What's new in jms/serializer v2.0

After more than a two years of work, jms/serializer v2.0 is going to see the light. Here a preview of the changes, new features and improvements that will be released soon.

After more than a two years of work (the first commit was on July 20 2016), jms/serializer v2.0 is going to see the light.

jms/serializer is a popular library to serialize/deserialize objects from/to XML and JSON. The installs counter on packagist has already passed 20 million downloads.

The version 1.x works well but over the years has accumulated inconsistent behaviours and even if is one of the fastest php serializers, I've realized that there is still room for improvements. The version v2.0 is not a big-bang release but is a improved version of what was available in the previous 1.x series.

Upgrading from 1.x to 2.0 should be really simple and in some cases fully transparent. The metadata format is untouched and the user-api is almost identical. If you have implemented some more advanced features on top of the serializer by overriding core classes, things might be more complicated but is mainly about adding PHP 7.2 type hints (7.2 is the minimum PHP version for the 2.0 version). Here is the initial UPGRADING guide.

Side note. Yes jms/serializer finally has a logo!

jms/serializer logo

License

jms/serializer 2.0 is licensed under the permissive MIT license (and also the 1.x versions starting from 1.12.0). This solves some adoption issues of the serializer in other projects that have license incompatibilities with the previous serializer license, the Apache-2.0 license.

Improvements

Here a quick list of improvements that are immediately noticeable when using the version 2.0.

1) Performance

The new serializer is approximately 35% faster than the v1.x and consumes 3% less memory.

Achieving this was possible by moving to more information to compiled metadata, by reducing the number of context accesses, by using simpler access strategies, by using a faster event dispatcher and few other tricks. In most of the cases was about reducing the number of repetitive operations by moving them in a "place" where they need to be invoked only one time per serialization call (instead of computing the operation for each object present in the object graph).

I did some benchmark by using a the ivory-serializer-benchmark. The JMS serializer was configured to use the APCu cache for metadata and annotations (by default files are used), the Symfony serializer was configured as in this PR and the Ivory serializer was configured by the author of the original benchmark.

The JMS serializer is the only serializer with complete Doctrine support out of the box. To make a more fair comparison, there are two versions of the JMS benchmark, the standard one (with all the features enabled) and the minimal with Doctrine support disabled.

Here is available the source code used for the benchmark and here are some results from a recent TravisCi build (n.

Here a more detailed set of benchmarks run on my laptop (Intel® Core™ i7-8550U 16GB RAM). In all the benchmarks I've excluded the Symfony Object Normalizer benchmark, its performance are so bad that this graphs will become unreadable if I had to fit also that in the graph area. The following graphs summarize the benchmark results (lower is better).

complex object graph mid-complex object graph mid-complex object graph simple object graph very simple object graph trivial object graph

Let's explain the meaning of this benchmark, horizontal and vertical complexity are characteristic of the benchmarked object graph.

  • Big values of horizontal complexity indicates a very "deep" object graph, with many objects having many child objects (horizontal=100 indicates an object with a chain of 200 nested objects)
  • Big values of vertical complexity indicate objects that have many children, but those children have no more child objects (vertical=100 indicates an object having 200 direct child objects)
  • Big values of vertical and horizontal complexity indicates a very complex object graph with the combination of the previous two cases
  • Small values of vertical and/or horizontal complexity indicates very simple object graphs with one, two of few nested objects

Benchmark Conclusions: what can be concluded from this benchmark is that the version 2.0 got ~30% better on not trivial object graphs, but it got worse on very simple cases when compared to the 1.x.
A personal opinion on the results, is that on trivial cases we are talking about a difference of 300 micro-seconds when serializing a graph of less than 5-10 objects. Benchmarking such small workload makes no sense. The difference between v1 and v2 mostly is caused by a more complex internal structure as graph factories and visitor factories introduced in v2. Another conclusion is that this is a benchmark on big graphs with no database access, if your application needs to serialize 5-50 objects then the performance impact is not that visible. Choose the serializer that has the features and DX you want/need.

2) Simpler context object

The v2.0 does not depend on the phpcollection library anymore, this means that "context attributes" now are a plain PHP array. This make the context API simple and intuitive.

v1.x:

<?php
$context = SerializationContext::create();
$context->attributes->set("foo", "bar"); // set a value
$holder = $context->attributes->get("foo"); 
if ($holder instanceof Some) {
    $value = $holder->get(); // get a value
} else {
    // value was not in the context attributes
}

v2.x:

<?php
$context = SerializationContext::create();
$context->setAttribute("foo", "bar"); // set a value 
if ($context->hasAttribute("foo")) {
    $value = $context->getAttribute("foo"); // get a value
} else {
    // value was not in the context attributes
}

3) No more obsolete exclusion strategies

Another unexpected behaviour in the context object were "obsolete exclusion strategies".

v1.x:

<?php
$context = SerializationContext::create();
$context->setGroups(["foo"]);
$context->setGroups(["bar"]);

// now the context has "foo" and "bar" as groups

v2.x:

<?php
$context = SerializationContext::create();
$context->setGroups(["foo"]);
$context->setGroups(["bar"]);

// now the context has only "bar"

The new behaviour is more consistent and in line with what is considered standard.

4) Recursive calls

The visitors are fully stateless now, so it is possible to use it in recursive contextes.

5) Metadata errors vs Runtime errors

In 2.0, all the "metadata-related" errors extend the JMS\Serializer\Exception\InvalidMetadataException exception. Thanks to this is possible to distinguish configuration errors from errors caused by invalid object state.

Fixed inconsistencies

The number of improvements is pretty long, a full list can be found in the CHANGELOG.md.

Here some of the most important.

1) Stateless JSON visitor

The old JSON visitor had a concept of a "root" node and the user has to take care of it. The user had to make sure the root was always present and to not overwrite it when already set. This made custom handles not intuitive and this created many unexpected behaviours.

Let's see how was implemented a custom JSON handler for an hypothetical User object.

v1.x:

<?php
use JMS\Serializer\GraphNavigatorInterface;
use JMS\Serializer\JsonSerializationVisitor;
use App\Entity\User;

final class CustomUserHandler implements SubscribingHandlerInterface
{
    public static function getSubscribingMethods()
    {
        return [
            [
               'direction' => GraphNavigatorInterface::DIRECTION_SERIALIZATION,
               'type' => User::class,
               'format' => 'json',
               'method' => 'serializeUserToJson',
            ]
        ];
    }

    public function serializeUserToJson(JsonSerializationVisitor $visitor, User $user)
    {
        // custom user serialization
        $data = [
            'username' => $user->getUsername()
        ];

        // extra root check
        if ($visitor->getRoot() === null) {
            $visitor->setRoot($data);
        }
        return $data;
    }
}

v2.x:

<?php
use JMS\Serializer\GraphNavigatorInterface;
use JMS\Serializer\JsonSerializationVisitor;
use App\Entity\User;

final class CustomUserHandler implements SubscribingHandlerInterface
{
    public static function getSubscribingMethods()
    {
        return [
            [
               'direction' => GraphNavigatorInterface::DIRECTION_SERIALIZATION,
               'type' => User::class,
               'format' => 'json',
               'method' => 'serializeUserToJson',
            ]
        ];
    }

    public function serializeUserToJson(JsonSerializationVisitor $visitor, User $user)
    {
        // custom user serialization
        return [
            'username' => $user->getUsername()
        ];
    }
}

2) @MaxDepth on object collections

When using the "max depth" exclusion strategy, traversing arrays and collections (ArrayCollection from Doctrine as example) had different behaviours. To achieve a desired "max-depth" of 1, for arrays was necessary declare it as @MaxDepth(1), but for collections was necessary declare it as @MaxDepth(2). This was a very old bug that become effectively a feature in v1.x, in 2.0 this has been fixed and now in both cases @MaxDepth(1) works as expected.

3) Consistent Exclusion Strategies

In v1.x the exclusion strategies were returning null in some cases, sometimes empty objects as {} and other were effectively excluding the property. Now the behaviour is more consistent and handles better collections and circular references (see #895).

Other changes

Many other changes allowed to reduce complexity and improve speed, here some of them:

  • A relaxed version of the doctrine/coding-standard has been adopted as official coding standard. Now the codebase is checked by PHPCS on each build for coding violations allowing to avoid bugs to land in released code

  • The VisitorInterface has been replaced by two more specialized interfaces: SerializationVisitorInterface and DeserializationVisitorInterface.

  • The type parsing now uses HOA Compiler, an AST based lexer and parser.
  • Updated jms/metadata v2.0 that does not rely heavily on reflection as the v1.x allowing to gain some performance and saving some memory.
  • To improve performance, the graph-navigators and visitors are stateless and instantiated on each serialize/deserialize call.
  • Removed support for "obsolete" libraries as Propel, PHPCollection and the minimum supported Symfony version is 3.0
  • Removed YAML serialization support (to not be confused with YAML metadata support that is still part of the serializer core)
  • Removed PHP metadata definitions support

What's next?

Today the beta release has been tagged, in two-four weeks the RC release will be tagged. Before entering in the RC stage it will be still possible to break things by introducing big changes (if necessary), but in the RC stage will not be possible to add features anymore till 2.0 will not be completed (eventual pull requests that are not bug-fixes will land in v2.1). Four weeks after the last RC, a stable version will be tagged.

In case of big changes in the beta period or bugs discovered in the RC phase, a second beta or RC release could happen, shifting the release process for one or two iterations.

In parallel, for Symfony users, will be released the new JMSSerializer bundle (v3.0) that will integrate the new serializer into Symfony. The JMSSerializer bundle will follow the same release schedule as the serializer library.

There are many popular libraries that use the serializer, as willdurand/Hateoas, nelmio/NelmioApiDocBundle or FriendsOfSymfony/FOSRestBundle and also they need to be upgraded. Hope to see some feedback from the PHP community.

Do you want to help?

That's great! As you can see there is a lot of work that needs to be done. Feel free to contact me via Github or email. You can test it with your project, write documentation, tests or contribute to the project! Do not hesitate.

You can also support me on https://www.patreon.com/goetas.

Feedback via email or comments are always welcome! :)

php, api, json, rest, serializer, xml

Want more info?