Hello, This is a “big idea” in that I don’t see it being helpful for TinkerPop3, but something to consider for TinkerPop4.
We currently support two “internal” serialization mechanisms — Gryo and GraphSON. We also support GraphML but this is primarily for data reading/writing (interoperability with other graph frameworks). The reason we have Gryo is that its “fast” and good for things that are Java-to-Java (or internal to the Gremlin VM like OLAP serialization). However: 1. With the development of the Gremlin language variants outside the JavaVM (e.g. Gremlin-C# and Gremlin-Python), GraphSON is the only viable serialization format. 2. With the development of other Gremlin virtual machines (e.g. CosmosDB’s Gremlin .NET implementation), GraphSON is the only viable serialization format. I think we should work to get rid of Gryo for TinkerPop4 and make GraphSON the default/standard/universal serialization mechanism. To do this, we need to make it “fast.” I saw an excellent talk by the ArangoDB guys at GraphDay about their immutable JSON serialization format. In short, they can have a binary JSON representation and can get data from it without having to turn it into a nested-Map structure. In this way, they can do fast lookup operations on the byte stream. With that, we could learn from their efforts and develop a class along the lines of: GraphSONByteStream.toJSON() -> yields the String { [ ] } JSON representation. GraphSONByteStream.toMap() -> yields the nested Map structure we are familiar with in TinkerPop. GraphSONByteStream.get(“/id”) -> does a random access lookup into the byte stream so we don’t have to deserialize — all byte offset based. In short, Gryo is too hard to manage and too Java specific. In the future, we should look to making an ultra fast GraphSON serializer/deserializer along with ensuring an elegant, self consistent representation of all the necessary TinkerPop objects. Thoughts?, Marko. http://markorodriguez.com