FYI, I have extended the example to use the Protobuf-like solution for
domain-specific objects which I mentioned above. The overall flow of the
example is this:

   - The client creates and populates a TinkerGraph instance. One of the
   properties has the key "livesIn" and a value which is an instance of a
   domain-specific BoundingBox class.
   - The client encodes the graph to an instance of the Thrift-generated
   Graph class. The BoundingBox is serialized using a JSON-based encoder which
   has been added to an encoder registry that is shared between client and
   server.
   - The Thrift-generated code sends the encoded graph across the wire to
   the server, which receives it again as an instance of the Thrift-generated
   Graph class
   - The server decodes the graph to a new instance of TinkerGraph. The
   serialized bounding box is deserialized to an instance of the
   domain-specific BoundingBox class, and becomes a property value in the
   server's graph.
   - The server prints out some info and writes the received graph to disk
   as a GraphSON file so  we can see that it is true to the client's original
   graph

Note: I'm stretching the notion of "serialized" values somewhat in that, in
these graphs, a serialized value is a record with two fields (or an object
with two member variables): the encoded value itself (in this case, a JSON
blob), and a type identifier.

Josh



On Wed, Jul 7, 2021 at 6:51 AM Joshua Shinavier <j...@fortytwo.net> wrote:

> Hi Stephen,
>
> Good questions. Let's elevate this discussion (about the specifics of
> graphs and traversal results over Thrift) to the dev list. See inline.
>
>
> On Wed, Jul 7, 2021 at 5:08 AM Stephen Mallette <spmalle...@gmail.com>
> wrote:
>
>> So, what happens if a returned Vertex contained a ByteBuffer or
>> InetAddress as a property value? I assume the thrift definition has to be
>> adjusted to include those types if you expect them in the results?
>>
>
>
> What you see in the diff, currently, captures the types specifically
> mentioned in Graph.Features (see graph_features.yaml). In order to support
> other types natively, we should update Graph.Features in parallel. Byte
> arrays can be captured using Thrift's binary type. Domain-specific types
> like InetAddress probably should not be built in, just as specific element
> labels and property keys are not built in at this level. However, that is
> not the only possible answer. Certain very common types like IP addresses,
> dates and intervals, units of measurement, etc. *could* be built into the
> type system, but IMO probably shouldn't. Instead, we should give users a
> way of encoding and decoding domain-specific objects using a handful of
> atomic types. InetAddress in this case is encoded either as a string or a
> struct.
>
>
>
>> How would provider specific types (like a Point or special instances of P
>> in JanusGraph) fit into something like this - how would providers (or
>> users) extend on our thrift definitions?
>>
>
> Point is definitely a domain-specific type which you would not see at this
> level of schema. Maybe I can illustrate encoding and decoding
> domain-specific types in the branch; using the current simple type system,
> you could turn the Point into a map with three keys, like "latitude",
> "longitude" and "type". When receiving a map with "type" equal to "Point",
> you turn it back into a native Point object. We could also use a strategy
> similar to Protobuf's Any type, where we send a struct with two fields over
> the wire: one field provides the data of the Point, and the other field
> provides a URL which specifies the type, i.e. how the object should be
> decoded. It is probably worthwhile to add a "record" type variant to
> Graph.Features in any case.
>
>
>
> I think that the idea of having a more strict definition on the types
>> Gremlin supports is starting to materialize given the constraints on
>> serializable types of GraphSON and then further restricted in GraphBinary.
>> We actually have a list of types that haven't changed much in years at this
>> point:
>>
>> https://tinkerpop.apache.org/docs/3.5.0/dev/io/
>>
>
>
> We might want to go through this list with a fine-toothed comb (i.e. we
> probably don't want both a Date atomic type and a Timestamp type unless
> they have different precision/granularity, in which case I would make that
> explicit in the name of the type, e.g. UnixTimeSeconds vs. UnixTimeMillis).
>
>
> I think we could actually even limit them further and then the dream would
>> be to prevent them from being so JVM specific.
>>
>
>
> Yes, I would argue for limiting them to very domain-independent atomic
> types, probably excluding the timestamp type(s) as well as UUID and Class.
> However, as I say it's possible to include a few specialized types if the
> user demand is really high. It's just more stuff which needs to be
> implemented in each Gremlin language variant.
>
>
>
>> It would be nice to elevate the discussion of supported types out of
>> serialization and into the Gremlin language layer itself, which would then
>> in turn drive serialization discussions.
>>
>
>
> That's where I see this going. The specification of Gremlin traversal
> structure in YAML (already illustrated in the branch) translates neatly
> into traversals over the wire using Thrift. To that and the basic graph
> structure specification, we need a specification for other kinds of objects
> which appear in traversal results, such as paths.
>
>
> Josh
>
>
> [original message clipped]
>

Reply via email to