Re: [DISCUSS] New IO format for GLVs/Gremlin Server

Robert Dale Wed, 13 Jul 2016 06:16:36 -0700

Hi, Stephen.  I think that's a bad example. You may recall I brought
up that issue in the forum.  However, it's actually attributed to the
default ID manager of ANY (for historical) which I think is a really
bad default (and reason) because it only leads to confusion.  Java is
one of the few, if not only, brain-damaged languages where 5 != 5 !=
5.  In Java, number objects must be coerced into like form for
comparison. The other ID managers do this coercion.  Saner languages
do this under the covers.


On Wed, Jul 13, 2016 at 8:56 AM, Stephen Mallette <[email protected]> wrote:
> Robert, thanks for joining this discussion.
>
>> I wonder if it even makes sense to type numbers according to their
> memory model. As objects, Byte, Short, and Integer occupy the same
> space. Long isn't much more.  So in Java we're not saving much space.
> Jackson will attempt to parse in order: int, long, BigInt, BigDecimal.
> The JSON JSR uses only BigDecimal. Some non-jvm languages don't even
> have this concept.  Does anything in gremlin actually require this?
>
> If the intended numeric type isn't preserved, weird things can happen with
> graphs that have a schema (like Titan/DSE). Even TinkerGraph using the
> default ID manager will not be happy if you try to do a lookup of Long
> identifiers with an Integer:
>
> gremlin> graph = TinkerFactory.createModern()
> ==>tinkergraph[vertices:6 edges:6]
> gremlin> graph.vertices(1)
> ==>v[1]
> gremlin> graph.vertices(1L)
> gremlin>
>
>
>
>
> On Wed, Jul 13, 2016 at 8:17 AM, Robert Dale <[email protected]> wrote:
>
>> Marko, I agree that empty object properties should not be represented.
>> I think if you saw that in an example then it was probably for
>> demonstration purposes.
>>
>> Kevin, can you expand on this comment:
>>
>> > the format you suggest would lead to the same inconsistencies as in
>> GraphSON 1.0.
>> > Since the type is at the same level than the data itself, whether the
>> container is an Array or an Object
>> > https://github.com/apache/tinkerpop/pull/351#issuecomment-231351653
>>
>> What exactly are the inconsistencies?  What is the problem in
>> determining an array or object?
>> This is a natural JSON array (or list): []
>> This is a natural JSON object: {}
>>
>> Type at the object level is a common pattern and supported feature of
>> Jackson.  Also, GeoJSON would be a natural fit as it also stores
>> 'type' at the object level. Titan supports GeoJSON currently.  I
>> wonder if it would make sense to promote geometry to gremlin.
>>
>> We should probably start documenting a table of supported types. (If
>> there is one, please provide link)
>>
>> I wonder if it even makes sense to type numbers according to their
>> memory model. As objects, Byte, Short, and Integer occupy the same
>> space. Long isn't much more.  So in Java we're not saving much space.
>> Jackson will attempt to parse in order: int, long, BigInt, BigDecimal.
>> The JSON JSR uses only BigDecimal. Some non-jvm languages don't even
>> have this concept.  Does anything in gremlin actually require this?
>> I'm thinking that this is only going to be relevant at the domain
>> model level. This way json native numbers can be used and not need
>> typing.
>>
>> Additionally, I think that all things that will be typed should always
>> be typed. For the use cases of injesting a saved graph from a file, it
>> can probably be assumed that the top-level objects are vertices since
>> the graph is vertex-centric and everything else follows naturally.
>> I'm not entirely sure what is required for submitting traversals to
>> gremlin server from GLV.  However, if this is used for the results
>> from gremlin server then the results could start with any one of path,
>> vertex, edge, property, vertex property, etc. So you'll need that type
>> data there.
>>
>> --
>> Robert Dale
>>
>> On Tue, Jul 12, 2016 at 8:35 AM, Marko Rodriguez <[email protected]>
>> wrote:
>> > Hi,
>> >
>> > I’m not following this PR too closely so what I might be saying is a
>> already known/argued against/etc.
>> >
>> >         1. I think we should go with Robert Dale’s proposal of int32,
>> int64, Vertex, uuid, etc. instead of Java class names.
>> >         2. In Java we then have a Map<String,Class> for typecasting
>> accordingly.
>> >         3. This would make GraphSON 2.0 perfect for Bytecode
>> serialization in TINKERPOP-1278.
>> >         4. I think that if a Vertex, Edge, etc. doesn’t have properties,
>> outV, etc. then don’t even have those fields in the representation.
>> >         5. Most of the serialization back and forth will be ReferenceXXX
>> elements and thus, don’t create more Maps/lists for no reason. — less chars.
>> >
>> > For me, my interests with this work is all about a language agnostic way
>> of sending Gremlin traversal bytecode between different languages. This
>> work is exactly what I am looking for.
>> >
>> > Thanks,
>> > Marko.
>> >
>> > http://markorodriguez.com
>> >
>> >
>> >
>> >> On Jul 9, 2016, at 9:48 AM, Stephen Mallette <[email protected]>
>> wrote:
>> >>
>> >> With all the work on GLVs and the recent work on GraphSON 2.0, I think
>> it's
>> >> important that we have a solid, efficient, programming language neutral,
>> >> lossless serialization format. Right now that format is GraphSON and it
>> >> works for that purpose (ever more  so with 2.0). Given some discussion
>> on
>> >> the GraphSON 2.0 PR driven a bit by Robert Dale:
>> >>
>> >> https://github.com/apache/tinkerpop/pull/351#issuecomment-231157389
>> >>
>> >> I wonder if we shouldn't consider another IO format that has Gremlin
>> >> Server/GLVs in mind. At this point I'm not suggesting anything specific
>> -
>> >> I'm just hanging the idea out for further discussion and brain storming.
>> >> Thoughts?
>> >
>>
>>
>>
>> --
>> Robert Dale
>>



-- 
Robert Dale

Re: [DISCUSS] New IO format for GLVs/Gremlin Server

Reply via email to