On 2016-07-15 14:44 (+0100), Robert Dale <[email protected]> wrote:
> It looks to me like a self-inflicted problem because the things that
> are typed are already native to json so it's redundant. And to go a
> step further, I wouldn't consider the types to be 'correct' because
> everything that is a HashMap is really a Vertex, Edge, or Property.
>
> On Thu, Jul 14, 2016 at 10:03 AM, [email protected]
> <[email protected]> wrote:
> >
> >
> > On 2016-07-13 13:17 (+0100), Robert Dale <[email protected]> wrote:
> >> Marko, I agree that empty object properties should not be represented.
> >> I think if you saw that in an example then it was probably for
> >> demonstration purposes.
> >>
> >> Kevin, can you expand on this comment:
> >>
> >> > the format you suggest would lead to the same inconsistencies as in
> >> > GraphSON 1.0.
> >> > Since the type is at the same level than the data itself, whether the
> >> > container is an Array or an Object
> >> > https://github.com/apache/tinkerpop/pull/351#issuecomment-231351653
> >>
> >> What exactly are the inconsistencies? What is the problem in
> >> determining an array or object?
> >> This is a natural JSON array (or list): []
> >> This is a natural JSON object: {}
> >>
> >> Type at the object level is a common pattern and supported feature of
> >> Jackson. Also, GeoJSON would be a natural fit as it also stores
> >> 'type' at the object level. Titan supports GeoJSON currently. I
> >> wonder if it would make sense to promote geometry to gremlin.
> >>
> >
> > I wasn't probably clear enough, in my first email exposing my motivation to
> > improve GraphSON 1.0, one of the things I noticed was that according to the
> > enclosing element (either an Array or a Map), a type will either be
> > described as (respectively) an element of the Array, or a key/value pair in
> > a Map, you can see that in the "embedded types" example of the Tinkerpop
> > docs :
> > http://tinkerpop.apache.org/docs/current/reference/#graphson-reader-writer .
> >
> > There you can see that the type "java.util.ArrayList" is a simple element
> > of the enclosing array, but the "java.util.HashMap" type is a field of the
> > enclosing Map as {"@class" : "java.util.HashMap", ...}. This does not seem
> > consistent to me and even though I know that Jackson handles it well, it
> > seems that we'd better provide a consistent enclosing format that we know
> > is fixed whatever the enclosed data is, to make the automatic type
> > detection for other parsers in other libraries/languages easier. Does that
> > make sense ?
> >
> >> We should probably start documenting a table of supported types. (If
> >> there is one, please provide link)
> >>
> >> I wonder if it even makes sense to type numbers according to their
> >> memory model. As objects, Byte, Short, and Integer occupy the same
> >> space. Long isn't much more. So in Java we're not saving much space.
> >> Jackson will attempt to parse in order: int, long, BigInt, BigDecimal.
> >> The JSON JSR uses only BigDecimal. Some non-jvm languages don't even
> >> have this concept. Does anything in gremlin actually require this?
> >> I'm thinking that this is only going to be relevant at the domain
> >> model level. This way json native numbers can be used and not need
> >> typing.
> >>
> >> Additionally, I think that all things that will be typed should always
> >> be typed. For the use cases of injesting a saved graph from a file, it
> >> can probably be assumed that the top-level objects are vertices since
> >> the graph is vertex-centric and everything else follows naturally.
> >> I'm not entirely sure what is required for submitting traversals to
> >> gremlin server from GLV. However, if this is used for the results
> >> from gremlin server then the results could start with any one of path,
> >> vertex, edge, property, vertex property, etc. So you'll need that type
> >> data there.
> >>
> >> --
> >> Robert Dale
> >>
> >> On Tue, Jul 12, 2016 at 8:35 AM, Marko Rodriguez <[email protected]>
> >> wrote:
> >> > Hi,
> >> >
> >> > Iâm not following this PR too closely so what I might be saying is a
> >> > already known/argued against/etc.
> >> >
> >> > 1. I think we should go with Robert Daleâs proposal of int32,
> >> > int64, Vertex, uuid, etc. instead of Java class names.
> >> > 2. In Java we then have a Map<String,Class> for typecasting
> >> > accordingly.
> >> > 3. This would make GraphSON 2.0 perfect for Bytecode
> >> > serialization in TINKERPOP-1278.
> >> > 4. I think that if a Vertex, Edge, etc. doesnât have
> >> > properties, outV, etc. then donât even have those fields in the
> >> > representation.
> >> > 5. Most of the serialization back and forth will be ReferenceXXX
> >> > elements and thus, donât create more Maps/lists for no reason. â
> >> > less chars.
> >> >
> >> > For me, my interests with this work is all about a language agnostic way
> >> > of sending Gremlin traversal bytecode between different languages. This
> >> > work is exactly what I am looking for.
> >> >
> >> > Thanks,
> >> > Marko.
> >> >
> >> > http://markorodriguez.com
> >> >
> >> >
> >> >
> >> >> On Jul 9, 2016, at 9:48 AM, Stephen Mallette <[email protected]>
> >> >> wrote:
> >> >>
> >> >> With all the work on GLVs and the recent work on GraphSON 2.0, I think
> >> >> it's
> >> >> important that we have a solid, efficient, programming language neutral,
> >> >> lossless serialization format. Right now that format is GraphSON and it
> >> >> works for that purpose (ever more so with 2.0). Given some discussion
> >> >> on
> >> >> the GraphSON 2.0 PR driven a bit by Robert Dale:
> >> >>
> >> >> https://github.com/apache/tinkerpop/pull/351#issuecomment-231157389
> >> >>
> >> >> I wonder if we shouldn't consider another IO format that has Gremlin
> >> >> Server/GLVs in mind. At this point I'm not suggesting anything specific
> >> >> -
> >> >> I'm just hanging the idea out for further discussion and brain storming.
> >> >> Thoughts?
> >> >
> >>
> >>
> >>
> >> --
> >> Robert Dale
>
>
>
> --
> Robert Dale
> Correct, these types weren't relevant... I only wanted to show you the
> format...
However, I don't manage to understand the structure behind the format you
suggest, and I don't manage to establish a clear explicit representation in my
mind, regarding the example you provided in the TP-1274 PR. Could you please
give an example of how you would imagine the serialized JSON of :
- an example list of typed values, like List<UUID>
- an example list of typed and untyped values, like a list with UUIDs and
booleans
- an example map of typed and untyped values
How would you define that format in a general way ? Like what I did when saying
"- untyped : value
- typed : {"@type", "typeName", "value" : value}"
Just trying your point better.
Also what are the downsides you see with the format suggested above ?