> First, is there a wiki that we can keep updated with decisions or at least decision points? I know there's an old wiki, but is there/will there be a new wiki?
No - we don't have a wiki. Design decisions tend to get trapped in the mailing list (or JIRA) which isn't so good. Maybe that's a separate discussion. > Neo4j via NeoGraph appears to do the right thing for vertex IDs and > properties. It treats all types, primitive or object, from byte to long, double, float as numbers. Perhaps we could take a stronger stance on this in the test cases? Does anyone know what graphs this would impact besides Titan and TinkerGraph (I suspect DSE Graph, but not 100% sure)? On Wed, Jul 13, 2016 at 1:49 PM, Robert Dale <robd...@gmail.com> wrote: > First, is there a wiki that we can keep updated with decisions or at > least decision points? I know there's an old wiki, but is there/will > there be a new wiki? > > Stephen, IMO, that's still bad behavior. That says to me a number is > not a number. But, yes, schemaless does allow one to put crap in and > get crap out. So designers should be aware of these types of pitfalls. > Neo4j via NeoGraph appears to do the right thing for vertex IDs and > properties. It treats all types, primitive or object, from byte to > long, double, float as numbers. This is pretty standard behavior in > SQL, JDBC drivers, and other NoSQL technologies. > > > > On Wed, Jul 13, 2016 at 11:30 AM, Stephen Mallette <spmalle...@gmail.com> > wrote: > > Marko, the namespacing idea seems smart. > > > > Robert, I think other graphs have similar behavior to TinkerGraph's > > default. In Titan, the absence of a schema (default, obviously) produces > > this: > > > > gremlin> graph = TitanFactory.open('conf/titan-cassandra-es.properties') > > ==>standardtitangraph[cassandrathrift:[127.0.0.1]] > > gremlin> graph.addVertex("n",100D) > > ==>v[4288] > > gremlin> graph.traversal().V().has('n',100f) > > gremlin> graph.traversal().V().has('n',100d) > > ==>v[4288] > > > > This kind of problem has caused trouble for years and years in TinkerPop > > and allowing the type to be embedded seemed like a good solution. Of > > course, you bring up a good point about javascript - to this point we've > > relied on JS devs to conform to java/groovy types by forcing conversion > in > > their gremlin scripts or configuring their graphs to avoid use of types > > that would produce these kinds of ambiguous results. > > > > > > > > On Wed, Jul 13, 2016 at 9:51 AM, Robert Dale <robd...@gmail.com> wrote: > > > >> And just to be clear, I'm not necessarily disagreeing. But I think > >> it's important to understand where and why it's necessary. > >> > >> For example, if I'm writing a gremlin script (string), I don't type my > >> input numbers. It's rightly converted by the underlying architecture. > >> (I'm guessing groovy which has enhanced number support). Also, if a > >> GLV is submitting typed numbers, how would that work? For example, in > >> Javascript? > >> > >> On Wed, Jul 13, 2016 at 9:16 AM, Robert Dale <robd...@gmail.com> wrote: > >> > Hi, Stephen. I think that's a bad example. You may recall I brought > >> > up that issue in the forum. However, it's actually attributed to the > >> > default ID manager of ANY (for historical) which I think is a really > >> > bad default (and reason) because it only leads to confusion. Java is > >> > one of the few, if not only, brain-damaged languages where 5 != 5 != > >> > 5. In Java, number objects must be coerced into like form for > >> > comparison. The other ID managers do this coercion. Saner languages > >> > do this under the covers. > >> > > >> > On Wed, Jul 13, 2016 at 8:56 AM, Stephen Mallette < > spmalle...@gmail.com> > >> wrote: > >> >> Robert, thanks for joining this discussion. > >> >> > >> >>> I wonder if it even makes sense to type numbers according to their > >> >> memory model. As objects, Byte, Short, and Integer occupy the same > >> >> space. Long isn't much more. So in Java we're not saving much space. > >> >> Jackson will attempt to parse in order: int, long, BigInt, > BigDecimal. > >> >> The JSON JSR uses only BigDecimal. Some non-jvm languages don't even > >> >> have this concept. Does anything in gremlin actually require this? > >> >> > >> >> If the intended numeric type isn't preserved, weird things can happen > >> with > >> >> graphs that have a schema (like Titan/DSE). Even TinkerGraph using > the > >> >> default ID manager will not be happy if you try to do a lookup of > Long > >> >> identifiers with an Integer: > >> >> > >> >> gremlin> graph = TinkerFactory.createModern() > >> >> ==>tinkergraph[vertices:6 edges:6] > >> >> gremlin> graph.vertices(1) > >> >> ==>v[1] > >> >> gremlin> graph.vertices(1L) > >> >> gremlin> > >> >> > >> >> > >> >> > >> >> > >> >> On Wed, Jul 13, 2016 at 8:17 AM, Robert Dale <robd...@gmail.com> > wrote: > >> >> > >> >>> Marko, I agree that empty object properties should not be > represented. > >> >>> I think if you saw that in an example then it was probably for > >> >>> demonstration purposes. > >> >>> > >> >>> Kevin, can you expand on this comment: > >> >>> > >> >>> > the format you suggest would lead to the same inconsistencies as > in > >> >>> GraphSON 1.0. > >> >>> > Since the type is at the same level than the data itself, whether > the > >> >>> container is an Array or an Object > >> >>> > > https://github.com/apache/tinkerpop/pull/351#issuecomment-231351653 > >> >>> > >> >>> What exactly are the inconsistencies? What is the problem in > >> >>> determining an array or object? > >> >>> This is a natural JSON array (or list): [] > >> >>> This is a natural JSON object: {} > >> >>> > >> >>> Type at the object level is a common pattern and supported feature > of > >> >>> Jackson. Also, GeoJSON would be a natural fit as it also stores > >> >>> 'type' at the object level. Titan supports GeoJSON currently. I > >> >>> wonder if it would make sense to promote geometry to gremlin. > >> >>> > >> >>> We should probably start documenting a table of supported types. (If > >> >>> there is one, please provide link) > >> >>> > >> >>> I wonder if it even makes sense to type numbers according to their > >> >>> memory model. As objects, Byte, Short, and Integer occupy the same > >> >>> space. Long isn't much more. So in Java we're not saving much > space. > >> >>> Jackson will attempt to parse in order: int, long, BigInt, > BigDecimal. > >> >>> The JSON JSR uses only BigDecimal. Some non-jvm languages don't even > >> >>> have this concept. Does anything in gremlin actually require this? > >> >>> I'm thinking that this is only going to be relevant at the domain > >> >>> model level. This way json native numbers can be used and not need > >> >>> typing. > >> >>> > >> >>> Additionally, I think that all things that will be typed should > always > >> >>> be typed. For the use cases of injesting a saved graph from a file, > it > >> >>> can probably be assumed that the top-level objects are vertices > since > >> >>> the graph is vertex-centric and everything else follows naturally. > >> >>> I'm not entirely sure what is required for submitting traversals to > >> >>> gremlin server from GLV. However, if this is used for the results > >> >>> from gremlin server then the results could start with any one of > path, > >> >>> vertex, edge, property, vertex property, etc. So you'll need that > type > >> >>> data there. > >> >>> > >> >>> -- > >> >>> Robert Dale > >> >>> > >> >>> On Tue, Jul 12, 2016 at 8:35 AM, Marko Rodriguez < > okramma...@gmail.com > >> > > >> >>> wrote: > >> >>> > Hi, > >> >>> > > >> >>> > I’m not following this PR too closely so what I might be saying > is a > >> >>> already known/argued against/etc. > >> >>> > > >> >>> > 1. I think we should go with Robert Dale’s proposal of > int32, > >> >>> int64, Vertex, uuid, etc. instead of Java class names. > >> >>> > 2. In Java we then have a Map<String,Class> for > typecasting > >> >>> accordingly. > >> >>> > 3. This would make GraphSON 2.0 perfect for Bytecode > >> >>> serialization in TINKERPOP-1278. > >> >>> > 4. I think that if a Vertex, Edge, etc. doesn’t have > >> properties, > >> >>> outV, etc. then don’t even have those fields in the representation. > >> >>> > 5. Most of the serialization back and forth will be > >> ReferenceXXX > >> >>> elements and thus, don’t create more Maps/lists for no reason. — > less > >> chars. > >> >>> > > >> >>> > For me, my interests with this work is all about a language > agnostic > >> way > >> >>> of sending Gremlin traversal bytecode between different languages. > This > >> >>> work is exactly what I am looking for. > >> >>> > > >> >>> > Thanks, > >> >>> > Marko. > >> >>> > > >> >>> > http://markorodriguez.com > >> >>> > > >> >>> > > >> >>> > > >> >>> >> On Jul 9, 2016, at 9:48 AM, Stephen Mallette < > spmalle...@gmail.com> > >> >>> wrote: > >> >>> >> > >> >>> >> With all the work on GLVs and the recent work on GraphSON 2.0, I > >> think > >> >>> it's > >> >>> >> important that we have a solid, efficient, programming language > >> neutral, > >> >>> >> lossless serialization format. Right now that format is GraphSON > >> and it > >> >>> >> works for that purpose (ever more so with 2.0). Given some > >> discussion > >> >>> on > >> >>> >> the GraphSON 2.0 PR driven a bit by Robert Dale: > >> >>> >> > >> >>> >> > https://github.com/apache/tinkerpop/pull/351#issuecomment-231157389 > >> >>> >> > >> >>> >> I wonder if we shouldn't consider another IO format that has > Gremlin > >> >>> >> Server/GLVs in mind. At this point I'm not suggesting anything > >> specific > >> >>> - > >> >>> >> I'm just hanging the idea out for further discussion and brain > >> storming. > >> >>> >> Thoughts? > >> >>> > > >> >>> > >> >>> > >> >>> > >> >>> -- > >> >>> Robert Dale > >> >>> > >> > > >> > > >> > > >> > -- > >> > Robert Dale > >> > >> > >> > >> -- > >> Robert Dale > >> > > > > -- > Robert Dale >