So, first and foremost, yes we would still have both, so nothing goes away
- so that's good.

As for the rest, I understand. When you query a document in Mongo, you get
the whole document for example. Most folks would expect that, coming from
other systems I guess. The problem is I don't think we can go the other
direction to make it work consistently between olap and oltp which is its
own worry. We would get the same confusion when someone issues their query
against spark and gets a ReferenceVertex rather than DetachedVertex. I'm
not tied to the idea of making ReferenceVertex the default in Gremlin
Server for 3.3.x - i guess we can put that up for debate when the time
comes.

On Thu, May 19, 2016 at 9:20 AM, Dylan Millikin <dylan.milli...@gmail.com>
wrote:

> Hey,
>
> I'm a little torn here. On one side it's good to have the option of
> returning a ReferenceVertex, which is currently really complicated to do.
> On the other hand this new behavior is far from intuitive and has some
> difficultly surmountable issues.
>
> If I'm understanding you correctly both behaviors would still live through,
> we would just switch the default mode right? I would like to debate whether
> or not this new behavior should be default (I don't really know where I
> stand but just for the sake of being thorough).
>
> Barring the actual issues this introduces (as I'm pretty sure it's only
> going to concern very few people and they can use whatever conf). People
> coming from the SQL world and who already have trouble adjusting to gremlin
> will find this counter-intuitive. After all these people couldn't care less
> about ReferenceVertex, on the other hand it's very natural to query a
> vertex and get it's info. Not to mention that when handling a vertex
> directly or using a traversal the ways of getting the properties are
> different and not very consistent.
>
> Again, I don't really know where I stand on this, I just wanted to be
> thorough.
>
> Thoughts?
>
> On Wed, May 18, 2016 at 4:04 PM, Stephen Mallette <spmalle...@gmail.com>
> wrote:
>
> > I'll try to keep this simple, as serialization tends to be anything but
> > simple....
> >
> > Forgetting GraphML which has its own rules, GraphSON and Gryo are the two
> > key serialization modules that we have in IO.  We use these for both
> > serialization to disk as well as serialization over the network in
> Gremlin
> > Server. If you issue a request like:
> >
> > g.V()
> >
> > it returns vertices obviously. For both Gryo and GraphSON, those vertices
> > are converted to DetachedVertex which includes the properties of the
> > Vertex. This can be tremendously expensive, especially if the graph
> > supports multi-properties.
> >
> > I think that Gremlin Server should take a hint from OLAP in relation to
> > this issue. With OLAP, a Vertex is converted to a ReferenceVertex where
> we
> > only get the element identifier passed around.
> >
> > gremlin> graph = GraphFactory.open('conf/hadoop/hadoop-gryo.properties')
> > ==>hadoopgraph[gryoinputformat->gryooutputformat]
> > gremlin> g = graph.traversal().withComputer(SparkGraphComputer)
> > ==>graphtraversalsource[hadoopgraph[gryoinputformat->gryooutputformat],
> > sparkgraphcomputer]
> > gremlin> l = g.V().toList();[]
> > gremlin> l[0].class
> > ==>class
> > org.apache.tinkerpop.gremlin.structure.util.reference.ReferenceVertex
> >
> > If you want more information, it is up to you to issue your query to
> > request that information - for example:
> >
> > g.V().valueMap(true)
> >
> > I think Gremlin Server should work in the same fashion (i.e. return a
> > ReferenceVertex when a Vertex is serialized over the network).  It would
> > ease up on serialization overhead and force users to be more explicit
> about
> > the data that they want which would prevent unnecessary performance
> > surprises. This change might also be nice for the efficiency of
> > RemoteGraph/Connection implementations.
> >
> > This has bothered me for a while, but we carried over the pattern from
> > TinkerPop 2.x of sending back properties and I've been concerned about
> > introducing a break in trying to improve that.  I dug into it more today
> > and my analysis seems to indicate that this change can occur without
> > breaking all the code that's currently out there. I think that we could
> > keep the existing serialization model and simply add in the
> ReferenceVertex
> > approach as a configuration option for 3.2.1 and then make it the default
> > for 3.3.x.
> >
> > If there are no objections in the next 72 hours (Saturday, May 21, 2016,
> > 4pm EST) I'll assume lazy consensus and move forward.
> >
>

Reply via email to