Ah yeah the lack of consistency between OLTP and OLAP -is- an issue. You're
right lets let this sit and review it at a later time.


On Thu, May 19, 2016 at 9:58 AM, Stephen Mallette <spmalle...@gmail.com>
wrote:

> So, first and foremost, yes we would still have both, so nothing goes away
> - so that's good.
>
> As for the rest, I understand. When you query a document in Mongo, you get
> the whole document for example. Most folks would expect that, coming from
> other systems I guess. The problem is I don't think we can go the other
> direction to make it work consistently between olap and oltp which is its
> own worry. We would get the same confusion when someone issues their query
> against spark and gets a ReferenceVertex rather than DetachedVertex. I'm
> not tied to the idea of making ReferenceVertex the default in Gremlin
> Server for 3.3.x - i guess we can put that up for debate when the time
> comes.
>
> On Thu, May 19, 2016 at 9:20 AM, Dylan Millikin <dylan.milli...@gmail.com>
> wrote:
>
> > Hey,
> >
> > I'm a little torn here. On one side it's good to have the option of
> > returning a ReferenceVertex, which is currently really complicated to do.
> > On the other hand this new behavior is far from intuitive and has some
> > difficultly surmountable issues.
> >
> > If I'm understanding you correctly both behaviors would still live
> through,
> > we would just switch the default mode right? I would like to debate
> whether
> > or not this new behavior should be default (I don't really know where I
> > stand but just for the sake of being thorough).
> >
> > Barring the actual issues this introduces (as I'm pretty sure it's only
> > going to concern very few people and they can use whatever conf). People
> > coming from the SQL world and who already have trouble adjusting to
> gremlin
> > will find this counter-intuitive. After all these people couldn't care
> less
> > about ReferenceVertex, on the other hand it's very natural to query a
> > vertex and get it's info. Not to mention that when handling a vertex
> > directly or using a traversal the ways of getting the properties are
> > different and not very consistent.
> >
> > Again, I don't really know where I stand on this, I just wanted to be
> > thorough.
> >
> > Thoughts?
> >
> > On Wed, May 18, 2016 at 4:04 PM, Stephen Mallette <spmalle...@gmail.com>
> > wrote:
> >
> > > I'll try to keep this simple, as serialization tends to be anything but
> > > simple....
> > >
> > > Forgetting GraphML which has its own rules, GraphSON and Gryo are the
> two
> > > key serialization modules that we have in IO.  We use these for both
> > > serialization to disk as well as serialization over the network in
> > Gremlin
> > > Server. If you issue a request like:
> > >
> > > g.V()
> > >
> > > it returns vertices obviously. For both Gryo and GraphSON, those
> vertices
> > > are converted to DetachedVertex which includes the properties of the
> > > Vertex. This can be tremendously expensive, especially if the graph
> > > supports multi-properties.
> > >
> > > I think that Gremlin Server should take a hint from OLAP in relation to
> > > this issue. With OLAP, a Vertex is converted to a ReferenceVertex where
> > we
> > > only get the element identifier passed around.
> > >
> > > gremlin> graph =
> GraphFactory.open('conf/hadoop/hadoop-gryo.properties')
> > > ==>hadoopgraph[gryoinputformat->gryooutputformat]
> > > gremlin> g = graph.traversal().withComputer(SparkGraphComputer)
> > > ==>graphtraversalsource[hadoopgraph[gryoinputformat->gryooutputformat],
> > > sparkgraphcomputer]
> > > gremlin> l = g.V().toList();[]
> > > gremlin> l[0].class
> > > ==>class
> > > org.apache.tinkerpop.gremlin.structure.util.reference.ReferenceVertex
> > >
> > > If you want more information, it is up to you to issue your query to
> > > request that information - for example:
> > >
> > > g.V().valueMap(true)
> > >
> > > I think Gremlin Server should work in the same fashion (i.e. return a
> > > ReferenceVertex when a Vertex is serialized over the network).  It
> would
> > > ease up on serialization overhead and force users to be more explicit
> > about
> > > the data that they want which would prevent unnecessary performance
> > > surprises. This change might also be nice for the efficiency of
> > > RemoteGraph/Connection implementations.
> > >
> > > This has bothered me for a while, but we carried over the pattern from
> > > TinkerPop 2.x of sending back properties and I've been concerned about
> > > introducing a break in trying to improve that.  I dug into it more
> today
> > > and my analysis seems to indicate that this change can occur without
> > > breaking all the code that's currently out there. I think that we could
> > > keep the existing serialization model and simply add in the
> > ReferenceVertex
> > > approach as a configuration option for 3.2.1 and then make it the
> default
> > > for 3.3.x.
> > >
> > > If there are no objections in the next 72 hours (Saturday, May 21,
> 2016,
> > > 4pm EST) I'll assume lazy consensus and move forward.
> > >
> >
>

Reply via email to