Hi,

Just to check, will it be configurable, returning a ReferenceVertex or
the Vertex together with its properties?

Thanks
Pieter

On 19/05/2016 16:39, Stephen Mallette wrote:
> Ok - that's what i thought - just wanted to be certain I didn't
> misunderstand you. I'm fine with that. Thanks.
>
> On Thu, May 19, 2016 at 10:18 AM, Dylan Millikin <dylan.milli...@gmail.com>
> wrote:
>
>> When to switch default behavior. At this point there's no question about
>> adding this feature. It's definitely a requirement. Once the feature is in
>> place we can perhaps discuss it more or put it to a vote (which behavior to
>> default on what version)?
>>
>> On Thu, May 19, 2016 at 10:15 AM, Stephen Mallette <spmalle...@gmail.com>
>> wrote:
>>
>>> sorry - what aspect of this do you want to have "sit" for later review?
>>>
>>> On Thu, May 19, 2016 at 10:10 AM, Dylan Millikin <
>> dylan.milli...@gmail.com
>>> wrote:
>>>
>>>> Ah yeah the lack of consistency between OLTP and OLAP -is- an issue.
>>> You're
>>>> right lets let this sit and review it at a later time.
>>>>
>>>>
>>>> On Thu, May 19, 2016 at 9:58 AM, Stephen Mallette <
>> spmalle...@gmail.com>
>>>> wrote:
>>>>
>>>>> So, first and foremost, yes we would still have both, so nothing goes
>>>> away
>>>>> - so that's good.
>>>>>
>>>>> As for the rest, I understand. When you query a document in Mongo,
>> you
>>>> get
>>>>> the whole document for example. Most folks would expect that, coming
>>> from
>>>>> other systems I guess. The problem is I don't think we can go the
>> other
>>>>> direction to make it work consistently between olap and oltp which is
>>> its
>>>>> own worry. We would get the same confusion when someone issues their
>>>> query
>>>>> against spark and gets a ReferenceVertex rather than DetachedVertex.
>>> I'm
>>>>> not tied to the idea of making ReferenceVertex the default in Gremlin
>>>>> Server for 3.3.x - i guess we can put that up for debate when the
>> time
>>>>> comes.
>>>>>
>>>>> On Thu, May 19, 2016 at 9:20 AM, Dylan Millikin <
>>>> dylan.milli...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Hey,
>>>>>>
>>>>>> I'm a little torn here. On one side it's good to have the option of
>>>>>> returning a ReferenceVertex, which is currently really complicated
>> to
>>>> do.
>>>>>> On the other hand this new behavior is far from intuitive and has
>>> some
>>>>>> difficultly surmountable issues.
>>>>>>
>>>>>> If I'm understanding you correctly both behaviors would still live
>>>>> through,
>>>>>> we would just switch the default mode right? I would like to debate
>>>>> whether
>>>>>> or not this new behavior should be default (I don't really know
>>> where I
>>>>>> stand but just for the sake of being thorough).
>>>>>>
>>>>>> Barring the actual issues this introduces (as I'm pretty sure it's
>>> only
>>>>>> going to concern very few people and they can use whatever conf).
>>>> People
>>>>>> coming from the SQL world and who already have trouble adjusting to
>>>>> gremlin
>>>>>> will find this counter-intuitive. After all these people couldn't
>>> care
>>>>> less
>>>>>> about ReferenceVertex, on the other hand it's very natural to
>> query a
>>>>>> vertex and get it's info. Not to mention that when handling a
>> vertex
>>>>>> directly or using a traversal the ways of getting the properties
>> are
>>>>>> different and not very consistent.
>>>>>>
>>>>>> Again, I don't really know where I stand on this, I just wanted to
>> be
>>>>>> thorough.
>>>>>>
>>>>>> Thoughts?
>>>>>>
>>>>>> On Wed, May 18, 2016 at 4:04 PM, Stephen Mallette <
>>>> spmalle...@gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> I'll try to keep this simple, as serialization tends to be
>> anything
>>>> but
>>>>>>> simple....
>>>>>>>
>>>>>>> Forgetting GraphML which has its own rules, GraphSON and Gryo are
>>> the
>>>>> two
>>>>>>> key serialization modules that we have in IO.  We use these for
>>> both
>>>>>>> serialization to disk as well as serialization over the network
>> in
>>>>>> Gremlin
>>>>>>> Server. If you issue a request like:
>>>>>>>
>>>>>>> g.V()
>>>>>>>
>>>>>>> it returns vertices obviously. For both Gryo and GraphSON, those
>>>>> vertices
>>>>>>> are converted to DetachedVertex which includes the properties of
>>> the
>>>>>>> Vertex. This can be tremendously expensive, especially if the
>> graph
>>>>>>> supports multi-properties.
>>>>>>>
>>>>>>> I think that Gremlin Server should take a hint from OLAP in
>>> relation
>>>> to
>>>>>>> this issue. With OLAP, a Vertex is converted to a ReferenceVertex
>>>> where
>>>>>> we
>>>>>>> only get the element identifier passed around.
>>>>>>>
>>>>>>> gremlin> graph =
>>>>> GraphFactory.open('conf/hadoop/hadoop-gryo.properties')
>>>>>>> ==>hadoopgraph[gryoinputformat->gryooutputformat]
>>>>>>> gremlin> g = graph.traversal().withComputer(SparkGraphComputer)
>>>>>>>
>>>> ==>graphtraversalsource[hadoopgraph[gryoinputformat->gryooutputformat],
>>>>>>> sparkgraphcomputer]
>>>>>>> gremlin> l = g.V().toList();[]
>>>>>>> gremlin> l[0].class
>>>>>>> ==>class
>>>>>>>
>>> org.apache.tinkerpop.gremlin.structure.util.reference.ReferenceVertex
>>>>>>> If you want more information, it is up to you to issue your query
>>> to
>>>>>>> request that information - for example:
>>>>>>>
>>>>>>> g.V().valueMap(true)
>>>>>>>
>>>>>>> I think Gremlin Server should work in the same fashion (i.e.
>>> return a
>>>>>>> ReferenceVertex when a Vertex is serialized over the network).
>> It
>>>>> would
>>>>>>> ease up on serialization overhead and force users to be more
>>> explicit
>>>>>> about
>>>>>>> the data that they want which would prevent unnecessary
>> performance
>>>>>>> surprises. This change might also be nice for the efficiency of
>>>>>>> RemoteGraph/Connection implementations.
>>>>>>>
>>>>>>> This has bothered me for a while, but we carried over the pattern
>>>> from
>>>>>>> TinkerPop 2.x of sending back properties and I've been concerned
>>>> about
>>>>>>> introducing a break in trying to improve that.  I dug into it
>> more
>>>>> today
>>>>>>> and my analysis seems to indicate that this change can occur
>>> without
>>>>>>> breaking all the code that's currently out there. I think that we
>>>> could
>>>>>>> keep the existing serialization model and simply add in the
>>>>>> ReferenceVertex
>>>>>>> approach as a configuration option for 3.2.1 and then make it the
>>>>> default
>>>>>>> for 3.3.x.
>>>>>>>
>>>>>>> If there are no objections in the next 72 hours (Saturday, May
>> 21,
>>>>> 2016,
>>>>>>> 4pm EST) I'll assume lazy consensus and move forward.
>>>>>>>

Reply via email to