Now that the Gryo 3.0 PR has been issued and is awaiting review, I've moved
on to GraphSON 3.0 to hopefully resolve the following problems:

https://issues.apache.org/jira/browse/TINKERPOP-1427
https://issues.apache.org/jira/browse/TINKERPOP-1574
https://issues.apache.org/jira/browse/TINKERPOP-1414

Those three issues basically mean that GraphSON 3.0 will be the default for
TinkerPop 3.3.0. There will be no such thing as "untyped" GraphSON anymore
- that really doesn't make sense given the move toward bytecode/GLVs. That
is further nice as it reduces a choice for users. Finally, this revised
version of GraphSON will take care of "collections" that do not serialize
well for Gremlin needs due to limitations of JSON itself. The best example
of this deficiency is in maps where keys can't be anything but strings. In
Gremlin it is a common pattern to do stuff like:

g.V(1).out().groupCount()

As it stands GLVs have no support for that result. Other than that, there
are no big changes to GraphSON - it works pretty well and has improve
dramatically in speed for 3.2.5 so it should be a good basis for GLVs to
build from. That's about it - hoping to get all these items fixed up by end
of next week and ready for a pull request.



On Wed, Jun 28, 2017 at 8:23 AM, Stephen Mallette <[email protected]>
wrote:

> hahaha - i can't win. there was other work done on Gryo 3.0 a pretty long
> time ago, that made it useful around Request/ResponseMessage serialization
> in Gremlin Server. I'd long forgotten about that. I guess I will finish up
> the work - the work just won't be harried by anything that had anything to
> do with TINKERPOP-1592. It was almost there anyway I think.
>
> On Tue, Jun 27, 2017 at 3:02 PM, Stephen Mallette <[email protected]>
> wrote:
>
>> You had me at the problem with multi-properties. withDetachment() doesn't
>> seem to address that well and which makes this feel quite more hacky than
>> when it started.
>>
>> I think not doing withDetachment() reduces the need and urgency to do
>> Gryo 3.0. HaltedTraverserStrategy already works as does all the
>> serialization that goes with it. The only issue that is messed up is that I
>> need to "fix" TraversalMetrics serialization in Gryo 1.0, but we've never
>> been super consistent about versioning Gryo anyway, so i wonder if that
>> minor break matters on a version where we are allowing for breaks. I think
>> i'll kill my efforts on Gryo 3.0 for now and save some code.
>>
>>
>>
>> On Tue, Jun 27, 2017 at 11:17 AM, Marko Rodriguez <[email protected]>
>> wrote:
>>
>>> Hi,
>>>
>>> In this email I will argue why TINKERPOP-1592 is a bad idea.
>>>
>>> GremlinServer does “too much.” In the future, the concept for
>>> GremlinServer for TinkerPop4 should be “network I/O to the Gremlin virtual
>>> machine.” That is it. So what is GremlinServer doing now that is bad?
>>> Currently GremlinServer is “detaching” elements from the graph and
>>> populating those elements with more data than the user requested. What do I
>>> mean?:
>>>
>>> g.V(1)
>>>
>>> Currently that returns a DetachedVertex which is vertex 1’s id, label,
>>> and all of its properties. Now here is the crazy part. What if there are 1M
>>> properties (e.g. timestamped multi-properties on sensor network vertices).
>>> Doh! However, what did the user ask for? They asked for v[1] and that is
>>> all they should get. This is known as ReferenceVertex. If they wanted some
>>> subset of the timestamped sensor network data, they should have done:
>>>
>>> g.V(1).properties(‘sensor’).has(‘timestamp’,gt(2017))
>>>
>>> Thus, we should only return the data people explicitly ask for in the
>>> traversal. The TINKERPOP-1592 ticket is a hack for DetachedVertex by
>>> allowing users to specify withDetachment(properties:[“not_sensor”]),
>>> but then it is not expressive enough. Ultimately, for generality, users
>>> will want to specify full traversals in their withDetachment()
>>> specification. Now you are talking SubgraphStrategy. Dar! — and guess what,
>>> GremlinServer doesn’t respect SubgraphStrategy. This is the problem with
>>> everything NOT being traversal — once you start using the “Blueprints API”
>>> you start getting inconsistent behavior/functionality. Thus, GremlinServer
>>> does too much — just execute the traversal and return the result.
>>>
>>> Next, DetachedXXX starts to get I-N-S-A-N-E when you start talking GLVs.
>>> Now we have the Blueprints API implements in C#, Python, etc. Noooooooo!
>>> GLV’s should only implement ReferenceXXX which is the bare minimum
>>> specification of a graph object such that it can be re-attached (referenced
>>> back) to the source graph. Thats it. Don’t start populating it with
>>> properties — “what about edges?” — “can it get the neighboring vertices
>>> properties too?” — “what about ...?” — if you want that data, you traverse
>>> to it yourself!
>>>
>>> So, what is the solution to the problem at hand — ReferenceXXX. These
>>> element classes are the minimal amount of data required to re-attach to the
>>> source graph. Thus,  if you do g.V(1), you get back id/label. However, if
>>> you want to then get the sensor data, you do g.V(v1).properties(…).
>>> Moreover, there is a little hidden gem called HaltedTraverserStrategy that
>>> allows the user to specify their desired element class —
>>> https://github.com/apache/tinkerpop/blob/master/gremlin-core
>>> /src/main/java/org/apache/tinkerpop/gremlin/process/traversa
>>> l/strategy/decoration/HaltedTraverserStrategy.java <
>>> https://github.com/apache/tinkerpop/blob/master/gremlin-cor
>>> e/src/main/java/org/apache/tinkerpop/gremlin/process/travers
>>> al/strategy/decoration/HaltedTraverserStrategy.java>.
>>>
>>> If GremlinServer is yielding too much data, simply do this:
>>>
>>> g = g.withStrategy(HaltedTraverserStrategy.reference())
>>> g.V(1) // ahh… fresh and clean.
>>>
>>> The trick to software is not to write it. If you are a software
>>> developer, you are not as good as the guy who runs the deli down the street
>>> cause guess what, he lives just fine and doesn’t write a lick of code.
>>>
>>> Marko.
>>>
>>> http://markorodriguez.com
>>>
>>>
>>>
>>> > On Jun 26, 2017, at 2:21 PM, Stephen Mallette <[email protected]>
>>> wrote:
>>> >
>>> > Looking back at this thread, I think that since there were no
>>> objections to
>>> > doing "pre-releases" of GLVs I think we can postpone the test suite
>>> > changes. So, i'm fine with that being off the table.
>>> >
>>> > Looking at my list, I'm also surprised that I didn't include:
>>> >
>>> > https://issues.apache.org/jira/browse/TINKERPOP-1592
>>> >
>>> > I think that will be important for providing more flexibility to users
>>> to
>>> > shape results returned from traversals. That, of course, is important
>>> for
>>> > GLV remoting so that users can return only data that matters to them.
>>> > TINKERPOP-1592 funnels into the GraphSON/Gryo 3.0 stuff mentioned
>>> > previously as we seek to make improvements there in terms of
>>> > efficiency/performance/usability. Marko will be taking a look at the
>>> 1592
>>> > ticket.
>>> >
>>> > I think there is a good body of nice-to-have tickets (after going
>>> through
>>> > them all in the last couple of weeks to do some housekeeping) so we'll
>>> see
>>> > what we can get in there and what we can't after those more crucial
>>> bits
>>> > are done. I believe that we could start thinking about release of
>>> 3.3.0 in
>>> > the next 4 weeks or so.
>>> >
>>> > If there are any other thoughts for what's going on with 3.3.0 please
>>> let
>>> > them be known.
>>> >
>>> >
>>> >
>>> >
>>> >
>>> >
>>> > On Thu, Jun 1, 2017 at 2:08 PM, Stephen Mallette <[email protected]
>>> >
>>> > wrote:
>>> >
>>> >> I was just thinking about what needs to be done for 3.3.0 to get it
>>> ready
>>> >> for release. I thought I'd send this email to collect any ideas:
>>> >>
>>> >> + Dynamically load the MetricManager in Gremlin Server
>>> (TINKERPOP-1550)
>>> >> + Clean up IO - both GraphSON 3.0 and Gryo 3.0
>>> >> + Remove more deprecated code
>>> >> + Test framework improvements (GLVs and in the structure/process
>>> suites)
>>> >>
>>> >> I suppose these could shift and change between now and when we think
>>> it's
>>> >> ready to release. I have no idea how much time it will take to get
>>> this all
>>> >> done, but let's see if anyone else has any other important things for
>>> 3.3.0.
>>> >>
>>> >>
>>> >>
>>> >>
>>>
>>>
>>
>

Reply via email to