kuppitz and i have been batting this one back and forth for a while now.
it's gotten crazy and ultimately we were backed into a corner with no good
solutions. the least of all evils involved creating Gryo 3.1 so that spark
tests could pass with gryo given the revision to property ids being locally
unique. So basically, Spark will always want to use Gryo 3.1 on 3.4.0+ or
else you will get weirdness for traversals like:

g.V().hasLabel("person").properties().dedup().value()

For network serialization with Gryo you can stay on gryo 3.0/1.0 and not
see any problem which means that 3.4.0 remains compatible with 3.3.x on
those gryo versions. On 3.4.0 we will default to 3.1, but keep 3.0
configurations in default Gremlin Server.

Gryo seemed so straightforward a solution for us in those early TP3
days...........ends up being just a little faster than GraphSON and only
useful on the JVM when we're flooded with GLVs. dah....

On Mon, Nov 19, 2018 at 9:33 AM Stephen Mallette <spmalle...@gmail.com>
wrote:

> Reading back through things again........I wrote on the issue a long while
> back that this was going to be a change that just involved writing tests to
> enforce this "uniqueness concept", but this change is happening at a more
> foundational level it seems. Any reason to not just go that route somehow?
> If we really wanted to change this at a foundational level we'd need to
> stamp a new version of Gryo out (should never have used the default
> serializer for "detached"), but I don't think we want a third version of
> Gryo for 3.4.0 (if ever, considering that Jorge has a better solution
> hopefully coming along).
>
> On Wed, Nov 14, 2018 at 10:18 AM Daniel Kuppitz <m...@gremlin.guru> wrote:
>
>> Spark was the very first thing that failed - I think it was only the newly
>> added test
>> <
>> https://github.com/apache/tinkerpop/pull/993/files#diff-7b625ebb74c59ffe5521b63a49797b64R169
>> >,
>> but of course, the failure made sense as this query can't work if the
>> parent element's id lost somewhere down the line.
>> So, in one way or another, we have to remember the parent id if we want to
>> allow local property id uniqueness.
>>
>> Cheers,
>> Daniel
>>
>>
>> On Tue, Nov 13, 2018 at 7:50 PM Stephen Mallette <spmalle...@gmail.com>
>> wrote:
>>
>> > Gryo is so completely inflexible some times :|
>> >
>> > How did we end up having to do this again? Reading back through the
>> thread,
>> > it seemed like it was because of Spark testing. What kinds of tests were
>> > failing for Spark because of this? I assume it wasn't just property
>> > assertions in the tests themselves, right? Or was Spark just blowing up
>> and
>> > erroring out?
>> >
>> > On Tue, Nov 13, 2018 at 10:10 AM Daniel Kuppitz <m...@gremlin.guru>
>> wrote:
>> >
>> > > That was my conclusion as well, I didn't see a way to make it
>> > non-breaking.
>> > > Maybe 3.4.0 should just have a special serializer for
>> VertexProperties..?
>> > >
>> > >
>> > > On Tue, Nov 13, 2018 at 5:00 AM Stephen Mallette <
>> spmalle...@gmail.com>
>> > > wrote:
>> > >
>> > > > If i'm thinking about this right....I guess Gryo used the default
>> Kryo
>> > > > serialization for 1.0 and 3.0 so  by changing that we effectively
>> break
>> > > > both version of Gryo in 3.4.0. And that would mean that you couldn't
>> > > > connect to older versions of the Java driver to 3.4.0 systems - you
>> > would
>> > > > have to upgrade. Is that the correct conclusion?
>> > > >
>> > > > On Mon, Nov 12, 2018 at 9:28 AM Daniel Kuppitz <m...@gremlin.guru>
>> > wrote:
>> > > >
>> > > > > I actually opened a PR <
>> https://github.com/apache/tinkerpop/pull/993
>> > >
>> > > > > (makes it easier to look at the changes). Removing transient
>> didn't
>> > > quite
>> > > > > do the job, so I overwrote the (de)serialization methods and only
>> > kept
>> > > > the
>> > > > > parent element id. The only thing that's broken now is the
>> > > serialization
>> > > > > test suite.
>> > > > >
>> > > > > Cheers,
>> > > > > Daniel
>> > > > >
>> > > > >
>> > > > > On Mon, Nov 12, 2018 at 4:59 AM Stephen Mallette <
>> > spmalle...@gmail.com
>> > > >
>> > > > > wrote:
>> > > > >
>> > > > > > I don't think we can remove the transient keyword. Doesn't that
>> > > balloon
>> > > > > the
>> > > > > > amount of data shuffled around the network? I think that was the
>> > > reason
>> > > > > we
>> > > > > > made that transient in the first place.
>> > > > > >
>> > > > > > What parts of the test suite are failing because of this? Could
>> you
>> > > > post
>> > > > > > links to a couple of examples of failing assertions?
>> > > > > >
>> > > > > > On Sat, Nov 10, 2018 at 2:01 PM Daniel Kuppitz <m...@gremlin.guru
>> >
>> > > > wrote:
>> > > > > >
>> > > > > > > Removing the transient attribute from the
>> > DetachedVertexProperty's
>> > > > > vertex
>> > > > > > > field [1] makes all Spark tests pass. Easy peasy, but holy
>> cow,
>> > > that
>> > > > > > little
>> > > > > > > change probably has a huge impact on things I can't even
>> think of
>> > > > right
>> > > > > > > now.
>> > > > > > >
>> > > > > > > [1]
>> > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> https://github.com/apache/tinkerpop/blob/master/gremlin-core/src/main/java/org/apache/tinkerpop/gremlin/structure/util/detached/DetachedVertexProperty.java#L40
>> > > > > > >
>> > > > > > > Cheers,
>> > > > > > > Daniel
>> > > > > > >
>> > > > > > >
>> > > > > > > On Sat, Nov 10, 2018 at 11:49 AM Daniel Kuppitz
>> <m...@gremlin.guru
>> > >
>> > > > > wrote:
>> > > > > > >
>> > > > > > > > Treating null parents as a wildcard makes the majority of
>> tests
>> > > > just
>> > > > > > > pass,
>> > > > > > > > but Spark integration tests fail. Spark is all about
>> detached
>> > > > > elements
>> > > > > > > (or
>> > > > > > > > reference elements) and I guess there's no way to make those
>> > > tests
>> > > > > pass
>> > > > > > > > without making detached properties remember their parent
>> > elements
>> > > > (or
>> > > > > > at
>> > > > > > > > least just their parent element ids).
>> > > > > > > >
>> > > > > > > > Cheers,
>> > > > > > > > Daniel
>> > > > > > > >
>> > > > > > > >
>> > > > > > > > On Fri, Nov 9, 2018 at 4:53 AM Stephen Mallette <
>> > > > > spmalle...@gmail.com>
>> > > > > > > > wrote:
>> > > > > > > >
>> > > > > > > >> If we can, I think that we should just keep derser
>> semantics
>> > as
>> > > > they
>> > > > > > > are.
>> > > > > > > >> We've gone this long with them having null parent element
>> ids
>> > > and
>> > > > > > there
>> > > > > > > >> haven't been any raised complains that I'm aware of.
>> > > > > > > >>
>> > > > > > > >> > I thought about skipping the parent id comparison in case
>> > one
>> > > of
>> > > > > the
>> > > > > > > >> elements parents is *null*
>> > > > > > > >>
>> > > > > > > >> I suppose the question we need to answer stated directly
>> is:
>> > > What
>> > > > do
>> > > > > > we
>> > > > > > > >> want equals() to mean for "detached" properties? Maybe it's
>> > not
>> > > > > wrong
>> > > > > > to
>> > > > > > > >> consider "detached" properties without a parent to have a
>> form
>> > > of
>> > > > > > > wildcard
>> > > > > > > >> status. If you were to "attach" such a property then its id
>> > > space
>> > > > > > would
>> > > > > > > be
>> > > > > > > >> restricted to the element you're adding it to, so it seems
>> > safe
>> > > in
>> > > > > > that
>> > > > > > > >> respect. In the "tree test" that's failing that you
>> > referenced,
>> > > > the
>> > > > > > > >> deserialized tree is a wildcard match for the original
>> > > tree...the
>> > > > > idea
>> > > > > > > >> sorta works in that capacity, but perhaps there are other
>> > > > scenarios
>> > > > > > > where
>> > > > > > > >> that's a really bad thing?
>> > > > > > > >>
>> > > > > > > >> Does a "null" parent element mean "detached"? I guess the
>> > answer
>> > > > is
>> > > > > > > >> "sometimes" because in some cases "detached" keeps the
>> parent
>> > > > > element.
>> > > > > > > Are
>> > > > > > > >> there other scenarios though where the parent element can
>> be
>> > > > null? I
>> > > > > > > can't
>> > > > > > > >> really think of any right now.
>> > > > > > > >>
>> > > > > > > >> On Thu, Nov 8, 2018 at 12:48 PM Daniel Kuppitz
>> > <m...@gremlin.guru
>> > > >
>> > > > > > wrote:
>> > > > > > > >>
>> > > > > > > >> > Working on TINKERPOP-2051
>> > > > > > > >> > <https://issues.apache.org/jira/browse/TINKERPOP-2051>,
>> I
>> > > > > thought I
>> > > > > > > >> found
>> > > > > > > >> > a
>> > > > > > > >> > pretty easy solution: In order to allow locally unique
>> > > property
>> > > > > ids,
>> > > > > > > we
>> > > > > > > >> > only need to compare the ids of two properties as well as
>> > the
>> > > > ids
>> > > > > of
>> > > > > > > >> their
>> > > > > > > >> > parent elements. That was basically a one-line change and
>> > had
>> > > > the
>> > > > > > > >> expected
>> > > > > > > >> > effect:
>> > > > > > > >> >
>> > > > > > > >> >
>> > > > > > > >> > gremlin> g = TinkerGraph.open().traversal()
>> > > > > > > >> > ==>graphtraversalsource[tinkergraph[vertices:0 edges:0],
>> > > > standard]
>> > > > > > > >> > gremlin> g.addV("person").property("name", "alice", id,
>> > > > > > > "name").as("a").
>> > > > > > > >> > ......1>   addV("person").property("name", "bob", id,
>> > "name").
>> > > > > > > >> > ......2>   addE("knows").from("a")
>> > > > > > > >> > ==>e[2][0-knows->1]
>> > > > > > > >> > gremlin>
>> > > > > > > >> > gremlin> g.V().properties("name")
>> > > > > > > >> > ==>vp[name->alice]
>> > > > > > > >> > ==>vp[name->bob]
>> > > > > > > >> > gremlin> g.V().properties("name").dedup()
>> > > > > > > >> > ==>vp[name->alice]
>> > > > > > > >> > ==>vp[name->bob]
>> > > > > > > >> >
>> > > > > > > >> >
>> > > > > > > >> > Without the change in id comparison, the last statement
>> > would
>> > > > have
>> > > > > > > only
>> > > > > > > >> > returned 1 of the properties. Looked easy enough, but -
>> of
>> > > > course
>> > > > > -
>> > > > > > it
>> > > > > > > >> > broke the test suite. This change had an impact on 2
>> types
>> > of
>> > > > > tests:
>> > > > > > > >> >
>> > > > > > > >> >    - those, that deal with serialization and
>> deserialization
>> > > > > > > >> >    - those, that make use of bytecode
>> > > > > > > >> >
>> > > > > > > >> > Perhaps the latter can be seen as a more specific case of
>> > the
>> > > > > > former.
>> > > > > > > >> The
>> > > > > > > >> > problem is that serialized/deserialized/detached
>> properties
>> > no
>> > > > > > longer
>> > > > > > > >> have
>> > > > > > > >> > a reference to their former parent element. I thought
>> about
>> > > > > skipping
>> > > > > > > the
>> > > > > > > >> > parent id comparison in case one of the elements parents
>> is
>> > > > > *null*,
>> > > > > > > but
>> > > > > > > >> at
>> > > > > > > >> > some point that just felt wrong and made me think of all
>> > kind
>> > > of
>> > > > > > > horror
>> > > > > > > >> > scenarios this could/would lead to.
>> > > > > > > >> >
>> > > > > > > >> > So my question is, what would be the smartest way to
>> handle
>> > > > that?
>> > > > > > > Adjust
>> > > > > > > >> > all test cases, so they no longer rely on full equality?
>> Fix
>> > > > > > > >> > (de)serialization to include parent element ids?
>> Something
>> > > else?
>> > > > > > > >> >
>> > > > > > > >> > *Example of a failing test:*
>> > > > > > > >> >
>> > > > > > > >> > [ERROR]
>> > >  SerializationTest$GryoV1d0Test.shouldSerializeTree:254
>> > > > > The
>> > > > > > > >> > objects differ
>> > > > > > > >> > expected:
>> > > > > > > >> >
>> > > > > > > >> >
>> > > > > > > >>
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> org.apache.tinkerpop.gremlin.process.traversal.step.util.Tree<{v[1]={v[2]={vp[name->vadas]={}},
>> > > > > > > >> > v[3]={vp[name->lop]={}}, v[4]={vp[name->josh]={}}}}>
>> > > > > > > >> > but was:
>> > > > > > > >> >
>> > > > > > > >> >
>> > > > > > > >>
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> org.apache.tinkerpop.gremlin.process.traversal.step.util.Tree<{v[1]={v[2]={vp[name->vadas]={}},
>> > > > > > > >> > v[3]={vp[name->lop]={}}, v[4]={vp[name->josh]={}}}}>
>> > > > > > > >> >
>> > > > > > > >> >
>> > > > > > > >> > Crazy thought: We could just change the tests to compare
>> the
>> > > > > > > toString()
>> > > > > > > >> > versions of results....
>> > > > > > > >> >
>> > > > > > > >> > Cheers,
>> > > > > > > >> > Daniel
>> > > > > > > >> >
>> > > > > > > >>
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>>
>

Reply via email to