Josh, thanks for your thoughts - some responses inline:

On Tue, Dec 8, 2020 at 10:16 PM Josh Perryman <joshperry...@gmail.com>
wrote:

> I'll offer some thoughts. I'm seeing upsertV() as an idempotent getOrCreate
> call which always returns a vertex with the label/property values specified
> within the step. It's sort of a declarative pattern: "return this vertex to
> me, find it if you can, create it if you must."
>

I like this description - I've added it to the gist, though it's a bit at
odds with Dave's previous post, so we'll consider it a temporary addition
until he responds.


> On that account, I do like the simplification in 1. Repetition shouldn't be
> necessary. In an ideal world, the engine should know the primary
> identifiers (name or id) and find/create the vertex based on them. Any
> other included values will be "trued up" as well. But this may be a bridge
> too far for TinkerPop since knowing identifiers may require a specified
> schema. I'd prefer to omit the third input, but it might be necessary to
> keep it so that the second input can be for the matching use case.
>

In my most recent post on gremlin-users I think I came up with a nice way
to get rid of the second Map. One Map that forms the full list of
properties for upserting is easier than partitioning two Maps that
essentially merge together. I imagine it's unlikely that application code
will have that separation naturally so users will have the added step of
trying to separate their data into searchable vs "just data". Getting us to
one Map argument will simplify APIs for us and reduce complexity to users.
Here is what I'd proposed for those not following over there:

// match on name and age (or perhaps whatever the underlying graph system
thinks is best?)
g.upsertV('person', [name:'marko',age:29])

// match on name only
g.upsertV('person', [name:'marko',age:29]).by('name')

// explicitly match on name and age
g.upsertV('person', [name:'marko',age:29]).
  by('name').by('age')

// match on id only
g.upsertV('person', [(T.id): 100, name:'marko',age:29]).by(T.id)

// match on whatever the by(Traversal) predicate defines
g.upsertV('person', [name:'marko',age:29]).
  by(has('name', 'marko'))

// match on id, then update age
g.upsertV('person', [(T.id): 100, name:'marko']).by(T.id).
  property('age',29)

With this model, we get one Map argument that represents the complete
property set to be added/updated to the graph and the user can hint on what
key they wish to match on using by() where that sort of step modulation
should be a well understood and familiar concept in Gremlin at this point.

So that means I think 2 should always match or update the additional
> values. Again, we're specifying the expected result and letting the engine
> figure out best how to return that results and appropriately maintain
> state.
>

I again like this description, but we'll see what Dave's thoughts are since
he's a bit behind on the threads at this point I think.


> I'm also presuming that anything not included as inputs to the upsertV()
> step are then to be handled by following steps. I'm hoping that is a
> sufficient approach for addressing the multi/meta property use cases
> brought up in 3.
>

yeah................it needs more thought. I spent more time thinking on
this issue yesterday than I have for all the previous posts combined and I
think it yielded something good in that revised syntax. It's going to take
more of that kind of elbow grease to dig into these lesser use cases to
make sure we aren't coding ourselves into corners.


> I do like the idea of using modulators (with(), by()) for more
> sophisticated usage and advanced use cases. Also, the streaming examples
> are quite elegant allowing for a helpful separation of data and logic.
>

cool - hope you like the revised syntax I posted then. :)


> That's my humble take. This is a very welcome addition to the language and
> I appreciate the thoughtful & collaborative approach to the design
> considerations.
>

Thanks again and please keep the thoughts coming. Lots of other interesting
design discussions seem to be brewing.


>
> Josh
>
> On Tue, Dec 8, 2020 at 8:57 AM Stephen Mallette <spmalle...@gmail.com>
> wrote:
>
> > I started a expanded this discussion to gremlin-users for a wider
> audience
> > and the thread is starting to grow:
> >
> > https://groups.google.com/g/gremlin-users/c/QBmiOUkA0iI/m/pj5Ukiq6AAAJ
> >
> > I guess we'll need to summarize that discussion back here now....
> >
> > I did have some more thoughts to hang out there and figured that I
> wouldn't
> > convolute the discussion on gremlin-users with it so I will continue the
> > discussion here.
> >
> > 1, The very first couple of examples seem wrong (or at least not best
> > demonstrating the usage):
> >
> > g.upsertV('person', [name: 'marko'],
> >                     [name: 'marko', age: 29])
> > g.upsertV('person', [(T.id): 1],
> >                     [(T.id): 1, name: 'Marko'])
> >
> > should instead be:
> >
> > g.upsertV('person', [name: 'marko'],
> >                     [age: 29])
> > g.upsertV('person', [(T.id): 1],
> >                     [name: 'Marko'])
> >
> > 2. I can't recall if we made this clear anywhere but in situations where
> we
> > "get" rather than "create" do the additional properties act in an update
> > fashion to the element that was found? I think I've been working on the
> > assumption that it would, though perhaps that is not always desirable?
> >
> > 3. We really never settled up how to deal with multi/meta-properties.
> That
> > story should be clear so that when we document upsert() we include the
> > approaches for the fallback patterns that don't meet the 90% of use cases
> > we are targeting and I sense that we're saying that meta/multi-properties
> > don't fit in that bucket. So with that in mind, I don't think that the
> > following works for metaproperties:
> >
> > g.upsertV('person', [(T.id): 1],
> >                     [name:[acl: 'public'])
> >
> > as it doesn't let us set the value for the "name" just the pairs for the
> > meta-properties. I guess a user would have to fall back to:
> >
> > g.upsertV('person', [(T.id): 1]).
> >   property('name','marko','acl','public')
> >
> > // or use the additional properties syntax
> > g.upsertV('person', [(T.id): 1],[name:'marko']).
> >   properties('name').property('acl','public')
> >
> > // or if there were multi-properties then maybe...
> > g.upsertV('person', [(T.id): 1],[name:'marko']).
> >   properties('name').hasValue('marko').property('acl','public')
> >
> > As for multi-properties, I dont think we should assume that a List object
> > should be interpreted by Gremlin as a multi-property. Perhaps we just
> rely
> > on the underlying graph to properly deal with that given a schema or the
> > user falls back to:
> >
> > // this ends up as however the graph deals with a List object
> > g.upsertV('person', [(T.id): 1], [lang: ['java', 'scala', 'java'])
> >
> > // this is explicit
> > g.upsertV('person', [(T.id): 1]).
> >   property(list, 'lang', 'java').
> >   property(list, 'lang, 'scala').
> >   property(list, 'lang', 'java')
> >
> > If that makes sense to everyone, I will update the gist.
> >
> >
> >
> >
> >
> >
> >
> > On Tue, Oct 27, 2020 at 11:51 PM David Bechberger <d...@bechberger.com>
> > wrote:
> >
> > > Hello Stephen,
> > >
> > > Thanks for making that gist, its much easier to follow along with the
> > > proposed syntax there.  To your specific comments:
> > >
> > > #1 - My only worry with the term upsert is that it is not as widely a
> > used
> > > term for this sort of pattern as "Merge" (e.g. SQL, Cypher).  However I
> > > don't have a strong opinion on this, so I am fine with either.
> > > #2 - My only real objective here is to make sure that we make the
> 80-90%
> > > case easy and straightforward.  I think that having the fallback option
> > of
> > > using the current syntax for any complicated edge cases should be
> > > considered here as well. I'd appreciate your thoughts here as these are
> > > good points you bring up that definitely fall into the 80-90% use case.
> > > #3 - Those points make sense to me, not sure I have anything further to
> > add
> > > #4 - I don't think I would expect to have the values extracted from the
> > > traversal filters but I'd be interested in other opinions on that.
> > >
> > > Thanks,
> > > Dave
> > >
> > > On Thu, Oct 15, 2020 at 5:30 AM Stephen Mallette <spmalle...@gmail.com
> >
> > > wrote:
> > >
> > > > It's been a couple weeks since we moved this thread so I went back
> > > through
> > > > it and refreshed my mind with what's there. Dave, nice job putting
> all
> > > > those examples together. I've recollected them all together and have
> > just
> > > > been reviewing them in a more formatted style with proper Groovy
> syntax
> > > > than what can be accomplished in this mailing list. Anyone who cares
> to
> > > > review in that form can do so here (i've pasted its markdown contents
> > at
> > > > the bottom of this post as well):
> > > >
> > > > https://gist.github.com/spmallette/5cd448f38d5dae832c67d890b576df31
> > > >
> > > > In this light, I have the following comments and thoughts:
> > > >
> > > > 1. I feel more inclined toward saving "merge" for a more generalized
> > step
> > > > of that name and have thus renamed the examples to use "upsert".
> > > >
> > > > 2. I still don't quite feel comfortable with meta/multi-properties
> > > > examples. Perhaps you could explain further as it's possible I'm not
> > > > thinking things through properly. For meta-properties the example is:
> > > >
> > > > g.upsertV('person', [(T.id): 1],
> > > >                     [(T.id): 1, name:[first: Marko', last: 'R'])
> > > >
> > > > So I see that "name" takes a Map here, but how does Gremlin know if
> > that
> > > > means "meta-property" or "Map value" and if the former, then how do
> you
> > > set
> > > > the value of "name"?  My question is similar to the multi-property
> > > examples
> > > > - using this one:
> > > >
> > > > g.upsertV('person', [(T.id): 1],
> > > >                     [(T.id): 1,lang: ['java', 'scala'])
> > > >
> > > > How does Gremlin know if the user means that "lang" should be a
> > > > "multi-property" with Cardinality.list (or Cardinality.set for that
> > > matter)
> > > > or a value of Cardinality.single with a List/Array value? I can
> perhaps
> > > > suggest some solutions here but wanted to make sure I wasn't just
> > > > misunderstanding something.
> > > >
> > > > 3. The "upsert with stream" example is:
> > > >
> > > > g.inject([(id): 1, (label): 'person', name: 'marko'],
> > > >          [(id): 2, (label): 'person', name: 'josh']).
> > > >   upsertV(values(label), valueMap(id), valueMap())
> > > >
> > > > That would establish an API of upsertV(Traversal, Traversal,
> Traversal)
> > > > which would enable your final thought of your last post with:
> > > >
> > > > g.V().upsertV('person', __.has('person','age',gt(29)).has('state',
> > 'NY'),
> > > >                         [active: true])
> > > >
> > > > The open-ended nature of that is neat but comes with some complexity
> > with
> > > > steps that traditionally take an "open" Traversal (and this step now
> > has
> > > up
> > > > to three of them) - perhaps, that's our own fault in some ways. For
> the
> > > > "stream" concept, values()/valueMap() won't immediately work that way
> > as
> > > > they only apply to Element...given today's Gremlin semantics, I think
> > > we'd
> > > > re-write that as:
> > > >
> > > > g.inject([(id): 1, (label): 'person', name: 'marko'],
> > > >          [(id): 2, (label): 'person', name: 'josh']).
> > > >   upsertV(select(label), select(id), select(id,label,'name'))
> > > >
> > > > Though, even that doesn't quite work because you can't select(T)
> right
> > > now
> > > > which means you'd need to go with:
> > > >
> > > > g.inject([id: 1, label: 'person', name: 'marko'],
> > > >          [id: 2, label: 'person', name: 'josh']).
> > > >   upsertV(select('label'), select('id'), select('id','label','name'))
> > > >
> > > > not terrible in a way and perhaps we could fix select(T) - can't
> > remember
> > > > why that isn't allowed except that select() operates over multiple
> > scopes
> > > > and perhaps trying T through those scopes causes trouble.
> > > >
> > > > 4. Continuing with the use of Traversal as arguments, let's go back
> to:
> > > >
> > > > g.V().upsertV('person', __.has('person','age',gt(29)).has('state',
> > 'NY'),
> > > >                         [active: true])
> > > >
> > > > I originally had it in my mind to prefer this approach over a Map for
> > the
> > > > search as it provides a great deal of flexibility and fits the common
> > > > filtering patterns that users follow really well. I suppose the
> > question
> > > is
> > > > what to do in the situation of "create". Would we extract all the
> > strict
> > > > equality filter values from the initial has(), thus, in the example,
> > > > "state" but not "age", and create a vertex that has "state=NY,
> > > > active=true"? That is possible with how things are designed in
> Gremlin
> > > > today I think.
> > > >
> > > > This syntax does create some conflict with the subject of 3 and
> streams
> > > > because the traverser in the streams case is the incoming Map but for
> > > this
> > > > context it's meant as a filter of V(). Getting too whacky?
> > > >
> > > > 5. I like the idea of using a side-effect to capture whether the
> > element
> > > > was created or not. that makes sense. I think that can work.
> > > >
> > > > ==============================
> > > > = gist contents
> > > > ==============================
> > > >
> > > > # API
> > > >
> > > > ```java
> > > > upsertV(String label, Map matchOrCreateProperties)
> > > > upsertV(String label, Map matchOrCreateProperties, Map
> > > > additionalProperties)
> > > > upsertV(Traversal label, Traversal matchOrCreateProperties)
> > > > upsertV(Traversal label, Traversal matchOrCreateProperties, Traversal
> > > > additionalProperties)
> > > > upsertV(...).
> > > >   with(WithOptions.sideEffectLabel, 'a')
> > > >
> > > > upsertE(String label, Map matchOrCreateProperties)
> > > > upsertE(Traversal label, Traversal matchOrCreateProperties)
> > > > upsertE(String label, Map matchOrCreateProperties, Map
> > > > additionalProperties)
> > > > upsertE(Traversal label, Traversal matchOrCreateProperties, Traversal
> > > > additionalProperties)
> > > > upsertE(...).
> > > >   from(Vertex incidentOut).to(Vertex incidentIn)
> > > >   with(WithOptions.sideEffectLabel, 'a')
> > > > ```
> > > >
> > > > # Examples
> > > >
> > > > ## upsert
> > > >
> > > > ```groovy
> > > > g.upsertV('person', [name: 'marko'],
> > > >                     [name: 'marko', age: 29])
> > > > ```
> > > >
> > > > ## upsert with id
> > > >
> > > > ```groovy
> > > > g.upsertV('person', [(T.id): 1],
> > > >                     [(T.id): 1, name: 'Marko'])
> > > > ```
> > > >
> > > > ## upsert with Meta Properties
> > > >
> > > > ```groovy
> > > > g.upsertV('person', [(T.id): 1],
> > > >                     [(T.id): 1, name:[first: Marko', last: 'R'])
> > > > ```
> > > >
> > > > ## upsert with Multi Properties
> > > >
> > > > ```groovy
> > > > g.upsertV('person', [(T.id): 1],
> > > >                     [(T.id): 1,lang: ['java', 'scala'])
> > > > ```
> > > >
> > > > ## upsert with Single Cardinality
> > > >
> > > > ```groovy
> > > > g.upsertV('person', [(T.id): 1],
> > > >                     [(T.id): 1, lang: 'java')
> > > > ```
> > > >
> > > > ## upsert with List Cardinality
> > > >
> > > > ```groovy
> > > > g.upsertV('person', [(T.id): 1],
> > > >                     [(T.id): 1, lang: ['java', 'scala', 'java'])
> > > > ```
> > > >
> > > > ## upsert with Set Cardinality
> > > >
> > > > ```groovy
> > > > g.upsertV('person', [(T.id): 1],
> > > >                     [(T.id): 1, lang: ['java','scala'])
> > > > ```
> > > >
> > > > ## upsert with List Cardinality - and add value to list
> > > >
> > > > ```groovy
> > > > g.upsertV('person', [(T.id): 1],
> > > >                     [(T.id): 1, lang: ['java','scala','java']).
> > > >   property(Cardinality.list, 'lang', 'java')
> > > > ```
> > > >
> > > > ## upsert with stream
> > > >
> > > > ```groovy
> > > > // doesn't work with today's Gremlin semantics
> > > > g.inject([(id): 1, (label): 'person', name: 'marko'],
> > > >          [(id): 2, (label): 'person', name: 'josh']).
> > > >   upsertV(values(label), valueMap(id), valueMap())
> > > >
> > > > // the following would be more in line with what's possible with
> > existing
> > > > Gremlin semantics:
> > > > g.inject([id: 1, label: 'person', name: 'marko'],
> > > >          [id: 2, label: 'person', name: 'josh']).
> > > >   upsertV(select('label'), select('id'), select('id','label','name'))
> > > > ```
> > > >
> > > > ## upsert with reporting added or updated side effect
> > > >
> > > > ```groovy
> > > > g.upsertV('person', [name: 'marko'],
> > > >                     [name: 'marko', age: 29]).
> > > >     with(WithOptions.sideEffectLabel, 'a').
> > > >   project('vertex', 'added').
> > > >     by().
> > > >     by(cap('a'))
> > > > ```
> > > >
> > > > ## upsert Edge assuming a self-relation of "marko"
> > > >
> > > > ```groovy
> > > > g.V().has('person','name','marko').
> > > >   upsertE( 'self', [weight:0.5])
> > > > ```
> > > >
> > > > ## upsert Edge with incident vertex checks
> > > >
> > > > ```groovy
> > > > g.upsertE('knows', [weight:0.5]).
> > > >    from(V(1)).to(V(2))
> > > > g.V().has('person','name','marko').
> > > >   upsertE( 'knows', [weight:0.5]).
> > > >     to(V(2))
> > > > ```
> > > >
> > > > ## upsert using a Traversal for the match
> > > >
> > > > ```groovy
> > > > g.V().upsertV('person', __.has('person','age',gt(29)).has('state',
> > 'NY'),
> > > >                         [active: true])
> > > > ```
> > > >
> > > >
> > > >
> > > >
> > > > On Mon, Sep 28, 2020 at 6:49 PM David Bechberger <
> d...@bechberger.com>
> > > > wrote:
> > > >
> > > > > So, I've been doing some additional thinking about the ways that
> this
> > > > could
> > > > > work based on the comments below and have put my comments inline.
> > > > >
> > > > > Dave
> > > > >
> > > > > On Tue, Sep 22, 2020 at 6:05 AM Stephen Mallette <
> > spmalle...@gmail.com
> > > >
> > > > > wrote:
> > > > >
> > > > > > I added some thoughts inline below:
> > > > > >
> > > > > > On Fri, Sep 18, 2020 at 3:51 PM David Bechberger <
> > > d...@bechberger.com>
> > > > > > wrote:
> > > > > >
> > > > > > > Thanks for the detailed comments Stephen.  I have addressed
> them
> > > > inline
> > > > > > > below.
> > > > > > >
> > > > > > > I did read the proposal from earlier and I think that we are in
> > > close
> > > > > > > agreement with what we are trying to accomplish.  I also fully
> > > > support
> > > > > > > Josh's comment on providing a mechanism for submitting a map of
> > > > > > properties
> > > > > > > as manually unrolling this all right now leads to a lot of
> > > potential
> > > > > for
> > > > > > > error and a long messy traversal.
> > > > > > >
> > > > > > > I'm looking forward to this discussion on how to merge these
> two
> > > > > > proposals.
> > > > > > >
> > > > > > >
> > > > > > > 1. How would multi/meta-properties fit into the API you've
> > > proposed?
> > > > > > >
> > > > > > > My first thought here is that multi-properties would be
> > represented
> > > > as
> > > > > > > lists in the map, e.g.
> > > > > > >
> > > > > > > {names: ['Dave', 'David']}
> > > > > > >
> > > > > > > and meta-properties would be represented as maps in the maps,
> > e.g.
> > > > > > >
> > > > > > > {name: {first: 'Dave', last: 'Bechberger'}}
> > > > > > >
> > > > > > > I can't say I've thought through all the implications of this
> > > though
> > > > so
> > > > > > it
> > > > > > > is an area we would need to explore.
> > > > > > >
> > > > > >
> > > > > > The first implication that comes to mind is that it makes the
> > > > assumption
> > > > > > the user wants multi/meta-properties as opposed to a single
> > > cardinality
> > > > > of
> > > > > > List in the first case and a Map as a property value in the
> second
> > > > case.
> > > > > I
> > > > > > suppose that graphs with a schema could resolve those assumptions
> > but
> > > > > > graphs that are schemaless would have a problem. The issue could
> be
> > > > > > resolved by specialized configuration of "g" or per merge() step
> > > using
> > > > a
> > > > > > with() modulator I suppose but that goes into a yet another level
> > of
> > > > > > implications to consider. I've often wondered if the start point
> > for
> > > > > > getting types/schema into TP3 without a full rewrite would be in
> > this
> > > > > form
> > > > > > where Gremlin would be given hints as to what to expect as to the
> > > types
> > > > > of
> > > > > > data it might encounter while traversing. Anyway, I'd be hesitant
> > to
> > > go
> > > > > > down paths that don't account for multi/metaproperties well.
> They
> > > are
> > > > > > first class citizens in TP3 (with those hoping for extension of
> at
> > > > least
> > > > > > multiproperties to edges) and while I find them a constant
> > annoyance
> > > > for
> > > > > so
> > > > > > many reasons, we're kinda stuck with them.
> > > > > >
> > > > >
> > > > > I agree we need to account for multi/meta properties as 1st class
> > > > > citizens.  This is my current thinking on the syntax for the
> > situations
> > > > we
> > > > > have laid out so far:
> > > > >
> > > > > *Merge*
> > > > > g.mergeV('name', {'name': 'marko'}, {'name': 'marko', 'age': 29})
> > > > >
> > > > >
> > > > > *Merge with id* g.mergeV('name', {T.id: 1}, {T.id: 1, 'name':
> > 'Marko'})
> > > > >
> > > > >
> > > > > *Merge with Meta Properties* g.mergeV('name', {T.id: 1}, {T.id: 1,
> > > > 'name':
> > > > > {'first': Marko', 'last': 'R'})
> > > > >
> > > > >
> > > > > *Merge with Multi Properties * g.mergeV('name', {T.id: 1}, {T.id:
> 1,
> > > > > 'lang': ['java', 'scala'])
> > > > >
> > > > >
> > > > > *Merge with Single Cardinality* g.mergeV('name', {T.id: 1}, {T.id:
> 1,
> > > > > 'lang': 'java')
> > > > >
> > > > >
> > > > > *Merge with List Cardinality* g.mergeV('name', {T.id: 1}, {T.id: 1,
> > > > 'lang':
> > > > > ['java', 'scala', 'java'])
> > > > >
> > > > >
> > > > > *Merge with Set Cardinality * g.mergeV('name', {T.id: 1}, {T.id: 1,
> > > > 'lang':
> > > > > ['java', 'scala'])
> > > > >
> > > > > Since in a mergeV() scenario we are only ever adding whatever
> values
> > > are
> > > > > passed in there would be no need to specify the cardinality of the
> > > > property
> > > > > being added.  If they wanted to add a value to an existing property
> > > then
> > > > > the current property() method would still be available on the
> output
> > > > > traversal. e.g.
> > > > > g.mergeV('name', {T.id: 1}, {T.id: 1, 'lang': ['java', 'scala',
> > > > > 'java']).property(Cardinality.list, 'lang, 'java')
> > > > >
> > > > >
> > > > > *Merge with stream * g.inject([{id: 1, label: 'person', 'name':
> > > 'marko'},
> > > > > {id: 2, label: 'person', 'name': 'josh'}]).
> > > > >   mergeV(values('label'), valueMap('id'), valueMap())
> > > > >
> > > > >
> > > > > *Merge with reporting added or updated side effect*
> g.mergeV('name',
> > > > > {'name': 'marko'}, {'name': 'marko', 'age': 29}).
> > > > >   with(WithOptions.sideEffectLabel, 'a').
> > > > >   project('vertex', 'added').
> > > > >     by().
> > > > >     by(cap('a'))
> > > > >
> > > > >
> > > > >
> > > > > >
> > > > > >
> > > > > > > 2. How would users set the T.id on creation? would that T.id
> just
> > > be
> > > > a
> > > > > > key
> > > > > > > in the first Map argument?
> > > > > > >
> > > > > > > Yes, I was thinking that T.id would be the key name, e.g.:
> > > > > > >
> > > > > > > g.mergeV('name', {T.id: 1}, {'name': 'Marko'})
> > > > > > >
> > > > > >
> > > > > > ok - I can't say I see a problem with that atm.
> > > > > >
> > > > > >
> > > > > > > 3. I do like the general idea of a match on multiple properties
> > for
> > > > the
> > > > > > > first argument as a convenience but wonder about the
> specificity
> > of
> > > > > this
> > > > > > > API a bit as it focuses heavily on equality - I suppose that's
> > most
> > > > > cases
> > > > > > > for get-or-create, so perhaps that's ok.
> > > > > > >
> > > > > > > In most cases I've seen use exact matches on the vertex or
> > edge.  I
> > > > > think
> > > > > > > it might be best to keep this straightforward as any complex
> edge
> > > > cases
> > > > > > > still can perform the same functionality using the coalesce()
> > > > pattern.
> > > > > > >
> > > > > >
> > > > > > I played around with the idea to use modulators to apply
> additional
> > > > > > constraints:
> > > > > >
> > > > > > g.mergeV('person', {'state':'NY'}, {'active': true}).
> > > > > >     by(has('age', gt(29))
> > > > > >
> > > > > > I suppose that's neat but maybe not...we could easily get the
> same
> > > > thing
> > > > > > from:
> > > > > >
> > > > > > g.V().has('person','age',gt(29)).
> > > > > >   mergeV('person', {'state':'NY'}, {'active': true}).
> > > > > >
> > > > > > which is more readable though it does make it harder for
> providers
> > to
> > > > > > optimize upsert operations if they could make use of the has() as
> > > part
> > > > of
> > > > > > that. I think I like has() as part of the main query rather than
> > in a
> > > > > by()
> > > > > > - just thinking out loud.
> > > > > >
> > > > > >
> > > > > Another potential option here would be to allow the second
> parameter
> > of
> > > > > mergeV() accept a traversal.  Not sure how much complexity that
> would
> > > add
> > > > > though
> > > > >
> > > > > g.V().mergeV('person', __.has('person','age',gt(29)).has('state',
> > > 'NY'}.,
> > > > > {'active': true}).
> > > > >
> > > > >
> > > > > >
> > > > > > > 4. I think your suggestion points to one of the troubles
> Gremlin
> > > has
> > > > > > which
> > > > > > > we see with "algorithms" - extending the language with new
> steps
> > > that
> > > > > > > provides a form of "sugar" (e.g. in algorithms we end up with
> > > > > > > shortestPath() step) pollutes the core language a bit, hence my
> > > > > > > generalization of "merging" in my link above which fits into
> the
> > > core
> > > > > > > Gremlin language style. There is a bigger picture where we are
> > > > missing
> > > > > > > something in Gremlin that lets us extend the language in ways
> > that
> > > > let
> > > > > us
> > > > > > > easily introduce new steps that aren't for general purpose.
> This
> > > > issue
> > > > > is
> > > > > > > discussed in terms of "algorithms" here:
> > > > > > > https://issues.apache.org/jira/browse/TINKERPOP-1991 but I
> think
> > > we
> > > > > > could
> > > > > > > see how there might be some "mutation" extension steps that
> would
> > > > cover
> > > > > > > your suggested API, plus batch operations, etc. We need a way
> to
> > > add
> > > > > > > "sugar" without it interfering with the consistency of the
> core.
> > > > > > Obviously
> > > > > > > this is a bigger issue but perhaps important to solve to
> > implement
> > > > > steps
> > > > > > in
> > > > > > > the fashion you describe.
> > > > > > >
> > > > > > > I agree that the "algorithm" steps do seem a bit odd in the
> core
> > > > > language
> > > > > > > but I think the need is there.  I'd be interested in furthering
> > > this
> > > > > > > discussion but I think these "pattern" steps may or may not be
> > the
> > > > same
> > > > > > as
> > > > > > > the algorithm steps.  I'll have to think on that.
> > > > > >
> > > > > >
> > > > > >
> > > > > > > 5. I suppose that the reason for mergeE and mergeV is to
> specify
> > > what
> > > > > > > element type the first Map argument should be applied to? what
> > > about
> > > > > > > mergeVP (i.e. vertex property as it too is an element) ? That's
> > > > tricky
> > > > > > but
> > > > > > > I don't think we should miss that. Perhaps merge() could be a
> > > > "complex
> > > > > > > modulator"?? that's a new concept of course, but you would do
> > > > > > g.V().merge()
> > > > > > > and the label and first Map would fold to VertexStartStep (i.e.
> > > V())
> > > > > for
> > > > > > > the lookup and then a MergeStep would follow - thus a "complex
> > > > > modulator"
> > > > > > > as it does more than just change the behavior of the previous
> > step
> > > -
> > > > it
> > > > > > > also adds its own. I suppose it could also add has() steps
> > followed
> > > > by
> > > > > > the
> > > > > > > MergeStep and then the has() operations would fold in normally
> as
> > > > they
> > > > > do
> > > > > > > today. In this way, we can simplify to just one single
> > > > > > > merge(String,Map,Map). ??
> > > > > > >
> > > > > > > I agree that we should also think about how to include
> properties
> > > in
> > > > > this
> > > > > > > merge construct.  The reason I was thinking about mergeV() and
> > > > mergeE()
> > > > > > is
> > > > > > > that it follows the same pattern as the already well understood
> > > > > > > addV()/addE() steps.  I am a bit wary of trying to generalize
> > this
> > > > down
> > > > > > to
> > > > > > > a single merge() step as these sorts of complex overloads make
> it
> > > > hard
> > > > > to
> > > > > > > figure out which Gremlin step you should use for a particular
> > > pattern
> > > > > > (e.g.
> > > > > > > upsert vertex or upsert edge).  One thing that I think
> customers
> > > find
> > > > > > > useful about the addV/addE step is that they are very
> > discoverable,
> > > > the
> > > > > > > name tells me what functionality to expect from that step.
> > > > > > >
> > > > > >
> > > > > > I'm starting to agree with the idea that we have to do something
> > like
> > > > > > mergeV()/E() as it wouldn't quite work as a start step otherwise.
> > We
> > > > > > couldn't do:
> > > > > >
> > > > > > g.merge()
> > > > > >
> > > > > > as we wouldnt know what sort of Element it applied to. If so, I
> > > wonder
> > > > if
> > > > > > it would be better to preserve "merge" for my more general
> use-case
> > > and
> > > > > > prefer upsertV()/E()?
> > > > > >
> > > > > > Also, perhaps mergeVP() doesn't need to exist as perhaps it is an
> > > > > uncommon
> > > > > > case (compared to V/E()) and we don't really have addVP() as an
> > > > analogous
> > > > > > step. Perhaps existing coalesce() patterns and/or my more general
> > > > purpose
> > > > > > merge() step would satisfy those situations??
> > > > > >
> > > > >
> > > > > I think that a mergeVP() type step may require a different pattern
> > > since
> > > > it
> > > > > would really need to accept a map of values otherwise it would
> > > > essentially
> > > > > perform the same function as property()
> > > > >
> > > > >
> > > > > >
> > > > > >
> > > > > > > 6. One thing neither my approach nor yours seems to do is tell
> > the
> > > > user
> > > > > > if
> > > > > > > they created something or updated something - that's another
> > thing
> > > > I've
> > > > > > > seen users want to have in get-or-create. Here again we go
> deeper
> > > > into
> > > > > a
> > > > > > > less general step specification as alluded to in 4, but a
> merge()
> > > > step
> > > > > as
> > > > > > > proposed in 5, might return [Element,boolean] so as to provide
> an
> > > > > > indicator
> > > > > > > of creation?
> > > > > > >
> > > > > > > Hmm, I'll have to think about that.  How would returning
> multiple
> > > > > values
> > > > > > > work if we want to chain these together. e.g. Add a vertex
> > between
> > > > two
> > > > > > > edges but make sure the vertices exist?
> > > > > > >
> > > > > >
> > > > >
> > > > > Added a thought on how to accomplish this above that I'd like to
> get
> > > > > thoughts on.
> > > > >
> > > > >
> > > > > > >
> > > > > > > 7. You were just introducing your ideas here, so perhaps you
> > > haven't
> > > > > > gotten
> > > > > > > this far yet, but a shortcoming to doing merge(String,Map,Map)
> is
> > > > that
> > > > > it
> > > > > > > leaves open no opportunity to stream a List of Maps to a
> merge()
> > > for
> > > > a
> > > > > > form
> > > > > > > of batch loading which is mighty common and one of the
> variations
> > > of
> > > > > the
> > > > > > > coalesce() pattern that I alluded to at the start of all this.
> I
> > > > think
> > > > > > that
> > > > > > > we would want to be sure that we left open the option to do
> that
> > > > > somehow.
> > > > > > > 8. If we had a general purpose merge() step I wonder if it
> makes
> > > > > > developing
> > > > > > > the API as you suggested easier to do?
> > > > > > >
> > > > > > > Hmm, let me think about that one.
> > > > > > >
> > > > > > >
> > > > > > I will add an item 8 to think about which I didn't mention
> before:
> > > > > >
> > > > > > 8. The signature you suggested for mergeE() is:
> > > > > >
> > > > > > mergeE(String, Map, Map)
> > > > > >      String - The edge label
> > > > > >      Map (first) - The properties to match existing edge on
> > > > > >      Map (second) - Any additional properties to set if a new
> edge
> > is
> > > > > > created (optional)
> > > > > >
> > > > > > but does that exactly answer the need? Typically the idea is to
> > > detect
> > > > an
> > > > > > edge between two vertices not globally in the graph by way of
> some
> > > > > > properties. This signature doesn't seem to allow for that as it
> > > doesn't
> > > > > > allow specification of the vertices to test against. Perhaps the
> > > answer
> > > > > is
> > > > > > to use from() and to() modulators?
> > > > > >
> > > > > > g.mergeE('knows', {'weight':0.5}).
> > > > > >   from(V(1)).to(V(2))
> > > > > > g.V().has('person','name','marko').
> > > > > >   mergeE( 'knows', {'weight':0.5}).
> > > > > >     to(V(2))
> > > > > > g.V().has('person','name','marko').
> > > > > >   mergeE( 'self', {'weight':0.5})
> > > > > >
> > > > > > That seems to work?
> > > > > >
> > > > >
> > > > > I think that is an elegant solution to that problem.  I like it and
> > it
> > > > > keeps in line with the way that addE() works.
> > > > >
> > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > > On Thu, Sep 17, 2020 at 4:31 AM Stephen Mallette <
> > > > spmalle...@gmail.com
> > > > > >
> > > > > > > wrote:
> > > > > > >
> > > > > > > > I like our coalesce() pattern but it is verbose and over time
> > it
> > > > has
> > > > > > gone
> > > > > > > > from a simple pattern to one with numerous variations for all
> > > > manner
> > > > > of
> > > > > > > > different sorts of merge-like operations. As such, I do think
> > we
> > > > > should
> > > > > > > > introduce something to cover this pattern.
> > > > > > > >
> > > > > > > > I like that you used the word "merge" in your description of
> > this
> > > > as
> > > > > it
> > > > > > > is
> > > > > > > > the word I've liked using. You might want to give a look at
> my
> > > > > proposed
> > > > > > > > merge() step from earlier in the year:
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://lists.apache.org/thread.html/r34ff112e18f4e763390303501fc07c82559d71667444339bde61053f%40%3Cdev.tinkerpop.apache.org%3E
> > > > > > > >
> > > > > > > > I'm just going to dump thoughts as they come regarding what
> you
> > > > > wrote:
> > > > > > > >
> > > > > > > > 1. How would multi/meta-properties fit into the API you've
> > > > proposed?
> > > > > > > > 2. How would users set the T.id on creation? would that T.id
> > just
> > > > be
> > > > > a
> > > > > > > key
> > > > > > > > in the first Map argument?
> > > > > > > > 3. I do like the general idea of a match on multiple
> properties
> > > for
> > > > > the
> > > > > > > > first argument as a convenience but wonder about the
> > specificity
> > > of
> > > > > > this
> > > > > > > > API a bit as it focuses heavily on equality - I suppose
> that's
> > > most
> > > > > > cases
> > > > > > > > for get-or-create, so perhaps that's ok.
> > > > > > > > 4. I think your suggestion points to one of the troubles
> > Gremlin
> > > > has
> > > > > > > which
> > > > > > > > we see with "algorithms" - extending the language with new
> > steps
> > > > that
> > > > > > > > provides a form of "sugar" (e.g. in algorithms we end up with
> > > > > > > > shortestPath() step) pollutes the core language a bit, hence
> my
> > > > > > > > generalization of "merging" in my link above which fits into
> > the
> > > > core
> > > > > > > > Gremlin language style. There is a bigger picture where we
> are
> > > > > missing
> > > > > > > > something in Gremlin that lets us extend the language in ways
> > > that
> > > > > let
> > > > > > us
> > > > > > > > easily introduce new steps that aren't for general purpose.
> > This
> > > > > issue
> > > > > > is
> > > > > > > > discussed in terms of "algorithms" here:
> > > > > > > > https://issues.apache.org/jira/browse/TINKERPOP-1991 but I
> > think
> > > > we
> > > > > > > could
> > > > > > > > see how there might be some "mutation" extension steps that
> > would
> > > > > cover
> > > > > > > > your suggested API, plus batch operations, etc. We need a way
> > to
> > > > add
> > > > > > > > "sugar" without it interfering with the consistency of the
> > core.
> > > > > > > Obviously
> > > > > > > > this is a bigger issue but perhaps important to solve to
> > > implement
> > > > > > steps
> > > > > > > in
> > > > > > > > the fashion you describe.
> > > > > > > > 5. I suppose that the reason for mergeE and mergeV is to
> > specify
> > > > what
> > > > > > > > element type the first Map argument should be applied to?
> what
> > > > about
> > > > > > > > mergeVP (i.e. vertex property as it too is an element) ?
> That's
> > > > > tricky
> > > > > > > but
> > > > > > > > I don't think we should miss that. Perhaps merge() could be a
> > > > > "complex
> > > > > > > > modulator"?? that's a new concept of course, but you would do
> > > > > > > g.V().merge()
> > > > > > > > and the label and first Map would fold to VertexStartStep
> (i.e.
> > > > V())
> > > > > > for
> > > > > > > > the lookup and then a MergeStep would follow - thus a
> "complex
> > > > > > modulator"
> > > > > > > > as it does more than just change the behavior of the previous
> > > step
> > > > -
> > > > > it
> > > > > > > > also adds its own. I suppose it could also add has() steps
> > > followed
> > > > > by
> > > > > > > the
> > > > > > > > MergeStep and then the has() operations would fold in
> normally
> > as
> > > > > they
> > > > > > do
> > > > > > > > today. In this way, we can simplify to just one single
> > > > > > > > merge(String,Map,Map). ??
> > > > > > > > 6. One thing neither my approach nor yours seems to do is
> tell
> > > the
> > > > > user
> > > > > > > if
> > > > > > > > they created something or updated something - that's another
> > > thing
> > > > > I've
> > > > > > > > seen users want to have in get-or-create. Here again we go
> > deeper
> > > > > into
> > > > > > a
> > > > > > > > less general step specification as alluded to in 4, but a
> > merge()
> > > > > step
> > > > > > as
> > > > > > > > proposed in 5, might return [Element,boolean] so as to
> provide
> > an
> > > > > > > indicator
> > > > > > > > of creation?
> > > > > > > > 7. You were just introducing your ideas here, so perhaps you
> > > > haven't
> > > > > > > gotten
> > > > > > > > this far yet, but a shortcoming to doing
> merge(String,Map,Map)
> > is
> > > > > that
> > > > > > it
> > > > > > > > leaves open no opportunity to stream a List of Maps to a
> > merge()
> > > > for
> > > > > a
> > > > > > > form
> > > > > > > > of batch loading which is mighty common and one of the
> > variations
> > > > of
> > > > > > the
> > > > > > > > coalesce() pattern that I alluded to at the start of all
> this.
> > I
> > > > > think
> > > > > > > that
> > > > > > > > we would want to be sure that we left open the option to do
> > that
> > > > > > somehow.
> > > > > > > > 8. If we had a general purpose merge() step I wonder if it
> > makes
> > > > > > > developing
> > > > > > > > the API as you suggested easier to do?
> > > > > > > >
> > > > > > > > I think I'd like to solve the problems you describe in your
> > post
> > > as
> > > > > > well
> > > > > > > as
> > > > > > > > the ones in mine. There is some relation there, but gaps as
> > well.
> > > > > With
> > > > > > > more
> > > > > > > > discussion here we can figure something out.
> > > > > > > >
> > > > > > > > Thanks for starting this talk - good one!
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > On Wed, Sep 16, 2020 at 9:26 PM David Bechberger <
> > > > > d...@bechberger.com>
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > > > I've had a few on and off discussions with a few people
> here,
> > > so
> > > > I
> > > > > > > wanted
> > > > > > > > > to send this out to everyone for feedback.
> > > > > > > > >
> > > > > > > > > What are people's thoughts on creating a new set of steps
> > that
> > > > > codify
> > > > > > > > > common Gremlin best practices?
> > > > > > > > >
> > > > > > > > > I think there are several common Gremlin patterns where
> users
> > > > would
> > > > > > > > benefit
> > > > > > > > > from the additional guidance that these codified steps
> > > represent.
> > > > > > The
> > > > > > > > > first one I would recommend though is codifying the element
> > > > > existence
> > > > > > > > > pattern into a single Gremlin step, something like:
> > > > > > > > >
> > > > > > > > > mergeV(String, Map, Map)
> > > > > > > > >      String - The vertex label
> > > > > > > > >      Map (first) - The properties to match existing
> vertices
> > on
> > > > > > > > >      Map (second) - Any additional properties to set if a
> new
> > > > > vertex
> > > > > > is
> > > > > > > > > created (optional)
> > > > > > > > > mergeE(String, Map, Map)
> > > > > > > > >      String - The edge label
> > > > > > > > >      Map (first) - The properties to match existing edge on
> > > > > > > > >      Map (second) - Any additional properties to set if a
> new
> > > > edge
> > > > > is
> > > > > > > > > created (optional)
> > > > > > > > >
> > > > > > > > > In each of these cases these steps would perform the same
> > > upsert
> > > > > > > > > functionality as the element existence pattern.
> > > > > > > > >
> > > > > > > > > Example:
> > > > > > > > >
> > > > > > > > > g.V().has('person','name','stephen').
> > > > > > > > >            fold().
> > > > > > > > >            coalesce(unfold(),
> > > > > > > > >                     addV('person').
> > > > > > > > >                       property('name','stephen').
> > > > > > > > >                       property('age',34))
> > > > > > > > >
> > > > > > > > > would become:
> > > > > > > > >
> > > > > > > > > g.mergeV('person', {'name': 'stephen'}, {'age', 34})
> > > > > > > > >
> > > > > > > > > I think that this change would be a good addition to the
> > > language
> > > > > for
> > > > > > > > > several reasons:
> > > > > > > > >
> > > > > > > > > * This codifies the best practice for a specific
> > action/recipe,
> > > > > which
> > > > > > > > > reduces the chance that someone uses the pattern
> incorrectly
> > > > > > > > > * Most complex Gremlin traversals are verbose.  Reducing
> the
> > > > amount
> > > > > > of
> > > > > > > > code
> > > > > > > > > that needs to be written and maintained allows for a better
> > > > > developer
> > > > > > > > > experience.
> > > > > > > > > * It will lower the bar of entry for a developer by making
> > > these
> > > > > > > actions
> > > > > > > > > more discoverable.  The more we can help bring these
> patterns
> > > to
> > > > > the
> > > > > > > > > forefront of the language via these pattern/meta steps the
> > more
> > > > we
> > > > > > > guide
> > > > > > > > > users towards writing better Gremlin faster
> > > > > > > > > * This allows DB vendors to optimize for this pattern
> > > > > > > > >
> > > > > > > > > I know that this would likely be the first step in Gremlin
> > that
> > > > > > > codifies
> > > > > > > > a
> > > > > > > > > pattern, so I'd like to get other's thoughts on this?
> > > > > > > > >
> > > > > > > > > Dave
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Reply via email to