So, I've been doing some additional thinking about the ways that this could
work based on the comments below and have put my comments inline.

Dave

On Tue, Sep 22, 2020 at 6:05 AM Stephen Mallette <spmalle...@gmail.com>
wrote:

> I added some thoughts inline below:
>
> On Fri, Sep 18, 2020 at 3:51 PM David Bechberger <d...@bechberger.com>
> wrote:
>
> > Thanks for the detailed comments Stephen.  I have addressed them inline
> > below.
> >
> > I did read the proposal from earlier and I think that we are in close
> > agreement with what we are trying to accomplish.  I also fully support
> > Josh's comment on providing a mechanism for submitting a map of
> properties
> > as manually unrolling this all right now leads to a lot of potential for
> > error and a long messy traversal.
> >
> > I'm looking forward to this discussion on how to merge these two
> proposals.
> >
> >
> > 1. How would multi/meta-properties fit into the API you've proposed?
> >
> > My first thought here is that multi-properties would be represented as
> > lists in the map, e.g.
> >
> > {names: ['Dave', 'David']}
> >
> > and meta-properties would be represented as maps in the maps, e.g.
> >
> > {name: {first: 'Dave', last: 'Bechberger'}}
> >
> > I can't say I've thought through all the implications of this though so
> it
> > is an area we would need to explore.
> >
>
> The first implication that comes to mind is that it makes the assumption
> the user wants multi/meta-properties as opposed to a single cardinality of
> List in the first case and a Map as a property value in the second case. I
> suppose that graphs with a schema could resolve those assumptions but
> graphs that are schemaless would have a problem. The issue could be
> resolved by specialized configuration of "g" or per merge() step using a
> with() modulator I suppose but that goes into a yet another level of
> implications to consider. I've often wondered if the start point for
> getting types/schema into TP3 without a full rewrite would be in this form
> where Gremlin would be given hints as to what to expect as to the types of
> data it might encounter while traversing. Anyway, I'd be hesitant to go
> down paths that don't account for multi/metaproperties well.  They are
> first class citizens in TP3 (with those hoping for extension of at least
> multiproperties to edges) and while I find them a constant annoyance for so
> many reasons, we're kinda stuck with them.
>

I agree we need to account for multi/meta properties as 1st class
citizens.  This is my current thinking on the syntax for the situations we
have laid out so far:

*Merge*
g.mergeV('name', {'name': 'marko'}, {'name': 'marko', 'age': 29})


*Merge with id* g.mergeV('name', {T.id: 1}, {T.id: 1, 'name': 'Marko'})


*Merge with Meta Properties* g.mergeV('name', {T.id: 1}, {T.id: 1, 'name':
{'first': Marko', 'last': 'R'})


*Merge with Multi Properties * g.mergeV('name', {T.id: 1}, {T.id: 1,
'lang': ['java', 'scala'])


*Merge with Single Cardinality* g.mergeV('name', {T.id: 1}, {T.id: 1,
'lang': 'java')


*Merge with List Cardinality* g.mergeV('name', {T.id: 1}, {T.id: 1, 'lang':
['java', 'scala', 'java'])


*Merge with Set Cardinality * g.mergeV('name', {T.id: 1}, {T.id: 1, 'lang':
['java', 'scala'])

Since in a mergeV() scenario we are only ever adding whatever values are
passed in there would be no need to specify the cardinality of the property
being added.  If they wanted to add a value to an existing property then
the current property() method would still be available on the output
traversal. e.g.
g.mergeV('name', {T.id: 1}, {T.id: 1, 'lang': ['java', 'scala',
'java']).property(Cardinality.list, 'lang, 'java')


*Merge with stream * g.inject([{id: 1, label: 'person', 'name': 'marko'},
{id: 2, label: 'person', 'name': 'josh'}]).
  mergeV(values('label'), valueMap('id'), valueMap())


*Merge with reporting added or updated side effect* g.mergeV('name',
{'name': 'marko'}, {'name': 'marko', 'age': 29}).
  with(WithOptions.sideEffectLabel, 'a').
  project('vertex', 'added').
    by().
    by(cap('a'))



>
>
> > 2. How would users set the T.id on creation? would that T.id just be a
> key
> > in the first Map argument?
> >
> > Yes, I was thinking that T.id would be the key name, e.g.:
> >
> > g.mergeV('name', {T.id: 1}, {'name': 'Marko'})
> >
>
> ok - I can't say I see a problem with that atm.
>
>
> > 3. I do like the general idea of a match on multiple properties for the
> > first argument as a convenience but wonder about the specificity of this
> > API a bit as it focuses heavily on equality - I suppose that's most cases
> > for get-or-create, so perhaps that's ok.
> >
> > In most cases I've seen use exact matches on the vertex or edge.  I think
> > it might be best to keep this straightforward as any complex edge cases
> > still can perform the same functionality using the coalesce() pattern.
> >
>
> I played around with the idea to use modulators to apply additional
> constraints:
>
> g.mergeV('person', {'state':'NY'}, {'active': true}).
>     by(has('age', gt(29))
>
> I suppose that's neat but maybe not...we could easily get the same thing
> from:
>
> g.V().has('person','age',gt(29)).
>   mergeV('person', {'state':'NY'}, {'active': true}).
>
> which is more readable though it does make it harder for providers to
> optimize upsert operations if they could make use of the has() as part of
> that. I think I like has() as part of the main query rather than in a by()
> - just thinking out loud.
>
>
Another potential option here would be to allow the second parameter of
mergeV() accept a traversal.  Not sure how much complexity that would add
though

g.V().mergeV('person', __.has('person','age',gt(29)).has('state', 'NY'}.,
{'active': true}).


>
> > 4. I think your suggestion points to one of the troubles Gremlin has
> which
> > we see with "algorithms" - extending the language with new steps that
> > provides a form of "sugar" (e.g. in algorithms we end up with
> > shortestPath() step) pollutes the core language a bit, hence my
> > generalization of "merging" in my link above which fits into the core
> > Gremlin language style. There is a bigger picture where we are missing
> > something in Gremlin that lets us extend the language in ways that let us
> > easily introduce new steps that aren't for general purpose. This issue is
> > discussed in terms of "algorithms" here:
> > https://issues.apache.org/jira/browse/TINKERPOP-1991 but I think we
> could
> > see how there might be some "mutation" extension steps that would cover
> > your suggested API, plus batch operations, etc. We need a way to add
> > "sugar" without it interfering with the consistency of the core.
> Obviously
> > this is a bigger issue but perhaps important to solve to implement steps
> in
> > the fashion you describe.
> >
> > I agree that the "algorithm" steps do seem a bit odd in the core language
> > but I think the need is there.  I'd be interested in furthering this
> > discussion but I think these "pattern" steps may or may not be the same
> as
> > the algorithm steps.  I'll have to think on that.
>
>
>
> > 5. I suppose that the reason for mergeE and mergeV is to specify what
> > element type the first Map argument should be applied to? what about
> > mergeVP (i.e. vertex property as it too is an element) ? That's tricky
> but
> > I don't think we should miss that. Perhaps merge() could be a "complex
> > modulator"?? that's a new concept of course, but you would do
> g.V().merge()
> > and the label and first Map would fold to VertexStartStep (i.e. V()) for
> > the lookup and then a MergeStep would follow - thus a "complex modulator"
> > as it does more than just change the behavior of the previous step - it
> > also adds its own. I suppose it could also add has() steps followed by
> the
> > MergeStep and then the has() operations would fold in normally as they do
> > today. In this way, we can simplify to just one single
> > merge(String,Map,Map). ??
> >
> > I agree that we should also think about how to include properties in this
> > merge construct.  The reason I was thinking about mergeV() and mergeE()
> is
> > that it follows the same pattern as the already well understood
> > addV()/addE() steps.  I am a bit wary of trying to generalize this down
> to
> > a single merge() step as these sorts of complex overloads make it hard to
> > figure out which Gremlin step you should use for a particular pattern
> (e.g.
> > upsert vertex or upsert edge).  One thing that I think customers find
> > useful about the addV/addE step is that they are very discoverable, the
> > name tells me what functionality to expect from that step.
> >
>
> I'm starting to agree with the idea that we have to do something like
> mergeV()/E() as it wouldn't quite work as a start step otherwise. We
> couldn't do:
>
> g.merge()
>
> as we wouldnt know what sort of Element it applied to. If so, I wonder if
> it would be better to preserve "merge" for my more general use-case and
> prefer upsertV()/E()?
>
> Also, perhaps mergeVP() doesn't need to exist as perhaps it is an uncommon
> case (compared to V/E()) and we don't really have addVP() as an analogous
> step. Perhaps existing coalesce() patterns and/or my more general purpose
> merge() step would satisfy those situations??
>

I think that a mergeVP() type step may require a different pattern since it
would really need to accept a map of values otherwise it would essentially
perform the same function as property()


>
>
> > 6. One thing neither my approach nor yours seems to do is tell the user
> if
> > they created something or updated something - that's another thing I've
> > seen users want to have in get-or-create. Here again we go deeper into a
> > less general step specification as alluded to in 4, but a merge() step as
> > proposed in 5, might return [Element,boolean] so as to provide an
> indicator
> > of creation?
> >
> > Hmm, I'll have to think about that.  How would returning multiple values
> > work if we want to chain these together. e.g. Add a vertex between two
> > edges but make sure the vertices exist?
> >
>

Added a thought on how to accomplish this above that I'd like to get
thoughts on.


> >
> > 7. You were just introducing your ideas here, so perhaps you haven't
> gotten
> > this far yet, but a shortcoming to doing merge(String,Map,Map) is that it
> > leaves open no opportunity to stream a List of Maps to a merge() for a
> form
> > of batch loading which is mighty common and one of the variations of the
> > coalesce() pattern that I alluded to at the start of all this. I think
> that
> > we would want to be sure that we left open the option to do that somehow.
> > 8. If we had a general purpose merge() step I wonder if it makes
> developing
> > the API as you suggested easier to do?
> >
> > Hmm, let me think about that one.
> >
> >
> I will add an item 8 to think about which I didn't mention before:
>
> 8. The signature you suggested for mergeE() is:
>
> mergeE(String, Map, Map)
>      String - The edge label
>      Map (first) - The properties to match existing edge on
>      Map (second) - Any additional properties to set if a new edge is
> created (optional)
>
> but does that exactly answer the need? Typically the idea is to detect an
> edge between two vertices not globally in the graph by way of some
> properties. This signature doesn't seem to allow for that as it doesn't
> allow specification of the vertices to test against. Perhaps the answer is
> to use from() and to() modulators?
>
> g.mergeE('knows', {'weight':0.5}).
>   from(V(1)).to(V(2))
> g.V().has('person','name','marko').
>   mergeE( 'knows', {'weight':0.5}).
>     to(V(2))
> g.V().has('person','name','marko').
>   mergeE( 'self', {'weight':0.5})
>
> That seems to work?
>

I think that is an elegant solution to that problem.  I like it and it
keeps in line with the way that addE() works.


>
>
>
> > On Thu, Sep 17, 2020 at 4:31 AM Stephen Mallette <spmalle...@gmail.com>
> > wrote:
> >
> > > I like our coalesce() pattern but it is verbose and over time it has
> gone
> > > from a simple pattern to one with numerous variations for all manner of
> > > different sorts of merge-like operations. As such, I do think we should
> > > introduce something to cover this pattern.
> > >
> > > I like that you used the word "merge" in your description of this as it
> > is
> > > the word I've liked using. You might want to give a look at my proposed
> > > merge() step from earlier in the year:
> > >
> > >
> > >
> >
> https://lists.apache.org/thread.html/r34ff112e18f4e763390303501fc07c82559d71667444339bde61053f%40%3Cdev.tinkerpop.apache.org%3E
> > >
> > > I'm just going to dump thoughts as they come regarding what you wrote:
> > >
> > > 1. How would multi/meta-properties fit into the API you've proposed?
> > > 2. How would users set the T.id on creation? would that T.id just be a
> > key
> > > in the first Map argument?
> > > 3. I do like the general idea of a match on multiple properties for the
> > > first argument as a convenience but wonder about the specificity of
> this
> > > API a bit as it focuses heavily on equality - I suppose that's most
> cases
> > > for get-or-create, so perhaps that's ok.
> > > 4. I think your suggestion points to one of the troubles Gremlin has
> > which
> > > we see with "algorithms" - extending the language with new steps that
> > > provides a form of "sugar" (e.g. in algorithms we end up with
> > > shortestPath() step) pollutes the core language a bit, hence my
> > > generalization of "merging" in my link above which fits into the core
> > > Gremlin language style. There is a bigger picture where we are missing
> > > something in Gremlin that lets us extend the language in ways that let
> us
> > > easily introduce new steps that aren't for general purpose. This issue
> is
> > > discussed in terms of "algorithms" here:
> > > https://issues.apache.org/jira/browse/TINKERPOP-1991 but I think we
> > could
> > > see how there might be some "mutation" extension steps that would cover
> > > your suggested API, plus batch operations, etc. We need a way to add
> > > "sugar" without it interfering with the consistency of the core.
> > Obviously
> > > this is a bigger issue but perhaps important to solve to implement
> steps
> > in
> > > the fashion you describe.
> > > 5. I suppose that the reason for mergeE and mergeV is to specify what
> > > element type the first Map argument should be applied to? what about
> > > mergeVP (i.e. vertex property as it too is an element) ? That's tricky
> > but
> > > I don't think we should miss that. Perhaps merge() could be a "complex
> > > modulator"?? that's a new concept of course, but you would do
> > g.V().merge()
> > > and the label and first Map would fold to VertexStartStep (i.e. V())
> for
> > > the lookup and then a MergeStep would follow - thus a "complex
> modulator"
> > > as it does more than just change the behavior of the previous step - it
> > > also adds its own. I suppose it could also add has() steps followed by
> > the
> > > MergeStep and then the has() operations would fold in normally as they
> do
> > > today. In this way, we can simplify to just one single
> > > merge(String,Map,Map). ??
> > > 6. One thing neither my approach nor yours seems to do is tell the user
> > if
> > > they created something or updated something - that's another thing I've
> > > seen users want to have in get-or-create. Here again we go deeper into
> a
> > > less general step specification as alluded to in 4, but a merge() step
> as
> > > proposed in 5, might return [Element,boolean] so as to provide an
> > indicator
> > > of creation?
> > > 7. You were just introducing your ideas here, so perhaps you haven't
> > gotten
> > > this far yet, but a shortcoming to doing merge(String,Map,Map) is that
> it
> > > leaves open no opportunity to stream a List of Maps to a merge() for a
> > form
> > > of batch loading which is mighty common and one of the variations of
> the
> > > coalesce() pattern that I alluded to at the start of all this. I think
> > that
> > > we would want to be sure that we left open the option to do that
> somehow.
> > > 8. If we had a general purpose merge() step I wonder if it makes
> > developing
> > > the API as you suggested easier to do?
> > >
> > > I think I'd like to solve the problems you describe in your post as
> well
> > as
> > > the ones in mine. There is some relation there, but gaps as well. With
> > more
> > > discussion here we can figure something out.
> > >
> > > Thanks for starting this talk - good one!
> > >
> > >
> > >
> > > On Wed, Sep 16, 2020 at 9:26 PM David Bechberger <d...@bechberger.com>
> > > wrote:
> > >
> > > > I've had a few on and off discussions with a few people here, so I
> > wanted
> > > > to send this out to everyone for feedback.
> > > >
> > > > What are people's thoughts on creating a new set of steps that codify
> > > > common Gremlin best practices?
> > > >
> > > > I think there are several common Gremlin patterns where users would
> > > benefit
> > > > from the additional guidance that these codified steps represent.
> The
> > > > first one I would recommend though is codifying the element existence
> > > > pattern into a single Gremlin step, something like:
> > > >
> > > > mergeV(String, Map, Map)
> > > >      String - The vertex label
> > > >      Map (first) - The properties to match existing vertices on
> > > >      Map (second) - Any additional properties to set if a new vertex
> is
> > > > created (optional)
> > > > mergeE(String, Map, Map)
> > > >      String - The edge label
> > > >      Map (first) - The properties to match existing edge on
> > > >      Map (second) - Any additional properties to set if a new edge is
> > > > created (optional)
> > > >
> > > > In each of these cases these steps would perform the same upsert
> > > > functionality as the element existence pattern.
> > > >
> > > > Example:
> > > >
> > > > g.V().has('person','name','stephen').
> > > >            fold().
> > > >            coalesce(unfold(),
> > > >                     addV('person').
> > > >                       property('name','stephen').
> > > >                       property('age',34))
> > > >
> > > > would become:
> > > >
> > > > g.mergeV('person', {'name': 'stephen'}, {'age', 34})
> > > >
> > > > I think that this change would be a good addition to the language for
> > > > several reasons:
> > > >
> > > > * This codifies the best practice for a specific action/recipe, which
> > > > reduces the chance that someone uses the pattern incorrectly
> > > > * Most complex Gremlin traversals are verbose.  Reducing the amount
> of
> > > code
> > > > that needs to be written and maintained allows for a better developer
> > > > experience.
> > > > * It will lower the bar of entry for a developer by making these
> > actions
> > > > more discoverable.  The more we can help bring these patterns to the
> > > > forefront of the language via these pattern/meta steps the more we
> > guide
> > > > users towards writing better Gremlin faster
> > > > * This allows DB vendors to optimize for this pattern
> > > >
> > > > I know that this would likely be the first step in Gremlin that
> > codifies
> > > a
> > > > pattern, so I'd like to get other's thoughts on this?
> > > >
> > > > Dave
> > > >
> > >
> >
>

Reply via email to