Re: [DISCUSS] Creating pattern steps to codify best practices

2020-09-17 Thread Stephen Mallette
I like our coalesce() pattern but it is verbose and over time it has gone
from a simple pattern to one with numerous variations for all manner of
different sorts of merge-like operations. As such, I do think we should
introduce something to cover this pattern.

I like that you used the word "merge" in your description of this as it is
the word I've liked using. You might want to give a look at my proposed
merge() step from earlier in the year:

https://lists.apache.org/thread.html/r34ff112e18f4e763390303501fc07c82559d71667444339bde61053f%40%3Cdev.tinkerpop.apache.org%3E

I'm just going to dump thoughts as they come regarding what you wrote:

1. How would multi/meta-properties fit into the API you've proposed?
2. How would users set the T.id on creation? would that T.id just be a key
in the first Map argument?
3. I do like the general idea of a match on multiple properties for the
first argument as a convenience but wonder about the specificity of this
API a bit as it focuses heavily on equality - I suppose that's most cases
for get-or-create, so perhaps that's ok.
4. I think your suggestion points to one of the troubles Gremlin has which
we see with "algorithms" - extending the language with new steps that
provides a form of "sugar" (e.g. in algorithms we end up with
shortestPath() step) pollutes the core language a bit, hence my
generalization of "merging" in my link above which fits into the core
Gremlin language style. There is a bigger picture where we are missing
something in Gremlin that lets us extend the language in ways that let us
easily introduce new steps that aren't for general purpose. This issue is
discussed in terms of "algorithms" here:
https://issues.apache.org/jira/browse/TINKERPOP-1991 but I think we could
see how there might be some "mutation" extension steps that would cover
your suggested API, plus batch operations, etc. We need a way to add
"sugar" without it interfering with the consistency of the core. Obviously
this is a bigger issue but perhaps important to solve to implement steps in
the fashion you describe.
5. I suppose that the reason for mergeE and mergeV is to specify what
element type the first Map argument should be applied to? what about
mergeVP (i.e. vertex property as it too is an element) ? That's tricky but
I don't think we should miss that. Perhaps merge() could be a "complex
modulator"?? that's a new concept of course, but you would do g.V().merge()
and the label and first Map would fold to VertexStartStep (i.e. V()) for
the lookup and then a MergeStep would follow - thus a "complex modulator"
as it does more than just change the behavior of the previous step - it
also adds its own. I suppose it could also add has() steps followed by the
MergeStep and then the has() operations would fold in normally as they do
today. In this way, we can simplify to just one single
merge(String,Map,Map). ??
6. One thing neither my approach nor yours seems to do is tell the user if
they created something or updated something - that's another thing I've
seen users want to have in get-or-create. Here again we go deeper into a
less general step specification as alluded to in 4, but a merge() step as
proposed in 5, might return [Element,boolean] so as to provide an indicator
of creation?
7. You were just introducing your ideas here, so perhaps you haven't gotten
this far yet, but a shortcoming to doing merge(String,Map,Map) is that it
leaves open no opportunity to stream a List of Maps to a merge() for a form
of batch loading which is mighty common and one of the variations of the
coalesce() pattern that I alluded to at the start of all this. I think that
we would want to be sure that we left open the option to do that somehow.
8. If we had a general purpose merge() step I wonder if it makes developing
the API as you suggested easier to do?

I think I'd like to solve the problems you describe in your post as well as
the ones in mine. There is some relation there, but gaps as well. With more
discussion here we can figure something out.

Thanks for starting this talk - good one!



On Wed, Sep 16, 2020 at 9:26 PM David Bechberger 
wrote:

> I've had a few on and off discussions with a few people here, so I wanted
> to send this out to everyone for feedback.
>
> What are people's thoughts on creating a new set of steps that codify
> common Gremlin best practices?
>
> I think there are several common Gremlin patterns where users would benefit
> from the additional guidance that these codified steps represent.  The
> first one I would recommend though is codifying the element existence
> pattern into a single Gremlin step, something like:
>
> mergeV(String, Map, Map)
>  String - The vertex label
>  Map (first) - The properties to match existing vertices on
>  Map (second) - Any additional properties to set if a new vertex is
> created (optional)
> mergeE(String, Map, Map)
>  String - The edge label
>  Map (first) - The properties to match existing edge on
>  Map (seco

[jira] [Closed] (TINKERPOP-2412) Add missing query tests

2020-09-17 Thread Stephen Mallette (Jira)


 [ 
https://issues.apache.org/jira/browse/TINKERPOP-2412?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephen Mallette closed TINKERPOP-2412.
---
Fix Version/s: 3.4.9
   3.5.0
 Assignee: Stephen Mallette
   Resolution: Done

> Add missing query tests
> ---
>
> Key: TINKERPOP-2412
> URL: https://issues.apache.org/jira/browse/TINKERPOP-2412
> Project: TinkerPop
>  Issue Type: Improvement
>  Components: test-suite
>Affects Versions: 3.4.8
>Reporter: Divij Vaidya
>Assignee: Stephen Mallette
>Priority: Minor
> Fix For: 3.5.0, 3.4.9
>
>
> In our Gremlin query test suites, we do not have tests that would test the 
> following pattern of queries:
> # "blah().barrier().as('x')blah().select('x')"
> # "blah().valueMap().as('x')blah().select('x')"
> # "blah().path().as('x')blah().select('x')"
> # "blah().count().as('x')blah().select('x')"
> The category of tests can be clubbed into defining an alias and using the 
> value of the alias further ahead in the traversal.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (TINKERPOP-2412) Add missing query tests

2020-09-17 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/TINKERPOP-2412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17197757#comment-17197757
 ] 

ASF GitHub Bot commented on TINKERPOP-2412:
---

spmallette merged pull request #1327:
URL: https://github.com/apache/tinkerpop/pull/1327


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add missing query tests
> ---
>
> Key: TINKERPOP-2412
> URL: https://issues.apache.org/jira/browse/TINKERPOP-2412
> Project: TinkerPop
>  Issue Type: Improvement
>  Components: test-suite
>Affects Versions: 3.4.8
>Reporter: Divij Vaidya
>Priority: Minor
>
> In our Gremlin query test suites, we do not have tests that would test the 
> following pattern of queries:
> # "blah().barrier().as('x')blah().select('x')"
> # "blah().valueMap().as('x')blah().select('x')"
> # "blah().path().as('x')blah().select('x')"
> # "blah().count().as('x')blah().select('x')"
> The category of tests can be clubbed into defining an alias and using the 
> value of the alias further ahead in the traversal.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (TINKERPOP-2249) Improve RequestOption passing

2020-09-17 Thread Stephen Mallette (Jira)


 [ 
https://issues.apache.org/jira/browse/TINKERPOP-2249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephen Mallette updated TINKERPOP-2249:

Description: 
{{RequestOptions}} provides a way to set certain control arguments to send to 
the server like a per-request timeout or batch size. For bytecode traversals, 
these are set somewhat statically and selectively by way of {{with()}} which 
are then gathered by {{DriverRemoteConnection}} into a {{RequestOptions}} 
object and then sent to the server on the request - it's basically leading to a 
giant switch statement as new special options are being added.

Perhaps we should also consider a "request headers" space on the 
{{RequestMessage}} rather than just throwing all these options in at the "args" 
level.

Another idea might be to do {{with(TraversalSourceOptions)}} where 
{{TraversalSourceOptions}} would be a new interface that something like 
{{RequestOptions}} could implement. That way we could pass around 
{{RequestOptions}} in a consistent way:

{code}
submit("g.V()", RequestOptions.build().timeout(100).create())
g.with(RequestOptions.build().timeout(100).create()).V()
{code}

rather than:

{code}
g.with("evaluationTimeout", 100)
{code}

  was:
{{RequestOptions}} provides a way to set certain control arguments to send to 
the server like a per-request timeout or batch size. For bytecode traversals, 
these are set somewhat statically and selectively by way of {{with()}} which 
are then gathered by {{DriverRemoteConnection}} into a {{RequestOptions}} 
object and then sent to the server on the request - it's basically leading to a 
giant switch statement as new special options are being added.

Perhaps we should also consider a "request headers" space on the 
{{RequestMessage}} rather than just throwing all these options in at the "args" 
level.


> Improve RequestOption passing
> -
>
> Key: TINKERPOP-2249
> URL: https://issues.apache.org/jira/browse/TINKERPOP-2249
> Project: TinkerPop
>  Issue Type: Improvement
>  Components: driver, server
>Affects Versions: 3.4.2
>Reporter: Stephen Mallette
>Priority: Major
>
> {{RequestOptions}} provides a way to set certain control arguments to send to 
> the server like a per-request timeout or batch size. For bytecode traversals, 
> these are set somewhat statically and selectively by way of {{with()}} which 
> are then gathered by {{DriverRemoteConnection}} into a {{RequestOptions}} 
> object and then sent to the server on the request - it's basically leading to 
> a giant switch statement as new special options are being added.
> Perhaps we should also consider a "request headers" space on the 
> {{RequestMessage}} rather than just throwing all these options in at the 
> "args" level.
> Another idea might be to do {{with(TraversalSourceOptions)}} where 
> {{TraversalSourceOptions}} would be a new interface that something like 
> {{RequestOptions}} could implement. That way we could pass around 
> {{RequestOptions}} in a consistent way:
> {code}
> submit("g.V()", RequestOptions.build().timeout(100).create())
> g.with(RequestOptions.build().timeout(100).create()).V()
> {code}
> rather than:
> {code}
> g.with("evaluationTimeout", 100)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (TINKERPOP-2296) Per query timeout not working from Python

2020-09-17 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/TINKERPOP-2296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17197910#comment-17197910
 ] 

ASF GitHub Bot commented on TINKERPOP-2296:
---

spmallette opened a new pull request #1329:
URL: https://github.com/apache/tinkerpop/pull/1329


   https://issues.apache.org/jira/browse/TINKERPOP-2296
   
   Builds with `mvn clean install -pl gremlin-python`
   
   VOTE +1



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Per query timeout not working from Python
> -
>
> Key: TINKERPOP-2296
> URL: https://issues.apache.org/jira/browse/TINKERPOP-2296
> Project: TinkerPop
>  Issue Type: Bug
>  Components: python, server
>Affects Versions: 3.4.3
> Environment: Gremlin Server 3.4.3 and GremlinPython latest
>Reporter: Kelvin R. Lawrence
>Assignee: Stephen Mallette
>Priority: Major
>
> I had a discussion with [~spmallette] about some problems I have been running 
> into trying to get per query timeouts to work using the Python GLV client. As 
> best as I can tell the timeout setting just gets ignored. The query executes 
> to completion taking as many seconds as needed. Stephen asked for a Jira so 
> writing this up here.
> Using the air-routes data set this query can take a few seconds so should 
> definitely time out at 200ms. Using the Java client the same query works 
> although I had an exception when using the Binary serializer but it worked 
> when using GraphSON. I don't know yet if that is relevant to this issue.
>  
> {{paths = (g.V().with_('scriptEvaluationTimeout', 200).}}
> {{         has('airport', 'code', 'AUS').}}
> {{         repeat(__.out('route').simplePath()).}}
> {{         until(__.has('code', 'AGR')).}}
> {{         path().by('code').}}
> {{         limit(10).}}
> {{         toList())}}
>  
> {{As a sidenote it would be nice if the Python client had an equivalent of 
> the Java Tokens class so that constant variable names rather than strings 
> could be used for the 'scriptEvaluationTimeout' part.}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [DISCUSS] Accepting gremlint donation

2020-09-17 Thread Øyvind Sæbø
I've created a pull request for adding ASF licence headers to all the files
of the project here: https://github.com/OyvindSabo/gremlint/pull/57

I'll merge it once the legal department at Ardoq has had time to fill out
the CCLA. I requested an estimate for when they would have time for this,
and unfortunately it might not be until October, so I guess that indeed
leaves me some time to do some clean-up.

ons. 16. sep. 2020 kl. 16:01 skrev Stephen Mallette :

> Sorry - to be clear, I will start the VOTE thread once the IP Clearance
> form is complete.
>
> On Wed, Sep 16, 2020 at 9:53 AM Stephen Mallette 
> wrote:
>
> > Interestingly there is a similar process going on right now in the Apache
> > Cassandra community where they are going through IP Clearance. In
> watching
> > that it seems an actual VOTE thread is better than a "consensus" thread,
> so
> > I will start that now for completeness purposes and to lower friction as
> we
> > head toward incubator.
> >
> > On Wed, Sep 16, 2020 at 7:05 AM Stephen Mallette 
> > wrote:
> >
> >> >1. I can create a GitHub issue for adding the required headers in
> >> all the source files.
> >>
> >> excellent. I'm still not completely clear on that step being a
> >> prerequisite from the documentation as i saw some IP Clearance examples
> >> with and without it, but if I know one thing about Apache and its Ways,
> >> it's best not to use "what other projects do" as your reasoning for
> doing
> >> something. anyway, once that commit is in place we can reference that
> >> commit id (or any after it) for the donation. I suppose that if there
> were
> >> any other "clean-up" you wanted to do before that time, now would be the
> >> time to do it.
> >>
> >> >   2. There should be no issues there.
> >> >   3. The project has zero dependencies, so there should be no issues
> >> there either.
> >>
> >> well - that's easy then!
> >>
> >> a quick side note as we continue this process - the incubator site has
> >> regenerated itself so our page is available now:
> >>
> >> https://incubator.apache.org/ip-clearance/tinkerpop-gremlint.html
> >>
> >> you can see what steps remain - i've updated the document to reflect
> your
> >> responses to items 2 and 3 above.
> >>
> >> On Tue, Sep 15, 2020 at 11:19 AM Øyvind Sæbø 
> >> wrote:
> >>
> >>> Yes, I'm following along. Cool to hear that we can move forward with
> >>> this.
> >>>
> >>> I and Ardoq (the company that the project will be donated on behalf of)
> >>> will start filling out the required ICLA
> >>>  and CCLA
> >>> .
> >>>
> >>> Regarding the points you mentioned:
> >>>
> >>>1. I can create a GitHub issue for adding the required headers in
> all
> >>>the source files.
> >>>2. There should be no issues there.
> >>>3. The project has zero dependencies, so there should be no issues
> >>> there
> >>>either.
> >>>
> >>>
> >>> tir. 15. sep. 2020 kl. 15:48 skrev Stephen Mallette <
> >>> spmalle...@gmail.com>:
> >>>
> >>> > I've set up the IP Clearance form for incubator here (website hasn't
> >>> > generated the HTML yet I guess):
> >>> >
> >>> >
> >>> >
> >>>
> https://svn.apache.org/repos/asf/incubator/public/trunk/content/ip-clearance/tinkerpop-gremlint.xml
> >>> >
> >>> > In the checklist of items there there are few items pertaining to the
> >>> code
> >>> > base itself:
> >>> >
> >>> > 1. Check and make sure that the files that have been donated have
> been
> >>> > updated to reflect the new ASF copyright
> >>> > 2. Check and make sure that for all items included with the
> >>> distribution
> >>> > that is not under the Apache license, we have the right to combine
> with
> >>> > Apache-licensed code and redistribute.
> >>> > 3. Check and make sure that all items depended upon by the project is
> >>> > covered by one or more of the following approved licenses: Apache,
> BSD,
> >>> > Artistic, MIT/X, MIT/W3C, MPL 1.1, or something with essentially the
> >>> same
> >>> > terms.
> >>> >
> >>> > For item 1 I assume that means the code base state at which we accept
> >>> the
> >>> > it should have the ASF license header in it with it an appropriate
> >>> NOTICE
> >>> > file if necessary:
> >>> >
> >>> > https://www.apache.org/legal/src-headers.html
> >>> >
> >>> > For 2 and 3, I don't think we have any issues there but would need to
> >>> > confirm.
> >>> >
> >>> > Øyvind, I believe you're on the list following along - could you
> please
> >>> > comment on the above for us?
> >>> >
> >>> >
> >>> >
> >>> >
> >>> > On Tue, Sep 15, 2020 at 7:30 AM Stephen Mallette <
> spmalle...@gmail.com
> >>> >
> >>> > wrote:
> >>> >
> >>> > > As there haven't been any objections here, it sounds like we can go
> >>> ahead
> >>> > > with this process. I believe that we will need to go through the IP
> >>> > > Clearance process in incubator:
> >>> > >
> >>> > > https://incubator.apache.org/ip-clearance/
> >>> > >
> >>> > > and engage Apache In