[DISCUSS] Creating pattern steps to codify best practices

2020-09-16 Thread David Bechberger
I've had a few on and off discussions with a few people here, so I wanted
to send this out to everyone for feedback.

What are people's thoughts on creating a new set of steps that codify
common Gremlin best practices?

I think there are several common Gremlin patterns where users would benefit
from the additional guidance that these codified steps represent.  The
first one I would recommend though is codifying the element existence
pattern into a single Gremlin step, something like:

mergeV(String, Map, Map)
 String - The vertex label
 Map (first) - The properties to match existing vertices on
 Map (second) - Any additional properties to set if a new vertex is
created (optional)
mergeE(String, Map, Map)
 String - The edge label
 Map (first) - The properties to match existing edge on
 Map (second) - Any additional properties to set if a new edge is
created (optional)

In each of these cases these steps would perform the same upsert
functionality as the element existence pattern.

Example:

g.V().has('person','name','stephen').
   fold().
   coalesce(unfold(),
addV('person').
  property('name','stephen').
  property('age',34))

would become:

g.mergeV('person', {'name': 'stephen'}, {'age', 34})

I think that this change would be a good addition to the language for
several reasons:

* This codifies the best practice for a specific action/recipe, which
reduces the chance that someone uses the pattern incorrectly
* Most complex Gremlin traversals are verbose.  Reducing the amount of code
that needs to be written and maintained allows for a better developer
experience.
* It will lower the bar of entry for a developer by making these actions
more discoverable.  The more we can help bring these patterns to the
forefront of the language via these pattern/meta steps the more we guide
users towards writing better Gremlin faster
* This allows DB vendors to optimize for this pattern

I know that this would likely be the first step in Gremlin that codifies a
pattern, so I'd like to get other's thoughts on this?

Dave


Re: [DISCUSS] Accepting gremlint donation

2020-09-16 Thread Stephen Mallette
Sorry - to be clear, I will start the VOTE thread once the IP Clearance
form is complete.

On Wed, Sep 16, 2020 at 9:53 AM Stephen Mallette 
wrote:

> Interestingly there is a similar process going on right now in the Apache
> Cassandra community where they are going through IP Clearance. In watching
> that it seems an actual VOTE thread is better than a "consensus" thread, so
> I will start that now for completeness purposes and to lower friction as we
> head toward incubator.
>
> On Wed, Sep 16, 2020 at 7:05 AM Stephen Mallette 
> wrote:
>
>> >1. I can create a GitHub issue for adding the required headers in
>> all the source files.
>>
>> excellent. I'm still not completely clear on that step being a
>> prerequisite from the documentation as i saw some IP Clearance examples
>> with and without it, but if I know one thing about Apache and its Ways,
>> it's best not to use "what other projects do" as your reasoning for doing
>> something. anyway, once that commit is in place we can reference that
>> commit id (or any after it) for the donation. I suppose that if there were
>> any other "clean-up" you wanted to do before that time, now would be the
>> time to do it.
>>
>> >   2. There should be no issues there.
>> >   3. The project has zero dependencies, so there should be no issues
>> there either.
>>
>> well - that's easy then!
>>
>> a quick side note as we continue this process - the incubator site has
>> regenerated itself so our page is available now:
>>
>> https://incubator.apache.org/ip-clearance/tinkerpop-gremlint.html
>>
>> you can see what steps remain - i've updated the document to reflect your
>> responses to items 2 and 3 above.
>>
>> On Tue, Sep 15, 2020 at 11:19 AM Øyvind Sæbø 
>> wrote:
>>
>>> Yes, I'm following along. Cool to hear that we can move forward with
>>> this.
>>>
>>> I and Ardoq (the company that the project will be donated on behalf of)
>>> will start filling out the required ICLA
>>>  and CCLA
>>> .
>>>
>>> Regarding the points you mentioned:
>>>
>>>1. I can create a GitHub issue for adding the required headers in all
>>>the source files.
>>>2. There should be no issues there.
>>>3. The project has zero dependencies, so there should be no issues
>>> there
>>>either.
>>>
>>>
>>> tir. 15. sep. 2020 kl. 15:48 skrev Stephen Mallette <
>>> spmalle...@gmail.com>:
>>>
>>> > I've set up the IP Clearance form for incubator here (website hasn't
>>> > generated the HTML yet I guess):
>>> >
>>> >
>>> >
>>> https://svn.apache.org/repos/asf/incubator/public/trunk/content/ip-clearance/tinkerpop-gremlint.xml
>>> >
>>> > In the checklist of items there there are few items pertaining to the
>>> code
>>> > base itself:
>>> >
>>> > 1. Check and make sure that the files that have been donated have been
>>> > updated to reflect the new ASF copyright
>>> > 2. Check and make sure that for all items included with the
>>> distribution
>>> > that is not under the Apache license, we have the right to combine with
>>> > Apache-licensed code and redistribute.
>>> > 3. Check and make sure that all items depended upon by the project is
>>> > covered by one or more of the following approved licenses: Apache, BSD,
>>> > Artistic, MIT/X, MIT/W3C, MPL 1.1, or something with essentially the
>>> same
>>> > terms.
>>> >
>>> > For item 1 I assume that means the code base state at which we accept
>>> the
>>> > it should have the ASF license header in it with it an appropriate
>>> NOTICE
>>> > file if necessary:
>>> >
>>> > https://www.apache.org/legal/src-headers.html
>>> >
>>> > For 2 and 3, I don't think we have any issues there but would need to
>>> > confirm.
>>> >
>>> > Øyvind, I believe you're on the list following along - could you please
>>> > comment on the above for us?
>>> >
>>> >
>>> >
>>> >
>>> > On Tue, Sep 15, 2020 at 7:30 AM Stephen Mallette >> >
>>> > wrote:
>>> >
>>> > > As there haven't been any objections here, it sounds like we can go
>>> ahead
>>> > > with this process. I believe that we will need to go through the IP
>>> > > Clearance process in incubator:
>>> > >
>>> > > https://incubator.apache.org/ip-clearance/
>>> > >
>>> > > and engage Apache Infra about a gremlint.com domain transfer. And
>>> then
>>> > of
>>> > > course we will need to figure out "how" we make it part of the code
>>> base
>>> > > (where it goes, how it fits in the release process, etc.) - my
>>> preference
>>> > > would be to see it come in on 3.4.x so that we can immediately have
>>> an
>>> > > official release of it, but we'll see how it goes. I suppose we will
>>> > > continue to use this thread for all this sort of discussion for now
>>> > unless
>>> > > it gets too busy in which case we can spawn off other threads as
>>> needed..
>>> > >
>>> > > On Thu, Sep 10, 2020 at 6:31 PM David Bechberger <
>>> d...@bechberger.com>
>>> > > wrote:
>>> > >
>>> > >> I definitely agree that having this sort of tool f

Re: [DISCUSS] Accepting gremlint donation

2020-09-16 Thread Stephen Mallette
Interestingly there is a similar process going on right now in the Apache
Cassandra community where they are going through IP Clearance. In watching
that it seems an actual VOTE thread is better than a "consensus" thread, so
I will start that now for completeness purposes and to lower friction as we
head toward incubator.

On Wed, Sep 16, 2020 at 7:05 AM Stephen Mallette 
wrote:

> >1. I can create a GitHub issue for adding the required headers in all
> the source files.
>
> excellent. I'm still not completely clear on that step being a
> prerequisite from the documentation as i saw some IP Clearance examples
> with and without it, but if I know one thing about Apache and its Ways,
> it's best not to use "what other projects do" as your reasoning for doing
> something. anyway, once that commit is in place we can reference that
> commit id (or any after it) for the donation. I suppose that if there were
> any other "clean-up" you wanted to do before that time, now would be the
> time to do it.
>
> >   2. There should be no issues there.
> >   3. The project has zero dependencies, so there should be no issues
> there either.
>
> well - that's easy then!
>
> a quick side note as we continue this process - the incubator site has
> regenerated itself so our page is available now:
>
> https://incubator.apache.org/ip-clearance/tinkerpop-gremlint.html
>
> you can see what steps remain - i've updated the document to reflect your
> responses to items 2 and 3 above.
>
> On Tue, Sep 15, 2020 at 11:19 AM Øyvind Sæbø 
> wrote:
>
>> Yes, I'm following along. Cool to hear that we can move forward with this.
>>
>> I and Ardoq (the company that the project will be donated on behalf of)
>> will start filling out the required ICLA
>>  and CCLA
>> .
>>
>> Regarding the points you mentioned:
>>
>>1. I can create a GitHub issue for adding the required headers in all
>>the source files.
>>2. There should be no issues there.
>>3. The project has zero dependencies, so there should be no issues
>> there
>>either.
>>
>>
>> tir. 15. sep. 2020 kl. 15:48 skrev Stephen Mallette > >:
>>
>> > I've set up the IP Clearance form for incubator here (website hasn't
>> > generated the HTML yet I guess):
>> >
>> >
>> >
>> https://svn.apache.org/repos/asf/incubator/public/trunk/content/ip-clearance/tinkerpop-gremlint.xml
>> >
>> > In the checklist of items there there are few items pertaining to the
>> code
>> > base itself:
>> >
>> > 1. Check and make sure that the files that have been donated have been
>> > updated to reflect the new ASF copyright
>> > 2. Check and make sure that for all items included with the distribution
>> > that is not under the Apache license, we have the right to combine with
>> > Apache-licensed code and redistribute.
>> > 3. Check and make sure that all items depended upon by the project is
>> > covered by one or more of the following approved licenses: Apache, BSD,
>> > Artistic, MIT/X, MIT/W3C, MPL 1.1, or something with essentially the
>> same
>> > terms.
>> >
>> > For item 1 I assume that means the code base state at which we accept
>> the
>> > it should have the ASF license header in it with it an appropriate
>> NOTICE
>> > file if necessary:
>> >
>> > https://www.apache.org/legal/src-headers.html
>> >
>> > For 2 and 3, I don't think we have any issues there but would need to
>> > confirm.
>> >
>> > Øyvind, I believe you're on the list following along - could you please
>> > comment on the above for us?
>> >
>> >
>> >
>> >
>> > On Tue, Sep 15, 2020 at 7:30 AM Stephen Mallette 
>> > wrote:
>> >
>> > > As there haven't been any objections here, it sounds like we can go
>> ahead
>> > > with this process. I believe that we will need to go through the IP
>> > > Clearance process in incubator:
>> > >
>> > > https://incubator.apache.org/ip-clearance/
>> > >
>> > > and engage Apache Infra about a gremlint.com domain transfer. And
>> then
>> > of
>> > > course we will need to figure out "how" we make it part of the code
>> base
>> > > (where it goes, how it fits in the release process, etc.) - my
>> preference
>> > > would be to see it come in on 3.4.x so that we can immediately have an
>> > > official release of it, but we'll see how it goes. I suppose we will
>> > > continue to use this thread for all this sort of discussion for now
>> > unless
>> > > it gets too busy in which case we can spawn off other threads as
>> needed..
>> > >
>> > > On Thu, Sep 10, 2020 at 6:31 PM David Bechberger > >
>> > > wrote:
>> > >
>> > >> I definitely agree that having this sort of tool freely available
>> would
>> > be
>> > >> very helpful to the community as a whole.
>> > >>
>> > >> I also would be able to help create the translator between GLV's and
>> > text
>> > >> representations as this is something I and many others have struggled
>> > with
>> > >> many times.
>> > >>
>> > >> Thanks,
>> > >> Dave
>> > >>
>> > >> On T

[jira] [Updated] (TINKERPOP-2424) Reduce chance for OOME with large results to Java driver

2020-09-16 Thread Stephen Mallette (Jira)


 [ 
https://issues.apache.org/jira/browse/TINKERPOP-2424?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephen Mallette updated TINKERPOP-2424:

Description: 
Originally mentioned here:

https://groups.google.com/g/gremlin-users/c/I4HQC9JkzSo/m/fYfd5o0UAQAJ

and pretty easy to create with an empty TinkerGraph in Gremlin Server with 
{{evaluationTimeout}} set to something large using this script in the Gremlin 
Console with {{-Xmx512}}:

{code}
cluster = Cluster.open()
client = cluster.connect()
client.submit("g.addV().as('a').addE('self').iterate()")
rs = 
client.submit("g.V().emit().repeat(out()).valueMap(true).limit(1000)");[]
iterator = rs.iterator();[]
x = 0
while(iterator.hasNext()) {
  x++
  if (x % 1 == 0) {
System.out.println(x + "-[" + rs.getAvailableItemCount() + "]-"+ 
iterator.next());
  }
}
{code}

The {{LinkedBlockingQueue}} of the {{ResultQueue}} is unbounded and can fill 
faster than can be consumed and on a system with limited memory an OOME can 
loom. 

While we tend to discourage iteration of large result sets {{e.g. g.V()}} I 
suppose we should do what we can to keep users out of OOME situations if we 
can. Not sure of the best way to do this but some simple experimentation showed 
that bounding the queue helps (tried with 10) but does require the adding 
of new results to be blocked until more are consumed.

{code}
+++ 
b/gremlin-driver/src/main/java/org/apache/tinkerpop/gremlin/driver/Connection.java
@@ -233,7 +233,7 @@ final class Connection {
 
 cluster.executor().submit(() -> 
resultQueueSetup.completeExceptionally(f.cause()));
 } else {
-final LinkedBlockingQueue 
resultLinkedBlockingQueue = new LinkedBlockingQueue<>();
+final LinkedBlockingQueue 
resultLinkedBlockingQueue = new LinkedBlockingQueue<>(10);
 final CompletableFuture readCompleted = new 
CompletableFuture<>();
 
 readCompleted.whenCompleteAsync((v, t) -> {
diff --git 
a/gremlin-driver/src/main/java/org/apache/tinkerpop/gremlin/driver/ResultQueue.java
 
b/gremlin-driver/src/main/java/org/apache/tinkerpop/gremlin/driver/ResultQueue.java
index 29a6453431..4b52ae1671 100644
--- 
a/gremlin-driver/src/main/java/org/apache/tinkerpop/gremlin/driver/ResultQueue.java
+++ 
b/gremlin-driver/src/main/java/org/apache/tinkerpop/gremlin/driver/ResultQueue.java
@@ -70,7 +70,7 @@ final class ResultQueue {
  * @param result a return value from the {@link Traversal} or script 
submitted for execution
  */
 public void add(final Result result) {
-this.resultLinkedBlockingQueue.offer(result);
+while(!this.resultLinkedBlockingQueue.offer(result)) {}
 tryDrainNextWaiting(false);
 }
{code}

  was:
Originally mentioned here:

https://groups.google.com/g/gremlin-users/c/I4HQC9JkzSo/m/fYfd5o0UAQAJ

and pretty easy to create with an empty TinkerGraph in Gremlin Server with 
{{evaluationTimeout}} set to something large using this script in the Gremlin 
Console with {{-Xmx512}}:

{code}
cluster = Cluster.open()
client = cluster.connect()
client.submit("g.addV().as('a').addE('self').iterate()")
rs = 
client.submit("g.V().emit().repeat(out()).valueMap(true).limit(1000)");[]
iterator = rs.iterator();[]
x = 0
while(iterator.hasNext()) {
  x++
  if (x % 1 == 0) {
System.out.println(x + "-[" + rs.getAvailableItemCount() + "]-"+ 
iterator.next());
  }
}
{code}

The {{LinkedBlockingQueue}} of the {{ResultQueue}} is unbounded and can fill 
faster than can be consumed and on a system with limited memory an OOME can 
loom. 

While we tend to discourage iteration of large result sets {{e.g. g.V()}} I 
suppose we should do what we can to keep users out of OOME situations if we 
can. Not sure of the best way to do this but some simple experimentation showed 
that bounding the queue helps (tried with 10) but does require the adding 
of new results to be blocked until more are consumed.


> Reduce chance for OOME with large results to Java driver
> 
>
> Key: TINKERPOP-2424
> URL: https://issues.apache.org/jira/browse/TINKERPOP-2424
> Project: TinkerPop
>  Issue Type: Improvement
>  Components: driver
>Affects Versions: 3.4.8
>Reporter: Stephen Mallette
>Priority: Minor
>
> Originally mentioned here:
> https://groups.google.com/g/gremlin-users/c/I4HQC9JkzSo/m/fYfd5o0UAQAJ
> and pretty easy to create with an empty TinkerGraph in Gremlin Server with 
> {{evaluationTimeout}} set to something large using this script in the Gremlin 
> Console with {{-Xmx512}}:
> {code}
> cluster = Cluster.open()
> client = cluster.connect()
> client.submit("g.addV().as('a').addE('self').iterate()")
> rs = 
> client.submit("g.V().emit().repeat(out()).valueMap(tr

Re: [DISCUSS] Accepting gremlint donation

2020-09-16 Thread Stephen Mallette
>1. I can create a GitHub issue for adding the required headers in all
the source files.

excellent. I'm still not completely clear on that step being a prerequisite
from the documentation as i saw some IP Clearance examples with and without
it, but if I know one thing about Apache and its Ways, it's best not to use
"what other projects do" as your reasoning for doing something. anyway,
once that commit is in place we can reference that commit id (or any after
it) for the donation. I suppose that if there were any other "clean-up" you
wanted to do before that time, now would be the time to do it.

>   2. There should be no issues there.
>   3. The project has zero dependencies, so there should be no issues
there either.

well - that's easy then!

a quick side note as we continue this process - the incubator site has
regenerated itself so our page is available now:

https://incubator.apache.org/ip-clearance/tinkerpop-gremlint.html

you can see what steps remain - i've updated the document to reflect your
responses to items 2 and 3 above.

On Tue, Sep 15, 2020 at 11:19 AM Øyvind Sæbø  wrote:

> Yes, I'm following along. Cool to hear that we can move forward with this.
>
> I and Ardoq (the company that the project will be donated on behalf of)
> will start filling out the required ICLA
>  and CCLA
> .
>
> Regarding the points you mentioned:
>
>1. I can create a GitHub issue for adding the required headers in all
>the source files.
>2. There should be no issues there.
>3. The project has zero dependencies, so there should be no issues there
>either.
>
>
> tir. 15. sep. 2020 kl. 15:48 skrev Stephen Mallette  >:
>
> > I've set up the IP Clearance form for incubator here (website hasn't
> > generated the HTML yet I guess):
> >
> >
> >
> https://svn.apache.org/repos/asf/incubator/public/trunk/content/ip-clearance/tinkerpop-gremlint.xml
> >
> > In the checklist of items there there are few items pertaining to the
> code
> > base itself:
> >
> > 1. Check and make sure that the files that have been donated have been
> > updated to reflect the new ASF copyright
> > 2. Check and make sure that for all items included with the distribution
> > that is not under the Apache license, we have the right to combine with
> > Apache-licensed code and redistribute.
> > 3. Check and make sure that all items depended upon by the project is
> > covered by one or more of the following approved licenses: Apache, BSD,
> > Artistic, MIT/X, MIT/W3C, MPL 1.1, or something with essentially the same
> > terms.
> >
> > For item 1 I assume that means the code base state at which we accept the
> > it should have the ASF license header in it with it an appropriate NOTICE
> > file if necessary:
> >
> > https://www.apache.org/legal/src-headers.html
> >
> > For 2 and 3, I don't think we have any issues there but would need to
> > confirm.
> >
> > Øyvind, I believe you're on the list following along - could you please
> > comment on the above for us?
> >
> >
> >
> >
> > On Tue, Sep 15, 2020 at 7:30 AM Stephen Mallette 
> > wrote:
> >
> > > As there haven't been any objections here, it sounds like we can go
> ahead
> > > with this process. I believe that we will need to go through the IP
> > > Clearance process in incubator:
> > >
> > > https://incubator.apache.org/ip-clearance/
> > >
> > > and engage Apache Infra about a gremlint.com domain transfer. And then
> > of
> > > course we will need to figure out "how" we make it part of the code
> base
> > > (where it goes, how it fits in the release process, etc.) - my
> preference
> > > would be to see it come in on 3.4.x so that we can immediately have an
> > > official release of it, but we'll see how it goes. I suppose we will
> > > continue to use this thread for all this sort of discussion for now
> > unless
> > > it gets too busy in which case we can spawn off other threads as
> needed..
> > >
> > > On Thu, Sep 10, 2020 at 6:31 PM David Bechberger 
> > > wrote:
> > >
> > >> I definitely agree that having this sort of tool freely available
> would
> > be
> > >> very helpful to the community as a whole.
> > >>
> > >> I also would be able to help create the translator between GLV's and
> > text
> > >> representations as this is something I and many others have struggled
> > with
> > >> many times.
> > >>
> > >> Thanks,
> > >> Dave
> > >>
> > >> On Thu, Sep 10, 2020 at 2:14 PM Kelvin Lawrence
> >  > >> >
> > >> wrote:
> > >>
> > >> > I really like the idea of having an Apache TinkerPop hosted linter
> and
> > >> > style guide "enforcer". I have spent many wasted hours hand
> formatting
> > >> long
> > >> > Gremlin queries people have asked me to look at over the years and
> the
> > >> > latest version of Gremlint makes that so much easier. I also really
> > like
> > >> > the idea of extending the tool in the direction of "Gremlin
> > converter".
> > >> I
> > >> > hear from a lot of user