[jira] [Commented] (TINKERPOP-2438) Provide a way for scripts to respect with() specification of timeout

2020-12-11 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/TINKERPOP-2438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17248172#comment-17248172
 ] 

ASF GitHub Bot commented on TINKERPOP-2438:
---

spmallette opened a new pull request #1371:
URL: https://github.com/apache/tinkerpop/pull/1371


   https://issues.apache.org/jira/browse/TINKERPOP-2438
   
   Even though TINKERPOP-2438 is already done this is a bit of a follow on to 
that change. The GremlinScriptChecker performs the same exact task as the 
GremlinAstChecker but it does so with regex parsing which should be faster than 
processing a full AST. I've left the GremlinASTChecker where it is because I 
wonder if there isn't yet some usage for more rigorous sort of checks we might 
come up with in the future or perhaps have checks that only work well with AST 
processing. It can be removed in the future if we like.
   
   VOTE +1



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Provide a way for scripts to respect with() specification of timeout
> 
>
> Key: TINKERPOP-2438
> URL: https://issues.apache.org/jira/browse/TINKERPOP-2438
> Project: TinkerPop
>  Issue Type: Improvement
>  Components: server
>Affects Versions: 3.4.8
>Reporter: Stephen Mallette
>Assignee: Stephen Mallette
>Priority: Major
> Fix For: 3.5.0, 3.4.9
>
>
> This issue sorta relates to the Gremlin Console but I think it's generally a 
> server side problem as it could easily occur with HTTP or just scripts sent 
> over web sockets. Folks tend to see all the ways they can set timeouts and 
> then mix/match them. It only remains a problem with sending a script as 
> {{g.with("evalTimeout",100)}} because the server won't know that this value 
> was set until the script is passed to {{eval()}} but by then it's too late 
> because we would have already started the timeout countdown.  While most 
> users wouldn't send that as a bare script submission it's a common mistake in 
> the Gremlin Console and similar environments where it is not clear you are 
> working with a remote graph. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [DISCUSS] Creating pattern steps to codify best practices

2020-12-11 Thread Stephen Mallette
+1 to no new step. I haven't yet thought of a reason why this shouldn't
just be a variation on addV/E().

On Fri, Dec 11, 2020 at 1:03 PM David Bechberger 
wrote:

> I agree with Josh's description of what the upsertV() functionality is
> intended to be.
>
> I also fully support the simplification provided by the by() modulation
> that Stephen suggested, removing the second map.  I think that provides a
> much cleaner and easier to comprehend syntax.
>
> With that agreement, I think this does beg the question of if this should
> be a new step (upsertV()) or just an additional signature on the addV()
> step? I
>
> Stephen alluded to this on the dev list and after thinking about this a bit
> I think that I am favor of not adding a step and just adding a new
> signature to the existing step (if possible).  Thoughts?
>
>
> Dave
>
> On Wed, Dec 9, 2020 at 4:33 AM Stephen Mallette 
> wrote:
>
> > Josh, thanks for your thoughts - some responses inline:
> >
> > On Tue, Dec 8, 2020 at 10:16 PM Josh Perryman 
> > wrote:
> >
> > > I'll offer some thoughts. I'm seeing upsertV() as an idempotent
> > getOrCreate
> > > call which always returns a vertex with the label/property values
> > specified
> > > within the step. It's sort of a declarative pattern: "return this
> vertex
> > to
> > > me, find it if you can, create it if you must."
> > >
> >
> > I like this description - I've added it to the gist, though it's a bit at
> > odds with Dave's previous post, so we'll consider it a temporary addition
> > until he responds.
> >
> >
> > > On that account, I do like the simplification in 1. Repetition
> shouldn't
> > be
> > > necessary. In an ideal world, the engine should know the primary
> > > identifiers (name or id) and find/create the vertex based on them. Any
> > > other included values will be "trued up" as well. But this may be a
> > bridge
> > > too far for TinkerPop since knowing identifiers may require a specified
> > > schema. I'd prefer to omit the third input, but it might be necessary
> to
> > > keep it so that the second input can be for the matching use case.
> > >
> >
> > In my most recent post on gremlin-users I think I came up with a nice way
> > to get rid of the second Map. One Map that forms the full list of
> > properties for upserting is easier than partitioning two Maps that
> > essentially merge together. I imagine it's unlikely that application code
> > will have that separation naturally so users will have the added step of
> > trying to separate their data into searchable vs "just data". Getting us
> to
> > one Map argument will simplify APIs for us and reduce complexity to
> users.
> > Here is what I'd proposed for those not following over there:
> >
> > // match on name and age (or perhaps whatever the underlying graph system
> > thinks is best?)
> > g.upsertV('person', [name:'marko',age:29])
> >
> > // match on name only
> > g.upsertV('person', [name:'marko',age:29]).by('name')
> >
> > // explicitly match on name and age
> > g.upsertV('person', [name:'marko',age:29]).
> >   by('name').by('age')
> >
> > // match on id only
> > g.upsertV('person', [(T.id): 100, name:'marko',age:29]).by(T.id)
> >
> > // match on whatever the by(Traversal) predicate defines
> > g.upsertV('person', [name:'marko',age:29]).
> >   by(has('name', 'marko'))
> >
> > // match on id, then update age
> > g.upsertV('person', [(T.id): 100, name:'marko']).by(T.id).
> >   property('age',29)
> >
> > With this model, we get one Map argument that represents the complete
> > property set to be added/updated to the graph and the user can hint on
> what
> > key they wish to match on using by() where that sort of step modulation
> > should be a well understood and familiar concept in Gremlin at this
> point.
> >
> > So that means I think 2 should always match or update the additional
> > > values. Again, we're specifying the expected result and letting the
> > engine
> > > figure out best how to return that results and appropriately maintain
> > > state.
> > >
> >
> > I again like this description, but we'll see what Dave's thoughts are
> since
> > he's a bit behind on the threads at this point I think.
> >
> >
> > > I'm also presuming that anything not included as inputs to the
> upsertV()
> > > step are then to be handled by following steps. I'm hoping that is a
> > > sufficient approach for addressing the multi/meta property use cases
> > > brought up in 3.
> > >
> >
> > yeahit needs more thought. I spent more time thinking on
> > this issue yesterday than I have for all the previous posts combined and
> I
> > think it yielded something good in that revised syntax. It's going to
> take
> > more of that kind of elbow grease to dig into these lesser use cases to
> > make sure we aren't coding ourselves into corners.
> >
> >
> > > I do like the idea of using modulators (with(), by()) for more
> > > sophisticated usage and advanced use cases. Also, the streaming
> examples
> > > are quite elegant allowing for a 

Re: [DISCUSS] Creating pattern steps to codify best practices

2020-12-11 Thread David Bechberger
I agree with Josh's description of what the upsertV() functionality is
intended to be.

I also fully support the simplification provided by the by() modulation
that Stephen suggested, removing the second map.  I think that provides a
much cleaner and easier to comprehend syntax.

With that agreement, I think this does beg the question of if this should
be a new step (upsertV()) or just an additional signature on the addV()
step? I

Stephen alluded to this on the dev list and after thinking about this a bit
I think that I am favor of not adding a step and just adding a new
signature to the existing step (if possible).  Thoughts?


Dave

On Wed, Dec 9, 2020 at 4:33 AM Stephen Mallette 
wrote:

> Josh, thanks for your thoughts - some responses inline:
>
> On Tue, Dec 8, 2020 at 10:16 PM Josh Perryman 
> wrote:
>
> > I'll offer some thoughts. I'm seeing upsertV() as an idempotent
> getOrCreate
> > call which always returns a vertex with the label/property values
> specified
> > within the step. It's sort of a declarative pattern: "return this vertex
> to
> > me, find it if you can, create it if you must."
> >
>
> I like this description - I've added it to the gist, though it's a bit at
> odds with Dave's previous post, so we'll consider it a temporary addition
> until he responds.
>
>
> > On that account, I do like the simplification in 1. Repetition shouldn't
> be
> > necessary. In an ideal world, the engine should know the primary
> > identifiers (name or id) and find/create the vertex based on them. Any
> > other included values will be "trued up" as well. But this may be a
> bridge
> > too far for TinkerPop since knowing identifiers may require a specified
> > schema. I'd prefer to omit the third input, but it might be necessary to
> > keep it so that the second input can be for the matching use case.
> >
>
> In my most recent post on gremlin-users I think I came up with a nice way
> to get rid of the second Map. One Map that forms the full list of
> properties for upserting is easier than partitioning two Maps that
> essentially merge together. I imagine it's unlikely that application code
> will have that separation naturally so users will have the added step of
> trying to separate their data into searchable vs "just data". Getting us to
> one Map argument will simplify APIs for us and reduce complexity to users.
> Here is what I'd proposed for those not following over there:
>
> // match on name and age (or perhaps whatever the underlying graph system
> thinks is best?)
> g.upsertV('person', [name:'marko',age:29])
>
> // match on name only
> g.upsertV('person', [name:'marko',age:29]).by('name')
>
> // explicitly match on name and age
> g.upsertV('person', [name:'marko',age:29]).
>   by('name').by('age')
>
> // match on id only
> g.upsertV('person', [(T.id): 100, name:'marko',age:29]).by(T.id)
>
> // match on whatever the by(Traversal) predicate defines
> g.upsertV('person', [name:'marko',age:29]).
>   by(has('name', 'marko'))
>
> // match on id, then update age
> g.upsertV('person', [(T.id): 100, name:'marko']).by(T.id).
>   property('age',29)
>
> With this model, we get one Map argument that represents the complete
> property set to be added/updated to the graph and the user can hint on what
> key they wish to match on using by() where that sort of step modulation
> should be a well understood and familiar concept in Gremlin at this point.
>
> So that means I think 2 should always match or update the additional
> > values. Again, we're specifying the expected result and letting the
> engine
> > figure out best how to return that results and appropriately maintain
> > state.
> >
>
> I again like this description, but we'll see what Dave's thoughts are since
> he's a bit behind on the threads at this point I think.
>
>
> > I'm also presuming that anything not included as inputs to the upsertV()
> > step are then to be handled by following steps. I'm hoping that is a
> > sufficient approach for addressing the multi/meta property use cases
> > brought up in 3.
> >
>
> yeahit needs more thought. I spent more time thinking on
> this issue yesterday than I have for all the previous posts combined and I
> think it yielded something good in that revised syntax. It's going to take
> more of that kind of elbow grease to dig into these lesser use cases to
> make sure we aren't coding ourselves into corners.
>
>
> > I do like the idea of using modulators (with(), by()) for more
> > sophisticated usage and advanced use cases. Also, the streaming examples
> > are quite elegant allowing for a helpful separation of data and logic.
> >
>
> cool - hope you like the revised syntax I posted then. :)
>
>
> > That's my humble take. This is a very welcome addition to the language
> and
> > I appreciate the thoughtful & collaborative approach to the design
> > considerations.
> >
>
> Thanks again and please keep the thoughts coming. Lots of other interesting
> design discussions seem to be brewing.
>
>
> >
> 

Re: [DISCUSS] code freeze 3.4.9

2020-12-11 Thread Stephen Mallette
I've done all the post-release work. 3.4-dev is open for business again.
Thanks!

On Mon, Dec 7, 2020 at 7:37 AM Stephen Mallette 
wrote:

> Thanks for noticing that - fixed that last week with:
>
>
> https://github.com/apache/tinkerpop/commit/d51253ae515fe822fd7d5f4c8d7761f9ebad1704
>
>
>
> On Fri, Dec 4, 2020 at 4:29 AM  wrote:
>
>> I think I found a minor error in the upgrade docs: The link in the
>> section Per Request Options that should point to the Gremlin Drivers and
>> Variants section of the reference docs instead tries to find that section
>> in the upgrade docs. So, the link is broken.
>>
>> -Ursprüngliche Nachricht-
>> Von: Stephen Mallette 
>> Gesendet: Donnerstag, 3. Dezember 2020 14:37
>> An: dev@tinkerpop.apache.org
>> Betreff: Re: [DISCUSS] code freeze 3.4.9
>>
>> I've pushed the 3.4.9-SNAPSHOT documentation for those interested in
>> doing a review:
>>
>> https://tinkerpop.apache.org/docs/3.4.9-SNAPSHOT/
>>
>> The upgrade docs are especially full this release. Definitely the longest
>> list of improvements on 3.4.x since 3.4.0:
>>
>> https://tinkerpop.apache.org/docs/3.4.9-SNAPSHOT/upgrade/
>>
>>
>>
>> On Wed, Dec 2, 2020 at 4:18 PM Stephen Mallette 
>> wrote:
>>
>> > Following up on TINKERPOP-2438 - I think I've addressed the
>> > performance issue that was noted. It's not a perfect solution but it
>> > was easy/obvious and I feel safe enough to merge:
>> >
>> > https://github.com/apache/tinkerpop/pull/1368
>> >
>> > I've formed the pull request for review with the intention to merge it
>> > for 3.4.9. Running docker tests now - please let me know if there are
>> > any concerns.
>> >
>> > On Mon, Nov 30, 2020 at 5:12 PM Stephen Mallette
>> > 
>> > wrote:
>> >
>> >> As we roll into 3.4.9 code freeze week we can use this thread for any
>> >> issues or concerns that are found related to testing this week.
>> >>
>> >> To get the ball rolling, a regression has been noticed in performance
>> >> around low-latency scripts and seems to be related to the new
>> >> GremlinASTChecker. My earlier tests on this feature didn't note the
>> >> cumulative damage of the additional AST processing that would occur
>> >> on small/fast scripts where it is much harder to hide millisecond
>> >> range increments. I've reverted that functionality:
>> >>
>> >>
>> >> https://github.com/apache/tinkerpop/commit/daad70bec96d4a1d3a0579bee1
>> >> b73934d9293664
>> >>
>> >> and reopened the issue:
>> >>
>> >> https://issues.apache.org/jira/browse/TINKERPOP-2438
>> >>
>> >> I opted to not revert the entire body of work because I think the
>> >> idea is still valid in general but it needs more attention paid to
>> >> these particular queries. I also don't think it's a reason to pause
>> >> 3.4.9 release over. This release is too full of features to let this
>> issue hold things up.
>> >>
>> >> I've published a 3.4.9-SNAPSHOT if you're like to test and I will
>> >> post back when freshy docs are published.
>> >>
>> >>
>>
>>


[jira] [Commented] (TINKERPOP-2490) RangeGlobalStep touches next traverser when high limit is already hit

2020-12-11 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/TINKERPOP-2490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17247744#comment-17247744
 ] 

ASF GitHub Bot commented on TINKERPOP-2490:
---

junshiguo opened a new pull request #1370:
URL: https://github.com/apache/tinkerpop/pull/1370


   https://issues.apache.org/jira/browse/TINKERPOP-2490
   
   In FilterStep, the processNextStart() method will first retrieve next 
traverser and then apply filtering logic. But for RangleGlobalStep, if high 
limit is already hit, there will be no need to get next traverser.
   
   e.g. g.V().limit(1). This query will touch 2 vertices although only 1 vertex 
will be returned.
   
   This PR added high limit check before retrieving next traverser for 
filtering. Functionality is not affected, but we can expect better performance 
if getting next traverser is heavy.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> RangeGlobalStep touches next traverser when high limit is already hit
> -
>
> Key: TINKERPOP-2490
> URL: https://issues.apache.org/jira/browse/TINKERPOP-2490
> Project: TinkerPop
>  Issue Type: Bug
>  Components: process
>Affects Versions: 3.4.8
>Reporter: Guo Junshi
>Priority: Major
>
> In FilterStep, the processNextStart() method will first retrieve next 
> traverser and then apply filtering logic. But for RangleGlobalStep, if high 
> limit is already hit, there will be no need to get next traverser.
> {code:java}
> @Override
> protected Traverser.Admin processNextStart() {
> while (true) {
> final Traverser.Admin traverser = this.starts.next();
> if (this.filter(traverser))
> return traverser;
> }
> }
> {code}
> An example would be limit step: g.V().limit(1). This query will touch 2 
> vertices although only 1 vertex will be returned.
> This extra data loading will cause performance defects if DB data loading is 
> involved. It is not a functionality bug, but for better performance, we'd 
> better check high range limit first before touching next traversal.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (TINKERPOP-2490) RangeGlobalStep touches next traverser when high limit is already hit

2020-12-11 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/TINKERPOP-2490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17247743#comment-17247743
 ] 

ASF GitHub Bot commented on TINKERPOP-2490:
---

junshiguo closed pull request #1370:
URL: https://github.com/apache/tinkerpop/pull/1370


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> RangeGlobalStep touches next traverser when high limit is already hit
> -
>
> Key: TINKERPOP-2490
> URL: https://issues.apache.org/jira/browse/TINKERPOP-2490
> Project: TinkerPop
>  Issue Type: Bug
>  Components: process
>Affects Versions: 3.4.8
>Reporter: Guo Junshi
>Priority: Major
>
> In FilterStep, the processNextStart() method will first retrieve next 
> traverser and then apply filtering logic. But for RangleGlobalStep, if high 
> limit is already hit, there will be no need to get next traverser.
> {code:java}
> @Override
> protected Traverser.Admin processNextStart() {
> while (true) {
> final Traverser.Admin traverser = this.starts.next();
> if (this.filter(traverser))
> return traverser;
> }
> }
> {code}
> An example would be limit step: g.V().limit(1). This query will touch 2 
> vertices although only 1 vertex will be returned.
> This extra data loading will cause performance defects if DB data loading is 
> involved. It is not a functionality bug, but for better performance, we'd 
> better check high range limit first before touching next traversal.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)