[jira] [Commented] (TINKERPOP-1800) Remote connect issue

2017-10-18 Thread Loveneet kumar (JIRA)

[ 
https://issues.apache.org/jira/browse/TINKERPOP-1800?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16210480#comment-16210480
 ] 

Loveneet kumar commented on TINKERPOP-1800:
---

Thank you ...it was my mistake...i will read documents carefully next time..

> Remote connect issue
> 
>
> Key: TINKERPOP-1800
> URL: https://issues.apache.org/jira/browse/TINKERPOP-1800
> Project: TinkerPop
>  Issue Type: Bug
>  Components: groovy
>Affects Versions: 3.3.0
> Environment: WINDOWS 10
>Reporter: Loveneet kumar
>
> In windows 10 environment  i facing this issue :
> {color:#59afe1}gremlin> remote connect tinkerpop.server conf/remote.yaml
> groovysh_parse: 2: expecting EOF, found 'conf' @ line 2, column 33.
>remote connect tinkerpop.server conf/remote.yaml
>^
> 1 error
> Type ':help' or ':h' for help.
> Display stack trace? [yN]
> gremlin> :d
> Buffer is empty
> gremlin> :{color}
> After changing slash direction
> {color:#59afe1}gremlin> remote connect tinkerpop.server conf\remote.yaml
> groovysh_parse: 2: unexpected char: '\' @ line 2, column 37.
>remote connect tinkerpop.server conf\remote.yaml{color}
>^
> 1 error



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] tinkerpop pull request #735: TINKERPOP-1803: inject() doesn't re-attach with...

2017-10-18 Thread okram
GitHub user okram opened a pull request:

https://github.com/apache/tinkerpop/pull/735

TINKERPOP-1803: inject() doesn't re-attach with remote traversals

https://issues.apache.org/jira/browse/TINKERPOP-1803

Fixed an "attachement"-bug in `InjectStep` with a solution generalized to 
`StartStep`.

VOTE +1

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/apache/tinkerpop TINKERPOP-1803

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/tinkerpop/pull/735.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #735


commit 79b621c9a0ddc2d96f951c54ee3f1db3c8490d4c
Author: Marko A. Rodriguez 
Date:   2017-10-18T22:45:16Z

Fixed an attachement-bug in  with a solution generalized to .




---


[jira] [Commented] (TINKERPOP-1803) inject() doesn't re-attach with remote traversals

2017-10-18 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/TINKERPOP-1803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16210246#comment-16210246
 ] 

ASF GitHub Bot commented on TINKERPOP-1803:
---

GitHub user okram opened a pull request:

https://github.com/apache/tinkerpop/pull/735

TINKERPOP-1803: inject() doesn't re-attach with remote traversals

https://issues.apache.org/jira/browse/TINKERPOP-1803

Fixed an "attachement"-bug in `InjectStep` with a solution generalized to 
`StartStep`.

VOTE +1

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/apache/tinkerpop TINKERPOP-1803

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/tinkerpop/pull/735.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #735


commit 79b621c9a0ddc2d96f951c54ee3f1db3c8490d4c
Author: Marko A. Rodriguez 
Date:   2017-10-18T22:45:16Z

Fixed an attachement-bug in  with a solution generalized to .




> inject() doesn't re-attach with remote traversals
> -
>
> Key: TINKERPOP-1803
> URL: https://issues.apache.org/jira/browse/TINKERPOP-1803
> Project: TinkerPop
>  Issue Type: Bug
>  Components: process
>Affects Versions: 3.2.6
>Reporter: stephen mallette
>Assignee: Marko A. Rodriguez
>Priority: Critical
>
> In the console we get this:
> {code}
> gremlin> v2 = g.V(2).next()
> ==>v[2]
> gremlin> g.V(1).out().inject(v2).values("name")
> ==>vadas
> ==>lop
> ==>vadas
> ==>josh
> {code}
> From gremlin-python we can see:
> {code}
> >>> v2 = g.V(2).next()
> >>> g.V(1).out().inject(v2).values("name").toList()
> [u'lop', u'vadas', u'josh']
> {code}
> and using {code}withRemote(){code} in java:
> {code}
> gremlin> v2 = g.V(2).next()
> ==>v[2]
> gremlin> g.V(1).out().inject(v2).values("name")
> ==>lop
> ==>vadas
> ==>josh
> {code}
> Since {{inject()}} doesn't re-attach the vertex when {{values()}} gets called 
> it acts on a reference vertex with no properties and returns nothing.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (TINKERPOP-1803) inject() doesn't re-attach with remote traversals

2017-10-18 Thread Marko A. Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/TINKERPOP-1803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marko A. Rodriguez reassigned TINKERPOP-1803:
-

Assignee: Marko A. Rodriguez

> inject() doesn't re-attach with remote traversals
> -
>
> Key: TINKERPOP-1803
> URL: https://issues.apache.org/jira/browse/TINKERPOP-1803
> Project: TinkerPop
>  Issue Type: Bug
>  Components: process
>Affects Versions: 3.2.6
>Reporter: stephen mallette
>Assignee: Marko A. Rodriguez
>Priority: Critical
>
> In the console we get this:
> {code}
> gremlin> v2 = g.V(2).next()
> ==>v[2]
> gremlin> g.V(1).out().inject(v2).values("name")
> ==>vadas
> ==>lop
> ==>vadas
> ==>josh
> {code}
> From gremlin-python we can see:
> {code}
> >>> v2 = g.V(2).next()
> >>> g.V(1).out().inject(v2).values("name").toList()
> [u'lop', u'vadas', u'josh']
> {code}
> and using {code}withRemote(){code} in java:
> {code}
> gremlin> v2 = g.V(2).next()
> ==>v[2]
> gremlin> g.V(1).out().inject(v2).values("name")
> ==>lop
> ==>vadas
> ==>josh
> {code}
> Since {{inject()}} doesn't re-attach the vertex when {{values()}} gets called 
> it acts on a reference vertex with no properties and returns nothing.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Closed] (TINKERPOP-1797) LambdaRestrictionStrategy and LambdaMapStep in `by()`-modulation.

2017-10-18 Thread Marko A. Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/TINKERPOP-1797?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marko A. Rodriguez closed TINKERPOP-1797.
-
   Resolution: Fixed
Fix Version/s: 3.3.1
   3.2.7

> LambdaRestrictionStrategy and LambdaMapStep in `by()`-modulation.
> -
>
> Key: TINKERPOP-1797
> URL: https://issues.apache.org/jira/browse/TINKERPOP-1797
> Project: TinkerPop
>  Issue Type: Bug
>  Components: process
>Affects Versions: 3.2.6
>Reporter: Marko A. Rodriguez
>Assignee: Marko A. Rodriguez
> Fix For: 3.2.7, 3.3.1
>
>
> {code}
> gremlin> g.V().groupCount().by(label).order(local).by(values)
> The provided step contains a lambda comparator: 
> OrderLocalStep([[[LambdaMapStep(values)@[~gremlin.incidentToAdjacent, 
> ~gremlin.pathRetraction]], incr]])
> Type ':help' or ':h' for help.
> Display stack trace? [yN]
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (TINKERPOP-1797) LambdaRestrictionStrategy and LambdaMapStep in `by()`-modulation.

2017-10-18 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/TINKERPOP-1797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16209852#comment-16209852
 ] 

ASF GitHub Bot commented on TINKERPOP-1797:
---

Github user asfgit closed the pull request at:

https://github.com/apache/tinkerpop/pull/730


> LambdaRestrictionStrategy and LambdaMapStep in `by()`-modulation.
> -
>
> Key: TINKERPOP-1797
> URL: https://issues.apache.org/jira/browse/TINKERPOP-1797
> Project: TinkerPop
>  Issue Type: Bug
>  Components: process
>Affects Versions: 3.2.6
>Reporter: Marko A. Rodriguez
>Assignee: Marko A. Rodriguez
>
> {code}
> gremlin> g.V().groupCount().by(label).order(local).by(values)
> The provided step contains a lambda comparator: 
> OrderLocalStep([[[LambdaMapStep(values)@[~gremlin.incidentToAdjacent, 
> ~gremlin.pathRetraction]], incr]])
> Type ':help' or ':h' for help.
> Display stack trace? [yN]
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] tinkerpop pull request #730: TINKERPOP-1797: LambdaRestrictionStrategy and L...

2017-10-18 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/tinkerpop/pull/730


---


[jira] [Commented] (TINKERPOP-1797) LambdaRestrictionStrategy and LambdaMapStep in `by()`-modulation.

2017-10-18 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/TINKERPOP-1797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16209845#comment-16209845
 ] 

ASF GitHub Bot commented on TINKERPOP-1797:
---

Github user spmallette commented on the issue:

https://github.com/apache/tinkerpop/pull/730
  
All tests pass with `docker/build.sh -t -n -i`

VOTE +1


> LambdaRestrictionStrategy and LambdaMapStep in `by()`-modulation.
> -
>
> Key: TINKERPOP-1797
> URL: https://issues.apache.org/jira/browse/TINKERPOP-1797
> Project: TinkerPop
>  Issue Type: Bug
>  Components: process
>Affects Versions: 3.2.6
>Reporter: Marko A. Rodriguez
>Assignee: Marko A. Rodriguez
>
> {code}
> gremlin> g.V().groupCount().by(label).order(local).by(values)
> The provided step contains a lambda comparator: 
> OrderLocalStep([[[LambdaMapStep(values)@[~gremlin.incidentToAdjacent, 
> ~gremlin.pathRetraction]], incr]])
> Type ':help' or ':h' for help.
> Display stack trace? [yN]
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] tinkerpop issue #730: TINKERPOP-1797: LambdaRestrictionStrategy and LambdaMa...

2017-10-18 Thread spmallette
Github user spmallette commented on the issue:

https://github.com/apache/tinkerpop/pull/730
  
All tests pass with `docker/build.sh -t -n -i`

VOTE +1


---


[jira] [Created] (TINKERPOP-1803) inject() doesn't re-attach with remote traversals

2017-10-18 Thread stephen mallette (JIRA)
stephen mallette created TINKERPOP-1803:
---

 Summary: inject() doesn't re-attach with remote traversals
 Key: TINKERPOP-1803
 URL: https://issues.apache.org/jira/browse/TINKERPOP-1803
 Project: TinkerPop
  Issue Type: Bug
  Components: process
Affects Versions: 3.2.6
Reporter: stephen mallette
Priority: Critical


In the console we get this:

{code}
gremlin> v2 = g.V(2).next()
==>v[2]
gremlin> g.V(1).out().inject(v2).values("name")
==>vadas
==>lop
==>vadas
==>josh
{code}

>From gremlin-python we can see:

{code}
>>> v2 = g.V(2).next()
>>> g.V(1).out().inject(v2).values("name").toList()
[u'lop', u'vadas', u'josh']
{code}

and using {code}withRemote(){code} in java:

{code}
gremlin> v2 = g.V(2).next()
==>v[2]
gremlin> g.V(1).out().inject(v2).values("name")
==>lop
==>vadas
==>josh
{code}

Since {{inject()}} doesn't re-attach the vertex when {{values()}} gets called 
it acts on a reference vertex with no properties and returns nothing.





--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (TINKERPOP-1802) hasId() fails for empty collections

2017-10-18 Thread Daniel Kuppitz (JIRA)
Daniel Kuppitz created TINKERPOP-1802:
-

 Summary: hasId() fails for empty collections
 Key: TINKERPOP-1802
 URL: https://issues.apache.org/jira/browse/TINKERPOP-1802
 Project: TinkerPop
  Issue Type: Bug
  Components: process
Affects Versions: 3.2.6, 3.3.0
Reporter: Daniel Kuppitz
Assignee: Daniel Kuppitz


{noformat}
gremlin> g.V().hasId(within([]))
0
Type ':help' or ':h' for help.
Display stack trace? [yN]
{noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (TINKERPOP-1752) Gremlin.Net: Generate completely type-safe methods

2017-10-18 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/TINKERPOP-1752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16209601#comment-16209601
 ] 

ASF GitHub Bot commented on TINKERPOP-1752:
---

Github user FlorianHockmann commented on the issue:

https://github.com/apache/tinkerpop/pull/712
  
I finally found some time to work on this again and fixed the issues 
mentioned by @jorgebay. However, I left the `Bindings` implementation unchanged 
despite the problems with concurrent access as it seems to be still the best 
solution. (I'm of course open for suggestions on how this can be improved.)

I also noticed that my changes broke the `WithoutStrategies` source step as 
that now correctly expects to get the `Types` of the Strategies to exclude for 
which Gremlin.Net had no serializer. So I added a serializer that works 
basically like the respective one in gremlin-python as it also simply creates 
an object of the `Type` and then this object will be serialized as before. A 
unit test ensures that all Strategies have a parameterless constructor as we 
can't serialize their `Type` otherwise.
Honestly, I was a bit surprised that I had to serialize `Types` by 
serializing an empty object of that `Type` although the IO docs show that the 
GraphSON type `g:Class` [can be serialized like 
this](http://tinkerpop.apache.org/docs/3.3.0/dev/io/#_class):
```json
{
  "@type" : "g:Class",
  "@value" : "java.io.File"
}
```
but the Gremlin Server couldn't deserialize the Strategy class when I 
serialized it like this. So is the documentation wrong here? @spmallette: Could 
you clarify my confusion here?
Also the IO docs don't mention how `TraversalStrategies` are serialized in 
general. Should we add that?

The build is currently failing, but that seems to be caused by 
travis-ci/travis-ci#8607. I built Gremlin.Net locally and executed the tests 
without any problems.

BTW: Would it make sense to create a separate pull request for `master` or 
can we simply execute `generate.groovy` later when this is merged from `tp32` 
into `master`?


> Gremlin.Net: Generate completely type-safe methods
> --
>
> Key: TINKERPOP-1752
> URL: https://issues.apache.org/jira/browse/TINKERPOP-1752
> Project: TinkerPop
>  Issue Type: Improvement
>  Components: dotnet
>Affects Versions: 3.2.5
>Reporter: Florian Hockmann
>Priority: Minor
>
> Currently the generated traversal methods in Gremlin.Net take {{params 
> object[] args}} as an argument which allows the user to provide an arbitrary 
> number of arguments with any type. While this makes the generation rather 
> simple, it doesn't tell the user which arguments are actually valid so users 
> can submit completely invalid traversals like:
> {code}
> g.V(1).AddE(1234, "invalidArgument2").Next()
> {code}
> Type-safe methods could also use the original argument names to tell users 
> something about what kind of values the methods expect. Consider for example 
> the following method signatures for the C# step {{AddE}} that are basically a 
> 1:1 representation of the original Java {{addE}} step:
> {code}
> public GraphTraversal< S , Edge > AddE (Direction direction, string 
> firstVertexKeyOrEdgeLabel, string edgeLabelOrSecondVertexKey, params object[] 
> propertyKeyValues);
> public GraphTraversal< S , Edge > AddE (string edgeLabel);
> {code}
> Implementing this should make TINKERPOP-1725 obsolete and also resolve 
> TINKERPOP-1751.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] tinkerpop issue #712: TINKERPOP-1752: Gremlin.Net: Generate completely type-...

2017-10-18 Thread FlorianHockmann
Github user FlorianHockmann commented on the issue:

https://github.com/apache/tinkerpop/pull/712
  
I finally found some time to work on this again and fixed the issues 
mentioned by @jorgebay. However, I left the `Bindings` implementation unchanged 
despite the problems with concurrent access as it seems to be still the best 
solution. (I'm of course open for suggestions on how this can be improved.)

I also noticed that my changes broke the `WithoutStrategies` source step as 
that now correctly expects to get the `Types` of the Strategies to exclude for 
which Gremlin.Net had no serializer. So I added a serializer that works 
basically like the respective one in gremlin-python as it also simply creates 
an object of the `Type` and then this object will be serialized as before. A 
unit test ensures that all Strategies have a parameterless constructor as we 
can't serialize their `Type` otherwise.
Honestly, I was a bit surprised that I had to serialize `Types` by 
serializing an empty object of that `Type` although the IO docs show that the 
GraphSON type `g:Class` [can be serialized like 
this](http://tinkerpop.apache.org/docs/3.3.0/dev/io/#_class):
```json
{
  "@type" : "g:Class",
  "@value" : "java.io.File"
}
```
but the Gremlin Server couldn't deserialize the Strategy class when I 
serialized it like this. So is the documentation wrong here? @spmallette: Could 
you clarify my confusion here?
Also the IO docs don't mention how `TraversalStrategies` are serialized in 
general. Should we add that?

The build is currently failing, but that seems to be caused by 
travis-ci/travis-ci#8607. I built Gremlin.Net locally and executed the tests 
without any problems.

BTW: Would it make sense to create a separate pull request for `master` or 
can we simply execute `generate.groovy` later when this is merged from `tp32` 
into `master`?


---


[jira] [Updated] (TINKERPOP-1801) OLAP profile() step return incorrect timing

2017-10-18 Thread stephen mallette (JIRA)

 [ 
https://issues.apache.org/jira/browse/TINKERPOP-1801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stephen mallette updated TINKERPOP-1801:

Component/s: hadoop

>  OLAP profile() step return incorrect timing
> 
>
> Key: TINKERPOP-1801
> URL: https://issues.apache.org/jira/browse/TINKERPOP-1801
> Project: TinkerPop
>  Issue Type: Bug
>  Components: hadoop
>Affects Versions: 3.3.0, 3.2.6
>Reporter: Artem Aliev
>
> Graph ProfileStep calculates time of next()/hasNext() calls, expecting 
> recursion.
> But Message passing/RDD joins is used by GraphComputer.
> So next() does not recursively call next steps, but message is generated. And 
> most of the time is taken by message passing (RDD join). 
> Thus on graph computer the time between ProfileStep should be measured, not 
> inside it.
> The other approach is to get Spark statistics with SparkListener and add 
> spark stages timings into profiler metrics. that will work only for spark but 
> will give better representation of step costs.
> The simple fix is measuring time between OLAP iterations and add it to the 
> profiler step.
> This will not take into account computer setup time, but will be precise 
> enough for long running queries.
> To reproduce:
> tinkerPop 3.2.6 gremlin:
> {code}
> plugin activated: tinkerpop.server
> plugin activated: tinkerpop.utilities
> plugin activated: tinkerpop.spark
> plugin activated: tinkerpop.tinkergraph
> gremlin> graph = 
> GraphFactory.open('conf/hadoop/hadoop-grateful-gryo.properties')
> gremlin> g = graph.traversal().withComputer(SparkGraphComputer)
> ==>graphtraversalsource[hadoopgraph[gryoinputformat->gryooutputformat], 
> sparkgraphcomputer]
> gremlin> g.V().out().out().count().profile()
> ==>Traversal Metrics
> Step   Count  
> Traversers   Time (ms)% Dur
> =
> GraphStep(vertex,[]) 808  
>808   2.02518.35
> VertexStep(OUT,vertex)  8049  
>562   4.43040.14
> VertexStep(OUT,edge)  327370  
>   7551   4.58141.50
> CountGlobalStep1  
>  1   0.001 0.01
> >TOTAL -  
>  -  11.038-
> gremlin> clock(1){g.V().out().out().count().next() }
> ==>3421.92758
> gremlin>
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (TINKERPOP-1800) Remote connect issue

2017-10-18 Thread stephen mallette (JIRA)

 [ 
https://issues.apache.org/jira/browse/TINKERPOP-1800?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stephen mallette updated TINKERPOP-1800:

Fix Version/s: (was: 3.3.0)

> Remote connect issue
> 
>
> Key: TINKERPOP-1800
> URL: https://issues.apache.org/jira/browse/TINKERPOP-1800
> Project: TinkerPop
>  Issue Type: Bug
>  Components: groovy
>Affects Versions: 3.3.0
> Environment: WINDOWS 10
>Reporter: Loveneet kumar
>
> In windows 10 environment  i facing this issue :
> {color:#59afe1}gremlin> remote connect tinkerpop.server conf/remote.yaml
> groovysh_parse: 2: expecting EOF, found 'conf' @ line 2, column 33.
>remote connect tinkerpop.server conf/remote.yaml
>^
> 1 error
> Type ':help' or ':h' for help.
> Display stack trace? [yN]
> gremlin> :d
> Buffer is empty
> gremlin> :{color}
> After changing slash direction
> {color:#59afe1}gremlin> remote connect tinkerpop.server conf\remote.yaml
> groovysh_parse: 2: unexpected char: '\' @ line 2, column 37.
>remote connect tinkerpop.server conf\remote.yaml{color}
>^
> 1 error



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (TINKERPOP-1801) OLAP profile() step return incorrect timing

2017-10-18 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/TINKERPOP-1801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16209481#comment-16209481
 ] 

ASF GitHub Bot commented on TINKERPOP-1801:
---

Github user okram commented on the issue:

https://github.com/apache/tinkerpop/pull/734
  
This is a nice update @artem-aliev because it doesn't change API and it is 
general for all `GraphComputer` implementations. Great! A couple things please 
for a solid VOTE.

1. Please update the `CHANGELOG.asciidoc` with the change you made.
2. In this PR discussion, please provide a `CUT/PASTE` of what the new 
metrics `toString()` looks like so people can judge its merits.

Thank you.


>  OLAP profile() step return incorrect timing
> 
>
> Key: TINKERPOP-1801
> URL: https://issues.apache.org/jira/browse/TINKERPOP-1801
> Project: TinkerPop
>  Issue Type: Bug
>Affects Versions: 3.3.0, 3.2.6
>Reporter: Artem Aliev
>
> Graph ProfileStep calculates time of next()/hasNext() calls, expecting 
> recursion.
> But Message passing/RDD joins is used by GraphComputer.
> So next() does not recursively call next steps, but message is generated. And 
> most of the time is taken by message passing (RDD join). 
> Thus on graph computer the time between ProfileStep should be measured, not 
> inside it.
> The other approach is to get Spark statistics with SparkListener and add 
> spark stages timings into profiler metrics. that will work only for spark but 
> will give better representation of step costs.
> The simple fix is measuring time between OLAP iterations and add it to the 
> profiler step.
> This will not take into account computer setup time, but will be precise 
> enough for long running queries.
> To reproduce:
> tinkerPop 3.2.6 gremlin:
> {code}
> plugin activated: tinkerpop.server
> plugin activated: tinkerpop.utilities
> plugin activated: tinkerpop.spark
> plugin activated: tinkerpop.tinkergraph
> gremlin> graph = 
> GraphFactory.open('conf/hadoop/hadoop-grateful-gryo.properties')
> gremlin> g = graph.traversal().withComputer(SparkGraphComputer)
> ==>graphtraversalsource[hadoopgraph[gryoinputformat->gryooutputformat], 
> sparkgraphcomputer]
> gremlin> g.V().out().out().count().profile()
> ==>Traversal Metrics
> Step   Count  
> Traversers   Time (ms)% Dur
> =
> GraphStep(vertex,[]) 808  
>808   2.02518.35
> VertexStep(OUT,vertex)  8049  
>562   4.43040.14
> VertexStep(OUT,edge)  327370  
>   7551   4.58141.50
> CountGlobalStep1  
>  1   0.001 0.01
> >TOTAL -  
>  -  11.038-
> gremlin> clock(1){g.V().out().out().count().next() }
> ==>3421.92758
> gremlin>
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] tinkerpop issue #734: TINKERPOP-1801: fix profile() timing in OLAP by adding...

2017-10-18 Thread okram
Github user okram commented on the issue:

https://github.com/apache/tinkerpop/pull/734
  
This is a nice update @artem-aliev because it doesn't change API and it is 
general for all `GraphComputer` implementations. Great! A couple things please 
for a solid VOTE.

1. Please update the `CHANGELOG.asciidoc` with the change you made.
2. In this PR discussion, please provide a `CUT/PASTE` of what the new 
metrics `toString()` looks like so people can judge its merits.

Thank you.


---


[jira] [Commented] (TINKERPOP-1786) Recipe and missing manifest items for Spark on Yarn

2017-10-18 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/TINKERPOP-1786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16209465#comment-16209465
 ] 

ASF GitHub Bot commented on TINKERPOP-1786:
---

Github user vtslab commented on the issue:

https://github.com/apache/tinkerpop/pull/721
  
I am fine with the PR now. Build server needs a check, though.


> Recipe and missing manifest items for Spark on Yarn
> ---
>
> Key: TINKERPOP-1786
> URL: https://issues.apache.org/jira/browse/TINKERPOP-1786
> Project: TinkerPop
>  Issue Type: Improvement
>  Components: hadoop
>Affects Versions: 3.3.0, 3.1.8, 3.2.6
> Environment: gremlin-console
>Reporter: Marc de Lignie
>Priority: Minor
> Fix For: 3.2.7, 3.3.1
>
>
> Thorough documentation for running OLAP queries on Spark on Yarn has been 
> missing, keeping some users from getting the benefits of this nice feature of 
> the Tinkerpop stack and resulting in a significant number of questions on the 
> gremlin users list.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] tinkerpop issue #721: TINKERPOP-1786 Recipe and missing manifest items for S...

2017-10-18 Thread vtslab
Github user vtslab commented on the issue:

https://github.com/apache/tinkerpop/pull/721
  
I am fine with the PR now. Build server needs a check, though.


---


Re: Notes on TraverserSet and Sqlg optimizations

2017-10-18 Thread pieter gmail
Yes the hasCode() and equals() is correct. It is however a slightly 
heavier operation than TinkerGraph as Sqlg's Element's id is a more 
complex object holding the label and its id.


I should have mentioned that in Sqlg the traverser is always a 
B_LP_O_P_S_SE_SL_Traverser. As Sqlg returns multiple VertexSteps in one 
go I use the path information to reconstruct the jdbc ResultSet from the 
db. This makes the hashCode() and equals() operation heavier as it is 
called on B_LP_O_P_S_SE_SL_Traverser which calls hashCode() and equals() 
on Path and they in turn are non trivial operations.


Cheers
Pieter



On 17/10/2017 23:58, Marko Rodriguez wrote:

…do your vertices implement hashCode() and equals() “correctly” ?

Marko.




On Oct 17, 2017, at 2:40 PM, Stephen Mallette  wrote:


So if I understand correctly the map is only needed for bulking so quite

often is not needed.

afaik, it is only used for bulking though it's hard to characterize how
often it is used - i suppose it all depends on the types of traversals you
write and the nature of the data being traversed.


A significant difference.

The performance numbers are interesting. You don't get a speedup in sqlg
though when bullking would be enacted though - only when bulking would have
no effect - correct?



On Fri, Oct 13, 2017 at 3:48 PM, pieter gmail 
wrote:


Hi,

Doing step optimizations I am noticing a rather severe performance hit in
TraverserSet.

Sqlg does a secondary optimization on steps that it can not optimize from
the GraphStep. Before the secondary optimization these steps will execute
at least one query for each incoming start. The optimization caches the
incoming start traverser and the step is executed for all incoming
traversers in one go. This has the effect of changing the semantics into a
breath first traversal as opposed to the default depth first.

So basically the replaced steps code looks like follows

@Override
protected Traverser.Admin processNextStart() throws
NoSuchElementException {
if (this.first) {
this.first = false;
while (this.starts.hasNext()) {
Traverser.Admin start = this.starts.next();
this.traversal.addStart(start);
}


The performance hit is in the this.traversal.addStart(start) which ends up
putting the start into the TraverserSet's internal LinkedHashMap.

So if I understand correctly the map is only needed for bulking so quite
often is not needed. Replacing the map with an ArrayList improves the
performance drastically.

For the test the optimization does the following. I replace the
TraversalFilterStep with a custom SqlTraversalFilterStep which extends from
a custom SqlAbstractStep. The custom SqlgAbstractStep in turn replaces the
ExpandableStepIterator with a custom SqlgExpandableStepIterator which is a
copy of ExpandableStepIterator except for replacing TraverserSet with a
List traversers = new ArrayList<>();

@Test
public void testSqlgTraversalFilterStepPerformance() {
this.sqlgGraph.tx().normalBatchModeOn();
int count = 1;
for (int i = 0; i < count; i++) {
Vertex a1 = this.sqlgGraph.addVertex(T.label, "A", "name",
"a1");
Vertex b1 = this.sqlgGraph.addVertex(T.label, "B", "name",
"b1");
a1.addEdge("ab", b1);
}
this.sqlgGraph.tx().commit();

StopWatch stopWatch = new StopWatch();
for (int i = 0; i < 1000; i++) {
stopWatch.start();
GraphTraversal traversal =
this.sqlgGraph.traversal()
.V().hasLabel("A")
.where(__.out().hasLabel("B"));
List vertices = traversal.toList();
Assert.assertEquals(count, vertices.size());
stopWatch.stop();
System.out.println(stopWatch.toString());
stopWatch.reset();
}
}

Without the ArrayList optimization the output is,
0:00:12.198
0:00:09.756
0:00:09.435
0:00:14.466
0:00:10.197
0:00:04.937
0:00:02.974
0:00:02.942
0:00:02.977
0:00:03.142
0:00:03.207

With the ArrayList optimization the output is,
0:00:00.334
0:00:00.147
0:00:00.114
0:00:00.100
... time for jit
0:00:00.055
0:00:00.056
0:00:00.054
0:00:00.053
0:00:00.054
0:00:00.055

A significant difference.

For TinkerGraph this tests optimization is moot as the TraversalFilterStep
resets the step for every step making the TraverserSet's map empty so the
traversers equals method is never called.

Not sure if there are scenarios where this optimization will be any good
for TinkerGraph but thought I'd let you know how I am optimizing steps.

A concern is that I am now replacing core steps which makes Sqlg further
away from the reference implementation making it fragile to changes in
TinkerPop and harder to keep up to upstream changes. Perhaps there is a way
to make TravererSet's current behavior configurable?

Cheers
Pieter

Re: Notes on TraverserSet and Sqlg optimizations

2017-10-18 Thread pieter gmail
Currently Sqlg's optimization strategies removes bulking as it does not 
work with Sqlg's way of accessing the database. Sqlg fetches many 
VertexSteps in one go and bulking needs it to be on a one by one basis. 
Bulking is still possible but only by removing Sqlg's strategies from 
the traversal. They way I understood bulking it is only of use for a 
particular graph shape. Graphs with lots references from the same label 
back to itself. For the kind of graphs I work on and hopefully most of 
my users the graphs are more like trees where bulking is less useful.


Later I hope to look at bulking and see if its possible to predict 
whether a query would be better of with bulking.


Cheers
Pieter

On 17/10/2017 22:40, Stephen Mallette wrote:

So if I understand correctly the map is only needed for bulking so quite

often is not needed.

afaik, it is only used for bulking though it's hard to characterize how
often it is used - i suppose it all depends on the types of traversals you
write and the nature of the data being traversed.


A significant difference.

The performance numbers are interesting. You don't get a speedup in sqlg
though when bullking would be enacted though - only when bulking would have
no effect - correct?



On Fri, Oct 13, 2017 at 3:48 PM, pieter gmail 
wrote:


Hi,

Doing step optimizations I am noticing a rather severe performance hit in
TraverserSet.

Sqlg does a secondary optimization on steps that it can not optimize from
the GraphStep. Before the secondary optimization these steps will execute
at least one query for each incoming start. The optimization caches the
incoming start traverser and the step is executed for all incoming
traversers in one go. This has the effect of changing the semantics into a
breath first traversal as opposed to the default depth first.

So basically the replaced steps code looks like follows

 @Override
 protected Traverser.Admin processNextStart() throws
NoSuchElementException {
 if (this.first) {
 this.first = false;
 while (this.starts.hasNext()) {
 Traverser.Admin start = this.starts.next();
 this.traversal.addStart(start);
 }
 

The performance hit is in the this.traversal.addStart(start) which ends up
putting the start into the TraverserSet's internal LinkedHashMap.

So if I understand correctly the map is only needed for bulking so quite
often is not needed. Replacing the map with an ArrayList improves the
performance drastically.

For the test the optimization does the following. I replace the
TraversalFilterStep with a custom SqlTraversalFilterStep which extends from
a custom SqlAbstractStep. The custom SqlgAbstractStep in turn replaces the
ExpandableStepIterator with a custom SqlgExpandableStepIterator which is a
copy of ExpandableStepIterator except for replacing TraverserSet with a
List traversers = new ArrayList<>();

 @Test
 public void testSqlgTraversalFilterStepPerformance() {
 this.sqlgGraph.tx().normalBatchModeOn();
 int count = 1;
 for (int i = 0; i < count; i++) {
 Vertex a1 = this.sqlgGraph.addVertex(T.label, "A", "name",
"a1");
 Vertex b1 = this.sqlgGraph.addVertex(T.label, "B", "name",
"b1");
 a1.addEdge("ab", b1);
 }
 this.sqlgGraph.tx().commit();

 StopWatch stopWatch = new StopWatch();
 for (int i = 0; i < 1000; i++) {
 stopWatch.start();
 GraphTraversal traversal =
this.sqlgGraph.traversal()
 .V().hasLabel("A")
 .where(__.out().hasLabel("B"));
 List vertices = traversal.toList();
 Assert.assertEquals(count, vertices.size());
 stopWatch.stop();
 System.out.println(stopWatch.toString());
 stopWatch.reset();
 }
 }

Without the ArrayList optimization the output is,
0:00:12.198
0:00:09.756
0:00:09.435
0:00:14.466
0:00:10.197
0:00:04.937
0:00:02.974
0:00:02.942
0:00:02.977
0:00:03.142
0:00:03.207

With the ArrayList optimization the output is,
0:00:00.334
0:00:00.147
0:00:00.114
0:00:00.100
... time for jit
0:00:00.055
0:00:00.056
0:00:00.054
0:00:00.053
0:00:00.054
0:00:00.055

A significant difference.

For TinkerGraph this tests optimization is moot as the TraversalFilterStep
resets the step for every step making the TraverserSet's map empty so the
traversers equals method is never called.

Not sure if there are scenarios where this optimization will be any good
for TinkerGraph but thought I'd let you know how I am optimizing steps.

A concern is that I am now replacing core steps which makes Sqlg further
away from the reference implementation making it fragile to changes in
TinkerPop and harder to keep up to upstream changes. Perhaps there is a way
to make TravererSet's current behavior configurable?

Cheers
Pieter