Re: [DISCUSS] driver for gremlin-python
If I may suggest an alternative. Most (all?) languages have dependency management. Would it not be better to have GLV's simply depend on third party projects (when they exist and are decently up to date)? These dependencies can be pulled during testing. If features are lacking the community can provide patches, etc. If a project dies we can always switch it out for another one. When it comes to users, they simply download their GLV and go. Ideally the GLV would exist as a package in their dependency management system of choice. (as a side note apt-get/yum install gremlin-server would be amazing) My understanding around the driver needing modifications is -I'm guessing- mostly to accommodate for the updates in TINKERPOP-1278 rather than from drivers needing to be specific for GLVs. If not let me know. This obviously comes with it's own downsides. If a project dies and there's no replacement we basically have to fall back to your proposal by forking and maintaining it ourselves. I also don't know how licensing would work in this case, but I'm guessing a license change could be akin to the project dying in regards to TP's dependance on it. I just think that as we get more GLVs, providing CI with the correct environments to run various lower level driver tests in various languages is going to be a headache. It would be much better to keep them as separate projects. If I completely missed the original intent of the thread also do let me know. On Fri, Aug 5, 2016 at 4:57 PM, Stephen Mallette wrote: > I mentioned this thread to David Brown and asked if he thought that > gremlinclient would be a good fit for this issue. He said that he thought > it would be with some modifications. He's agreed to simplify gremliclient > with the intent of making it the driver for TinkerPop's gremlin-python. > Assuming no objections, he said that he could look to have a pull request > to TINKERPOP-1278 by end of next week. > > > > > On Thu, Aug 4, 2016 at 1:39 PM, Stephen Mallette > wrote: > > > TinkerPop has long held to the idea that "drivers" would be a bit like > > graph implementations. TinkerPop builds reference implementation of a > graph > > and third-parties maintain "other" implementations separately. The same > > idea has been held for "drivers" in that we have our reference > > implementation of the Java driver and third-parties produce > implementations > > for other languages. > > > > GLVs sorta changed things though. Now we have a model where we maintain > > GLVs of other languages in TinkerPop itself. While somewhat risky to > > include the added code/complexity, maintaining the GLV in TinkerPop has a > > lot of good advantages for users and for growing our community. We also > > have come up with ways to manage the risk, by establishing a really nice > > initial reference implementation of gremlin-python which sets a high bar > > for new GLVs to come in (i.e. hooked into tinkerpop test suite, automated > > build mixed with maven, non-native JVM language focused, etc.). > > > > Of course, by introducing GLVs to TinkerPop though, we now have the issue > > of conflict with TinkerPop only having a single "reference > implementation" > > for drivers because without a driver the GLV is pretty useless. We've > > mitigated that in gremlin-python so far with a simple REST implementation > > of a driver, but as we advance the capabilities of GLVs that approach > won't > > be good for the long term. > > > > So - it seems like we (TinkerPop) should have a driver to go with each > > GLV? To me, the upsides are big for users: > > > > 1. Users don't have to hunt for drivers elsewhere - they just grab their > > GLV and go. I suppose there is still opportunity for third-party drivers > to > > be built, but at least users have an immediate easy option within the > > Apache TinkerPop. > > 2. Our docs for GLV usage are completely in Apache TinkerPop and are > > "production ready" (i.e. not REST based then go switch to something > > third-party) > > > > There are some downsides which basically boil down to "Mo' code Mo' > > problems" but basically: > > > > 1. I believe that it will be expected that we maintain some parity in > > terms of features and performance with the drivers. We can't have the > java > > driver be 10x over the python driver > > 2. The GLV code base becomes more complex. A high performing driver will > > not be trivial. > > 3. A whole class of downsides related to GLVs in general where we don't > > have the necessary language expertise to maintain it - though - again, > > mitigated by the high bar we set with our reference implementation for > > gremlin-python. Also, rather than a negative, one could also see this as > an > > opportunity to grow the TinkerPop community. > > > > Ultimately, I think all of this is outweighed by the simple fact that we > > need a driver in Apache TinkerPop to fully test GLVs. Without that we're > > kinda of stuck ourselves. > > > > > > > > > > > > >
[GitHub] tinkerpop issue #375: Finished renaming distribution from 'apache-' to 'apac...
Github user joshsh commented on the issue: https://github.com/apache/tinkerpop/pull/375 Sure. Here: https://github.com/apache/tinkerpop/pull/376 Note that the distros for giraph-gremlin, spark-gremlin, and hadoop-gremlin don't use the apache-tinkerpop- prefix. I haven't changed them. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] tinkerpop pull request #375: Finished renaming distribution from 'apache-' t...
Github user joshsh closed the pull request at: https://github.com/apache/tinkerpop/pull/375 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] tinkerpop pull request #376: Finished renaming distribution from 'apache-' t...
GitHub user joshsh opened a pull request: https://github.com/apache/tinkerpop/pull/376 Finished renaming distribution from 'apache-' to 'apache-tinkerpop-' Fixed gremlin.sh and assembly descriptor, updated documentation You can merge this pull request into a Git repository by running: $ git pull https://github.com/joshsh/tinkerpop tp31 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/tinkerpop/pull/376.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #376 commit 6732921299af12894d130faece2269a59dbeb180 Author: Joshua Shinavier Date: 2016-08-05T23:17:57Z Finished renaming distribution from 'apache-' to 'apache-tinkerpop-' Fixed gremlin.sh and assembly descriptor, updated documentation --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] tinkerpop issue #375: Finished renaming distribution from 'apache-' to 'apac...
Github user spmallette commented on the issue: https://github.com/apache/tinkerpop/pull/375 hey @joshsh - wowtotally missed all those. this would have all bit me on release day so thanks for that. can i ask that you please retarget your pull request for the tp31 branch? we need these fixes in the 3.1.x line of code where applicable. Or perhaps it's a PR to each tp31 and master? seems like i see a few fixes that only apply on master. what do you think? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] tinkerpop pull request #375: Finished renaming distribution from 'apache-' t...
GitHub user joshsh opened a pull request: https://github.com/apache/tinkerpop/pull/375 Finished renaming distribution from 'apache-' to 'apache-tinkerpop-' Fixed gremlin.sh and assembly descriptor, updated documentation You can merge this pull request into a Git repository by running: $ git pull https://github.com/joshsh/tinkerpop master Alternatively you can review and apply these changes as the patch at: https://github.com/apache/tinkerpop/pull/375.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #375 commit c12538916c9b8511006791dee8d3d93802f40f87 Author: Joshua Shinavier Date: 2016-08-05T22:19:06Z Finished renaming distribution from 'apache-' to 'apache-tinkerpop-' Fixed gremlin.sh and assembly descriptor, updated documentation --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Closed] (TINKERPOP-1396) Traversal Induced Values Recipe
[ https://issues.apache.org/jira/browse/TINKERPOP-1396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stephen mallette closed TINKERPOP-1396. --- Resolution: Done completed on https://github.com/apache/tinkerpop/pull/371 > Traversal Induced Values Recipe > --- > > Key: TINKERPOP-1396 > URL: https://issues.apache.org/jira/browse/TINKERPOP-1396 > Project: TinkerPop > Issue Type: Improvement > Components: documentation >Affects Versions: 3.2.1 >Reporter: stephen mallette >Assignee: stephen mallette >Priority: Minor > Fix For: 3.2.2 > > > Develop a recipe for "traversal induced values" - those traversal that use > data from the traversal itself to parameterize other later steps in that same > traversal. This is a commonly asked question on the mailing list. > If there's a better name for this topic, I'm open to hearing what that might > be. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] tinkerpop pull request #371: Added new recipe for Traversal Induced Values.
Github user asfgit closed the pull request at: https://github.com/apache/tinkerpop/pull/371 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (TINKERPOP-1350) Server locks when submitting parallel requests on session
[ https://issues.apache.org/jira/browse/TINKERPOP-1350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15410090#comment-15410090 ] ASF GitHub Bot commented on TINKERPOP-1350: --- Github user okram commented on the issue: https://github.com/apache/tinkerpop/pull/367 VOTE +1. > Server locks when submitting parallel requests on session > - > > Key: TINKERPOP-1350 > URL: https://issues.apache.org/jira/browse/TINKERPOP-1350 > Project: TinkerPop > Issue Type: Bug > Components: server >Affects Versions: 3.1.2-incubating >Reporter: stephen mallette >Assignee: stephen mallette >Priority: Critical > Fix For: 3.1.4, 3.2.2 > > > This really is only a problem when there is some form of long blocking script > submitted and only on a session when done in parallel, like: > {code} > final ResultSet first = client.submit( > "Object mon1 = 'mon1';\n" + > "synchronized (mon1) {\n" + > "mon1.wait();\n" + > "} "); > final ResultSet second = client.submit( > "Object mon2 = 'mon2';\n" + > "synchronized (mon2) {\n" + > "mon2.wait();\n" + > "}"); > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] tinkerpop issue #367: TINKERPOP-1350 was never quite fixed in 3.1.3.
Github user okram commented on the issue: https://github.com/apache/tinkerpop/pull/367 VOTE +1. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[DISCUSS] ASF Board Draft Report - August 2016
## Description: Apache TinkerPop is a graph computing framework for both graph databases (OLTP) and graph analytic systems (OLAP). ## Activity: As discussed in the previous report, TinkerPop was preparing for the first releases outside of incubation. Those releases were voted on positively by the community and released the week of July 18th. We release 3.1.3 which was largely a maintenance release with some minor features and 3.2.1 which is the recommended version to be on. Going forward, we expect to continue to maintain the 3.1.x line of code providing bug fixes when needed. New development continues on the 3.2.x line and we expect our initial release of Gremlin Language Variants[1] for Python to be part of that line, likely 3.2.2. ## Issues: There are no issues requiring board attention at this time. ## Releases: - 3.1.3 (July 18, 2016) - 3.2.1 (July 18, 2016) ## PMC/Committer: - Last PMC addition was Dylan Millikin - May 2016 - Last committer addition was Michael Pollmeier - April 2016 ## Links [1] http://tinkerpop.apache.org/docs/3.2.1-SNAPSHOT/tutorials/gremlin-language-variants/
Re: [DISCUSS] driver for gremlin-python
I mentioned this thread to David Brown and asked if he thought that gremlinclient would be a good fit for this issue. He said that he thought it would be with some modifications. He's agreed to simplify gremliclient with the intent of making it the driver for TinkerPop's gremlin-python. Assuming no objections, he said that he could look to have a pull request to TINKERPOP-1278 by end of next week. On Thu, Aug 4, 2016 at 1:39 PM, Stephen Mallette wrote: > TinkerPop has long held to the idea that "drivers" would be a bit like > graph implementations. TinkerPop builds reference implementation of a graph > and third-parties maintain "other" implementations separately. The same > idea has been held for "drivers" in that we have our reference > implementation of the Java driver and third-parties produce implementations > for other languages. > > GLVs sorta changed things though. Now we have a model where we maintain > GLVs of other languages in TinkerPop itself. While somewhat risky to > include the added code/complexity, maintaining the GLV in TinkerPop has a > lot of good advantages for users and for growing our community. We also > have come up with ways to manage the risk, by establishing a really nice > initial reference implementation of gremlin-python which sets a high bar > for new GLVs to come in (i.e. hooked into tinkerpop test suite, automated > build mixed with maven, non-native JVM language focused, etc.). > > Of course, by introducing GLVs to TinkerPop though, we now have the issue > of conflict with TinkerPop only having a single "reference implementation" > for drivers because without a driver the GLV is pretty useless. We've > mitigated that in gremlin-python so far with a simple REST implementation > of a driver, but as we advance the capabilities of GLVs that approach won't > be good for the long term. > > So - it seems like we (TinkerPop) should have a driver to go with each > GLV? To me, the upsides are big for users: > > 1. Users don't have to hunt for drivers elsewhere - they just grab their > GLV and go. I suppose there is still opportunity for third-party drivers to > be built, but at least users have an immediate easy option within the > Apache TinkerPop. > 2. Our docs for GLV usage are completely in Apache TinkerPop and are > "production ready" (i.e. not REST based then go switch to something > third-party) > > There are some downsides which basically boil down to "Mo' code Mo' > problems" but basically: > > 1. I believe that it will be expected that we maintain some parity in > terms of features and performance with the drivers. We can't have the java > driver be 10x over the python driver > 2. The GLV code base becomes more complex. A high performing driver will > not be trivial. > 3. A whole class of downsides related to GLVs in general where we don't > have the necessary language expertise to maintain it - though - again, > mitigated by the high bar we set with our reference implementation for > gremlin-python. Also, rather than a negative, one could also see this as an > opportunity to grow the TinkerPop community. > > Ultimately, I think all of this is outweighed by the simple fact that we > need a driver in Apache TinkerPop to fully test GLVs. Without that we're > kinda of stuck ourselves. > > > > > >
[jira] [Commented] (TINKERPOP-1400) SubgraphStrategy introduces infinite recursion if filter has Vertex/Edge steps.
[ https://issues.apache.org/jira/browse/TINKERPOP-1400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15409951#comment-15409951 ] ASF GitHub Bot commented on TINKERPOP-1400: --- Github user analytically commented on a diff in the pull request: https://github.com/apache/tinkerpop/pull/374#discussion_r73747869 --- Diff: gremlin-core/src/main/java/org/apache/tinkerpop/gremlin/process/traversal/strategy/decoration/SubgraphStrategy.java --- @@ -156,20 +182,36 @@ private static void transferLabels(final Step from, final Step to) { private Builder() { } -public Builder vertexCriterion(final Traversal predicate) { -vertexCriterion = predicate; +public Builder vertices(final Traversal vertexPredicate) { +this.vertexCriterion = vertexPredicate; --- End diff -- Best rename the local vertexCriterion to vertexPredicate too no? > SubgraphStrategy introduces infinite recursion if filter has Vertex/Edge > steps. > --- > > Key: TINKERPOP-1400 > URL: https://issues.apache.org/jira/browse/TINKERPOP-1400 > Project: TinkerPop > Issue Type: Bug > Components: process >Affects Versions: 3.2.1 >Reporter: Marko A. Rodriguez >Assignee: Marko A. Rodriguez > > James from the mailing list reported: > {code} > gremlin> graph = TinkerFactory.createModern() > ==>tinkergraph[vertices:6 edges:6] > gremlin> g = graph.traversal() > ==>graphtraversalsource[tinkergraph[vertices:6 edges:6], standard] > gremlin> s = SubgraphStrategy.build().vertexCriterion(hasId(1)).create() > ==>SubgraphStrategy > gremlin> g.V().filter(hasId(1)) > ==>v[1] > gremlin> g.withStrategies(s).V() > ==>v[1] > works as expected. But if I change the predicate traversal to something > slightly more complex, e.g. in('knows').hasId(1) things start to go haywire. > The single step predicates works as expected in 3.1.1-incubating, 3.1.3 and > 3.2.1. > In 3.1.1-incubating the multi-step predicate subgraph strategy seems to end > up generating the same traversal as using filter(...) but fails to execute: > $ sh apache-gremlin-console-3.1.1-incubating/bin/gremlin.sh > \,,,/ > (o o) > -oOOo-(3)-oOOo- > plugin activated: tinkerpop.server > plugin activated: tinkerpop.utilities > plugin activated: tinkerpop.tinkergraph > gremlin> graph = TinkerFactory.createModern() > ==>tinkergraph[vertices:6 edges:6] > gremlin> g = graph.traversal() > ==>graphtraversalsource[tinkergraph[vertices:6 edges:6], standard] > gremlin> s = > SubgraphStrategy.build().vertexCriterion(__.in('knows').hasId(1)).create() > ==>SubgraphStrategy > gremlin> g1 = GraphTraversalSource.build().with(s).create(graph) > ==>graphtraversalsource[tinkergraph[vertices:6 edges:6], standard] > gremlin> g.V().filter(__.in('knows').hasId(1)).explain() > ==>Traversal Explanation > === > Original Traversal [GraphStep([],vertex), > TraversalFilterStep([VertexStep(IN,[knows],vertex), HasStep([~id.eq(1)])])] > ConnectiveStrategy [D] [GraphStep([],vertex), > TraversalFilterStep([VertexStep(IN,[knows],vertex), HasStep([~id.eq(1)])])] > IdentityRemovalStrategy [O] [GraphStep([],vertex), > TraversalFilterStep([VertexStep(IN,[knows],vertex), HasStep([~id.eq(1)])])] > IncidentToAdjacentStrategy [O] [GraphStep([],vertex), > TraversalFilterStep([VertexStep(IN,[knows],vertex), HasStep([~id.eq(1)])])] > AdjacentToIncidentStrategy [O] [GraphStep([],vertex), > TraversalFilterStep([VertexStep(IN,[knows],vertex), HasStep([~id.eq(1)])])] > FilterRankingStrategy[O] [GraphStep([],vertex), > TraversalFilterStep([VertexStep(IN,[knows],vertex), HasStep([~id.eq(1)])])] > MatchPredicateStrategy [O] [GraphStep([],vertex), > TraversalFilterStep([VertexStep(IN,[knows],vertex), HasStep([~id.eq(1)])])] > RangeByIsCountStrategy [O] [GraphStep([],vertex), > TraversalFilterStep([VertexStep(IN,[knows],vertex), HasStep([~id.eq(1)])])] > TinkerGraphStepStrategy [P] [TinkerGraphStep([],vertex), > TraversalFilterStep([VertexStep(IN,[knows],vertex), HasStep([~id.eq(1)])])] > EngineDependentStrategy [F] [TinkerGraphStep([],vertex), > TraversalFilterStep([VertexStep(IN,[knows],vertex), HasStep([~id.eq(1)])])] > ProfileStrategy [F] [TinkerGraphStep([],vertex), > TraversalFilterStep([VertexStep(IN,[knows],vertex), HasStep([~id.eq(1)])])] > StandardVerificationStrategy [V] [TinkerGraphStep([],vertex), > TraversalFilterStep([VertexStep(IN,[knows],vertex), HasStep([~id.eq(1)])])] > ComputerVerificationStrategy [V] [TinkerGraph
[GitHub] tinkerpop pull request #374: TINKERPOP-1400: SubgraphStrategy introduces inf...
Github user analytically commented on a diff in the pull request: https://github.com/apache/tinkerpop/pull/374#discussion_r73747869 --- Diff: gremlin-core/src/main/java/org/apache/tinkerpop/gremlin/process/traversal/strategy/decoration/SubgraphStrategy.java --- @@ -156,20 +182,36 @@ private static void transferLabels(final Step from, final Step to) { private Builder() { } -public Builder vertexCriterion(final Traversal predicate) { -vertexCriterion = predicate; +public Builder vertices(final Traversal vertexPredicate) { +this.vertexCriterion = vertexPredicate; --- End diff -- Best rename the local vertexCriterion to vertexPredicate too no? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] tinkerpop issue #372: DSP-1397 Fix StarGraph addEdge for self-edges
Github user okram commented on the issue: https://github.com/apache/tinkerpop/pull/372 Ah. Gotcha. Yea, please feel free to update my branch source with respect test that does the break and I can work to fix it. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
Re: path query optimization
Hello, This is cool. Check out also ImmutablePath.extend(labels) as that is ultimately what Traverser.addLabels() calls. We have a lot of set copying and I don’t know if its needed (as you seem to be demonstrating). What I don’t like about your solution is the explicit reference to the B_L_P…Traverser in AbstractStep. See if you can work your solution without it. Good luck, Marko. http://markorodriguez.com > On Aug 5, 2016, at 12:44 PM, pieter-gmail wrote: > > Sorry forgot to add a rather important part. > > I changed ImmutablePath's constructor to > >private ImmutablePath(final ImmutablePathImpl previousPath, final > Object currentObject, final Set currentLabels) { >this.previousPath = previousPath; >this.currentObject = currentObject; >this.currentLabels = currentLabels; > //this.currentLabels.addAll(currentLabels); >} > > Setting the collection directly as oppose to `addAll` > > Thanks > Pieter > > > On 05/08/2016 20:40, pieter-gmail wrote: >> Hi, >> >> I have been optimizing Sqlg of late and eventually arrived at TinkerPop >> code. >> >> The gremlin in particular that I am interested is path queries. >> >> Here is the test that I am running in jmh. >> >>//@Setup >>Vertex a = graph.addVertex(T.label, "A", "name", "a1"); >>for (int i = 1; i < 1_000_001; i++) { >>Vertex b = graph.addVertex(T.label, "B", "name", "name_" + i); >>a.addEdge("outB", b); >>for (int j = 0; j < 1; j++) { >>Vertex c = graph.addVertex(T.label, "C", "name", "name_" >> + i + " " + j); >>b.addEdge("outC", c); >>} >>} >> >> And the query being benchmarked is >> >>GraphTraversal traversal = >> g.V(a).as("a").out().as("b").out().as("c").path(); >>while (traversal.hasNext()) { >>Path path = traversal.next(); >>} >> >> Before the optimization, (as things are now) >> >> Benchmark Mode Cnt Score Error Units >> GremlinPathBenchmark.g_path avgt 100 1.086 ± 0.020 s/op >> >> The optimization I did is in AbstractStep.prepareTraversalForNextStep, >> to not call addLabels() for path gremlins as the labels are known by the >> step and do not change again so there is not need to keep adding them. >> >>private final Traverser.Admin prepareTraversalForNextStep(final >> Traverser.Admin traverser) { >>if (!this.traverserStepIdAndLabelsSetByChild) { >>traverser.setStepId(this.nextStep.getId()); >>if (traverser instanceof B_LP_O_P_S_SE_SL_Traverser) { >>} else { >>traverser.addLabels(this.labels); >>} >>} >>return traverser; >>} >> >> After optimization, >> >> Benchmark Mode Cnt Score Error Units >> GremlinPathBenchmark.g_path avgt 100 0.680 ± 0.004 s/op >> >> 1.086 vs 0.689 seconds for the traversal. >> >> I ran the Structured and Process test suites. 2 tests are failing with >> this optimization. >> >> InjectTest.g_VX1X_out_name_injectXdanielX_asXaX_mapXlengthX_path fails with >> >> "java.lang.IllegalArgumentException: The step with label a does not exist" >> >> and >> >> SerializationTest.shouldSerializePathAsDetached fails with >> >> "Caused by: java.lang.IllegalArgumentException: Class is not registered: >> java.util.Collections$UnmodifiableSet" >> >> Before investigating the failures is this optimization worth pursuing? >> >> Thanks >> Pieter >> >
[jira] [Commented] (TINKERPOP-1400) SubgraphStrategy introduces infinite recursion if filter has Vertex/Edge steps.
[ https://issues.apache.org/jira/browse/TINKERPOP-1400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15409927#comment-15409927 ] ASF GitHub Bot commented on TINKERPOP-1400: --- GitHub user okram opened a pull request: https://github.com/apache/tinkerpop/pull/374 TINKERPOP-1400: SubgraphStrategy introduces infinite recursion if filter has Vertex/Edge steps. https://issues.apache.org/jira/browse/TINKERPOP-1400 There was a severe bug in SubgraphStrategy where the traversal filters that were added for sub-graphing were being recursively applied yielded a StackOverflow. We did not have complex enough tests in SubgraphStrategyProcessTest to illicit the bug. The fix is clever using Step label markers to know if a traversal whose is having their strategy applied is a vertex/edge subgraph filter. Its clever. * Note: given the differences in strategy application code, this can not easily go into the `tp31`-line without a rewrite. Thus, this is headed to `master/`. CHANGELOG ``` * Fixed a severe bug in `SubgraphStrategy`. * Deprecated `SubgraphStrategy.Builder.vertexCriterion()/edgeCriterion()` in favor of `vertices()/edges()`. ``` VOTE +1. You can merge this pull request into a Git repository by running: $ git pull https://github.com/apache/tinkerpop TINKERPOP-1400 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/tinkerpop/pull/374.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #374 commit 9d6a4957468a65d15180ddc31f5020255ad14f20 Author: Marko A. Rodriguez Date: 2016-08-05T19:16:58Z Fixed a severe bug in SubgraphStrategy where infinite recurssion occurs if the strategy is not smart about how child traverals with Vertex/EdgeSteps are analyzed. Also Deprecated vertexCriteria() method with vertices() likewise for edgeCritera() in SubGraphStrategy.Builder to be consistent with GraphFilter style (same concept). In fact, moving forward, SubGraphStrategy could take a GraphFilter. > SubgraphStrategy introduces infinite recursion if filter has Vertex/Edge > steps. > --- > > Key: TINKERPOP-1400 > URL: https://issues.apache.org/jira/browse/TINKERPOP-1400 > Project: TinkerPop > Issue Type: Bug > Components: process >Affects Versions: 3.2.1 >Reporter: Marko A. Rodriguez >Assignee: Marko A. Rodriguez > > James from the mailing list reported: > {code} > gremlin> graph = TinkerFactory.createModern() > ==>tinkergraph[vertices:6 edges:6] > gremlin> g = graph.traversal() > ==>graphtraversalsource[tinkergraph[vertices:6 edges:6], standard] > gremlin> s = SubgraphStrategy.build().vertexCriterion(hasId(1)).create() > ==>SubgraphStrategy > gremlin> g.V().filter(hasId(1)) > ==>v[1] > gremlin> g.withStrategies(s).V() > ==>v[1] > works as expected. But if I change the predicate traversal to something > slightly more complex, e.g. in('knows').hasId(1) things start to go haywire. > The single step predicates works as expected in 3.1.1-incubating, 3.1.3 and > 3.2.1. > In 3.1.1-incubating the multi-step predicate subgraph strategy seems to end > up generating the same traversal as using filter(...) but fails to execute: > $ sh apache-gremlin-console-3.1.1-incubating/bin/gremlin.sh > \,,,/ > (o o) > -oOOo-(3)-oOOo- > plugin activated: tinkerpop.server > plugin activated: tinkerpop.utilities > plugin activated: tinkerpop.tinkergraph > gremlin> graph = TinkerFactory.createModern() > ==>tinkergraph[vertices:6 edges:6] > gremlin> g = graph.traversal() > ==>graphtraversalsource[tinkergraph[vertices:6 edges:6], standard] > gremlin> s = > SubgraphStrategy.build().vertexCriterion(__.in('knows').hasId(1)).create() > ==>SubgraphStrategy > gremlin> g1 = GraphTraversalSource.build().with(s).create(graph) > ==>graphtraversalsource[tinkergraph[vertices:6 edges:6], standard] > gremlin> g.V().filter(__.in('knows').hasId(1)).explain() > ==>Traversal Explanation > === > Original Traversal [GraphStep([],vertex), > TraversalFilterStep([VertexStep(IN,[knows],vertex), HasStep([~id.eq(1)])])] > ConnectiveStrategy [D] [GraphStep([],vertex), > TraversalFilterStep([VertexStep(IN,[knows],vertex), HasStep([~id.eq(1)])])] > IdentityRemovalStrategy [O] [GraphStep([],vertex), > TraversalFilterStep([VertexStep(IN,[knows],vertex), HasStep([~id.eq(1)])])] > IncidentToAdjacentStrategy [O] [GraphStep([],vertex), > TraversalFilterStep([VertexStep(IN,[knows],
[GitHub] tinkerpop pull request #374: TINKERPOP-1400: SubgraphStrategy introduces inf...
GitHub user okram opened a pull request: https://github.com/apache/tinkerpop/pull/374 TINKERPOP-1400: SubgraphStrategy introduces infinite recursion if filter has Vertex/Edge steps. https://issues.apache.org/jira/browse/TINKERPOP-1400 There was a severe bug in SubgraphStrategy where the traversal filters that were added for sub-graphing were being recursively applied yielded a StackOverflow. We did not have complex enough tests in SubgraphStrategyProcessTest to illicit the bug. The fix is clever using Step label markers to know if a traversal whose is having their strategy applied is a vertex/edge subgraph filter. Its clever. * Note: given the differences in strategy application code, this can not easily go into the `tp31`-line without a rewrite. Thus, this is headed to `master/`. CHANGELOG ``` * Fixed a severe bug in `SubgraphStrategy`. * Deprecated `SubgraphStrategy.Builder.vertexCriterion()/edgeCriterion()` in favor of `vertices()/edges()`. ``` VOTE +1. You can merge this pull request into a Git repository by running: $ git pull https://github.com/apache/tinkerpop TINKERPOP-1400 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/tinkerpop/pull/374.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #374 commit 9d6a4957468a65d15180ddc31f5020255ad14f20 Author: Marko A. Rodriguez Date: 2016-08-05T19:16:58Z Fixed a severe bug in SubgraphStrategy where infinite recurssion occurs if the strategy is not smart about how child traverals with Vertex/EdgeSteps are analyzed. Also Deprecated vertexCriteria() method with vertices() likewise for edgeCritera() in SubGraphStrategy.Builder to be consistent with GraphFilter style (same concept). In fact, moving forward, SubGraphStrategy could take a GraphFilter. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
Re: path query optimization
Sorry forgot to add a rather important part. I changed ImmutablePath's constructor to private ImmutablePath(final ImmutablePathImpl previousPath, final Object currentObject, final Set currentLabels) { this.previousPath = previousPath; this.currentObject = currentObject; this.currentLabels = currentLabels; //this.currentLabels.addAll(currentLabels); } Setting the collection directly as oppose to `addAll` Thanks Pieter On 05/08/2016 20:40, pieter-gmail wrote: > Hi, > > I have been optimizing Sqlg of late and eventually arrived at TinkerPop > code. > > The gremlin in particular that I am interested is path queries. > > Here is the test that I am running in jmh. > > //@Setup > Vertex a = graph.addVertex(T.label, "A", "name", "a1"); > for (int i = 1; i < 1_000_001; i++) { > Vertex b = graph.addVertex(T.label, "B", "name", "name_" + i); > a.addEdge("outB", b); > for (int j = 0; j < 1; j++) { > Vertex c = graph.addVertex(T.label, "C", "name", "name_" > + i + " " + j); > b.addEdge("outC", c); > } > } > > And the query being benchmarked is > > GraphTraversal traversal = > g.V(a).as("a").out().as("b").out().as("c").path(); > while (traversal.hasNext()) { > Path path = traversal.next(); > } > > Before the optimization, (as things are now) > > Benchmark Mode Cnt Score Error Units > GremlinPathBenchmark.g_path avgt 100 1.086 ± 0.020 s/op > > The optimization I did is in AbstractStep.prepareTraversalForNextStep, > to not call addLabels() for path gremlins as the labels are known by the > step and do not change again so there is not need to keep adding them. > > private final Traverser.Admin prepareTraversalForNextStep(final > Traverser.Admin traverser) { > if (!this.traverserStepIdAndLabelsSetByChild) { > traverser.setStepId(this.nextStep.getId()); > if (traverser instanceof B_LP_O_P_S_SE_SL_Traverser) { > } else { > traverser.addLabels(this.labels); > } > } > return traverser; > } > > After optimization, > > Benchmark Mode Cnt Score Error Units > GremlinPathBenchmark.g_path avgt 100 0.680 ± 0.004 s/op > > 1.086 vs 0.689 seconds for the traversal. > > I ran the Structured and Process test suites. 2 tests are failing with > this optimization. > > InjectTest.g_VX1X_out_name_injectXdanielX_asXaX_mapXlengthX_path fails with > > "java.lang.IllegalArgumentException: The step with label a does not exist" > > and > > SerializationTest.shouldSerializePathAsDetached fails with > > "Caused by: java.lang.IllegalArgumentException: Class is not registered: > java.util.Collections$UnmodifiableSet" > > Before investigating the failures is this optimization worth pursuing? > > Thanks > Pieter >
RE: path query optimization
Hi, I have been optimizing Sqlg of late and eventually arrived at TinkerPop code. The gremlin in particular that I am interested is path queries. Here is the test that I am running in jmh. //@Setup Vertex a = graph.addVertex(T.label, "A", "name", "a1"); for (int i = 1; i < 1_000_001; i++) { Vertex b = graph.addVertex(T.label, "B", "name", "name_" + i); a.addEdge("outB", b); for (int j = 0; j < 1; j++) { Vertex c = graph.addVertex(T.label, "C", "name", "name_" + i + " " + j); b.addEdge("outC", c); } } And the query being benchmarked is GraphTraversal traversal = g.V(a).as("a").out().as("b").out().as("c").path(); while (traversal.hasNext()) { Path path = traversal.next(); } Before the optimization, (as things are now) Benchmark Mode Cnt Score Error Units GremlinPathBenchmark.g_path avgt 100 1.086 ± 0.020 s/op The optimization I did is in AbstractStep.prepareTraversalForNextStep, to not call addLabels() for path gremlins as the labels are known by the step and do not change again so there is not need to keep adding them. private final Traverser.Admin prepareTraversalForNextStep(final Traverser.Admin traverser) { if (!this.traverserStepIdAndLabelsSetByChild) { traverser.setStepId(this.nextStep.getId()); if (traverser instanceof B_LP_O_P_S_SE_SL_Traverser) { } else { traverser.addLabels(this.labels); } } return traverser; } After optimization, Benchmark Mode Cnt Score Error Units GremlinPathBenchmark.g_path avgt 100 0.680 ± 0.004 s/op 1.086 vs 0.689 seconds for the traversal. I ran the Structured and Process test suites. 2 tests are failing with this optimization. InjectTest.g_VX1X_out_name_injectXdanielX_asXaX_mapXlengthX_path fails with "java.lang.IllegalArgumentException: The step with label a does not exist" and SerializationTest.shouldSerializePathAsDetached fails with "Caused by: java.lang.IllegalArgumentException: Class is not registered: java.util.Collections$UnmodifiableSet" Before investigating the failures is this optimization worth pursuing? Thanks Pieter
[GitHub] tinkerpop issue #372: DSP-1397 Fix StarGraph addEdge for self-edges
Github user dalaro commented on the issue: https://github.com/apache/tinkerpop/pull/372 Thanks for the replies. @spmallette I will investigate whether this behavior also exists on 3.1 and retarget if so. I think this will involve making a standalone test instead of just testing TP against my app and stimulating the failure that way. @okram Yes, I realize that bothE() should return two edges. The problem this PR tried to address was that, after applying an inE graph filter to a self-looped StarVertex, the StarVertex had two edges in its outEdges map and no edges in its inEdges map (should be one each). I think this is not just some obscure internals problem, because I discovered it through a traversal. Maybe I can articulate this better with a test. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] tinkerpop issue #372: DSP-1397 Fix StarGraph addEdge for self-edges
Github user dalaro commented on the issue: https://github.com/apache/tinkerpop/pull/372 @okram also, thanks for the code; I'll have a closer look at it when I iterate on this PR. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Assigned] (TINKERPOP-1400) SubgraphStrategy introduces infinite recursion if filter has Vertex/Edge steps.
[ https://issues.apache.org/jira/browse/TINKERPOP-1400?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marko A. Rodriguez reassigned TINKERPOP-1400: - Assignee: Marko A. Rodriguez > SubgraphStrategy introduces infinite recursion if filter has Vertex/Edge > steps. > --- > > Key: TINKERPOP-1400 > URL: https://issues.apache.org/jira/browse/TINKERPOP-1400 > Project: TinkerPop > Issue Type: Bug > Components: process >Affects Versions: 3.2.1 >Reporter: Marko A. Rodriguez >Assignee: Marko A. Rodriguez > > James from the mailing list reported: > {code} > gremlin> graph = TinkerFactory.createModern() > ==>tinkergraph[vertices:6 edges:6] > gremlin> g = graph.traversal() > ==>graphtraversalsource[tinkergraph[vertices:6 edges:6], standard] > gremlin> s = SubgraphStrategy.build().vertexCriterion(hasId(1)).create() > ==>SubgraphStrategy > gremlin> g.V().filter(hasId(1)) > ==>v[1] > gremlin> g.withStrategies(s).V() > ==>v[1] > works as expected. But if I change the predicate traversal to something > slightly more complex, e.g. in('knows').hasId(1) things start to go haywire. > The single step predicates works as expected in 3.1.1-incubating, 3.1.3 and > 3.2.1. > In 3.1.1-incubating the multi-step predicate subgraph strategy seems to end > up generating the same traversal as using filter(...) but fails to execute: > $ sh apache-gremlin-console-3.1.1-incubating/bin/gremlin.sh > \,,,/ > (o o) > -oOOo-(3)-oOOo- > plugin activated: tinkerpop.server > plugin activated: tinkerpop.utilities > plugin activated: tinkerpop.tinkergraph > gremlin> graph = TinkerFactory.createModern() > ==>tinkergraph[vertices:6 edges:6] > gremlin> g = graph.traversal() > ==>graphtraversalsource[tinkergraph[vertices:6 edges:6], standard] > gremlin> s = > SubgraphStrategy.build().vertexCriterion(__.in('knows').hasId(1)).create() > ==>SubgraphStrategy > gremlin> g1 = GraphTraversalSource.build().with(s).create(graph) > ==>graphtraversalsource[tinkergraph[vertices:6 edges:6], standard] > gremlin> g.V().filter(__.in('knows').hasId(1)).explain() > ==>Traversal Explanation > === > Original Traversal [GraphStep([],vertex), > TraversalFilterStep([VertexStep(IN,[knows],vertex), HasStep([~id.eq(1)])])] > ConnectiveStrategy [D] [GraphStep([],vertex), > TraversalFilterStep([VertexStep(IN,[knows],vertex), HasStep([~id.eq(1)])])] > IdentityRemovalStrategy [O] [GraphStep([],vertex), > TraversalFilterStep([VertexStep(IN,[knows],vertex), HasStep([~id.eq(1)])])] > IncidentToAdjacentStrategy [O] [GraphStep([],vertex), > TraversalFilterStep([VertexStep(IN,[knows],vertex), HasStep([~id.eq(1)])])] > AdjacentToIncidentStrategy [O] [GraphStep([],vertex), > TraversalFilterStep([VertexStep(IN,[knows],vertex), HasStep([~id.eq(1)])])] > FilterRankingStrategy[O] [GraphStep([],vertex), > TraversalFilterStep([VertexStep(IN,[knows],vertex), HasStep([~id.eq(1)])])] > MatchPredicateStrategy [O] [GraphStep([],vertex), > TraversalFilterStep([VertexStep(IN,[knows],vertex), HasStep([~id.eq(1)])])] > RangeByIsCountStrategy [O] [GraphStep([],vertex), > TraversalFilterStep([VertexStep(IN,[knows],vertex), HasStep([~id.eq(1)])])] > TinkerGraphStepStrategy [P] [TinkerGraphStep([],vertex), > TraversalFilterStep([VertexStep(IN,[knows],vertex), HasStep([~id.eq(1)])])] > EngineDependentStrategy [F] [TinkerGraphStep([],vertex), > TraversalFilterStep([VertexStep(IN,[knows],vertex), HasStep([~id.eq(1)])])] > ProfileStrategy [F] [TinkerGraphStep([],vertex), > TraversalFilterStep([VertexStep(IN,[knows],vertex), HasStep([~id.eq(1)])])] > StandardVerificationStrategy [V] [TinkerGraphStep([],vertex), > TraversalFilterStep([VertexStep(IN,[knows],vertex), HasStep([~id.eq(1)])])] > ComputerVerificationStrategy [V] [TinkerGraphStep([],vertex), > TraversalFilterStep([VertexStep(IN,[knows],vertex), HasStep([~id.eq(1)])])] > Final Traversal[TinkerGraphStep([],vertex), > TraversalFilterStep([VertexStep(IN,[knows],vertex), HasStep([~id.eq(1)])])] > gremlin> g.V().filter(__.in('knows').hasId(1)) > ==>v[2] > ==>v[4] > gremlin> g1.V().explain() > ==>Traversal Explanation > === > Original Traversal [GraphStep([],vertex)] > ConnectiveStrategy [D] [GraphStep([],vertex)] > SubgraphStrategy [D] [GraphStep([],vertex), > TraversalFilterStep([VertexStep(IN,[knows],vertex), HasStep([~id.eq(1)])])] > IdentityRemov
[jira] [Created] (TINKERPOP-1400) SubgraphStrategy introduces infinite recursion if filter has Vertex/Edge steps.
Marko A. Rodriguez created TINKERPOP-1400: - Summary: SubgraphStrategy introduces infinite recursion if filter has Vertex/Edge steps. Key: TINKERPOP-1400 URL: https://issues.apache.org/jira/browse/TINKERPOP-1400 Project: TinkerPop Issue Type: Bug Components: process Affects Versions: 3.2.1 Reporter: Marko A. Rodriguez James from the mailing list reported: {code} gremlin> graph = TinkerFactory.createModern() ==>tinkergraph[vertices:6 edges:6] gremlin> g = graph.traversal() ==>graphtraversalsource[tinkergraph[vertices:6 edges:6], standard] gremlin> s = SubgraphStrategy.build().vertexCriterion(hasId(1)).create() ==>SubgraphStrategy gremlin> g.V().filter(hasId(1)) ==>v[1] gremlin> g.withStrategies(s).V() ==>v[1] works as expected. But if I change the predicate traversal to something slightly more complex, e.g. in('knows').hasId(1) things start to go haywire. The single step predicates works as expected in 3.1.1-incubating, 3.1.3 and 3.2.1. In 3.1.1-incubating the multi-step predicate subgraph strategy seems to end up generating the same traversal as using filter(...) but fails to execute: $ sh apache-gremlin-console-3.1.1-incubating/bin/gremlin.sh \,,,/ (o o) -oOOo-(3)-oOOo- plugin activated: tinkerpop.server plugin activated: tinkerpop.utilities plugin activated: tinkerpop.tinkergraph gremlin> graph = TinkerFactory.createModern() ==>tinkergraph[vertices:6 edges:6] gremlin> g = graph.traversal() ==>graphtraversalsource[tinkergraph[vertices:6 edges:6], standard] gremlin> s = SubgraphStrategy.build().vertexCriterion(__.in('knows').hasId(1)).create() ==>SubgraphStrategy gremlin> g1 = GraphTraversalSource.build().with(s).create(graph) ==>graphtraversalsource[tinkergraph[vertices:6 edges:6], standard] gremlin> g.V().filter(__.in('knows').hasId(1)).explain() ==>Traversal Explanation === Original Traversal [GraphStep([],vertex), TraversalFilterStep([VertexStep(IN,[knows],vertex), HasStep([~id.eq(1)])])] ConnectiveStrategy [D] [GraphStep([],vertex), TraversalFilterStep([VertexStep(IN,[knows],vertex), HasStep([~id.eq(1)])])] IdentityRemovalStrategy [O] [GraphStep([],vertex), TraversalFilterStep([VertexStep(IN,[knows],vertex), HasStep([~id.eq(1)])])] IncidentToAdjacentStrategy [O] [GraphStep([],vertex), TraversalFilterStep([VertexStep(IN,[knows],vertex), HasStep([~id.eq(1)])])] AdjacentToIncidentStrategy [O] [GraphStep([],vertex), TraversalFilterStep([VertexStep(IN,[knows],vertex), HasStep([~id.eq(1)])])] FilterRankingStrategy[O] [GraphStep([],vertex), TraversalFilterStep([VertexStep(IN,[knows],vertex), HasStep([~id.eq(1)])])] MatchPredicateStrategy [O] [GraphStep([],vertex), TraversalFilterStep([VertexStep(IN,[knows],vertex), HasStep([~id.eq(1)])])] RangeByIsCountStrategy [O] [GraphStep([],vertex), TraversalFilterStep([VertexStep(IN,[knows],vertex), HasStep([~id.eq(1)])])] TinkerGraphStepStrategy [P] [TinkerGraphStep([],vertex), TraversalFilterStep([VertexStep(IN,[knows],vertex), HasStep([~id.eq(1)])])] EngineDependentStrategy [F] [TinkerGraphStep([],vertex), TraversalFilterStep([VertexStep(IN,[knows],vertex), HasStep([~id.eq(1)])])] ProfileStrategy [F] [TinkerGraphStep([],vertex), TraversalFilterStep([VertexStep(IN,[knows],vertex), HasStep([~id.eq(1)])])] StandardVerificationStrategy [V] [TinkerGraphStep([],vertex), TraversalFilterStep([VertexStep(IN,[knows],vertex), HasStep([~id.eq(1)])])] ComputerVerificationStrategy [V] [TinkerGraphStep([],vertex), TraversalFilterStep([VertexStep(IN,[knows],vertex), HasStep([~id.eq(1)])])] Final Traversal[TinkerGraphStep([],vertex), TraversalFilterStep([VertexStep(IN,[knows],vertex), HasStep([~id.eq(1)])])] gremlin> g.V().filter(__.in('knows').hasId(1)) ==>v[2] ==>v[4] gremlin> g1.V().explain() ==>Traversal Explanation === Original Traversal [GraphStep([],vertex)] ConnectiveStrategy [D] [GraphStep([],vertex)] SubgraphStrategy [D] [GraphStep([],vertex), TraversalFilterStep([VertexStep(IN,[knows],vertex), HasStep([~id.eq(1)])])] IdentityRemovalStrategy [O] [GraphStep([],vertex), TraversalFilterStep([VertexStep(IN,[knows],vertex), HasStep([~id.eq(1)])])] IncidentToAdjacentStrategy [O] [GraphStep([],vertex), TraversalFilterStep([VertexStep(IN,[knows],vertex), HasStep([~id.eq(1)])])] AdjacentToIncidentStrategy [O] [GraphStep([],vertex), TraversalFilterStep([VertexStep(IN,[knows],vertex), Ha