[jira] [Commented] (RYA-343) AccumuloLoadStatementsFile fails when loading data to a PCJ-enabled table.
[ https://issues.apache.org/jira/browse/RYA-343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16126702#comment-16126702 ] ASF GitHub Bot commented on RYA-343: Github user meiercaleb commented on a diff in the pull request: https://github.com/apache/incubator-rya/pull/205#discussion_r133100980 --- Diff: common/rya.api/src/main/java/org/apache/rya/api/instance/RyaDetailsToConfiguration.java --- @@ -53,14 +53,16 @@ public static void addRyaDetailsToConfiguration(final RyaDetails details, final checkAndSet(conf, ConfigurationFields.USE_FREETEXT, details.getFreeTextIndexDetails().isEnabled()); //RYA-215checkAndSet(conf, ConfigurationFields.USE_GEO, details.getGeoIndexDetails().isEnabled()); checkAndSet(conf, ConfigurationFields.USE_TEMPORAL, details.getTemporalIndexDetails().isEnabled()); -PCJIndexDetails pcjDetails = details.getPCJIndexDetails(); - if (pcjDetails.isEnabled() && pcjDetails.getFluoDetails().isPresent()) { - checkAndSet(conf, ConfigurationFields.USE_PCJ_UPDATER, true); - conf.set(ConfigurationFields.FLUO_APP_NAME, pcjDetails.getFluoDetails().get().getUpdateAppName()); - conf.set(ConfigurationFields.PCJ_UPDATER_TYPE, "FLUO"); - } else { - checkAndSet(conf, ConfigurationFields.USE_PCJ_UPDATER, false); - } +final PCJIndexDetails pcjDetails = details.getPCJIndexDetails(); +if (pcjDetails.isEnabled() && pcjDetails.getFluoDetails().isPresent()) { +checkAndSet(conf, ConfigurationFields.USE_PCJ_UPDATER, true); +conf.set(ConfigurationFields.FLUO_APP_NAME, pcjDetails.getFluoDetails().get().getUpdateAppName()); +conf.set(ConfigurationFields.PCJ_UPDATER_TYPE, "FLUO"); +conf.set(ConfigurationFields.PCJ_STORAGE_TYPE, "ACCUMULO"); --- End diff -- So this is probably okay for now. But this method could take in either an instance of AccumuloRyaInstanceDetails or an instance of MongoRyaInstanceDetails. So the storage type could be Accumulo or Mongo (note that the Mongo storage type doesn't currently exist in master, but will be merged in as part of a pull request). Given that there is currently no updater for mongo, there will be no fluo details so the the Accumulo storage type will never be set for a Mongo backed instance. But this could change once an updater is implemented for Mongo. The best thing to do is to add a getStorageType() method to the RyaInstanceDetails, and then just pass this into the config so that the storage type aligns with the RyaInstanceDetails type. > AccumuloLoadStatementsFile fails when loading data to a PCJ-enabled table. > -- > > Key: RYA-343 > URL: https://issues.apache.org/jira/browse/RYA-343 > Project: Rya > Issue Type: Sub-task > Components: clients >Reporter: Jeff Dasch >Assignee: Jeff Dasch > > Issue occurs when calling {{AccumuloLoadStatementsFile.loadStatements()}} to > loading data to a PCJ-enabled table. I believe this is a recent regression. > {noformat} > 2017-08-14 13:46:51,802 [Spring Shell] WARN > org.apache.rya.api.client.accumulo.AccumuloLoadStatementsFile - Exception > while loading: > org.apache.rya.api.persist.RyaDAOException: > java.lang.IllegalArgumentException: The 'rya.indexing.pcj.storageType' > property must have one of the following values: [ACCUMULO] > at org.apache.rya.accumulo.AccumuloRyaDAO.init(AccumuloRyaDAO.java:165) > at > org.apache.rya.sail.config.RyaSailFactory.getAccumuloDAO(RyaSailFactory.java:155) > at > org.apache.rya.sail.config.RyaSailFactory.getRyaSail(RyaSailFactory.java:100) > at > org.apache.rya.sail.config.RyaSailFactory.getInstance(RyaSailFactory.java:67) > at > org.apache.rya.api.client.accumulo.AccumuloLoadStatementsFile.loadStatements(AccumuloLoadStatementsFile.java:91) > at org.apache.rya.shell.RyaCommands.loadData(RyaCommands.java:121) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.springframework.util.ReflectionUtils.invokeMethod(ReflectionUtils.java:210) > at > org.springframework.shell.core.SimpleExecutionStrategy.invoke(SimpleExecutionStrategy.java:64) > at > org.springframework.shell.core.SimpleExecutionStrategy.execute(SimpleExecutionStrategy.java:57) > at >
[GitHub] incubator-rya pull request #205: RYA-343 Fixed rya.api connection issue for ...
Github user meiercaleb commented on a diff in the pull request: https://github.com/apache/incubator-rya/pull/205#discussion_r133100980 --- Diff: common/rya.api/src/main/java/org/apache/rya/api/instance/RyaDetailsToConfiguration.java --- @@ -53,14 +53,16 @@ public static void addRyaDetailsToConfiguration(final RyaDetails details, final checkAndSet(conf, ConfigurationFields.USE_FREETEXT, details.getFreeTextIndexDetails().isEnabled()); //RYA-215checkAndSet(conf, ConfigurationFields.USE_GEO, details.getGeoIndexDetails().isEnabled()); checkAndSet(conf, ConfigurationFields.USE_TEMPORAL, details.getTemporalIndexDetails().isEnabled()); -PCJIndexDetails pcjDetails = details.getPCJIndexDetails(); - if (pcjDetails.isEnabled() && pcjDetails.getFluoDetails().isPresent()) { - checkAndSet(conf, ConfigurationFields.USE_PCJ_UPDATER, true); - conf.set(ConfigurationFields.FLUO_APP_NAME, pcjDetails.getFluoDetails().get().getUpdateAppName()); - conf.set(ConfigurationFields.PCJ_UPDATER_TYPE, "FLUO"); - } else { - checkAndSet(conf, ConfigurationFields.USE_PCJ_UPDATER, false); - } +final PCJIndexDetails pcjDetails = details.getPCJIndexDetails(); +if (pcjDetails.isEnabled() && pcjDetails.getFluoDetails().isPresent()) { +checkAndSet(conf, ConfigurationFields.USE_PCJ_UPDATER, true); +conf.set(ConfigurationFields.FLUO_APP_NAME, pcjDetails.getFluoDetails().get().getUpdateAppName()); +conf.set(ConfigurationFields.PCJ_UPDATER_TYPE, "FLUO"); +conf.set(ConfigurationFields.PCJ_STORAGE_TYPE, "ACCUMULO"); --- End diff -- So this is probably okay for now. But this method could take in either an instance of AccumuloRyaInstanceDetails or an instance of MongoRyaInstanceDetails. So the storage type could be Accumulo or Mongo (note that the Mongo storage type doesn't currently exist in master, but will be merged in as part of a pull request). Given that there is currently no updater for mongo, there will be no fluo details so the the Accumulo storage type will never be set for a Mongo backed instance. But this could change once an updater is implemented for Mongo. The best thing to do is to add a getStorageType() method to the RyaInstanceDetails, and then just pass this into the config so that the storage type aligns with the RyaInstanceDetails type. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (RYA-295) Implement owl:allValuesFrom inference
[ https://issues.apache.org/jira/browse/RYA-295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16126480#comment-16126480 ] ASF GitHub Bot commented on RYA-295: Github user asfgit commented on the issue: https://github.com/apache/incubator-rya/pull/201 Refer to this link for build results (access rights to CI server needed): https://builds.apache.org/job/incubator-rya-master-with-optionals-pull-requests/390/Failed Tests: 3incubator-rya-master-with-optionals-pull-requests/org.apache.rya:rya.prospector: 3org.apache.rya.prospector.mr.ProspectorTest.testCountorg.apache.rya.prospector.service.ProspectorServiceEvalStatsDAOTest.testCountorg.apache.rya.prospector.service.ProspectorServiceEvalStatsDAOTest.testNoAuthsCount > Implement owl:allValuesFrom inference > - > > Key: RYA-295 > URL: https://issues.apache.org/jira/browse/RYA-295 > Project: Rya > Issue Type: Sub-task > Components: sail >Reporter: Jesse Hatfield >Assignee: Jesse Hatfield > > An *{{owl:allValuesFrom}}* restriction defines the set of resources for > which, given a particular predicate and other type, every value of that > predicate is a member of that type. Note that there may be no values at all. > For example, the ontology may state that resources of type {{:Person}} have > all values from {{:Person}} for type {{:parent}}: that is, a person's parents > are all people as well. Therefore, a pattern of the form {{?x rdf:type > :Person}} should be expanded to: > {noformat} > { ?y rdf:type :Person . > ?y :parent ?x } > UNION > { ?x rdf:type :Person } > {noformat} > i.e. we can infer {{?x}}'s personhood from the fact that child {{?y}} is > known to satisfy the restriction. > Notes: > -We can infer "x is a person, therefore all of x's parents are people". But > we can't infer "all of x's parents are people, therefore x is a person", > because of the open world semantics: we don't know that the parents given by > the data are in fact all of x's parents. (If there were also a cardinality > restriction and we could presume consistency, then we could infer this in the > right circumstances, but this is outside the scope of basic allValues>From > support.) This differs with most other property restriction rules in that we > can't infer that an object belongs to the class defined by the restriction, > but rather use the fact that an object is already known to belong in that > class in order to infer something about its neighbors in the graph (the types > of the values). > -The example above could be applied recursively, but to implement this as a > simple query rewrite we'll need to limit recursion depth (and interactions > with other rules, for the same reasons). -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[GitHub] incubator-rya issue #201: RYA-295 owl:allValuesFrom inference
Github user asfgit commented on the issue: https://github.com/apache/incubator-rya/pull/201 Refer to this link for build results (access rights to CI server needed): https://builds.apache.org/job/incubator-rya-master-with-optionals-pull-requests/390/Failed Tests: 3incubator-rya-master-with-optionals-pull-requests/org.apache.rya:rya.prospector: 3org.apache.rya.prospector.mr.ProspectorTest.testCountorg.apache.rya.prospector.service.ProspectorServiceEvalStatsDAOTest.testCountorg.apache.rya.prospector.service.ProspectorServiceEvalStatsDAOTest.testNoAuthsCount --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (RYA-298) Implement rdfs:domain inference
[ https://issues.apache.org/jira/browse/RYA-298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16126469#comment-16126469 ] ASF GitHub Bot commented on RYA-298: Github user asfgit commented on the issue: https://github.com/apache/incubator-rya/pull/197 Refer to this link for build results (access rights to CI server needed): https://builds.apache.org/job/incubator-rya-master-with-optionals-pull-requests/389/ > Implement rdfs:domain inference > --- > > Key: RYA-298 > URL: https://issues.apache.org/jira/browse/RYA-298 > Project: Rya > Issue Type: Sub-task > Components: sail >Reporter: Jesse Hatfield >Assignee: Jesse Hatfield > > If a predicate has an *{{rdfs:domain}}* of some class, than the subject of > any triple including that predicate belongs to the class. > If the ontology states that {{:advisor}} has the domain of {{:Person}}, then > the inference engine should rewrite queries of the form {{?x rdf:type > :Person}} to check for resources which have any {{:advisor}} (as well as any > specifically stated to have type {{:Person}} ). -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[GitHub] incubator-rya issue #197: RYA-298, RYA-299 Domain/range inference.
Github user asfgit commented on the issue: https://github.com/apache/incubator-rya/pull/197 Refer to this link for build results (access rights to CI server needed): https://builds.apache.org/job/incubator-rya-master-with-optionals-pull-requests/389/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (RYA-295) Implement owl:allValuesFrom inference
[ https://issues.apache.org/jira/browse/RYA-295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16126460#comment-16126460 ] ASF GitHub Bot commented on RYA-295: Github user jessehatfield commented on the issue: https://github.com/apache/incubator-rya/pull/201 Fixed a typo in a log message and a redundant line in the example; updated with latest changes to master. > Implement owl:allValuesFrom inference > - > > Key: RYA-295 > URL: https://issues.apache.org/jira/browse/RYA-295 > Project: Rya > Issue Type: Sub-task > Components: sail >Reporter: Jesse Hatfield >Assignee: Jesse Hatfield > > An *{{owl:allValuesFrom}}* restriction defines the set of resources for > which, given a particular predicate and other type, every value of that > predicate is a member of that type. Note that there may be no values at all. > For example, the ontology may state that resources of type {{:Person}} have > all values from {{:Person}} for type {{:parent}}: that is, a person's parents > are all people as well. Therefore, a pattern of the form {{?x rdf:type > :Person}} should be expanded to: > {noformat} > { ?y rdf:type :Person . > ?y :parent ?x } > UNION > { ?x rdf:type :Person } > {noformat} > i.e. we can infer {{?x}}'s personhood from the fact that child {{?y}} is > known to satisfy the restriction. > Notes: > -We can infer "x is a person, therefore all of x's parents are people". But > we can't infer "all of x's parents are people, therefore x is a person", > because of the open world semantics: we don't know that the parents given by > the data are in fact all of x's parents. (If there were also a cardinality > restriction and we could presume consistency, then we could infer this in the > right circumstances, but this is outside the scope of basic allValuesFrom > support.) This differs with most other property restriction rules in that we > can't infer that an object belongs to the class defined by the restriction, > but rather use the fact that an object is already known to belong in that > class in order to infer something about its neighbors in the graph (the types > of the values). > -The example above could be applied recursively, but to implement this as a > simple query rewrite we'll need to limit recursion depth (and interactions > with other rules, for the same reasons). -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[GitHub] incubator-rya issue #201: RYA-295 owl:allValuesFrom inference
Github user jessehatfield commented on the issue: https://github.com/apache/incubator-rya/pull/201 Fixed a typo in a log message and a redundant line in the example; updated with latest changes to master. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (RYA-298) Implement rdfs:domain inference
[ https://issues.apache.org/jira/browse/RYA-298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16126458#comment-16126458 ] ASF GitHub Bot commented on RYA-298: Github user jessehatfield commented on the issue: https://github.com/apache/incubator-rya/pull/197 Updated. At some point it would be nice to make a pass over the inference engine for thread-safety across both the old and new logic, maybe once the currently-pending extensions are added. > Implement rdfs:domain inference > --- > > Key: RYA-298 > URL: https://issues.apache.org/jira/browse/RYA-298 > Project: Rya > Issue Type: Sub-task > Components: sail >Reporter: Jesse Hatfield >Assignee: Jesse Hatfield > > If a predicate has an *{{rdfs:domain}}* of some class, than the subject of > any triple including that predicate belongs to the class. > If the ontology states that {{:advisor}} has the domain of {{:Person}}, then > the inference engine should rewrite queries of the form {{?x rdf:type > :Person}} to check for resources which have any {{:advisor}} (as well as any > specifically stated to have type {{:Person}} ). -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[GitHub] incubator-rya issue #197: RYA-298, RYA-299 Domain/range inference.
Github user jessehatfield commented on the issue: https://github.com/apache/incubator-rya/pull/197 Updated. At some point it would be nice to make a pass over the inference engine for thread-safety across both the old and new logic, maybe once the currently-pending extensions are added. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (RYA-343) AccumuloLoadStatementsFile fails when loading data to a PCJ-enabled table.
[ https://issues.apache.org/jira/browse/RYA-343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16126315#comment-16126315 ] ASF GitHub Bot commented on RYA-343: Github user asfgit commented on the issue: https://github.com/apache/incubator-rya/pull/205 Refer to this link for build results (access rights to CI server needed): https://builds.apache.org/job/incubator-rya-master-with-optionals-pull-requests/388/ > AccumuloLoadStatementsFile fails when loading data to a PCJ-enabled table. > -- > > Key: RYA-343 > URL: https://issues.apache.org/jira/browse/RYA-343 > Project: Rya > Issue Type: Sub-task > Components: clients >Reporter: Jeff Dasch >Assignee: Jeff Dasch > > Issue occurs when calling {{AccumuloLoadStatementsFile.loadStatements()}} to > loading data to a PCJ-enabled table. I believe this is a recent regression. > {noformat} > 2017-08-14 13:46:51,802 [Spring Shell] WARN > org.apache.rya.api.client.accumulo.AccumuloLoadStatementsFile - Exception > while loading: > org.apache.rya.api.persist.RyaDAOException: > java.lang.IllegalArgumentException: The 'rya.indexing.pcj.storageType' > property must have one of the following values: [ACCUMULO] > at org.apache.rya.accumulo.AccumuloRyaDAO.init(AccumuloRyaDAO.java:165) > at > org.apache.rya.sail.config.RyaSailFactory.getAccumuloDAO(RyaSailFactory.java:155) > at > org.apache.rya.sail.config.RyaSailFactory.getRyaSail(RyaSailFactory.java:100) > at > org.apache.rya.sail.config.RyaSailFactory.getInstance(RyaSailFactory.java:67) > at > org.apache.rya.api.client.accumulo.AccumuloLoadStatementsFile.loadStatements(AccumuloLoadStatementsFile.java:91) > at org.apache.rya.shell.RyaCommands.loadData(RyaCommands.java:121) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.springframework.util.ReflectionUtils.invokeMethod(ReflectionUtils.java:210) > at > org.springframework.shell.core.SimpleExecutionStrategy.invoke(SimpleExecutionStrategy.java:64) > at > org.springframework.shell.core.SimpleExecutionStrategy.execute(SimpleExecutionStrategy.java:57) > at > org.springframework.shell.core.AbstractShell.executeCommand(AbstractShell.java:127) > at > org.springframework.shell.core.JLineShell.promptLoop(JLineShell.java:533) > at org.springframework.shell.core.JLineShell.run(JLineShell.java:179) > at java.lang.Thread.run(Thread.java:748) > Caused by: java.lang.IllegalArgumentException: The > 'rya.indexing.pcj.storageType' property must have one of the following > values: [ACCUMULO] > at > com.google.common.base.Preconditions.checkArgument(Preconditions.java:92) > at > org.apache.rya.indexing.external.PrecomputedJoinStorageSupplier.get(PrecomputedJoinStorageSupplier.java:68) > at > org.apache.rya.indexing.external.PrecomputedJoinIndexer.init(PrecomputedJoinIndexer.java:139) > at org.apache.rya.accumulo.AccumuloRyaDAO.init(AccumuloRyaDAO.java:156) > ... 16 more > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[GitHub] incubator-rya issue #205: RYA-343 Fixed rya.api connection issue for PCJ-ena...
Github user asfgit commented on the issue: https://github.com/apache/incubator-rya/pull/205 Refer to this link for build results (access rights to CI server needed): https://builds.apache.org/job/incubator-rya-master-with-optionals-pull-requests/388/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (RYA-343) AccumuloLoadStatementsFile fails when loading data to a PCJ-enabled table.
[ https://issues.apache.org/jira/browse/RYA-343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16126273#comment-16126273 ] ASF GitHub Bot commented on RYA-343: GitHub user jdasch opened a pull request: https://github.com/apache/incubator-rya/pull/205 RYA-343 Fixed rya.api connection issue for PCJ-enabled rya table. ## Description Added a couple configs that weren't getting set and caused a regression. Purpose of this PR is to get a fix in place now. We should probably make a new JIRA to ensure this regression is captured by an IT and to expand PCJDetails to store/retrieve this information. ### Tests None. ### Links [Jira: RYA-343](https://issues.apache.org/jira/browse/RYA-343) ### Checklist - [ ] Code Review - [ ] Squash Commits People To Reivew @amihalik @meiercaleb You can merge this pull request into a Git repository by running: $ git pull https://github.com/jdasch/incubator-rya RYA-343 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/incubator-rya/pull/205.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #205 commit 81477c5815f3f54a3ccd5d32e87e6692461b32a9 Author: jdaschDate: 2017-08-14T19:21:10Z RYA-343 Fixed rya.api connection issue for PCJ-enabled rya table. > AccumuloLoadStatementsFile fails when loading data to a PCJ-enabled table. > -- > > Key: RYA-343 > URL: https://issues.apache.org/jira/browse/RYA-343 > Project: Rya > Issue Type: Sub-task > Components: clients >Reporter: Jeff Dasch >Assignee: Jeff Dasch > > Issue occurs when calling {{AccumuloLoadStatementsFile.loadStatements()}} to > loading data to a PCJ-enabled table. I believe this is a recent regression. > {noformat} > 2017-08-14 13:46:51,802 [Spring Shell] WARN > org.apache.rya.api.client.accumulo.AccumuloLoadStatementsFile - Exception > while loading: > org.apache.rya.api.persist.RyaDAOException: > java.lang.IllegalArgumentException: The 'rya.indexing.pcj.storageType' > property must have one of the following values: [ACCUMULO] > at org.apache.rya.accumulo.AccumuloRyaDAO.init(AccumuloRyaDAO.java:165) > at > org.apache.rya.sail.config.RyaSailFactory.getAccumuloDAO(RyaSailFactory.java:155) > at > org.apache.rya.sail.config.RyaSailFactory.getRyaSail(RyaSailFactory.java:100) > at > org.apache.rya.sail.config.RyaSailFactory.getInstance(RyaSailFactory.java:67) > at > org.apache.rya.api.client.accumulo.AccumuloLoadStatementsFile.loadStatements(AccumuloLoadStatementsFile.java:91) > at org.apache.rya.shell.RyaCommands.loadData(RyaCommands.java:121) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.springframework.util.ReflectionUtils.invokeMethod(ReflectionUtils.java:210) > at > org.springframework.shell.core.SimpleExecutionStrategy.invoke(SimpleExecutionStrategy.java:64) > at > org.springframework.shell.core.SimpleExecutionStrategy.execute(SimpleExecutionStrategy.java:57) > at > org.springframework.shell.core.AbstractShell.executeCommand(AbstractShell.java:127) > at > org.springframework.shell.core.JLineShell.promptLoop(JLineShell.java:533) > at org.springframework.shell.core.JLineShell.run(JLineShell.java:179) > at java.lang.Thread.run(Thread.java:748) > Caused by: java.lang.IllegalArgumentException: The > 'rya.indexing.pcj.storageType' property must have one of the following > values: [ACCUMULO] > at > com.google.common.base.Preconditions.checkArgument(Preconditions.java:92) > at > org.apache.rya.indexing.external.PrecomputedJoinStorageSupplier.get(PrecomputedJoinStorageSupplier.java:68) > at > org.apache.rya.indexing.external.PrecomputedJoinIndexer.init(PrecomputedJoinIndexer.java:139) > at org.apache.rya.accumulo.AccumuloRyaDAO.init(AccumuloRyaDAO.java:156) > ... 16 more > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[GitHub] incubator-rya pull request #205: RYA-343 Fixed rya.api connection issue for ...
GitHub user jdasch opened a pull request: https://github.com/apache/incubator-rya/pull/205 RYA-343 Fixed rya.api connection issue for PCJ-enabled rya table. ## Description Added a couple configs that weren't getting set and caused a regression. Purpose of this PR is to get a fix in place now. We should probably make a new JIRA to ensure this regression is captured by an IT and to expand PCJDetails to store/retrieve this information. ### Tests None. ### Links [Jira: RYA-343](https://issues.apache.org/jira/browse/RYA-343) ### Checklist - [ ] Code Review - [ ] Squash Commits People To Reivew @amihalik @meiercaleb You can merge this pull request into a Git repository by running: $ git pull https://github.com/jdasch/incubator-rya RYA-343 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/incubator-rya/pull/205.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #205 commit 81477c5815f3f54a3ccd5d32e87e6692461b32a9 Author: jdaschDate: 2017-08-14T19:21:10Z RYA-343 Fixed rya.api connection issue for PCJ-enabled rya table. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Created] (RYA-343) AccumuloLoadStatementsFile fails when loading data to a PCJ-enabled table.
Jeff Dasch created RYA-343: -- Summary: AccumuloLoadStatementsFile fails when loading data to a PCJ-enabled table. Key: RYA-343 URL: https://issues.apache.org/jira/browse/RYA-343 Project: Rya Issue Type: Sub-task Components: clients Reporter: Jeff Dasch Assignee: Jeff Dasch Issue occurs when calling {{AccumuloLoadStatementsFile.loadStatements()}} to loading data to a PCJ-enabled table. I believe this is a recent regression. {noformat} 2017-08-14 13:46:51,802 [Spring Shell] WARN org.apache.rya.api.client.accumulo.AccumuloLoadStatementsFile - Exception while loading: org.apache.rya.api.persist.RyaDAOException: java.lang.IllegalArgumentException: The 'rya.indexing.pcj.storageType' property must have one of the following values: [ACCUMULO] at org.apache.rya.accumulo.AccumuloRyaDAO.init(AccumuloRyaDAO.java:165) at org.apache.rya.sail.config.RyaSailFactory.getAccumuloDAO(RyaSailFactory.java:155) at org.apache.rya.sail.config.RyaSailFactory.getRyaSail(RyaSailFactory.java:100) at org.apache.rya.sail.config.RyaSailFactory.getInstance(RyaSailFactory.java:67) at org.apache.rya.api.client.accumulo.AccumuloLoadStatementsFile.loadStatements(AccumuloLoadStatementsFile.java:91) at org.apache.rya.shell.RyaCommands.loadData(RyaCommands.java:121) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.springframework.util.ReflectionUtils.invokeMethod(ReflectionUtils.java:210) at org.springframework.shell.core.SimpleExecutionStrategy.invoke(SimpleExecutionStrategy.java:64) at org.springframework.shell.core.SimpleExecutionStrategy.execute(SimpleExecutionStrategy.java:57) at org.springframework.shell.core.AbstractShell.executeCommand(AbstractShell.java:127) at org.springframework.shell.core.JLineShell.promptLoop(JLineShell.java:533) at org.springframework.shell.core.JLineShell.run(JLineShell.java:179) at java.lang.Thread.run(Thread.java:748) Caused by: java.lang.IllegalArgumentException: The 'rya.indexing.pcj.storageType' property must have one of the following values: [ACCUMULO] at com.google.common.base.Preconditions.checkArgument(Preconditions.java:92) at org.apache.rya.indexing.external.PrecomputedJoinStorageSupplier.get(PrecomputedJoinStorageSupplier.java:68) at org.apache.rya.indexing.external.PrecomputedJoinIndexer.init(PrecomputedJoinIndexer.java:139) at org.apache.rya.accumulo.AccumuloRyaDAO.init(AccumuloRyaDAO.java:156) ... 16 more {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (RYA-342) Improve documentation for deploying the RYA PCJ Updater
Jeff Dasch created RYA-342: -- Summary: Improve documentation for deploying the RYA PCJ Updater Key: RYA-342 URL: https://issues.apache.org/jira/browse/RYA-342 Project: Rya Issue Type: Sub-task Components: docs Reporter: Jeff Dasch Priority: Minor -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (RYA-341) refactor rya.fluo.pcj.app to be a leaf project
Jeff Dasch created RYA-341: -- Summary: refactor rya.fluo.pcj.app to be a leaf project Key: RYA-341 URL: https://issues.apache.org/jira/browse/RYA-341 Project: Rya Issue Type: Sub-task Components: build Reporter: Jeff Dasch Assignee: Jeff Dasch Priority: Minor This will allow us to properly scope (hadoop, accumulo, and fluo) dependencies as runtime and remove any now unnecessary filtering created by RYA-340. Refactor any reusable code into a new project (rya.fluo.pcj.(common/runtime/impl). Currently rya.fluo.pcj.api and rya.periodic.service.notification are dependent on it. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (RYA-340) rya.fluo.pcj.app is not currently deployable in fluo-1.0.0
Jeff Dasch created RYA-340: -- Summary: rya.fluo.pcj.app is not currently deployable in fluo-1.0.0 Key: RYA-340 URL: https://issues.apache.org/jira/browse/RYA-340 Project: Rya Issue Type: Sub-task Components: build Reporter: Jeff Dasch Assignee: Jeff Dasch There is a guava versioning incompatibility. Fastest fix is to improve the generated artifact through filtering. {noformat} Exception in thread "ServiceDelegate STARTING" java.lang.IncompatibleClassChangeError: class org.apache.twill.internal.utils.Dependencies$DependencyClassVisitor has interface org.objectweb.asm.ClassVisitor as super class at java.lang.ClassLoader.defineClass1(Native Method) at java.lang.ClassLoader.defineClass(ClassLoader.java:763) at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142) at java.net.URLClassLoader.defineClass(URLClassLoader.java:467) at java.net.URLClassLoader.access$100(URLClassLoader.java:73) at java.net.URLClassLoader$1.run(URLClassLoader.java:368) at java.net.URLClassLoader$1.run(URLClassLoader.java:362) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:361) at java.lang.ClassLoader.loadClass(ClassLoader.java:424) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:335) at java.lang.ClassLoader.loadClass(ClassLoader.java:357) at org.apache.twill.internal.utils.Dependencies.findClassDependencies(Dependencies.java:86) at org.apache.twill.internal.ApplicationBundler.findDependencies(ApplicationBundler.java:198) at org.apache.twill.internal.ApplicationBundler.createBundle(ApplicationBundler.java:155) at org.apache.twill.internal.ApplicationBundler.createBundle(ApplicationBundler.java:126) at org.apache.twill.yarn.YarnTwillPreparer.createAppMasterJar(YarnTwillPreparer.java:402) at org.apache.twill.yarn.YarnTwillPreparer.access$200(YarnTwillPreparer.java:108) at org.apache.twill.yarn.YarnTwillPreparer$1.call(YarnTwillPreparer.java:299) at org.apache.twill.yarn.YarnTwillPreparer$1.call(YarnTwillPreparer.java:289) at org.apache.twill.yarn.YarnTwillController.doStartUp(YarnTwillController.java:97) at org.apache.twill.internal.AbstractZKServiceController.startUp(AbstractZKServiceController.java:76) at org.apache.twill.internal.AbstractExecutionServiceController$ServiceDelegate.startUp(AbstractExecutionServiceController.java:175) at com.google.common.util.concurrent.AbstractIdleService$1$1.run(AbstractIdleService.java:43) at java.lang.Thread.run(Thread.java:748) {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (RYA-339) Fluo PCJ Improvements
Jeff Dasch created RYA-339: -- Summary: Fluo PCJ Improvements Key: RYA-339 URL: https://issues.apache.org/jira/browse/RYA-339 Project: Rya Issue Type: Improvement Components: build Reporter: Jeff Dasch Assignee: Jeff Dasch Priority: Minor This is a holder issue for several subtasks regarding Fluo PCJs. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (RYA-337) Batch Queries to MongoDB
[ https://issues.apache.org/jira/browse/RYA-337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16125933#comment-16125933 ] ASF GitHub Bot commented on RYA-337: Github user meiercaleb commented on a diff in the pull request: https://github.com/apache/incubator-rya/pull/204#discussion_r132990636 --- Diff: dao/mongodb.rya/src/main/java/org/apache/rya/mongodb/MongoDBQueryEngine.java --- @@ -140,72 +121,35 @@ public MongoDBRdfConfiguration getConf() { public CloseableIterationbatchQuery( final Collection stmts, MongoDBRdfConfiguration conf) throws RyaDAOException { -if (conf == null) { -conf = configuration; -} -final Long maxResults = conf.getLimit(); -final Set queries = new HashSet(); - -try { -for (final RyaStatement stmt : stmts) { -queries.add( strategy.getQuery(stmt)); - } +final Map queries = new HashMap<>(); -// TODO not sure what to do about regex ranges? -final RyaStatementCursorIterator iterator = new RyaStatementCursorIterator(getCollection(conf), queries, -strategy, configuration.getAuthorizations()); - -if (maxResults != null) { -iterator.setMaxResults(maxResults); -} -return iterator; -} catch (final Exception e) { -throw new RyaDAOException(e); +for (final RyaStatement stmt : stmts) { +queries.put(stmt, new MapBindingSet()); } +return new RyaStatementCursorIterator(queryWithBindingSet(queries.entrySet(), conf)); } + @Override public CloseableIterable query(final RyaQuery ryaQuery) throws RyaDAOException { -final Set queries = new HashSet(); - -try { -queries.add( strategy.getQuery(ryaQuery)); - -// TODO not sure what to do about regex ranges? -// TODO this is gross -final RyaStatementCursorIterable iterator = new RyaStatementCursorIterable( -new NonCloseableRyaStatementCursorIterator(new RyaStatementCursorIterator(getCollection(getConf()), -queries, strategy, configuration.getAuthorizations(; - -return iterator; -} catch (final Exception e) { -throw new RyaDAOException(e); -} +return query(new BatchRyaQuery(Collections.singleton(ryaQuery.getQuery(; --- End diff -- Null check > Batch Queries to MongoDB > > > Key: RYA-337 > URL: https://issues.apache.org/jira/browse/RYA-337 > Project: Rya > Issue Type: Improvement > Components: dao >Reporter: Aaron Mihalik >Assignee: Aaron Mihalik > > Currently the MongoDB DAO sends one query at a time to Mongo. Instead, the > DAO should send a batch of queries and perform a client side hash join (like > the Accumulo DAO) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (RYA-337) Batch Queries to MongoDB
[ https://issues.apache.org/jira/browse/RYA-337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16125934#comment-16125934 ] ASF GitHub Bot commented on RYA-337: Github user meiercaleb commented on a diff in the pull request: https://github.com/apache/incubator-rya/pull/204#discussion_r132989696 --- Diff: dao/mongodb.rya/src/main/java/org/apache/rya/mongodb/MongoDBQueryEngine.java --- @@ -86,22 +84,10 @@ public MongoDBRdfConfiguration getConf() { public CloseableIterationquery( final RyaStatement stmt, MongoDBRdfConfiguration conf) throws RyaDAOException { -if (conf == null) { -conf = configuration; -} -final Long maxResults = conf.getLimit(); -final Set queries = new HashSet(); -final DBObject query = strategy.getQuery(stmt); -queries.add(query); -final MongoDatabase db = mongoClient.getDatabase(conf.getMongoDBName()); -final MongoCollection collection = db.getCollection(conf.getTriplesCollectionName()); -final RyaStatementCursorIterator iterator = new RyaStatementCursorIterator(collection, queries, strategy, -conf.getAuthorizations()); - -if (maxResults != null) { -iterator.setMaxResults(maxResults); -} -return iterator; +Entry entry = new AbstractMap.SimpleEntry<>(stmt, new MapBindingSet()); --- End diff -- Check that the RyaStatement and Config are not null. > Batch Queries to MongoDB > > > Key: RYA-337 > URL: https://issues.apache.org/jira/browse/RYA-337 > Project: Rya > Issue Type: Improvement > Components: dao >Reporter: Aaron Mihalik >Assignee: Aaron Mihalik > > Currently the MongoDB DAO sends one query at a time to Mongo. Instead, the > DAO should send a batch of queries and perform a client side hash join (like > the Accumulo DAO) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (RYA-337) Batch Queries to MongoDB
[ https://issues.apache.org/jira/browse/RYA-337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16125932#comment-16125932 ] ASF GitHub Bot commented on RYA-337: Github user meiercaleb commented on a diff in the pull request: https://github.com/apache/incubator-rya/pull/204#discussion_r132979641 --- Diff: dao/mongodb.rya/src/main/java/org/apache/rya/mongodb/iter/RyaStatementBindingSetCursorIterator.java --- @@ -91,47 +96,81 @@ private boolean currentBindingSetIteratorIsValid() { } private void findNextResult() { -if (!currentResultCursorIsValid()) { -findNextValidResultCursor(); +if (!currentBatchQueryResultCursorIsValid()) { +submitBatchQuery(); } -if (currentResultCursorIsValid()) { + +if (currentBatchQueryResultCursorIsValid()) { // convert to Rya Statement -final Document queryResult = resultsIterator.next(); +final Document queryResult = batchQueryResultsIterator.next(); final DBObject dbo = (DBObject) JSON.parse(queryResult.toJson()); -currentStatement = strategy.deserializeDBObject(dbo); -currentBindingSetIterator = currentBindingSetCollection.iterator(); +currentResultStatement = strategy.deserializeDBObject(dbo); + +// Find all of the queries in the executed RangeMap that this result matches +// and collect all of those binding sets +Set bsList = new HashSet<>(); +for (RyaStatement executedQuery : executedRangeMap.keys()) { +if (isResultForQuery(executedQuery, currentResultStatement)) { +bsList.addAll(executedRangeMap.get(executedQuery)); +} +} +currentBindingSetIterator = bsList.iterator(); +} + +// Handle case of invalid currentResultStatement or no binding sets returned +if ((currentBindingSetIterator == null || !currentBindingSetIterator.hasNext()) && (currentBatchQueryResultCursorIsValid() || queryIterator.hasNext())) { +findNextResult(); } } + +private static boolean isResultForQuery(RyaStatement query, RyaStatement result) { +return isResult(query.getSubject(), result.getSubject()) && +isResult(query.getPredicate(), result.getPredicate()) && +isResult(query.getObject(), result.getObject()) && +isResult(query.getContext(), result.getContext()); +} + +private static boolean isResult(RyaType query, RyaType result) { +return (query == null) || query.equals(result); +} -private void findNextValidResultCursor() { -while (queryIterator.hasNext()){ -final DBObject currentQuery = queryIterator.next(); -currentBindingSetCollection = rangeMap.get(currentQuery); -// Executing redact aggregation to only return documents the user -// has access to. -final List pipeline = new ArrayList<>(); -pipeline.add(new Document("$match", currentQuery)); -pipeline.addAll(AggregationUtil.createRedactPipeline(auths)); -log.debug(pipeline); - -final AggregateIterable aggIter = coll.aggregate(pipeline); -aggIter.batchSize(1000); -resultsIterator = aggIter.iterator(); -if (resultsIterator.hasNext()) { -break; -} +private void submitBatchQuery() { +int count = 0; +executedRangeMap.clear(); +final List pipeline = new ArrayList<>(); +final List match = new ArrayList<>(); + +while (queryIterator.hasNext() && count < QUERY_BATCH_SIZE){ +count++; +RyaStatement query = queryIterator.next(); +executedRangeMap.putAll(query, rangeMap.get(query)); +final DBObject currentQuery = strategy.getQuery(query); +match.add(currentQuery); } -} -private boolean currentResultCursorIsValid() { -return (resultsIterator != null) && resultsIterator.hasNext(); +if (match.size() > 1) { +pipeline.add(new Document("$match", new Document("$or", match))); --- End diff -- Did you compare the performance of $or with $in? Seems like $or is used to see if a field value satisfies at least one of (possibly) many general logical expressions, while $in checks to see if a value is in some sort of indexed collection. My guess is that the added
[jira] [Commented] (RYA-337) Batch Queries to MongoDB
[ https://issues.apache.org/jira/browse/RYA-337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16125935#comment-16125935 ] ASF GitHub Bot commented on RYA-337: Github user meiercaleb commented on a diff in the pull request: https://github.com/apache/incubator-rya/pull/204#discussion_r132980017 --- Diff: dao/mongodb.rya/src/main/java/org/apache/rya/mongodb/iter/RyaStatementBindingSetCursorIterator.java --- @@ -44,20 +48,21 @@ public class RyaStatementBindingSetCursorIterator implements CloseableIteration, RyaDAOException> { private static final Logger log = Logger.getLogger(RyaStatementBindingSetCursorIterator.class); + +private static final int QUERY_BATCH_SIZE = 50; --- End diff -- Any particular reason you went with 50 here? > Batch Queries to MongoDB > > > Key: RYA-337 > URL: https://issues.apache.org/jira/browse/RYA-337 > Project: Rya > Issue Type: Improvement > Components: dao >Reporter: Aaron Mihalik >Assignee: Aaron Mihalik > > Currently the MongoDB DAO sends one query at a time to Mongo. Instead, the > DAO should send a batch of queries and perform a client side hash join (like > the Accumulo DAO) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[GitHub] incubator-rya pull request #204: RYA-337 Adding batch queries to MongoDB. Cl...
Github user meiercaleb commented on a diff in the pull request: https://github.com/apache/incubator-rya/pull/204#discussion_r132989696 --- Diff: dao/mongodb.rya/src/main/java/org/apache/rya/mongodb/MongoDBQueryEngine.java --- @@ -86,22 +84,10 @@ public MongoDBRdfConfiguration getConf() { public CloseableIterationquery( final RyaStatement stmt, MongoDBRdfConfiguration conf) throws RyaDAOException { -if (conf == null) { -conf = configuration; -} -final Long maxResults = conf.getLimit(); -final Set queries = new HashSet(); -final DBObject query = strategy.getQuery(stmt); -queries.add(query); -final MongoDatabase db = mongoClient.getDatabase(conf.getMongoDBName()); -final MongoCollection collection = db.getCollection(conf.getTriplesCollectionName()); -final RyaStatementCursorIterator iterator = new RyaStatementCursorIterator(collection, queries, strategy, -conf.getAuthorizations()); - -if (maxResults != null) { -iterator.setMaxResults(maxResults); -} -return iterator; +Entry entry = new AbstractMap.SimpleEntry<>(stmt, new MapBindingSet()); --- End diff -- Check that the RyaStatement and Config are not null. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-rya pull request #204: RYA-337 Adding batch queries to MongoDB. Cl...
Github user meiercaleb commented on a diff in the pull request: https://github.com/apache/incubator-rya/pull/204#discussion_r132990636 --- Diff: dao/mongodb.rya/src/main/java/org/apache/rya/mongodb/MongoDBQueryEngine.java --- @@ -140,72 +121,35 @@ public MongoDBRdfConfiguration getConf() { public CloseableIterationbatchQuery( final Collection stmts, MongoDBRdfConfiguration conf) throws RyaDAOException { -if (conf == null) { -conf = configuration; -} -final Long maxResults = conf.getLimit(); -final Set queries = new HashSet(); - -try { -for (final RyaStatement stmt : stmts) { -queries.add( strategy.getQuery(stmt)); - } +final Map queries = new HashMap<>(); -// TODO not sure what to do about regex ranges? -final RyaStatementCursorIterator iterator = new RyaStatementCursorIterator(getCollection(conf), queries, -strategy, configuration.getAuthorizations()); - -if (maxResults != null) { -iterator.setMaxResults(maxResults); -} -return iterator; -} catch (final Exception e) { -throw new RyaDAOException(e); +for (final RyaStatement stmt : stmts) { +queries.put(stmt, new MapBindingSet()); } +return new RyaStatementCursorIterator(queryWithBindingSet(queries.entrySet(), conf)); } + @Override public CloseableIterable query(final RyaQuery ryaQuery) throws RyaDAOException { -final Set queries = new HashSet(); - -try { -queries.add( strategy.getQuery(ryaQuery)); - -// TODO not sure what to do about regex ranges? -// TODO this is gross -final RyaStatementCursorIterable iterator = new RyaStatementCursorIterable( -new NonCloseableRyaStatementCursorIterator(new RyaStatementCursorIterator(getCollection(getConf()), -queries, strategy, configuration.getAuthorizations(; - -return iterator; -} catch (final Exception e) { -throw new RyaDAOException(e); -} +return query(new BatchRyaQuery(Collections.singleton(ryaQuery.getQuery(; --- End diff -- Null check --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-rya pull request #204: RYA-337 Adding batch queries to MongoDB. Cl...
Github user meiercaleb commented on a diff in the pull request: https://github.com/apache/incubator-rya/pull/204#discussion_r132979641 --- Diff: dao/mongodb.rya/src/main/java/org/apache/rya/mongodb/iter/RyaStatementBindingSetCursorIterator.java --- @@ -91,47 +96,81 @@ private boolean currentBindingSetIteratorIsValid() { } private void findNextResult() { -if (!currentResultCursorIsValid()) { -findNextValidResultCursor(); +if (!currentBatchQueryResultCursorIsValid()) { +submitBatchQuery(); } -if (currentResultCursorIsValid()) { + +if (currentBatchQueryResultCursorIsValid()) { // convert to Rya Statement -final Document queryResult = resultsIterator.next(); +final Document queryResult = batchQueryResultsIterator.next(); final DBObject dbo = (DBObject) JSON.parse(queryResult.toJson()); -currentStatement = strategy.deserializeDBObject(dbo); -currentBindingSetIterator = currentBindingSetCollection.iterator(); +currentResultStatement = strategy.deserializeDBObject(dbo); + +// Find all of the queries in the executed RangeMap that this result matches +// and collect all of those binding sets +Set bsList = new HashSet<>(); +for (RyaStatement executedQuery : executedRangeMap.keys()) { +if (isResultForQuery(executedQuery, currentResultStatement)) { +bsList.addAll(executedRangeMap.get(executedQuery)); +} +} +currentBindingSetIterator = bsList.iterator(); +} + +// Handle case of invalid currentResultStatement or no binding sets returned +if ((currentBindingSetIterator == null || !currentBindingSetIterator.hasNext()) && (currentBatchQueryResultCursorIsValid() || queryIterator.hasNext())) { +findNextResult(); } } + +private static boolean isResultForQuery(RyaStatement query, RyaStatement result) { +return isResult(query.getSubject(), result.getSubject()) && +isResult(query.getPredicate(), result.getPredicate()) && +isResult(query.getObject(), result.getObject()) && +isResult(query.getContext(), result.getContext()); +} + +private static boolean isResult(RyaType query, RyaType result) { +return (query == null) || query.equals(result); +} -private void findNextValidResultCursor() { -while (queryIterator.hasNext()){ -final DBObject currentQuery = queryIterator.next(); -currentBindingSetCollection = rangeMap.get(currentQuery); -// Executing redact aggregation to only return documents the user -// has access to. -final List pipeline = new ArrayList<>(); -pipeline.add(new Document("$match", currentQuery)); -pipeline.addAll(AggregationUtil.createRedactPipeline(auths)); -log.debug(pipeline); - -final AggregateIterable aggIter = coll.aggregate(pipeline); -aggIter.batchSize(1000); -resultsIterator = aggIter.iterator(); -if (resultsIterator.hasNext()) { -break; -} +private void submitBatchQuery() { +int count = 0; +executedRangeMap.clear(); +final List pipeline = new ArrayList<>(); +final List match = new ArrayList<>(); + +while (queryIterator.hasNext() && count < QUERY_BATCH_SIZE){ +count++; +RyaStatement query = queryIterator.next(); +executedRangeMap.putAll(query, rangeMap.get(query)); +final DBObject currentQuery = strategy.getQuery(query); +match.add(currentQuery); } -} -private boolean currentResultCursorIsValid() { -return (resultsIterator != null) && resultsIterator.hasNext(); +if (match.size() > 1) { +pipeline.add(new Document("$match", new Document("$or", match))); --- End diff -- Did you compare the performance of $or with $in? Seems like $or is used to see if a field value satisfies at least one of (possibly) many general logical expressions, while $in checks to see if a value is in some sort of indexed collection. My guess is that the added generality of $or would make it less performant for batch queries than $in. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature
[GitHub] incubator-rya pull request #204: RYA-337 Adding batch queries to MongoDB. Cl...
Github user meiercaleb commented on a diff in the pull request: https://github.com/apache/incubator-rya/pull/204#discussion_r132990694 --- Diff: dao/mongodb.rya/src/main/java/org/apache/rya/mongodb/MongoDBQueryEngine.java --- @@ -140,72 +121,35 @@ public MongoDBRdfConfiguration getConf() { public CloseableIterationbatchQuery( final Collection stmts, MongoDBRdfConfiguration conf) throws RyaDAOException { -if (conf == null) { -conf = configuration; -} -final Long maxResults = conf.getLimit(); -final Set queries = new HashSet(); - -try { -for (final RyaStatement stmt : stmts) { -queries.add( strategy.getQuery(stmt)); - } +final Map queries = new HashMap<>(); -// TODO not sure what to do about regex ranges? -final RyaStatementCursorIterator iterator = new RyaStatementCursorIterator(getCollection(conf), queries, -strategy, configuration.getAuthorizations()); - -if (maxResults != null) { -iterator.setMaxResults(maxResults); -} -return iterator; -} catch (final Exception e) { -throw new RyaDAOException(e); +for (final RyaStatement stmt : stmts) { +queries.put(stmt, new MapBindingSet()); } +return new RyaStatementCursorIterator(queryWithBindingSet(queries.entrySet(), conf)); } + @Override public CloseableIterable query(final RyaQuery ryaQuery) throws RyaDAOException { -final Set queries = new HashSet(); - -try { -queries.add( strategy.getQuery(ryaQuery)); - -// TODO not sure what to do about regex ranges? -// TODO this is gross -final RyaStatementCursorIterable iterator = new RyaStatementCursorIterable( -new NonCloseableRyaStatementCursorIterator(new RyaStatementCursorIterator(getCollection(getConf()), -queries, strategy, configuration.getAuthorizations(; - -return iterator; -} catch (final Exception e) { -throw new RyaDAOException(e); -} +return query(new BatchRyaQuery(Collections.singleton(ryaQuery.getQuery(; } + @Override public CloseableIterable query(final BatchRyaQuery batchRyaQuery) throws RyaDAOException { - try { - final Set queries = new HashSet(); -for (final RyaStatement statement : batchRyaQuery.getQueries()){ -queries.add( strategy.getQuery(statement)); +final Map queries = new HashMap<>(); --- End diff -- Null check. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-rya pull request #204: RYA-337 Adding batch queries to MongoDB. Cl...
Github user meiercaleb commented on a diff in the pull request: https://github.com/apache/incubator-rya/pull/204#discussion_r132990271 --- Diff: dao/mongodb.rya/src/main/java/org/apache/rya/mongodb/MongoDBQueryEngine.java --- @@ -111,25 +97,20 @@ public MongoDBRdfConfiguration getConf() { if (conf == null) { --- End diff -- Check that stmts is not null --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-rya pull request #204: RYA-337 Adding batch queries to MongoDB. Cl...
Github user meiercaleb commented on a diff in the pull request: https://github.com/apache/incubator-rya/pull/204#discussion_r132980017 --- Diff: dao/mongodb.rya/src/main/java/org/apache/rya/mongodb/iter/RyaStatementBindingSetCursorIterator.java --- @@ -44,20 +48,21 @@ public class RyaStatementBindingSetCursorIterator implements CloseableIteration, RyaDAOException> { private static final Logger log = Logger.getLogger(RyaStatementBindingSetCursorIterator.class); + +private static final int QUERY_BATCH_SIZE = 50; --- End diff -- Any particular reason you went with 50 here? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (RYA-337) Batch Queries to MongoDB
[ https://issues.apache.org/jira/browse/RYA-337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16125772#comment-16125772 ] ASF GitHub Bot commented on RYA-337: Github user asfgit commented on the issue: https://github.com/apache/incubator-rya/pull/204 Refer to this link for build results (access rights to CI server needed): https://builds.apache.org/job/incubator-rya-master-with-optionals-pull-requests/387/ > Batch Queries to MongoDB > > > Key: RYA-337 > URL: https://issues.apache.org/jira/browse/RYA-337 > Project: Rya > Issue Type: Improvement > Components: dao >Reporter: Aaron Mihalik >Assignee: Aaron Mihalik > > Currently the MongoDB DAO sends one query at a time to Mongo. Instead, the > DAO should send a batch of queries and perform a client side hash join (like > the Accumulo DAO) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[GitHub] incubator-rya issue #204: RYA-337 Adding batch queries to MongoDB. Closes #2...
Github user asfgit commented on the issue: https://github.com/apache/incubator-rya/pull/204 Refer to this link for build results (access rights to CI server needed): https://builds.apache.org/job/incubator-rya-master-with-optionals-pull-requests/386/Failed Tests: 1incubator-rya-master-with-optionals-pull-requests/org.apache.rya:rya.indexing.example: 1ExamplesTest.MongoRyaDirectExampleTest --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-rya pull request #204: RYA-337 Adding batch queries to MongoDB. Cl...
GitHub user amihalik opened a pull request: https://github.com/apache/incubator-rya/pull/204 RYA-337 Adding batch queries to MongoDB. Closes #204 ## Description Added a batch query mechanism to MongoDB DAO and simplified MongoDBQueryEngine ### Tests Ran unit tests ### Links [Jira](https://issues.apache.org/jira/browse/RYA-337) ### Checklist - [ ] Code Review - [ ] Squash Commits People To Review @isper3at @pujav65 @meiercaleb You can merge this pull request into a Git repository by running: $ git pull https://github.com/amihalik/incubator-rya RYA-337 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/incubator-rya/pull/204.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #204 commit d19ed3c7fb4ca71bc3c75e1aafc77c4b89c76270 Author: Aaron MihalikDate: 2017-08-08T15:17:37Z RYA-337 Adding batch queries to MongoDB. Closes #204 Additionally, simplifying MongoDBQueryEngine --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---