[jira] [Commented] (RYA-343) AccumuloLoadStatementsFile fails when loading data to a PCJ-enabled table.

2017-08-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/RYA-343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16126702#comment-16126702
 ] 

ASF GitHub Bot commented on RYA-343:


Github user meiercaleb commented on a diff in the pull request:

https://github.com/apache/incubator-rya/pull/205#discussion_r133100980
  
--- Diff: 
common/rya.api/src/main/java/org/apache/rya/api/instance/RyaDetailsToConfiguration.java
 ---
@@ -53,14 +53,16 @@ public static void addRyaDetailsToConfiguration(final 
RyaDetails details, final
 checkAndSet(conf, ConfigurationFields.USE_FREETEXT, 
details.getFreeTextIndexDetails().isEnabled());
   //RYA-215checkAndSet(conf, ConfigurationFields.USE_GEO, 
details.getGeoIndexDetails().isEnabled());
 checkAndSet(conf, ConfigurationFields.USE_TEMPORAL, 
details.getTemporalIndexDetails().isEnabled());
-PCJIndexDetails pcjDetails = details.getPCJIndexDetails();
-   if (pcjDetails.isEnabled() && 
pcjDetails.getFluoDetails().isPresent()) {
-   checkAndSet(conf, ConfigurationFields.USE_PCJ_UPDATER, 
true);
-   conf.set(ConfigurationFields.FLUO_APP_NAME, 
pcjDetails.getFluoDetails().get().getUpdateAppName());
-   conf.set(ConfigurationFields.PCJ_UPDATER_TYPE, "FLUO");
-   } else {
-   checkAndSet(conf, ConfigurationFields.USE_PCJ_UPDATER, 
false);
-   }
+final PCJIndexDetails pcjDetails = details.getPCJIndexDetails();
+if (pcjDetails.isEnabled() && 
pcjDetails.getFluoDetails().isPresent()) {
+checkAndSet(conf, ConfigurationFields.USE_PCJ_UPDATER, true);
+conf.set(ConfigurationFields.FLUO_APP_NAME, 
pcjDetails.getFluoDetails().get().getUpdateAppName());
+conf.set(ConfigurationFields.PCJ_UPDATER_TYPE, "FLUO");
+conf.set(ConfigurationFields.PCJ_STORAGE_TYPE, "ACCUMULO");
--- End diff --

So this is probably okay for now.  But this method could take in either an 
instance of AccumuloRyaInstanceDetails or an instance of 
MongoRyaInstanceDetails.  So the storage type could be Accumulo or Mongo (note 
that the Mongo storage type doesn't currently exist in master, but will be 
merged in as part of a pull request).  Given that there is currently no updater 
for mongo, there will be no fluo details so the the Accumulo storage type will 
never be set for a Mongo backed instance.  But this could change once an 
updater is implemented for Mongo.  The best thing to do is to add a 
getStorageType() method to the RyaInstanceDetails, and then just pass this into 
the config so that the storage type aligns with the RyaInstanceDetails type.


> AccumuloLoadStatementsFile fails when loading data to a PCJ-enabled table.
> --
>
> Key: RYA-343
> URL: https://issues.apache.org/jira/browse/RYA-343
> Project: Rya
>  Issue Type: Sub-task
>  Components: clients
>Reporter: Jeff Dasch
>Assignee: Jeff Dasch
>
> Issue occurs when calling {{AccumuloLoadStatementsFile.loadStatements()}} to 
> loading data to a PCJ-enabled table.  I believe this is a recent regression.
> {noformat}
> 2017-08-14 13:46:51,802 [Spring Shell] WARN  
> org.apache.rya.api.client.accumulo.AccumuloLoadStatementsFile - Exception 
> while loading:
> org.apache.rya.api.persist.RyaDAOException: 
> java.lang.IllegalArgumentException: The 'rya.indexing.pcj.storageType' 
> property must have one of the following values: [ACCUMULO]
>   at org.apache.rya.accumulo.AccumuloRyaDAO.init(AccumuloRyaDAO.java:165)
>   at 
> org.apache.rya.sail.config.RyaSailFactory.getAccumuloDAO(RyaSailFactory.java:155)
>   at 
> org.apache.rya.sail.config.RyaSailFactory.getRyaSail(RyaSailFactory.java:100)
>   at 
> org.apache.rya.sail.config.RyaSailFactory.getInstance(RyaSailFactory.java:67)
>   at 
> org.apache.rya.api.client.accumulo.AccumuloLoadStatementsFile.loadStatements(AccumuloLoadStatementsFile.java:91)
>   at org.apache.rya.shell.RyaCommands.loadData(RyaCommands.java:121)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.springframework.util.ReflectionUtils.invokeMethod(ReflectionUtils.java:210)
>   at 
> org.springframework.shell.core.SimpleExecutionStrategy.invoke(SimpleExecutionStrategy.java:64)
>   at 
> org.springframework.shell.core.SimpleExecutionStrategy.execute(SimpleExecutionStrategy.java:57)
>   at 
> 

[GitHub] incubator-rya pull request #205: RYA-343 Fixed rya.api connection issue for ...

2017-08-14 Thread meiercaleb
Github user meiercaleb commented on a diff in the pull request:

https://github.com/apache/incubator-rya/pull/205#discussion_r133100980
  
--- Diff: 
common/rya.api/src/main/java/org/apache/rya/api/instance/RyaDetailsToConfiguration.java
 ---
@@ -53,14 +53,16 @@ public static void addRyaDetailsToConfiguration(final 
RyaDetails details, final
 checkAndSet(conf, ConfigurationFields.USE_FREETEXT, 
details.getFreeTextIndexDetails().isEnabled());
   //RYA-215checkAndSet(conf, ConfigurationFields.USE_GEO, 
details.getGeoIndexDetails().isEnabled());
 checkAndSet(conf, ConfigurationFields.USE_TEMPORAL, 
details.getTemporalIndexDetails().isEnabled());
-PCJIndexDetails pcjDetails = details.getPCJIndexDetails();
-   if (pcjDetails.isEnabled() && 
pcjDetails.getFluoDetails().isPresent()) {
-   checkAndSet(conf, ConfigurationFields.USE_PCJ_UPDATER, 
true);
-   conf.set(ConfigurationFields.FLUO_APP_NAME, 
pcjDetails.getFluoDetails().get().getUpdateAppName());
-   conf.set(ConfigurationFields.PCJ_UPDATER_TYPE, "FLUO");
-   } else {
-   checkAndSet(conf, ConfigurationFields.USE_PCJ_UPDATER, 
false);
-   }
+final PCJIndexDetails pcjDetails = details.getPCJIndexDetails();
+if (pcjDetails.isEnabled() && 
pcjDetails.getFluoDetails().isPresent()) {
+checkAndSet(conf, ConfigurationFields.USE_PCJ_UPDATER, true);
+conf.set(ConfigurationFields.FLUO_APP_NAME, 
pcjDetails.getFluoDetails().get().getUpdateAppName());
+conf.set(ConfigurationFields.PCJ_UPDATER_TYPE, "FLUO");
+conf.set(ConfigurationFields.PCJ_STORAGE_TYPE, "ACCUMULO");
--- End diff --

So this is probably okay for now.  But this method could take in either an 
instance of AccumuloRyaInstanceDetails or an instance of 
MongoRyaInstanceDetails.  So the storage type could be Accumulo or Mongo (note 
that the Mongo storage type doesn't currently exist in master, but will be 
merged in as part of a pull request).  Given that there is currently no updater 
for mongo, there will be no fluo details so the the Accumulo storage type will 
never be set for a Mongo backed instance.  But this could change once an 
updater is implemented for Mongo.  The best thing to do is to add a 
getStorageType() method to the RyaInstanceDetails, and then just pass this into 
the config so that the storage type aligns with the RyaInstanceDetails type.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (RYA-295) Implement owl:allValuesFrom inference

2017-08-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/RYA-295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16126480#comment-16126480
 ] 

ASF GitHub Bot commented on RYA-295:


Github user asfgit commented on the issue:

https://github.com/apache/incubator-rya/pull/201
  

Refer to this link for build results (access rights to CI server needed): 

https://builds.apache.org/job/incubator-rya-master-with-optionals-pull-requests/390/Failed
 Tests: 3incubator-rya-master-with-optionals-pull-requests/org.apache.rya:rya.prospector:
 3org.apache.rya.prospector.mr.ProspectorTest.testCountorg.apache.rya.prospector.service.ProspectorServiceEvalStatsDAOTest.testCountorg.apache.rya.prospector.service.ProspectorServiceEvalStatsDAOTest.testNoAuthsCount



> Implement owl:allValuesFrom inference
> -
>
> Key: RYA-295
> URL: https://issues.apache.org/jira/browse/RYA-295
> Project: Rya
>  Issue Type: Sub-task
>  Components: sail
>Reporter: Jesse Hatfield
>Assignee: Jesse Hatfield
>
> An *{{owl:allValuesFrom}}* restriction defines the set of resources for 
> which, given a particular predicate and other type, every value of that 
> predicate is a member of that type. Note that there may be no values at all.
> For example, the ontology may state that resources of type {{:Person}} have 
> all values from {{:Person}} for type {{:parent}}: that is, a person's parents 
> are all people as well. Therefore, a pattern of the form {{?x rdf:type 
> :Person}} should be expanded to:
> {noformat}
> { ?y rdf:type :Person .
>   ?y :parent ?x }
> UNION
> { ?x rdf:type :Person }
> {noformat}
> i.e. we can infer {{?x}}'s personhood from the fact that child {{?y}} is 
> known to satisfy the restriction.
> Notes:
> -We can infer "x is a person, therefore all of x's parents are people". But 
> we can't infer "all of x's parents are people, therefore x is a person", 
> because of the open world semantics: we don't know that the parents given by 
> the data are in fact all of x's parents. (If there were also a cardinality 
> restriction and we could presume consistency, then we could infer this in the 
> right circumstances, but this is outside the scope of basic allValues>From 
> support.) This differs with most other property restriction rules in that we 
> can't infer that an object belongs to the class defined by the restriction, 
> but rather use the fact that an object is already known to belong in that 
> class in order to infer something about its neighbors in the graph (the types 
> of the values).
> -The example above could be applied recursively, but to implement this as a 
> simple query rewrite we'll need to limit recursion depth (and interactions 
> with other rules, for the same reasons).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] incubator-rya issue #201: RYA-295 owl:allValuesFrom inference

2017-08-14 Thread asfgit
Github user asfgit commented on the issue:

https://github.com/apache/incubator-rya/pull/201
  

Refer to this link for build results (access rights to CI server needed): 

https://builds.apache.org/job/incubator-rya-master-with-optionals-pull-requests/390/Failed
 Tests: 3incubator-rya-master-with-optionals-pull-requests/org.apache.rya:rya.prospector:
 3org.apache.rya.prospector.mr.ProspectorTest.testCountorg.apache.rya.prospector.service.ProspectorServiceEvalStatsDAOTest.testCountorg.apache.rya.prospector.service.ProspectorServiceEvalStatsDAOTest.testNoAuthsCount



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (RYA-298) Implement rdfs:domain inference

2017-08-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/RYA-298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16126469#comment-16126469
 ] 

ASF GitHub Bot commented on RYA-298:


Github user asfgit commented on the issue:

https://github.com/apache/incubator-rya/pull/197
  

Refer to this link for build results (access rights to CI server needed): 

https://builds.apache.org/job/incubator-rya-master-with-optionals-pull-requests/389/



> Implement rdfs:domain inference
> ---
>
> Key: RYA-298
> URL: https://issues.apache.org/jira/browse/RYA-298
> Project: Rya
>  Issue Type: Sub-task
>  Components: sail
>Reporter: Jesse Hatfield
>Assignee: Jesse Hatfield
>
> If a predicate has an *{{rdfs:domain}}* of some class, than the subject of 
> any triple including that predicate belongs to the class.
> If the ontology states that {{:advisor}} has the domain of {{:Person}}, then 
> the inference engine should rewrite queries of the form {{?x rdf:type 
> :Person}} to check for resources which have any {{:advisor}} (as well as any 
> specifically stated to have type {{:Person}} ).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] incubator-rya issue #197: RYA-298, RYA-299 Domain/range inference.

2017-08-14 Thread asfgit
Github user asfgit commented on the issue:

https://github.com/apache/incubator-rya/pull/197
  

Refer to this link for build results (access rights to CI server needed): 

https://builds.apache.org/job/incubator-rya-master-with-optionals-pull-requests/389/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (RYA-295) Implement owl:allValuesFrom inference

2017-08-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/RYA-295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16126460#comment-16126460
 ] 

ASF GitHub Bot commented on RYA-295:


Github user jessehatfield commented on the issue:

https://github.com/apache/incubator-rya/pull/201
  
Fixed a typo in a log message and a redundant line in the example; updated 
with latest changes to master.


> Implement owl:allValuesFrom inference
> -
>
> Key: RYA-295
> URL: https://issues.apache.org/jira/browse/RYA-295
> Project: Rya
>  Issue Type: Sub-task
>  Components: sail
>Reporter: Jesse Hatfield
>Assignee: Jesse Hatfield
>
> An *{{owl:allValuesFrom}}* restriction defines the set of resources for 
> which, given a particular predicate and other type, every value of that 
> predicate is a member of that type. Note that there may be no values at all.
> For example, the ontology may state that resources of type {{:Person}} have 
> all values from {{:Person}} for type {{:parent}}: that is, a person's parents 
> are all people as well. Therefore, a pattern of the form {{?x rdf:type 
> :Person}} should be expanded to:
> {noformat}
> { ?y rdf:type :Person .
>   ?y :parent ?x }
> UNION
> { ?x rdf:type :Person }
> {noformat}
> i.e. we can infer {{?x}}'s personhood from the fact that child {{?y}} is 
> known to satisfy the restriction.
> Notes:
> -We can infer "x is a person, therefore all of x's parents are people". But 
> we can't infer "all of x's parents are people, therefore x is a person", 
> because of the open world semantics: we don't know that the parents given by 
> the data are in fact all of x's parents. (If there were also a cardinality 
> restriction and we could presume consistency, then we could infer this in the 
> right circumstances, but this is outside the scope of basic allValuesFrom 
> support.) This differs with most other property restriction rules in that we 
> can't infer that an object belongs to the class defined by the restriction, 
> but rather use the fact that an object is already known to belong in that 
> class in order to infer something about its neighbors in the graph (the types 
> of the values).
> -The example above could be applied recursively, but to implement this as a 
> simple query rewrite we'll need to limit recursion depth (and interactions 
> with other rules, for the same reasons).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] incubator-rya issue #201: RYA-295 owl:allValuesFrom inference

2017-08-14 Thread jessehatfield
Github user jessehatfield commented on the issue:

https://github.com/apache/incubator-rya/pull/201
  
Fixed a typo in a log message and a redundant line in the example; updated 
with latest changes to master.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (RYA-298) Implement rdfs:domain inference

2017-08-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/RYA-298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16126458#comment-16126458
 ] 

ASF GitHub Bot commented on RYA-298:


Github user jessehatfield commented on the issue:

https://github.com/apache/incubator-rya/pull/197
  
Updated. At some point it would be nice to make a pass over the inference 
engine for thread-safety across both the old and new logic, maybe once the 
currently-pending extensions are added.


> Implement rdfs:domain inference
> ---
>
> Key: RYA-298
> URL: https://issues.apache.org/jira/browse/RYA-298
> Project: Rya
>  Issue Type: Sub-task
>  Components: sail
>Reporter: Jesse Hatfield
>Assignee: Jesse Hatfield
>
> If a predicate has an *{{rdfs:domain}}* of some class, than the subject of 
> any triple including that predicate belongs to the class.
> If the ontology states that {{:advisor}} has the domain of {{:Person}}, then 
> the inference engine should rewrite queries of the form {{?x rdf:type 
> :Person}} to check for resources which have any {{:advisor}} (as well as any 
> specifically stated to have type {{:Person}} ).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] incubator-rya issue #197: RYA-298, RYA-299 Domain/range inference.

2017-08-14 Thread jessehatfield
Github user jessehatfield commented on the issue:

https://github.com/apache/incubator-rya/pull/197
  
Updated. At some point it would be nice to make a pass over the inference 
engine for thread-safety across both the old and new logic, maybe once the 
currently-pending extensions are added.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (RYA-343) AccumuloLoadStatementsFile fails when loading data to a PCJ-enabled table.

2017-08-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/RYA-343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16126315#comment-16126315
 ] 

ASF GitHub Bot commented on RYA-343:


Github user asfgit commented on the issue:

https://github.com/apache/incubator-rya/pull/205
  

Refer to this link for build results (access rights to CI server needed): 

https://builds.apache.org/job/incubator-rya-master-with-optionals-pull-requests/388/



> AccumuloLoadStatementsFile fails when loading data to a PCJ-enabled table.
> --
>
> Key: RYA-343
> URL: https://issues.apache.org/jira/browse/RYA-343
> Project: Rya
>  Issue Type: Sub-task
>  Components: clients
>Reporter: Jeff Dasch
>Assignee: Jeff Dasch
>
> Issue occurs when calling {{AccumuloLoadStatementsFile.loadStatements()}} to 
> loading data to a PCJ-enabled table.  I believe this is a recent regression.
> {noformat}
> 2017-08-14 13:46:51,802 [Spring Shell] WARN  
> org.apache.rya.api.client.accumulo.AccumuloLoadStatementsFile - Exception 
> while loading:
> org.apache.rya.api.persist.RyaDAOException: 
> java.lang.IllegalArgumentException: The 'rya.indexing.pcj.storageType' 
> property must have one of the following values: [ACCUMULO]
>   at org.apache.rya.accumulo.AccumuloRyaDAO.init(AccumuloRyaDAO.java:165)
>   at 
> org.apache.rya.sail.config.RyaSailFactory.getAccumuloDAO(RyaSailFactory.java:155)
>   at 
> org.apache.rya.sail.config.RyaSailFactory.getRyaSail(RyaSailFactory.java:100)
>   at 
> org.apache.rya.sail.config.RyaSailFactory.getInstance(RyaSailFactory.java:67)
>   at 
> org.apache.rya.api.client.accumulo.AccumuloLoadStatementsFile.loadStatements(AccumuloLoadStatementsFile.java:91)
>   at org.apache.rya.shell.RyaCommands.loadData(RyaCommands.java:121)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.springframework.util.ReflectionUtils.invokeMethod(ReflectionUtils.java:210)
>   at 
> org.springframework.shell.core.SimpleExecutionStrategy.invoke(SimpleExecutionStrategy.java:64)
>   at 
> org.springframework.shell.core.SimpleExecutionStrategy.execute(SimpleExecutionStrategy.java:57)
>   at 
> org.springframework.shell.core.AbstractShell.executeCommand(AbstractShell.java:127)
>   at 
> org.springframework.shell.core.JLineShell.promptLoop(JLineShell.java:533)
>   at org.springframework.shell.core.JLineShell.run(JLineShell.java:179)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: java.lang.IllegalArgumentException: The 
> 'rya.indexing.pcj.storageType' property must have one of the following 
> values: [ACCUMULO]
>   at 
> com.google.common.base.Preconditions.checkArgument(Preconditions.java:92)
>   at 
> org.apache.rya.indexing.external.PrecomputedJoinStorageSupplier.get(PrecomputedJoinStorageSupplier.java:68)
>   at 
> org.apache.rya.indexing.external.PrecomputedJoinIndexer.init(PrecomputedJoinIndexer.java:139)
>   at org.apache.rya.accumulo.AccumuloRyaDAO.init(AccumuloRyaDAO.java:156)
>   ... 16 more
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] incubator-rya issue #205: RYA-343 Fixed rya.api connection issue for PCJ-ena...

2017-08-14 Thread asfgit
Github user asfgit commented on the issue:

https://github.com/apache/incubator-rya/pull/205
  

Refer to this link for build results (access rights to CI server needed): 

https://builds.apache.org/job/incubator-rya-master-with-optionals-pull-requests/388/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (RYA-343) AccumuloLoadStatementsFile fails when loading data to a PCJ-enabled table.

2017-08-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/RYA-343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16126273#comment-16126273
 ] 

ASF GitHub Bot commented on RYA-343:


GitHub user jdasch opened a pull request:

https://github.com/apache/incubator-rya/pull/205

RYA-343 Fixed rya.api connection issue for PCJ-enabled rya table.


## Description
Added a couple configs that weren't getting set and caused a regression.

Purpose of this PR is to get a fix in place now.  We should probably make a 
new JIRA to ensure this regression is captured by an IT and to expand 
PCJDetails to store/retrieve this information.

### Tests
None.

### Links
[Jira: RYA-343](https://issues.apache.org/jira/browse/RYA-343)

### Checklist
- [ ] Code Review
- [ ] Squash Commits

 People To Reivew
@amihalik 
@meiercaleb 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/jdasch/incubator-rya RYA-343

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-rya/pull/205.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #205


commit 81477c5815f3f54a3ccd5d32e87e6692461b32a9
Author: jdasch 
Date:   2017-08-14T19:21:10Z

RYA-343 Fixed rya.api connection issue for PCJ-enabled rya table.




> AccumuloLoadStatementsFile fails when loading data to a PCJ-enabled table.
> --
>
> Key: RYA-343
> URL: https://issues.apache.org/jira/browse/RYA-343
> Project: Rya
>  Issue Type: Sub-task
>  Components: clients
>Reporter: Jeff Dasch
>Assignee: Jeff Dasch
>
> Issue occurs when calling {{AccumuloLoadStatementsFile.loadStatements()}} to 
> loading data to a PCJ-enabled table.  I believe this is a recent regression.
> {noformat}
> 2017-08-14 13:46:51,802 [Spring Shell] WARN  
> org.apache.rya.api.client.accumulo.AccumuloLoadStatementsFile - Exception 
> while loading:
> org.apache.rya.api.persist.RyaDAOException: 
> java.lang.IllegalArgumentException: The 'rya.indexing.pcj.storageType' 
> property must have one of the following values: [ACCUMULO]
>   at org.apache.rya.accumulo.AccumuloRyaDAO.init(AccumuloRyaDAO.java:165)
>   at 
> org.apache.rya.sail.config.RyaSailFactory.getAccumuloDAO(RyaSailFactory.java:155)
>   at 
> org.apache.rya.sail.config.RyaSailFactory.getRyaSail(RyaSailFactory.java:100)
>   at 
> org.apache.rya.sail.config.RyaSailFactory.getInstance(RyaSailFactory.java:67)
>   at 
> org.apache.rya.api.client.accumulo.AccumuloLoadStatementsFile.loadStatements(AccumuloLoadStatementsFile.java:91)
>   at org.apache.rya.shell.RyaCommands.loadData(RyaCommands.java:121)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.springframework.util.ReflectionUtils.invokeMethod(ReflectionUtils.java:210)
>   at 
> org.springframework.shell.core.SimpleExecutionStrategy.invoke(SimpleExecutionStrategy.java:64)
>   at 
> org.springframework.shell.core.SimpleExecutionStrategy.execute(SimpleExecutionStrategy.java:57)
>   at 
> org.springframework.shell.core.AbstractShell.executeCommand(AbstractShell.java:127)
>   at 
> org.springframework.shell.core.JLineShell.promptLoop(JLineShell.java:533)
>   at org.springframework.shell.core.JLineShell.run(JLineShell.java:179)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: java.lang.IllegalArgumentException: The 
> 'rya.indexing.pcj.storageType' property must have one of the following 
> values: [ACCUMULO]
>   at 
> com.google.common.base.Preconditions.checkArgument(Preconditions.java:92)
>   at 
> org.apache.rya.indexing.external.PrecomputedJoinStorageSupplier.get(PrecomputedJoinStorageSupplier.java:68)
>   at 
> org.apache.rya.indexing.external.PrecomputedJoinIndexer.init(PrecomputedJoinIndexer.java:139)
>   at org.apache.rya.accumulo.AccumuloRyaDAO.init(AccumuloRyaDAO.java:156)
>   ... 16 more
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] incubator-rya pull request #205: RYA-343 Fixed rya.api connection issue for ...

2017-08-14 Thread jdasch
GitHub user jdasch opened a pull request:

https://github.com/apache/incubator-rya/pull/205

RYA-343 Fixed rya.api connection issue for PCJ-enabled rya table.


## Description
Added a couple configs that weren't getting set and caused a regression.

Purpose of this PR is to get a fix in place now.  We should probably make a 
new JIRA to ensure this regression is captured by an IT and to expand 
PCJDetails to store/retrieve this information.

### Tests
None.

### Links
[Jira: RYA-343](https://issues.apache.org/jira/browse/RYA-343)

### Checklist
- [ ] Code Review
- [ ] Squash Commits

 People To Reivew
@amihalik 
@meiercaleb 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/jdasch/incubator-rya RYA-343

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-rya/pull/205.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #205


commit 81477c5815f3f54a3ccd5d32e87e6692461b32a9
Author: jdasch 
Date:   2017-08-14T19:21:10Z

RYA-343 Fixed rya.api connection issue for PCJ-enabled rya table.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Created] (RYA-343) AccumuloLoadStatementsFile fails when loading data to a PCJ-enabled table.

2017-08-14 Thread Jeff Dasch (JIRA)
Jeff Dasch created RYA-343:
--

 Summary: AccumuloLoadStatementsFile fails when loading data to a 
PCJ-enabled table.
 Key: RYA-343
 URL: https://issues.apache.org/jira/browse/RYA-343
 Project: Rya
  Issue Type: Sub-task
  Components: clients
Reporter: Jeff Dasch
Assignee: Jeff Dasch


Issue occurs when calling {{AccumuloLoadStatementsFile.loadStatements()}} to 
loading data to a PCJ-enabled table.  I believe this is a recent regression.

{noformat}
2017-08-14 13:46:51,802 [Spring Shell] WARN  
org.apache.rya.api.client.accumulo.AccumuloLoadStatementsFile - Exception while 
loading:
org.apache.rya.api.persist.RyaDAOException: java.lang.IllegalArgumentException: 
The 'rya.indexing.pcj.storageType' property must have one of the following 
values: [ACCUMULO]
at org.apache.rya.accumulo.AccumuloRyaDAO.init(AccumuloRyaDAO.java:165)
at 
org.apache.rya.sail.config.RyaSailFactory.getAccumuloDAO(RyaSailFactory.java:155)
at 
org.apache.rya.sail.config.RyaSailFactory.getRyaSail(RyaSailFactory.java:100)
at 
org.apache.rya.sail.config.RyaSailFactory.getInstance(RyaSailFactory.java:67)
at 
org.apache.rya.api.client.accumulo.AccumuloLoadStatementsFile.loadStatements(AccumuloLoadStatementsFile.java:91)
at org.apache.rya.shell.RyaCommands.loadData(RyaCommands.java:121)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.springframework.util.ReflectionUtils.invokeMethod(ReflectionUtils.java:210)
at 
org.springframework.shell.core.SimpleExecutionStrategy.invoke(SimpleExecutionStrategy.java:64)
at 
org.springframework.shell.core.SimpleExecutionStrategy.execute(SimpleExecutionStrategy.java:57)
at 
org.springframework.shell.core.AbstractShell.executeCommand(AbstractShell.java:127)
at 
org.springframework.shell.core.JLineShell.promptLoop(JLineShell.java:533)
at org.springframework.shell.core.JLineShell.run(JLineShell.java:179)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.IllegalArgumentException: The 
'rya.indexing.pcj.storageType' property must have one of the following values: 
[ACCUMULO]
at 
com.google.common.base.Preconditions.checkArgument(Preconditions.java:92)
at 
org.apache.rya.indexing.external.PrecomputedJoinStorageSupplier.get(PrecomputedJoinStorageSupplier.java:68)
at 
org.apache.rya.indexing.external.PrecomputedJoinIndexer.init(PrecomputedJoinIndexer.java:139)
at org.apache.rya.accumulo.AccumuloRyaDAO.init(AccumuloRyaDAO.java:156)
... 16 more
{noformat}




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (RYA-342) Improve documentation for deploying the RYA PCJ Updater

2017-08-14 Thread Jeff Dasch (JIRA)
Jeff Dasch created RYA-342:
--

 Summary: Improve documentation for deploying the RYA PCJ Updater
 Key: RYA-342
 URL: https://issues.apache.org/jira/browse/RYA-342
 Project: Rya
  Issue Type: Sub-task
  Components: docs
Reporter: Jeff Dasch
Priority: Minor






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (RYA-341) refactor rya.fluo.pcj.app to be a leaf project

2017-08-14 Thread Jeff Dasch (JIRA)
Jeff Dasch created RYA-341:
--

 Summary: refactor rya.fluo.pcj.app to be a leaf project
 Key: RYA-341
 URL: https://issues.apache.org/jira/browse/RYA-341
 Project: Rya
  Issue Type: Sub-task
  Components: build
Reporter: Jeff Dasch
Assignee: Jeff Dasch
Priority: Minor


This will allow us to properly scope (hadoop, accumulo, and fluo) dependencies 
as runtime and remove any now unnecessary filtering created by RYA-340.
Refactor any reusable code into a new project 
(rya.fluo.pcj.(common/runtime/impl).

Currently rya.fluo.pcj.api and rya.periodic.service.notification are dependent 
on it.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (RYA-340) rya.fluo.pcj.app is not currently deployable in fluo-1.0.0

2017-08-14 Thread Jeff Dasch (JIRA)
Jeff Dasch created RYA-340:
--

 Summary: rya.fluo.pcj.app is not currently deployable in fluo-1.0.0
 Key: RYA-340
 URL: https://issues.apache.org/jira/browse/RYA-340
 Project: Rya
  Issue Type: Sub-task
  Components: build
Reporter: Jeff Dasch
Assignee: Jeff Dasch


There is a guava versioning incompatibility.  Fastest fix is to improve the 
generated artifact through filtering.


{noformat}
Exception in thread "ServiceDelegate STARTING" 
java.lang.IncompatibleClassChangeError: class 
org.apache.twill.internal.utils.Dependencies$DependencyClassVisitor has 
interface org.objectweb.asm.ClassVisitor as super class
at java.lang.ClassLoader.defineClass1(Native Method)
at java.lang.ClassLoader.defineClass(ClassLoader.java:763)
at 
java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
at java.net.URLClassLoader.defineClass(URLClassLoader.java:467)
at java.net.URLClassLoader.access$100(URLClassLoader.java:73)
at java.net.URLClassLoader$1.run(URLClassLoader.java:368)
at java.net.URLClassLoader$1.run(URLClassLoader.java:362)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:361)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:335)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
at 
org.apache.twill.internal.utils.Dependencies.findClassDependencies(Dependencies.java:86)
at 
org.apache.twill.internal.ApplicationBundler.findDependencies(ApplicationBundler.java:198)
at 
org.apache.twill.internal.ApplicationBundler.createBundle(ApplicationBundler.java:155)
at 
org.apache.twill.internal.ApplicationBundler.createBundle(ApplicationBundler.java:126)
at 
org.apache.twill.yarn.YarnTwillPreparer.createAppMasterJar(YarnTwillPreparer.java:402)
at 
org.apache.twill.yarn.YarnTwillPreparer.access$200(YarnTwillPreparer.java:108)
at 
org.apache.twill.yarn.YarnTwillPreparer$1.call(YarnTwillPreparer.java:299)
at 
org.apache.twill.yarn.YarnTwillPreparer$1.call(YarnTwillPreparer.java:289)
at 
org.apache.twill.yarn.YarnTwillController.doStartUp(YarnTwillController.java:97)
at 
org.apache.twill.internal.AbstractZKServiceController.startUp(AbstractZKServiceController.java:76)
at 
org.apache.twill.internal.AbstractExecutionServiceController$ServiceDelegate.startUp(AbstractExecutionServiceController.java:175)
at 
com.google.common.util.concurrent.AbstractIdleService$1$1.run(AbstractIdleService.java:43)
at java.lang.Thread.run(Thread.java:748)

{noformat}




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (RYA-339) Fluo PCJ Improvements

2017-08-14 Thread Jeff Dasch (JIRA)
Jeff Dasch created RYA-339:
--

 Summary: Fluo PCJ Improvements
 Key: RYA-339
 URL: https://issues.apache.org/jira/browse/RYA-339
 Project: Rya
  Issue Type: Improvement
  Components: build
Reporter: Jeff Dasch
Assignee: Jeff Dasch
Priority: Minor


This is a holder issue for several subtasks regarding Fluo PCJs.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (RYA-337) Batch Queries to MongoDB

2017-08-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/RYA-337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16125933#comment-16125933
 ] 

ASF GitHub Bot commented on RYA-337:


Github user meiercaleb commented on a diff in the pull request:

https://github.com/apache/incubator-rya/pull/204#discussion_r132990636
  
--- Diff: 
dao/mongodb.rya/src/main/java/org/apache/rya/mongodb/MongoDBQueryEngine.java ---
@@ -140,72 +121,35 @@ public MongoDBRdfConfiguration getConf() {
 public CloseableIteration batchQuery(
 final Collection stmts, MongoDBRdfConfiguration 
conf)
 throws RyaDAOException {
-if (conf == null) {
-conf = configuration;
-}
-final Long maxResults = conf.getLimit();
-final Set queries = new HashSet();
-
-try {
-for (final RyaStatement stmt : stmts) {
-queries.add( strategy.getQuery(stmt));
- }
+final Map queries = new HashMap<>();
 
-// TODO not sure what to do about regex ranges?
-final RyaStatementCursorIterator iterator = new 
RyaStatementCursorIterator(getCollection(conf), queries,
-strategy, configuration.getAuthorizations());
-
-if (maxResults != null) {
-iterator.setMaxResults(maxResults);
-}
-return iterator;
-} catch (final Exception e) {
-throw new RyaDAOException(e);
+for (final RyaStatement stmt : stmts) {
+queries.put(stmt, new MapBindingSet());
 }
 
+return new 
RyaStatementCursorIterator(queryWithBindingSet(queries.entrySet(), conf));
 }
+
 @Override
 public CloseableIterable query(final RyaQuery ryaQuery)
 throws RyaDAOException {
-final Set queries = new HashSet();
-
-try {
-queries.add( strategy.getQuery(ryaQuery));
-
-// TODO not sure what to do about regex ranges?
-// TODO this is gross
-final RyaStatementCursorIterable iterator = new 
RyaStatementCursorIterable(
-new NonCloseableRyaStatementCursorIterator(new 
RyaStatementCursorIterator(getCollection(getConf()),
-queries, strategy, 
configuration.getAuthorizations(;
-
-return iterator;
-} catch (final Exception e) {
-throw new RyaDAOException(e);
-}
+return query(new 
BatchRyaQuery(Collections.singleton(ryaQuery.getQuery(;
--- End diff --

Null check


> Batch Queries to MongoDB
> 
>
> Key: RYA-337
> URL: https://issues.apache.org/jira/browse/RYA-337
> Project: Rya
>  Issue Type: Improvement
>  Components: dao
>Reporter: Aaron Mihalik
>Assignee: Aaron Mihalik
>
> Currently the MongoDB DAO sends one query at a time to Mongo.  Instead, the 
> DAO should send a batch of queries and perform a client side hash join (like 
> the Accumulo DAO)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (RYA-337) Batch Queries to MongoDB

2017-08-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/RYA-337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16125934#comment-16125934
 ] 

ASF GitHub Bot commented on RYA-337:


Github user meiercaleb commented on a diff in the pull request:

https://github.com/apache/incubator-rya/pull/204#discussion_r132989696
  
--- Diff: 
dao/mongodb.rya/src/main/java/org/apache/rya/mongodb/MongoDBQueryEngine.java ---
@@ -86,22 +84,10 @@ public MongoDBRdfConfiguration getConf() {
 public CloseableIteration query(
 final RyaStatement stmt, MongoDBRdfConfiguration conf)
 throws RyaDAOException {
-if (conf == null) {
-conf = configuration;
-}
-final Long maxResults = conf.getLimit();
-final Set queries = new HashSet();
-final DBObject query = strategy.getQuery(stmt);
-queries.add(query);
-final MongoDatabase db = 
mongoClient.getDatabase(conf.getMongoDBName());
-final MongoCollection collection = 
db.getCollection(conf.getTriplesCollectionName());
-final RyaStatementCursorIterator iterator = new 
RyaStatementCursorIterator(collection, queries, strategy,
-conf.getAuthorizations());
-
-if (maxResults != null) {
-iterator.setMaxResults(maxResults);
-}
-return iterator;
+Entry entry = new 
AbstractMap.SimpleEntry<>(stmt, new MapBindingSet());
--- End diff --

Check that the RyaStatement and Config are not null.


> Batch Queries to MongoDB
> 
>
> Key: RYA-337
> URL: https://issues.apache.org/jira/browse/RYA-337
> Project: Rya
>  Issue Type: Improvement
>  Components: dao
>Reporter: Aaron Mihalik
>Assignee: Aaron Mihalik
>
> Currently the MongoDB DAO sends one query at a time to Mongo.  Instead, the 
> DAO should send a batch of queries and perform a client side hash join (like 
> the Accumulo DAO)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (RYA-337) Batch Queries to MongoDB

2017-08-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/RYA-337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16125932#comment-16125932
 ] 

ASF GitHub Bot commented on RYA-337:


Github user meiercaleb commented on a diff in the pull request:

https://github.com/apache/incubator-rya/pull/204#discussion_r132979641
  
--- Diff: 
dao/mongodb.rya/src/main/java/org/apache/rya/mongodb/iter/RyaStatementBindingSetCursorIterator.java
 ---
@@ -91,47 +96,81 @@ private boolean currentBindingSetIteratorIsValid() {
 }
 
 private void findNextResult() {
-if (!currentResultCursorIsValid()) {
-findNextValidResultCursor();
+if (!currentBatchQueryResultCursorIsValid()) {
+submitBatchQuery();
 }
-if (currentResultCursorIsValid()) {
+
+if (currentBatchQueryResultCursorIsValid()) {
 // convert to Rya Statement
-final Document queryResult = resultsIterator.next();
+final Document queryResult = batchQueryResultsIterator.next();
 final DBObject dbo = (DBObject) 
JSON.parse(queryResult.toJson());
-currentStatement = strategy.deserializeDBObject(dbo);
-currentBindingSetIterator = 
currentBindingSetCollection.iterator();
+currentResultStatement = strategy.deserializeDBObject(dbo);
+
+// Find all of the queries in the executed RangeMap that this 
result matches
+// and collect all of those binding sets
+Set bsList = new HashSet<>();
+for (RyaStatement executedQuery : executedRangeMap.keys()) {
+if (isResultForQuery(executedQuery, 
currentResultStatement)) {
+bsList.addAll(executedRangeMap.get(executedQuery));
+}
+}
+currentBindingSetIterator = bsList.iterator();
+}
+
+// Handle case of invalid currentResultStatement or no binding 
sets returned
+if ((currentBindingSetIterator == null || 
!currentBindingSetIterator.hasNext()) && 
(currentBatchQueryResultCursorIsValid() || queryIterator.hasNext())) {
+findNextResult();
 }
 }
+
+private static boolean isResultForQuery(RyaStatement query, 
RyaStatement result) {
+return isResult(query.getSubject(), result.getSubject()) &&
+isResult(query.getPredicate(), result.getPredicate()) &&
+isResult(query.getObject(), result.getObject()) &&
+isResult(query.getContext(), result.getContext());
+}
+
+private static boolean isResult(RyaType query, RyaType result) {
+return (query == null) || query.equals(result);
+}
 
-private void findNextValidResultCursor() {
-while (queryIterator.hasNext()){
-final DBObject currentQuery = queryIterator.next();
-currentBindingSetCollection = rangeMap.get(currentQuery);
-// Executing redact aggregation to only return documents the 
user
-// has access to.
-final List pipeline = new ArrayList<>();
-pipeline.add(new Document("$match", currentQuery));
-pipeline.addAll(AggregationUtil.createRedactPipeline(auths));
-log.debug(pipeline);
-
-final AggregateIterable aggIter = 
coll.aggregate(pipeline);
-aggIter.batchSize(1000);
-resultsIterator = aggIter.iterator();
-if (resultsIterator.hasNext()) {
-break;
-}
+private void submitBatchQuery() {
+int count = 0;
+executedRangeMap.clear();
+final List pipeline = new ArrayList<>();
+final List match = new ArrayList<>();
+
+while (queryIterator.hasNext() && count < QUERY_BATCH_SIZE){
+count++;
+RyaStatement query = queryIterator.next();
+executedRangeMap.putAll(query, rangeMap.get(query));
+final DBObject currentQuery = strategy.getQuery(query);
+match.add(currentQuery);
 }
-}
 
-private boolean currentResultCursorIsValid() {
-return (resultsIterator != null) && resultsIterator.hasNext();
+if (match.size() > 1) {
+pipeline.add(new Document("$match", new Document("$or", 
match)));
--- End diff --

Did you compare the performance of $or with $in? Seems like $or is used to 
see if a field value satisfies at least one of (possibly) many general logical 
expressions, while $in checks to see if a value is in some sort of indexed 
collection.  My guess is that the added 

[jira] [Commented] (RYA-337) Batch Queries to MongoDB

2017-08-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/RYA-337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16125935#comment-16125935
 ] 

ASF GitHub Bot commented on RYA-337:


Github user meiercaleb commented on a diff in the pull request:

https://github.com/apache/incubator-rya/pull/204#discussion_r132980017
  
--- Diff: 
dao/mongodb.rya/src/main/java/org/apache/rya/mongodb/iter/RyaStatementBindingSetCursorIterator.java
 ---
@@ -44,20 +48,21 @@
 
 public class RyaStatementBindingSetCursorIterator implements 
CloseableIteration, RyaDAOException> {
 private static final Logger log = 
Logger.getLogger(RyaStatementBindingSetCursorIterator.class);
+
+private static final int QUERY_BATCH_SIZE = 50;
--- End diff --

Any particular reason you went with 50 here?


> Batch Queries to MongoDB
> 
>
> Key: RYA-337
> URL: https://issues.apache.org/jira/browse/RYA-337
> Project: Rya
>  Issue Type: Improvement
>  Components: dao
>Reporter: Aaron Mihalik
>Assignee: Aaron Mihalik
>
> Currently the MongoDB DAO sends one query at a time to Mongo.  Instead, the 
> DAO should send a batch of queries and perform a client side hash join (like 
> the Accumulo DAO)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] incubator-rya pull request #204: RYA-337 Adding batch queries to MongoDB. Cl...

2017-08-14 Thread meiercaleb
Github user meiercaleb commented on a diff in the pull request:

https://github.com/apache/incubator-rya/pull/204#discussion_r132989696
  
--- Diff: 
dao/mongodb.rya/src/main/java/org/apache/rya/mongodb/MongoDBQueryEngine.java ---
@@ -86,22 +84,10 @@ public MongoDBRdfConfiguration getConf() {
 public CloseableIteration query(
 final RyaStatement stmt, MongoDBRdfConfiguration conf)
 throws RyaDAOException {
-if (conf == null) {
-conf = configuration;
-}
-final Long maxResults = conf.getLimit();
-final Set queries = new HashSet();
-final DBObject query = strategy.getQuery(stmt);
-queries.add(query);
-final MongoDatabase db = 
mongoClient.getDatabase(conf.getMongoDBName());
-final MongoCollection collection = 
db.getCollection(conf.getTriplesCollectionName());
-final RyaStatementCursorIterator iterator = new 
RyaStatementCursorIterator(collection, queries, strategy,
-conf.getAuthorizations());
-
-if (maxResults != null) {
-iterator.setMaxResults(maxResults);
-}
-return iterator;
+Entry entry = new 
AbstractMap.SimpleEntry<>(stmt, new MapBindingSet());
--- End diff --

Check that the RyaStatement and Config are not null.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-rya pull request #204: RYA-337 Adding batch queries to MongoDB. Cl...

2017-08-14 Thread meiercaleb
Github user meiercaleb commented on a diff in the pull request:

https://github.com/apache/incubator-rya/pull/204#discussion_r132990636
  
--- Diff: 
dao/mongodb.rya/src/main/java/org/apache/rya/mongodb/MongoDBQueryEngine.java ---
@@ -140,72 +121,35 @@ public MongoDBRdfConfiguration getConf() {
 public CloseableIteration batchQuery(
 final Collection stmts, MongoDBRdfConfiguration 
conf)
 throws RyaDAOException {
-if (conf == null) {
-conf = configuration;
-}
-final Long maxResults = conf.getLimit();
-final Set queries = new HashSet();
-
-try {
-for (final RyaStatement stmt : stmts) {
-queries.add( strategy.getQuery(stmt));
- }
+final Map queries = new HashMap<>();
 
-// TODO not sure what to do about regex ranges?
-final RyaStatementCursorIterator iterator = new 
RyaStatementCursorIterator(getCollection(conf), queries,
-strategy, configuration.getAuthorizations());
-
-if (maxResults != null) {
-iterator.setMaxResults(maxResults);
-}
-return iterator;
-} catch (final Exception e) {
-throw new RyaDAOException(e);
+for (final RyaStatement stmt : stmts) {
+queries.put(stmt, new MapBindingSet());
 }
 
+return new 
RyaStatementCursorIterator(queryWithBindingSet(queries.entrySet(), conf));
 }
+
 @Override
 public CloseableIterable query(final RyaQuery ryaQuery)
 throws RyaDAOException {
-final Set queries = new HashSet();
-
-try {
-queries.add( strategy.getQuery(ryaQuery));
-
-// TODO not sure what to do about regex ranges?
-// TODO this is gross
-final RyaStatementCursorIterable iterator = new 
RyaStatementCursorIterable(
-new NonCloseableRyaStatementCursorIterator(new 
RyaStatementCursorIterator(getCollection(getConf()),
-queries, strategy, 
configuration.getAuthorizations(;
-
-return iterator;
-} catch (final Exception e) {
-throw new RyaDAOException(e);
-}
+return query(new 
BatchRyaQuery(Collections.singleton(ryaQuery.getQuery(;
--- End diff --

Null check


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-rya pull request #204: RYA-337 Adding batch queries to MongoDB. Cl...

2017-08-14 Thread meiercaleb
Github user meiercaleb commented on a diff in the pull request:

https://github.com/apache/incubator-rya/pull/204#discussion_r132979641
  
--- Diff: 
dao/mongodb.rya/src/main/java/org/apache/rya/mongodb/iter/RyaStatementBindingSetCursorIterator.java
 ---
@@ -91,47 +96,81 @@ private boolean currentBindingSetIteratorIsValid() {
 }
 
 private void findNextResult() {
-if (!currentResultCursorIsValid()) {
-findNextValidResultCursor();
+if (!currentBatchQueryResultCursorIsValid()) {
+submitBatchQuery();
 }
-if (currentResultCursorIsValid()) {
+
+if (currentBatchQueryResultCursorIsValid()) {
 // convert to Rya Statement
-final Document queryResult = resultsIterator.next();
+final Document queryResult = batchQueryResultsIterator.next();
 final DBObject dbo = (DBObject) 
JSON.parse(queryResult.toJson());
-currentStatement = strategy.deserializeDBObject(dbo);
-currentBindingSetIterator = 
currentBindingSetCollection.iterator();
+currentResultStatement = strategy.deserializeDBObject(dbo);
+
+// Find all of the queries in the executed RangeMap that this 
result matches
+// and collect all of those binding sets
+Set bsList = new HashSet<>();
+for (RyaStatement executedQuery : executedRangeMap.keys()) {
+if (isResultForQuery(executedQuery, 
currentResultStatement)) {
+bsList.addAll(executedRangeMap.get(executedQuery));
+}
+}
+currentBindingSetIterator = bsList.iterator();
+}
+
+// Handle case of invalid currentResultStatement or no binding 
sets returned
+if ((currentBindingSetIterator == null || 
!currentBindingSetIterator.hasNext()) && 
(currentBatchQueryResultCursorIsValid() || queryIterator.hasNext())) {
+findNextResult();
 }
 }
+
+private static boolean isResultForQuery(RyaStatement query, 
RyaStatement result) {
+return isResult(query.getSubject(), result.getSubject()) &&
+isResult(query.getPredicate(), result.getPredicate()) &&
+isResult(query.getObject(), result.getObject()) &&
+isResult(query.getContext(), result.getContext());
+}
+
+private static boolean isResult(RyaType query, RyaType result) {
+return (query == null) || query.equals(result);
+}
 
-private void findNextValidResultCursor() {
-while (queryIterator.hasNext()){
-final DBObject currentQuery = queryIterator.next();
-currentBindingSetCollection = rangeMap.get(currentQuery);
-// Executing redact aggregation to only return documents the 
user
-// has access to.
-final List pipeline = new ArrayList<>();
-pipeline.add(new Document("$match", currentQuery));
-pipeline.addAll(AggregationUtil.createRedactPipeline(auths));
-log.debug(pipeline);
-
-final AggregateIterable aggIter = 
coll.aggregate(pipeline);
-aggIter.batchSize(1000);
-resultsIterator = aggIter.iterator();
-if (resultsIterator.hasNext()) {
-break;
-}
+private void submitBatchQuery() {
+int count = 0;
+executedRangeMap.clear();
+final List pipeline = new ArrayList<>();
+final List match = new ArrayList<>();
+
+while (queryIterator.hasNext() && count < QUERY_BATCH_SIZE){
+count++;
+RyaStatement query = queryIterator.next();
+executedRangeMap.putAll(query, rangeMap.get(query));
+final DBObject currentQuery = strategy.getQuery(query);
+match.add(currentQuery);
 }
-}
 
-private boolean currentResultCursorIsValid() {
-return (resultsIterator != null) && resultsIterator.hasNext();
+if (match.size() > 1) {
+pipeline.add(new Document("$match", new Document("$or", 
match)));
--- End diff --

Did you compare the performance of $or with $in? Seems like $or is used to 
see if a field value satisfies at least one of (possibly) many general logical 
expressions, while $in checks to see if a value is in some sort of indexed 
collection.  My guess is that the added generality of $or would make it less 
performant for batch queries than $in.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature

[GitHub] incubator-rya pull request #204: RYA-337 Adding batch queries to MongoDB. Cl...

2017-08-14 Thread meiercaleb
Github user meiercaleb commented on a diff in the pull request:

https://github.com/apache/incubator-rya/pull/204#discussion_r132990694
  
--- Diff: 
dao/mongodb.rya/src/main/java/org/apache/rya/mongodb/MongoDBQueryEngine.java ---
@@ -140,72 +121,35 @@ public MongoDBRdfConfiguration getConf() {
 public CloseableIteration batchQuery(
 final Collection stmts, MongoDBRdfConfiguration 
conf)
 throws RyaDAOException {
-if (conf == null) {
-conf = configuration;
-}
-final Long maxResults = conf.getLimit();
-final Set queries = new HashSet();
-
-try {
-for (final RyaStatement stmt : stmts) {
-queries.add( strategy.getQuery(stmt));
- }
+final Map queries = new HashMap<>();
 
-// TODO not sure what to do about regex ranges?
-final RyaStatementCursorIterator iterator = new 
RyaStatementCursorIterator(getCollection(conf), queries,
-strategy, configuration.getAuthorizations());
-
-if (maxResults != null) {
-iterator.setMaxResults(maxResults);
-}
-return iterator;
-} catch (final Exception e) {
-throw new RyaDAOException(e);
+for (final RyaStatement stmt : stmts) {
+queries.put(stmt, new MapBindingSet());
 }
 
+return new 
RyaStatementCursorIterator(queryWithBindingSet(queries.entrySet(), conf));
 }
+
 @Override
 public CloseableIterable query(final RyaQuery ryaQuery)
 throws RyaDAOException {
-final Set queries = new HashSet();
-
-try {
-queries.add( strategy.getQuery(ryaQuery));
-
-// TODO not sure what to do about regex ranges?
-// TODO this is gross
-final RyaStatementCursorIterable iterator = new 
RyaStatementCursorIterable(
-new NonCloseableRyaStatementCursorIterator(new 
RyaStatementCursorIterator(getCollection(getConf()),
-queries, strategy, 
configuration.getAuthorizations(;
-
-return iterator;
-} catch (final Exception e) {
-throw new RyaDAOException(e);
-}
+return query(new 
BatchRyaQuery(Collections.singleton(ryaQuery.getQuery(;
 }
+
 @Override
 public CloseableIterable query(final BatchRyaQuery 
batchRyaQuery)
 throws RyaDAOException {
- try {
- final Set queries = new HashSet();
-for (final RyaStatement statement : 
batchRyaQuery.getQueries()){
-queries.add( strategy.getQuery(statement));
+final Map queries = new HashMap<>();
--- End diff --

Null check.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-rya pull request #204: RYA-337 Adding batch queries to MongoDB. Cl...

2017-08-14 Thread meiercaleb
Github user meiercaleb commented on a diff in the pull request:

https://github.com/apache/incubator-rya/pull/204#discussion_r132990271
  
--- Diff: 
dao/mongodb.rya/src/main/java/org/apache/rya/mongodb/MongoDBQueryEngine.java ---
@@ -111,25 +97,20 @@ public MongoDBRdfConfiguration getConf() {
 if (conf == null) {
--- End diff --

Check that stmts is not null


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-rya pull request #204: RYA-337 Adding batch queries to MongoDB. Cl...

2017-08-14 Thread meiercaleb
Github user meiercaleb commented on a diff in the pull request:

https://github.com/apache/incubator-rya/pull/204#discussion_r132980017
  
--- Diff: 
dao/mongodb.rya/src/main/java/org/apache/rya/mongodb/iter/RyaStatementBindingSetCursorIterator.java
 ---
@@ -44,20 +48,21 @@
 
 public class RyaStatementBindingSetCursorIterator implements 
CloseableIteration, RyaDAOException> {
 private static final Logger log = 
Logger.getLogger(RyaStatementBindingSetCursorIterator.class);
+
+private static final int QUERY_BATCH_SIZE = 50;
--- End diff --

Any particular reason you went with 50 here?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (RYA-337) Batch Queries to MongoDB

2017-08-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/RYA-337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16125772#comment-16125772
 ] 

ASF GitHub Bot commented on RYA-337:


Github user asfgit commented on the issue:

https://github.com/apache/incubator-rya/pull/204
  

Refer to this link for build results (access rights to CI server needed): 

https://builds.apache.org/job/incubator-rya-master-with-optionals-pull-requests/387/



> Batch Queries to MongoDB
> 
>
> Key: RYA-337
> URL: https://issues.apache.org/jira/browse/RYA-337
> Project: Rya
>  Issue Type: Improvement
>  Components: dao
>Reporter: Aaron Mihalik
>Assignee: Aaron Mihalik
>
> Currently the MongoDB DAO sends one query at a time to Mongo.  Instead, the 
> DAO should send a batch of queries and perform a client side hash join (like 
> the Accumulo DAO)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] incubator-rya issue #204: RYA-337 Adding batch queries to MongoDB. Closes #2...

2017-08-14 Thread asfgit
Github user asfgit commented on the issue:

https://github.com/apache/incubator-rya/pull/204
  

Refer to this link for build results (access rights to CI server needed): 

https://builds.apache.org/job/incubator-rya-master-with-optionals-pull-requests/386/Failed
 Tests: 1incubator-rya-master-with-optionals-pull-requests/org.apache.rya:rya.indexing.example:
 1ExamplesTest.MongoRyaDirectExampleTest



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-rya pull request #204: RYA-337 Adding batch queries to MongoDB. Cl...

2017-08-14 Thread amihalik
GitHub user amihalik opened a pull request:

https://github.com/apache/incubator-rya/pull/204

RYA-337 Adding batch queries to MongoDB. Closes #204

## Description

Added a batch query mechanism to MongoDB DAO and simplified 
MongoDBQueryEngine

### Tests

Ran unit tests 

### Links
[Jira](https://issues.apache.org/jira/browse/RYA-337)

### Checklist
- [ ] Code Review
- [ ] Squash Commits

 People To Review
@isper3at @pujav65 @meiercaleb 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/amihalik/incubator-rya RYA-337

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-rya/pull/204.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #204


commit d19ed3c7fb4ca71bc3c75e1aafc77c4b89c76270
Author: Aaron Mihalik 
Date:   2017-08-08T15:17:37Z

RYA-337 Adding batch queries to MongoDB. Closes #204

Additionally, simplifying MongoDBQueryEngine




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---