[jira] [Commented] (RYA-337) Batch Queries to MongoDB

2017-08-18 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/RYA-337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16133863#comment-16133863
 ] 

ASF GitHub Bot commented on RYA-337:


Github user asfgit closed the pull request at:

https://github.com/apache/incubator-rya/pull/204


> Batch Queries to MongoDB
> 
>
> Key: RYA-337
> URL: https://issues.apache.org/jira/browse/RYA-337
> Project: Rya
>  Issue Type: Improvement
>  Components: dao
>Reporter: Aaron Mihalik
>Assignee: Aaron Mihalik
>
> Currently the MongoDB DAO sends one query at a time to Mongo.  Instead, the 
> DAO should send a batch of queries and perform a client side hash join (like 
> the Accumulo DAO)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (RYA-337) Batch Queries to MongoDB

2017-08-18 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/RYA-337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16133163#comment-16133163
 ] 

ASF GitHub Bot commented on RYA-337:


Github user asfgit commented on the issue:

https://github.com/apache/incubator-rya/pull/204
  

Refer to this link for build results (access rights to CI server needed): 

https://builds.apache.org/job/incubator-rya-master-with-optionals-pull-requests/410/



> Batch Queries to MongoDB
> 
>
> Key: RYA-337
> URL: https://issues.apache.org/jira/browse/RYA-337
> Project: Rya
>  Issue Type: Improvement
>  Components: dao
>Reporter: Aaron Mihalik
>Assignee: Aaron Mihalik
>
> Currently the MongoDB DAO sends one query at a time to Mongo.  Instead, the 
> DAO should send a batch of queries and perform a client side hash join (like 
> the Accumulo DAO)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (RYA-337) Batch Queries to MongoDB

2017-08-18 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/RYA-337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16133112#comment-16133112
 ] 

ASF GitHub Bot commented on RYA-337:


Github user amihalik commented on a diff in the pull request:

https://github.com/apache/incubator-rya/pull/204#discussion_r133984224
  
--- Diff: 
dao/mongodb.rya/src/main/java/org/apache/rya/mongodb/iter/RyaStatementBindingSetCursorIterator.java
 ---
@@ -44,20 +48,21 @@
 
 public class RyaStatementBindingSetCursorIterator implements 
CloseableIteration, RyaDAOException> {
 private static final Logger log = 
Logger.getLogger(RyaStatementBindingSetCursorIterator.class);
+
+private static final int QUERY_BATCH_SIZE = 50;
--- End diff --

Empirical testing... To support the "show me the first 10" or "first 100" 
type queries.  I noticed that if we left this at 1000, then the "show me the 
first 10" or "first 100" type queries were very slow.  However, the "show me 
all" (where all = 30k), we equally as quick using 50 or 1000.


> Batch Queries to MongoDB
> 
>
> Key: RYA-337
> URL: https://issues.apache.org/jira/browse/RYA-337
> Project: Rya
>  Issue Type: Improvement
>  Components: dao
>Reporter: Aaron Mihalik
>Assignee: Aaron Mihalik
>
> Currently the MongoDB DAO sends one query at a time to Mongo.  Instead, the 
> DAO should send a batch of queries and perform a client side hash join (like 
> the Accumulo DAO)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (RYA-337) Batch Queries to MongoDB

2017-08-18 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/RYA-337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16133105#comment-16133105
 ] 

ASF GitHub Bot commented on RYA-337:


Github user amihalik commented on a diff in the pull request:

https://github.com/apache/incubator-rya/pull/204#discussion_r133983577
  
--- Diff: 
dao/mongodb.rya/src/main/java/org/apache/rya/mongodb/MongoDBQueryEngine.java ---
@@ -140,72 +121,35 @@ public MongoDBRdfConfiguration getConf() {
 public CloseableIteration batchQuery(
 final Collection stmts, MongoDBRdfConfiguration 
conf)
 throws RyaDAOException {
-if (conf == null) {
-conf = configuration;
-}
-final Long maxResults = conf.getLimit();
-final Set queries = new HashSet();
-
-try {
-for (final RyaStatement stmt : stmts) {
-queries.add( strategy.getQuery(stmt));
- }
+final Map queries = new HashMap<>();
 
-// TODO not sure what to do about regex ranges?
-final RyaStatementCursorIterator iterator = new 
RyaStatementCursorIterator(getCollection(conf), queries,
-strategy, configuration.getAuthorizations());
-
-if (maxResults != null) {
-iterator.setMaxResults(maxResults);
-}
-return iterator;
-} catch (final Exception e) {
-throw new RyaDAOException(e);
+for (final RyaStatement stmt : stmts) {
+queries.put(stmt, new MapBindingSet());
 }
 
+return new 
RyaStatementCursorIterator(queryWithBindingSet(queries.entrySet(), conf));
 }
+
 @Override
 public CloseableIterable query(final RyaQuery ryaQuery)
 throws RyaDAOException {
-final Set queries = new HashSet();
-
-try {
-queries.add( strategy.getQuery(ryaQuery));
-
-// TODO not sure what to do about regex ranges?
-// TODO this is gross
-final RyaStatementCursorIterable iterator = new 
RyaStatementCursorIterable(
-new NonCloseableRyaStatementCursorIterator(new 
RyaStatementCursorIterator(getCollection(getConf()),
-queries, strategy, 
configuration.getAuthorizations(;
-
-return iterator;
-} catch (final Exception e) {
-throw new RyaDAOException(e);
-}
+return query(new 
BatchRyaQuery(Collections.singleton(ryaQuery.getQuery(;
 }
+
 @Override
 public CloseableIterable query(final BatchRyaQuery 
batchRyaQuery)
 throws RyaDAOException {
- try {
- final Set queries = new HashSet();
-for (final RyaStatement statement : 
batchRyaQuery.getQueries()){
-queries.add( strategy.getQuery(statement));
+final Map queries = new HashMap<>();
--- End diff --

done


> Batch Queries to MongoDB
> 
>
> Key: RYA-337
> URL: https://issues.apache.org/jira/browse/RYA-337
> Project: Rya
>  Issue Type: Improvement
>  Components: dao
>Reporter: Aaron Mihalik
>Assignee: Aaron Mihalik
>
> Currently the MongoDB DAO sends one query at a time to Mongo.  Instead, the 
> DAO should send a batch of queries and perform a client side hash join (like 
> the Accumulo DAO)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (RYA-337) Batch Queries to MongoDB

2017-08-18 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/RYA-337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16133103#comment-16133103
 ] 

ASF GitHub Bot commented on RYA-337:


Github user amihalik commented on a diff in the pull request:

https://github.com/apache/incubator-rya/pull/204#discussion_r133983396
  
--- Diff: 
dao/mongodb.rya/src/main/java/org/apache/rya/mongodb/MongoDBQueryEngine.java ---
@@ -140,72 +121,35 @@ public MongoDBRdfConfiguration getConf() {
 public CloseableIteration batchQuery(
 final Collection stmts, MongoDBRdfConfiguration 
conf)
 throws RyaDAOException {
-if (conf == null) {
-conf = configuration;
-}
-final Long maxResults = conf.getLimit();
-final Set queries = new HashSet();
-
-try {
-for (final RyaStatement stmt : stmts) {
-queries.add( strategy.getQuery(stmt));
- }
+final Map queries = new HashMap<>();
 
-// TODO not sure what to do about regex ranges?
-final RyaStatementCursorIterator iterator = new 
RyaStatementCursorIterator(getCollection(conf), queries,
-strategy, configuration.getAuthorizations());
-
-if (maxResults != null) {
-iterator.setMaxResults(maxResults);
-}
-return iterator;
-} catch (final Exception e) {
-throw new RyaDAOException(e);
+for (final RyaStatement stmt : stmts) {
+queries.put(stmt, new MapBindingSet());
 }
 
+return new 
RyaStatementCursorIterator(queryWithBindingSet(queries.entrySet(), conf));
 }
+
 @Override
 public CloseableIterable query(final RyaQuery ryaQuery)
 throws RyaDAOException {
-final Set queries = new HashSet();
-
-try {
-queries.add( strategy.getQuery(ryaQuery));
-
-// TODO not sure what to do about regex ranges?
-// TODO this is gross
-final RyaStatementCursorIterable iterator = new 
RyaStatementCursorIterable(
-new NonCloseableRyaStatementCursorIterator(new 
RyaStatementCursorIterator(getCollection(getConf()),
-queries, strategy, 
configuration.getAuthorizations(;
-
-return iterator;
-} catch (final Exception e) {
-throw new RyaDAOException(e);
-}
+return query(new 
BatchRyaQuery(Collections.singleton(ryaQuery.getQuery(;
--- End diff --

done


> Batch Queries to MongoDB
> 
>
> Key: RYA-337
> URL: https://issues.apache.org/jira/browse/RYA-337
> Project: Rya
>  Issue Type: Improvement
>  Components: dao
>Reporter: Aaron Mihalik
>Assignee: Aaron Mihalik
>
> Currently the MongoDB DAO sends one query at a time to Mongo.  Instead, the 
> DAO should send a batch of queries and perform a client side hash join (like 
> the Accumulo DAO)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (RYA-337) Batch Queries to MongoDB

2017-08-18 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/RYA-337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16133101#comment-16133101
 ] 

ASF GitHub Bot commented on RYA-337:


Github user amihalik commented on a diff in the pull request:

https://github.com/apache/incubator-rya/pull/204#discussion_r133983122
  
--- Diff: 
dao/mongodb.rya/src/main/java/org/apache/rya/mongodb/MongoDBQueryEngine.java ---
@@ -86,22 +84,10 @@ public MongoDBRdfConfiguration getConf() {
 public CloseableIteration query(
 final RyaStatement stmt, MongoDBRdfConfiguration conf)
 throws RyaDAOException {
-if (conf == null) {
-conf = configuration;
-}
-final Long maxResults = conf.getLimit();
-final Set queries = new HashSet();
-final DBObject query = strategy.getQuery(stmt);
-queries.add(query);
-final MongoDatabase db = 
mongoClient.getDatabase(conf.getMongoDBName());
-final MongoCollection collection = 
db.getCollection(conf.getTriplesCollectionName());
-final RyaStatementCursorIterator iterator = new 
RyaStatementCursorIterator(collection, queries, strategy,
-conf.getAuthorizations());
-
-if (maxResults != null) {
-iterator.setMaxResults(maxResults);
-}
-return iterator;
+Entry entry = new 
AbstractMap.SimpleEntry<>(stmt, new MapBindingSet());
--- End diff --

done


> Batch Queries to MongoDB
> 
>
> Key: RYA-337
> URL: https://issues.apache.org/jira/browse/RYA-337
> Project: Rya
>  Issue Type: Improvement
>  Components: dao
>Reporter: Aaron Mihalik
>Assignee: Aaron Mihalik
>
> Currently the MongoDB DAO sends one query at a time to Mongo.  Instead, the 
> DAO should send a batch of queries and perform a client side hash join (like 
> the Accumulo DAO)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (RYA-337) Batch Queries to MongoDB

2017-08-18 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/RYA-337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16133102#comment-16133102
 ] 

ASF GitHub Bot commented on RYA-337:


Github user amihalik commented on a diff in the pull request:

https://github.com/apache/incubator-rya/pull/204#discussion_r133983248
  
--- Diff: 
dao/mongodb.rya/src/main/java/org/apache/rya/mongodb/MongoDBQueryEngine.java ---
@@ -111,25 +97,20 @@ public MongoDBRdfConfiguration getConf() {
 if (conf == null) {
--- End diff --

done


> Batch Queries to MongoDB
> 
>
> Key: RYA-337
> URL: https://issues.apache.org/jira/browse/RYA-337
> Project: Rya
>  Issue Type: Improvement
>  Components: dao
>Reporter: Aaron Mihalik
>Assignee: Aaron Mihalik
>
> Currently the MongoDB DAO sends one query at a time to Mongo.  Instead, the 
> DAO should send a batch of queries and perform a client side hash join (like 
> the Accumulo DAO)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (RYA-337) Batch Queries to MongoDB

2017-08-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/RYA-337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16125931#comment-16125931
 ] 

ASF GitHub Bot commented on RYA-337:


Github user meiercaleb commented on a diff in the pull request:

https://github.com/apache/incubator-rya/pull/204#discussion_r132990271
  
--- Diff: 
dao/mongodb.rya/src/main/java/org/apache/rya/mongodb/MongoDBQueryEngine.java ---
@@ -111,25 +97,20 @@ public MongoDBRdfConfiguration getConf() {
 if (conf == null) {
--- End diff --

Check that stmts is not null


> Batch Queries to MongoDB
> 
>
> Key: RYA-337
> URL: https://issues.apache.org/jira/browse/RYA-337
> Project: Rya
>  Issue Type: Improvement
>  Components: dao
>Reporter: Aaron Mihalik
>Assignee: Aaron Mihalik
>
> Currently the MongoDB DAO sends one query at a time to Mongo.  Instead, the 
> DAO should send a batch of queries and perform a client side hash join (like 
> the Accumulo DAO)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (RYA-337) Batch Queries to MongoDB

2017-08-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/RYA-337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16125933#comment-16125933
 ] 

ASF GitHub Bot commented on RYA-337:


Github user meiercaleb commented on a diff in the pull request:

https://github.com/apache/incubator-rya/pull/204#discussion_r132990636
  
--- Diff: 
dao/mongodb.rya/src/main/java/org/apache/rya/mongodb/MongoDBQueryEngine.java ---
@@ -140,72 +121,35 @@ public MongoDBRdfConfiguration getConf() {
 public CloseableIteration batchQuery(
 final Collection stmts, MongoDBRdfConfiguration 
conf)
 throws RyaDAOException {
-if (conf == null) {
-conf = configuration;
-}
-final Long maxResults = conf.getLimit();
-final Set queries = new HashSet();
-
-try {
-for (final RyaStatement stmt : stmts) {
-queries.add( strategy.getQuery(stmt));
- }
+final Map queries = new HashMap<>();
 
-// TODO not sure what to do about regex ranges?
-final RyaStatementCursorIterator iterator = new 
RyaStatementCursorIterator(getCollection(conf), queries,
-strategy, configuration.getAuthorizations());
-
-if (maxResults != null) {
-iterator.setMaxResults(maxResults);
-}
-return iterator;
-} catch (final Exception e) {
-throw new RyaDAOException(e);
+for (final RyaStatement stmt : stmts) {
+queries.put(stmt, new MapBindingSet());
 }
 
+return new 
RyaStatementCursorIterator(queryWithBindingSet(queries.entrySet(), conf));
 }
+
 @Override
 public CloseableIterable query(final RyaQuery ryaQuery)
 throws RyaDAOException {
-final Set queries = new HashSet();
-
-try {
-queries.add( strategy.getQuery(ryaQuery));
-
-// TODO not sure what to do about regex ranges?
-// TODO this is gross
-final RyaStatementCursorIterable iterator = new 
RyaStatementCursorIterable(
-new NonCloseableRyaStatementCursorIterator(new 
RyaStatementCursorIterator(getCollection(getConf()),
-queries, strategy, 
configuration.getAuthorizations(;
-
-return iterator;
-} catch (final Exception e) {
-throw new RyaDAOException(e);
-}
+return query(new 
BatchRyaQuery(Collections.singleton(ryaQuery.getQuery(;
--- End diff --

Null check


> Batch Queries to MongoDB
> 
>
> Key: RYA-337
> URL: https://issues.apache.org/jira/browse/RYA-337
> Project: Rya
>  Issue Type: Improvement
>  Components: dao
>Reporter: Aaron Mihalik
>Assignee: Aaron Mihalik
>
> Currently the MongoDB DAO sends one query at a time to Mongo.  Instead, the 
> DAO should send a batch of queries and perform a client side hash join (like 
> the Accumulo DAO)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (RYA-337) Batch Queries to MongoDB

2017-08-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/RYA-337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16125934#comment-16125934
 ] 

ASF GitHub Bot commented on RYA-337:


Github user meiercaleb commented on a diff in the pull request:

https://github.com/apache/incubator-rya/pull/204#discussion_r132989696
  
--- Diff: 
dao/mongodb.rya/src/main/java/org/apache/rya/mongodb/MongoDBQueryEngine.java ---
@@ -86,22 +84,10 @@ public MongoDBRdfConfiguration getConf() {
 public CloseableIteration query(
 final RyaStatement stmt, MongoDBRdfConfiguration conf)
 throws RyaDAOException {
-if (conf == null) {
-conf = configuration;
-}
-final Long maxResults = conf.getLimit();
-final Set queries = new HashSet();
-final DBObject query = strategy.getQuery(stmt);
-queries.add(query);
-final MongoDatabase db = 
mongoClient.getDatabase(conf.getMongoDBName());
-final MongoCollection collection = 
db.getCollection(conf.getTriplesCollectionName());
-final RyaStatementCursorIterator iterator = new 
RyaStatementCursorIterator(collection, queries, strategy,
-conf.getAuthorizations());
-
-if (maxResults != null) {
-iterator.setMaxResults(maxResults);
-}
-return iterator;
+Entry entry = new 
AbstractMap.SimpleEntry<>(stmt, new MapBindingSet());
--- End diff --

Check that the RyaStatement and Config are not null.


> Batch Queries to MongoDB
> 
>
> Key: RYA-337
> URL: https://issues.apache.org/jira/browse/RYA-337
> Project: Rya
>  Issue Type: Improvement
>  Components: dao
>Reporter: Aaron Mihalik
>Assignee: Aaron Mihalik
>
> Currently the MongoDB DAO sends one query at a time to Mongo.  Instead, the 
> DAO should send a batch of queries and perform a client side hash join (like 
> the Accumulo DAO)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (RYA-337) Batch Queries to MongoDB

2017-08-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/RYA-337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16125932#comment-16125932
 ] 

ASF GitHub Bot commented on RYA-337:


Github user meiercaleb commented on a diff in the pull request:

https://github.com/apache/incubator-rya/pull/204#discussion_r132979641
  
--- Diff: 
dao/mongodb.rya/src/main/java/org/apache/rya/mongodb/iter/RyaStatementBindingSetCursorIterator.java
 ---
@@ -91,47 +96,81 @@ private boolean currentBindingSetIteratorIsValid() {
 }
 
 private void findNextResult() {
-if (!currentResultCursorIsValid()) {
-findNextValidResultCursor();
+if (!currentBatchQueryResultCursorIsValid()) {
+submitBatchQuery();
 }
-if (currentResultCursorIsValid()) {
+
+if (currentBatchQueryResultCursorIsValid()) {
 // convert to Rya Statement
-final Document queryResult = resultsIterator.next();
+final Document queryResult = batchQueryResultsIterator.next();
 final DBObject dbo = (DBObject) 
JSON.parse(queryResult.toJson());
-currentStatement = strategy.deserializeDBObject(dbo);
-currentBindingSetIterator = 
currentBindingSetCollection.iterator();
+currentResultStatement = strategy.deserializeDBObject(dbo);
+
+// Find all of the queries in the executed RangeMap that this 
result matches
+// and collect all of those binding sets
+Set bsList = new HashSet<>();
+for (RyaStatement executedQuery : executedRangeMap.keys()) {
+if (isResultForQuery(executedQuery, 
currentResultStatement)) {
+bsList.addAll(executedRangeMap.get(executedQuery));
+}
+}
+currentBindingSetIterator = bsList.iterator();
+}
+
+// Handle case of invalid currentResultStatement or no binding 
sets returned
+if ((currentBindingSetIterator == null || 
!currentBindingSetIterator.hasNext()) && 
(currentBatchQueryResultCursorIsValid() || queryIterator.hasNext())) {
+findNextResult();
 }
 }
+
+private static boolean isResultForQuery(RyaStatement query, 
RyaStatement result) {
+return isResult(query.getSubject(), result.getSubject()) &&
+isResult(query.getPredicate(), result.getPredicate()) &&
+isResult(query.getObject(), result.getObject()) &&
+isResult(query.getContext(), result.getContext());
+}
+
+private static boolean isResult(RyaType query, RyaType result) {
+return (query == null) || query.equals(result);
+}
 
-private void findNextValidResultCursor() {
-while (queryIterator.hasNext()){
-final DBObject currentQuery = queryIterator.next();
-currentBindingSetCollection = rangeMap.get(currentQuery);
-// Executing redact aggregation to only return documents the 
user
-// has access to.
-final List pipeline = new ArrayList<>();
-pipeline.add(new Document("$match", currentQuery));
-pipeline.addAll(AggregationUtil.createRedactPipeline(auths));
-log.debug(pipeline);
-
-final AggregateIterable aggIter = 
coll.aggregate(pipeline);
-aggIter.batchSize(1000);
-resultsIterator = aggIter.iterator();
-if (resultsIterator.hasNext()) {
-break;
-}
+private void submitBatchQuery() {
+int count = 0;
+executedRangeMap.clear();
+final List pipeline = new ArrayList<>();
+final List match = new ArrayList<>();
+
+while (queryIterator.hasNext() && count < QUERY_BATCH_SIZE){
+count++;
+RyaStatement query = queryIterator.next();
+executedRangeMap.putAll(query, rangeMap.get(query));
+final DBObject currentQuery = strategy.getQuery(query);
+match.add(currentQuery);
 }
-}
 
-private boolean currentResultCursorIsValid() {
-return (resultsIterator != null) && resultsIterator.hasNext();
+if (match.size() > 1) {
+pipeline.add(new Document("$match", new Document("$or", 
match)));
--- End diff --

Did you compare the performance of $or with $in? Seems like $or is used to 
see if a field value satisfies at least one of (possibly) many general logical 
expressions, while $in checks to see if a value is in some sort of indexed 
collection.  My guess is that t

[jira] [Commented] (RYA-337) Batch Queries to MongoDB

2017-08-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/RYA-337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16125930#comment-16125930
 ] 

ASF GitHub Bot commented on RYA-337:


Github user meiercaleb commented on a diff in the pull request:

https://github.com/apache/incubator-rya/pull/204#discussion_r132990694
  
--- Diff: 
dao/mongodb.rya/src/main/java/org/apache/rya/mongodb/MongoDBQueryEngine.java ---
@@ -140,72 +121,35 @@ public MongoDBRdfConfiguration getConf() {
 public CloseableIteration batchQuery(
 final Collection stmts, MongoDBRdfConfiguration 
conf)
 throws RyaDAOException {
-if (conf == null) {
-conf = configuration;
-}
-final Long maxResults = conf.getLimit();
-final Set queries = new HashSet();
-
-try {
-for (final RyaStatement stmt : stmts) {
-queries.add( strategy.getQuery(stmt));
- }
+final Map queries = new HashMap<>();
 
-// TODO not sure what to do about regex ranges?
-final RyaStatementCursorIterator iterator = new 
RyaStatementCursorIterator(getCollection(conf), queries,
-strategy, configuration.getAuthorizations());
-
-if (maxResults != null) {
-iterator.setMaxResults(maxResults);
-}
-return iterator;
-} catch (final Exception e) {
-throw new RyaDAOException(e);
+for (final RyaStatement stmt : stmts) {
+queries.put(stmt, new MapBindingSet());
 }
 
+return new 
RyaStatementCursorIterator(queryWithBindingSet(queries.entrySet(), conf));
 }
+
 @Override
 public CloseableIterable query(final RyaQuery ryaQuery)
 throws RyaDAOException {
-final Set queries = new HashSet();
-
-try {
-queries.add( strategy.getQuery(ryaQuery));
-
-// TODO not sure what to do about regex ranges?
-// TODO this is gross
-final RyaStatementCursorIterable iterator = new 
RyaStatementCursorIterable(
-new NonCloseableRyaStatementCursorIterator(new 
RyaStatementCursorIterator(getCollection(getConf()),
-queries, strategy, 
configuration.getAuthorizations(;
-
-return iterator;
-} catch (final Exception e) {
-throw new RyaDAOException(e);
-}
+return query(new 
BatchRyaQuery(Collections.singleton(ryaQuery.getQuery(;
 }
+
 @Override
 public CloseableIterable query(final BatchRyaQuery 
batchRyaQuery)
 throws RyaDAOException {
- try {
- final Set queries = new HashSet();
-for (final RyaStatement statement : 
batchRyaQuery.getQueries()){
-queries.add( strategy.getQuery(statement));
+final Map queries = new HashMap<>();
--- End diff --

Null check.


> Batch Queries to MongoDB
> 
>
> Key: RYA-337
> URL: https://issues.apache.org/jira/browse/RYA-337
> Project: Rya
>  Issue Type: Improvement
>  Components: dao
>Reporter: Aaron Mihalik
>Assignee: Aaron Mihalik
>
> Currently the MongoDB DAO sends one query at a time to Mongo.  Instead, the 
> DAO should send a batch of queries and perform a client side hash join (like 
> the Accumulo DAO)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (RYA-337) Batch Queries to MongoDB

2017-08-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/RYA-337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16125935#comment-16125935
 ] 

ASF GitHub Bot commented on RYA-337:


Github user meiercaleb commented on a diff in the pull request:

https://github.com/apache/incubator-rya/pull/204#discussion_r132980017
  
--- Diff: 
dao/mongodb.rya/src/main/java/org/apache/rya/mongodb/iter/RyaStatementBindingSetCursorIterator.java
 ---
@@ -44,20 +48,21 @@
 
 public class RyaStatementBindingSetCursorIterator implements 
CloseableIteration, RyaDAOException> {
 private static final Logger log = 
Logger.getLogger(RyaStatementBindingSetCursorIterator.class);
+
+private static final int QUERY_BATCH_SIZE = 50;
--- End diff --

Any particular reason you went with 50 here?


> Batch Queries to MongoDB
> 
>
> Key: RYA-337
> URL: https://issues.apache.org/jira/browse/RYA-337
> Project: Rya
>  Issue Type: Improvement
>  Components: dao
>Reporter: Aaron Mihalik
>Assignee: Aaron Mihalik
>
> Currently the MongoDB DAO sends one query at a time to Mongo.  Instead, the 
> DAO should send a batch of queries and perform a client side hash join (like 
> the Accumulo DAO)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (RYA-337) Batch Queries to MongoDB

2017-08-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/RYA-337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16125772#comment-16125772
 ] 

ASF GitHub Bot commented on RYA-337:


Github user asfgit commented on the issue:

https://github.com/apache/incubator-rya/pull/204
  

Refer to this link for build results (access rights to CI server needed): 

https://builds.apache.org/job/incubator-rya-master-with-optionals-pull-requests/387/



> Batch Queries to MongoDB
> 
>
> Key: RYA-337
> URL: https://issues.apache.org/jira/browse/RYA-337
> Project: Rya
>  Issue Type: Improvement
>  Components: dao
>Reporter: Aaron Mihalik
>Assignee: Aaron Mihalik
>
> Currently the MongoDB DAO sends one query at a time to Mongo.  Instead, the 
> DAO should send a batch of queries and perform a client side hash join (like 
> the Accumulo DAO)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (RYA-337) Batch Queries to MongoDB

2017-08-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/RYA-337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16125745#comment-16125745
 ] 

ASF GitHub Bot commented on RYA-337:


Github user asfgit commented on the issue:

https://github.com/apache/incubator-rya/pull/204
  

Refer to this link for build results (access rights to CI server needed): 

https://builds.apache.org/job/incubator-rya-master-with-optionals-pull-requests/386/Failed
 Tests: 1incubator-rya-master-with-optionals-pull-requests/org.apache.rya:rya.indexing.example:
 1ExamplesTest.MongoRyaDirectExampleTest



> Batch Queries to MongoDB
> 
>
> Key: RYA-337
> URL: https://issues.apache.org/jira/browse/RYA-337
> Project: Rya
>  Issue Type: Improvement
>  Components: dao
>Reporter: Aaron Mihalik
>Assignee: Aaron Mihalik
>
> Currently the MongoDB DAO sends one query at a time to Mongo.  Instead, the 
> DAO should send a batch of queries and perform a client side hash join (like 
> the Accumulo DAO)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (RYA-337) Batch Queries to MongoDB

2017-08-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/RYA-337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16125707#comment-16125707
 ] 

ASF GitHub Bot commented on RYA-337:


GitHub user amihalik opened a pull request:

https://github.com/apache/incubator-rya/pull/204

RYA-337 Adding batch queries to MongoDB. Closes #204

## Description

Added a batch query mechanism to MongoDB DAO and simplified 
MongoDBQueryEngine

### Tests

Ran unit tests 

### Links
[Jira](https://issues.apache.org/jira/browse/RYA-337)

### Checklist
- [ ] Code Review
- [ ] Squash Commits

 People To Review
@isper3at @pujav65 @meiercaleb 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/amihalik/incubator-rya RYA-337

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-rya/pull/204.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #204


commit d19ed3c7fb4ca71bc3c75e1aafc77c4b89c76270
Author: Aaron Mihalik 
Date:   2017-08-08T15:17:37Z

RYA-337 Adding batch queries to MongoDB. Closes #204

Additionally, simplifying MongoDBQueryEngine




> Batch Queries to MongoDB
> 
>
> Key: RYA-337
> URL: https://issues.apache.org/jira/browse/RYA-337
> Project: Rya
>  Issue Type: Improvement
>  Components: dao
>Reporter: Aaron Mihalik
>Assignee: Aaron Mihalik
>
> Currently the MongoDB DAO sends one query at a time to Mongo.  Instead, the 
> DAO should send a batch of queries and perform a client side hash join (like 
> the Accumulo DAO)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)