[jira] [Commented] (RYA-316) Long LineStrings break MongoDB ingest

2017-08-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/RYA-316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16121986#comment-16121986
 ] 

ASF GitHub Bot commented on RYA-316:


Github user asfgit closed the pull request at:

https://github.com/apache/incubator-rya/pull/199


> Long LineStrings break MongoDB ingest
> -
>
> Key: RYA-316
> URL: https://issues.apache.org/jira/browse/RYA-316
> Project: Rya
>  Issue Type: Bug
>  Components: dao
>Reporter: Aaron Mihalik
>Assignee: Andrew Smith
>
> MongoDB will reject statements they contain very long linestrings.  
> Basically, the mongodb index key is limited to 1024 chars, so the insert will 
> fail if the literal is longer.
> [Here is some example 
> code|https://github.com/amihalik/rya-mongo-debugging/blob/master/src/main/java/com/github/amihalik/rya/mongo/debugging/linestring/LoadLineString.java].
>   I think the inserts will work if you use 10 points, but fail if you use 
> linestrings with 100 points.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (RYA-316) Long LineStrings break MongoDB ingest

2017-08-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/RYA-316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16121954#comment-16121954
 ] 

ASF GitHub Bot commented on RYA-316:


Github user amihalik commented on the issue:

https://github.com/apache/incubator-rya/pull/199
  
@isper3at thanks!  Merging now...


> Long LineStrings break MongoDB ingest
> -
>
> Key: RYA-316
> URL: https://issues.apache.org/jira/browse/RYA-316
> Project: Rya
>  Issue Type: Bug
>  Components: dao
>Reporter: Aaron Mihalik
>Assignee: Andrew Smith
>
> MongoDB will reject statements they contain very long linestrings.  
> Basically, the mongodb index key is limited to 1024 chars, so the insert will 
> fail if the literal is longer.
> [Here is some example 
> code|https://github.com/amihalik/rya-mongo-debugging/blob/master/src/main/java/com/github/amihalik/rya/mongo/debugging/linestring/LoadLineString.java].
>   I think the inserts will work if you use 10 points, but fail if you use 
> linestrings with 100 points.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (RYA-316) Long LineStrings break MongoDB ingest

2017-08-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/RYA-316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16121950#comment-16121950
 ] 

ASF GitHub Bot commented on RYA-316:


Github user asfgit commented on the issue:

https://github.com/apache/incubator-rya/pull/199
  

Refer to this link for build results (access rights to CI server needed): 

https://builds.apache.org/job/incubator-rya-master-with-optionals-pull-requests/382/



> Long LineStrings break MongoDB ingest
> -
>
> Key: RYA-316
> URL: https://issues.apache.org/jira/browse/RYA-316
> Project: Rya
>  Issue Type: Bug
>  Components: dao
>Reporter: Aaron Mihalik
>Assignee: Andrew Smith
>
> MongoDB will reject statements they contain very long linestrings.  
> Basically, the mongodb index key is limited to 1024 chars, so the insert will 
> fail if the literal is longer.
> [Here is some example 
> code|https://github.com/amihalik/rya-mongo-debugging/blob/master/src/main/java/com/github/amihalik/rya/mongo/debugging/linestring/LoadLineString.java].
>   I think the inserts will work if you use 10 points, but fail if you use 
> linestrings with 100 points.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (RYA-316) Long LineStrings break MongoDB ingest

2017-08-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/RYA-316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16121736#comment-16121736
 ] 

ASF GitHub Bot commented on RYA-316:


Github user amihalik commented on the issue:

https://github.com/apache/incubator-rya/pull/199
  
@pujav65 totally agree.  That's why I added the "please get some metrics" 
portion.


> Long LineStrings break MongoDB ingest
> -
>
> Key: RYA-316
> URL: https://issues.apache.org/jira/browse/RYA-316
> Project: Rya
>  Issue Type: Bug
>  Components: dao
>Reporter: Aaron Mihalik
>Assignee: Andrew Smith
>
> MongoDB will reject statements they contain very long linestrings.  
> Basically, the mongodb index key is limited to 1024 chars, so the insert will 
> fail if the literal is longer.
> [Here is some example 
> code|https://github.com/amihalik/rya-mongo-debugging/blob/master/src/main/java/com/github/amihalik/rya/mongo/debugging/linestring/LoadLineString.java].
>   I think the inserts will work if you use 10 points, but fail if you use 
> linestrings with 100 points.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (RYA-316) Long LineStrings break MongoDB ingest

2017-08-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/RYA-316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16121728#comment-16121728
 ] 

ASF GitHub Bot commented on RYA-316:


Github user pujav65 commented on the issue:

https://github.com/apache/incubator-rya/pull/199
  
The reason I wanted it broken apart was so this could be merged and more 
time could be spent figuring out if hashing is necessary.  So I say merge this 
and do that on a follow on or.


> Long LineStrings break MongoDB ingest
> -
>
> Key: RYA-316
> URL: https://issues.apache.org/jira/browse/RYA-316
> Project: Rya
>  Issue Type: Bug
>  Components: dao
>Reporter: Aaron Mihalik
>Assignee: Andrew Smith
>
> MongoDB will reject statements they contain very long linestrings.  
> Basically, the mongodb index key is limited to 1024 chars, so the insert will 
> fail if the literal is longer.
> [Here is some example 
> code|https://github.com/amihalik/rya-mongo-debugging/blob/master/src/main/java/com/github/amihalik/rya/mongo/debugging/linestring/LoadLineString.java].
>   I think the inserts will work if you use 10 points, but fail if you use 
> linestrings with 100 points.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (RYA-316) Long LineStrings break MongoDB ingest

2017-08-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/RYA-316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16121682#comment-16121682
 ] 

ASF GitHub Bot commented on RYA-316:


Github user amihalik commented on the issue:

https://github.com/apache/incubator-rya/pull/199
  
I added the "please hash everything" task as requested by @pujav65 and we 
can continue the work on that task.  The task is more of a "please get some 
metrics then decide if we want to hash everything."  The task is [RYA-338 - 
Revisit Core indcies in MongoDB](https://issues.apache.org/jira/browse/RYA-338)


> Long LineStrings break MongoDB ingest
> -
>
> Key: RYA-316
> URL: https://issues.apache.org/jira/browse/RYA-316
> Project: Rya
>  Issue Type: Bug
>  Components: dao
>Reporter: Aaron Mihalik
>Assignee: Andrew Smith
>
> MongoDB will reject statements they contain very long linestrings.  
> Basically, the mongodb index key is limited to 1024 chars, so the insert will 
> fail if the literal is longer.
> [Here is some example 
> code|https://github.com/amihalik/rya-mongo-debugging/blob/master/src/main/java/com/github/amihalik/rya/mongo/debugging/linestring/LoadLineString.java].
>   I think the inserts will work if you use 10 points, but fail if you use 
> linestrings with 100 points.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (RYA-316) Long LineStrings break MongoDB ingest

2017-08-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/RYA-316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16121663#comment-16121663
 ] 

ASF GitHub Bot commented on RYA-316:


Github user amihalik commented on the issue:

https://github.com/apache/incubator-rya/pull/199
  
@isper3at Change line 80 from "SUBJECT" to "SUBJECT_HASH" and I'll be 
satisfied and I'll merge it.  Thanks!


> Long LineStrings break MongoDB ingest
> -
>
> Key: RYA-316
> URL: https://issues.apache.org/jira/browse/RYA-316
> Project: Rya
>  Issue Type: Bug
>  Components: dao
>Reporter: Aaron Mihalik
>Assignee: Andrew Smith
>
> MongoDB will reject statements they contain very long linestrings.  
> Basically, the mongodb index key is limited to 1024 chars, so the insert will 
> fail if the literal is longer.
> [Here is some example 
> code|https://github.com/amihalik/rya-mongo-debugging/blob/master/src/main/java/com/github/amihalik/rya/mongo/debugging/linestring/LoadLineString.java].
>   I think the inserts will work if you use 10 points, but fail if you use 
> linestrings with 100 points.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (RYA-316) Long LineStrings break MongoDB ingest

2017-08-09 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/RYA-316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16120301#comment-16120301
 ] 

ASF GitHub Bot commented on RYA-316:


Github user pujav65 commented on a diff in the pull request:

https://github.com/apache/incubator-rya/pull/199#discussion_r132249498
  
--- Diff: 
dao/mongodb.rya/src/main/java/org/apache/rya/mongodb/dao/SimpleMongoDBStorageStrategy.java
 ---
@@ -53,8 +54,11 @@
 public static final String OBJECT_TYPE_VALUE = 
XMLSchema.ANYURI.stringValue();
 public static final String CONTEXT = "context";
 public static final String PREDICATE = "predicate";
-public static final String OBJECT = "object";
+public static final String PREDICATE_HASH = "predicate_hash";
+public static final String OBJECT = "object_original";
--- End diff --

I'm pretty sure Mongo further condenses the data, so I'm not sure hashing 
is necessary in order for it to store in memory.  You're adding a lot of 
overhead to query.  I'm ok with adding it now if you think it's necessary.


> Long LineStrings break MongoDB ingest
> -
>
> Key: RYA-316
> URL: https://issues.apache.org/jira/browse/RYA-316
> Project: Rya
>  Issue Type: Bug
>  Components: dao
>Reporter: Aaron Mihalik
>Assignee: Andrew Smith
>
> MongoDB will reject statements they contain very long linestrings.  
> Basically, the mongodb index key is limited to 1024 chars, so the insert will 
> fail if the literal is longer.
> [Here is some example 
> code|https://github.com/amihalik/rya-mongo-debugging/blob/master/src/main/java/com/github/amihalik/rya/mongo/debugging/linestring/LoadLineString.java].
>   I think the inserts will work if you use 10 points, but fail if you use 
> linestrings with 100 points.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (RYA-316) Long LineStrings break MongoDB ingest

2017-08-09 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/RYA-316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16120257#comment-16120257
 ] 

ASF GitHub Bot commented on RYA-316:


Github user amihalik commented on a diff in the pull request:

https://github.com/apache/incubator-rya/pull/199#discussion_r132245115
  
--- Diff: 
dao/mongodb.rya/src/main/java/org/apache/rya/mongodb/dao/SimpleMongoDBStorageStrategy.java
 ---
@@ -64,14 +68,14 @@
 @Override
 public void createIndices(final DBCollection coll){
 BasicDBObject doc = new BasicDBObject();
-doc.put(SUBJECT, 1);
-doc.put(PREDICATE, 1);
+doc.put(SUBJECT_HASH, 1);
+doc.put(PREDICATE_HASH, 1);
 coll.createIndex(doc);
--- End diff --

@pujav65 thanks.  

@isper3at clearly a bug.  please add OBJECT_HASH, OBJECT_TYPE_HASH to the 
first index. 


> Long LineStrings break MongoDB ingest
> -
>
> Key: RYA-316
> URL: https://issues.apache.org/jira/browse/RYA-316
> Project: Rya
>  Issue Type: Bug
>  Components: dao
>Reporter: Aaron Mihalik
>Assignee: Andrew Smith
>
> MongoDB will reject statements they contain very long linestrings.  
> Basically, the mongodb index key is limited to 1024 chars, so the insert will 
> fail if the literal is longer.
> [Here is some example 
> code|https://github.com/amihalik/rya-mongo-debugging/blob/master/src/main/java/com/github/amihalik/rya/mongo/debugging/linestring/LoadLineString.java].
>   I think the inserts will work if you use 10 points, but fail if you use 
> linestrings with 100 points.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (RYA-316) Long LineStrings break MongoDB ingest

2017-08-09 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/RYA-316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16120245#comment-16120245
 ] 

ASF GitHub Bot commented on RYA-316:


Github user pujav65 commented on a diff in the pull request:

https://github.com/apache/incubator-rya/pull/199#discussion_r132243060
  
--- Diff: 
dao/mongodb.rya/src/main/java/org/apache/rya/mongodb/dao/SimpleMongoDBStorageStrategy.java
 ---
@@ -64,14 +68,14 @@
 @Override
 public void createIndices(final DBCollection coll){
 BasicDBObject doc = new BasicDBObject();
-doc.put(SUBJECT, 1);
-doc.put(PREDICATE, 1);
+doc.put(SUBJECT_HASH, 1);
+doc.put(PREDICATE_HASH, 1);
 coll.createIndex(doc);
--- End diff --

When the Mongo db backend was first implemented, you could only do indices 
over two fields-- the first is the primary index, the second the secondary 
index.  That may have changed since.  The indices we originally had were 
subject, predicate, object, and then subject/predicate, predicate/object, and 
object/subject.  The not including object type might be a bug, but I had 
thought that was addressed at some point.  Also one could argue that the single 
field indices were redundant-- I had wanted to test to see but never got around 
to it.
If you can now index over more than two fields, then we might want to 
revisit this.  


> Long LineStrings break MongoDB ingest
> -
>
> Key: RYA-316
> URL: https://issues.apache.org/jira/browse/RYA-316
> Project: Rya
>  Issue Type: Bug
>  Components: dao
>Reporter: Aaron Mihalik
>Assignee: Andrew Smith
>
> MongoDB will reject statements they contain very long linestrings.  
> Basically, the mongodb index key is limited to 1024 chars, so the insert will 
> fail if the literal is longer.
> [Here is some example 
> code|https://github.com/amihalik/rya-mongo-debugging/blob/master/src/main/java/com/github/amihalik/rya/mongo/debugging/linestring/LoadLineString.java].
>   I think the inserts will work if you use 10 points, but fail if you use 
> linestrings with 100 points.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (RYA-316) Long LineStrings break MongoDB ingest

2017-08-09 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/RYA-316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16120235#comment-16120235
 ] 

ASF GitHub Bot commented on RYA-316:


Github user amihalik commented on a diff in the pull request:

https://github.com/apache/incubator-rya/pull/199#discussion_r132242206
  
--- Diff: 
dao/mongodb.rya/src/main/java/org/apache/rya/mongodb/dao/SimpleMongoDBStorageStrategy.java
 ---
@@ -53,8 +54,11 @@
 public static final String OBJECT_TYPE_VALUE = 
XMLSchema.ANYURI.stringValue();
 public static final String CONTEXT = "context";
 public static final String PREDICATE = "predicate";
-public static final String OBJECT = "object";
+public static final String PREDICATE_HASH = "predicate_hash";
+public static final String OBJECT = "object_original";
--- End diff --

@pujav65 I'm concerned about index size.  please hash everything.  If you 
want another ticket for "please hash everything" I'm fine with that, but let's 
knock that out while @isper3at is cleaning this stuff up.  Key thing with mongo 
is to get the index to fit in memory, so lets do that.


> Long LineStrings break MongoDB ingest
> -
>
> Key: RYA-316
> URL: https://issues.apache.org/jira/browse/RYA-316
> Project: Rya
>  Issue Type: Bug
>  Components: dao
>Reporter: Aaron Mihalik
>Assignee: Andrew Smith
>
> MongoDB will reject statements they contain very long linestrings.  
> Basically, the mongodb index key is limited to 1024 chars, so the insert will 
> fail if the literal is longer.
> [Here is some example 
> code|https://github.com/amihalik/rya-mongo-debugging/blob/master/src/main/java/com/github/amihalik/rya/mongo/debugging/linestring/LoadLineString.java].
>   I think the inserts will work if you use 10 points, but fail if you use 
> linestrings with 100 points.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (RYA-316) Long LineStrings break MongoDB ingest

2017-08-09 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/RYA-316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16120195#comment-16120195
 ] 

ASF GitHub Bot commented on RYA-316:


Github user amihalik commented on a diff in the pull request:

https://github.com/apache/incubator-rya/pull/199#discussion_r132235261
  
--- Diff: 
dao/mongodb.rya/src/main/java/org/apache/rya/mongodb/dao/SimpleMongoDBStorageStrategy.java
 ---
@@ -64,14 +68,14 @@
 @Override
 public void createIndices(final DBCollection coll){
 BasicDBObject doc = new BasicDBObject();
-doc.put(SUBJECT, 1);
-doc.put(PREDICATE, 1);
+doc.put(SUBJECT_HASH, 1);
+doc.put(PREDICATE_HASH, 1);
 coll.createIndex(doc);
-doc = new BasicDBObject(PREDICATE, 1);
-doc.put(OBJECT, 1);
+doc = new BasicDBObject(PREDICATE_HASH, 1);
+doc.put(OBJECT_HASH, 1);
 doc.put(OBJECT_TYPE, 1);
 coll.createIndex(doc);
-doc = new BasicDBObject(OBJECT, 1);
+doc = new BasicDBObject(OBJECT_HASH, 1);
 doc.put(OBJECT_TYPE, 1);
 doc.put(SUBJECT, 1);
--- End diff --

SUBJECT_HASH


> Long LineStrings break MongoDB ingest
> -
>
> Key: RYA-316
> URL: https://issues.apache.org/jira/browse/RYA-316
> Project: Rya
>  Issue Type: Bug
>  Components: dao
>Reporter: Aaron Mihalik
>Assignee: Andrew Smith
>
> MongoDB will reject statements they contain very long linestrings.  
> Basically, the mongodb index key is limited to 1024 chars, so the insert will 
> fail if the literal is longer.
> [Here is some example 
> code|https://github.com/amihalik/rya-mongo-debugging/blob/master/src/main/java/com/github/amihalik/rya/mongo/debugging/linestring/LoadLineString.java].
>   I think the inserts will work if you use 10 points, but fail if you use 
> linestrings with 100 points.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (RYA-316) Long LineStrings break MongoDB ingest

2017-08-09 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/RYA-316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16120194#comment-16120194
 ] 

ASF GitHub Bot commented on RYA-316:


Github user amihalik commented on a diff in the pull request:

https://github.com/apache/incubator-rya/pull/199#discussion_r132235567
  
--- Diff: 
dao/mongodb.rya/src/main/java/org/apache/rya/mongodb/dao/SimpleMongoDBStorageStrategy.java
 ---
@@ -64,14 +68,14 @@
 @Override
 public void createIndices(final DBCollection coll){
 BasicDBObject doc = new BasicDBObject();
-doc.put(SUBJECT, 1);
-doc.put(PREDICATE, 1);
+doc.put(SUBJECT_HASH, 1);
+doc.put(PREDICATE_HASH, 1);
 coll.createIndex(doc);
--- End diff --

@pujav65  Looking over this index creation code... this seems like a bug... 
where's the SPO index?  I think this first index should be SUBJECT_HASH, 
PREDICATE_HASH, OBJECT_HASH, OBJECT_TYPE_HASH


> Long LineStrings break MongoDB ingest
> -
>
> Key: RYA-316
> URL: https://issues.apache.org/jira/browse/RYA-316
> Project: Rya
>  Issue Type: Bug
>  Components: dao
>Reporter: Aaron Mihalik
>Assignee: Andrew Smith
>
> MongoDB will reject statements they contain very long linestrings.  
> Basically, the mongodb index key is limited to 1024 chars, so the insert will 
> fail if the literal is longer.
> [Here is some example 
> code|https://github.com/amihalik/rya-mongo-debugging/blob/master/src/main/java/com/github/amihalik/rya/mongo/debugging/linestring/LoadLineString.java].
>   I think the inserts will work if you use 10 points, but fail if you use 
> linestrings with 100 points.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (RYA-316) Long LineStrings break MongoDB ingest

2017-08-09 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/RYA-316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16120185#comment-16120185
 ] 

ASF GitHub Bot commented on RYA-316:


Github user amihalik commented on a diff in the pull request:

https://github.com/apache/incubator-rya/pull/199#discussion_r132235041
  
--- Diff: 
dao/mongodb.rya/src/main/java/org/apache/rya/mongodb/dao/SimpleMongoDBStorageStrategy.java
 ---
@@ -53,8 +54,11 @@
 public static final String OBJECT_TYPE_VALUE = 
XMLSchema.ANYURI.stringValue();
 public static final String CONTEXT = "context";
 public static final String PREDICATE = "predicate";
-public static final String OBJECT = "object";
+public static final String PREDICATE_HASH = "predicate_hash";
+public static final String OBJECT = "object_original";
--- End diff --

yep, might as well hash context and object type as well.


> Long LineStrings break MongoDB ingest
> -
>
> Key: RYA-316
> URL: https://issues.apache.org/jira/browse/RYA-316
> Project: Rya
>  Issue Type: Bug
>  Components: dao
>Reporter: Aaron Mihalik
>Assignee: Andrew Smith
>
> MongoDB will reject statements they contain very long linestrings.  
> Basically, the mongodb index key is limited to 1024 chars, so the insert will 
> fail if the literal is longer.
> [Here is some example 
> code|https://github.com/amihalik/rya-mongo-debugging/blob/master/src/main/java/com/github/amihalik/rya/mongo/debugging/linestring/LoadLineString.java].
>   I think the inserts will work if you use 10 points, but fail if you use 
> linestrings with 100 points.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (RYA-316) Long LineStrings break MongoDB ingest

2017-08-09 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/RYA-316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16120183#comment-16120183
 ] 

ASF GitHub Bot commented on RYA-316:


Github user pujav65 commented on the issue:

https://github.com/apache/incubator-rya/pull/199
  
looks good to me.  aaron's different hashing suggestion can be done later 
-- add it to jira to track if you don't want to do it now.  


> Long LineStrings break MongoDB ingest
> -
>
> Key: RYA-316
> URL: https://issues.apache.org/jira/browse/RYA-316
> Project: Rya
>  Issue Type: Bug
>  Components: dao
>Reporter: Aaron Mihalik
>Assignee: Andrew Smith
>
> MongoDB will reject statements they contain very long linestrings.  
> Basically, the mongodb index key is limited to 1024 chars, so the insert will 
> fail if the literal is longer.
> [Here is some example 
> code|https://github.com/amihalik/rya-mongo-debugging/blob/master/src/main/java/com/github/amihalik/rya/mongo/debugging/linestring/LoadLineString.java].
>   I think the inserts will work if you use 10 points, but fail if you use 
> linestrings with 100 points.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (RYA-316) Long LineStrings break MongoDB ingest

2017-08-09 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/RYA-316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16120155#comment-16120155
 ] 

ASF GitHub Bot commented on RYA-316:


Github user isper3at commented on a diff in the pull request:

https://github.com/apache/incubator-rya/pull/199#discussion_r132230502
  
--- Diff: 
dao/mongodb.rya/src/main/java/org/apache/rya/mongodb/dao/SimpleMongoDBStorageStrategy.java
 ---
@@ -53,8 +54,11 @@
 public static final String OBJECT_TYPE_VALUE = 
XMLSchema.ANYURI.stringValue();
 public static final String CONTEXT = "context";
 public static final String PREDICATE = "predicate";
-public static final String OBJECT = "object";
+public static final String PREDICATE_HASH = "predicate_hash";
+public static final String OBJECT = "object_original";
--- End diff --

woops.  I'll make it just object.  did you want a hash for context as well?


> Long LineStrings break MongoDB ingest
> -
>
> Key: RYA-316
> URL: https://issues.apache.org/jira/browse/RYA-316
> Project: Rya
>  Issue Type: Bug
>  Components: dao
>Reporter: Aaron Mihalik
>Assignee: Andrew Smith
>
> MongoDB will reject statements they contain very long linestrings.  
> Basically, the mongodb index key is limited to 1024 chars, so the insert will 
> fail if the literal is longer.
> [Here is some example 
> code|https://github.com/amihalik/rya-mongo-debugging/blob/master/src/main/java/com/github/amihalik/rya/mongo/debugging/linestring/LoadLineString.java].
>   I think the inserts will work if you use 10 points, but fail if you use 
> linestrings with 100 points.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (RYA-316) Long LineStrings break MongoDB ingest

2017-08-09 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/RYA-316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16120156#comment-16120156
 ] 

ASF GitHub Bot commented on RYA-316:


Github user isper3at commented on a diff in the pull request:

https://github.com/apache/incubator-rya/pull/199#discussion_r132230655
  
--- Diff: 
dao/mongodb.rya/src/main/java/org/apache/rya/mongodb/dao/SimpleMongoDBStorageStrategy.java
 ---
@@ -85,14 +89,14 @@ public DBObject getQuery(final RyaStatement stmt) {
 final RyaURI context = stmt.getContext();
 final BasicDBObject query = new BasicDBObject();
 if (subject != null){
-query.append(SUBJECT, subject.getData());
+query.append(SUBJECT_HASH, 
DigestUtils.sha256Hex(subject.getData()));
--- End diff --

I can store as either.  Not really sure if there are any pros-cons between 
the two


> Long LineStrings break MongoDB ingest
> -
>
> Key: RYA-316
> URL: https://issues.apache.org/jira/browse/RYA-316
> Project: Rya
>  Issue Type: Bug
>  Components: dao
>Reporter: Aaron Mihalik
>Assignee: Andrew Smith
>
> MongoDB will reject statements they contain very long linestrings.  
> Basically, the mongodb index key is limited to 1024 chars, so the insert will 
> fail if the literal is longer.
> [Here is some example 
> code|https://github.com/amihalik/rya-mongo-debugging/blob/master/src/main/java/com/github/amihalik/rya/mongo/debugging/linestring/LoadLineString.java].
>   I think the inserts will work if you use 10 points, but fail if you use 
> linestrings with 100 points.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (RYA-316) Long LineStrings break MongoDB ingest

2017-08-09 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/RYA-316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16119949#comment-16119949
 ] 

ASF GitHub Bot commented on RYA-316:


Github user amihalik commented on a diff in the pull request:

https://github.com/apache/incubator-rya/pull/199#discussion_r132191684
  
--- Diff: 
dao/mongodb.rya/src/main/java/org/apache/rya/mongodb/dao/SimpleMongoDBStorageStrategy.java
 ---
@@ -53,8 +54,11 @@
 public static final String OBJECT_TYPE_VALUE = 
XMLSchema.ANYURI.stringValue();
 public static final String CONTEXT = "context";
 public static final String PREDICATE = "predicate";
-public static final String OBJECT = "object";
+public static final String PREDICATE_HASH = "predicate_hash";
+public static final String OBJECT = "object_original";
--- End diff --

Can you change this to just "object" or change "context" "predicate" 
"subject" to "xxx_original"


> Long LineStrings break MongoDB ingest
> -
>
> Key: RYA-316
> URL: https://issues.apache.org/jira/browse/RYA-316
> Project: Rya
>  Issue Type: Bug
>  Components: dao
>Reporter: Aaron Mihalik
>Assignee: Andrew Smith
>
> MongoDB will reject statements they contain very long linestrings.  
> Basically, the mongodb index key is limited to 1024 chars, so the insert will 
> fail if the literal is longer.
> [Here is some example 
> code|https://github.com/amihalik/rya-mongo-debugging/blob/master/src/main/java/com/github/amihalik/rya/mongo/debugging/linestring/LoadLineString.java].
>   I think the inserts will work if you use 10 points, but fail if you use 
> linestrings with 100 points.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (RYA-316) Long LineStrings break MongoDB ingest

2017-08-09 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/RYA-316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16119948#comment-16119948
 ] 

ASF GitHub Bot commented on RYA-316:


Github user amihalik commented on a diff in the pull request:

https://github.com/apache/incubator-rya/pull/199#discussion_r132192953
  
--- Diff: 
dao/mongodb.rya/src/main/java/org/apache/rya/mongodb/dao/SimpleMongoDBStorageStrategy.java
 ---
@@ -85,14 +89,14 @@ public DBObject getQuery(final RyaStatement stmt) {
 final RyaURI context = stmt.getContext();
 final BasicDBObject query = new BasicDBObject();
 if (subject != null){
-query.append(SUBJECT, subject.getData());
+query.append(SUBJECT_HASH, 
DigestUtils.sha256Hex(subject.getData()));
--- End diff --

Can we store/query in binary (32 bytes) vs hex string (64 bytes)?


> Long LineStrings break MongoDB ingest
> -
>
> Key: RYA-316
> URL: https://issues.apache.org/jira/browse/RYA-316
> Project: Rya
>  Issue Type: Bug
>  Components: dao
>Reporter: Aaron Mihalik
>Assignee: Andrew Smith
>
> MongoDB will reject statements they contain very long linestrings.  
> Basically, the mongodb index key is limited to 1024 chars, so the insert will 
> fail if the literal is longer.
> [Here is some example 
> code|https://github.com/amihalik/rya-mongo-debugging/blob/master/src/main/java/com/github/amihalik/rya/mongo/debugging/linestring/LoadLineString.java].
>   I think the inserts will work if you use 10 points, but fail if you use 
> linestrings with 100 points.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (RYA-316) Long LineStrings break MongoDB ingest

2017-08-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/RYA-316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16118678#comment-16118678
 ] 

ASF GitHub Bot commented on RYA-316:


Github user isper3at commented on the issue:

https://github.com/apache/incubator-rya/pull/199
  
asfbot build


> Long LineStrings break MongoDB ingest
> -
>
> Key: RYA-316
> URL: https://issues.apache.org/jira/browse/RYA-316
> Project: Rya
>  Issue Type: Bug
>  Components: dao
>Reporter: Aaron Mihalik
>Assignee: Andrew Smith
>
> MongoDB will reject statements they contain very long linestrings.  
> Basically, the mongodb index key is limited to 1024 chars, so the insert will 
> fail if the literal is longer.
> [Here is some example 
> code|https://github.com/amihalik/rya-mongo-debugging/blob/master/src/main/java/com/github/amihalik/rya/mongo/debugging/linestring/LoadLineString.java].
>   I think the inserts will work if you use 10 points, but fail if you use 
> linestrings with 100 points.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (RYA-316) Long LineStrings break MongoDB ingest

2017-08-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/RYA-316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16118632#comment-16118632
 ] 

ASF GitHub Bot commented on RYA-316:


Github user asfgit commented on the issue:

https://github.com/apache/incubator-rya/pull/199
  

Refer to this link for build results (access rights to CI server needed): 

https://builds.apache.org/job/incubator-rya-master-with-optionals-pull-requests/371/Failed
 Tests: 3incubator-rya-master-with-optionals-pull-requests/org.apache.rya:rya.prospector:
 3org.apache.rya.prospector.mr.ProspectorTest.testCountorg.apache.rya.prospector.service.ProspectorServiceEvalStatsDAOTest.testCountorg.apache.rya.prospector.service.ProspectorServiceEvalStatsDAOTest.testNoAuthsCount



> Long LineStrings break MongoDB ingest
> -
>
> Key: RYA-316
> URL: https://issues.apache.org/jira/browse/RYA-316
> Project: Rya
>  Issue Type: Bug
>  Components: dao
>Reporter: Aaron Mihalik
>Assignee: Andrew Smith
>
> MongoDB will reject statements they contain very long linestrings.  
> Basically, the mongodb index key is limited to 1024 chars, so the insert will 
> fail if the literal is longer.
> [Here is some example 
> code|https://github.com/amihalik/rya-mongo-debugging/blob/master/src/main/java/com/github/amihalik/rya/mongo/debugging/linestring/LoadLineString.java].
>   I think the inserts will work if you use 10 points, but fail if you use 
> linestrings with 100 points.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (RYA-316) Long LineStrings break MongoDB ingest

2017-08-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/RYA-316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16118539#comment-16118539
 ] 

ASF GitHub Bot commented on RYA-316:


Github user asfgit commented on the issue:

https://github.com/apache/incubator-rya/pull/199
  

Refer to this link for build results (access rights to CI server needed): 

https://builds.apache.org/job/incubator-rya-master-with-optionals-pull-requests/368/Failed
 Tests: 2incubator-rya-master-with-optionals-pull-requests/org.apache.rya:mongodb.rya:
 1org.apache.rya.mongodb.SimpleMongoDBStorageStrategyTest.testSerializeStatementToDBOincubator-rya-master-with-optionals-pull-requests/org.apache.rya:rya.indexing:
 1org.apache.rya.indexing.mongo.MongoTemporalIndexerTest.org.apache.rya.indexing.mongo.MongoTemporalIndexerTest



> Long LineStrings break MongoDB ingest
> -
>
> Key: RYA-316
> URL: https://issues.apache.org/jira/browse/RYA-316
> Project: Rya
>  Issue Type: Bug
>  Components: dao
>Reporter: Aaron Mihalik
>Assignee: Andrew Smith
>
> MongoDB will reject statements they contain very long linestrings.  
> Basically, the mongodb index key is limited to 1024 chars, so the insert will 
> fail if the literal is longer.
> [Here is some example 
> code|https://github.com/amihalik/rya-mongo-debugging/blob/master/src/main/java/com/github/amihalik/rya/mongo/debugging/linestring/LoadLineString.java].
>   I think the inserts will work if you use 10 points, but fail if you use 
> linestrings with 100 points.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (RYA-316) Long LineStrings break MongoDB ingest

2017-08-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/RYA-316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16118523#comment-16118523
 ] 

ASF GitHub Bot commented on RYA-316:


Github user asfgit commented on the issue:

https://github.com/apache/incubator-rya/pull/199
  

Refer to this link for build results (access rights to CI server needed): 

https://builds.apache.org/job/incubator-rya-master-with-optionals-pull-requests/369/Failed
 Tests: 1incubator-rya-master-with-optionals-pull-requests/org.apache.rya:mongodb.rya:
 1org.apache.rya.mongodb.SimpleMongoDBStorageStrategyTest.testSerializeStatementToDBO



> Long LineStrings break MongoDB ingest
> -
>
> Key: RYA-316
> URL: https://issues.apache.org/jira/browse/RYA-316
> Project: Rya
>  Issue Type: Bug
>  Components: dao
>Reporter: Aaron Mihalik
>Assignee: Andrew Smith
>
> MongoDB will reject statements they contain very long linestrings.  
> Basically, the mongodb index key is limited to 1024 chars, so the insert will 
> fail if the literal is longer.
> [Here is some example 
> code|https://github.com/amihalik/rya-mongo-debugging/blob/master/src/main/java/com/github/amihalik/rya/mongo/debugging/linestring/LoadLineString.java].
>   I think the inserts will work if you use 10 points, but fail if you use 
> linestrings with 100 points.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (RYA-316) Long LineStrings break MongoDB ingest

2017-08-07 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/RYA-316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16117781#comment-16117781
 ] 

ASF GitHub Bot commented on RYA-316:


Github user isper3at commented on the issue:

https://github.com/apache/incubator-rya/pull/199
  
sure, shouldn't be an issue


> Long LineStrings break MongoDB ingest
> -
>
> Key: RYA-316
> URL: https://issues.apache.org/jira/browse/RYA-316
> Project: Rya
>  Issue Type: Bug
>  Components: dao
>Reporter: Aaron Mihalik
>Assignee: Andrew Smith
>
> MongoDB will reject statements they contain very long linestrings.  
> Basically, the mongodb index key is limited to 1024 chars, so the insert will 
> fail if the literal is longer.
> [Here is some example 
> code|https://github.com/amihalik/rya-mongo-debugging/blob/master/src/main/java/com/github/amihalik/rya/mongo/debugging/linestring/LoadLineString.java].
>   I think the inserts will work if you use 10 points, but fail if you use 
> linestrings with 100 points.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (RYA-316) Long LineStrings break MongoDB ingest

2017-08-07 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/RYA-316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16116904#comment-16116904
 ] 

ASF GitHub Bot commented on RYA-316:


GitHub user isper3at opened a pull request:

https://github.com/apache/incubator-rya/pull/199

RYA-316 Long OBJ string

## Description
>What Changed?

Hash the indexed object field with SHA256.
This will allow the indexer not to break
if the object is longer than 1024 bytes.

### Tests
>Coverage?

Updated the tests with the new fields

### Links
[Jira](https://issues.apache.org/jira/browse/RYA-316)

### Checklist
- [ ] Code Review
- [ ] Squash Commits

 People To Reivew
@meiercaleb 
@amihalik 
@pujav65 


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/isper3at/incubator-rya RYA-316

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-rya/pull/199.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #199


commit 94f6c9bad01d0cf7c2716895678692065796fa13
Author: isper3at 
Date:   2017-08-07T17:28:46Z

RYA-316 Long OBJ string

Hash the indexed object field with SHA256.
This will allow the indexer not to break
if the object is longer than 1024 bytes.




> Long LineStrings break MongoDB ingest
> -
>
> Key: RYA-316
> URL: https://issues.apache.org/jira/browse/RYA-316
> Project: Rya
>  Issue Type: Bug
>  Components: dao
>Reporter: Aaron Mihalik
>Assignee: Andrew Smith
>
> MongoDB will reject statements they contain very long linestrings.  
> Basically, the mongodb index key is limited to 1024 chars, so the insert will 
> fail if the literal is longer.
> [Here is some example 
> code|https://github.com/amihalik/rya-mongo-debugging/blob/master/src/main/java/com/github/amihalik/rya/mongo/debugging/linestring/LoadLineString.java].
>   I think the inserts will work if you use 10 points, but fail if you use 
> linestrings with 100 points.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (RYA-316) Long LineStrings break MongoDB ingest

2017-08-04 Thread Puja Valiyil (JIRA)

[ 
https://issues.apache.org/jira/browse/RYA-316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16115209#comment-16115209
 ] 

Puja Valiyil  commented on RYA-316:
---

It wouldn't be a hash index-- you would just store the hash as if it was the 
original literal.  Basically encoding it.  
Good point about the geo search-- that was a lot of my concern with the 
truncating, so I'm ok with it.  

> Long LineStrings break MongoDB ingest
> -
>
> Key: RYA-316
> URL: https://issues.apache.org/jira/browse/RYA-316
> Project: Rya
>  Issue Type: Bug
>  Components: dao
>Reporter: Aaron Mihalik
>Assignee: Andrew Smith
>
> MongoDB will reject statements they contain very long linestrings.  
> Basically, the mongodb index key is limited to 1024 chars, so the insert will 
> fail if the literal is longer.
> [Here is some example 
> code|https://github.com/amihalik/rya-mongo-debugging/blob/master/src/main/java/com/github/amihalik/rya/mongo/debugging/linestring/LoadLineString.java].
>   I think the inserts will work if you use 10 points, but fail if you use 
> linestrings with 100 points.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (RYA-316) Long LineStrings break MongoDB ingest

2017-08-04 Thread Andrew Smith (JIRA)

[ 
https://issues.apache.org/jira/browse/RYA-316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16115206#comment-16115206
 ] 

Andrew Smith commented on RYA-316:
--

hash indices can't be used in compound indices,
geo searches are performed on the geo field, not the object field.


> Long LineStrings break MongoDB ingest
> -
>
> Key: RYA-316
> URL: https://issues.apache.org/jira/browse/RYA-316
> Project: Rya
>  Issue Type: Bug
>  Components: dao
>Reporter: Aaron Mihalik
>Assignee: Andrew Smith
>
> MongoDB will reject statements they contain very long linestrings.  
> Basically, the mongodb index key is limited to 1024 chars, so the insert will 
> fail if the literal is longer.
> [Here is some example 
> code|https://github.com/amihalik/rya-mongo-debugging/blob/master/src/main/java/com/github/amihalik/rya/mongo/debugging/linestring/LoadLineString.java].
>   I think the inserts will work if you use 10 points, but fail if you use 
> linestrings with 100 points.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (RYA-316) Long LineStrings break MongoDB ingest

2017-08-04 Thread Puja Valiyil (JIRA)

[ 
https://issues.apache.org/jira/browse/RYA-316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16115198#comment-16115198
 ] 

Puja Valiyil  commented on RYA-316:
---

Since you're also storing the object value still (just basing index values for 
queries off the sha1 hash), you don't have to worry about going back and forth 
from the hash.  So if you had the triple pattern (?s ?pred obj), you would 
query for triples with a hash value field equal to the has of obj.  Compound 
indices would use the hash obj instead of the obj value.  The scan would return 
the entire document, so you wouldn't need to convert back from the hash.  We 
would be storing the object twice, but that's not that big a deal.
As I said before, truncating isn't valid especially in a geo use case- you 
would no longer be storing the entire geo literal which makes it impossible to 
accurately do geo search functions.  I'm not sure how we would query on that 
either-- if you have a long literal do you break it up in the query?  You'd be 
doing a lot of filtering client side.  

> Long LineStrings break MongoDB ingest
> -
>
> Key: RYA-316
> URL: https://issues.apache.org/jira/browse/RYA-316
> Project: Rya
>  Issue Type: Bug
>  Components: dao
>Reporter: Aaron Mihalik
>Assignee: Andrew Smith
>
> MongoDB will reject statements they contain very long linestrings.  
> Basically, the mongodb index key is limited to 1024 chars, so the insert will 
> fail if the literal is longer.
> [Here is some example 
> code|https://github.com/amihalik/rya-mongo-debugging/blob/master/src/main/java/com/github/amihalik/rya/mongo/debugging/linestring/LoadLineString.java].
>   I think the inserts will work if you use 10 points, but fail if you use 
> linestrings with 100 points.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (RYA-316) Long LineStrings break MongoDB ingest

2017-08-04 Thread Andrew Smith (JIRA)

[ 
https://issues.apache.org/jira/browse/RYA-316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16115146#comment-16115146
 ] 

Andrew Smith commented on RYA-316:
--

right, the truncating would end up with:
Subject=Andrew
Predicate=plays
Object=trumpet_sup-
Object_suffix=er long object value
Object type=string

so when you go to retrieve it, the value just gets concatenated.  This should 
be fine since we won't index on the suffix so it can be 16mb long  

> Long LineStrings break MongoDB ingest
> -
>
> Key: RYA-316
> URL: https://issues.apache.org/jira/browse/RYA-316
> Project: Rya
>  Issue Type: Bug
>  Components: dao
>Reporter: Aaron Mihalik
>Assignee: Andrew Smith
>
> MongoDB will reject statements they contain very long linestrings.  
> Basically, the mongodb index key is limited to 1024 chars, so the insert will 
> fail if the literal is longer.
> [Here is some example 
> code|https://github.com/amihalik/rya-mongo-debugging/blob/master/src/main/java/com/github/amihalik/rya/mongo/debugging/linestring/LoadLineString.java].
>   I think the inserts will work if you use 10 points, but fail if you use 
> linestrings with 100 points.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (RYA-316) Long LineStrings break MongoDB ingest

2017-08-04 Thread Andrew Smith (JIRA)

[ 
https://issues.apache.org/jira/browse/RYA-316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16115113#comment-16115113
 ] 

Andrew Smith commented on RYA-316:
--

I u may have found another solution, if I specify the PO index to be a 
compound index where OBJECT is a Text field, which it is treated as such, it 
works.  I just tried it and no problems


> Long LineStrings break MongoDB ingest
> -
>
> Key: RYA-316
> URL: https://issues.apache.org/jira/browse/RYA-316
> Project: Rya
>  Issue Type: Bug
>  Components: dao
>Reporter: Aaron Mihalik
>Assignee: Andrew Smith
>
> MongoDB will reject statements they contain very long linestrings.  
> Basically, the mongodb index key is limited to 1024 chars, so the insert will 
> fail if the literal is longer.
> [Here is some example 
> code|https://github.com/amihalik/rya-mongo-debugging/blob/master/src/main/java/com/github/amihalik/rya/mongo/debugging/linestring/LoadLineString.java].
>   I think the inserts will work if you use 10 points, but fail if you use 
> linestrings with 100 points.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (RYA-316) Long LineStrings break MongoDB ingest

2017-08-04 Thread Andrew Smith (JIRA)

[ 
https://issues.apache.org/jira/browse/RYA-316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16115102#comment-16115102
 ] 

Andrew Smith commented on RYA-316:
--

the problem is we would have to get rid of all compound indexes that use 
object.  We could compound index over the hashObject field, but I'm not sure 
how that'll work, we would also have to convert to/from sha of the object 
before/after each query and at insert time.  The LOE for truncating and sha-ing 
is the same, so I'm almost done with truncating.  If we decide to switch to SHA 
it'll be trivial


> Long LineStrings break MongoDB ingest
> -
>
> Key: RYA-316
> URL: https://issues.apache.org/jira/browse/RYA-316
> Project: Rya
>  Issue Type: Bug
>  Components: dao
>Reporter: Aaron Mihalik
>Assignee: Andrew Smith
>
> MongoDB will reject statements they contain very long linestrings.  
> Basically, the mongodb index key is limited to 1024 chars, so the insert will 
> fail if the literal is longer.
> [Here is some example 
> code|https://github.com/amihalik/rya-mongo-debugging/blob/master/src/main/java/com/github/amihalik/rya/mongo/debugging/linestring/LoadLineString.java].
>   I think the inserts will work if you use 10 points, but fail if you use 
> linestrings with 100 points.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (RYA-316) Long LineStrings break MongoDB ingest

2017-08-04 Thread Puja Valiyil (JIRA)

[ 
https://issues.apache.org/jira/browse/RYA-316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16115100#comment-16115100
 ] 

Puja Valiyil  commented on RYA-316:
---

Oh and I don't think truncating is a valid solution-- you lose portions of very 
long strings.  It would probably be better to not store the triple and instead 
detect and log the error at that point.  

> Long LineStrings break MongoDB ingest
> -
>
> Key: RYA-316
> URL: https://issues.apache.org/jira/browse/RYA-316
> Project: Rya
>  Issue Type: Bug
>  Components: dao
>Reporter: Aaron Mihalik
>Assignee: Andrew Smith
>
> MongoDB will reject statements they contain very long linestrings.  
> Basically, the mongodb index key is limited to 1024 chars, so the insert will 
> fail if the literal is longer.
> [Here is some example 
> code|https://github.com/amihalik/rya-mongo-debugging/blob/master/src/main/java/com/github/amihalik/rya/mongo/debugging/linestring/LoadLineString.java].
>   I think the inserts will work if you use 10 points, but fail if you use 
> linestrings with 100 points.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (RYA-316) Long LineStrings break MongoDB ingest

2017-08-04 Thread Puja Valiyil (JIRA)

[ 
https://issues.apache.org/jira/browse/RYA-316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16115095#comment-16115095
 ] 

Puja Valiyil  commented on RYA-316:
---

I had thought we would add a hash of the object to each triple document and 
then use that for query/etc.  so for the triple (Andrew, plays, trumpet) we 
would store the following:
Subject=Andrew
Predicate=plays
Object=trumpet
Hashobject=sha 1 hash of trumpet
Object type=string
The Id would be the Concat of subject, predicate, and hash of object
There's a concern about losing alphabetic sorting for the po index but I'm not 
sure we even make use of that in a way that matters.

> Long LineStrings break MongoDB ingest
> -
>
> Key: RYA-316
> URL: https://issues.apache.org/jira/browse/RYA-316
> Project: Rya
>  Issue Type: Bug
>  Components: dao
>Reporter: Aaron Mihalik
>Assignee: Andrew Smith
>
> MongoDB will reject statements they contain very long linestrings.  
> Basically, the mongodb index key is limited to 1024 chars, so the insert will 
> fail if the literal is longer.
> [Here is some example 
> code|https://github.com/amihalik/rya-mongo-debugging/blob/master/src/main/java/com/github/amihalik/rya/mongo/debugging/linestring/LoadLineString.java].
>   I think the inserts will work if you use 10 points, but fail if you use 
> linestrings with 100 points.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (RYA-316) Long LineStrings break MongoDB ingest

2017-08-04 Thread Andrew Smith (JIRA)

[ 
https://issues.apache.org/jira/browse/RYA-316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16115032#comment-16115032
 ] 

Andrew Smith commented on RYA-316:
--

Quick fix solution is to truncate the object value to 1024 so it fits in the 
index key, alternatively we can create multiple indexers and gracefully catch 
invalid index keys.  I'll implement the truncate since that's what I talked 
about with Caleb, but I'll look into doing this gracefully with hashing as well 
as an alternative

> Long LineStrings break MongoDB ingest
> -
>
> Key: RYA-316
> URL: https://issues.apache.org/jira/browse/RYA-316
> Project: Rya
>  Issue Type: Bug
>  Components: dao
>Reporter: Aaron Mihalik
>Assignee: Andrew Smith
>
> MongoDB will reject statements they contain very long linestrings.  
> Basically, the mongodb index key is limited to 1024 chars, so the insert will 
> fail if the literal is longer.
> [Here is some example 
> code|https://github.com/amihalik/rya-mongo-debugging/blob/master/src/main/java/com/github/amihalik/rya/mongo/debugging/linestring/LoadLineString.java].
>   I think the inserts will work if you use 10 points, but fail if you use 
> linestrings with 100 points.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (RYA-316) Long LineStrings break MongoDB ingest

2017-07-19 Thread Aaron Mihalik (JIRA)

[ 
https://issues.apache.org/jira/browse/RYA-316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16093325#comment-16093325
 ] 

Aaron Mihalik commented on RYA-316:
---

[~Pujav65][~meierca...@gmail.com]

The issue is when Mongo ties to put the Object/literal into the PO and OS 
indices.  The index keys are way too long.

Unfortunately, I can't figure out obvious way around this.  Mongo can do hash 
indices (awesome!), but cannot do that over compound keys (ugh).

I would like to make this solution as general purpose as possible.  I think we 
might have to truncate the values that we put in the index.  Or we might roll 
our own hash index.

--Aaron

> Long LineStrings break MongoDB ingest
> -
>
> Key: RYA-316
> URL: https://issues.apache.org/jira/browse/RYA-316
> Project: Rya
>  Issue Type: Bug
>  Components: dao
>Reporter: Aaron Mihalik
>Assignee: Andrew Smith
>
> MongoDB will reject statements they contain very long linestrings.  
> Basically, the mongodb index key is limited to 1024 chars, so the insert will 
> fail if the literal is longer.
> [Here is some example 
> code|https://github.com/amihalik/rya-mongo-debugging/blob/master/src/main/java/com/github/amihalik/rya/mongo/debugging/linestring/LoadLineString.java].
>   I think the inserts will work if you use 10 points, but fail if you use 
> linestrings with 100 points.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)