We are storing the RDF statement data to Accumulo in the
POS(Predicate,Object, Subject) fashion. The table is designed to store 100
million records.

Ex:
p1|o1|s1
p1|o1|s5
p1|o2|s3
p1|o2|s2
p2|o1|s4

The data is sorted based on the fist two parts of the key, (p1 & o1 etc). 

When I apply a prefix range with (p1|o1  to p2|o1), I could get the subjects
in the order [s1, s5, s3, s2, s4].

But with the my scan would perform back and forth on the table and I would
be interested to get the list of subjects as [s1, s2, s3, s4, s5] while
reading through the iterators.

Is there anyway I can get the above result ?

Also, on the same table if I apply the Range filter then I would get
distinct order sets like [s2, s3, s5] and [s200, s150, s500] etc. Even in
this case, how should I make the scanner to read the data in the single
sorted order.











--
View this message in context: 
http://apache-accumulo.1065345.n5.nabble.com/Sorted-RowId-suffix-retrieval-using-Server-Side-Iterators-tp21787.html
Sent from the Developers mailing list archive at Nabble.com.

Reply via email to