Re: BatchScanner taking too much time to scan rows

2015-05-14 Thread Dylan Hutchison
I think this is the same issue I found for ACCUMULO-3710 , only in my case the tserver ran out of memory. Accumulo doesn't handle large numbers of small, disjoint ranges well. I bet there's room for improvement on both the client and tablet ser

Re: BatchScanner taking too much time to scan rows

2015-05-14 Thread vaibhav thapliyal
Dylan could you elaborate on the average query time you had? Thanks Vaibhav On 14-May-2015 11:03 pm, "Dylan Hutchison" wrote: > I think this is the same issue I found for ACCUMULO-3710 > , only in my case > the tserver ran out of memory. Accum

Re: BatchScanner taking too much time to scan rows

2015-05-14 Thread Dylan Hutchison
I didn't have an average query time-- the tablet server crashed. A quick solution is to batch the ranges into groups of 50k (or 500k, I forgot which one) and do many BatchScans-- not ideal. I think I achieved 33k entries/second retrieval on a single-node Accumulo. Accumulo is better for sequenti

Re: BatchScanner taking too much time to scan rows

2015-05-14 Thread Dylan Hutchison
Sorry, just remembered that my setup was to scan an index table and gather rowIDs, then scan a main data table using the rowIDs as the BatchScan ranges. Effectively it is a join of part of the index table to a main data table. The scan rate I achieved is therefore double the value I cited previou

Re: Mini Accumulo cluster

2015-05-14 Thread Dave Hardcastle
Josh, Thanks for your response. My iterators will do the same number of seeks, they're only different in the implementation of the functions used to perform filtering, so I think I'll get a reasonable comparison but I won't read too much into the results. On 13 May 2015 at 21:19, Josh Elser wr