The simple intuition on batch scanners is that they provide parallelism by
having multiple fetch threads to contact multiple servers at once. It's
pretty common to structure your key so that records are likely to be
scattered around your cluster, either by using a hash or random number
inside your
Inline
John
On Sun, Aug 12, 2012 at 5:49 PM, Steven Troxell wrote:
> Hi All,
>
> I was wondering if someone would be willing to help evaulate my reasoning
> on the use of Scanner vs. BatchScanner, and see if I'm making the proper
> assumptions.
>
> The background is I am attempting to benchmark
Hi All,
I was wondering if someone would be willing to help evaulate my reasoning
on the use of Scanner vs. BatchScanner, and see if I'm making the proper
assumptions.
The background is I am attempting to benchmark an RDF application using
Accumulo by evaluating the impact of scaling on performan