Q: BatchScanner and parallel (i.e. m/r style) execution

Roberts, Geoffry [USA] Sat, 16 Jan 2021 08:28:13 -0800

All,

Three questions all asking the same thing:


Can an Accumulo scan or batchscan run like a map/reduce job?

I have an Accumulo 2.0 cluster.

In hadoop, I can launch a map/reduce job on the name node and hadoop 
distributes the job over the nodes of the cluster and the job runs in parallel.

In accumulo, I am calling the batch scanner from some non-java code that is 
first distributed across the cluster then on each node it attaches to accumulo 
and does the scan.  It works on a single node accumulo—so far so good.  I need 
to escalate and run it multi-node.  I am concerned that I’ll wind up running 
the same scan on each node, which would return me an array of result sets all 
alike.  Am I correct?

Can I somehow get the Hadoop m/r effect in accumulo?

Thanks

Geoffry Roberts
Lead Technologist
702.290.9098
[email protected]

Booz | Allen | Hamilton
BoozAllen.com

Q: BatchScanner and parallel (i.e. m/r style) execution

Reply via email to