Re: Accumulo performance on various hardware configurations

Josh Elser Wed, 29 Aug 2018 08:48:44 -0700

To answer your original question: YCSB is a standard benchmarking toolfor databases that provides various types of read/write workloads.


https://github.com/brianfrankcooper/YCSB/tree/master/accumulo1.7


On 8/29/18 8:04 AM, guy sharon wrote:

hi,
Continuing my performance benchmarks, I'm still trying to figure out ifthe results I'm getting are reasonable and why throwing more hardware atthe problem doesn't help. What I'm doing is a full table scan on a tablewith 6M entries. This is Accumulo 1.7.4 with Zookeeper 3.4.12 and Hadoop2.8.4. The table is populated byorg.apache.accumulo.examples.simple.helloworld.InsertWithBatchWritermodified to write 6M entries instead of 50k. Reads are performed by"bin/accumulo org.apache.accumulo.examples.simple.helloworld.ReadData -imuchos -z localhost:2181 -u root -t hellotable -p secret". Here are theresults I got:
1. 5 tserver cluster as configured by Muchos(https://github.com/apache/fluo-muchos), running on m5d.large AWSmachines (2vCPU, 8GB RAM) running CentOS 7. Master is on a separateserver. Scan took 12 seconds.
2. As above except with m5d.xlarge (4vCPU, 16GB RAM). Same results.
3. Splitting the table to 4 tablets causes the runtime to increase to 16seconds.
4. 7 tserver cluster running m5d.xlarge servers. 12 seconds.
5. Single node cluster on m5d.12xlarge (48 cores, 192GB RAM), runningAmazon Linux. Configuration as provided by Uno(https://github.com/apache/fluo-uno). Total time was 26 seconds.
Offhand I would say this is very slow. I'm guessing I'm making some sortof newbie (possibly configuration) mistake but I can't figure out whatit is. Can anyone point me to something that might help me find out whatit is?
thanks,
Guy.

Re: Accumulo performance on various hardware configurations

Reply via email to