Re: Accumulo performance on various hardware configurations

Keith Turner Tue, 04 Sep 2018 10:25:42 -0700

On Wed, Aug 29, 2018 at 9:37 AM, guy sharon <guy.sharon.1...@gmail.com> wrote:
> hi Marc,
>
> Just ran the test again with the changes you suggested. Setup: 5 tservers on
> CentOS 7, 4 CPUs and 16 GB RAM, Accumulo 1.7.4, table with 6M rows.
> org.apache.accumulo.examples.simple.helloworld.ReadData now uses a
> BatchScanner with 10 threads. I got:
>
> $ time install/accumulo-1.7.4/bin/accumulo
> org.apache.accumulo.examples.simple.helloworld.ReadData -i muchos -z
> localhost:2181 -u root -t hellotable -p secret


Doing performance test this way has overhead that would not be seen in
a long running java process.  I think the accumulo script runs a java
process to check something before running the java process for
ReadData.  Also, Java is really slow when it first starts because it
is interpreting and loading classes.   When the classes are loaded and
the byte code is compiled, execution is much faster.   For performance
test I usually loop a few times in java running the same test and
printing out the time for each iteration.  The first run is usually
really slow, then it speeds up until the times stabilize.


>
> real    0m16.979s
> user    0m13.670s
> sys    0m0.599s
>
> So this doesn't really improve things. That looks strange to me as I'd
> expect Accumulo to use the threads to speed things up. Unless the full scan
> makes it use just one thread with the assumption that the entries are next
> to each other on the disk making it faster to read them sequentially rather
> than jump back and forth with threads. What do you think?
>
> BR,
> Guy.
>
>
>
>
> On Wed, Aug 29, 2018 at 3:25 PM Marc <phroc...@apache.org> wrote:
>>
>> Guy,
>>   To clarify :
>>
>> [1] If you have four tablets it's reasonable to suspect that the RPC
>> time to access those servers may increase a bit if you access them
>> sequentially versus in parallel.
>> On Wed, Aug 29, 2018 at 8:16 AM Marc <phroc...@apache.org> wrote:
>> >
>> > Guy,
>> >   The ReadData example appears to use a sequential scanner. Can you
>> > change that to a batch scanner and see if there is improvement [1]?
>> > Also, while you are there can you remove the log statement or set your
>> > log level so that the trace message isn't printed?
>> >
>> > In this case we are reading the entirety of that data. If you were to
>> > perform a query you would likely prefer to do it at the data instead
>> > of bringing all data back to the client.
>> >
>> > What are your expectations since it appears very slow. Do you want
>> > faster client side access to the data? Certainly improvements could be
>> > made -- of that I have no doubt -- but the time to bring 6M entries to
>> > the client is a cost you will incur if you use the ReadData example.
>> >
>> > [1] If you have four tablets it's reasonable to suspect that the RPC
>> > time to access those servers may increase a bit.
>> >
>> > On Wed, Aug 29, 2018 at 8:05 AM guy sharon <guy.sharon.1...@gmail.com>
>> > wrote:
>> > >
>> > > hi,
>> > >
>> > > Continuing my performance benchmarks, I'm still trying to figure out
>> > > if the results I'm getting are reasonable and why throwing more hardware 
>> > > at
>> > > the problem doesn't help. What I'm doing is a full table scan on a table
>> > > with 6M entries. This is Accumulo 1.7.4 with Zookeeper 3.4.12 and Hadoop
>> > > 2.8.4. The table is populated by
>> > > org.apache.accumulo.examples.simple.helloworld.InsertWithBatchWriter
>> > > modified to write 6M entries instead of 50k. Reads are performed by
>> > > "bin/accumulo org.apache.accumulo.examples.simple.helloworld.ReadData -i
>> > > muchos -z localhost:2181 -u root -t hellotable -p secret". Here are the
>> > > results I got:
>> > >
>> > > 1. 5 tserver cluster as configured by Muchos
>> > > (https://github.com/apache/fluo-muchos), running on m5d.large AWS 
>> > > machines
>> > > (2vCPU, 8GB RAM) running CentOS 7. Master is on a separate server. Scan 
>> > > took
>> > > 12 seconds.
>> > > 2. As above except with m5d.xlarge (4vCPU, 16GB RAM). Same results.
>> > > 3. Splitting the table to 4 tablets causes the runtime to increase to
>> > > 16 seconds.
>> > > 4. 7 tserver cluster running m5d.xlarge servers. 12 seconds.
>> > > 5. Single node cluster on m5d.12xlarge (48 cores, 192GB RAM), running
>> > > Amazon Linux. Configuration as provided by Uno
>> > > (https://github.com/apache/fluo-uno). Total time was 26 seconds.
>> > >
>> > > Offhand I would say this is very slow. I'm guessing I'm making some
>> > > sort of newbie (possibly configuration) mistake but I can't figure out 
>> > > what
>> > > it is. Can anyone point me to something that might help me find out what 
>> > > it
>> > > is?
>> > >
>> > > thanks,
>> > > Guy.
>> > >
>> > >

Re: Accumulo performance on various hardware configurations

Reply via email to