Keith, what do you think about the throughput archived?
Was it around 15k messages per second, right?

*Dummy questions:*
I've noticed when I increase the number of process (threads/applications)
loading data better is the throughput. (Obviously)
But I didn't reach the maximum of the Fluo. Is MapRedure the best way to
load data to fluo?

Are there differences in use "fluo exec" to start a Fluo client instead use
a "java -jar" or other ways?
Are there other clients to load/transact with Fluo? Python, Go...
What's the difference in use a Loader or a client transaction?
Asynchronous/Synchronous?

*(sorry for the disconnected questions!)*
Thanks!

Alan Camillo
*BlueShift *I IT Director
Cel.: +55 11 98283-6358
Tel.: +55 11 4605-5082

2018-01-10 13:19 GMT-02:00 Keith Turner <ke...@deenlo.com>:

> I completed a successful 24hr run of the Fluo stress test on a 10 node
> EC2 cluster.  For the test 1 billion random integers were loaded via
> map reduce and then 370 million were loaded by Fluo.  This resulted in
> ~1.3 billion transaction executing and ~13 million collisions.  Fluo
> commit dbad51d was used for the test.  Below is the final output from
> the test.
>
> *****Verifying Fluo & MapReduce results match*****
> Success! Fluo & MapReduce both calculated 1369064132 unique integers
>
> During the test CPU utilization was not uniformly high.  Looking at
> the Accumulo monitor some nodes would have lots of queued scans.
> Running jstack on that nodes showed lots of threads trying to reserve
> open files.  However there were only a few threads actually running
> scans.  This seemed very odd and I plan to investigate further.  I had
> set the max open files to 1000 and all tablets had only 3 to 4 files.
> Therefore if 1000 files were reserved I would have expected to see
> lots of scans running, however this was not what I saw.
>
>
> Below is a gist with info about config used for the test.
>
> https://gist.github.com/keith-turner/e28ee6cd4941210f34e5cd0e6a6b3106
>

Reply via email to