Hey Roger,

We did some benchmarking, and discovered very similar performance to what
you've described. We saw ~40k writes/sec, and ~20 k reads/sec,
per-container, on a Virident SSD. This was without any changelog. Are you
using a changelog on the store?

When we attached a changelog to the store, the writes dropped
significantly (~1000 writes/sec). When we hooked up VisualVM, we saw that
the container was spending > 99% of its time in KafkaSystemProducer.send().

We're currently doing two things:

1. Working with our performance team to understand and tune RocksDB
properly.
2. Upgrading the Kafka producer to use the new Java-based API. (SAMZA-227)

For (1), it seems like we should be able to get a lot higher throughput
from RocksDB. Anecdotally, we've heard that RocksDB requires many threads
in order to max out an SSD, and since Samza is single-threaded, we could
just be hitting a RocksDB bottleneck. We won't know until we dig into the
problem (which we started investigating last week). The current plan is to
start by benchmarking RocksDB JNI outside of Samza, and see what we can
get. From there, we'll know our "speed of light", and can try to get Samza
as close as possible to it. If RocksDB JNI can't be made to go "fast",
then we'll have to understand why.

(2) should help with the changelog issue. I believe that the slowness with
the changelog is caused because the changelog is using a sync producer to
send to Kafka, and is blocking when a batch is flushed. In the new API,
the concept of a "sync" producer is removed. All writes are handled on an
async writer thread (though we can still guarantee writes are safely
written before checkpointing, which is what we need).

In short, I agree, it seems slow. We see this behavior, too. We're digging
into it.

Cheers,
Chris

On 1/17/15 12:58 PM, "Roger Hoover" <[email protected]> wrote:

>Michael,
>
>Thanks for the response.  I used VisualVM and YourKit and see the CPU is
>not being used (0.1%).  I took a few thread dumps and see the main thread
>blocked on the flush() method inside the KV store.
>
>On Sat, Jan 17, 2015 at 7:09 AM, Michael Rose <[email protected]>
>wrote:
>
>> Is your process at 100% CPU? I suspect you're spending most of your
>>time in
>> JSON deserialization, but profile it and check.
>>
>> Michael
>>
>> On Friday, January 16, 2015, Roger Hoover <[email protected]>
>>wrote:
>>
>> > Hi guys,
>> >
>> > I'm testing a job that needs to load 40M records (6GB in Kafka as
>>JSON)
>> > from a bootstrap topic.  The topic has 4 partitions and I'm running
>>the
>> job
>> > using the ProcessJobFactory so all four tasks are in one container.
>> >
>> > Using RocksDB, it's taking 19 minutes to load all the data which
>>amounts
>> to
>> > 35k records/sec or 5MB/s based on input size.  I ran iostat during
>>this
>> > time as see the disk write throughput is 14MB/s.
>> >
>> > I didn't tweak any of the storage settings.
>> >
>> > A few questions:
>> > 1) Does this seem low?  I'm running on a Macbook Pro with SSD.
>> > 2) Do you have any recommendations for improving the load speed?
>> >
>> > Thanks,
>> >
>> > Roger
>> >
>>

Reply via email to