Re: Local state write throughput

Roger Hoover Tue, 20 Jan 2015 16:49:20 -0800

"Btw, I see that you're using 4 partitions ? Any reason you're not using
like >= 128 and running with more containers ?"


In my case, the table that I need to join is small (1.5GB) so rather than
partitioning the state, I have each task keep it's own full copy.  This
makes the overall job flow simpler in that I don't need partitioning jobs
and I can add more partitions if needed without a complex migration process
or data loss.

So, in this case, adding more partitions won't decrease the start up time.

On Tue, Jan 20, 2015 at 4:19 PM, Chinmay Soman <[email protected]>
wrote:

> I remember running both RocksDB and LevelDB and it was definitely better
> (in that 1 test case, it was ~40K vs ~30K random writes/sec) - but I
> haven't done any exhaustive comparison.
>
> Btw, I see that you're using 4 partitions ? Any reason you're not using
> like >= 128 and running with more containers ?
>
> On Tue, Jan 20, 2015 at 4:05 PM, Roger Hoover <[email protected]>
> wrote:
>
> > Thanks, Chris.
> >
> > I am not using a changelog for the store because the the bootstrap stream
> > is a master copy of the data and the job can recover from there.  No need
> > to write out another copy.  Is this the way you typically do it for
> > stream/table joins?
> >
> > Great to know you that you're looking into the performance issues.  I
> love
> > the idea of local state for isolation and predictable throughput but the
> > current write throughput puts hard limits on the amount of local state
> that
> > a container can have without really long initialization/recovery times.
> >
> > Is my tests, LevelDB has about the same performance.  Have you noticed
> that
> > as well?
> >
> > Cheers,
> >
> > Roger
> >
> > On Tue, Jan 20, 2015 at 9:28 AM, Chris Riccomini <
> > [email protected]> wrote:
> >
> > > Hey Roger,
> > >
> > > We did some benchmarking, and discovered very similar performance to
> what
> > > you've described. We saw ~40k writes/sec, and ~20 k reads/sec,
> > > per-container, on a Virident SSD. This was without any changelog. Are
> you
> > > using a changelog on the store?
> > >
> > > When we attached a changelog to the store, the writes dropped
> > > significantly (~1000 writes/sec). When we hooked up VisualVM, we saw
> that
> > > the container was spending > 99% of its time in
> > KafkaSystemProducer.send().
> > >
> > > We're currently doing two things:
> > >
> > > 1. Working with our performance team to understand and tune RocksDB
> > > properly.
> > > 2. Upgrading the Kafka producer to use the new Java-based API.
> > (SAMZA-227)
> > >
> > > For (1), it seems like we should be able to get a lot higher throughput
> > > from RocksDB. Anecdotally, we've heard that RocksDB requires many
> threads
> > > in order to max out an SSD, and since Samza is single-threaded, we
> could
> > > just be hitting a RocksDB bottleneck. We won't know until we dig into
> the
> > > problem (which we started investigating last week). The current plan is
> > to
> > > start by benchmarking RocksDB JNI outside of Samza, and see what we can
> > > get. From there, we'll know our "speed of light", and can try to get
> > Samza
> > > as close as possible to it. If RocksDB JNI can't be made to go "fast",
> > > then we'll have to understand why.
> > >
> > > (2) should help with the changelog issue. I believe that the slowness
> > with
> > > the changelog is caused because the changelog is using a sync producer
> to
> > > send to Kafka, and is blocking when a batch is flushed. In the new API,
> > > the concept of a "sync" producer is removed. All writes are handled on
> an
> > > async writer thread (though we can still guarantee writes are safely
> > > written before checkpointing, which is what we need).
> > >
> > > In short, I agree, it seems slow. We see this behavior, too. We're
> > digging
> > > into it.
> > >
> > > Cheers,
> > > Chris
> > >
> > > On 1/17/15 12:58 PM, "Roger Hoover" <[email protected]> wrote:
> > >
> > > >Michael,
> > > >
> > > >Thanks for the response.  I used VisualVM and YourKit and see the CPU
> is
> > > >not being used (0.1%).  I took a few thread dumps and see the main
> > thread
> > > >blocked on the flush() method inside the KV store.
> > > >
> > > >On Sat, Jan 17, 2015 at 7:09 AM, Michael Rose <[email protected]
> >
> > > >wrote:
> > > >
> > > >> Is your process at 100% CPU? I suspect you're spending most of your
> > > >>time in
> > > >> JSON deserialization, but profile it and check.
> > > >>
> > > >> Michael
> > > >>
> > > >> On Friday, January 16, 2015, Roger Hoover <[email protected]>
> > > >>wrote:
> > > >>
> > > >> > Hi guys,
> > > >> >
> > > >> > I'm testing a job that needs to load 40M records (6GB in Kafka as
> > > >>JSON)
> > > >> > from a bootstrap topic.  The topic has 4 partitions and I'm
> running
> > > >>the
> > > >> job
> > > >> > using the ProcessJobFactory so all four tasks are in one
> container.
> > > >> >
> > > >> > Using RocksDB, it's taking 19 minutes to load all the data which
> > > >>amounts
> > > >> to
> > > >> > 35k records/sec or 5MB/s based on input size.  I ran iostat during
> > > >>this
> > > >> > time as see the disk write throughput is 14MB/s.
> > > >> >
> > > >> > I didn't tweak any of the storage settings.
> > > >> >
> > > >> > A few questions:
> > > >> > 1) Does this seem low?  I'm running on a Macbook Pro with SSD.
> > > >> > 2) Do you have any recommendations for improving the load speed?
> > > >> >
> > > >> > Thanks,
> > > >> >
> > > >> > Roger
> > > >> >
> > > >>
> > >
> > >
> >
>
>
>
> --
> Thanks and regards
>
> Chinmay Soman
>

Re: Local state write throughput

Reply via email to