Good to know. Thanks, Jay and Chris. I want the job to accept updates it may be worthwhile for me to add a changelog so recoveries are faster.
On Tue, Jan 20, 2015 at 4:38 PM, Chris Riccomini < [email protected]> wrote: > Hey Roger, > > To add to Jay's comment, if you don't care about getting updates after the > initial bootstrap, you can configure a store with a changelog pointed to > your bootstrap topic. This will cause the SamzaContainer to bootstrap > using the optimized code that Jay described. Just make sure you don't > write to the store (since it would put the mutation back into your > bootstrap stream). This configuration won't allow new updates to come into > the store until the job is restarted. If you use the 'bootstrap stream' > concept, then you continue getting updates after the initial bootstrap. > The 'bootstrap' stream also allows you to have arbitrary logic, which > might be useful for your job--not sure. > > Cheers, > Chris > > On 1/20/15 4:30 PM, "Jay Kreps" <[email protected]> wrote: > > >It's also worth noting that restoring from a changelog *should* be much > >faster than restoring from upstream. The restore case is optimized and > >batches the updates and skips serialization both of which help a ton with > >performance. > > > >-Jay > > > >On Tue, Jan 20, 2015 at 4:19 PM, Chinmay Soman <[email protected] > > > >wrote: > > > >> I remember running both RocksDB and LevelDB and it was definitely better > >> (in that 1 test case, it was ~40K vs ~30K random writes/sec) - but I > >> haven't done any exhaustive comparison. > >> > >> Btw, I see that you're using 4 partitions ? Any reason you're not using > >> like >= 128 and running with more containers ? > >> > >> On Tue, Jan 20, 2015 at 4:05 PM, Roger Hoover <[email protected]> > >> wrote: > >> > >> > Thanks, Chris. > >> > > >> > I am not using a changelog for the store because the the bootstrap > >>stream > >> > is a master copy of the data and the job can recover from there. No > >>need > >> > to write out another copy. Is this the way you typically do it for > >> > stream/table joins? > >> > > >> > Great to know you that you're looking into the performance issues. I > >> love > >> > the idea of local state for isolation and predictable throughput but > >>the > >> > current write throughput puts hard limits on the amount of local state > >> that > >> > a container can have without really long initialization/recovery > >>times. > >> > > >> > Is my tests, LevelDB has about the same performance. Have you noticed > >> that > >> > as well? > >> > > >> > Cheers, > >> > > >> > Roger > >> > > >> > On Tue, Jan 20, 2015 at 9:28 AM, Chris Riccomini < > >> > [email protected]> wrote: > >> > > >> > > Hey Roger, > >> > > > >> > > We did some benchmarking, and discovered very similar performance to > >> what > >> > > you've described. We saw ~40k writes/sec, and ~20 k reads/sec, > >> > > per-container, on a Virident SSD. This was without any changelog. > >>Are > >> you > >> > > using a changelog on the store? > >> > > > >> > > When we attached a changelog to the store, the writes dropped > >> > > significantly (~1000 writes/sec). When we hooked up VisualVM, we saw > >> that > >> > > the container was spending > 99% of its time in > >> > KafkaSystemProducer.send(). > >> > > > >> > > We're currently doing two things: > >> > > > >> > > 1. Working with our performance team to understand and tune RocksDB > >> > > properly. > >> > > 2. Upgrading the Kafka producer to use the new Java-based API. > >> > (SAMZA-227) > >> > > > >> > > For (1), it seems like we should be able to get a lot higher > >>throughput > >> > > from RocksDB. Anecdotally, we've heard that RocksDB requires many > >> threads > >> > > in order to max out an SSD, and since Samza is single-threaded, we > >> could > >> > > just be hitting a RocksDB bottleneck. We won't know until we dig > >>into > >> the > >> > > problem (which we started investigating last week). The current > >>plan is > >> > to > >> > > start by benchmarking RocksDB JNI outside of Samza, and see what we > >>can > >> > > get. From there, we'll know our "speed of light", and can try to get > >> > Samza > >> > > as close as possible to it. If RocksDB JNI can't be made to go > >>"fast", > >> > > then we'll have to understand why. > >> > > > >> > > (2) should help with the changelog issue. I believe that the > >>slowness > >> > with > >> > > the changelog is caused because the changelog is using a sync > >>producer > >> to > >> > > send to Kafka, and is blocking when a batch is flushed. In the new > >>API, > >> > > the concept of a "sync" producer is removed. All writes are handled > >>on > >> an > >> > > async writer thread (though we can still guarantee writes are safely > >> > > written before checkpointing, which is what we need). > >> > > > >> > > In short, I agree, it seems slow. We see this behavior, too. We're > >> > digging > >> > > into it. > >> > > > >> > > Cheers, > >> > > Chris > >> > > > >> > > On 1/17/15 12:58 PM, "Roger Hoover" <[email protected]> wrote: > >> > > > >> > > >Michael, > >> > > > > >> > > >Thanks for the response. I used VisualVM and YourKit and see the > >>CPU > >> is > >> > > >not being used (0.1%). I took a few thread dumps and see the main > >> > thread > >> > > >blocked on the flush() method inside the KV store. > >> > > > > >> > > >On Sat, Jan 17, 2015 at 7:09 AM, Michael Rose > >><[email protected] > >> > > >> > > >wrote: > >> > > > > >> > > >> Is your process at 100% CPU? I suspect you're spending most of > >>your > >> > > >>time in > >> > > >> JSON deserialization, but profile it and check. > >> > > >> > >> > > >> Michael > >> > > >> > >> > > >> On Friday, January 16, 2015, Roger Hoover > >><[email protected]> > >> > > >>wrote: > >> > > >> > >> > > >> > Hi guys, > >> > > >> > > >> > > >> > I'm testing a job that needs to load 40M records (6GB in Kafka > >>as > >> > > >>JSON) > >> > > >> > from a bootstrap topic. The topic has 4 partitions and I'm > >> running > >> > > >>the > >> > > >> job > >> > > >> > using the ProcessJobFactory so all four tasks are in one > >> container. > >> > > >> > > >> > > >> > Using RocksDB, it's taking 19 minutes to load all the data > >>which > >> > > >>amounts > >> > > >> to > >> > > >> > 35k records/sec or 5MB/s based on input size. I ran iostat > >>during > >> > > >>this > >> > > >> > time as see the disk write throughput is 14MB/s. > >> > > >> > > >> > > >> > I didn't tweak any of the storage settings. > >> > > >> > > >> > > >> > A few questions: > >> > > >> > 1) Does this seem low? I'm running on a Macbook Pro with SSD. > >> > > >> > 2) Do you have any recommendations for improving the load > >>speed? > >> > > >> > > >> > > >> > Thanks, > >> > > >> > > >> > > >> > Roger > >> > > >> > > >> > > >> > >> > > > >> > > > >> > > >> > >> > >> > >> -- > >> Thanks and regards > >> > >> Chinmay Soman > >> > >
