Hey Roger, > Is this the way you typically do it for stream/table joins?
Yup, in cases where the source stream has a copy of the data, and you fully bootstrap, there's no need to have a changelog. > but the current write throughput puts hard limits on the amount of local >state that a container can have without really long >initialization/recovery times. Yea. The current guidance we give is <10G/container, and every gig costs on startup recovery time. We definitely want to decrease startup times. > Is my tests, LevelDB has about the same performance. Have you noticed >that as well? I haven't checked, but I'm not surprised. LevelDB and RocksDB are fairly similar. Anecdotally, we have found LevelDB less predictable. It performs on-par with RocksDB, and then falls off a cliff for some reason (data size, access patterns, etc) sometimes. Cheers, Chris On 1/20/15 4:05 PM, "Roger Hoover" <[email protected]> wrote: >Thanks, Chris. > >I am not using a changelog for the store because the the bootstrap stream >is a master copy of the data and the job can recover from there. No need >to write out another copy. Is this the way you typically do it for >stream/table joins? > >Great to know you that you're looking into the performance issues. I love >the idea of local state for isolation and predictable throughput but the >current write throughput puts hard limits on the amount of local state >that >a container can have without really long initialization/recovery times. > >Is my tests, LevelDB has about the same performance. Have you noticed >that >as well? > >Cheers, > >Roger > >On Tue, Jan 20, 2015 at 9:28 AM, Chris Riccomini < >[email protected]> wrote: > >> Hey Roger, >> >> We did some benchmarking, and discovered very similar performance to >>what >> you've described. We saw ~40k writes/sec, and ~20 k reads/sec, >> per-container, on a Virident SSD. This was without any changelog. Are >>you >> using a changelog on the store? >> >> When we attached a changelog to the store, the writes dropped >> significantly (~1000 writes/sec). When we hooked up VisualVM, we saw >>that >> the container was spending > 99% of its time in >>KafkaSystemProducer.send(). >> >> We're currently doing two things: >> >> 1. Working with our performance team to understand and tune RocksDB >> properly. >> 2. Upgrading the Kafka producer to use the new Java-based API. >>(SAMZA-227) >> >> For (1), it seems like we should be able to get a lot higher throughput >> from RocksDB. Anecdotally, we've heard that RocksDB requires many >>threads >> in order to max out an SSD, and since Samza is single-threaded, we could >> just be hitting a RocksDB bottleneck. We won't know until we dig into >>the >> problem (which we started investigating last week). The current plan is >>to >> start by benchmarking RocksDB JNI outside of Samza, and see what we can >> get. From there, we'll know our "speed of light", and can try to get >>Samza >> as close as possible to it. If RocksDB JNI can't be made to go "fast", >> then we'll have to understand why. >> >> (2) should help with the changelog issue. I believe that the slowness >>with >> the changelog is caused because the changelog is using a sync producer >>to >> send to Kafka, and is blocking when a batch is flushed. In the new API, >> the concept of a "sync" producer is removed. All writes are handled on >>an >> async writer thread (though we can still guarantee writes are safely >> written before checkpointing, which is what we need). >> >> In short, I agree, it seems slow. We see this behavior, too. We're >>digging >> into it. >> >> Cheers, >> Chris >> >> On 1/17/15 12:58 PM, "Roger Hoover" <[email protected]> wrote: >> >> >Michael, >> > >> >Thanks for the response. I used VisualVM and YourKit and see the CPU >>is >> >not being used (0.1%). I took a few thread dumps and see the main >>thread >> >blocked on the flush() method inside the KV store. >> > >> >On Sat, Jan 17, 2015 at 7:09 AM, Michael Rose <[email protected]> >> >wrote: >> > >> >> Is your process at 100% CPU? I suspect you're spending most of your >> >>time in >> >> JSON deserialization, but profile it and check. >> >> >> >> Michael >> >> >> >> On Friday, January 16, 2015, Roger Hoover <[email protected]> >> >>wrote: >> >> >> >> > Hi guys, >> >> > >> >> > I'm testing a job that needs to load 40M records (6GB in Kafka as >> >>JSON) >> >> > from a bootstrap topic. The topic has 4 partitions and I'm running >> >>the >> >> job >> >> > using the ProcessJobFactory so all four tasks are in one container. >> >> > >> >> > Using RocksDB, it's taking 19 minutes to load all the data which >> >>amounts >> >> to >> >> > 35k records/sec or 5MB/s based on input size. I ran iostat during >> >>this >> >> > time as see the disk write throughput is 14MB/s. >> >> > >> >> > I didn't tweak any of the storage settings. >> >> > >> >> > A few questions: >> >> > 1) Does this seem low? I'm running on a Macbook Pro with SSD. >> >> > 2) Do you have any recommendations for improving the load speed? >> >> > >> >> > Thanks, >> >> > >> >> > Roger >> >> > >> >> >> >>
