Thanks Chris, We will test our product using SerialGC to see how it behave.
One concern that I have is regarding the kafka topic sizes - Assuming "stop-the-world" GC stops will more noticable using SerialGC should we increase the kafka topic sizes to accommodate incoming data during these time gaps as opposed to the parallel GC? Or on a broader aspect - What are the best practices to measure and set the right size for the kafka topics? Can anyone share his experience on that? Thanks, Dotan On Tue, Oct 28, 2014 at 5:53 PM, Chris Riccomini < [email protected]> wrote: > Hey Dotan, > > We run all of our jobs using SerialGC by default. For a few of our > higher-throughput jobs, we've had better luck with parallel GC or G1, but > in general, serial works fine. > > Cheers, > Chris > > On 10/28/14 8:34 AM, "Dotan Patrich" <[email protected]> wrote: > > >Hi All, > > > >I encountered some issues caused by having too many threads for a user on > >linux CentOS. Investigating this deeper, it turned out that the JVM spawn > >over 31 threads per process for GC. Having about 18 Samza processes > >running > >on the machine we soon got near to the 1000 threads limit per user. > >I was thinking of running the Samza JVM with SerialGC instead of parallel > >GC to avoid having so many threads in the environment. In addition, > >theoretically this seems to be better fitted for situations where we > >prefer > >throughput over latency in a single-core environments (this is roughly > >what > >we Samza tasks is assigned with). > > > >Before doing so, I would really appreciate you insights - did anyone > >encountered this issue before? Does changing the GC to be serial is a good > >solution? > > > >Thanks, > >Dotan > >
