Hi, Bo, That's an interesting question. Since we have opened up the task.opts option to the users to set any favorable GC configuration to Samza jobs, we really don't have a "recommended" GC for the users. It would probably also depend on the application's usage pattern as well. Our perf partner Tao Feng @LinkedIn may have some more insights.
@Tao, do you have any comments on this? -Yi On Sun, Jan 31, 2016 at 7:58 PM, Liu Bo <diabl...@gmail.com> wrote: > Hi group > > We are trying to migrate our current streaming pipeline to samza. Our > pipeline has several NLP modules, such as segment, POS, and a lot of score > calculation. Each process normally needs 8~10GB memory. > > Our goal is high throughput so we use Parallel Scavenge + Parallel Old in > our current setup. We've tried G1 in Java 8 U65, it's not so good for > throughput. > > My question is since samza is designed for one core, dose it means that > Serial + Serial Old is the best garbage collector for samza? On paper > serial is more efficient. > > If it's not could someone share your experience on samza GC tuning for > discussion? Thanks in advance. > > -- > All the best > > Liu Bo >