Hi, Bo,

That's an interesting question. Since we have opened up the task.opts
option to the users to set any favorable GC configuration to Samza jobs, we
really don't have a "recommended" GC for the users. It would probably also
depend on the application's usage pattern as well. Our perf partner Tao
Feng @LinkedIn may have some more insights.

@Tao, do you have any comments on this?

-Yi

On Sun, Jan 31, 2016 at 7:58 PM, Liu Bo <diabl...@gmail.com> wrote:

> Hi group
>
> We are trying to migrate our current streaming pipeline to samza. Our
> pipeline has several NLP modules, such as segment, POS, and a lot of score
> calculation. Each process normally needs 8~10GB memory.
>
> Our goal is high throughput so we use Parallel Scavenge + Parallel Old in
> our current setup. We've tried G1 in Java 8 U65, it's not so good for
> throughput.
>
> My question is since samza is designed for one core, dose it means that
> Serial + Serial Old is the best garbage collector for samza? On paper
> serial is more efficient.
>
> If it's not could someone share your experience on samza GC tuning for
> discussion? Thanks in advance.
>
> --
> All the best
>
> Liu Bo
>

Reply via email to