Hi Bo,

I think application usually would not want to use Serial GC which is
designed only for uniprocessor. If you have 8G~10G memory, the STW time
with serial GC could be quite large.  Even Samza is designed for one core
as you mentioned, if the message rate of upstream data is huge, you would
still need to have multiple Samza containers to consume the upstream data
in order to avoid message lag(fall behind produce data offset). With full
GC(each ~10s) happened frequently , the QPS could be very minimal which I
would imagine it is hard for this Samza job to keep up the upstream data
message rate.

Thanks,
-Tao

On Sun, Jan 31, 2016 at 7:58 PM, Liu Bo <diabl...@gmail.com> wrote:

> Hi group
>
> We are trying to migrate our current streaming pipeline to samza. Our
> pipeline has several NLP modules, such as segment, POS, and a lot of score
> calculation. Each process normally needs 8~10GB memory.
>
> Our goal is high throughput so we use Parallel Scavenge + Parallel Old in
> our current setup. We've tried G1 in Java 8 U65, it's not so good for
> throughput.
>
> My question is since samza is designed for one core, dose it means that
> Serial + Serial Old is the best garbage collector for samza? On paper
> serial is more efficient.
>
> If it's not could someone share your experience on samza GC tuning for
> discussion? Thanks in advance.
>
> --
> All the best
>
> Liu Bo
>

Reply via email to