Hi group We are trying to migrate our current streaming pipeline to samza. Our pipeline has several NLP modules, such as segment, POS, and a lot of score calculation. Each process normally needs 8~10GB memory.
Our goal is high throughput so we use Parallel Scavenge + Parallel Old in our current setup. We've tried G1 in Java 8 U65, it's not so good for throughput. My question is since samza is designed for one core, dose it means that Serial + Serial Old is the best garbage collector for samza? On paper serial is more efficient. If it's not could someone share your experience on samza GC tuning for discussion? Thanks in advance. -- All the best Liu Bo