Hi,

I'm using Spark Streaming 1.3 on CDH5.1 with yarn-cluster mode. I find out
that YARN is killing the driver and executor process because of excessive
use of memory. Here's something I tried:

1. Xmx is set to 512M and the GC looks fine (one ygc per 10s), so the extra
memory is not used by heap.
2. I set the two memoryOverhead params to 1024 (default is 384), but the
memory just keeps growing and then hits the limit.
3. This problem is not shown in low-throughput jobs, neither in standalone
mode.
4. The test job just receives messages from Kafka, with batch interval of
1, do some filtering and aggregation, and then print to executor logs. So
it's not some 3rd party library that causes the 'leak'.

Spark 1.3 is built by myself, with correct hadoop versions.

Any ideas will be appreciated.

Thanks.

-- 
Jerry

Reply via email to