Re: Spark ML's RandomForestClassifier OOM

2017-01-10 Thread Julio Antonio Soto de Vicente
No. I am running Spark on YARN on a 3 node testing cluster. My guess is that given the amount of splits done by a hundred trees of depth 30 (which should be more than 100 * 2^30), either the executors or the driver die OOM while trying to store all the split metadata. I guess that the same

Re: OOM on yarn-cluster mode

2016-01-19 Thread Julio Antonio Soto de Vicente
Hi, I tried with --driver-memory 16G (more than enough to read a simple parquet table), but the problem still persists. Everything works fine in yarn-client. -- Julio Antonio Soto de Vicente > El 19 ene 2016, a las 22:18, Saisai Shao <sai.sai.s...@gmail.com> escribió: > >