Re: Spark ML's RandomForestClassifier OOM

2017-01-10 Thread Julio Antonio Soto de Vicente
t; Hth > >> On 10 Jan 2017 10:07 am, "Julio Antonio Soto" <ju...@esbet.es> wrote: >> Hi, >> >> I am running into OOM problems while training a Spark ML >> RandomForestClassifier (maxDepth of 30, 32 maxBins, 100 trees). >

Spark ML's RandomForestClassifier OOM

2017-01-10 Thread Julio Antonio Soto
Hi, I am running into OOM problems while training a Spark ML RandomForestClassifier (maxDepth of 30, 32 maxBins, 100 trees). My dataset is arguably pretty big given the executor count and size (8x5G), with approximately 20M rows and 130 features. The "fun fact" is that a single

OOM on yarn-cluster mode

2016-01-19 Thread Julio Antonio Soto
Spark 1.5.2 and YARN from Hadoop 2.6.0-cdh5.5.1. Any help would be greatly appreciated! Thank you. -- Julio Antonio Soto de Vicente

Re: OOM on yarn-cluster mode

2016-01-19 Thread Julio Antonio Soto de Vicente
Hi, I tried with --driver-memory 16G (more than enough to read a simple parquet table), but the problem still persists. Everything works fine in yarn-client. -- Julio Antonio Soto de Vicente > El 19 ene 2016, a las 22:18, Saisai Shao <sai.sai.s...@gmail.com> escribió: > >