Java Heap Space Error

2018-02-16 Thread Vinay Muttineni
Hello, I am trying to debug a PySpark program and quite frankly, I am stumped. I see the following error in the logs. I verified the input parameters - all appear to be in order. Driver and executors appear to be proper - about 3MB of 7GB being used on each node. I do see that the DAG plan that

OOM error with GMMs on 4GB dataset

2015-05-04 Thread Vinay Muttineni
Hi, I am training a GMM with 10 gaussians on a 4 GB dataset(720,000 * 760). The spark (1.3.1) job is allocated 120 executors with 6GB each and the driver also has 6GB. Spark Config Params: .set(spark.hadoop.validateOutputSpecs, false).set(spark.dynamicAllocation.enabled,