Re: OutOfMemory with wide (289 column) dataframe

2016-04-01 Thread ludflu
This was a big help! For the benefit of my fellow travelers running spark on EMR: I made a json file with the following: [ { "Classification": "yarn-site", "Properties": { "yarn.nodemanager.pmem-check-enabled": "false", "yarn.nodemanager.vmem-check-enabled": "false" } } ] and then I created my

OutOfMemory with wide (289 column) dataframe

2016-03-31 Thread ludflu
I'm building a spark job against Spark 1.6.0 / EMR 4.4 in Scala. I'm attempting to concat a bunch of dataframe columns then explode them into new rows. (just using the built in concat and explode functions) Works great in my unit test. But I get out of memory issues when I run against my