Re: OutOfMemory with wide (289 column) dataframe

2016-04-01 Thread Ted Yu
a gzip'd json file. gzip files are NOT splittable, so it wasn't properly > parallelized, which means that the join were causing alot of memory > pressure. I recompressed it was bzip2 and my job has been running with no > errors. > > Thanks again! > > > > -- > View this m

Re: OutOfMemory with wide (289 column) dataframe

2016-04-01 Thread ludflu
gzip'd json file. gzip files are NOT splittable, so it wasn't properly parallelized, which means that the join were causing alot of memory pressure. I recompressed it was bzip2 and my job has been running with no errors. Thanks again! -- View this message in context: http://apache-spark-user

OutOfMemory with wide (289 column) dataframe

2016-03-31 Thread ludflu
B? Any words of wisdom would be really appreciated! -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/OutOfMemory-with-wide-289-column-dataframe-tp26651.html Sent from the Apache Spark User