Re: Dataset : Issue with Save

Yong Zhang Thu, 16 Mar 2017 12:50:27 -0700

You can take a look of https://issues.apache.org/jira/browse/SPARK-12837



Yong

Spark driver requires large memory space for serialized 
...<https://issues.apache.org/jira/browse/SPARK-12837>
issues.apache.org
Executing a sql statement with a large number of partitions requires a high 
memory space for the driver even there are no requests to collect data back to 
the driver.




________________________________
From: Bahubali Jain <bahub...@gmail.com>
Sent: Thursday, March 16, 2017 1:39 PM
To: user@spark.apache.org
Subject: Dataset : Issue with Save

Hi,
While saving a dataset using        mydataset.write().csv("outputlocation")     
              I am running into an exception

"Total size of serialized results of 3722 tasks (1024.0 MB) is bigger than 
spark.driver.maxResultSize (1024.0 MB)"

Does it mean that for saving a dataset whole of the dataset contents are being 
sent to driver ,similar to collect()  action?

Thanks,
Baahu

Re: Dataset : Issue with Save

Reply via email to