Re: OutOfMemory error with Spark ML 1.5 logreg example

2015-09-07 Thread Zoltán Tóth
Aaand, the error! :) Exception in thread "org.apache.hadoop.hdfs.PeerCache@4e000abf" Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "org.apache.hadoop.hdfs.PeerCache@4e000abf" Exception in thread "Thread-7" Exception: java.lang.OutOfMemoryError thrown

Re: OutOfMemory error with Spark ML 1.5 logreg example

2015-09-07 Thread Zoltán Zvara
Hey, I'd try to debug, profile ResolvedDataSource. As far as I know, your write will be performed by the JVM. On Mon, Sep 7, 2015 at 4:11 PM Tóth Zoltán wrote: > Unfortunately I'm getting the same error: > The other interesting things are that: > - the parquet files got

Re: OutOfMemory error with Spark ML 1.5 logreg example

2015-09-07 Thread boci
Hi, Can you try to using save method instead of write? ex: out_df.save("path","parquet") b0c1 -- Skype: boci13, Hangout: boci.b...@gmail.com On Mon, Sep 7, 2015 at

Re: OutOfMemory error with Spark ML 1.5 logreg example

2015-09-07 Thread Tóth Zoltán
Unfortunately I'm getting the same error: The other interesting things are that: - the parquet files got actually written to HDFS (also with .write.parquet() ) - the application gets stuck in the RUNNING state for good even after the error is thrown 15/09/07 10:01:10 INFO spark.ContextCleaner:

Re: OutOfMemory error with Spark ML 1.5 logreg example

2015-09-07 Thread Zsolt Tóth
Hi, I ran your example on Spark-1.4.1 and 1.5.0-rc3. It succeeds on 1.4.1 but throws the OOM on 1.5.0. Do any of you know which PR introduced this issue? Zsolt 2015-09-07 16:33 GMT+02:00 Zoltán Zvara : > Hey, I'd try to debug, profile ResolvedDataSource. As far as I

Re: OutOfMemory error in Spark Core

2015-01-15 Thread Akhil Das
Did you try increasing the parallelism? Thanks Best Regards On Fri, Jan 16, 2015 at 10:41 AM, Anand Mohan chinn...@gmail.com wrote: We have our Analytics App built on Spark 1.1 Core, Parquet, Avro and Spray. We are using Kryo serializer for the Avro objects read from Parquet and we are using

Re: OutOfMemory Error

2014-08-20 Thread MEETHU MATHEW
 Hi , How to increase the heap size? What is the difference between spark executor memory and heap size? Thanks Regards, Meethu M On Monday, 18 August 2014 12:35 PM, Akhil Das ak...@sigmoidanalytics.com wrote: I believe spark.shuffle.memoryFraction is the one you are looking for.

RE: OutOfMemory Error

2014-08-20 Thread Shao, Saisai
/configuration.html Thanks Jerry From: MEETHU MATHEW [mailto:meethu2...@yahoo.co.in] Sent: Wednesday, August 20, 2014 4:48 PM To: Akhil Das; Ghousia Cc: user@spark.apache.org Subject: Re: OutOfMemory Error Hi , How to increase the heap size? What is the difference between spark executor memory and heap

Re: OutOfMemory Error

2014-08-19 Thread Ghousia
Hi, Any further info on this?? Do you think it would be useful if we have a in memory buffer implemented that stores the content of the new RDD. In case the buffer reaches a configured threshold, content of the buffer are spilled to the local disk. This saves us from OutOfMememory Error.

Re: OutOfMemory Error

2014-08-18 Thread Akhil Das
Hi Ghousia, You can try the following: 1. Increase the heap size https://spark.apache.org/docs/0.9.0/configuration.html 2. Increase the number of partitions http://stackoverflow.com/questions/21698443/spark-best-practice-for-retrieving-big-data-from-rdd-to-local-machine 3. You could try

Re: OutOfMemory Error

2014-08-18 Thread Ghousia
Thanks for the answer Akhil. We are right now getting rid of this issue by increasing the number of partitions. And we are persisting RDDs to DISK_ONLY. But the issue is with heavy computations within an RDD. It would be better if we have the option of spilling the intermediate transformation

Re: OutOfMemory Error

2014-08-18 Thread Akhil Das
I believe spark.shuffle.memoryFraction is the one you are looking for. spark.shuffle.memoryFraction : Fraction of Java heap to use for aggregation and cogroups during shuffles, if spark.shuffle.spill is true. At any given time, the collective size of all in-memory maps used for shuffles is