You also have to check the memory you give to the spark driver
(spark.driver.memory property)
On 26/03/17 07:40, RUSHIKESH RAUT wrote:
Yes I know it inevitable if the data is large. I want to know how do I
increase the interpreter memory to handle large data?
Thanks,
Rushikesh Raut
On Mar 26, 2017 8:56 AM, "Jianfeng (Jeff) Zhang" <[email protected]
<mailto:[email protected]>> wrote:
How large is your data ? This problem is inevitable if your data is
too large, you can try to use spark data frame if that works for you.
Best Regard,
Jeff Zhang
From: RUSHIKESH RAUT <[email protected]
<mailto:[email protected]>>
Reply-To: "[email protected]
<mailto:[email protected]>" <[email protected]
<mailto:[email protected]>>
Date: Saturday, March 25, 2017 at 5:06 PM
To: "[email protected] <mailto:[email protected]>"
<[email protected] <mailto:[email protected]>>
Subject: Zeppelin out of memory issue - (GC overhead limit exceeded)
Hi everyone,
I am trying to load some data from hive table into my notebook and
then convert this dataframe into r dataframe using spark.r
interpreter. This works perfectly for small amount of data.
But if the data is increased then it gives me error
java.lang.OutOfMemoryError: GC overhead limit exceeded
I have tried increasing the ZEPPELIN_MEM and ZEPPELIN_INTP_MEM in
the zeppelin-env.cmd file but i am still facing this issue. I have
used the following configuration
set ZEPPELIN_MEM="-Xms4096m -Xmx4096m -XX:MaxPermSize=2048m"
set ZEPPELIN_INTP_MEM="-Xmx4096m -Xms4096m -XX:MaxPermSize=2048m"
I am sure that this much size should be sufficient for my data but
still i am getting this same error. Any guidance will be much
appreciated.
Thanks,
Rushikesh Raut