Re: 1.4.0 regression: out-of-memory errors on small data

2015-07-06 Thread Denny Lee
...@servian.com.au, user user@spark.apache.org Subject: Re: 1.4.0 regression: out-of-memory errors on small data I have never seen issue like this. Setting PermGen size to 256m should solve the problem. Can you send me your test file and the command used to launch the spark shell or your application

Re: 1.4.0 regression: out-of-memory errors on small data

2015-07-06 Thread Yin Huai
Date: Monday, July 6, 2015 at 11:41 AM To: Denny Lee denny.g@gmail.com Cc: Simeon Simeonov s...@swoop.com, Andy Huang andy.hu...@servian.com.au, user user@spark.apache.org Subject: Re: 1.4.0 regression: out-of-memory errors on small data Hi Sim, I think the right way to set the PermGen

Re: 1.4.0 regression: out-of-memory errors on small data

2015-07-06 Thread Yin Huai
, Andy Huang andy.hu...@servian.com.au, user user@spark.apache.org Subject: Re: 1.4.0 regression: out-of-memory errors on small data I have never seen issue like this. Setting PermGen size to 256m should solve the problem. Can you send me your test file and the command used to launch the spark

Re: 1.4.0 regression: out-of-memory errors on small data

2015-07-06 Thread Simeon Simeonov
...@servian.com.aumailto:andy.hu...@servian.com.au, user user@spark.apache.orgmailto:user@spark.apache.org Subject: Re: 1.4.0 regression: out-of-memory errors on small data Hi Sim, I think the right way to set the PermGen Size is through driver extra JVM options, i.e. --conf

Re: 1.4.0 regression: out-of-memory errors on small data

2015-07-05 Thread Andy Huang
We have hit the same issue in spark shell when registering a temp table. We observed it happening with those who had JDK 6. The problem went away after installing jdk 8. This was only for the tutorial materials which was about loading a parquet file. Regards Andy On Sat, Jul 4, 2015 at 2:54 AM,

Re: 1.4.0 regression: out-of-memory errors on small data

2015-07-05 Thread Denny Lee
I had run into the same problem where everything was working swimmingly with Spark 1.3.1. When I switched to Spark 1.4, either by upgrading to Java8 (from Java7) or by knocking up the PermGenSize had solved my issue. HTH! On Mon, Jul 6, 2015 at 8:31 AM Andy Huang andy.hu...@servian.com.au

Re: 1.4.0 regression: out-of-memory errors on small data

2015-07-05 Thread Simeon Simeonov
...@swoop.commailto:s...@swoop.com Cc: Denny Lee denny.g@gmail.commailto:denny.g@gmail.com, Andy Huang andy.hu...@servian.com.aumailto:andy.hu...@servian.com.au, user user@spark.apache.orgmailto:user@spark.apache.org Subject: Re: 1.4.0 regression: out-of-memory errors on small data I have never

Re: 1.4.0 regression: out-of-memory errors on small data

2015-07-05 Thread Simeon Simeonov
-of-memory errors on small data Sim, Can you increase the PermGen size? Please let me know what is your setting when the problem disappears. Thanks, Yin On Sun, Jul 5, 2015 at 5:59 PM, Denny Lee denny.g@gmail.commailto:denny.g@gmail.com wrote: I had run into the same problem where

Re: 1.4.0 regression: out-of-memory errors on small data

2015-07-05 Thread Yin Huai
@gmail.com Cc: Andy Huang andy.hu...@servian.com.au, Simeon Simeonov s...@swoop.com, user user@spark.apache.org Subject: Re: 1.4.0 regression: out-of-memory errors on small data Sim, Can you increase the PermGen size? Please let me know what is your setting when the problem disappears

Re: 1.4.0 regression: out-of-memory errors on small data

2015-07-05 Thread Yin Huai
Sim, Can you increase the PermGen size? Please let me know what is your setting when the problem disappears. Thanks, Yin On Sun, Jul 5, 2015 at 5:59 PM, Denny Lee denny.g@gmail.com wrote: I had run into the same problem where everything was working swimmingly with Spark 1.3.1. When I

Re: 1.4.0 regression: out-of-memory errors on small data

2015-07-02 Thread Yin Huai
Hi Sim, Seems you already set the PermGen size to 256m, right? I notice that in your the shell, you created a HiveContext (it further increased the memory consumption on PermGen). But, spark shell has already created a HiveContext for you (sqlContext. You can use asInstanceOf to access

Re: 1.4.0 regression: out-of-memory errors on small data

2015-07-02 Thread Simeon Simeonov
Date: Thursday, July 2, 2015 at 4:34 PM To: Simeon Simeonov s...@swoop.commailto:s...@swoop.com Cc: user user@spark.apache.orgmailto:user@spark.apache.org Subject: Re: 1.4.0 regression: out-of-memory errors on small data Hi Sim, Seems you already set the PermGen size to 256m, right? I notice

1.4.0 regression: out-of-memory errors on small data

2015-07-02 Thread sim
A very simple Spark SQL COUNT operation succeeds in spark-shell for 1.3.1 and fails with a series of out-of-memory errors in 1.4.0. This gist https://gist.github.com/ssimeonov/a49b75dc086c3ac6f3c4 includes the code and the full output from the 1.3.1 and 1.4.0 runs, including the command line

Re: 1.4.0 regression: out-of-memory errors on small data

2015-07-02 Thread Yin Huai
Hi Sim, Spark 1.4.0's memory consumption on PermGen is higher then Spark 1.3 (explained in https://issues.apache.org/jira/browse/SPARK-8776). Can you add --conf spark.driver.extraJavaOptions=-XX:MaxPermSize=256m in the command you used to launch Spark shell? This will increase the PermGen size