You meant "SPARK_REPL_OPTS"? I did a quick search. Looks like it has been removed since 1.0. I think it did not affect the behavior of the shell.
On Mon, Jul 6, 2015 at 9:04 AM, Simeon Simeonov <s...@swoop.com> wrote: > Yin, that did the trick. > > I'm curious what was the effect of the environment variable, however, as > the behavior of the shell changed from hanging to quitting when the env var > value got to 1g. > > /Sim > > Simeon Simeonov, Founder & CTO, Swoop <http://swoop.com/> > @simeons <http://twitter.com/simeons> | blog.simeonov.com | 617.299.6746 > > > From: Yin Huai <yh...@databricks.com> > Date: Monday, July 6, 2015 at 11:41 AM > To: Denny Lee <denny.g....@gmail.com> > Cc: Simeon Simeonov <s...@swoop.com>, Andy Huang <andy.hu...@servian.com.au>, > user <user@spark.apache.org> > > Subject: Re: 1.4.0 regression: out-of-memory errors on small data > > Hi Sim, > > I think the right way to set the PermGen Size is through driver extra > JVM options, i.e. > > --conf "spark.driver.extraJavaOptions=-XX:MaxPermSize=256m" > > Can you try it? Without this conf, your driver's PermGen size is still > 128m. > > Thanks, > > Yin > > On Mon, Jul 6, 2015 at 4:07 AM, Denny Lee <denny.g....@gmail.com> wrote: > >> I went ahead and tested your file and the results from the tests can be >> seen in the gist: https://gist.github.com/dennyglee/c933b5ae01c57bd01d94. >> >> Basically, when running {Java 7, MaxPermSize = 256} or {Java 8, >> default} the query ran without any issues. I was able to recreate the >> issue with {Java 7, default}. I included the commands I used to start the >> spark-shell but basically I just used all defaults (no alteration to driver >> or executor memory) with the only additional call was with >> driver-class-path to connect to MySQL Hive metastore. This is on OSX >> Macbook Pro. >> >> One thing I did notice is that your version of Java 7 is version 51 >> while my version of Java 7 version 79. Could you see if updating to Java 7 >> version 79 perhaps allows you to use the MaxPermSize call? >> >> >> >> >> On Mon, Jul 6, 2015 at 1:36 PM Simeon Simeonov <s...@swoop.com> wrote: >> >>> The file is at >>> https://www.dropbox.com/s/a00sd4x65448dl2/apache-spark-failure-data-part-00000.gz?dl=1 >>> >>> The command was included in the gist >>> >>> SPARK_REPL_OPTS="-XX:MaxPermSize=256m" >>> spark-1.4.0-bin-hadoop2.6/bin/spark-shell --packages >>> com.databricks:spark-csv_2.10:1.0.3 --driver-memory 4g --executor-memory 4g >>> >>> /Sim >>> >>> Simeon Simeonov, Founder & CTO, Swoop <http://swoop.com/> >>> @simeons <http://twitter.com/simeons> | blog.simeonov.com | 617.299.6746 >>> >>> >>> From: Yin Huai <yh...@databricks.com> >>> Date: Monday, July 6, 2015 at 12:59 AM >>> To: Simeon Simeonov <s...@swoop.com> >>> Cc: Denny Lee <denny.g....@gmail.com>, Andy Huang < >>> andy.hu...@servian.com.au>, user <user@spark.apache.org> >>> >>> Subject: Re: 1.4.0 regression: out-of-memory errors on small data >>> >>> I have never seen issue like this. Setting PermGen size to 256m >>> should solve the problem. Can you send me your test file and the command >>> used to launch the spark shell or your application? >>> >>> Thanks, >>> >>> Yin >>> >>> On Sun, Jul 5, 2015 at 9:17 PM, Simeon Simeonov <s...@swoop.com> wrote: >>> >>>> Yin, >>>> >>>> With 512Mb PermGen, the process still hung and had to be kill -9ed. >>>> >>>> At 1Gb the spark shell & associated processes stopped hanging and >>>> started exiting with >>>> >>>> scala> println(dfCount.first.getLong(0)) >>>> 15/07/06 00:10:07 INFO storage.MemoryStore: ensureFreeSpace(235040) >>>> called with curMem=0, maxMem=2223023063 >>>> 15/07/06 00:10:07 INFO storage.MemoryStore: Block broadcast_2 stored as >>>> values in memory (estimated size 229.5 KB, free 2.1 GB) >>>> 15/07/06 00:10:08 INFO storage.MemoryStore: ensureFreeSpace(20184) >>>> called with curMem=235040, maxMem=2223023063 >>>> 15/07/06 00:10:08 INFO storage.MemoryStore: Block broadcast_2_piece0 >>>> stored as bytes in memory (estimated size 19.7 KB, free 2.1 GB) >>>> 15/07/06 00:10:08 INFO storage.BlockManagerInfo: Added >>>> broadcast_2_piece0 in memory on localhost:65464 (size: 19.7 KB, free: 2.1 >>>> GB) >>>> 15/07/06 00:10:08 INFO spark.SparkContext: Created broadcast 2 from >>>> first at <console>:30 >>>> java.lang.OutOfMemoryError: PermGen space >>>> Stopping spark context. >>>> Exception in thread "main" >>>> Exception: java.lang.OutOfMemoryError thrown from the >>>> UncaughtExceptionHandler in thread "main" >>>> 15/07/06 00:10:14 INFO storage.BlockManagerInfo: Removed >>>> broadcast_2_piece0 on localhost:65464 in memory (size: 19.7 KB, free: 2.1 >>>> GB) >>>> >>>> That did not change up until 4Gb of PermGen space and 8Gb for driver >>>> & executor each. >>>> >>>> I stopped at this point because the exercise started looking silly. >>>> It is clear that 1.4.0 is using memory in a substantially different manner. >>>> >>>> I'd be happy to share the test file so you can reproduce this in your >>>> own environment. >>>> >>>> /Sim >>>> >>>> Simeon Simeonov, Founder & CTO, Swoop <http://swoop.com/> >>>> @simeons <http://twitter.com/simeons> | blog.simeonov.com | >>>> 617.299.6746 >>>> >>>> >>>> From: Yin Huai <yh...@databricks.com> >>>> Date: Sunday, July 5, 2015 at 11:04 PM >>>> To: Denny Lee <denny.g....@gmail.com> >>>> Cc: Andy Huang <andy.hu...@servian.com.au>, Simeon Simeonov < >>>> s...@swoop.com>, user <user@spark.apache.org> >>>> Subject: Re: 1.4.0 regression: out-of-memory errors on small data >>>> >>>> Sim, >>>> >>>> Can you increase the PermGen size? Please let me know what is your >>>> setting when the problem disappears. >>>> >>>> Thanks, >>>> >>>> Yin >>>> >>>> On Sun, Jul 5, 2015 at 5:59 PM, Denny Lee <denny.g....@gmail.com> >>>> wrote: >>>> >>>>> I had run into the same problem where everything was working >>>>> swimmingly with Spark 1.3.1. When I switched to Spark 1.4, either by >>>>> upgrading to Java8 (from Java7) or by knocking up the PermGenSize had >>>>> solved my issue. HTH! >>>>> >>>>> >>>>> >>>>> On Mon, Jul 6, 2015 at 8:31 AM Andy Huang <andy.hu...@servian.com.au> >>>>> wrote: >>>>> >>>>>> We have hit the same issue in spark shell when registering a temp >>>>>> table. We observed it happening with those who had JDK 6. The problem >>>>>> went >>>>>> away after installing jdk 8. This was only for the tutorial materials >>>>>> which >>>>>> was about loading a parquet file. >>>>>> >>>>>> Regards >>>>>> Andy >>>>>> >>>>>> On Sat, Jul 4, 2015 at 2:54 AM, sim <s...@swoop.com> wrote: >>>>>> >>>>>>> @bipin, in my case the error happens immediately in a fresh shell in >>>>>>> 1.4.0. >>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> View this message in context: >>>>>>> http://apache-spark-user-list.1001560.n3.nabble.com/1-4-0-regression-out-of-memory-errors-on-small-data-tp23595p23614.html >>>>>>> Sent from the Apache Spark User List mailing list archive at >>>>>>> Nabble.com. >>>>>>> >>>>>>> --------------------------------------------------------------------- >>>>>>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org >>>>>>> For additional commands, e-mail: user-h...@spark.apache.org >>>>>>> >>>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Andy Huang | Managing Consultant | Servian Pty Ltd | t: 02 9376 >>>>>> 0700 | f: 02 9376 0730| m: 0433221979 >>>>>> >>>>> >>>> >>> >