Re: 1.4.0 regression: out-of-memory errors on small data

Simeon Simeonov Sun, 05 Jul 2015 21:18:43 -0700

Yin,

With 512Mb PermGen, the process still hung and had to be kill -9ed.


At 1Gb the spark shell & associated processes stopped hanging and started 
exiting with

scala> println(dfCount.first.getLong(0))
15/07/06 00:10:07 INFO storage.MemoryStore: ensureFreeSpace(235040) called with 
curMem=0, maxMem=2223023063
15/07/06 00:10:07 INFO storage.MemoryStore: Block broadcast_2 stored as values 
in memory (estimated size 229.5 KB, free 2.1 GB)
15/07/06 00:10:08 INFO storage.MemoryStore: ensureFreeSpace(20184) called with 
curMem=235040, maxMem=2223023063
15/07/06 00:10:08 INFO storage.MemoryStore: Block broadcast_2_piece0 stored as 
bytes in memory (estimated size 19.7 KB, free 2.1 GB)
15/07/06 00:10:08 INFO storage.BlockManagerInfo: Added broadcast_2_piece0 in 
memory on localhost:65464 (size: 19.7 KB, free: 2.1 GB)
15/07/06 00:10:08 INFO spark.SparkContext: Created broadcast 2 from first at 
<console>:30
java.lang.OutOfMemoryError: PermGen space
Stopping spark context.
Exception in thread "main"
Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler 
in thread "main"
15/07/06 00:10:14 INFO storage.BlockManagerInfo: Removed broadcast_2_piece0 on 
localhost:65464 in memory (size: 19.7 KB, free: 2.1 GB)

That did not change up until 4Gb of PermGen space and 8Gb for driver & executor 
each.

I stopped at this point because the exercise started looking silly. It is clear 
that 1.4.0 is using memory in a substantially different manner.

I'd be happy to share the test file so you can reproduce this in your own 
environment.

/Sim

Simeon Simeonov, Founder & CTO, Swoop<http://swoop.com/>
@simeons<http://twitter.com/simeons> | 
blog.simeonov.com<http://blog.simeonov.com/> | 617.299.6746


From: Yin Huai <yh...@databricks.com<mailto:yh...@databricks.com>>
Date: Sunday, July 5, 2015 at 11:04 PM
To: Denny Lee <denny.g....@gmail.com<mailto:denny.g....@gmail.com>>
Cc: Andy Huang <andy.hu...@servian.com.au<mailto:andy.hu...@servian.com.au>>, 
Simeon Simeonov <s...@swoop.com<mailto:s...@swoop.com>>, user 
<user@spark.apache.org<mailto:user@spark.apache.org>>
Subject: Re: 1.4.0 regression: out-of-memory errors on small data

Sim,

Can you increase the PermGen size? Please let me know what is your setting when 
the problem disappears.

Thanks,

Yin

On Sun, Jul 5, 2015 at 5:59 PM, Denny Lee 
<denny.g....@gmail.com<mailto:denny.g....@gmail.com>> wrote:
I had run into the same problem where everything was working swimmingly with 
Spark 1.3.1.  When I switched to Spark 1.4, either by upgrading to Java8 (from 
Java7) or by knocking up the PermGenSize had solved my issue.  HTH!



On Mon, Jul 6, 2015 at 8:31 AM Andy Huang 
<andy.hu...@servian.com.au<mailto:andy.hu...@servian.com.au>> wrote:
We have hit the same issue in spark shell when registering a temp table. We 
observed it happening with those who had JDK 6. The problem went away after 
installing jdk 8. This was only for the tutorial materials which was about 
loading a parquet file.

Regards
Andy

On Sat, Jul 4, 2015 at 2:54 AM, sim <s...@swoop.com<mailto:s...@swoop.com>> 
wrote:
@bipin, in my case the error happens immediately in a fresh shell in 1.4.0.



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/1-4-0-regression-out-of-memory-errors-on-small-data-tp23595p23614.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: 
user-unsubscr...@spark.apache.org<mailto:user-unsubscr...@spark.apache.org>
For additional commands, e-mail: 
user-h...@spark.apache.org<mailto:user-h...@spark.apache.org>




--
Andy Huang | Managing Consultant | Servian Pty Ltd | t: 02 9376 0700 | f: 02 
9376 0730| m: 0433221979

Re: 1.4.0 regression: out-of-memory errors on small data

Reply via email to