Solved the problem by specifying the PermGen size when submitting the job
(even to just a few MB).

Seems Java 8 has removed the Permanent Generation space, thus corresponding
JVM arguments are ignored.  But I can still use --driver-java-options
"-XX:PermSize=80M -XX:MaxPermSize=100m" to specify them when submitting the
Spark job, which is wried. I don't know whether it has anything to do with
py4j as I am not familiar with it.

2016-10-13 17:00 GMT+08:00 Shady Xu <shad...@gmail.com>:

> Hi,
>
> I have a problem when running Spark SQL by PySpark on Java 8. Below is the
> log.
>
>
> 16/10/13 16:46:40 INFO spark.SparkContext: Starting job: sql at 
> NativeMethodAccessorImpl.java:-2
> Exception in thread "dag-scheduler-event-loop" java.lang.OutOfMemoryError: 
> PermGen space
>       at java.lang.ClassLoader.defineClass1(Native Method)
>       at java.lang.ClassLoader.defineClass(ClassLoader.java:800)
>       at 
> java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
>       at java.net.URLClassLoader.defineClass(URLClassLoader.java:449)
>       at java.net.URLClassLoader.access$100(URLClassLoader.java:71)
>       at java.net.URLClassLoader$1.run(URLClassLoader.java:361)
>       at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
>       at java.security.AccessController.doPrivileged(Native Method)
>       at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
>       at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
>       at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
>       at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
>       at 
> org.apache.spark.scheduler.DAGScheduler.handleJobSubmitted(DAGScheduler.scala:857)
>       at 
> org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1630)
>       at 
> org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1622)
>       at 
> org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1611)
>       at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
> Exception in thread "shuffle-server-2" java.lang.OutOfMemoryError: PermGen 
> space
> Exception in thread "shuffle-server-4" java.lang.OutOfMemoryError: PermGen 
> space
> Exception in thread "threadDeathWatcher-2-1" java.lang.OutOfMemoryError: 
> PermGen space
>
>
> I tried to increase the driver memory and didn't help. However, things are ok 
> when I run the same code after switching to Java 7. I also find it ok to run 
> the SparkPi example on Java 8. So I believe the problem stays with PySpark 
> rather theSpark core.
>
>
> I am using Spark 2.0.1 and run the program in YARN cluster mode. Anyone any 
> idea is appreciated.
>
>

Reply via email to