Re: OOM when running Spark SQL by PySpark on Java 8

2016-10-13 Thread Shady Xu
All nodes of my YARN cluster is running on Java 7, but I submit the job from a Java 8 client. I realised I run the job in yarn cluster mode and that's why setting ' --driver-java-options' is effective. Now the problem is, why submitting a job from a Java 8 client to a Java 7 cluster causes a

Re: OOM when running Spark SQL by PySpark on Java 8

2016-10-13 Thread Sean Owen
You can specify it; it just doesn't do anything but cause a warning in Java 8. It won't work in general to have such a tiny PermGen. If it's working it means you're on Java 8 because it's ignored. You should set MaxPermSize if anything, not PermSize. However the error indicates you are not using

Re: OOM when running Spark SQL by PySpark on Java 8

2016-10-13 Thread Shady Xu
Solved the problem by specifying the PermGen size when submitting the job (even to just a few MB). Seems Java 8 has removed the Permanent Generation space, thus corresponding JVM arguments are ignored. But I can still use --driver-java-options "-XX:PermSize=80M -XX:MaxPermSize=100m" to specify

Re: OOM when running Spark SQL by PySpark on Java 8

2016-10-13 Thread Sean Owen
The error doesn't say you're out of memory, but says you're out of PermGen. If you see this, you aren't running Java 8 AFAIK, because 8 has no PermGen. But if you're running Java 7, and you go investigate what this error means, you'll find you need to increase PermGen. This is mentioned in the

OOM when running Spark SQL by PySpark on Java 8

2016-10-13 Thread Shady Xu
Hi, I have a problem when running Spark SQL by PySpark on Java 8. Below is the log. 16/10/13 16:46:40 INFO spark.SparkContext: Starting job: sql at NativeMethodAccessorImpl.java:-2 Exception in thread "dag-scheduler-event-loop" java.lang.OutOfMemoryError: PermGen space at