Re: Setting up jvm in pyspark from shell

2014-09-11 Thread Davies Liu
The heap size of JVM can not been changed dynamically, so you
 need to config it before running pyspark.

If you run it in local mode, you should config spark.driver.memory
 (in 1.1 or master).

Or, you can use --driver-memory 2G (should work in 1.0+)

On Wed, Sep 10, 2014 at 10:43 PM, Mohit Singh mohit1...@gmail.com wrote:
 Hi,
   I am using pyspark shell and am trying to create an rdd from numpy matrix
 rdd = sc.parallelize(matrix)
 I am getting the following error:
 JVMDUMP039I Processing dump event systhrow, detail
 java/lang/OutOfMemoryError at 2014/09/10 22:41:44 - please wait.
 JVMDUMP032I JVM requested Heap dump using
 '/global/u2/m/msingh/heapdump.20140910.224144.29660.0005.phd' in response to
 an event
 JVMDUMP010I Heap dump written to
 /global/u2/m/msingh/heapdump.20140910.224144.29660.0005.phd
 JVMDUMP032I JVM requested Java dump using
 '/global/u2/m/msingh/javacore.20140910.224144.29660.0006.txt' in response to
 an event
 JVMDUMP010I Java dump written to
 /global/u2/m/msingh/javacore.20140910.224144.29660.0006.txt
 JVMDUMP032I JVM requested Snap dump using
 '/global/u2/m/msingh/Snap.20140910.224144.29660.0007.trc' in response to an
 event
 JVMDUMP010I Snap dump written to
 /global/u2/m/msingh/Snap.20140910.224144.29660.0007.trc
 JVMDUMP013I Processed dump event systhrow, detail
 java/lang/OutOfMemoryError.
 Exception AttributeError: 'SparkContext' object has no attribute '_jsc' in
 bound method SparkContext.__del__ of pyspark.context.SparkContext object
 at 0x11f9450 ignored
 Traceback (most recent call last):
   File stdin, line 1, in module
   File /usr/common/usg/spark/1.0.2/python/pyspark/context.py, line 271, in
 parallelize
 jrdd = readRDDFromFile(self._jsc, tempFile.name, numSlices)
   File
 /usr/common/usg/spark/1.0.2/python/lib/py4j-0.8.1-src.zip/py4j/java_gateway.py,
 line 537, in __call__
   File
 /usr/common/usg/spark/1.0.2/python/lib/py4j-0.8.1-src.zip/py4j/protocol.py,
 line 300, in get_return_value
 py4j.protocol.Py4JJavaError: An error occurred while calling
 z:org.apache.spark.api.python.PythonRDD.readRDDFromFile.
 : java.lang.OutOfMemoryError: Java heap space
 at
 org.apache.spark.api.python.PythonRDD$.readRDDFromFile(PythonRDD.scala:279)
 at org.apache.spark.api.python.PythonRDD.readRDDFromFile(PythonRDD.scala)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:88)
 at
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:55)
 at java.lang.reflect.Method.invoke(Method.java:618)
 at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:231)
 at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:379)
 at py4j.Gateway.invoke(Gateway.java:259)
 at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
 at py4j.commands.CallCommand.execute(CallCommand.java:79)
 at py4j.GatewayConnection.run(GatewayConnection.java:207)
 at java.lang.Thread.run(Thread.java:804)

 I did try to setSystemProperty
 sc.setSystemProperty(spark.executor.memory, 20g)
 How do i increase jvm heap from the shell?

 --
 Mohit

 When you want success as badly as you want the air, then you will get it.
 There is no other secret of success.
 -Socrates

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Setting up jvm in pyspark from shell

2014-09-10 Thread Mohit Singh
Hi,
  I am using pyspark shell and am trying to create an rdd from numpy matrix
rdd = sc.parallelize(matrix)
I am getting the following error:
JVMDUMP039I Processing dump event systhrow, detail
java/lang/OutOfMemoryError at 2014/09/10 22:41:44 - please wait.
JVMDUMP032I JVM requested Heap dump using
'/global/u2/m/msingh/heapdump.20140910.224144.29660.0005.phd' in response
to an event
JVMDUMP010I Heap dump written to
/global/u2/m/msingh/heapdump.20140910.224144.29660.0005.phd
JVMDUMP032I JVM requested Java dump using
'/global/u2/m/msingh/javacore.20140910.224144.29660.0006.txt' in response
to an event
JVMDUMP010I Java dump written to
/global/u2/m/msingh/javacore.20140910.224144.29660.0006.txt
JVMDUMP032I JVM requested Snap dump using
'/global/u2/m/msingh/Snap.20140910.224144.29660.0007.trc' in response to an
event
JVMDUMP010I Snap dump written to
/global/u2/m/msingh/Snap.20140910.224144.29660.0007.trc
JVMDUMP013I Processed dump event systhrow, detail
java/lang/OutOfMemoryError.
Exception AttributeError: 'SparkContext' object has no attribute '_jsc'
in bound method SparkContext.__del__ of pyspark.context.SparkContext
object at 0x11f9450 ignored
Traceback (most recent call last):
  File stdin, line 1, in module
  File /usr/common/usg/spark/1.0.2/python/pyspark/context.py, line 271,
in parallelize
jrdd = readRDDFromFile(self._jsc, tempFile.name, numSlices)
  File
/usr/common/usg/spark/1.0.2/python/lib/py4j-0.8.1-src.zip/py4j/java_gateway.py,
line 537, in __call__
  File
/usr/common/usg/spark/1.0.2/python/lib/py4j-0.8.1-src.zip/py4j/protocol.py,
line 300, in get_return_value
py4j.protocol.Py4JJavaError: An error occurred while calling
z:org.apache.spark.api.python.PythonRDD.readRDDFromFile.
: java.lang.OutOfMemoryError: Java heap space
at
org.apache.spark.api.python.PythonRDD$.readRDDFromFile(PythonRDD.scala:279)
at org.apache.spark.api.python.PythonRDD.readRDDFromFile(PythonRDD.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:88)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:55)
at java.lang.reflect.Method.invoke(Method.java:618)
at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:231)
at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:379)
at py4j.Gateway.invoke(Gateway.java:259)
at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
at py4j.commands.CallCommand.execute(CallCommand.java:79)
at py4j.GatewayConnection.run(GatewayConnection.java:207)
at java.lang.Thread.run(Thread.java:804)

I did try to setSystemProperty
sc.setSystemProperty(spark.executor.memory, 20g)
How do i increase jvm heap from the shell?

-- 
Mohit

When you want success as badly as you want the air, then you will get it.
There is no other secret of success.
-Socrates