Re: java server error - spark

2016-06-15 Thread spR
hey,

Thanks. Now it worked.. :)

On Wed, Jun 15, 2016 at 6:59 PM, Jeff Zhang  wrote:

> Then the only solution is to increase your driver memory but still
> restricted by your machine's memory.  "--driver-memory"
>
> On Thu, Jun 16, 2016 at 9:53 AM, spR  wrote:
>
>> Hey,
>>
>> But I just have one machine. I am running everything on my laptop. Won't
>> I be able to do this processing in local mode then?
>>
>> Regards,
>> Tejaswini
>>
>> On Wed, Jun 15, 2016 at 6:32 PM, Jeff Zhang  wrote:
>>
>>> You are using local mode, --executor-memory  won't take effect for
>>> local mode, please use other cluster mode.
>>>
>>> On Thu, Jun 16, 2016 at 9:32 AM, Jeff Zhang  wrote:
>>>
 Specify --executor-memory in your spark-submit command.



 On Thu, Jun 16, 2016 at 9:01 AM, spR  wrote:

> Thank you. Can you pls tell How to increase the executor memory?
>
>
>
> On Wed, Jun 15, 2016 at 5:59 PM, Jeff Zhang  wrote:
>
>> >>> Caused by: java.lang.OutOfMemoryError: GC overhead limit exceeded
>>
>>
>> It is OOM on the executor.  Please try to increase executor memory.
>> "--executor-memory"
>>
>>
>>
>>
>>
>> On Thu, Jun 16, 2016 at 8:54 AM, spR  wrote:
>>
>>> Hey,
>>>
>>> error trace -
>>>
>>> hey,
>>>
>>>
>>> error trace -
>>>
>>>
>>> ---Py4JJavaError
>>>  Traceback (most recent call 
>>> last) in ()> 1 temp.take(2)
>>>
>>> /Users/my/Documents/My_Study_folder/spark-1.6.1/python/pyspark/sql/dataframe.pyc
>>>  in take(self, num)304 with SCCallSiteSync(self._sc) as 
>>> css:305 port = 
>>> self._sc._jvm.org.apache.spark.sql.execution.EvaluatePython.takeAndServe(-->
>>>  306 self._jdf, num)307 return 
>>> list(_load_from_socket(port, BatchedSerializer(PickleSerializer(
>>> 308
>>>
>>> /Users/my/Documents/My_Study_folder/spark-1.6.1/python/lib/py4j-0.9-src.zip/py4j/java_gateway.py
>>>  in __call__(self, *args)811 answer = 
>>> self.gateway_client.send_command(command)812 return_value = 
>>> get_return_value(--> 813 answer, self.gateway_client, 
>>> self.target_id, self.name)814
>>> 815 for temp_arg in temp_args:
>>>
>>> /Users/my/Documents/My_Study_folder/spark-1.6.1/python/pyspark/sql/utils.pyc
>>>  in deco(*a, **kw) 43 def deco(*a, **kw): 44 
>>> try:---> 45 return f(*a, **kw) 46 except 
>>> py4j.protocol.Py4JJavaError as e: 47 s = 
>>> e.java_exception.toString()
>>> /Users/my/Documents/My_Study_folder/spark-1.6.1/python/lib/py4j-0.9-src.zip/py4j/protocol.py
>>>  in get_return_value(answer, gateway_client, target_id, name)306
>>>  raise Py4JJavaError(307 "An error 
>>> occurred while calling {0}{1}{2}.\n".--> 308 
>>> format(target_id, ".", name), value)309 else:
>>> 310 raise Py4JError(
>>> Py4JJavaError: An error occurred while calling 
>>> z:org.apache.spark.sql.execution.EvaluatePython.takeAndServe.
>>> : org.apache.spark.SparkException: Job aborted due to stage failure: 
>>> Task 0 in stage 3.0 failed 1 times, most recent failure: Lost task 0.0 
>>> in stage 3.0 (TID 76, localhost): java.lang.OutOfMemoryError: GC 
>>> overhead limit exceeded
>>> at com.mysql.jdbc.MysqlIO.nextRowFast(MysqlIO.java:2205)
>>> at com.mysql.jdbc.MysqlIO.nextRow(MysqlIO.java:1984)
>>> at com.mysql.jdbc.MysqlIO.readSingleRowSet(MysqlIO.java:3403)
>>> at com.mysql.jdbc.MysqlIO.getResultSet(MysqlIO.java:470)
>>> at 
>>> com.mysql.jdbc.MysqlIO.readResultsForQueryOrUpdate(MysqlIO.java:3105)
>>> at com.mysql.jdbc.MysqlIO.readAllResults(MysqlIO.java:2336)
>>> at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2729)
>>> at 
>>> com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2549)
>>> at 
>>> com.mysql.jdbc.PreparedStatement.executeInternal(PreparedStatement.java:1861)
>>> at 
>>> com.mysql.jdbc.PreparedStatement.executeQuery(PreparedStatement.java:1962)
>>> at 
>>> org.apache.spark.sql.execution.datasources.jdbc.JDBCRDD$$anon$1.(JDBCRDD.scala:363)
>>> at 
>>> org.apache.spark.sql.execution.datasources.jdbc.JDBCRDD.compute(JDBCRDD.scala:339)
>>> at 
>>> org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
>>> at 

Re: java server error - spark

2016-06-15 Thread Jeff Zhang
Then the only solution is to increase your driver memory but still
restricted by your machine's memory.  "--driver-memory"

On Thu, Jun 16, 2016 at 9:53 AM, spR  wrote:

> Hey,
>
> But I just have one machine. I am running everything on my laptop. Won't I
> be able to do this processing in local mode then?
>
> Regards,
> Tejaswini
>
> On Wed, Jun 15, 2016 at 6:32 PM, Jeff Zhang  wrote:
>
>> You are using local mode, --executor-memory  won't take effect for local
>> mode, please use other cluster mode.
>>
>> On Thu, Jun 16, 2016 at 9:32 AM, Jeff Zhang  wrote:
>>
>>> Specify --executor-memory in your spark-submit command.
>>>
>>>
>>>
>>> On Thu, Jun 16, 2016 at 9:01 AM, spR  wrote:
>>>
 Thank you. Can you pls tell How to increase the executor memory?



 On Wed, Jun 15, 2016 at 5:59 PM, Jeff Zhang  wrote:

> >>> Caused by: java.lang.OutOfMemoryError: GC overhead limit exceeded
>
>
> It is OOM on the executor.  Please try to increase executor memory.
> "--executor-memory"
>
>
>
>
>
> On Thu, Jun 16, 2016 at 8:54 AM, spR  wrote:
>
>> Hey,
>>
>> error trace -
>>
>> hey,
>>
>>
>> error trace -
>>
>>
>> ---Py4JJavaError
>>  Traceback (most recent call 
>> last) in ()> 1 temp.take(2)
>>
>> /Users/my/Documents/My_Study_folder/spark-1.6.1/python/pyspark/sql/dataframe.pyc
>>  in take(self, num)304 with SCCallSiteSync(self._sc) as css: 
>>305 port = 
>> self._sc._jvm.org.apache.spark.sql.execution.EvaluatePython.takeAndServe(-->
>>  306 self._jdf, num)307 return 
>> list(_load_from_socket(port, BatchedSerializer(PickleSerializer(
>> 308
>>
>> /Users/my/Documents/My_Study_folder/spark-1.6.1/python/lib/py4j-0.9-src.zip/py4j/java_gateway.py
>>  in __call__(self, *args)811 answer = 
>> self.gateway_client.send_command(command)812 return_value = 
>> get_return_value(--> 813 answer, self.gateway_client, 
>> self.target_id, self.name)814
>> 815 for temp_arg in temp_args:
>>
>> /Users/my/Documents/My_Study_folder/spark-1.6.1/python/pyspark/sql/utils.pyc
>>  in deco(*a, **kw) 43 def deco(*a, **kw): 44 
>> try:---> 45 return f(*a, **kw) 46 except 
>> py4j.protocol.Py4JJavaError as e: 47 s = 
>> e.java_exception.toString()
>> /Users/my/Documents/My_Study_folder/spark-1.6.1/python/lib/py4j-0.9-src.zip/py4j/protocol.py
>>  in get_return_value(answer, gateway_client, target_id, name)306 
>> raise Py4JJavaError(307 "An error 
>> occurred while calling {0}{1}{2}.\n".--> 308 
>> format(target_id, ".", name), value)309 else:
>> 310 raise Py4JError(
>> Py4JJavaError: An error occurred while calling 
>> z:org.apache.spark.sql.execution.EvaluatePython.takeAndServe.
>> : org.apache.spark.SparkException: Job aborted due to stage failure: 
>> Task 0 in stage 3.0 failed 1 times, most recent failure: Lost task 0.0 
>> in stage 3.0 (TID 76, localhost): java.lang.OutOfMemoryError: GC 
>> overhead limit exceeded
>>  at com.mysql.jdbc.MysqlIO.nextRowFast(MysqlIO.java:2205)
>>  at com.mysql.jdbc.MysqlIO.nextRow(MysqlIO.java:1984)
>>  at com.mysql.jdbc.MysqlIO.readSingleRowSet(MysqlIO.java:3403)
>>  at com.mysql.jdbc.MysqlIO.getResultSet(MysqlIO.java:470)
>>  at com.mysql.jdbc.MysqlIO.readResultsForQueryOrUpdate(MysqlIO.java:3105)
>>  at com.mysql.jdbc.MysqlIO.readAllResults(MysqlIO.java:2336)
>>  at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2729)
>>  at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2549)
>>  at 
>> com.mysql.jdbc.PreparedStatement.executeInternal(PreparedStatement.java:1861)
>>  at 
>> com.mysql.jdbc.PreparedStatement.executeQuery(PreparedStatement.java:1962)
>>  at 
>> org.apache.spark.sql.execution.datasources.jdbc.JDBCRDD$$anon$1.(JDBCRDD.scala:363)
>>  at 
>> org.apache.spark.sql.execution.datasources.jdbc.JDBCRDD.compute(JDBCRDD.scala:339)
>>  at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
>>  at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
>>  at 
>> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
>>  at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
>>  at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
>>  at 
>> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
>>  at 

Re: java server error - spark

2016-06-15 Thread spR
Hey,

But I just have one machine. I am running everything on my laptop. Won't I
be able to do this processing in local mode then?

Regards,
Tejaswini

On Wed, Jun 15, 2016 at 6:32 PM, Jeff Zhang  wrote:

> You are using local mode, --executor-memory  won't take effect for local
> mode, please use other cluster mode.
>
> On Thu, Jun 16, 2016 at 9:32 AM, Jeff Zhang  wrote:
>
>> Specify --executor-memory in your spark-submit command.
>>
>>
>>
>> On Thu, Jun 16, 2016 at 9:01 AM, spR  wrote:
>>
>>> Thank you. Can you pls tell How to increase the executor memory?
>>>
>>>
>>>
>>> On Wed, Jun 15, 2016 at 5:59 PM, Jeff Zhang  wrote:
>>>
 >>> Caused by: java.lang.OutOfMemoryError: GC overhead limit exceeded


 It is OOM on the executor.  Please try to increase executor memory.
 "--executor-memory"





 On Thu, Jun 16, 2016 at 8:54 AM, spR  wrote:

> Hey,
>
> error trace -
>
> hey,
>
>
> error trace -
>
>
> ---Py4JJavaError
>  Traceback (most recent call 
> last) in ()> 1 temp.take(2)
>
> /Users/my/Documents/My_Study_folder/spark-1.6.1/python/pyspark/sql/dataframe.pyc
>  in take(self, num)304 with SCCallSiteSync(self._sc) as css:  
>   305 port = 
> self._sc._jvm.org.apache.spark.sql.execution.EvaluatePython.takeAndServe(-->
>  306 self._jdf, num)307 return 
> list(_load_from_socket(port, BatchedSerializer(PickleSerializer(
> 308
>
> /Users/my/Documents/My_Study_folder/spark-1.6.1/python/lib/py4j-0.9-src.zip/py4j/java_gateway.py
>  in __call__(self, *args)811 answer = 
> self.gateway_client.send_command(command)812 return_value = 
> get_return_value(--> 813 answer, self.gateway_client, 
> self.target_id, self.name)814
> 815 for temp_arg in temp_args:
>
> /Users/my/Documents/My_Study_folder/spark-1.6.1/python/pyspark/sql/utils.pyc
>  in deco(*a, **kw) 43 def deco(*a, **kw): 44 try:---> 
> 45 return f(*a, **kw) 46 except 
> py4j.protocol.Py4JJavaError as e: 47 s = 
> e.java_exception.toString()
> /Users/my/Documents/My_Study_folder/spark-1.6.1/python/lib/py4j-0.9-src.zip/py4j/protocol.py
>  in get_return_value(answer, gateway_client, target_id, name)306  
>raise Py4JJavaError(307 "An error 
> occurred while calling {0}{1}{2}.\n".--> 308 
> format(target_id, ".", name), value)309 else:
> 310 raise Py4JError(
> Py4JJavaError: An error occurred while calling 
> z:org.apache.spark.sql.execution.EvaluatePython.takeAndServe.
> : org.apache.spark.SparkException: Job aborted due to stage failure: Task 
> 0 in stage 3.0 failed 1 times, most recent failure: Lost task 0.0 in 
> stage 3.0 (TID 76, localhost): java.lang.OutOfMemoryError: GC overhead 
> limit exceeded
>   at com.mysql.jdbc.MysqlIO.nextRowFast(MysqlIO.java:2205)
>   at com.mysql.jdbc.MysqlIO.nextRow(MysqlIO.java:1984)
>   at com.mysql.jdbc.MysqlIO.readSingleRowSet(MysqlIO.java:3403)
>   at com.mysql.jdbc.MysqlIO.getResultSet(MysqlIO.java:470)
>   at com.mysql.jdbc.MysqlIO.readResultsForQueryOrUpdate(MysqlIO.java:3105)
>   at com.mysql.jdbc.MysqlIO.readAllResults(MysqlIO.java:2336)
>   at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2729)
>   at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2549)
>   at 
> com.mysql.jdbc.PreparedStatement.executeInternal(PreparedStatement.java:1861)
>   at 
> com.mysql.jdbc.PreparedStatement.executeQuery(PreparedStatement.java:1962)
>   at 
> org.apache.spark.sql.execution.datasources.jdbc.JDBCRDD$$anon$1.(JDBCRDD.scala:363)
>   at 
> org.apache.spark.sql.execution.datasources.jdbc.JDBCRDD.compute(JDBCRDD.scala:339)
>   at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
>   at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
>   at 
> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
>   at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
>   at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
>   at 
> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
>   at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
>   at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
>   at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
>   at org.apache.spark.scheduler.Task.run(Task.scala:89)
>   at 

Re: java server error - spark

2016-06-15 Thread Jeff Zhang
You are using local mode, --executor-memory  won't take effect for local
mode, please use other cluster mode.

On Thu, Jun 16, 2016 at 9:32 AM, Jeff Zhang  wrote:

> Specify --executor-memory in your spark-submit command.
>
>
>
> On Thu, Jun 16, 2016 at 9:01 AM, spR  wrote:
>
>> Thank you. Can you pls tell How to increase the executor memory?
>>
>>
>>
>> On Wed, Jun 15, 2016 at 5:59 PM, Jeff Zhang  wrote:
>>
>>> >>> Caused by: java.lang.OutOfMemoryError: GC overhead limit exceeded
>>>
>>>
>>> It is OOM on the executor.  Please try to increase executor memory.
>>> "--executor-memory"
>>>
>>>
>>>
>>>
>>>
>>> On Thu, Jun 16, 2016 at 8:54 AM, spR  wrote:
>>>
 Hey,

 error trace -

 hey,


 error trace -


 ---Py4JJavaError
  Traceback (most recent call 
 last) in ()> 1 temp.take(2)

 /Users/my/Documents/My_Study_folder/spark-1.6.1/python/pyspark/sql/dataframe.pyc
  in take(self, num)304 with SCCallSiteSync(self._sc) as css:   
  305 port = 
 self._sc._jvm.org.apache.spark.sql.execution.EvaluatePython.takeAndServe(-->
  306 self._jdf, num)307 return 
 list(_load_from_socket(port, BatchedSerializer(PickleSerializer(
 308

 /Users/my/Documents/My_Study_folder/spark-1.6.1/python/lib/py4j-0.9-src.zip/py4j/java_gateway.py
  in __call__(self, *args)811 answer = 
 self.gateway_client.send_command(command)812 return_value = 
 get_return_value(--> 813 answer, self.gateway_client, 
 self.target_id, self.name)814
 815 for temp_arg in temp_args:

 /Users/my/Documents/My_Study_folder/spark-1.6.1/python/pyspark/sql/utils.pyc
  in deco(*a, **kw) 43 def deco(*a, **kw): 44 try:---> 
 45 return f(*a, **kw) 46 except 
 py4j.protocol.Py4JJavaError as e: 47 s = 
 e.java_exception.toString()
 /Users/my/Documents/My_Study_folder/spark-1.6.1/python/lib/py4j-0.9-src.zip/py4j/protocol.py
  in get_return_value(answer, gateway_client, target_id, name)306   
   raise Py4JJavaError(307 "An error 
 occurred while calling {0}{1}{2}.\n".--> 308 
 format(target_id, ".", name), value)309 else:
 310 raise Py4JError(
 Py4JJavaError: An error occurred while calling 
 z:org.apache.spark.sql.execution.EvaluatePython.takeAndServe.
 : org.apache.spark.SparkException: Job aborted due to stage failure: Task 
 0 in stage 3.0 failed 1 times, most recent failure: Lost task 0.0 in stage 
 3.0 (TID 76, localhost): java.lang.OutOfMemoryError: GC overhead limit 
 exceeded
at com.mysql.jdbc.MysqlIO.nextRowFast(MysqlIO.java:2205)
at com.mysql.jdbc.MysqlIO.nextRow(MysqlIO.java:1984)
at com.mysql.jdbc.MysqlIO.readSingleRowSet(MysqlIO.java:3403)
at com.mysql.jdbc.MysqlIO.getResultSet(MysqlIO.java:470)
at com.mysql.jdbc.MysqlIO.readResultsForQueryOrUpdate(MysqlIO.java:3105)
at com.mysql.jdbc.MysqlIO.readAllResults(MysqlIO.java:2336)
at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2729)
at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2549)
at 
 com.mysql.jdbc.PreparedStatement.executeInternal(PreparedStatement.java:1861)
at 
 com.mysql.jdbc.PreparedStatement.executeQuery(PreparedStatement.java:1962)
at 
 org.apache.spark.sql.execution.datasources.jdbc.JDBCRDD$$anon$1.(JDBCRDD.scala:363)
at 
 org.apache.spark.sql.execution.datasources.jdbc.JDBCRDD.compute(JDBCRDD.scala:339)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
at 
 org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
at 
 org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
at org.apache.spark.scheduler.Task.run(Task.scala:89)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)

 

Re: java server error - spark

2016-06-15 Thread Jeff Zhang
Specify --executor-memory in your spark-submit command.



On Thu, Jun 16, 2016 at 9:01 AM, spR  wrote:

> Thank you. Can you pls tell How to increase the executor memory?
>
>
>
> On Wed, Jun 15, 2016 at 5:59 PM, Jeff Zhang  wrote:
>
>> >>> Caused by: java.lang.OutOfMemoryError: GC overhead limit exceeded
>>
>>
>> It is OOM on the executor.  Please try to increase executor memory.
>> "--executor-memory"
>>
>>
>>
>>
>>
>> On Thu, Jun 16, 2016 at 8:54 AM, spR  wrote:
>>
>>> Hey,
>>>
>>> error trace -
>>>
>>> hey,
>>>
>>>
>>> error trace -
>>>
>>>
>>> ---Py4JJavaError
>>>  Traceback (most recent call 
>>> last) in ()> 1 temp.take(2)
>>>
>>> /Users/my/Documents/My_Study_folder/spark-1.6.1/python/pyspark/sql/dataframe.pyc
>>>  in take(self, num)304 with SCCallSiteSync(self._sc) as css:
>>> 305 port = 
>>> self._sc._jvm.org.apache.spark.sql.execution.EvaluatePython.takeAndServe(-->
>>>  306 self._jdf, num)307 return 
>>> list(_load_from_socket(port, BatchedSerializer(PickleSerializer(
>>> 308
>>>
>>> /Users/my/Documents/My_Study_folder/spark-1.6.1/python/lib/py4j-0.9-src.zip/py4j/java_gateway.py
>>>  in __call__(self, *args)811 answer = 
>>> self.gateway_client.send_command(command)812 return_value = 
>>> get_return_value(--> 813 answer, self.gateway_client, 
>>> self.target_id, self.name)814
>>> 815 for temp_arg in temp_args:
>>>
>>> /Users/my/Documents/My_Study_folder/spark-1.6.1/python/pyspark/sql/utils.pyc
>>>  in deco(*a, **kw) 43 def deco(*a, **kw): 44 try:---> 
>>> 45 return f(*a, **kw) 46 except 
>>> py4j.protocol.Py4JJavaError as e: 47 s = 
>>> e.java_exception.toString()
>>> /Users/my/Documents/My_Study_folder/spark-1.6.1/python/lib/py4j-0.9-src.zip/py4j/protocol.py
>>>  in get_return_value(answer, gateway_client, target_id, name)306
>>>  raise Py4JJavaError(307 "An error occurred 
>>> while calling {0}{1}{2}.\n".--> 308 format(target_id, 
>>> ".", name), value)309 else:
>>> 310 raise Py4JError(
>>> Py4JJavaError: An error occurred while calling 
>>> z:org.apache.spark.sql.execution.EvaluatePython.takeAndServe.
>>> : org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 
>>> in stage 3.0 failed 1 times, most recent failure: Lost task 0.0 in stage 
>>> 3.0 (TID 76, localhost): java.lang.OutOfMemoryError: GC overhead limit 
>>> exceeded
>>> at com.mysql.jdbc.MysqlIO.nextRowFast(MysqlIO.java:2205)
>>> at com.mysql.jdbc.MysqlIO.nextRow(MysqlIO.java:1984)
>>> at com.mysql.jdbc.MysqlIO.readSingleRowSet(MysqlIO.java:3403)
>>> at com.mysql.jdbc.MysqlIO.getResultSet(MysqlIO.java:470)
>>> at com.mysql.jdbc.MysqlIO.readResultsForQueryOrUpdate(MysqlIO.java:3105)
>>> at com.mysql.jdbc.MysqlIO.readAllResults(MysqlIO.java:2336)
>>> at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2729)
>>> at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2549)
>>> at 
>>> com.mysql.jdbc.PreparedStatement.executeInternal(PreparedStatement.java:1861)
>>> at 
>>> com.mysql.jdbc.PreparedStatement.executeQuery(PreparedStatement.java:1962)
>>> at 
>>> org.apache.spark.sql.execution.datasources.jdbc.JDBCRDD$$anon$1.(JDBCRDD.scala:363)
>>> at 
>>> org.apache.spark.sql.execution.datasources.jdbc.JDBCRDD.compute(JDBCRDD.scala:339)
>>> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
>>> at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
>>> at 
>>> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
>>> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
>>> at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
>>> at 
>>> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
>>> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
>>> at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
>>> at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
>>> at org.apache.spark.scheduler.Task.run(Task.scala:89)
>>> at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
>>> at 
>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>>> at 
>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>>> at java.lang.Thread.run(Thread.java:745)
>>>
>>> Driver stacktrace:
>>> at org.apache.spark.scheduler.DAGScheduler.org 
>>> $apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1431)
>>> at 
>>> 

Re: java server error - spark

2016-06-15 Thread spR
hey,

I did this in my notebook. But still I get the same error. Is this the
right way to do it?

from pyspark import SparkConf
conf = (SparkConf()
 .setMaster("local[4]")
 .setAppName("My app")
 .set("spark.executor.memory", "12g"))
sc.conf = conf

On Wed, Jun 15, 2016 at 5:59 PM, Jeff Zhang  wrote:

> >>> Caused by: java.lang.OutOfMemoryError: GC overhead limit exceeded
>
>
> It is OOM on the executor.  Please try to increase executor memory.
> "--executor-memory"
>
>
>
>
>
> On Thu, Jun 16, 2016 at 8:54 AM, spR  wrote:
>
>> Hey,
>>
>> error trace -
>>
>> hey,
>>
>>
>> error trace -
>>
>>
>> ---Py4JJavaError
>>  Traceback (most recent call 
>> last) in ()> 1 temp.take(2)
>>
>> /Users/my/Documents/My_Study_folder/spark-1.6.1/python/pyspark/sql/dataframe.pyc
>>  in take(self, num)304 with SCCallSiteSync(self._sc) as css:
>> 305 port = 
>> self._sc._jvm.org.apache.spark.sql.execution.EvaluatePython.takeAndServe(--> 
>> 306 self._jdf, num)307 return 
>> list(_load_from_socket(port, BatchedSerializer(PickleSerializer(
>> 308
>>
>> /Users/my/Documents/My_Study_folder/spark-1.6.1/python/lib/py4j-0.9-src.zip/py4j/java_gateway.py
>>  in __call__(self, *args)811 answer = 
>> self.gateway_client.send_command(command)812 return_value = 
>> get_return_value(--> 813 answer, self.gateway_client, 
>> self.target_id, self.name)814
>> 815 for temp_arg in temp_args:
>>
>> /Users/my/Documents/My_Study_folder/spark-1.6.1/python/pyspark/sql/utils.pyc 
>> in deco(*a, **kw) 43 def deco(*a, **kw): 44 try:---> 45  
>>return f(*a, **kw) 46 except 
>> py4j.protocol.Py4JJavaError as e: 47 s = 
>> e.java_exception.toString()
>> /Users/my/Documents/My_Study_folder/spark-1.6.1/python/lib/py4j-0.9-src.zip/py4j/protocol.py
>>  in get_return_value(answer, gateway_client, target_id, name)306 
>> raise Py4JJavaError(307 "An error occurred 
>> while calling {0}{1}{2}.\n".--> 308 format(target_id, 
>> ".", name), value)309 else:
>> 310 raise Py4JError(
>> Py4JJavaError: An error occurred while calling 
>> z:org.apache.spark.sql.execution.EvaluatePython.takeAndServe.
>> : org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 
>> in stage 3.0 failed 1 times, most recent failure: Lost task 0.0 in stage 3.0 
>> (TID 76, localhost): java.lang.OutOfMemoryError: GC overhead limit exceeded
>>  at com.mysql.jdbc.MysqlIO.nextRowFast(MysqlIO.java:2205)
>>  at com.mysql.jdbc.MysqlIO.nextRow(MysqlIO.java:1984)
>>  at com.mysql.jdbc.MysqlIO.readSingleRowSet(MysqlIO.java:3403)
>>  at com.mysql.jdbc.MysqlIO.getResultSet(MysqlIO.java:470)
>>  at com.mysql.jdbc.MysqlIO.readResultsForQueryOrUpdate(MysqlIO.java:3105)
>>  at com.mysql.jdbc.MysqlIO.readAllResults(MysqlIO.java:2336)
>>  at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2729)
>>  at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2549)
>>  at 
>> com.mysql.jdbc.PreparedStatement.executeInternal(PreparedStatement.java:1861)
>>  at 
>> com.mysql.jdbc.PreparedStatement.executeQuery(PreparedStatement.java:1962)
>>  at 
>> org.apache.spark.sql.execution.datasources.jdbc.JDBCRDD$$anon$1.(JDBCRDD.scala:363)
>>  at 
>> org.apache.spark.sql.execution.datasources.jdbc.JDBCRDD.compute(JDBCRDD.scala:339)
>>  at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
>>  at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
>>  at 
>> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
>>  at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
>>  at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
>>  at 
>> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
>>  at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
>>  at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
>>  at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
>>  at org.apache.spark.scheduler.Task.run(Task.scala:89)
>>  at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
>>  at 
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>>  at 
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>>  at java.lang.Thread.run(Thread.java:745)
>>
>> Driver stacktrace:
>>  at org.apache.spark.scheduler.DAGScheduler.org 
>> $apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1431)
>>  at 
>> 

Re: java server error - spark

2016-06-15 Thread spR
Thank you. Can you pls tell How to increase the executor memory?



On Wed, Jun 15, 2016 at 5:59 PM, Jeff Zhang  wrote:

> >>> Caused by: java.lang.OutOfMemoryError: GC overhead limit exceeded
>
>
> It is OOM on the executor.  Please try to increase executor memory.
> "--executor-memory"
>
>
>
>
>
> On Thu, Jun 16, 2016 at 8:54 AM, spR  wrote:
>
>> Hey,
>>
>> error trace -
>>
>> hey,
>>
>>
>> error trace -
>>
>>
>> ---Py4JJavaError
>>  Traceback (most recent call 
>> last) in ()> 1 temp.take(2)
>>
>> /Users/my/Documents/My_Study_folder/spark-1.6.1/python/pyspark/sql/dataframe.pyc
>>  in take(self, num)304 with SCCallSiteSync(self._sc) as css:
>> 305 port = 
>> self._sc._jvm.org.apache.spark.sql.execution.EvaluatePython.takeAndServe(--> 
>> 306 self._jdf, num)307 return 
>> list(_load_from_socket(port, BatchedSerializer(PickleSerializer(
>> 308
>>
>> /Users/my/Documents/My_Study_folder/spark-1.6.1/python/lib/py4j-0.9-src.zip/py4j/java_gateway.py
>>  in __call__(self, *args)811 answer = 
>> self.gateway_client.send_command(command)812 return_value = 
>> get_return_value(--> 813 answer, self.gateway_client, 
>> self.target_id, self.name)814
>> 815 for temp_arg in temp_args:
>>
>> /Users/my/Documents/My_Study_folder/spark-1.6.1/python/pyspark/sql/utils.pyc 
>> in deco(*a, **kw) 43 def deco(*a, **kw): 44 try:---> 45  
>>return f(*a, **kw) 46 except 
>> py4j.protocol.Py4JJavaError as e: 47 s = 
>> e.java_exception.toString()
>> /Users/my/Documents/My_Study_folder/spark-1.6.1/python/lib/py4j-0.9-src.zip/py4j/protocol.py
>>  in get_return_value(answer, gateway_client, target_id, name)306 
>> raise Py4JJavaError(307 "An error occurred 
>> while calling {0}{1}{2}.\n".--> 308 format(target_id, 
>> ".", name), value)309 else:
>> 310 raise Py4JError(
>> Py4JJavaError: An error occurred while calling 
>> z:org.apache.spark.sql.execution.EvaluatePython.takeAndServe.
>> : org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 
>> in stage 3.0 failed 1 times, most recent failure: Lost task 0.0 in stage 3.0 
>> (TID 76, localhost): java.lang.OutOfMemoryError: GC overhead limit exceeded
>>  at com.mysql.jdbc.MysqlIO.nextRowFast(MysqlIO.java:2205)
>>  at com.mysql.jdbc.MysqlIO.nextRow(MysqlIO.java:1984)
>>  at com.mysql.jdbc.MysqlIO.readSingleRowSet(MysqlIO.java:3403)
>>  at com.mysql.jdbc.MysqlIO.getResultSet(MysqlIO.java:470)
>>  at com.mysql.jdbc.MysqlIO.readResultsForQueryOrUpdate(MysqlIO.java:3105)
>>  at com.mysql.jdbc.MysqlIO.readAllResults(MysqlIO.java:2336)
>>  at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2729)
>>  at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2549)
>>  at 
>> com.mysql.jdbc.PreparedStatement.executeInternal(PreparedStatement.java:1861)
>>  at 
>> com.mysql.jdbc.PreparedStatement.executeQuery(PreparedStatement.java:1962)
>>  at 
>> org.apache.spark.sql.execution.datasources.jdbc.JDBCRDD$$anon$1.(JDBCRDD.scala:363)
>>  at 
>> org.apache.spark.sql.execution.datasources.jdbc.JDBCRDD.compute(JDBCRDD.scala:339)
>>  at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
>>  at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
>>  at 
>> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
>>  at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
>>  at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
>>  at 
>> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
>>  at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
>>  at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
>>  at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
>>  at org.apache.spark.scheduler.Task.run(Task.scala:89)
>>  at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
>>  at 
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>>  at 
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>>  at java.lang.Thread.run(Thread.java:745)
>>
>> Driver stacktrace:
>>  at org.apache.spark.scheduler.DAGScheduler.org 
>> $apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1431)
>>  at 
>> org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1419)
>>  at 
>> org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1418)
>>  at 
>> 

Re: java server error - spark

2016-06-15 Thread Jeff Zhang
>>> Caused by: java.lang.OutOfMemoryError: GC overhead limit exceeded


It is OOM on the executor.  Please try to increase executor memory.
"--executor-memory"





On Thu, Jun 16, 2016 at 8:54 AM, spR  wrote:

> Hey,
>
> error trace -
>
> hey,
>
>
> error trace -
>
>
> ---Py4JJavaError
>  Traceback (most recent call 
> last) in ()> 1 temp.take(2)
>
> /Users/my/Documents/My_Study_folder/spark-1.6.1/python/pyspark/sql/dataframe.pyc
>  in take(self, num)304 with SCCallSiteSync(self._sc) as css:
> 305 port = 
> self._sc._jvm.org.apache.spark.sql.execution.EvaluatePython.takeAndServe(--> 
> 306 self._jdf, num)307 return 
> list(_load_from_socket(port, BatchedSerializer(PickleSerializer(
> 308
>
> /Users/my/Documents/My_Study_folder/spark-1.6.1/python/lib/py4j-0.9-src.zip/py4j/java_gateway.py
>  in __call__(self, *args)811 answer = 
> self.gateway_client.send_command(command)812 return_value = 
> get_return_value(--> 813 answer, self.gateway_client, 
> self.target_id, self.name)814
> 815 for temp_arg in temp_args:
>
> /Users/my/Documents/My_Study_folder/spark-1.6.1/python/pyspark/sql/utils.pyc 
> in deco(*a, **kw) 43 def deco(*a, **kw): 44 try:---> 45   
>   return f(*a, **kw) 46 except 
> py4j.protocol.Py4JJavaError as e: 47 s = 
> e.java_exception.toString()
> /Users/my/Documents/My_Study_folder/spark-1.6.1/python/lib/py4j-0.9-src.zip/py4j/protocol.py
>  in get_return_value(answer, gateway_client, target_id, name)306  
>raise Py4JJavaError(307 "An error occurred 
> while calling {0}{1}{2}.\n".--> 308 format(target_id, 
> ".", name), value)309 else:
> 310 raise Py4JError(
> Py4JJavaError: An error occurred while calling 
> z:org.apache.spark.sql.execution.EvaluatePython.takeAndServe.
> : org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 
> in stage 3.0 failed 1 times, most recent failure: Lost task 0.0 in stage 3.0 
> (TID 76, localhost): java.lang.OutOfMemoryError: GC overhead limit exceeded
>   at com.mysql.jdbc.MysqlIO.nextRowFast(MysqlIO.java:2205)
>   at com.mysql.jdbc.MysqlIO.nextRow(MysqlIO.java:1984)
>   at com.mysql.jdbc.MysqlIO.readSingleRowSet(MysqlIO.java:3403)
>   at com.mysql.jdbc.MysqlIO.getResultSet(MysqlIO.java:470)
>   at com.mysql.jdbc.MysqlIO.readResultsForQueryOrUpdate(MysqlIO.java:3105)
>   at com.mysql.jdbc.MysqlIO.readAllResults(MysqlIO.java:2336)
>   at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2729)
>   at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2549)
>   at 
> com.mysql.jdbc.PreparedStatement.executeInternal(PreparedStatement.java:1861)
>   at 
> com.mysql.jdbc.PreparedStatement.executeQuery(PreparedStatement.java:1962)
>   at 
> org.apache.spark.sql.execution.datasources.jdbc.JDBCRDD$$anon$1.(JDBCRDD.scala:363)
>   at 
> org.apache.spark.sql.execution.datasources.jdbc.JDBCRDD.compute(JDBCRDD.scala:339)
>   at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
>   at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
>   at 
> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
>   at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
>   at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
>   at 
> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
>   at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
>   at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
>   at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
>   at org.apache.spark.scheduler.Task.run(Task.scala:89)
>   at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
>
> Driver stacktrace:
>   at org.apache.spark.scheduler.DAGScheduler.org 
> $apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1431)
>   at 
> org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1419)
>   at 
> org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1418)
>   at 
> scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
>   at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
>   at 
> 

Re: java server error - spark

2016-06-15 Thread spR
Hey,

error trace -

hey,


error trace -


---Py4JJavaError
Traceback (most recent call
last) in ()> 1 temp.take(2)

/Users/my/Documents/My_Study_folder/spark-1.6.1/python/pyspark/sql/dataframe.pyc
in take(self, num)304 with SCCallSiteSync(self._sc) as
css:305 port =
self._sc._jvm.org.apache.spark.sql.execution.EvaluatePython.takeAndServe(-->
306 self._jdf, num)307 return
list(_load_from_socket(port, BatchedSerializer(PickleSerializer(
308

/Users/my/Documents/My_Study_folder/spark-1.6.1/python/lib/py4j-0.9-src.zip/py4j/java_gateway.py
in __call__(self, *args)811 answer =
self.gateway_client.send_command(command)812 return_value
= get_return_value(--> 813 answer, self.gateway_client,
self.target_id, self.name)814
815 for temp_arg in temp_args:

/Users/my/Documents/My_Study_folder/spark-1.6.1/python/pyspark/sql/utils.pyc
in deco(*a, **kw) 43 def deco(*a, **kw): 44
try:---> 45 return f(*a, **kw) 46 except
py4j.protocol.Py4JJavaError as e: 47 s =
e.java_exception.toString()
/Users/my/Documents/My_Study_folder/spark-1.6.1/python/lib/py4j-0.9-src.zip/py4j/protocol.py
in get_return_value(answer, gateway_client, target_id, name)306
 raise Py4JJavaError(307 "An error
occurred while calling {0}{1}{2}.\n".--> 308
format(target_id, ".", name), value)309 else:
310 raise Py4JError(
Py4JJavaError: An error occurred while calling
z:org.apache.spark.sql.execution.EvaluatePython.takeAndServe.
: org.apache.spark.SparkException: Job aborted due to stage failure:
Task 0 in stage 3.0 failed 1 times, most recent failure: Lost task 0.0
in stage 3.0 (TID 76, localhost): java.lang.OutOfMemoryError: GC
overhead limit exceeded
at com.mysql.jdbc.MysqlIO.nextRowFast(MysqlIO.java:2205)
at com.mysql.jdbc.MysqlIO.nextRow(MysqlIO.java:1984)
at com.mysql.jdbc.MysqlIO.readSingleRowSet(MysqlIO.java:3403)
at com.mysql.jdbc.MysqlIO.getResultSet(MysqlIO.java:470)
at com.mysql.jdbc.MysqlIO.readResultsForQueryOrUpdate(MysqlIO.java:3105)
at com.mysql.jdbc.MysqlIO.readAllResults(MysqlIO.java:2336)
at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2729)
at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2549)
at 
com.mysql.jdbc.PreparedStatement.executeInternal(PreparedStatement.java:1861)
at 
com.mysql.jdbc.PreparedStatement.executeQuery(PreparedStatement.java:1962)
at 
org.apache.spark.sql.execution.datasources.jdbc.JDBCRDD$$anon$1.(JDBCRDD.scala:363)
at 
org.apache.spark.sql.execution.datasources.jdbc.JDBCRDD.compute(JDBCRDD.scala:339)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
at org.apache.spark.scheduler.Task.run(Task.scala:89)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)

Driver stacktrace:
at org.apache.spark.scheduler.DAGScheduler.org
$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1431)
at 
org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1419)
at 
org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1418)
at 
scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
at 
org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1418)
at 
org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:799)
at 
org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:799)
at scala.Option.foreach(Option.scala:236)
at 
org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:799)
at 

Re: java server error - spark

2016-06-15 Thread Jeff Zhang
Could you paste the full stacktrace ?

On Thu, Jun 16, 2016 at 7:24 AM, spR  wrote:

> Hi,
> I am getting this error while executing a query using sqlcontext.sql
>
> The table has around 2.5 gb of data to be scanned.
>
> First I get out of memory exception. But I have 16 gb of ram
>
> Then my notebook dies and I get below error
>
> Py4JNetworkError: An error occurred while trying to connect to the Java server
>
>
> Thank You
>



-- 
Best Regards

Jeff Zhang