Re: GC overhead limit exceeded

2016-05-16 Thread Aleksandr Modestov
Hi,

"Why did you though you have enough memory for your task? You checked task
statistics in your WebUI?". I mean that I have jnly about 5Gb data but
spark.driver memory in 60Gb. I check task statistics in web UI.
But really spark says that
*"05-16 17:50:06.254 127.0.0.1:54321    1534
#e Thread WARN: Swapping!  GC CALLBACK, (K/V:29.74 GB + POJO:18.97 GB +
FREE:8.79 GB == MEM_MAX:57.50 GB), desiredKV=7.19 GB OOM!Exception in
thread "Heartbeat" java.lang.OutOfMemoryError: Java heap space"*
But why spark doesn't split data into a disk?

On Mon, May 16, 2016 at 5:11 PM, Takeshi Yamamuro 
wrote:

> Hi,
>
> Why did you though you have enough memory for your task? You checked task
> statistics in your WebUI?
> Anyway, If you get stuck with the GC issue, you'd better off increasing
> the number of partitions.
>
> // maropu
>
> On Mon, May 16, 2016 at 10:00 PM, AlexModestov <
> aleksandrmodes...@gmail.com> wrote:
>
>> I get the error in the apache spark...
>>
>> "spark.driver.memory 60g
>> spark.python.worker.memory 60g
>> spark.master local[*]"
>>
>> The amount of data is about 5Gb, but spark says that "GC overhead limit
>> exceeded". I guess that my conf-file gives enought resources.
>>
>> "16/05/16 15:13:02 WARN NettyRpcEndpointRef: Error sending message
>> [message
>> = Heartbeat(driver,[Lscala.Tuple2;@87576f9,BlockManagerId(driver,
>> localhost,
>> 59407))] in 1 attempts
>> org.apache.spark.rpc.RpcTimeoutException: Futures timed out after [10
>> seconds]. This timeout is controlled by spark.executor.heartbeatInterval
>> at
>> org.apache.spark.rpc.RpcTimeout.org
>> $apache$spark$rpc$RpcTimeout$$createRpcTimeoutException(RpcTimeout.scala:48)
>> at
>>
>> org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:63)
>> at
>>
>> org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:59)
>> at
>>
>> scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:33)
>> at
>> org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:76)
>> at
>> org.apache.spark.rpc.RpcEndpointRef.askWithRetry(RpcEndpointRef.scala:101)
>> at
>> org.apache.spark.executor.Executor.org
>> $apache$spark$executor$Executor$$reportHeartBeat(Executor.scala:449)
>> at
>>
>> org.apache.spark.executor.Executor$$anon$1$$anonfun$run$1.apply$mcV$sp(Executor.scala:470)
>> at
>>
>> org.apache.spark.executor.Executor$$anon$1$$anonfun$run$1.apply(Executor.scala:470)
>> at
>>
>> org.apache.spark.executor.Executor$$anon$1$$anonfun$run$1.apply(Executor.scala:470)
>> at
>> org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1765)
>> at
>> org.apache.spark.executor.Executor$$anon$1.run(Executor.scala:470)
>> at
>> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>> at
>> java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
>> at
>>
>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
>> at
>>
>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
>> at
>>
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>> at
>>
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>> at java.lang.Thread.run(Thread.java:745)
>> Caused by: java.util.concurrent.TimeoutException: Futures timed out after
>> [10 seconds]
>> at
>> scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:219)
>> at
>> scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223)
>> at
>> scala.concurrent.Await$$anonfun$result$1.apply(package.scala:107)
>> at
>>
>> scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53)
>> at scala.concurrent.Await$.result(package.scala:107)
>> at
>> org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:75)
>> ... 14 more
>> 16/05/16 15:13:02 WARN NettyRpcEnv: Ignored message:
>> HeartbeatResponse(false)
>> 05-16 15:13:26.398 127.0.0.1:54321   2059   #e Thread WARN: Swapping!
>> GC CALLBACK, (K/V:29.74 GB + POJO:16.74 GB + FREE:11.03 GB ==
>> MEM_MAX:57.50
>> GB), desiredKV=7.19 GB OOM!
>> 05-16 15:13:44.528 127.0.0.1:54321   2059   #e Thread WARN: Swapping!
>> GC CALLBACK, (K/V:29.74 GB + POJO:16.86 GB + FREE:10.90 GB ==
>> MEM_MAX:57.50
>> GB), desiredKV=7.19 GB OOM!
>> 05-16 15:13:56.847 127.0.0.1:54321   2059   #e Thread WARN: Swapping!
>> GC CALLBACK, (K/V:29.74 GB + POJO:16.88 GB + FREE:10.88 GB ==
>> MEM_MAX:57.50
>> GB), desiredKV=7.19 GB OOM!
>> 05-16 15:14:10.215 127.0.0.1:54321   2059   #e Thread WARN: Swapping!
>> GC CALLBACK, (K/V:29.74 GB + POJO:16.90 GB + FREE:10.86 GB ==
>> MEM_MAX:57.50
>> GB), desiredKV=7.19 GB OOM!
>> 05-16 15:14:33.6

Re: Scala from Jupyter

2016-02-16 Thread Aleksandr Modestov
Thank you!
I will test Spark Notebook.

On Tue, Feb 16, 2016 at 3:37 PM, andy petrella 
wrote:

> Hello Alex!
>
> Rajeev is right, come over the spark notebook gitter room, you'll be
> helped by many experienced people if you have some troubles:
> https://gitter.im/andypetrella/spark-notebook
>
> The spark notebook has many integrated, reactive (scala) and extendable
> (scala) plotting capabilities.
>
> cheers and have fun!
> andy
>
> On Tue, Feb 16, 2016 at 1:04 PM Rajeev Reddy 
> wrote:
>
>> Hello,
>>
>> Let me understand your query correctly.
>>
>> Case 1. You have a jupyter installation for python and you want to use it
>> for scala.
>> Solution: You can install kernels other than python Ref
>> 
>>
>> Case 2. You want to use spark scala
>> Solution: You can create a notebook config in which you create spark
>> context and inject it back to your notebook OR Install other kind of
>> notebooks like spark-notebook
>>  or apache zeppelin
>> 
>>
>>
>> According to my experience for case 2. I have been using and prefer spark
>> notebook over zeppelin
>>
>>
>> On Tue, Feb 16, 2016 at 4:49 PM, AlexModestov <
>> aleksandrmodes...@gmail.com> wrote:
>>
>>> Hello!
>>> I want to use Scala from Jupyter (or may be something else if you could
>>> recomend anything. I mean an IDE). Does anyone know how I can do this?
>>> Thank you!
>>>
>>>
>>>
>>> --
>>> View this message in context:
>>> http://apache-spark-user-list.1001560.n3.nabble.com/Scala-from-Jupyter-tp26234.html
>>> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>>>
>>> -
>>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
>>> For additional commands, e-mail: user-h...@spark.apache.org
>>>
>>>
>>
>>
>> --
>> Thanks,
>> Rajeev Reddy
>> Software Development Engineer  1
>> IXP - Information Intelligence (I2) Team
>> Flipkart Internet Pvt. Ltd (flipkart.com)
>> http://rajeev-reddy.com
>> +91-8001618957
>>
> --
> andy
>