After a bit of research, i figured out that the one of the worker was hung on cleaning up GC and the connection usually times out since the default is 60Seconds, so i set it to a higher number and it eliminated this issue. You may want to try this:
sc.set("spark.core.connection.ack.wait.timeout","600") sc.set("spark.akka.frameSize","50") Thanks Best Regards On Wed, Oct 8, 2014 at 6:06 PM, jamborta <jambo...@gmail.com> wrote: > I am still puzzled on this. In my case the data is allowed to write to > disk, > and I usually get different errors if it is out of memory. > > My guess is that akka kills the executors for some reason. > > > > -- > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/Any-issues-with-repartition-tp13462p15929.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > --------------------------------------------------------------------- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > >