One thing made me very confused during debuggin is the error message. The
important one

  WARN ReliableDeliverySupervisor: Association with remote system
[akka.tcp://sparkDriver@xxx:50278] has failed, address is now gated for
[5000] ms. Reason is: [Disassociated].

is of Log Level WARN.

Jianshi


On Tue, Oct 14, 2014 at 4:36 AM, Jianshi Huang <jianshi.hu...@gmail.com>
wrote:

> Turned out it was caused by this issue:
> https://issues.apache.org/jira/browse/SPARK-3923
>
> Set spark.akka.heartbeat.interval to 100 solved it.
>
> Jianshi
>
> On Mon, Oct 13, 2014 at 4:24 PM, Jianshi Huang <jianshi.hu...@gmail.com>
> wrote:
>
>> Hmm... it failed again, just lasted a little bit longer.
>>
>> Jianshi
>>
>> On Mon, Oct 13, 2014 at 4:15 PM, Jianshi Huang <jianshi.hu...@gmail.com>
>> wrote:
>>
>>> https://issues.apache.org/jira/browse/SPARK-3106
>>>
>>> I'm having the saming errors described in SPARK-3106 (no other types of
>>> errors confirmed), running a bunch sql queries on spark 1.2.0 built from
>>> latest master HEAD.
>>>
>>> Any updates to this issue?
>>>
>>> My main task is to join a huge fact table with a dozen dim tables (using
>>> HiveContext) and then map it to my class object. It failed a couple of
>>> times and now I cached the intermediate table and currently it seems
>>> working fine... no idea why until I found SPARK-3106
>>>
>>> Cheers,
>>> --
>>> Jianshi Huang
>>>
>>> LinkedIn: jianshi
>>> Twitter: @jshuang
>>> Github & Blog: http://huangjs.github.com/
>>>
>>
>>
>>
>> --
>> Jianshi Huang
>>
>> LinkedIn: jianshi
>> Twitter: @jshuang
>> Github & Blog: http://huangjs.github.com/
>>
>
>
>
> --
> Jianshi Huang
>
> LinkedIn: jianshi
> Twitter: @jshuang
> Github & Blog: http://huangjs.github.com/
>



-- 
Jianshi Huang

LinkedIn: jianshi
Twitter: @jshuang
Github & Blog: http://huangjs.github.com/

Reply via email to