Re: Driver aborts on Mesos when unable to connect to one of external shuffle services

2018-04-16 Thread igor.berman
Hi Szuromi, We manage external shuffle service by Marathon and not manually sometime though, eg. when adding new node to cluster there is some delay between mesos schedules tasks on some slave and marathon scheduling external shuffle service task on this node. -- Sent from:

Re: Driver aborts on Mesos when unable to connect to one of external shuffle services

2018-04-12 Thread Szuromi Tamás
Hi Igor, Have you started the external shuffle service manually? Cheers 2018-04-12 10:48 GMT+02:00 igor.berman : > Hi, > any input regarding is it expected: > Driver starts and unable to connect to external shuffle service on one of > the nodes(no matter what is the

Driver aborts on Mesos when unable to connect to one of external shuffle services

2018-04-12 Thread igor.berman
Hi, any input regarding is it expected: Driver starts and unable to connect to external shuffle service on one of the nodes(no matter what is the reason) This makes framework to go to Inactive mode in Mesos UI However it seems that driver doesn't exits and continues to execute tasks(or tries to).