Hi All,
I have a big job which mainly takes more than one hour to run the whole,
however, it is very much unreasonable to exit & finish to run midway (almost
80% of the job finished actually, but not all), without any apparent error or
exception log.
I submitted the same job for many times, it
Hi,
We are running Spark 1.4.0 on a Mesosphere cluster (~250GB memory with 16
activated hosts).
Spark jobs are submitted in coarse mode.
Suddenly, our jobs get killed without any error.
ip-10-0-2-193.us-west-2.compute.internal, PROCESS_LOCAL, 1514 bytes)
15/09/01 10:48:24 INFO TaskSetManager:
If it is not some other user then its the kernal triggering the kill, it
might be using way too much memory or swap. Check your resource usage while
the job is running and see the memory overhead etc.
Thanks
Best Regards
On Tue, Sep 1, 2015 at 5:56 PM, Silvio Bernardinello <