Thanks All, I know i have a data skew but the data is unpredictable and
hard to find every time.
Do you think this workaround is reasonable?

                    ExecutorService executor =
Executors.newCachedThreadPool();
                    Callable< Result > task = () -> simulation.run();
                    Future<Result> future = executor.submit(task);
                    try {
                        simResult = future.get(20, TimeUnit.MINUTES);
                    } catch (TimeoutException ex) {
                        SPARKLOG.info("Task timed out");
                    }

It will force timeout the task if it runs for more than 20mins.


On Thu, Jun 16, 2016 at 5:00 AM, Jacek Laskowski <ja...@japila.pl> wrote:

> Hi,
>
> I'd check Details for Stage page in web UI.
>
> Pozdrawiam,
> Jacek Laskowski
> ----
> https://medium.com/@jaceklaskowski/
> Mastering Apache Spark http://bit.ly/mastering-apache-spark
> Follow me at https://twitter.com/jaceklaskowski
>
>
> On Thu, Jun 16, 2016 at 6:45 AM, Utkarsh Sengar <utkarsh2...@gmail.com>
> wrote:
> > This SO question was asked about 1yr ago.
> >
> http://stackoverflow.com/questions/31799755/how-to-deal-with-tasks-running-too-long-comparing-to-others-in-job-in-yarn-cli
> >
> > I answered this question with a suggestion to try speculation but it
> doesn't
> > quite do what the OP expects. I have been running into this issue more
> these
> > days. Out of 5000 tasks, 4950 completes in 5mins but the last 50 never
> > really completes, have tried waiting for 4hrs. This can be a memory
> issue or
> > maybe the way spark's fine grained mode works with mesos, I am trying to
> > enable jmxsink to get a heap dump.
> >
> > But in the mean time, is there a better fix for this? (in any version of
> > spark, I am using 1.5.1 but can upgrade). It would be great if the last
> 50
> > tasks in my example can be killed (timed out) and the stage completes
> > successfully.
> >
> > --
> > Thanks,
> > -Utkarsh
>



-- 
Thanks,
-Utkarsh

Reply via email to