Currently, failed tasks make the JVM exit. There is no work around for
that. Before we can change that we would need to be able to check the task
execution is isolated such that a task failure does not end up “corrupting”
the host.



Bikas



*From:* Thaddeus Diamond [mailto:[email protected]]
*Sent:* Wednesday, July 30, 2014 3:15 PM
*To:* [email protected]
*Subject:* Reusing Containers Of Failed Tasks



Hi,



I turned on container reuse and upped the time that containers linger after
task vertex completion (tez.am.container.session.delay-allocation-millis),
but I'm still having an issue.  Sometimes, the Processor I created will
fail due to application logic in one DAG but not the next. The trivial
example is:



class MyProcessor implements LogicalIOProcessor {

  // Other non-application logic code

  public void run(...) {

    if (new Random().nextBoolean()) {

      throw new FooBarBazException();

    }

  }

}



In this case I don't want the task JVM to be deallocated because it was
application logic that caused the failure and next time I start a DAG I
will have the long JVM task startup delay.



I see the following code in the source (TaskScheduler#deallocateTask(...))
that I think is the cause of this:



       if (!taskSucceeded || !shouldReuseContainers) {

          if (LOG.isDebugEnabled()) {

            LOG.debug("Releasing container, containerId=" +
container.getId()

                + ", taskSucceeded=" + taskSucceeded

                + ", reuseContainersFlag=" + shouldReuseContainers);

          }

          releaseContainer(container.getId());

        }



Is this something that can be fixed in master? Or is there a
workaround/conf I can set to get this working?



Thanks,

Thad

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Reply via email to