Re: "Timed out while stopping the job generator" plus subsequent failures

Tobias Pfeiffer Wed, 11 Mar 2015 22:26:28 -0700

Hi,

I discovered what caused my issue when running on YARN and was able to work
around it.

On Wed, Mar 11, 2015 at 7:43 PM, Tobias Pfeiffer <t...@preferred.jp> wrote:

> The processing itself is complete, i.e., the batch currently processed at
> the time of stop() is finished and no further batches are processed.
> However, something keeps the streaming context from stopping properly. In
> local[n] mode, this is not actually a problem (other than I have to wait 20
> seconds for shutdown), but in yarn-cluster mode, I get an error
>
>   akka.actor.InvalidActorNameException: actor name [JobGenerator] is not
> unique!
>

It seems that not all checkpointed RDDs are cleaned (metadata cleared,
checkpoint directories deleted etc.?) at the time when the streamingContext
is stopped, but only afterwards. In particular, when I add
`Thread.sleep(5000)` after my streamingContext.stop() call, then it works
and I can start a different streamingContext afterwards.

This is pretty ugly, so does anyone know a method to poll whether it's safe
to continue or whether there are still some RDDs waiting to be cleaned up?

Thanks
Tobias

Re: "Timed out while stopping the job generator" plus subsequent failures

Reply via email to