Hi Roger,

" but it only spawns one container and still hangs after bootstrap"
    -- this probably is due to your local machine does not have enough
resource for the second container. Because I checked your log file, each
container is about 4GB.

"When I run it on our YARN cluster with a single container, it works
correctly.  When I tried it with 5 containers, it gets hung after consuming
the bootstrap topic."
   -- Have you figure it out? I have a looked at your log and also the
code. My suspect is that, there is a null enveloper somehow blocking the
process. If you can paste the trace level log, it will be more helpful
because many logs in chooser are trace level.

Thanks,

Fang, Yan
[email protected]

On Thu, Jun 18, 2015 at 5:20 PM, Roger Hoover <[email protected]>
wrote:

> I need some help.  I have a job which bootstraps one stream and then is
> supposed to read from two.  When I run it on our YARN cluster with a single
> container, it works correctly.  When I tried it with 5 containers, it gets
> hung after consuming the bootstrap topic.  I ran it with the grid script on
> my laptop (Mac OS X) with yarn.container.count=2 but it only spawns one
> container and still hangs after bootstrap.
>
> Debug logs are here: http://pastebin.com/af3KPvju
>
> I looked at JMX metrics and see:
> - Task Metrics - no value for kafka offset of non-bootstrapped stream
> -  SystemConsumerMetrics
>     - choose null keeps incrementing
>      - ssps-needed-by-chooser 1
>       - unprocessed-messages 62k
> - Bootstrapping Chooser
>   - lagging partitions 4
>   - laggin-batch-streams - 4
>   - batch-resets - 0
>
> Has anyone seen this or can offer ideas of how to better debug it?
>
> I'm using Samza 0.9.0 and YARN 2.4.0.
>
> Thanks!
>
> Roger
>

Reply via email to