Thank you, Yan.  I'll get a trace level log as soon as I can.

Sent from my iPhone

> On Jun 19, 2015, at 12:05 PM, Yan Fang <yanfang...@gmail.com> wrote:
> 
> Hi Roger,
> 
> " but it only spawns one container and still hangs after bootstrap"
>    -- this probably is due to your local machine does not have enough
> resource for the second container. Because I checked your log file, each
> container is about 4GB.
> 
> "When I run it on our YARN cluster with a single container, it works
> correctly.  When I tried it with 5 containers, it gets hung after consuming
> the bootstrap topic."
>   -- Have you figure it out? I have a looked at your log and also the
> code. My suspect is that, there is a null enveloper somehow blocking the
> process. If you can paste the trace level log, it will be more helpful
> because many logs in chooser are trace level.
> 
> Thanks,
> 
> Fang, Yan
> yanfang...@gmail.com
> 
> On Thu, Jun 18, 2015 at 5:20 PM, Roger Hoover <roger.hoo...@gmail.com>
> wrote:
> 
>> I need some help.  I have a job which bootstraps one stream and then is
>> supposed to read from two.  When I run it on our YARN cluster with a single
>> container, it works correctly.  When I tried it with 5 containers, it gets
>> hung after consuming the bootstrap topic.  I ran it with the grid script on
>> my laptop (Mac OS X) with yarn.container.count=2 but it only spawns one
>> container and still hangs after bootstrap.
>> 
>> Debug logs are here: http://pastebin.com/af3KPvju
>> 
>> I looked at JMX metrics and see:
>> - Task Metrics - no value for kafka offset of non-bootstrapped stream
>> -  SystemConsumerMetrics
>>    - choose null keeps incrementing
>>     - ssps-needed-by-chooser 1
>>      - unprocessed-messages 62k
>> - Bootstrapping Chooser
>>  - lagging partitions 4
>>  - laggin-batch-streams - 4
>>  - batch-resets - 0
>> 
>> Has anyone seen this or can offer ideas of how to better debug it?
>> 
>> I'm using Samza 0.9.0 and YARN 2.4.0.
>> 
>> Thanks!
>> 
>> Roger
>> 

Reply via email to