Thank you, Yan. I'll get a trace level log as soon as I can. Sent from my iPhone
> On Jun 19, 2015, at 12:05 PM, Yan Fang <yanfang...@gmail.com> wrote: > > Hi Roger, > > " but it only spawns one container and still hangs after bootstrap" > -- this probably is due to your local machine does not have enough > resource for the second container. Because I checked your log file, each > container is about 4GB. > > "When I run it on our YARN cluster with a single container, it works > correctly. When I tried it with 5 containers, it gets hung after consuming > the bootstrap topic." > -- Have you figure it out? I have a looked at your log and also the > code. My suspect is that, there is a null enveloper somehow blocking the > process. If you can paste the trace level log, it will be more helpful > because many logs in chooser are trace level. > > Thanks, > > Fang, Yan > yanfang...@gmail.com > > On Thu, Jun 18, 2015 at 5:20 PM, Roger Hoover <roger.hoo...@gmail.com> > wrote: > >> I need some help. I have a job which bootstraps one stream and then is >> supposed to read from two. When I run it on our YARN cluster with a single >> container, it works correctly. When I tried it with 5 containers, it gets >> hung after consuming the bootstrap topic. I ran it with the grid script on >> my laptop (Mac OS X) with yarn.container.count=2 but it only spawns one >> container and still hangs after bootstrap. >> >> Debug logs are here: http://pastebin.com/af3KPvju >> >> I looked at JMX metrics and see: >> - Task Metrics - no value for kafka offset of non-bootstrapped stream >> - SystemConsumerMetrics >> - choose null keeps incrementing >> - ssps-needed-by-chooser 1 >> - unprocessed-messages 62k >> - Bootstrapping Chooser >> - lagging partitions 4 >> - laggin-batch-streams - 4 >> - batch-resets - 0 >> >> Has anyone seen this or can offer ideas of how to better debug it? >> >> I'm using Samza 0.9.0 and YARN 2.4.0. >> >> Thanks! >> >> Roger >>