I need some help.  I have a job which bootstraps one stream and then is
supposed to read from two.  When I run it on our YARN cluster with a single
container, it works correctly.  When I tried it with 5 containers, it gets
hung after consuming the bootstrap topic.  I ran it with the grid script on
my laptop (Mac OS X) with yarn.container.count=2 but it only spawns one
container and still hangs after bootstrap.

Debug logs are here: http://pastebin.com/af3KPvju

I looked at JMX metrics and see:
- Task Metrics - no value for kafka offset of non-bootstrapped stream
-  SystemConsumerMetrics
    - choose null keeps incrementing
     - ssps-needed-by-chooser 1
      - unprocessed-messages 62k
- Bootstrapping Chooser
  - lagging partitions 4
  - laggin-batch-streams - 4
  - batch-resets - 0

Has anyone seen this or can offer ideas of how to better debug it?

I'm using Samza 0.9.0 and YARN 2.4.0.

Thanks!

Roger

Reply via email to