Hey Raghu, Runner is Spark on Yarn cluster. Executor/Thread would be the Thread(Core) within a Spark Executor. I have nothing I can find in the logs saying that they aren't being closed cleanly outside of forced shutdown. Interval I have now is 30ms with a min read time of 2secs. Watching the finished tasks for the read job shows each thread running for the 2secs, some die early with no log as to why. However watching the offset within Kafka Manager shows the lag growing, as if the commit isn't happening. However the event throughput shows it matching that of the kafka topic msgs inbound. I can't seem to find a working option for getting the offsets KafkaIO believes it has out of either to compare numbers. I tried the Spark Metrics sinks provided by the runner, as KafkaIO appears to provide it via those. But they don't appear to work and actually caused issues, no longer appeared, with the Streaming stats Spark maintains.
Consumer group name is consistent as a bash script is being used to launch jobs to guarantee uniformity between launches. >From what I've been able to dissect from the KafkaIO class. I believe the watermark management layer via the PartitionState might be the best tiein for doing manual kafka offset management via the Consumer.commitSync() or Consumer.commitAsync() methods. This would allow better uniform and consistent reporting of offset in sync with the values being maintained in the watermark. On Mon, Nov 6, 2017 at 11:11 AM, Raghu Angadi <[email protected]> wrote: > What is the runner? > Can you elaborate a bit more what you mean by 'executor/thread shutting > down'? If the KafkaIO reader shutdown cleanly, it would call close the > consumer cleanly, triggering auto commit. But if you shutdown a pipeline, > it might not cleanly close the consumers. What is the auto_commit interval > you have? Please note that there is no way to coordinate consistency > between Beam pipeline and externally maintained auto commit offsets since > it is outside Beam. 'Drain' feature Dataflow can help (it lets a clean > shutdown of the pipeline), also note that many runners provide clean ways > to update a pipeline that keeps all the state from previous run (in this > case Kafka offsets), which is the only way for Beam to provide its > processing guarantees across runs. > > KafkaIO leaves auto_commit handling entirely to KafkaConsumer. If you are > seeing consumer is not honoring the auto_committed offset, please check the > log from KafkConsumer on the worker. Only user error I could think of is > some typo in consumer group name upon restart. > > Currently KafkaIO does not actively participate in auto_commit management. > It lets user directly set KafkaConsumer configuration. May be there is a > case for some more active support for auto_commit management. Please > provide more details in your case so that we can discuss actual specifics > and potential improvements it provides. > > > On Mon, Nov 6, 2017 at 8:15 AM, NerdyNick <[email protected]> wrote: > >> There seems to be a lot of oddities with the auto offset committer and >> the watermark management as well as kafka offsets in general. >> >> One issue I keep having is the auto committer will just not commit any >> offsets. So the topic will look like it's backing up. From what I've been >> able to trace on it it appears to be in relation to the executor/thread >> shutting down before the auto commit has a chance to run. Even though the >> min read times are set. It still prematurely shuts down. Turning auto >> commit interval down seems to help but doesn't resolve the issue. Just >> seems to allow it to correct itself much quicker. >> >> Another I just had happen is after restarting a pipeline the auto >> committed offsets reset to the earliest record and the pipeline appears to >> be working on those records. Which is odd in contrary to a lot of things. >> When I shut the pipeline down it was only a few thousand records behind. >> The consumer is configured to start at the latest offset not the earliest. >> Give that It would appear the recorded watermarks had an odd corruption or >> something where they believed they where in the past. >> >> -- >> Nick Verbeck - NerdyNick >> ---------------------------------------------------- >> NerdyNick.com >> Coloco.ubuntu-rocks.org >> > > -- Nick Verbeck - NerdyNick ---------------------------------------------------- NerdyNick.com Coloco.ubuntu-rocks.org
