ahipp13 opened a new issue, #34213:
URL: https://github.com/apache/airflow/issues/34213

   ### Apache Airflow version
   
   Other Airflow 2 version (please specify below)
   
   ### What happened
   
   When running the Airflow Kafka Provider Operator "ConsumeFromTopicOperator", 
I had one of my runs fail. Naturally since I have the "commit_cadence" option 
set to "end_of_operator", I was expecting to have duplicate records since it 
should have not commit the offset because the operator failed. Well the day 
ended and my counts were off, and when I looked in my DB I found that during 
the time it failed is when it missed the messages. So when the DAG run failed 
the offset was for some reason still committed even though I had set it to 
"end_of_operator".
   
   ### What you think should happen instead
   
   Based on your description, the offset should not get committed until the 
operator has completed successfully. If the DAG fails, it should go back to the 
offset the operator started on. 
   
   ### How to reproduce
   
   Run the Kafka Provider on a topic and mid DAG run fail it, and see if it 
goes back and gets the messages it missed. The connection information I used is:
   
   {
     "bootstrap.servers": SERVERS,
     "group.id": GROUPID,
     "auto.offset.reset": "earliest",
     "security.protocol": "SSL",
     "ssl.ca.location": "CA",
     "ssl.certificate.location": "CERT",
     "ssl.key.location": "KEY",
     "ssl.key.password": "PW"
   }
   
   ### Operating System
   
   PRETTY_NAME="Debian GNU/Linux 11 (bullseye)"
   
   ### Versions of Apache Airflow Providers
   
   apache-airflow-providers-apache-kafka==1.1.2
   
   ### Deployment
   
   Official Apache Airflow Helm Chart
   
   ### Deployment details
   
   _No response_
   
   ### Anything else
   
   Looking through the Confluent Kafka Documentation, I suspect what is 
happening here is because for Confluent's consumers they have an option 
"enable.auto.commit" that defaults to true, and it commits the offset every 5 
seconds (https://docs.confluent.io/platform/current/clients/consumer.html#id1). 
When I turned this option to false, it worked as expected and I was getting 
duplicate messages on fails.
   
   I don't really know what the expected behavior here is, but either 1) the 
code should be changed to turn this option off in the source code or 2) the 
documentation should specifically say that you need to turn this option to 
false in order for the commit_cadence option to work.
   
   ### Are you willing to submit PR?
   
   - [ ] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of 
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to