[ https://issues.apache.org/jira/browse/AIRFLOW-6522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17012324#comment-17012324 ]
ASF GitHub Bot commented on AIRFLOW-6522: ----------------------------------------- rconroy293 commented on pull request #7120: [AIRFLOW-6522] Clear log file to fix duplication in S3TaskHandler URL: https://github.com/apache/airflow/pull/7120 The same task instance (including try number) can be run on a worker when using a sensor in "reschedule" mode. Accordingly, this clears the local log file when re-initializing the logger so that the old log lines aren't uploaded again when the logger is closed. --- Issue link: WILL BE INSERTED BY [boring-cyborg](https://github.com/kaxil/boring-cyborg) - [ ] Description above provides context of the change - [ ] Commit message/PR title starts with `[AIRFLOW-NNNN]`. AIRFLOW-NNNN = JIRA ID<sup>*</sup> - [ ] Unit tests coverage for changes (not needed for documentation changes) - [ ] Commits follow "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)" - [ ] Relevant documentation is updated including usage instructions. - [ ] I will engage committers as explained in [Contribution Workflow Example](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#contribution-workflow-example). <sup>*</sup> For document-only changes commit message can start with `[AIRFLOW-XXXX]`. --- In case of fundamental code change, Airflow Improvement Proposal ([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)) is needed. In case of a new dependency, check compliance with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x). In case of backwards incompatible changes please leave a note in [UPDATING.md](https://github.com/apache/airflow/blob/master/UPDATING.md). Read the [Pull Request Guidelines](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#pull-request-guidelines) for more information. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Sensors in reschedule mode with S3TaskHandler can cause log duplication > ----------------------------------------------------------------------- > > Key: AIRFLOW-6522 > URL: https://issues.apache.org/jira/browse/AIRFLOW-6522 > Project: Apache Airflow > Issue Type: Bug > Components: logging > Affects Versions: 1.10.6 > Reporter: Robert Conroy > Assignee: Robert Conroy > Priority: Minor > > With sensors using {{reschedule}} mode and {{S3TaskHandler}} for logging, the > task instance log gets a bunch of duplicate messages. I believe this is > happening because contents of the local log file are appended to what's > already in S3. The local log file may contain log messages that have already > been uploaded to S3 if the task is sent back to a worker that had already > processed a poke for that task instance. -- This message was sent by Atlassian Jira (v8.3.4#803005)