[ 
https://issues.apache.org/jira/browse/AIRFLOW-6522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17012324#comment-17012324
 ] 

ASF GitHub Bot commented on AIRFLOW-6522:
-----------------------------------------

rconroy293 commented on pull request #7120: [AIRFLOW-6522] Clear log file to 
fix duplication in S3TaskHandler
URL: https://github.com/apache/airflow/pull/7120
 
 
   The same task instance (including try number) can be run on a worker
   when using a sensor in "reschedule" mode. Accordingly, this clears the
   local log file when re-initializing the logger so that the old log
   lines aren't uploaded again when the logger is closed.
   
   ---
   Issue link: WILL BE INSERTED BY 
[boring-cyborg](https://github.com/kaxil/boring-cyborg)
   
   - [ ] Description above provides context of the change
   - [ ] Commit message/PR title starts with `[AIRFLOW-NNNN]`. AIRFLOW-NNNN = 
JIRA ID<sup>*</sup>
   - [ ] Unit tests coverage for changes (not needed for documentation changes)
   - [ ] Commits follow "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)"
   - [ ] Relevant documentation is updated including usage instructions.
   - [ ] I will engage committers as explained in [Contribution Workflow 
Example](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#contribution-workflow-example).
   
   <sup>*</sup> For document-only changes commit message can start with 
`[AIRFLOW-XXXX]`.
   
   ---
   In case of fundamental code change, Airflow Improvement Proposal 
([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals))
 is needed.
   In case of a new dependency, check compliance with the [ASF 3rd Party 
License Policy](https://www.apache.org/legal/resolved.html#category-x).
   In case of backwards incompatible changes please leave a note in 
[UPDATING.md](https://github.com/apache/airflow/blob/master/UPDATING.md).
   Read the [Pull Request 
Guidelines](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#pull-request-guidelines)
 for more information.
   
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Sensors in reschedule mode with S3TaskHandler can cause log duplication
> -----------------------------------------------------------------------
>
>                 Key: AIRFLOW-6522
>                 URL: https://issues.apache.org/jira/browse/AIRFLOW-6522
>             Project: Apache Airflow
>          Issue Type: Bug
>          Components: logging
>    Affects Versions: 1.10.6
>            Reporter: Robert Conroy
>            Assignee: Robert Conroy
>            Priority: Minor
>
> With sensors using {{reschedule}} mode and {{S3TaskHandler}} for logging, the 
> task instance log gets a bunch of duplicate messages. I believe this is 
> happening because contents of the local log file are appended to what's 
> already in S3. The local log file may contain log messages that have already 
> been uploaded to S3 if the task is sent back to a worker that had already 
> processed a poke for that task instance.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to