Fokko commented on issue #6370: AIRFLOW-5701: Don't clear xcom explicitly 
before execution
URL: https://github.com/apache/airflow/pull/6370#issuecomment-546582724
 
 
   The idea started with PR https://github.com/apache/airflow/pull/6210
   
   We needed to store state because we're now having a sensor that allows you 
to give back the slot, and repoke again later on. However, sometimes you need 
to keep state. For example, when you are poking a Dataproc operator, and 
waiting for it to get started, you want to keep the ID of the operator. I've 
suggested this to store this in xcom. Xcom is initially intended to share data 
between operators/senors, but in this case, I think it would be great to also 
use it to share state between operator/sensor instances. It is already in 
there, and you can also look at it through the GUI. If you clear the xcom 
fields upfront, then the state would have been wiped before the next instance 
of the operator/sensor would run.
   
   I've poked Bas and we've had a discussion about why this was in there, and 
what the implications are when doing an upsert of the xcom fields, instead of 
clearing them upfront. We could not come to an obvious reason. To give some 
history, it was initially added 4 yours ago: 
https://github.com/apache/airflow/commit/f238f1d614061573fca48817cbf5314c772d12d2.
 And then it was moved to later in the task process: 
https://github.com/apache/airflow/pull/1951 
   
   Personally I don't expect the user to notice a lot since it will only keep 
the xcom values a little longer.
   
   
![image](https://user-images.githubusercontent.com/1134248/67616669-cc5c2480-f7db-11e9-86d1-89dfcdca7a3a.png)
   
   So there is no async part in there AFAIK, but the clearing of the xcom is 
delayed until it is upserted (atomic :-).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

Reply via email to