TJaniF commented on issue #48852:
URL: https://github.com/apache/airflow/issues/48852#issuecomment-2782391721

   > We are now using a client-server model and basically sending a lot of data 
over the network for this.
   > 
   > To give an estimate, your data: `big_list = [[i for i in range(100)] for _ 
in range(1000)]` will be roughly 380KB. It shoudln't usually be an issue to do 
this, but I think this is a valid enough case to use custom backends?
   > 
   
   It is definitely a situation where we would (and will in this case, the 
pipeline where I noticed the issue is for a tutorial) recommend a custom xcom 
backend once the pipeline is moved to production. 
   
   But many users will pass XCom of this size without a custom backend in their 
existing dags (for example with a larger metadb and frequent cleanup this would 
work fine in production even if not best practice). So it would be quite a 
breaking change we'd have to call out and that is tricky to assess across many 
pipelines since users likely wont know the size of all their XCom they pass, if 
they'd be over the threshold or not. It would be tedious to assess if they need 
to make a change and to require custom xcom backends for local development... 
   
   So if it is possible I'd strongly favor not limiting the XCom size that can 
be passed (beyond the db-level limits) to have parity with Airflow 2.10. 
   
   If it is not possible I'd need the size cutoff as soon as possible to put 
into upgrading checklists as a breaking change. And I'd vote to have an error 
message show in the task logs that explains that the issue was an XCom was too 
large and to please use a custom XCom backend. 
   
   cc: @cmarteepants 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to