naruto-lgtm commented on PR #68662:
URL: https://github.com/apache/airflow/pull/68662#issuecomment-4750934236

   On when this fires: it's a small set of control-flow exceptions that carry 
data across the worker/supervisor boundary, not arbitrary exceptions thrown for 
their own sake. AirflowRescheduleException round-trips the reschedule_date back 
to where the reschedule is recorded; TaskDeferred, TaskAwaitingInput, 
DownstreamTasksSkipped and XComNotFound serialize their payloads over the 
Execution API; and KeyError/AttributeError show up nested inside other 
serialized results. That's the BASE/AIRFLOW_EXC_SER path. So it's 
reconstructing data, not really needing to throw an arbitrary class from 
another process.
   
   Stepping back though, I think your instinct is right that this isn't a vuln 
in the security-model sense. The blob only comes from the metadata DB or the 
Execution API, both behind auth, and anyone who can tamper with it has more 
direct execution paths already. So this is defense-in-depth, not a reported 
vulnerability.
   
   The two tighter fixes don't hold up either: because the base 
AirflowException.serialize() is inherited, the encode side accepts any 
AirflowException subclass, so the import set is open-ended (providers and 
user-defined exceptions). That's why an allowlist or an airflow.* prefix can't 
work without breaking legitimate round-trips, as you noted.
   
   I've dropped the prefix commit and left just the minimal post-import 
backstop that rejects a resolved object that isn't an exception type. If you'd 
rather treat the serialized store as trusted and close this, I'm fine with that 
too. Your call.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to