AjayKumbham commented on PR #55645:
URL: https://github.com/apache/airflow/pull/55645#issuecomment-3289467834

   Hi @eladkal, totally get your concern. Let me break this down:
   
   The actual problem: Airflow 3 passes task data as JSON strings to 
containers. Spark containers (and others) have entrypoints that run these 
through shell, which strips the quotes and breaks the JSON.
   
   Why I'm fixing it here: I checked the other executors - AWS Lambda/ECS/Batch 
all use the same --json-string approach but don't have this issue because they 
execute directly without shell processing. This is specifically a Kubernetes 
container problem.
   
   The workload_to_command_args() function is Kubernetes-only, so fixing it 
here doesn't affect other executors. By switching to --json-path (writing to a 
temp file), we sidestep the shell processing entirely.
   
   Why it's broader than Spark: Any operator with complex JSON configs would 
hit this with shell-based container entrypoints. SparkSubmitOperator just 
happened to be where users first noticed it because Spark configs are 
JSON-heavy.
   
   The fix is surgical - it only changes how KubernetesExecutor passes data to 
containers, which is exactly where this container-specific workaround belongs.
   
   I did use AI assistance for cleaner code and PR description, but the 
debugging, root cause analysis, and solution approach are mine. Happy to 
clarify any technical details..!
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to