Prasanth345 commented on issue #54143:
URL: https://github.com/apache/airflow/issues/54143#issuecomment-3273248042

   Hi, I’ve been looking into this feature request, and my thought is to 
leverage the existing **`extra` JSON field on the `Log` table** instead of 
introducing new schema. The idea would be:
   
   - On each task attempt, insert a `Log` row with 
`event="ti.worker_metadata"`.  
   - Store the worker/pod/hostname (and other useful metadata) in the `extra` 
JSON field.  
   - This makes the data **queryable via the metadata DB** and avoids changes 
to the `task_instance` table.  
   
   This seems to align well with the user story: support engineers would be 
able to query the event log for a task attempt and see which pod/host executed 
it.  
   
   Before I start wiring this in, I’d like to confirm:  
   
   - Does using `Log.extra` for this purpose sound reasonable?  
   - If so, what would be the best place in the code to insert this `Log` row 
so it is written **once per task attempt**, by the actual worker? (e.g. in the 
Celery worker task wrapper, in `StandardTaskRunner`, or somewhere else?).  
   
   Appreciate your guidance so I can get started in the right place.  
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to