On Wed, 18 May 2022 at 16:44, Howard Yoo <[email protected]> wrote:
> 2. So, the reason why I ended up implementing span_json was that between the 
> scheduler who submits the tasks to be processed, and the worker that needs to 
> pick them up from the queue (which is implemented in meta database of 
> airflow) - needs to get the current span in some way. It looked like every 
> time the worker gets dagrun or task instance it does so via databases, so in 
> my POC, it was necessary to have means to persist the current 'span' in the 
> database tables. Well, dagrun and task instances do not have anything related 
> to storing spans, so had to implement some method to convert the span objects 
> into json and store them.

But isn't the span uniquely identified by the task instance and attempt number?

I would think that you've already got all the information without
persisting any additional data.

> 3. Yes, I believe the logs will be included into the scope of AIP, even in 
> the draft stage (at least that is what I hope). However, it may be 
> implemented following the initial implementation of metrics and traces.

To me what is the most important and what right now would motivate me
personally in helping out here is to get task execution logs (both
worker-based on asynchronous, task deferreds) out using OpenTelemetry.
That's because right now there is a bit of a broken logging story if
you're using deferred tasks and distributed logging is really the only
fix as far as I can tell.

Cheers

Reply via email to