Hi Malthe, 1. You are right - those were merely added to those 'counter' metrics to see if populating helpful attributes alongside of them would be feasible and also helpful or not - and does not represent the final form of the implementation. I would hope the actual implementation of this proposal would best leverage the OpenTelemetry's context abstraction.
2. So, the reason why I ended up implementing span_json was that between the scheduler who submits the tasks to be processed, and the worker that needs to pick them up from the queue (which is implemented in meta database of airflow) - needs to get the current span in some way. It looked like every time the worker gets dagrun or task instance it does so via databases, so in my POC, it was necessary to have means to persist the current 'span' in the database tables. Well, dagrun and task instances do not have anything related to storing spans, so had to implement some method to convert the span objects into json and store them. 3. Yes, I believe the logs will be included into the scope of AIP, even in the draft stage (at least that is what I hope). However, it may be implemented following the initial implementation of metrics and traces. Howard On Wed, May 18, 2022 at 5:57 AM Malthe <[email protected]> wrote: > On Fri, 25 Mar 2022 at 20:40, Howard Yoo <[email protected]> wrote: > > I am pleased to announce the start of the discussion for the new AIP > draft that was recently been published: > https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-49+OpenTelemetry+Support+for+Apache+Airflow > > Some comments: > > 1. The optional attributes added to the "incr", "desc", etc. methods > seem wrong to me. It should be possible to specify this using some > context abstraction in OpenTelemetry – that is, to more precisely > define some scoped contexts within which the metrics or traces are > logged. > > 2. I don't understand the need for `span_json` – it should be possible > to pull these details from the existing objects (i.e. dagrun, task > instance etc). > > 3. I think logging should be included even if it is in draft > specification. We need distributed logging due to deferred tasks (work > is shared between multiple worker task execution sessions, async > executions). I doubt that logging will change much. > > Cheers >
