I think once we get some of the people who commented (Elad/Malthe) confirm that their comments were addressed and maybe some other voices of support, it could be ready for a Voting attempt actually :). I'd wait however with it till after the Summit (like with few other discussions we are having now).
J On Sun, May 22, 2022 at 4:53 PM Howard Yoo <howard...@gmail.com> wrote: > > But isn't the span uniquely identified by the task instance and attempt > number? > --> True, but the thing about OpenTelemetry is that span ID and trace ID are > in UUID format (and has to be), so unless we devise a way to uniquely create > the task instance and attempt number into UUID format (I guess we could kind > of do it, technically?), those identification cannot be directly used, hence > the need to persist the span information somewhere for later retrieval and > 'ending' it when the dag run or task instance ends. > > Yes, the only problem with the logging part with OpenTelemetry right now, is > that the logging was the latest addition to it, and thus will be subjected to > many changes and additions. This AIP does guarantee, however, that it will > include the logging feature, and according to the opentelemetry docs, logging > will be designed and implemented in such a way that will try to encompass > majority of existing logging structures and schemes since the project > understands that logging is a well established practices. > > Would be more than happy to get this AIP approved and going to address > logging part also. We are still waiting for it to get voted! > > Howard > > On Sat, May 21, 2022 at 2:22 AM Malthe <mbo...@gmail.com> wrote: >> >> On Wed, 18 May 2022 at 16:44, Howard Yoo <howard...@gmail.com> wrote: >> > 2. So, the reason why I ended up implementing span_json was that between >> > the scheduler who submits the tasks to be processed, and the worker that >> > needs to pick them up from the queue (which is implemented in meta >> > database of airflow) - needs to get the current span in some way. It >> > looked like every time the worker gets dagrun or task instance it does so >> > via databases, so in my POC, it was necessary to have means to persist the >> > current 'span' in the database tables. Well, dagrun and task instances do >> > not have anything related to storing spans, so had to implement some >> > method to convert the span objects into json and store them. >> >> But isn't the span uniquely identified by the task instance and attempt >> number? >> >> I would think that you've already got all the information without >> persisting any additional data. >> >> > 3. Yes, I believe the logs will be included into the scope of AIP, even in >> > the draft stage (at least that is what I hope). However, it may be >> > implemented following the initial implementation of metrics and traces. >> >> To me what is the most important and what right now would motivate me >> personally in helping out here is to get task execution logs (both >> worker-based on asynchronous, task deferreds) out using OpenTelemetry. >> That's because right now there is a bit of a broken logging story if >> you're using deferred tasks and distributed logging is really the only >> fix as far as I can tell. >> >> Cheers