I agree! :-) looking forward to the Airflow Summit next week!
Howard

On Sun, May 22, 2022 at 10:29 AM Jarek Potiuk <ja...@potiuk.com> wrote:

> I think once we get some of the people who commented (Elad/Malthe)
> confirm that their comments were addressed and maybe some other voices
> of support, it could be ready for a Voting attempt actually :). I'd
> wait however with it till after the Summit (like with few other
> discussions we are having now).
>
> J
>
> On Sun, May 22, 2022 at 4:53 PM Howard Yoo <howard...@gmail.com> wrote:
> >
> > But isn't the span uniquely identified by the task instance and attempt
> number?
> > --> True, but the thing about OpenTelemetry is that span ID and trace ID
> are in UUID format (and has to be), so unless we devise a way to uniquely
> create the task instance and attempt number into UUID format (I guess we
> could kind of do it, technically?), those identification cannot be directly
> used, hence the need to persist the span information somewhere for later
> retrieval and 'ending' it when the dag run or task instance ends.
> >
> > Yes, the only problem with the logging part with OpenTelemetry right
> now, is that the logging was the latest addition to it, and thus will be
> subjected to many changes and additions. This AIP does guarantee, however,
> that it will include the logging feature, and according to the
> opentelemetry docs, logging will be designed and implemented in such a way
> that will try to encompass majority of existing logging structures and
> schemes since the project understands that logging is a well established
> practices.
> >
> > Would be more than happy to get this AIP approved and going to address
> logging part also. We are still waiting for it to get voted!
> >
> > Howard
> >
> > On Sat, May 21, 2022 at 2:22 AM Malthe <mbo...@gmail.com> wrote:
> >>
> >> On Wed, 18 May 2022 at 16:44, Howard Yoo <howard...@gmail.com> wrote:
> >> > 2. So, the reason why I ended up implementing span_json was that
> between the scheduler who submits the tasks to be processed, and the worker
> that needs to pick them up from the queue (which is implemented in meta
> database of airflow) - needs to get the current span in some way. It looked
> like every time the worker gets dagrun or task instance it does so via
> databases, so in my POC, it was necessary to have means to persist the
> current 'span' in the database tables. Well, dagrun and task instances do
> not have anything related to storing spans, so had to implement some method
> to convert the span objects into json and store them.
> >>
> >> But isn't the span uniquely identified by the task instance and attempt
> number?
> >>
> >> I would think that you've already got all the information without
> >> persisting any additional data.
> >>
> >> > 3. Yes, I believe the logs will be included into the scope of AIP,
> even in the draft stage (at least that is what I hope). However, it may be
> implemented following the initial implementation of metrics and traces.
> >>
> >> To me what is the most important and what right now would motivate me
> >> personally in helping out here is to get task execution logs (both
> >> worker-based on asynchronous, task deferreds) out using OpenTelemetry.
> >> That's because right now there is a bit of a broken logging story if
> >> you're using deferred tasks and distributed logging is really the only
> >> fix as far as I can tell.
> >>
> >> Cheers
>

Reply via email to