I think once we get some of the people who commented (Elad/Malthe)
confirm that their comments were addressed and maybe some other voices
of support, it could be ready for a Voting attempt actually :). I'd
wait however with it till after the Summit (like with few other
discussions we are having now).

J

On Sun, May 22, 2022 at 4:53 PM Howard Yoo <howard...@gmail.com> wrote:
>
> But isn't the span uniquely identified by the task instance and attempt 
> number?
> --> True, but the thing about OpenTelemetry is that span ID and trace ID are 
> in UUID format (and has to be), so unless we devise a way to uniquely create 
> the task instance and attempt number into UUID format (I guess we could kind 
> of do it, technically?), those identification cannot be directly used, hence 
> the need to persist the span information somewhere for later retrieval and 
> 'ending' it when the dag run or task instance ends.
>
> Yes, the only problem with the logging part with OpenTelemetry right now, is 
> that the logging was the latest addition to it, and thus will be subjected to 
> many changes and additions. This AIP does guarantee, however, that it will 
> include the logging feature, and according to the opentelemetry docs, logging 
> will be designed and implemented in such a way that will try to encompass 
> majority of existing logging structures and schemes since the project 
> understands that logging is a well established practices.
>
> Would be more than happy to get this AIP approved and going to address 
> logging part also. We are still waiting for it to get voted!
>
> Howard
>
> On Sat, May 21, 2022 at 2:22 AM Malthe <mbo...@gmail.com> wrote:
>>
>> On Wed, 18 May 2022 at 16:44, Howard Yoo <howard...@gmail.com> wrote:
>> > 2. So, the reason why I ended up implementing span_json was that between 
>> > the scheduler who submits the tasks to be processed, and the worker that 
>> > needs to pick them up from the queue (which is implemented in meta 
>> > database of airflow) - needs to get the current span in some way. It 
>> > looked like every time the worker gets dagrun or task instance it does so 
>> > via databases, so in my POC, it was necessary to have means to persist the 
>> > current 'span' in the database tables. Well, dagrun and task instances do 
>> > not have anything related to storing spans, so had to implement some 
>> > method to convert the span objects into json and store them.
>>
>> But isn't the span uniquely identified by the task instance and attempt 
>> number?
>>
>> I would think that you've already got all the information without
>> persisting any additional data.
>>
>> > 3. Yes, I believe the logs will be included into the scope of AIP, even in 
>> > the draft stage (at least that is what I hope). However, it may be 
>> > implemented following the initial implementation of metrics and traces.
>>
>> To me what is the most important and what right now would motivate me
>> personally in helping out here is to get task execution logs (both
>> worker-based on asynchronous, task deferreds) out using OpenTelemetry.
>> That's because right now there is a bit of a broken logging story if
>> you're using deferred tasks and distributed logging is really the only
>> fix as far as I can tell.
>>
>> Cheers

Reply via email to