Splendid. Thanks Gengliang

Mich Talebzadeh,
Dad | Technologist | Solutions Architect | Engineer
London
United Kingdom


   view my Linkedin profile
<https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>


 https://en.everybodywiki.com/Mich_Talebzadeh



*Disclaimer:* The information provided is correct to the best of my
knowledge but of course cannot be guaranteed . It is essential to note
that, as with any advice, quote "one test result is worth one-thousand
expert opinions (Werner  <https://en.wikipedia.org/wiki/Wernher_von_Braun>Von
Braun <https://en.wikipedia.org/wiki/Wernher_von_Braun>)".


On Sat, 9 Mar 2024 at 18:10, Gengliang Wang <ltn...@gmail.com> wrote:

> Hi Mich,
>
> Thanks for your suggestions. I agree that we should avoid confusion with
> Spark Structured Streaming.
>
> So, I'll go with "Structured Logging Framework for Apache Spark". This
> keeps the standard term "Structured Logging" and distinguishes it from
> "Structured Streaming" clearly.
>
> Thanks for helping shape this!
>
> Best,
> Gengliang
>
> On Sat, Mar 2, 2024 at 12:19 PM Mich Talebzadeh <mich.talebza...@gmail.com>
> wrote:
>
>> Hi Gengliang,
>>
>> Thanks for taking the initiative to improve the Spark logging system.
>> Transitioning to structured logs seems like a worthy way to enhance the
>> ability to analyze and troubleshoot Spark jobs and hopefully  the future
>> integration with cloud logging systems. While "Structured Spark Logging"
>> sounds good, I was wondering if we could consider an alternative name.
>> Since we already use "Spark Structured Streaming", there might be a slight
>> initial confusion with the terminology. I must confess it was my initial
>> reaction so to speak.
>>
>> Here are a few alternative names I came up with if I may
>>
>>    - Spark Log Schema Initiative
>>    - Centralized Logging with Structured Data for Spark
>>    - Enhanced Spark Logging with Queryable Format
>>
>> These options all highlight the key aspects of your proposal namely;
>> schema, centralized logging and queryability and might be even clearer for
>> everyone at first glance.
>>
>> Cheers
>>
>> Mich Talebzadeh,
>> Dad | Technologist | Solutions Architect | Engineer
>> London
>> United Kingdom
>>
>>
>>    view my Linkedin profile
>> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>
>>
>>
>>  https://en.everybodywiki.com/Mich_Talebzadeh
>>
>>
>>
>> *Disclaimer:* The information provided is correct to the best of my
>> knowledge but of course cannot be guaranteed . It is essential to note
>> that, as with any advice, quote "one test result is worth one-thousand
>> expert opinions (Werner
>> <https://en.wikipedia.org/wiki/Wernher_von_Braun>Von Braun
>> <https://en.wikipedia.org/wiki/Wernher_von_Braun>)".
>>
>>
>> On Fri, 1 Mar 2024 at 10:07, Gengliang Wang <ltn...@gmail.com> wrote:
>>
>>> Hi All,
>>>
>>> I propose to enhance our logging system by transitioning to structured
>>> logs. This initiative is designed to tackle the challenges of analyzing
>>> distributed logs from drivers, workers, and executors by allowing them to
>>> be queried using a fixed schema. The goal is to improve the informativeness
>>> and accessibility of logs, making it significantly easier to diagnose
>>> issues.
>>>
>>> Key benefits include:
>>>
>>>    - Clarity and queryability of distributed log files.
>>>    - Continued support for log4j, allowing users to switch back to
>>>    traditional text logging if preferred.
>>>
>>> The improvement will simplify debugging and enhance productivity without
>>> disrupting existing logging practices. The implementation is estimated to
>>> take around 3 months.
>>>
>>> *SPIP*:
>>> https://docs.google.com/document/d/1rATVGmFLNVLmtxSpWrEceYm7d-ocgu8ofhryVs4g3XU/edit?usp=sharing
>>> *JIRA*: SPARK-47240 <https://issues.apache.org/jira/browse/SPARK-47240>
>>>
>>> Your comments and feedback would be greatly appreciated.
>>>
>>

Reply via email to