Splendid. Thanks Gengliang Mich Talebzadeh, Dad | Technologist | Solutions Architect | Engineer London United Kingdom
view my Linkedin profile <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> https://en.everybodywiki.com/Mich_Talebzadeh *Disclaimer:* The information provided is correct to the best of my knowledge but of course cannot be guaranteed . It is essential to note that, as with any advice, quote "one test result is worth one-thousand expert opinions (Werner <https://en.wikipedia.org/wiki/Wernher_von_Braun>Von Braun <https://en.wikipedia.org/wiki/Wernher_von_Braun>)". On Sat, 9 Mar 2024 at 18:10, Gengliang Wang <ltn...@gmail.com> wrote: > Hi Mich, > > Thanks for your suggestions. I agree that we should avoid confusion with > Spark Structured Streaming. > > So, I'll go with "Structured Logging Framework for Apache Spark". This > keeps the standard term "Structured Logging" and distinguishes it from > "Structured Streaming" clearly. > > Thanks for helping shape this! > > Best, > Gengliang > > On Sat, Mar 2, 2024 at 12:19 PM Mich Talebzadeh <mich.talebza...@gmail.com> > wrote: > >> Hi Gengliang, >> >> Thanks for taking the initiative to improve the Spark logging system. >> Transitioning to structured logs seems like a worthy way to enhance the >> ability to analyze and troubleshoot Spark jobs and hopefully the future >> integration with cloud logging systems. While "Structured Spark Logging" >> sounds good, I was wondering if we could consider an alternative name. >> Since we already use "Spark Structured Streaming", there might be a slight >> initial confusion with the terminology. I must confess it was my initial >> reaction so to speak. >> >> Here are a few alternative names I came up with if I may >> >> - Spark Log Schema Initiative >> - Centralized Logging with Structured Data for Spark >> - Enhanced Spark Logging with Queryable Format >> >> These options all highlight the key aspects of your proposal namely; >> schema, centralized logging and queryability and might be even clearer for >> everyone at first glance. >> >> Cheers >> >> Mich Talebzadeh, >> Dad | Technologist | Solutions Architect | Engineer >> London >> United Kingdom >> >> >> view my Linkedin profile >> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> >> >> >> https://en.everybodywiki.com/Mich_Talebzadeh >> >> >> >> *Disclaimer:* The information provided is correct to the best of my >> knowledge but of course cannot be guaranteed . It is essential to note >> that, as with any advice, quote "one test result is worth one-thousand >> expert opinions (Werner >> <https://en.wikipedia.org/wiki/Wernher_von_Braun>Von Braun >> <https://en.wikipedia.org/wiki/Wernher_von_Braun>)". >> >> >> On Fri, 1 Mar 2024 at 10:07, Gengliang Wang <ltn...@gmail.com> wrote: >> >>> Hi All, >>> >>> I propose to enhance our logging system by transitioning to structured >>> logs. This initiative is designed to tackle the challenges of analyzing >>> distributed logs from drivers, workers, and executors by allowing them to >>> be queried using a fixed schema. The goal is to improve the informativeness >>> and accessibility of logs, making it significantly easier to diagnose >>> issues. >>> >>> Key benefits include: >>> >>> - Clarity and queryability of distributed log files. >>> - Continued support for log4j, allowing users to switch back to >>> traditional text logging if preferred. >>> >>> The improvement will simplify debugging and enhance productivity without >>> disrupting existing logging practices. The implementation is estimated to >>> take around 3 months. >>> >>> *SPIP*: >>> https://docs.google.com/document/d/1rATVGmFLNVLmtxSpWrEceYm7d-ocgu8ofhryVs4g3XU/edit?usp=sharing >>> *JIRA*: SPARK-47240 <https://issues.apache.org/jira/browse/SPARK-47240> >>> >>> Your comments and feedback would be greatly appreciated. >>> >>