Re: [PR] [SPARK-48131][Core] Unify MDC key `mdc.taskName` and `task_name` [spark]
gengliangwang commented on PR #46386: URL: https://github.com/apache/spark/pull/46386#issuecomment-2099078293 @dongjoon-hyun Thanks, I will add a configuration about the unification. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
Re: [PR] [SPARK-48131][Core] Unify MDC key `mdc.taskName` and `task_name` [spark]
dongjoon-hyun commented on PR #46386: URL: https://github.com/apache/spark/pull/46386#issuecomment-2098985001 WDYT, @gengliangwang ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
Re: [PR] [SPARK-48131][Core] Unify MDC key `mdc.taskName` and `task_name` [spark]
dongjoon-hyun commented on PR #46386: URL: https://github.com/apache/spark/pull/46386#issuecomment-2098984770 I understand your point, @mridulm . Do you use the following syntax? ``` spark.sparkContext.setLocalProperty("mdc." + name, "value") ``` Initially, I thought it's okay because MDC is not used by default and this PR is for Apache Spark 4.0.0. However, if there is any concern on the plain logging part, I agree with @mridulm because MDC has been supported since Apache Spark 3.1.0. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
Re: [PR] [SPARK-48131][Core] Unify MDC key `mdc.taskName` and `task_name` [spark]
mridulm commented on PR #46386: URL: https://github.com/apache/spark/pull/46386#issuecomment-2097607873 So if I understood right, this as a backwardly incompatible change done for consistency of naming for Structured logging ? If yes, IMO it is preferable to revert this and not break existing users. +CC @dongjoon-hyun as well for thoughts. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
Re: [PR] [SPARK-48131][Core] Unify MDC key `mdc.taskName` and `task_name` [spark]
gengliangwang commented on PR #46386: URL: https://github.com/apache/spark/pull/46386#issuecomment-2097446465 @mridulm As the task name MDC is frequently showing in the logs; I would say this is necessary for the new logging framework. After the renaming, the MDC names are consistent and simpler to use. Otherwise, it will be `executor_id`/`worker_id`/`task_id`/etc and an odd `mdc.taskName`. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
Re: [PR] [SPARK-48131][Core] Unify MDC key `mdc.taskName` and `task_name` [spark]
mridulm commented on PR #46386: URL: https://github.com/apache/spark/pull/46386#issuecomment-2097430754 Is this change strictly necessary would be the question ... if it is, we can evaluate it in that context. If not and is a nice to have, it is better not to make breaking changes which can impact users. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
Re: [PR] [SPARK-48131][Core] Unify MDC key `mdc.taskName` and `task_name` [spark]
gengliangwang commented on PR #46386: URL: https://github.com/apache/spark/pull/46386#issuecomment-2097287397 > That change was not visible to end users, as there was not release made - right ? Yes you are right. Are you ok with the changes in this PR? If not, please let me know your preference and I can check with @cloud-fan to see if we still have time to make further changes. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
Re: [PR] [SPARK-48131][Core] Unify MDC key `mdc.taskName` and `task_name` [spark]
mridulm commented on PR #46386: URL: https://github.com/apache/spark/pull/46386#issuecomment-2097182888 That change was not visible to end users, as there was not release made - right ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
Re: [PR] [SPARK-48131][Core] Unify MDC key `mdc.taskName` and `task_name` [spark]
gengliangwang commented on PR #46386: URL: https://github.com/apache/spark/pull/46386#issuecomment-2096535872 > Wouldn't this not impact/break existing log config files where users are customizing the template ? Yes, and this is documented on `docs/configuration.md`. If necessary, I can add this to the migration guide as well. BTW, the key was changed from `taskName` to `mdc.taskName` in https://github.com/apache/spark/pull/28801. So this is not the first time it has been changed, and hopefully, we can stay with `task_name` this time. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
Re: [PR] [SPARK-48131][Core] Unify MDC key `mdc.taskName` and `task_name` [spark]
mridulm commented on PR #46386: URL: https://github.com/apache/spark/pull/46386#issuecomment-2094831806 Wouldn't this not impact/break existing log config files where users are customizing the template ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
Re: [PR] [SPARK-48131][Core] Unify MDC key `mdc.taskName` and `task_name` [spark]
dongjoon-hyun commented on PR #46386: URL: https://github.com/apache/spark/pull/46386#issuecomment-2094521040 Merged to master for Apache Spark 4.0.0-preview. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
Re: [PR] [SPARK-48131][Core] Unify MDC key `mdc.taskName` and `task_name` [spark]
dongjoon-hyun closed pull request #46386: [SPARK-48131][Core] Unify MDC key `mdc.taskName` and `task_name` URL: https://github.com/apache/spark/pull/46386 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
Re: [PR] [SPARK-48131][Core] Unify MDC key `mdc.taskName` and `task_name` [spark]
gengliangwang commented on PR #46386: URL: https://github.com/apache/spark/pull/46386#issuecomment-2094434379 I suggest having this merged before the Spark 4.0 preview release. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[PR] [SPARK-48131][Core] Unify MDC key `mdc.taskName` and `task_name` [spark]
gengliangwang opened a new pull request, #46386: URL: https://github.com/apache/spark/pull/46386 ### What changes were proposed in this pull request? Currently there are two MDC keys for task name: * `mdc.taskName`, which is introduced in https://github.com/apache/spark/pull/28801. Before the change, it was `taskName`. * `task_name`: introduce from the structured logging framework project. To make the MDC keys unified, this PR renames the `mdc.taskName` as `task_name` ### Why are the changes needed? 1. Make the MDC names consistent 2. Minor upside: this will allow users to query task names with `SELECT * FROM logs where context.task_name = ...`. Otherwise, querying with `context.mdc.task_name` will result in an analysis exception. Users will have to query with `context['mdc.task_name']` ### Does this PR introduce _any_ user-facing change? No really. The MDC key is used by developers for debugging purpose. ### How was this patch tested? Manual test ### Was this patch authored or co-authored using generative AI tooling? No -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org