Re: [PR] [SPARK-48131][Core] Unify MDC key `mdc.taskName` and `task_name` [spark]

2024-05-07 Thread via GitHub


gengliangwang commented on PR #46386:
URL: https://github.com/apache/spark/pull/46386#issuecomment-2099078293

   @dongjoon-hyun Thanks, I will add a configuration about the unification.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-48131][Core] Unify MDC key `mdc.taskName` and `task_name` [spark]

2024-05-07 Thread via GitHub


dongjoon-hyun commented on PR #46386:
URL: https://github.com/apache/spark/pull/46386#issuecomment-2098985001

   WDYT, @gengliangwang ?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-48131][Core] Unify MDC key `mdc.taskName` and `task_name` [spark]

2024-05-07 Thread via GitHub


dongjoon-hyun commented on PR #46386:
URL: https://github.com/apache/spark/pull/46386#issuecomment-2098984770

   I understand your point, @mridulm . Do you use the following syntax?
   ```
   spark.sparkContext.setLocalProperty("mdc." + name, "value")
   ```
   
   Initially, I thought it's okay because MDC is not used by default and this 
PR is for Apache Spark 4.0.0.
   
   However, if there is any concern on the plain logging part, I agree with 
@mridulm because MDC has been supported since Apache Spark 3.1.0.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-48131][Core] Unify MDC key `mdc.taskName` and `task_name` [spark]

2024-05-07 Thread via GitHub


mridulm commented on PR #46386:
URL: https://github.com/apache/spark/pull/46386#issuecomment-2097607873

   So if I understood right, this as a backwardly incompatible change done for 
consistency of naming for Structured logging ?
   If yes, IMO it is preferable to revert this and not break existing users.
   
   +CC @dongjoon-hyun as well for thoughts.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-48131][Core] Unify MDC key `mdc.taskName` and `task_name` [spark]

2024-05-06 Thread via GitHub


gengliangwang commented on PR #46386:
URL: https://github.com/apache/spark/pull/46386#issuecomment-2097446465

   @mridulm As the task name MDC is frequently showing in the logs; I would say 
this is necessary for the new logging framework. After the renaming, the MDC 
names are consistent and simpler to use. Otherwise, it will be 
`executor_id`/`worker_id`/`task_id`/etc and an odd `mdc.taskName`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-48131][Core] Unify MDC key `mdc.taskName` and `task_name` [spark]

2024-05-06 Thread via GitHub


mridulm commented on PR #46386:
URL: https://github.com/apache/spark/pull/46386#issuecomment-2097430754

   Is this change strictly necessary would be the question ... if it is, we can 
evaluate it in that context.
   If not and is a nice to have, it is better not to make breaking changes 
which can impact users.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-48131][Core] Unify MDC key `mdc.taskName` and `task_name` [spark]

2024-05-06 Thread via GitHub


gengliangwang commented on PR #46386:
URL: https://github.com/apache/spark/pull/46386#issuecomment-2097287397

   > That change was not visible to end users, as there was not release made - 
right ?
   
   Yes you are right. 
   Are you ok with the changes in this PR? If not, please let me know your 
preference and I can check with @cloud-fan to see if we still have time to make 
further changes.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-48131][Core] Unify MDC key `mdc.taskName` and `task_name` [spark]

2024-05-06 Thread via GitHub


mridulm commented on PR #46386:
URL: https://github.com/apache/spark/pull/46386#issuecomment-2097182888

   That change was not visible to end users, as there was not release made - 
right ?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-48131][Core] Unify MDC key `mdc.taskName` and `task_name` [spark]

2024-05-06 Thread via GitHub


gengliangwang commented on PR #46386:
URL: https://github.com/apache/spark/pull/46386#issuecomment-2096535872

   > Wouldn't this not impact/break existing log config files where users are 
customizing the template ?
   
   Yes, and this is documented on `docs/configuration.md`. If necessary, I can 
add this to the migration guide as well.
   BTW, the key was changed from `taskName` to `mdc.taskName` in 
https://github.com/apache/spark/pull/28801. So this is not the first time it 
has been changed, and hopefully, we can stay with `task_name` this time.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-48131][Core] Unify MDC key `mdc.taskName` and `task_name` [spark]

2024-05-05 Thread via GitHub


mridulm commented on PR #46386:
URL: https://github.com/apache/spark/pull/46386#issuecomment-2094831806

   Wouldn't this not impact/break existing log config files where users are 
customizing the template ?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-48131][Core] Unify MDC key `mdc.taskName` and `task_name` [spark]

2024-05-04 Thread via GitHub


dongjoon-hyun commented on PR #46386:
URL: https://github.com/apache/spark/pull/46386#issuecomment-2094521040

   Merged to master for Apache Spark 4.0.0-preview.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-48131][Core] Unify MDC key `mdc.taskName` and `task_name` [spark]

2024-05-04 Thread via GitHub


dongjoon-hyun closed pull request #46386: [SPARK-48131][Core] Unify MDC key 
`mdc.taskName` and `task_name`
URL: https://github.com/apache/spark/pull/46386


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-48131][Core] Unify MDC key `mdc.taskName` and `task_name` [spark]

2024-05-04 Thread via GitHub


gengliangwang commented on PR #46386:
URL: https://github.com/apache/spark/pull/46386#issuecomment-2094434379

   I suggest having this merged before the Spark 4.0 preview release.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[PR] [SPARK-48131][Core] Unify MDC key `mdc.taskName` and `task_name` [spark]

2024-05-04 Thread via GitHub


gengliangwang opened a new pull request, #46386:
URL: https://github.com/apache/spark/pull/46386

   
   
   ### What changes were proposed in this pull request?
   
   Currently there are two MDC keys for task name:
   * `mdc.taskName`, which is introduced in 
https://github.com/apache/spark/pull/28801. Before the change, it was 
`taskName`.
   * `task_name`: introduce from the structured logging framework project. 
   
   To make the MDC keys unified, this PR renames the `mdc.taskName` as 
`task_name`
   
   ### Why are the changes needed?
   
   1. Make the MDC names consistent
   2. Minor upside: this will allow users to query task names with `SELECT * 
FROM logs where context.task_name = ...`.  Otherwise, querying with 
`context.mdc.task_name` will result in an analysis exception. Users will have 
to query with `context['mdc.task_name']`
   
   ### Does this PR introduce _any_ user-facing change?
   
   No really. The MDC key is used by developers for debugging purpose.
   
   ### How was this patch tested?
   
   Manual test
   
   ### Was this patch authored or co-authored using generative AI tooling?
   
   No


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org