dongjoon-hyun commented on PR #52027: URL: https://github.com/apache/spark/pull/52027#issuecomment-3237864277
It seems that you are underestimating the usage of `spark.hadoop.fs.gs.application.name.suffix`. As I mentioned about, it's very flexible for the users to allow to do that when they need to generate IDs in a more fine-grained way by prefixing company, organization, and team prefixed IDs. Which is much flexible and better than this a hard-coded single prefix. > When no custom identifier is set: All GCS requests from Spark will be tagged with the default apache-spark/<version> identifier. This allows users to filter their GCS logs for "apache-spark" to see a consolidated view of all storage operations originating from their Spark infrastructure. This is particularly useful for understanding Spark's overall storage footprint or diagnosing version-specific connector issues, which is a benefit that doesn't exist today without manual configuration. ``` spark.hadoop.fs.gs.application.name.suffix="google-finance-teamX-spark4.0.0-app-20250829-..." spark.hadoop.fs.gs.application.name.suffix="google-ads-teamY-spark3.5.6-app-20250829-..." ``` Given the above, the following could be interpreted as limiting the flexibility into the a single hard-coded way. > When a custom identifier is set: The main benefit here is automating a best practice. While a user can manually add the Spark version to their identifier, this change ensures it's done consistently and automatically across all jobs. This prevents fragmentation where some users remember to add the version and others don't, leading to more standardized and reliable logs. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
