[jira] [Commented] (SPARK-5847) Allow for configuring MetricsSystem's use of app ID to namespace all metrics

2016-09-09 Thread Apache Spark (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-5847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15476885#comment-15476885
 ] 

Apache Spark commented on SPARK-5847:
-

User 'AnthonyTruchet' has created a pull request for this issue:
https://github.com/apache/spark/pull/15023

> Allow for configuring MetricsSystem's use of app ID to namespace all metrics
> 
>
> Key: SPARK-5847
> URL: https://issues.apache.org/jira/browse/SPARK-5847
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 1.2.1
>Reporter: Ryan Williams
>Assignee: Mark Grover
>Priority: Minor
> Fix For: 2.1.0
>
>
> {{MetricsSystem}} [currently prepends the app ID to all 
> metrics|https://github.com/apache/spark/blob/c51ab37faddf4ede23243058dfb388e74a192552/core/src/main/scala/org/apache/spark/metrics/MetricsSystem.scala#L131].
> When reading Spark metrics in Graphite, I've found this to not always be 
> desirable. Graphite is designed to track a mostly-unchanging set of metrics 
> over time; it allocates large zeroed-out files for each metric it sees, and 
> [by default rate-limits itself from creating many of 
> these|https://github.com/graphite-project/carbon/blob/79158ffde5949b4056eb7fdb5e9b6b583fe21ea4/conf/carbon.conf.example#L61-L68].
> App-ID namespacing means that Graphite is allocating disk-space for every 
> "metric" for every job it sees, when in reality some metrics may correspond 
> to others across jobs (e.g. driver JVM stats).
> Some common Spark usage flows would be better modeled by e.g. namespacing 
> metrics by {{spark.app.name}}, so that successive runs of a given job would 
> share "metrics", from a storage perspective as well as allowing for 
> monitoring aspects of a job's performance over time / many runs.
> There's not likely a one-size-fits-all solution here, so I'd propose allowing 
> the metrics config file to allow users to specify whether they'd like metrics 
> namespaced by {{spark.app.id}}, {{spark.app.name}}, or some other config 
> param.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-5847) Allow for configuring MetricsSystem's use of app ID to namespace all metrics

2016-07-19 Thread Apache Spark (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-5847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15384739#comment-15384739
 ] 

Apache Spark commented on SPARK-5847:
-

User 'markgrover' has created a pull request for this issue:
https://github.com/apache/spark/pull/14270

> Allow for configuring MetricsSystem's use of app ID to namespace all metrics
> 
>
> Key: SPARK-5847
> URL: https://issues.apache.org/jira/browse/SPARK-5847
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 1.2.1
>Reporter: Ryan Williams
>Priority: Minor
>
> {{MetricsSystem}} [currently prepends the app ID to all 
> metrics|https://github.com/apache/spark/blob/c51ab37faddf4ede23243058dfb388e74a192552/core/src/main/scala/org/apache/spark/metrics/MetricsSystem.scala#L131].
> When reading Spark metrics in Graphite, I've found this to not always be 
> desirable. Graphite is designed to track a mostly-unchanging set of metrics 
> over time; it allocates large zeroed-out files for each metric it sees, and 
> [by default rate-limits itself from creating many of 
> these|https://github.com/graphite-project/carbon/blob/79158ffde5949b4056eb7fdb5e9b6b583fe21ea4/conf/carbon.conf.example#L61-L68].
> App-ID namespacing means that Graphite is allocating disk-space for every 
> "metric" for every job it sees, when in reality some metrics may correspond 
> to others across jobs (e.g. driver JVM stats).
> Some common Spark usage flows would be better modeled by e.g. namespacing 
> metrics by {{spark.app.name}}, so that successive runs of a given job would 
> share "metrics", from a storage perspective as well as allowing for 
> monitoring aspects of a job's performance over time / many runs.
> There's not likely a one-size-fits-all solution here, so I'd propose allowing 
> the metrics config file to allow users to specify whether they'd like metrics 
> namespaced by {{spark.app.id}}, {{spark.app.name}}, or some other config 
> param.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-5847) Allow for configuring MetricsSystem's use of app ID to namespace all metrics

2015-02-16 Thread Apache Spark (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-5847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14323340#comment-14323340
 ] 

Apache Spark commented on SPARK-5847:
-

User 'ryan-williams' has created a pull request for this issue:
https://github.com/apache/spark/pull/4632

 Allow for configuring MetricsSystem's use of app ID to namespace all metrics
 

 Key: SPARK-5847
 URL: https://issues.apache.org/jira/browse/SPARK-5847
 Project: Spark
  Issue Type: Improvement
  Components: Spark Core
Affects Versions: 1.2.1
Reporter: Ryan Williams
Priority: Minor

 {{MetricsSystem}} [currently prepends the app ID to all 
 metrics|https://github.com/apache/spark/blob/c51ab37faddf4ede23243058dfb388e74a192552/core/src/main/scala/org/apache/spark/metrics/MetricsSystem.scala#L131].
 When reading Spark metrics in Graphite, I've found this to not always be 
 desirable. Graphite is designed to track a mostly-unchanging set of metrics 
 over time; it allocates large zeroed-out files for each metric it sees, and 
 [by default rate-limits itself from creating many of 
 these|https://github.com/graphite-project/carbon/blob/79158ffde5949b4056eb7fdb5e9b6b583fe21ea4/conf/carbon.conf.example#L61-L68].
 App-ID namespacing means that Graphite is allocating disk-space for every 
 metric for every job it sees, when in reality some metrics may correspond 
 to others across jobs (e.g. driver JVM stats).
 Some common Spark usage flows would be better modeled by e.g. namespacing 
 metrics by {{spark.app.name}}, so that successive runs of a given job would 
 share metrics, from a storage perspective as well as allowing for 
 monitoring aspects of a job's performance over time / many runs.
 There's not likely a one-size-fits-all solution here, so I'd propose allowing 
 the metrics config file to allow users to specify whether they'd like metrics 
 namespaced by {{spark.app.id}}, {{spark.app.name}}, or some other config 
 param.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org