Re: Accessing SparkConf in metrics sink

2016-03-16 Thread Pete Robbins
So the answer to my previous question is NO.

It looks like I could use SparkEnv.get.conf but

* * NOTE: This is not intended for external use. This is exposed for Shark
and may be made private * in a future release. */



On Wed, 16 Mar 2016 at 08:22 Pete Robbins  wrote:

> OK thanks. Does that work in an executor?
>
> On Wed, 16 Mar 2016 at 07:58 Reynold Xin  wrote:
>
>> SparkConf is not a singleton.
>>
>> However, SparkContext in almost all cases are. So you can use
>> SparkContext.getOrCreate().getConf
>>
>> On Wed, Mar 16, 2016 at 12:38 AM, Pete Robbins 
>> wrote:
>>
>>> I'm writing a metrics sink and reporter to push metrics to
>>> Elasticsearch. An example format of a metric in JSON:
>>>
>>> {
>>>  "timestamp": "2016-03-15T16:11:19.314+",
>>>  "hostName": "10.192.0.87"
>>>  "applicationName": "My application",
>>>  "applicationId": "app-20160315093931-0003",
>>>  "executorId": "17",
>>>  "executor_threadpool_completeTasks": 20
>>> }
>>>
>>> For correlating the metrics I want the timestamp, hostname,
>>> applicationId, executorId and applicationName.
>>>
>>> Currently I am extracting the applicationId and executor Id from the
>>> metric name as MetricsSystem prepends these to the name. As the sink is
>>> instantiated without the SparkConf I can not determine the applicationName.
>>>
>>> Another proposed change in
>>> https://issues.apache.org/jira/browse/SPARK-10610 would also make me
>>> require access to the SparkConf to get the applicationId/executorId.
>>>
>>> So, Is the SparkConf a singleton and can there be a Utils method for
>>> accessing it? Instantiating a SparkConf myself will not pick up the appName
>>> etc as these are set via methods on the conf.
>>>
>>> I'm trying to write this without modifying any Spark code by just using
>>> a definition in the metrics properties to load my sink.
>>>
>>> Cheers,
>>>
>>
>>


Re: Accessing SparkConf in metrics sink

2016-03-16 Thread Pete Robbins
OK thanks. Does that work in an executor?

On Wed, 16 Mar 2016 at 07:58 Reynold Xin  wrote:

> SparkConf is not a singleton.
>
> However, SparkContext in almost all cases are. So you can use
> SparkContext.getOrCreate().getConf
>
> On Wed, Mar 16, 2016 at 12:38 AM, Pete Robbins 
> wrote:
>
>> I'm writing a metrics sink and reporter to push metrics to Elasticsearch.
>> An example format of a metric in JSON:
>>
>> {
>>  "timestamp": "2016-03-15T16:11:19.314+",
>>  "hostName": "10.192.0.87"
>>  "applicationName": "My application",
>>  "applicationId": "app-20160315093931-0003",
>>  "executorId": "17",
>>  "executor_threadpool_completeTasks": 20
>> }
>>
>> For correlating the metrics I want the timestamp, hostname,
>> applicationId, executorId and applicationName.
>>
>> Currently I am extracting the applicationId and executor Id from the
>> metric name as MetricsSystem prepends these to the name. As the sink is
>> instantiated without the SparkConf I can not determine the applicationName.
>>
>> Another proposed change in
>> https://issues.apache.org/jira/browse/SPARK-10610 would also make me
>> require access to the SparkConf to get the applicationId/executorId.
>>
>> So, Is the SparkConf a singleton and can there be a Utils method for
>> accessing it? Instantiating a SparkConf myself will not pick up the appName
>> etc as these are set via methods on the conf.
>>
>> I'm trying to write this without modifying any Spark code by just using a
>> definition in the metrics properties to load my sink.
>>
>> Cheers,
>>
>
>


Re: Accessing SparkConf in metrics sink

2016-03-16 Thread Reynold Xin
SparkConf is not a singleton.

However, SparkContext in almost all cases are. So you can use
SparkContext.getOrCreate().getConf

On Wed, Mar 16, 2016 at 12:38 AM, Pete Robbins  wrote:

> I'm writing a metrics sink and reporter to push metrics to Elasticsearch.
> An example format of a metric in JSON:
>
> {
>  "timestamp": "2016-03-15T16:11:19.314+",
>  "hostName": "10.192.0.87"
>  "applicationName": "My application",
>  "applicationId": "app-20160315093931-0003",
>  "executorId": "17",
>  "executor_threadpool_completeTasks": 20
> }
>
> For correlating the metrics I want the timestamp, hostname, applicationId,
> executorId and applicationName.
>
> Currently I am extracting the applicationId and executor Id from the
> metric name as MetricsSystem prepends these to the name. As the sink is
> instantiated without the SparkConf I can not determine the applicationName.
>
> Another proposed change in
> https://issues.apache.org/jira/browse/SPARK-10610 would also make me
> require access to the SparkConf to get the applicationId/executorId.
>
> So, Is the SparkConf a singleton and can there be a Utils method for
> accessing it? Instantiating a SparkConf myself will not pick up the appName
> etc as these are set via methods on the conf.
>
> I'm trying to write this without modifying any Spark code by just using a
> definition in the metrics properties to load my sink.
>
> Cheers,
>


Accessing SparkConf in metrics sink

2016-03-16 Thread Pete Robbins
I'm writing a metrics sink and reporter to push metrics to Elasticsearch.
An example format of a metric in JSON:

{
 "timestamp": "2016-03-15T16:11:19.314+",
 "hostName": "10.192.0.87"
 "applicationName": "My application",
 "applicationId": "app-20160315093931-0003",
 "executorId": "17",
 "executor_threadpool_completeTasks": 20
}

For correlating the metrics I want the timestamp, hostname, applicationId,
executorId and applicationName.

Currently I am extracting the applicationId and executor Id from the metric
name as MetricsSystem prepends these to the name. As the sink is
instantiated without the SparkConf I can not determine the applicationName.

Another proposed change in https://issues.apache.org/jira/browse/SPARK-10610
would also make me require access to the SparkConf to get the
applicationId/executorId.

So, Is the SparkConf a singleton and can there be a Utils method for
accessing it? Instantiating a SparkConf myself will not pick up the appName
etc as these are set via methods on the conf.

I'm trying to write this without modifying any Spark code by just using a
definition in the metrics properties to load my sink.

Cheers,