Re: Can we remove private[spark] from Metrics Source and SInk traits?

2016-03-22 Thread Steve Loughran

On 19 Mar 2016, at 16:16, Pete Robbins 
> wrote:


There are several open Jiras to add new Sinks

OpenTSDB https://issues.apache.org/jira/browse/SPARK-12194
StatsD https://issues.apache.org/jira/browse/SPARK-11574


statsd is nicely easy to test: either listen in on a (localhost, port) or 
simply create a socket and force it into the sink for the test run


Kafka https://issues.apache.org/jira/browse/SPARK-13392

Some have PRs from 2015 so I'm assuming there is not the desire to integrate 
these into core Spark. Opening up the Sink/Source interfaces would at least 
allow these to exist somewhere such as spark-packages without having to pollute 
the o.a.s namespace


On Sat, 19 Mar 2016 at 13:05 Gerard Maas 
> wrote:

+1

On Mar 19, 2016 08:33, "Pete Robbins" 
> wrote:
This seems to me to be unnecessarily restrictive. These are very useful 
extension points for adding 3rd party sources and sinks.

I intend to make an Elasticsearch sink available on spark-packages but this 
will require a single class, the sink, to be in the org.apache.spark package 
tree. I could submit the package as a PR to the Spark codebase, and I'd be 
happy to do that but it could be a completely separate add-on.

There are similar issues with writing a 3rd party metrics source which may not 
be of interest to the community at large so would probably not warrant 
inclusion in the Spark codebase.

Any thoughts?



Re: Can we remove private[spark] from Metrics Source and SInk traits?

2016-03-19 Thread Pete Robbins
There are several open Jiras to add new Sinks

OpenTSDB https://issues.apache.org/jira/browse/SPARK-12194
StatsD https://issues.apache.org/jira/browse/SPARK-11574
Kafka https://issues.apache.org/jira/browse/SPARK-13392

Some have PRs from 2015 so I'm assuming there is not the desire to
integrate these into core Spark. Opening up the Sink/Source interfaces
would at least allow these to exist somewhere such as spark-packages
without having to pollute the o.a.s namespace


On Sat, 19 Mar 2016 at 13:05 Gerard Maas  wrote:

> +1
> On Mar 19, 2016 08:33, "Pete Robbins"  wrote:
>
>> This seems to me to be unnecessarily restrictive. These are very useful
>> extension points for adding 3rd party sources and sinks.
>>
>> I intend to make an Elasticsearch sink available on spark-packages but
>> this will require a single class, the sink, to be in the org.apache.spark
>> package tree. I could submit the package as a PR to the Spark codebase, and
>> I'd be happy to do that but it could be a completely separate add-on.
>>
>> There are similar issues with writing a 3rd party metrics source which
>> may not be of interest to the community at large so would probably not
>> warrant inclusion in the Spark codebase.
>>
>> Any thoughts?
>>
>


Re: Can we remove private[spark] from Metrics Source and SInk traits?

2016-03-19 Thread Gerard Maas
+1
On Mar 19, 2016 08:33, "Pete Robbins"  wrote:

> This seems to me to be unnecessarily restrictive. These are very useful
> extension points for adding 3rd party sources and sinks.
>
> I intend to make an Elasticsearch sink available on spark-packages but
> this will require a single class, the sink, to be in the org.apache.spark
> package tree. I could submit the package as a PR to the Spark codebase, and
> I'd be happy to do that but it could be a completely separate add-on.
>
> There are similar issues with writing a 3rd party metrics source which may
> not be of interest to the community at large so would probably not warrant
> inclusion in the Spark codebase.
>
> Any thoughts?
>


Can we remove private[spark] from Metrics Source and SInk traits?

2016-03-19 Thread Pete Robbins
This seems to me to be unnecessarily restrictive. These are very useful
extension points for adding 3rd party sources and sinks.

I intend to make an Elasticsearch sink available on spark-packages but this
will require a single class, the sink, to be in the org.apache.spark
package tree. I could submit the package as a PR to the Spark codebase, and
I'd be happy to do that but it could be a completely separate add-on.

There are similar issues with writing a 3rd party metrics source which may
not be of interest to the community at large so would probably not warrant
inclusion in the Spark codebase.

Any thoughts?