Re: Capability matrix question

2016-03-23 Thread Robert Bradshaw
+1 to Metric too.

Sounds like there's consensus on renaming to something, likely
[P]Metric. I created https://issues.apache.org/jira/browse/BEAM-147 to
track the actual work.

On Wed, Mar 23, 2016 at 1:56 PM, Dan Halperin
 wrote:
> +1 @Amit =>  -1 to Counter but +1 to Metric.
>
> On Wed, Mar 23, 2016 at 1:43 PM, Amit Sela  wrote:
>
>> IMHO Counters just count..  Metrics measure things, so I think metrics
>> sounds better. Accumulators and Aggregators would have been good as well if
>> they weren't so overloaded.
>> That's just my thoughts here though..
>>
>> On Wed, Mar 23, 2016 at 10:38 PM Robert Bradshaw
>>  wrote:
>>
>> > +1 to renaming this. [P]Counter is another option.
>> >
>> > On Wed, Mar 23, 2016 at 9:12 AM, Kenneth Knowles > >
>> > wrote:
>> > > +1 to considering "metric" / PMetric / etc.
>> > >
>> > > On Wed, Mar 23, 2016 at 8:09 AM, Amit Sela 
>> wrote:
>> > >
>> > >> How about "PMetric" ?
>> > >>
>> > >> On Wed, Mar 23, 2016, 16:53 Frances Perry  wrote:
>> > >>
>> > >>>
>> > > Perhaps I'm unclear on what an “Aggregator” is. I assumed that a
>> line
>> > > such as the following:
>> > >
>> > > PCollection> meanByName =
>> > > dataPoints.apply(Mean.perKey());
>> > >
>> > > …would be considered an Aggregator, since it applies a mean
>> > aggregation
>> > > over a window. Is that correct, with respect to the Beam
>> > terminology? If
>> > > not, what would an example of an Aggregator be?
>> > >
>> > 
>> > >>> Ah, we may have some slightly confusing terminology here.
>> > >>>
>> > >>> In that code snippet you are using a PTransform (Mean.perKey) to
>> > combine
>> > >>> a PCollection using the Mean CombineFn
>> > >>> <
>> >
>> https://github.com/apache/incubator-beam/blob/c199f085473cfcd79014d0a022b5ce3fdd4863ec/sdk/src/main/java/com/google/cloud/dataflow/sdk/transforms/Combine.java#L359
>> > >.
>> > >>> An Aggregator
>> > >>> <
>> >
>> https://github.com/apache/incubator-beam/blob/211e76abf9ba34c35ef13cca279cbeefdad7c406/sdk/src/main/java/com/google/cloud/dataflow/sdk/transforms/Aggregator.java#L54
>> > >
>> > >>> takes a CombineFn and applies it continuously within a DoFn. So it's
>> > more
>> > >>> analogous to a 'counter'. You can see an example of aggregators in
>> > >>> DebuggingWordCount
>> > >>> <
>> >
>> https://github.com/apache/incubator-beam/blob/master/examples/src/main/java/com/google/cloud/dataflow/examples/DebuggingWordCount.java#L129
>> > >
>> > >>> .
>> > >>>
>> > >>> We never really used the term *aggregation *to refer to a general set
>> > of
>> > >>> PTransforms until we started describing things to the community. But
>> > it is
>> > >>> a useful word, so we've ended up in a bit of confusing state. Maybe
>> we
>> > >>> should consider renaming Aggregator? Something like "metric" might be
>> > >>> clearer.
>> > >>>
>> > >>>
>> >
>>


Re: Capability matrix question

2016-03-23 Thread Jean-Baptiste Onofré

+1 to Metric

Regards
JB

On 03/23/2016 09:56 PM, Dan Halperin wrote:

+1 @Amit =>  -1 to Counter but +1 to Metric.

On Wed, Mar 23, 2016 at 1:43 PM, Amit Sela  wrote:


IMHO Counters just count..  Metrics measure things, so I think metrics
sounds better. Accumulators and Aggregators would have been good as well if
they weren't so overloaded.
That's just my thoughts here though..

On Wed, Mar 23, 2016 at 10:38 PM Robert Bradshaw
 wrote:


+1 to renaming this. [P]Counter is another option.

On Wed, Mar 23, 2016 at 9:12 AM, Kenneth Knowles 

wrote:



How about "PMetric" ?

On Wed, Mar 23, 2016, 16:53 Frances Perry  wrote:




Perhaps I'm unclear on what an “Aggregator” is. I assumed that a

line

such as the following:

PCollection> meanByName =
dataPoints.apply(Mean.perKey());

…would be considered an Aggregator, since it applies a mean

aggregation

over a window. Is that correct, with respect to the Beam

terminology? If

not, what would an example of an Aggregator be?




Ah, we may have some slightly confusing terminology here.

In that code snippet you are using a PTransform (Mean.perKey) to

combine

a PCollection using the Mean CombineFn
<



https://github.com/apache/incubator-beam/blob/c199f085473cfcd79014d0a022b5ce3fdd4863ec/sdk/src/main/java/com/google/cloud/dataflow/sdk/transforms/Combine.java#L359

.

An Aggregator
<



https://github.com/apache/incubator-beam/blob/211e76abf9ba34c35ef13cca279cbeefdad7c406/sdk/src/main/java/com/google/cloud/dataflow/sdk/transforms/Aggregator.java#L54



takes a CombineFn and applies it continuously within a DoFn. So it's

more

analogous to a 'counter'. You can see an example of aggregators in
DebuggingWordCount
<



https://github.com/apache/incubator-beam/blob/master/examples/src/main/java/com/google/cloud/dataflow/examples/DebuggingWordCount.java#L129



.

We never really used the term *aggregation *to refer to a general set

of

PTransforms until we started describing things to the community. But

it is

a useful word, so we've ended up in a bit of confusing state. Maybe

we

should consider renaming Aggregator? Something like "metric" might be
clearer.










--
Jean-Baptiste Onofré
jbono...@apache.org
http://blog.nanthrax.net
Talend - http://www.talend.com


Re: Capability matrix question

2016-03-23 Thread Amit Sela
IMHO Counters just count..  Metrics measure things, so I think metrics
sounds better. Accumulators and Aggregators would have been good as well if
they weren't so overloaded.
That's just my thoughts here though..

On Wed, Mar 23, 2016 at 10:38 PM Robert Bradshaw
 wrote:

> +1 to renaming this. [P]Counter is another option.
>
> On Wed, Mar 23, 2016 at 9:12 AM, Kenneth Knowles 
> wrote:
> > +1 to considering "metric" / PMetric / etc.
> >
> > On Wed, Mar 23, 2016 at 8:09 AM, Amit Sela  wrote:
> >
> >> How about "PMetric" ?
> >>
> >> On Wed, Mar 23, 2016, 16:53 Frances Perry  wrote:
> >>
> >>>
> > Perhaps I'm unclear on what an “Aggregator” is. I assumed that a line
> > such as the following:
> >
> > PCollection> meanByName =
> > dataPoints.apply(Mean.perKey());
> >
> > …would be considered an Aggregator, since it applies a mean
> aggregation
> > over a window. Is that correct, with respect to the Beam
> terminology? If
> > not, what would an example of an Aggregator be?
> >
> 
> >>> Ah, we may have some slightly confusing terminology here.
> >>>
> >>> In that code snippet you are using a PTransform (Mean.perKey) to
> combine
> >>> a PCollection using the Mean CombineFn
> >>> <
> https://github.com/apache/incubator-beam/blob/c199f085473cfcd79014d0a022b5ce3fdd4863ec/sdk/src/main/java/com/google/cloud/dataflow/sdk/transforms/Combine.java#L359
> >.
> >>> An Aggregator
> >>> <
> https://github.com/apache/incubator-beam/blob/211e76abf9ba34c35ef13cca279cbeefdad7c406/sdk/src/main/java/com/google/cloud/dataflow/sdk/transforms/Aggregator.java#L54
> >
> >>> takes a CombineFn and applies it continuously within a DoFn. So it's
> more
> >>> analogous to a 'counter'. You can see an example of aggregators in
> >>> DebuggingWordCount
> >>> <
> https://github.com/apache/incubator-beam/blob/master/examples/src/main/java/com/google/cloud/dataflow/examples/DebuggingWordCount.java#L129
> >
> >>> .
> >>>
> >>> We never really used the term *aggregation *to refer to a general set
> of
> >>> PTransforms until we started describing things to the community. But
> it is
> >>> a useful word, so we've ended up in a bit of confusing state. Maybe we
> >>> should consider renaming Aggregator? Something like "metric" might be
> >>> clearer.
> >>>
> >>>
>


Re: Capability matrix question

2016-03-23 Thread Kenneth Knowles
+1 to considering "metric" / PMetric / etc.

On Wed, Mar 23, 2016 at 8:09 AM, Amit Sela  wrote:

> How about "PMetric" ?
>
> On Wed, Mar 23, 2016, 16:53 Frances Perry  wrote:
>
>>
 Perhaps I'm unclear on what an “Aggregator” is. I assumed that a line
 such as the following:

 PCollection> meanByName =
 dataPoints.apply(Mean.perKey());

 …would be considered an Aggregator, since it applies a mean aggregation
 over a window. Is that correct, with respect to the Beam terminology? If
 not, what would an example of an Aggregator be?

>>>
>> Ah, we may have some slightly confusing terminology here.
>>
>> In that code snippet you are using a PTransform (Mean.perKey) to combine
>> a PCollection using the Mean CombineFn
>> .
>> An Aggregator
>> 
>> takes a CombineFn and applies it continuously within a DoFn. So it's more
>> analogous to a 'counter'. You can see an example of aggregators in
>> DebuggingWordCount
>> 
>> .
>>
>> We never really used the term *aggregation *to refer to a general set of
>> PTransforms until we started describing things to the community. But it is
>> a useful word, so we've ended up in a bit of confusing state. Maybe we
>> should consider renaming Aggregator? Something like "metric" might be
>> clearer.
>>
>>