Re: What is the "Keyed State" in the capability matrix?

2016-06-24 Thread Kenneth Knowles
Hi Shen,

The row refers to the ability for a DoFn in a ParDo to access per-key (and
window) state cells that persist beyond the lifetime of an element or
bundle. This is a feature that was in the later stages of design when the
Beam code was donated. Hence it a row in the graph, but even the Beam Model
column says "no", pending a public design proposal & consensus. Most
runners already have a similar capability at a low level; this feature
refers to exposing it in a nice way for users.

I have a design doc that I'm busily revising to make sense for the whole
community. I will send the doc to this list and add it to our technical
docs folder as soon as I can get it ready. You can follow BEAM-25 [1] if
you like, too.

Kenn

[1] https://issues.apache.org/jira/browse/BEAM-25


On Fri, Jun 24, 2016 at 10:56 AM, Shen Li <cs.she...@gmail.com> wrote:

> Hi,
>
> There is a "Keyed State" row in the  "What is being computed" section of
> the capability matrix. What does the "Keyed State" refer to? Is it a global
> key-value store?
>
> (
>
> http://beam.incubator.apache.org/beam/capability/2016/03/17/capability-matrix.html
> )
>
> Thanks,
>
> Shen
>


Re: Testing and the Capability Matrix

2016-06-14 Thread Aljoscha Krettek
@Thomas Completely agree, this is also how it is currently handled in the
Flink runner. I was talking about the presentation of the compatibility
matrix on the web site, whether we should have separate columns for Flink
Stream/Batch and Spark Stream/Batch. (And possibly other runners in the
future)

On Tue, 14 Jun 2016 at 18:57 Thomas Groh <tg...@google.com.invalid> wrote:

> It is also worth noting that this document is a snapshot rather than the
> long-term plan. As the SDK evolves, the annotations will almost certainly
> change with it (and will certainly expand).
>
> +Aljoscha
>
> For streaming/batch execution separation, this is better served by
> configuration in the runner's build (e.g. specifying two separate
> executions in the pom.xml, one for streaming and one for batch). Given that
> the tests live in a separate module from the runner, this is likened to how
> RunnableOnService tests are currently executed by all of the runners.
>
> For sink, I think given the current implementations of sink there isn't a
> huge need; however, most sinks should be annotated with some form of
> superclass (although the implementation of sink requires side inputs, so
> this is also worth considering).
>
> +jb
>
> These would live on the tests proper, yes.
>
> On Sun, Jun 12, 2016 at 11:05 PM, Jean-Baptiste Onofré <j...@nanthrax.net>
> wrote:
>
> > Hi Thomas,
> >
> > it looks good to me.
> >
> > Just curious: the proposed annotations will be directly in the Java SDK
> > Test jar right ?
> >
> > Thanks,
> > Regards
> > JB
> >
> >
> > On 06/11/2016 01:34 AM, Thomas Groh wrote:
> >
> >> Hey Beamers!
> >>
> >> We have a lovely Capability Matrix (
> >> http://beam.incubator.apache.org/capability-matrix/) which describes
> what
> >> runners can do, and what's in the model. However, right now we only have
> >> one way to specify that a test is useful to be executed in a runner, the
> >> RunnableOnService category.
> >>
> >> I've worked on a document to expand the number of annotations to be more
> >> in
> >> line with the capability matrix, which should help runner writers test
> >> more
> >> precisely with regards to the Beam model. The document is located at
> >>
> >>
> https://docs.google.com/document/d/1fICxq32t9yWn9qXhmT07xpclHeHX2VlUyVtpi2WzzGM/edit?usp=sharing
> >> ,
> >> and I've added edit access for all of our committers.
> >>
> >> Feel free to take a look and leave any comments you may have,
> >>
> >> Thanks,
> >>
> >> Thomas
> >>
> >>
> > --
> > Jean-Baptiste Onofré
> > jbono...@apache.org
> > http://blog.nanthrax.net
> > Talend - http://www.talend.com
> >
>


Re: Testing and the Capability Matrix

2016-06-14 Thread Thomas Groh
It is also worth noting that this document is a snapshot rather than the
long-term plan. As the SDK evolves, the annotations will almost certainly
change with it (and will certainly expand).

+Aljoscha

For streaming/batch execution separation, this is better served by
configuration in the runner's build (e.g. specifying two separate
executions in the pom.xml, one for streaming and one for batch). Given that
the tests live in a separate module from the runner, this is likened to how
RunnableOnService tests are currently executed by all of the runners.

For sink, I think given the current implementations of sink there isn't a
huge need; however, most sinks should be annotated with some form of
superclass (although the implementation of sink requires side inputs, so
this is also worth considering).

+jb

These would live on the tests proper, yes.

On Sun, Jun 12, 2016 at 11:05 PM, Jean-Baptiste Onofré <j...@nanthrax.net>
wrote:

> Hi Thomas,
>
> it looks good to me.
>
> Just curious: the proposed annotations will be directly in the Java SDK
> Test jar right ?
>
> Thanks,
> Regards
> JB
>
>
> On 06/11/2016 01:34 AM, Thomas Groh wrote:
>
>> Hey Beamers!
>>
>> We have a lovely Capability Matrix (
>> http://beam.incubator.apache.org/capability-matrix/) which describes what
>> runners can do, and what's in the model. However, right now we only have
>> one way to specify that a test is useful to be executed in a runner, the
>> RunnableOnService category.
>>
>> I've worked on a document to expand the number of annotations to be more
>> in
>> line with the capability matrix, which should help runner writers test
>> more
>> precisely with regards to the Beam model. The document is located at
>>
>> https://docs.google.com/document/d/1fICxq32t9yWn9qXhmT07xpclHeHX2VlUyVtpi2WzzGM/edit?usp=sharing
>> ,
>> and I've added edit access for all of our committers.
>>
>> Feel free to take a look and leave any comments you may have,
>>
>> Thanks,
>>
>> Thomas
>>
>>
> --
> Jean-Baptiste Onofré
> jbono...@apache.org
> http://blog.nanthrax.net
> Talend - http://www.talend.com
>


Re: Capability matrix question

2016-03-23 Thread Robert Bradshaw
+1 to Metric too.

Sounds like there's consensus on renaming to something, likely
[P]Metric. I created https://issues.apache.org/jira/browse/BEAM-147 to
track the actual work.

On Wed, Mar 23, 2016 at 1:56 PM, Dan Halperin
 wrote:
> +1 @Amit =>  -1 to Counter but +1 to Metric.
>
> On Wed, Mar 23, 2016 at 1:43 PM, Amit Sela  wrote:
>
>> IMHO Counters just count..  Metrics measure things, so I think metrics
>> sounds better. Accumulators and Aggregators would have been good as well if
>> they weren't so overloaded.
>> That's just my thoughts here though..
>>
>> On Wed, Mar 23, 2016 at 10:38 PM Robert Bradshaw
>>  wrote:
>>
>> > +1 to renaming this. [P]Counter is another option.
>> >
>> > On Wed, Mar 23, 2016 at 9:12 AM, Kenneth Knowles > >
>> > wrote:
>> > > +1 to considering "metric" / PMetric / etc.
>> > >
>> > > On Wed, Mar 23, 2016 at 8:09 AM, Amit Sela 
>> wrote:
>> > >
>> > >> How about "PMetric" ?
>> > >>
>> > >> On Wed, Mar 23, 2016, 16:53 Frances Perry  wrote:
>> > >>
>> > >>>
>> > > Perhaps I'm unclear on what an “Aggregator” is. I assumed that a
>> line
>> > > such as the following:
>> > >
>> > > PCollection> meanByName =
>> > > dataPoints.apply(Mean.perKey());
>> > >
>> > > …would be considered an Aggregator, since it applies a mean
>> > aggregation
>> > > over a window. Is that correct, with respect to the Beam
>> > terminology? If
>> > > not, what would an example of an Aggregator be?
>> > >
>> > 
>> > >>> Ah, we may have some slightly confusing terminology here.
>> > >>>
>> > >>> In that code snippet you are using a PTransform (Mean.perKey) to
>> > combine
>> > >>> a PCollection using the Mean CombineFn
>> > >>> <
>> >
>> https://github.com/apache/incubator-beam/blob/c199f085473cfcd79014d0a022b5ce3fdd4863ec/sdk/src/main/java/com/google/cloud/dataflow/sdk/transforms/Combine.java#L359
>> > >.
>> > >>> An Aggregator
>> > >>> <
>> >
>> https://github.com/apache/incubator-beam/blob/211e76abf9ba34c35ef13cca279cbeefdad7c406/sdk/src/main/java/com/google/cloud/dataflow/sdk/transforms/Aggregator.java#L54
>> > >
>> > >>> takes a CombineFn and applies it continuously within a DoFn. So it's
>> > more
>> > >>> analogous to a 'counter'. You can see an example of aggregators in
>> > >>> DebuggingWordCount
>> > >>> <
>> >
>> https://github.com/apache/incubator-beam/blob/master/examples/src/main/java/com/google/cloud/dataflow/examples/DebuggingWordCount.java#L129
>> > >
>> > >>> .
>> > >>>
>> > >>> We never really used the term *aggregation *to refer to a general set
>> > of
>> > >>> PTransforms until we started describing things to the community. But
>> > it is
>> > >>> a useful word, so we've ended up in a bit of confusing state. Maybe
>> we
>> > >>> should consider renaming Aggregator? Something like "metric" might be
>> > >>> clearer.
>> > >>>
>> > >>>
>> >
>>


Re: Capability matrix question

2016-03-23 Thread Jean-Baptiste Onofré

+1 to Metric

Regards
JB

On 03/23/2016 09:56 PM, Dan Halperin wrote:

+1 @Amit =>  -1 to Counter but +1 to Metric.

On Wed, Mar 23, 2016 at 1:43 PM, Amit Sela  wrote:


IMHO Counters just count..  Metrics measure things, so I think metrics
sounds better. Accumulators and Aggregators would have been good as well if
they weren't so overloaded.
That's just my thoughts here though..

On Wed, Mar 23, 2016 at 10:38 PM Robert Bradshaw
 wrote:


+1 to renaming this. [P]Counter is another option.

On Wed, Mar 23, 2016 at 9:12 AM, Kenneth Knowles 

wrote:



How about "PMetric" ?

On Wed, Mar 23, 2016, 16:53 Frances Perry  wrote:




Perhaps I'm unclear on what an “Aggregator” is. I assumed that a

line

such as the following:

PCollection> meanByName =
dataPoints.apply(Mean.perKey());

…would be considered an Aggregator, since it applies a mean

aggregation

over a window. Is that correct, with respect to the Beam

terminology? If

not, what would an example of an Aggregator be?




Ah, we may have some slightly confusing terminology here.

In that code snippet you are using a PTransform (Mean.perKey) to

combine

a PCollection using the Mean CombineFn
<



https://github.com/apache/incubator-beam/blob/c199f085473cfcd79014d0a022b5ce3fdd4863ec/sdk/src/main/java/com/google/cloud/dataflow/sdk/transforms/Combine.java#L359

.

An Aggregator
<



https://github.com/apache/incubator-beam/blob/211e76abf9ba34c35ef13cca279cbeefdad7c406/sdk/src/main/java/com/google/cloud/dataflow/sdk/transforms/Aggregator.java#L54



takes a CombineFn and applies it continuously within a DoFn. So it's

more

analogous to a 'counter'. You can see an example of aggregators in
DebuggingWordCount
<



https://github.com/apache/incubator-beam/blob/master/examples/src/main/java/com/google/cloud/dataflow/examples/DebuggingWordCount.java#L129



.

We never really used the term *aggregation *to refer to a general set

of

PTransforms until we started describing things to the community. But

it is

a useful word, so we've ended up in a bit of confusing state. Maybe

we

should consider renaming Aggregator? Something like "metric" might be
clearer.










--
Jean-Baptiste Onofré
jbono...@apache.org
http://blog.nanthrax.net
Talend - http://www.talend.com


Re: Capability matrix question

2016-03-23 Thread Amit Sela
IMHO Counters just count..  Metrics measure things, so I think metrics
sounds better. Accumulators and Aggregators would have been good as well if
they weren't so overloaded.
That's just my thoughts here though..

On Wed, Mar 23, 2016 at 10:38 PM Robert Bradshaw
 wrote:

> +1 to renaming this. [P]Counter is another option.
>
> On Wed, Mar 23, 2016 at 9:12 AM, Kenneth Knowles 
> wrote:
> > +1 to considering "metric" / PMetric / etc.
> >
> > On Wed, Mar 23, 2016 at 8:09 AM, Amit Sela  wrote:
> >
> >> How about "PMetric" ?
> >>
> >> On Wed, Mar 23, 2016, 16:53 Frances Perry  wrote:
> >>
> >>>
> > Perhaps I'm unclear on what an “Aggregator” is. I assumed that a line
> > such as the following:
> >
> > PCollection> meanByName =
> > dataPoints.apply(Mean.perKey());
> >
> > …would be considered an Aggregator, since it applies a mean
> aggregation
> > over a window. Is that correct, with respect to the Beam
> terminology? If
> > not, what would an example of an Aggregator be?
> >
> 
> >>> Ah, we may have some slightly confusing terminology here.
> >>>
> >>> In that code snippet you are using a PTransform (Mean.perKey) to
> combine
> >>> a PCollection using the Mean CombineFn
> >>> <
> https://github.com/apache/incubator-beam/blob/c199f085473cfcd79014d0a022b5ce3fdd4863ec/sdk/src/main/java/com/google/cloud/dataflow/sdk/transforms/Combine.java#L359
> >.
> >>> An Aggregator
> >>> <
> https://github.com/apache/incubator-beam/blob/211e76abf9ba34c35ef13cca279cbeefdad7c406/sdk/src/main/java/com/google/cloud/dataflow/sdk/transforms/Aggregator.java#L54
> >
> >>> takes a CombineFn and applies it continuously within a DoFn. So it's
> more
> >>> analogous to a 'counter'. You can see an example of aggregators in
> >>> DebuggingWordCount
> >>> <
> https://github.com/apache/incubator-beam/blob/master/examples/src/main/java/com/google/cloud/dataflow/examples/DebuggingWordCount.java#L129
> >
> >>> .
> >>>
> >>> We never really used the term *aggregation *to refer to a general set
> of
> >>> PTransforms until we started describing things to the community. But
> it is
> >>> a useful word, so we've ended up in a bit of confusing state. Maybe we
> >>> should consider renaming Aggregator? Something like "metric" might be
> >>> clearer.
> >>>
> >>>
>


Re: Capability matrix question

2016-03-23 Thread Kenneth Knowles
+1 to considering "metric" / PMetric / etc.

On Wed, Mar 23, 2016 at 8:09 AM, Amit Sela  wrote:

> How about "PMetric" ?
>
> On Wed, Mar 23, 2016, 16:53 Frances Perry  wrote:
>
>>
 Perhaps I'm unclear on what an “Aggregator” is. I assumed that a line
 such as the following:

 PCollection> meanByName =
 dataPoints.apply(Mean.perKey());

 …would be considered an Aggregator, since it applies a mean aggregation
 over a window. Is that correct, with respect to the Beam terminology? If
 not, what would an example of an Aggregator be?

>>>
>> Ah, we may have some slightly confusing terminology here.
>>
>> In that code snippet you are using a PTransform (Mean.perKey) to combine
>> a PCollection using the Mean CombineFn
>> .
>> An Aggregator
>> 
>> takes a CombineFn and applies it continuously within a DoFn. So it's more
>> analogous to a 'counter'. You can see an example of aggregators in
>> DebuggingWordCount
>> 
>> .
>>
>> We never really used the term *aggregation *to refer to a general set of
>> PTransforms until we started describing things to the community. But it is
>> a useful word, so we've ended up in a bit of confusing state. Maybe we
>> should consider renaming Aggregator? Something like "metric" might be
>> clearer.
>>
>>


Re: Capability Matrix

2016-03-21 Thread Tyler Akidau
Thanks, all!

On Fri, Mar 18, 2016 at 6:46 AM Amit Sela <amitsel...@gmail.com> wrote:

> Looks great!
> I think it's the best way to give a clear picture of capabilities for users
> and runner developers.
> And as always, Love the colours ;)
>
>
> On Fri, Mar 18, 2016 at 3:33 PM Kostas Kloudas <
> k.klou...@data-artisans.com>
> wrote:
>
> > Great to have an overview of the available
> > runners and a comprehensible visualization of
> > the features each one supports!
> >
> > Kostas
> >
> > > On Mar 18, 2016, at 11:32 AM, Maximilian Michels <m...@apache.org>
> wrote:
> > >
> > > Well done. The matrix provides a good basis for improving the existing
> > > runners. Moreover, new backends can use it to evaluate capabilities
> > > for creating a runner.
> > >
> > > On Fri, Mar 18, 2016 at 1:15 AM, Jean-Baptiste Onofré <j...@nanthrax.net
> >
> > wrote:
> > >> Catcha, thanks !
> > >>
> > >> Regards
> > >> JB
> > >>
> > >>
> > >> On 03/18/2016 12:51 AM, Frances Perry wrote:
> > >>>
> > >>> That's "partially". Check out the full matrix for complete details:
> > >>> http://beam.incubator.apache.org/capability-matrix/
> > >>>
> > >>> On Thu, Mar 17, 2016 at 4:50 PM, Jean-Baptiste Onofré <
> j...@nanthrax.net
> > >
> > >>> wrote:
> > >>>
> > >>>> Great job !
> > >>>>
> > >>>> By the way, when you use ~ in the matrix, does it mean that it works
> > only
> > >>>> in some cases (depending of the pipeline or transform) or it doesn't
> > work
> > >>>> as expected ? Just curious for the Aggregators and the meaning in
> the
> > >>>> Beam
> > >>>> Model.
> > >>>>
> > >>>> Thanks,
> > >>>> Regards
> > >>>> JB
> > >>>>
> > >>>>
> > >>>> On 03/18/2016 12:45 AM, Tyler Akidau wrote:
> > >>>>
> > >>>>> Just pushed the capability matrix and an attendant blog post to the
> > >>>>> site:
> > >>>>>
> > >>>>> - Blog post:
> > >>>>>
> > >>>>>
> > >>>>>
> >
> http://beam.incubator.apache.org/beam/capability/2016/03/17/capability-matrix.html
> > >>>>> - Matrix: http://beam.incubator.apache.org/capability-matrix/
> > >>>>>
> > >>>>> For those of you that want to keep the matrix up to date as your
> > runner
> > >>>>> evolves, you'll want to make updates in the
> > _data/capability-matrix.yml
> > >>>>> file:
> > >>>>>
> > >>>>>
> > >>>>>
> >
> https://github.com/apache/incubator-beam-site/blob/asf-site/_data/capability-matrix.yml
> > >>>>>
> > >>>>> Thanks to everyone for helping fill out the initial set of
> > capabilities!
> > >>>>> Looking forward to updates as things progress. :-)
> > >>>>>
> > >>>>> And thanks also to Max for moving all the website stuff to git!
> > >>>>>
> > >>>>> -Tyler
> > >>>>>
> > >>>>>
> > >>>>> On Sat, Mar 12, 2016 at 9:37 AM Tyler Akidau <taki...@google.com>
> > wrote:
> > >>>>>
> > >>>>> Thanks all! At this point, it looks like most all of the fields
> have
> > >>>>> been
> > >>>>>>
> > >>>>>> filled out. I'm in the process of migrating the spreadsheet
> > contents to
> > >>>>>> YAML within the website source, so I've revoked edit access from
> the
> > >>>>>> doc
> > >>>>>> to
> > >>>>>> keep things from changing while I'm doing that. If you have
> further
> > >>>>>> edits
> > >>>>>> to make, feel free to leave a comment, and I'll incorporate it
> into
> > the
> > >>>>>> YAML.
> > >>>>>>
> > >>>>>> -Tyler
> > >>>>>>
> > >>>>>>
> > >>>>>> On Thu, Mar 10, 2016

Re: Capability Matrix

2016-03-20 Thread Tyler Akidau
Just pushed the capability matrix and an attendant blog post to the site:

   - Blog post:
   
http://beam.incubator.apache.org/beam/capability/2016/03/17/capability-matrix.html
   - Matrix: http://beam.incubator.apache.org/capability-matrix/

For those of you that want to keep the matrix up to date as your runner
evolves, you'll want to make updates in the _data/capability-matrix.yml
file:
https://github.com/apache/incubator-beam-site/blob/asf-site/_data/capability-matrix.yml

Thanks to everyone for helping fill out the initial set of capabilities!
Looking forward to updates as things progress. :-)

And thanks also to Max for moving all the website stuff to git!

-Tyler


On Sat, Mar 12, 2016 at 9:37 AM Tyler Akidau <taki...@google.com> wrote:

> Thanks all! At this point, it looks like most all of the fields have been
> filled out. I'm in the process of migrating the spreadsheet contents to
> YAML within the website source, so I've revoked edit access from the doc to
> keep things from changing while I'm doing that. If you have further edits
> to make, feel free to leave a comment, and I'll incorporate it into the
> YAML.
>
> -Tyler
>
>
> On Thu, Mar 10, 2016 at 12:43 AM Jean-Baptiste Onofré <j...@nanthrax.net>
> wrote:
>
>> Hi Tyler,
>>
>> good idea !
>>
>> I like it !
>>
>> Regards
>> JB
>>
>> On 03/09/2016 11:14 PM, Tyler Akidau wrote:
>> > I just filed BEAM-104 <https://issues.apache.org/jira/browse/BEAM-104>
>> > regarding publishing a capability matrix on the Beam website. We've
>> seeded
>> > the spreadsheet linked there (
>> >
>> https://docs.google.com/spreadsheets/d/1OM077lZBARrtUi6g0X0O0PHaIbFKCD6v0djRefQRE1I/edit
>> > )
>> > with an initial proposed set of capabilities, as well as descriptions
>> for
>> > the model and Cloud Dataflow. If folks for other runners (currently
>> Flink
>> > and Spark) could please make sure their columns are filled out as well,
>> > it'd be much appreciated. Also let us know if there are capabilities you
>> > think we've missed.
>> >
>> > Our hope is to get this up and published soon, since we've been getting
>> a
>> > lot of questions regarding runner capabilities, portability, etc.
>> >
>> > -Tyler
>> >
>>
>> --
>> Jean-Baptiste Onofré
>> jbono...@apache.org
>> http://blog.nanthrax.net
>> Talend - http://www.talend.com
>>
>


Re: Capability Matrix

2016-03-19 Thread Jean-Baptiste Onofré

Great job !

By the way, when you use ~ in the matrix, does it mean that it works 
only in some cases (depending of the pipeline or transform) or it 
doesn't work as expected ? Just curious for the Aggregators and the 
meaning in the Beam Model.


Thanks,
Regards
JB

On 03/18/2016 12:45 AM, Tyler Akidau wrote:

Just pushed the capability matrix and an attendant blog post to the site:

- Blog post:

http://beam.incubator.apache.org/beam/capability/2016/03/17/capability-matrix.html
- Matrix: http://beam.incubator.apache.org/capability-matrix/

For those of you that want to keep the matrix up to date as your runner
evolves, you'll want to make updates in the _data/capability-matrix.yml
file:
https://github.com/apache/incubator-beam-site/blob/asf-site/_data/capability-matrix.yml

Thanks to everyone for helping fill out the initial set of capabilities!
Looking forward to updates as things progress. :-)

And thanks also to Max for moving all the website stuff to git!

-Tyler


On Sat, Mar 12, 2016 at 9:37 AM Tyler Akidau <taki...@google.com> wrote:


Thanks all! At this point, it looks like most all of the fields have been
filled out. I'm in the process of migrating the spreadsheet contents to
YAML within the website source, so I've revoked edit access from the doc to
keep things from changing while I'm doing that. If you have further edits
to make, feel free to leave a comment, and I'll incorporate it into the
YAML.

-Tyler


On Thu, Mar 10, 2016 at 12:43 AM Jean-Baptiste Onofré <j...@nanthrax.net>
wrote:


Hi Tyler,

good idea !

I like it !

Regards
JB

On 03/09/2016 11:14 PM, Tyler Akidau wrote:

I just filed BEAM-104 <https://issues.apache.org/jira/browse/BEAM-104>
regarding publishing a capability matrix on the Beam website. We've

seeded

the spreadsheet linked there (


https://docs.google.com/spreadsheets/d/1OM077lZBARrtUi6g0X0O0PHaIbFKCD6v0djRefQRE1I/edit

)
with an initial proposed set of capabilities, as well as descriptions

for

the model and Cloud Dataflow. If folks for other runners (currently

Flink

and Spark) could please make sure their columns are filled out as well,
it'd be much appreciated. Also let us know if there are capabilities you
think we've missed.

Our hope is to get this up and published soon, since we've been getting

a

lot of questions regarding runner capabilities, portability, etc.

-Tyler



--
Jean-Baptiste Onofré
jbono...@apache.org
http://blog.nanthrax.net
Talend - http://www.talend.com







--
Jean-Baptiste Onofré
jbono...@apache.org
http://blog.nanthrax.net
Talend - http://www.talend.com


Re: Capability Matrix

2016-03-19 Thread Maximilian Michels
Well done. The matrix provides a good basis for improving the existing
runners. Moreover, new backends can use it to evaluate capabilities
for creating a runner.

On Fri, Mar 18, 2016 at 1:15 AM, Jean-Baptiste Onofré <j...@nanthrax.net> wrote:
> Catcha, thanks !
>
> Regards
> JB
>
>
> On 03/18/2016 12:51 AM, Frances Perry wrote:
>>
>> That's "partially". Check out the full matrix for complete details:
>> http://beam.incubator.apache.org/capability-matrix/
>>
>> On Thu, Mar 17, 2016 at 4:50 PM, Jean-Baptiste Onofré <j...@nanthrax.net>
>> wrote:
>>
>>> Great job !
>>>
>>> By the way, when you use ~ in the matrix, does it mean that it works only
>>> in some cases (depending of the pipeline or transform) or it doesn't work
>>> as expected ? Just curious for the Aggregators and the meaning in the
>>> Beam
>>> Model.
>>>
>>> Thanks,
>>> Regards
>>> JB
>>>
>>>
>>> On 03/18/2016 12:45 AM, Tyler Akidau wrote:
>>>
>>>> Just pushed the capability matrix and an attendant blog post to the
>>>> site:
>>>>
>>>>  - Blog post:
>>>>
>>>>
>>>> http://beam.incubator.apache.org/beam/capability/2016/03/17/capability-matrix.html
>>>>  - Matrix: http://beam.incubator.apache.org/capability-matrix/
>>>>
>>>> For those of you that want to keep the matrix up to date as your runner
>>>> evolves, you'll want to make updates in the _data/capability-matrix.yml
>>>> file:
>>>>
>>>>
>>>> https://github.com/apache/incubator-beam-site/blob/asf-site/_data/capability-matrix.yml
>>>>
>>>> Thanks to everyone for helping fill out the initial set of capabilities!
>>>> Looking forward to updates as things progress. :-)
>>>>
>>>> And thanks also to Max for moving all the website stuff to git!
>>>>
>>>> -Tyler
>>>>
>>>>
>>>> On Sat, Mar 12, 2016 at 9:37 AM Tyler Akidau <taki...@google.com> wrote:
>>>>
>>>> Thanks all! At this point, it looks like most all of the fields have
>>>> been
>>>>>
>>>>> filled out. I'm in the process of migrating the spreadsheet contents to
>>>>> YAML within the website source, so I've revoked edit access from the
>>>>> doc
>>>>> to
>>>>> keep things from changing while I'm doing that. If you have further
>>>>> edits
>>>>> to make, feel free to leave a comment, and I'll incorporate it into the
>>>>> YAML.
>>>>>
>>>>> -Tyler
>>>>>
>>>>>
>>>>> On Thu, Mar 10, 2016 at 12:43 AM Jean-Baptiste Onofré <j...@nanthrax.net>
>>>>> wrote:
>>>>>
>>>>> Hi Tyler,
>>>>>>
>>>>>>
>>>>>> good idea !
>>>>>>
>>>>>> I like it !
>>>>>>
>>>>>> Regards
>>>>>> JB
>>>>>>
>>>>>> On 03/09/2016 11:14 PM, Tyler Akidau wrote:
>>>>>>
>>>>>>> I just filed BEAM-104
>>>>>>> <https://issues.apache.org/jira/browse/BEAM-104>
>>>>>>> regarding publishing a capability matrix on the Beam website. We've
>>>>>>>
>>>>>> seeded
>>>>>>
>>>>>>> the spreadsheet linked there (
>>>>>>>
>>>>>>>
>>>>>>
>>>>>> https://docs.google.com/spreadsheets/d/1OM077lZBARrtUi6g0X0O0PHaIbFKCD6v0djRefQRE1I/edit
>>>>>>
>>>>>>> )
>>>>>>> with an initial proposed set of capabilities, as well as descriptions
>>>>>>>
>>>>>> for
>>>>>>
>>>>>>> the model and Cloud Dataflow. If folks for other runners (currently
>>>>>>>
>>>>>> Flink
>>>>>>
>>>>>>> and Spark) could please make sure their columns are filled out as
>>>>>>> well,
>>>>>>> it'd be much appreciated. Also let us know if there are capabilities
>>>>>>> you
>>>>>>> think we've missed.
>>>>>>>
>>>>>>> Our hope is to get this up and published soon, since we've been
>>>>>>> getting
>>>>>>>
>>>>>> a
>>>>>>
>>>>>>> lot of questions regarding runner capabilities, portability, etc.
>>>>>>>
>>>>>>> -Tyler
>>>>>>>
>>>>>>>
>>>>>> --
>>>>>> Jean-Baptiste Onofré
>>>>>> jbono...@apache.org
>>>>>> http://blog.nanthrax.net
>>>>>> Talend - http://www.talend.com
>>>>>>
>>>>>>
>>>>>
>>>>
>>> --
>>> Jean-Baptiste Onofré
>>> jbono...@apache.org
>>> http://blog.nanthrax.net
>>> Talend - http://www.talend.com
>>>
>>
>
> --
> Jean-Baptiste Onofré
> jbono...@apache.org
> http://blog.nanthrax.net
> Talend - http://www.talend.com