subject:"\[DISCUSS\] PTransform.named vs. named apply"

Re: [DISCUSS] PTransform.named vs. named apply

2016-06-23 Thread Dan Halperin

A little late... but yes! +1

On Wed, Jun 22, 2016 at 11:13 PM, Aljoscha Krettek 
wrote:

> ±1 for the named apply
>
> On Thu, Jun 23, 2016, 07:07 Robert Bradshaw 
> wrote:
>
> > +1, I think it makes more sense to name the application of a transform
> > rather than the transform itself. (Still mulling on how best to do
> > this with Python...)
> >
> > On Wed, Jun 22, 2016 at 9:27 PM, Jean-Baptiste Onofré 
> > wrote:
> > > +1
> > >
> > > Regards
> > > JB
> > >
> > >
> > > On 06/23/2016 12:17 AM, Ben Chambers wrote:
> > >>
> > >> Based on a recent PR (
> https://github.com/apache/incubator-beam/pull/468)
> > I
> > >> was reminded of the confusion around the use of
> > >> .apply(transform.named(someName)) and .apply(someName, transform).
> This
> > is
> > >> one of things I’ve wanted to cleanup for a while. I’d like to propose
> a
> > >> path towards removing this redundancy.
> > >>
> > >> First, some background -- why are there two ways to name things? When
> we
> > >> added support for updating existing pipelines, we needed all
> > applications
> > >> to have unique user-provided names to allow diff’ing the pipelines. We
> > >> found a few problems with the first approach -- using .named() to
> > create a
> > >> new transform -- which led to the introduction of the named apply:
> > >>
> > >> 1. When receiving an error about an application not having a name, it
> is
> > >> not obvious that a name should be given to the *transform*
> > >> 2. When using .named() to construct a new transform either the type
> > >> information is lost or the composite transform has to override
> .named()
> > >>
> > >> We now generally suggest the use of .apply(someName, transform). It is
> > >> easier to use and doesn’t lead to as much confusion around PTransform
> > >> names
> > >> and PTransform application names.
> > >>
> > >> To that end, I'd like to propose the following changes to the code and
> > >> documentation:
> > >> 1. Replace the usage of .named(name) in all examples and composites
> with
> > >> the named-apply syntax.
> > >> 2. Replace .named(name) with a protected PTransform constructor which
> > >> takes
> > >> a default name. If not provided, the default name will be derived from
> > the
> > >> class of the PTransform.
> > >> 3. Use the protected constructor in composites (where appropriate) to
> > >> ensure that the default application has a reasonable name.
> > >>
> > >> Users will benefit from having a single way of naming applications
> while
> > >> building a pipeline. Any breakages due to the removal of .named should
> > be
> > >> easily fixed by either using the named application or by passing the
> > name
> > >> to the constructor of a composite.
> > >>
> > >> I’d like to hear any comments or opinions on this topic from the wider
> > >> community. Please let me know what you think!
> > >>
> > >> -- Ben
> > >>
> > >
> > > --
> > > Jean-Baptiste Onofré
> > > jbono...@apache.org
> > > http://blog.nanthrax.net
> > > Talend - http://www.talend.com
> >
>

Re: [DISCUSS] PTransform.named vs. named apply

2016-06-23 Thread Ben Chambers

FYI: Created https://issues.apache.org/jira/browse/BEAM-370 to track
changes made in this direction.

On Wed, Jun 22, 2016 at 11:13 PM Aljoscha Krettek 
wrote:

> ±1 for the named apply
>
> On Thu, Jun 23, 2016, 07:07 Robert Bradshaw 
> wrote:
>
> > +1, I think it makes more sense to name the application of a transform
> > rather than the transform itself. (Still mulling on how best to do
> > this with Python...)
> >
> > On Wed, Jun 22, 2016 at 9:27 PM, Jean-Baptiste Onofré 
> > wrote:
> > > +1
> > >
> > > Regards
> > > JB
> > >
> > >
> > > On 06/23/2016 12:17 AM, Ben Chambers wrote:
> > >>
> > >> Based on a recent PR (
> https://github.com/apache/incubator-beam/pull/468)
> > I
> > >> was reminded of the confusion around the use of
> > >> .apply(transform.named(someName)) and .apply(someName, transform).
> This
> > is
> > >> one of things I’ve wanted to cleanup for a while. I’d like to propose
> a
> > >> path towards removing this redundancy.
> > >>
> > >> First, some background -- why are there two ways to name things? When
> we
> > >> added support for updating existing pipelines, we needed all
> > applications
> > >> to have unique user-provided names to allow diff’ing the pipelines. We
> > >> found a few problems with the first approach -- using .named() to
> > create a
> > >> new transform -- which led to the introduction of the named apply:
> > >>
> > >> 1. When receiving an error about an application not having a name, it
> is
> > >> not obvious that a name should be given to the *transform*
> > >> 2. When using .named() to construct a new transform either the type
> > >> information is lost or the composite transform has to override
> .named()
> > >>
> > >> We now generally suggest the use of .apply(someName, transform). It is
> > >> easier to use and doesn’t lead to as much confusion around PTransform
> > >> names
> > >> and PTransform application names.
> > >>
> > >> To that end, I'd like to propose the following changes to the code and
> > >> documentation:
> > >> 1. Replace the usage of .named(name) in all examples and composites
> with
> > >> the named-apply syntax.
> > >> 2. Replace .named(name) with a protected PTransform constructor which
> > >> takes
> > >> a default name. If not provided, the default name will be derived from
> > the
> > >> class of the PTransform.
> > >> 3. Use the protected constructor in composites (where appropriate) to
> > >> ensure that the default application has a reasonable name.
> > >>
> > >> Users will benefit from having a single way of naming applications
> while
> > >> building a pipeline. Any breakages due to the removal of .named should
> > be
> > >> easily fixed by either using the named application or by passing the
> > name
> > >> to the constructor of a composite.
> > >>
> > >> I’d like to hear any comments or opinions on this topic from the wider
> > >> community. Please let me know what you think!
> > >>
> > >> -- Ben
> > >>
> > >
> > > --
> > > Jean-Baptiste Onofré
> > > jbono...@apache.org
> > > http://blog.nanthrax.net
> > > Talend - http://www.talend.com
> >
>

Re: [DISCUSS] PTransform.named vs. named apply

2016-06-23 Thread Aljoscha Krettek

±1 for the named apply

On Thu, Jun 23, 2016, 07:07 Robert Bradshaw 
wrote:

> +1, I think it makes more sense to name the application of a transform
> rather than the transform itself. (Still mulling on how best to do
> this with Python...)
>
> On Wed, Jun 22, 2016 at 9:27 PM, Jean-Baptiste Onofré 
> wrote:
> > +1
> >
> > Regards
> > JB
> >
> >
> > On 06/23/2016 12:17 AM, Ben Chambers wrote:
> >>
> >> Based on a recent PR (https://github.com/apache/incubator-beam/pull/468)
> I
> >> was reminded of the confusion around the use of
> >> .apply(transform.named(someName)) and .apply(someName, transform). This
> is
> >> one of things I’ve wanted to cleanup for a while. I’d like to propose a
> >> path towards removing this redundancy.
> >>
> >> First, some background -- why are there two ways to name things? When we
> >> added support for updating existing pipelines, we needed all
> applications
> >> to have unique user-provided names to allow diff’ing the pipelines. We
> >> found a few problems with the first approach -- using .named() to
> create a
> >> new transform -- which led to the introduction of the named apply:
> >>
> >> 1. When receiving an error about an application not having a name, it is
> >> not obvious that a name should be given to the *transform*
> >> 2. When using .named() to construct a new transform either the type
> >> information is lost or the composite transform has to override .named()
> >>
> >> We now generally suggest the use of .apply(someName, transform). It is
> >> easier to use and doesn’t lead to as much confusion around PTransform
> >> names
> >> and PTransform application names.
> >>
> >> To that end, I'd like to propose the following changes to the code and
> >> documentation:
> >> 1. Replace the usage of .named(name) in all examples and composites with
> >> the named-apply syntax.
> >> 2. Replace .named(name) with a protected PTransform constructor which
> >> takes
> >> a default name. If not provided, the default name will be derived from
> the
> >> class of the PTransform.
> >> 3. Use the protected constructor in composites (where appropriate) to
> >> ensure that the default application has a reasonable name.
> >>
> >> Users will benefit from having a single way of naming applications while
> >> building a pipeline. Any breakages due to the removal of .named should
> be
> >> easily fixed by either using the named application or by passing the
> name
> >> to the constructor of a composite.
> >>
> >> I’d like to hear any comments or opinions on this topic from the wider
> >> community. Please let me know what you think!
> >>
> >> -- Ben
> >>
> >
> > --
> > Jean-Baptiste Onofré
> > jbono...@apache.org
> > http://blog.nanthrax.net
> > Talend - http://www.talend.com
>

Re: [DISCUSS] PTransform.named vs. named apply

2016-06-22 Thread Amit Sela

+1

On Thu, Jun 23, 2016 at 7:27 AM Jean-Baptiste Onofré 
wrote:

> +1
>
> Regards
> JB
>
> On 06/23/2016 12:17 AM, Ben Chambers wrote:
> > Based on a recent PR (https://github.com/apache/incubator-beam/pull/468)
> I
> > was reminded of the confusion around the use of
> > .apply(transform.named(someName)) and .apply(someName, transform). This
> is
> > one of things I’ve wanted to cleanup for a while. I’d like to propose a
> > path towards removing this redundancy.
> >
> > First, some background -- why are there two ways to name things? When we
> > added support for updating existing pipelines, we needed all applications
> > to have unique user-provided names to allow diff’ing the pipelines. We
> > found a few problems with the first approach -- using .named() to create
> a
> > new transform -- which led to the introduction of the named apply:
> >
> > 1. When receiving an error about an application not having a name, it is
> > not obvious that a name should be given to the *transform*
> > 2. When using .named() to construct a new transform either the type
> > information is lost or the composite transform has to override .named()
> >
> > We now generally suggest the use of .apply(someName, transform). It is
> > easier to use and doesn’t lead to as much confusion around PTransform
> names
> > and PTransform application names.
> >
> > To that end, I'd like to propose the following changes to the code and
> > documentation:
> > 1. Replace the usage of .named(name) in all examples and composites with
> > the named-apply syntax.
> > 2. Replace .named(name) with a protected PTransform constructor which
> takes
> > a default name. If not provided, the default name will be derived from
> the
> > class of the PTransform.
> > 3. Use the protected constructor in composites (where appropriate) to
> > ensure that the default application has a reasonable name.
> >
> > Users will benefit from having a single way of naming applications while
> > building a pipeline. Any breakages due to the removal of .named should be
> > easily fixed by either using the named application or by passing the name
> > to the constructor of a composite.
> >
> > I’d like to hear any comments or opinions on this topic from the wider
> > community. Please let me know what you think!
> >
> > -- Ben
> >
>
> --
> Jean-Baptiste Onofré
> jbono...@apache.org
> http://blog.nanthrax.net
> Talend - http://www.talend.com
>

Re: [DISCUSS] PTransform.named vs. named apply

2016-06-22 Thread Jean-Baptiste Onofré


+1

Regards
JB

On 06/23/2016 12:17 AM, Ben Chambers wrote:

Based on a recent PR (https://github.com/apache/incubator-beam/pull/468) I
was reminded of the confusion around the use of
.apply(transform.named(someName)) and .apply(someName, transform). This is
one of things I’ve wanted to cleanup for a while. I’d like to propose a
path towards removing this redundancy.

First, some background -- why are there two ways to name things? When we
added support for updating existing pipelines, we needed all applications
to have unique user-provided names to allow diff’ing the pipelines. We
found a few problems with the first approach -- using .named() to create a
new transform -- which led to the introduction of the named apply:

1. When receiving an error about an application not having a name, it is
not obvious that a name should be given to the *transform*
2. When using .named() to construct a new transform either the type
information is lost or the composite transform has to override .named()

We now generally suggest the use of .apply(someName, transform). It is
easier to use and doesn’t lead to as much confusion around PTransform names
and PTransform application names.

To that end, I'd like to propose the following changes to the code and
documentation:
1. Replace the usage of .named(name) in all examples and composites with
the named-apply syntax.
2. Replace .named(name) with a protected PTransform constructor which takes
a default name. If not provided, the default name will be derived from the
class of the PTransform.
3. Use the protected constructor in composites (where appropriate) to
ensure that the default application has a reasonable name.

Users will benefit from having a single way of naming applications while
building a pipeline. Any breakages due to the removal of .named should be
easily fixed by either using the named application or by passing the name
to the constructor of a composite.

I’d like to hear any comments or opinions on this topic from the wider
community. Please let me know what you think!

-- Ben



--
Jean-Baptiste Onofré
jbono...@apache.org
http://blog.nanthrax.net
Talend - http://www.talend.com

Re: [DISCUSS] PTransform.named vs. named apply

2016-06-22 Thread Lukasz Cwik

+1 on your proposed solution

On Wed, Jun 22, 2016 at 3:17 PM, Ben Chambers  wrote:

> Based on a recent PR (https://github.com/apache/incubator-beam/pull/468) I
> was reminded of the confusion around the use of
> .apply(transform.named(someName)) and .apply(someName, transform). This is
> one of things I’ve wanted to cleanup for a while. I’d like to propose a
> path towards removing this redundancy.
>
> First, some background -- why are there two ways to name things? When we
> added support for updating existing pipelines, we needed all applications
> to have unique user-provided names to allow diff’ing the pipelines. We
> found a few problems with the first approach -- using .named() to create a
> new transform -- which led to the introduction of the named apply:
>
> 1. When receiving an error about an application not having a name, it is
> not obvious that a name should be given to the *transform*
> 2. When using .named() to construct a new transform either the type
> information is lost or the composite transform has to override .named()
>
> We now generally suggest the use of .apply(someName, transform). It is
> easier to use and doesn’t lead to as much confusion around PTransform names
> and PTransform application names.
>
> To that end, I'd like to propose the following changes to the code and
> documentation:
> 1. Replace the usage of .named(name) in all examples and composites with
> the named-apply syntax.
> 2. Replace .named(name) with a protected PTransform constructor which takes
> a default name. If not provided, the default name will be derived from the
> class of the PTransform.
> 3. Use the protected constructor in composites (where appropriate) to
> ensure that the default application has a reasonable name.
>
> Users will benefit from having a single way of naming applications while
> building a pipeline. Any breakages due to the removal of .named should be
> easily fixed by either using the named application or by passing the name
> to the constructor of a composite.
>
> I’d like to hear any comments or opinions on this topic from the wider
> community. Please let me know what you think!
>
> -- Ben
>

Re: [DISCUSS] PTransform.named vs. named apply

Re: [DISCUSS] PTransform.named vs. named apply

Re: [DISCUSS] PTransform.named vs. named apply

Re: [DISCUSS] PTransform.named vs. named apply

Re: [DISCUSS] PTransform.named vs. named apply

Re: [DISCUSS] PTransform.named vs. named apply

6 matches

Site Navigation

Mail list logo

Footer information