Re: [DISCUSS] PTransform.named vs. named apply
A little late... but yes! +1 On Wed, Jun 22, 2016 at 11:13 PM, Aljoscha Krettekwrote: > ±1 for the named apply > > On Thu, Jun 23, 2016, 07:07 Robert Bradshaw > wrote: > > > +1, I think it makes more sense to name the application of a transform > > rather than the transform itself. (Still mulling on how best to do > > this with Python...) > > > > On Wed, Jun 22, 2016 at 9:27 PM, Jean-Baptiste Onofré > > wrote: > > > +1 > > > > > > Regards > > > JB > > > > > > > > > On 06/23/2016 12:17 AM, Ben Chambers wrote: > > >> > > >> Based on a recent PR ( > https://github.com/apache/incubator-beam/pull/468) > > I > > >> was reminded of the confusion around the use of > > >> .apply(transform.named(someName)) and .apply(someName, transform). > This > > is > > >> one of things I’ve wanted to cleanup for a while. I’d like to propose > a > > >> path towards removing this redundancy. > > >> > > >> First, some background -- why are there two ways to name things? When > we > > >> added support for updating existing pipelines, we needed all > > applications > > >> to have unique user-provided names to allow diff’ing the pipelines. We > > >> found a few problems with the first approach -- using .named() to > > create a > > >> new transform -- which led to the introduction of the named apply: > > >> > > >> 1. When receiving an error about an application not having a name, it > is > > >> not obvious that a name should be given to the *transform* > > >> 2. When using .named() to construct a new transform either the type > > >> information is lost or the composite transform has to override > .named() > > >> > > >> We now generally suggest the use of .apply(someName, transform). It is > > >> easier to use and doesn’t lead to as much confusion around PTransform > > >> names > > >> and PTransform application names. > > >> > > >> To that end, I'd like to propose the following changes to the code and > > >> documentation: > > >> 1. Replace the usage of .named(name) in all examples and composites > with > > >> the named-apply syntax. > > >> 2. Replace .named(name) with a protected PTransform constructor which > > >> takes > > >> a default name. If not provided, the default name will be derived from > > the > > >> class of the PTransform. > > >> 3. Use the protected constructor in composites (where appropriate) to > > >> ensure that the default application has a reasonable name. > > >> > > >> Users will benefit from having a single way of naming applications > while > > >> building a pipeline. Any breakages due to the removal of .named should > > be > > >> easily fixed by either using the named application or by passing the > > name > > >> to the constructor of a composite. > > >> > > >> I’d like to hear any comments or opinions on this topic from the wider > > >> community. Please let me know what you think! > > >> > > >> -- Ben > > >> > > > > > > -- > > > Jean-Baptiste Onofré > > > jbono...@apache.org > > > http://blog.nanthrax.net > > > Talend - http://www.talend.com > > >
Re: [DISCUSS] PTransform.named vs. named apply
FYI: Created https://issues.apache.org/jira/browse/BEAM-370 to track changes made in this direction. On Wed, Jun 22, 2016 at 11:13 PM Aljoscha Krettekwrote: > ±1 for the named apply > > On Thu, Jun 23, 2016, 07:07 Robert Bradshaw > wrote: > > > +1, I think it makes more sense to name the application of a transform > > rather than the transform itself. (Still mulling on how best to do > > this with Python...) > > > > On Wed, Jun 22, 2016 at 9:27 PM, Jean-Baptiste Onofré > > wrote: > > > +1 > > > > > > Regards > > > JB > > > > > > > > > On 06/23/2016 12:17 AM, Ben Chambers wrote: > > >> > > >> Based on a recent PR ( > https://github.com/apache/incubator-beam/pull/468) > > I > > >> was reminded of the confusion around the use of > > >> .apply(transform.named(someName)) and .apply(someName, transform). > This > > is > > >> one of things I’ve wanted to cleanup for a while. I’d like to propose > a > > >> path towards removing this redundancy. > > >> > > >> First, some background -- why are there two ways to name things? When > we > > >> added support for updating existing pipelines, we needed all > > applications > > >> to have unique user-provided names to allow diff’ing the pipelines. We > > >> found a few problems with the first approach -- using .named() to > > create a > > >> new transform -- which led to the introduction of the named apply: > > >> > > >> 1. When receiving an error about an application not having a name, it > is > > >> not obvious that a name should be given to the *transform* > > >> 2. When using .named() to construct a new transform either the type > > >> information is lost or the composite transform has to override > .named() > > >> > > >> We now generally suggest the use of .apply(someName, transform). It is > > >> easier to use and doesn’t lead to as much confusion around PTransform > > >> names > > >> and PTransform application names. > > >> > > >> To that end, I'd like to propose the following changes to the code and > > >> documentation: > > >> 1. Replace the usage of .named(name) in all examples and composites > with > > >> the named-apply syntax. > > >> 2. Replace .named(name) with a protected PTransform constructor which > > >> takes > > >> a default name. If not provided, the default name will be derived from > > the > > >> class of the PTransform. > > >> 3. Use the protected constructor in composites (where appropriate) to > > >> ensure that the default application has a reasonable name. > > >> > > >> Users will benefit from having a single way of naming applications > while > > >> building a pipeline. Any breakages due to the removal of .named should > > be > > >> easily fixed by either using the named application or by passing the > > name > > >> to the constructor of a composite. > > >> > > >> I’d like to hear any comments or opinions on this topic from the wider > > >> community. Please let me know what you think! > > >> > > >> -- Ben > > >> > > > > > > -- > > > Jean-Baptiste Onofré > > > jbono...@apache.org > > > http://blog.nanthrax.net > > > Talend - http://www.talend.com > > >
Re: [DISCUSS] PTransform.named vs. named apply
±1 for the named apply On Thu, Jun 23, 2016, 07:07 Robert Bradshawwrote: > +1, I think it makes more sense to name the application of a transform > rather than the transform itself. (Still mulling on how best to do > this with Python...) > > On Wed, Jun 22, 2016 at 9:27 PM, Jean-Baptiste Onofré > wrote: > > +1 > > > > Regards > > JB > > > > > > On 06/23/2016 12:17 AM, Ben Chambers wrote: > >> > >> Based on a recent PR (https://github.com/apache/incubator-beam/pull/468) > I > >> was reminded of the confusion around the use of > >> .apply(transform.named(someName)) and .apply(someName, transform). This > is > >> one of things I’ve wanted to cleanup for a while. I’d like to propose a > >> path towards removing this redundancy. > >> > >> First, some background -- why are there two ways to name things? When we > >> added support for updating existing pipelines, we needed all > applications > >> to have unique user-provided names to allow diff’ing the pipelines. We > >> found a few problems with the first approach -- using .named() to > create a > >> new transform -- which led to the introduction of the named apply: > >> > >> 1. When receiving an error about an application not having a name, it is > >> not obvious that a name should be given to the *transform* > >> 2. When using .named() to construct a new transform either the type > >> information is lost or the composite transform has to override .named() > >> > >> We now generally suggest the use of .apply(someName, transform). It is > >> easier to use and doesn’t lead to as much confusion around PTransform > >> names > >> and PTransform application names. > >> > >> To that end, I'd like to propose the following changes to the code and > >> documentation: > >> 1. Replace the usage of .named(name) in all examples and composites with > >> the named-apply syntax. > >> 2. Replace .named(name) with a protected PTransform constructor which > >> takes > >> a default name. If not provided, the default name will be derived from > the > >> class of the PTransform. > >> 3. Use the protected constructor in composites (where appropriate) to > >> ensure that the default application has a reasonable name. > >> > >> Users will benefit from having a single way of naming applications while > >> building a pipeline. Any breakages due to the removal of .named should > be > >> easily fixed by either using the named application or by passing the > name > >> to the constructor of a composite. > >> > >> I’d like to hear any comments or opinions on this topic from the wider > >> community. Please let me know what you think! > >> > >> -- Ben > >> > > > > -- > > Jean-Baptiste Onofré > > jbono...@apache.org > > http://blog.nanthrax.net > > Talend - http://www.talend.com >
Re: [DISCUSS] PTransform.named vs. named apply
+1 On Thu, Jun 23, 2016 at 7:27 AM Jean-Baptiste Onofréwrote: > +1 > > Regards > JB > > On 06/23/2016 12:17 AM, Ben Chambers wrote: > > Based on a recent PR (https://github.com/apache/incubator-beam/pull/468) > I > > was reminded of the confusion around the use of > > .apply(transform.named(someName)) and .apply(someName, transform). This > is > > one of things I’ve wanted to cleanup for a while. I’d like to propose a > > path towards removing this redundancy. > > > > First, some background -- why are there two ways to name things? When we > > added support for updating existing pipelines, we needed all applications > > to have unique user-provided names to allow diff’ing the pipelines. We > > found a few problems with the first approach -- using .named() to create > a > > new transform -- which led to the introduction of the named apply: > > > > 1. When receiving an error about an application not having a name, it is > > not obvious that a name should be given to the *transform* > > 2. When using .named() to construct a new transform either the type > > information is lost or the composite transform has to override .named() > > > > We now generally suggest the use of .apply(someName, transform). It is > > easier to use and doesn’t lead to as much confusion around PTransform > names > > and PTransform application names. > > > > To that end, I'd like to propose the following changes to the code and > > documentation: > > 1. Replace the usage of .named(name) in all examples and composites with > > the named-apply syntax. > > 2. Replace .named(name) with a protected PTransform constructor which > takes > > a default name. If not provided, the default name will be derived from > the > > class of the PTransform. > > 3. Use the protected constructor in composites (where appropriate) to > > ensure that the default application has a reasonable name. > > > > Users will benefit from having a single way of naming applications while > > building a pipeline. Any breakages due to the removal of .named should be > > easily fixed by either using the named application or by passing the name > > to the constructor of a composite. > > > > I’d like to hear any comments or opinions on this topic from the wider > > community. Please let me know what you think! > > > > -- Ben > > > > -- > Jean-Baptiste Onofré > jbono...@apache.org > http://blog.nanthrax.net > Talend - http://www.talend.com >
Re: [DISCUSS] PTransform.named vs. named apply
+1 Regards JB On 06/23/2016 12:17 AM, Ben Chambers wrote: Based on a recent PR (https://github.com/apache/incubator-beam/pull/468) I was reminded of the confusion around the use of .apply(transform.named(someName)) and .apply(someName, transform). This is one of things I’ve wanted to cleanup for a while. I’d like to propose a path towards removing this redundancy. First, some background -- why are there two ways to name things? When we added support for updating existing pipelines, we needed all applications to have unique user-provided names to allow diff’ing the pipelines. We found a few problems with the first approach -- using .named() to create a new transform -- which led to the introduction of the named apply: 1. When receiving an error about an application not having a name, it is not obvious that a name should be given to the *transform* 2. When using .named() to construct a new transform either the type information is lost or the composite transform has to override .named() We now generally suggest the use of .apply(someName, transform). It is easier to use and doesn’t lead to as much confusion around PTransform names and PTransform application names. To that end, I'd like to propose the following changes to the code and documentation: 1. Replace the usage of .named(name) in all examples and composites with the named-apply syntax. 2. Replace .named(name) with a protected PTransform constructor which takes a default name. If not provided, the default name will be derived from the class of the PTransform. 3. Use the protected constructor in composites (where appropriate) to ensure that the default application has a reasonable name. Users will benefit from having a single way of naming applications while building a pipeline. Any breakages due to the removal of .named should be easily fixed by either using the named application or by passing the name to the constructor of a composite. I’d like to hear any comments or opinions on this topic from the wider community. Please let me know what you think! -- Ben -- Jean-Baptiste Onofré jbono...@apache.org http://blog.nanthrax.net Talend - http://www.talend.com
Re: [DISCUSS] PTransform.named vs. named apply
+1 on your proposed solution On Wed, Jun 22, 2016 at 3:17 PM, Ben Chamberswrote: > Based on a recent PR (https://github.com/apache/incubator-beam/pull/468) I > was reminded of the confusion around the use of > .apply(transform.named(someName)) and .apply(someName, transform). This is > one of things I’ve wanted to cleanup for a while. I’d like to propose a > path towards removing this redundancy. > > First, some background -- why are there two ways to name things? When we > added support for updating existing pipelines, we needed all applications > to have unique user-provided names to allow diff’ing the pipelines. We > found a few problems with the first approach -- using .named() to create a > new transform -- which led to the introduction of the named apply: > > 1. When receiving an error about an application not having a name, it is > not obvious that a name should be given to the *transform* > 2. When using .named() to construct a new transform either the type > information is lost or the composite transform has to override .named() > > We now generally suggest the use of .apply(someName, transform). It is > easier to use and doesn’t lead to as much confusion around PTransform names > and PTransform application names. > > To that end, I'd like to propose the following changes to the code and > documentation: > 1. Replace the usage of .named(name) in all examples and composites with > the named-apply syntax. > 2. Replace .named(name) with a protected PTransform constructor which takes > a default name. If not provided, the default name will be derived from the > class of the PTransform. > 3. Use the protected constructor in composites (where appropriate) to > ensure that the default application has a reasonable name. > > Users will benefit from having a single way of naming applications while > building a pipeline. Any breakages due to the removal of .named should be > easily fixed by either using the named application or by passing the name > to the constructor of a composite. > > I’d like to hear any comments or opinions on this topic from the wider > community. Please let me know what you think! > > -- Ben >