Re: [DISCUSS] Using Verbs for Transforms

2016-11-01 Thread Neelesh Salian
Will do. Feel free to chime in, if I missed anything.

On Tue, Nov 1, 2016 at 9:16 PM, Jesse Anderson 
wrote:

> @Neelesh Could you write an email to the user list explaining the change
> since it is a breaking change?
>
> On Tue, Nov 1, 2016 at 10:08 PM Neelesh Salian 
> wrote:
>
> > Thanks everyone. The PR was merged.
> >
> >
> > On Thu, Oct 27, 2016 at 11:51 AM, Neelesh Salian 
> > wrote:
> >
> > > Thanks everyone for all the inputs.
> > > It's really encouraging for a new contributor, as myself, to get
> valuable
> > > input and mentoring (like on this thread) and, in turn, help make the
> > > community better.
> > >
> > >
> > >
> > > On Thu, Oct 27, 2016 at 11:41 AM, Jean-Baptiste Onofré <
> j...@nanthrax.net>
> > > wrote:
> > >
> > >> You did well ! It's an interesting discussion we have and it's great
> to
> > >> have it on the mailing list (better than in Jira or PR comments IMHO).
> > >>
> > >> Thanks !
> > >>
> > >> Regards
> > >> JB
> > >>
> > >> ⁣​
> > >>
> > >> On Oct 27, 2016, 20:39, at 20:39, Robert Bradshaw
> > >>  wrote:
> > >> >+1 to all Dan says.
> > >> >
> > >> >I only brought this up because it seemed new contributors (yay)
> > >> >jumping in and renaming a core transform based on "Something to
> > >> >consider" deserved a couple more more eyeballs, but didn't intend for
> > >> >it to become a big deal.
> > >> >
> > >> >On Thu, Oct 27, 2016 at 11:03 AM, Dan Halperin
> > >> > wrote:
> > >> >> Folks, I don't think this needs to be a "vote". This is just not
> that
> > >> >big a
> > >> >> deal :). It is important to be transparent and have these
> discussions
> > >> >on
> > >> >> the list, which is why we brought it here from GitHub/JIRA, but at
> > >> >the end
> > >> >> of the day I hope that a small group of committers and developers
> can
> > >> >> assess "good enough" consensus for these minor issues.
> > >> >>
> > >> >> Here's my assessment:
> > >> >> * We don't really have any rules about naming transforms. "Should
> be
> > >> >a
> > >> >> verb" is a sort of guiding principle inherited from the Google
> Flume
> > >> >> project from which Dataflow evolved, but honestly we violate this
> > >> >rule for
> > >> >> clarity all over the place. ("Values", for example).
> > >> >> * The "Big Data" community is significantly more familiar with the
> > >> >concept
> > >> >> of Distinct -- Jesse, who filed the original JIRA, is a good
> example
> > >> >here.
> > >> >> * Finally, nobody feels very strongly. We could argue minor points
> of
> > >> >each
> > >> >> solution, but at the end of the day I don't think anyone wants to
> > >> >block a
> > >> >> change.
> > >> >>
> > >> >> Let's go with Distinct. It's important to align Beam with the open
> > >> >source
> > >> >> big data community. (And thanks Jesse, our newest (*tied)
> committer,
> > >> >for
> > >> >> pushing us in the right direction!)
> > >> >>
> > >> >> Jesse, can you please take charge of wrapping up the PR and merging
> > >> >it?
> > >> >>
> > >> >> Thanks!
> > >> >> Dan
> > >> >>
> > >> >> On Wed, Oct 26, 2016 at 11:12 PM, Jean-Baptiste Onofré
> > >> >
> > >> >> wrote:
> > >> >>
> > >> >>> Just to clarify. Davor is right for a code modification change: -1
> > >> >means a
> > >> >>> veto.
> > >> >>> I meant that -1 is not a veto for a release vote.
> > >> >>>
> > >> >>> Anyway, even if it's not a formal code, we can have a discussion
> > >> >with
> > >> >>> "options" a,b and c.
> > >> >>>
> > >> >>> Regards
> > >> >>> JB
> > >> >>>
> > >> >>> ⁣
> > >> >>>
> > >> >>> On Oct 27, 2016, 06:48, at 06:48, Davor Bonaci
> > >> >
> > >> >>> wrote:
> > >> >>> >In terms of reaching a decision on any code or design changes,
> > >> >>> >including
> > >> >>> >this one, I'd suggest going without formal votes. Voting process
> > >> >for
> > >> >>> >code
> > >> >>> >modifications between choices A and B doesn't necessarily end
> with
> > >> >a
> > >> >>> >decision A or B -- a single (qualified) -1 vote is a veto and
> > >> >cannot be
> > >> >>> >overridden [1]. Said differently, the guideline is that code
> > >> >changes
> > >> >>> >should
> > >> >>> >be made by consensus; not by one group outvoting another. I'd
> like
> > >> >to
> > >> >>> >avoid
> > >> >>> >setting such precedent; we should try to drive consensus, as
> > >> >opposed to
> > >> >>> >attempting to outvote another part of the community.
> > >> >>> >
> > >> >>> >In this particular case, we have had a great discussion. Many
> > >> >>> >contributors
> > >> >>> >brought different perspectives. Consequently, some opinions have
> > >> >been
> > >> >>> >likely changed. At this point, someone should summarize the
> > >> >arguments,
> > >> >>> >try
> > >> >>> >to critique them from a neutral standpoint, and suggest a refined
> > >> >>> >proposal
> > >> >>> >that takes these perspectives into account. If nobody objects in
> a
> > >> >>> >short
> > >> >>> >time, we should consider this decided. [ I can certainly help
> here,
> > >> >but
> > >> >>> >I'd
> > >> >>> >love to see somebody else do 

Re: [DISCUSS] Using Verbs for Transforms

2016-11-01 Thread Jesse Anderson
@Neelesh Could you write an email to the user list explaining the change
since it is a breaking change?

On Tue, Nov 1, 2016 at 10:08 PM Neelesh Salian  wrote:

> Thanks everyone. The PR was merged.
>
>
> On Thu, Oct 27, 2016 at 11:51 AM, Neelesh Salian 
> wrote:
>
> > Thanks everyone for all the inputs.
> > It's really encouraging for a new contributor, as myself, to get valuable
> > input and mentoring (like on this thread) and, in turn, help make the
> > community better.
> >
> >
> >
> > On Thu, Oct 27, 2016 at 11:41 AM, Jean-Baptiste Onofré 
> > wrote:
> >
> >> You did well ! It's an interesting discussion we have and it's great to
> >> have it on the mailing list (better than in Jira or PR comments IMHO).
> >>
> >> Thanks !
> >>
> >> Regards
> >> JB
> >>
> >> ⁣​
> >>
> >> On Oct 27, 2016, 20:39, at 20:39, Robert Bradshaw
> >>  wrote:
> >> >+1 to all Dan says.
> >> >
> >> >I only brought this up because it seemed new contributors (yay)
> >> >jumping in and renaming a core transform based on "Something to
> >> >consider" deserved a couple more more eyeballs, but didn't intend for
> >> >it to become a big deal.
> >> >
> >> >On Thu, Oct 27, 2016 at 11:03 AM, Dan Halperin
> >> > wrote:
> >> >> Folks, I don't think this needs to be a "vote". This is just not that
> >> >big a
> >> >> deal :). It is important to be transparent and have these discussions
> >> >on
> >> >> the list, which is why we brought it here from GitHub/JIRA, but at
> >> >the end
> >> >> of the day I hope that a small group of committers and developers can
> >> >> assess "good enough" consensus for these minor issues.
> >> >>
> >> >> Here's my assessment:
> >> >> * We don't really have any rules about naming transforms. "Should be
> >> >a
> >> >> verb" is a sort of guiding principle inherited from the Google Flume
> >> >> project from which Dataflow evolved, but honestly we violate this
> >> >rule for
> >> >> clarity all over the place. ("Values", for example).
> >> >> * The "Big Data" community is significantly more familiar with the
> >> >concept
> >> >> of Distinct -- Jesse, who filed the original JIRA, is a good example
> >> >here.
> >> >> * Finally, nobody feels very strongly. We could argue minor points of
> >> >each
> >> >> solution, but at the end of the day I don't think anyone wants to
> >> >block a
> >> >> change.
> >> >>
> >> >> Let's go with Distinct. It's important to align Beam with the open
> >> >source
> >> >> big data community. (And thanks Jesse, our newest (*tied) committer,
> >> >for
> >> >> pushing us in the right direction!)
> >> >>
> >> >> Jesse, can you please take charge of wrapping up the PR and merging
> >> >it?
> >> >>
> >> >> Thanks!
> >> >> Dan
> >> >>
> >> >> On Wed, Oct 26, 2016 at 11:12 PM, Jean-Baptiste Onofré
> >> >
> >> >> wrote:
> >> >>
> >> >>> Just to clarify. Davor is right for a code modification change: -1
> >> >means a
> >> >>> veto.
> >> >>> I meant that -1 is not a veto for a release vote.
> >> >>>
> >> >>> Anyway, even if it's not a formal code, we can have a discussion
> >> >with
> >> >>> "options" a,b and c.
> >> >>>
> >> >>> Regards
> >> >>> JB
> >> >>>
> >> >>> ⁣
> >> >>>
> >> >>> On Oct 27, 2016, 06:48, at 06:48, Davor Bonaci
> >> >
> >> >>> wrote:
> >> >>> >In terms of reaching a decision on any code or design changes,
> >> >>> >including
> >> >>> >this one, I'd suggest going without formal votes. Voting process
> >> >for
> >> >>> >code
> >> >>> >modifications between choices A and B doesn't necessarily end with
> >> >a
> >> >>> >decision A or B -- a single (qualified) -1 vote is a veto and
> >> >cannot be
> >> >>> >overridden [1]. Said differently, the guideline is that code
> >> >changes
> >> >>> >should
> >> >>> >be made by consensus; not by one group outvoting another. I'd like
> >> >to
> >> >>> >avoid
> >> >>> >setting such precedent; we should try to drive consensus, as
> >> >opposed to
> >> >>> >attempting to outvote another part of the community.
> >> >>> >
> >> >>> >In this particular case, we have had a great discussion. Many
> >> >>> >contributors
> >> >>> >brought different perspectives. Consequently, some opinions have
> >> >been
> >> >>> >likely changed. At this point, someone should summarize the
> >> >arguments,
> >> >>> >try
> >> >>> >to critique them from a neutral standpoint, and suggest a refined
> >> >>> >proposal
> >> >>> >that takes these perspectives into account. If nobody objects in a
> >> >>> >short
> >> >>> >time, we should consider this decided. [ I can certainly help here,
> >> >but
> >> >>> >I'd
> >> >>> >love to see somebody else do it! ]
> >> >>> >
> >> >>> >[1] http://www.apache.org/foundation/voting.html
> >> >>> >
> >> >>> >On Wed, Oct 26, 2016 at 7:35 AM, Ben Chambers
> >> >>> >
> >> >>> >wrote:
> >> >>> >
> >> >>> >> I also like Distinct since it doesn't make it sound like it
> >> >modifies
> >> >>> >any
> >> >>> >> underlying collection. RemoveDuplicates makes it sound like the
> >> >>> >duplicates
> >> >>> >> are removed, rather than a new PCollect

Re: [DISCUSS] Using Verbs for Transforms

2016-11-01 Thread Neelesh Salian
Thanks everyone. The PR was merged.


On Thu, Oct 27, 2016 at 11:51 AM, Neelesh Salian 
wrote:

> Thanks everyone for all the inputs.
> It's really encouraging for a new contributor, as myself, to get valuable
> input and mentoring (like on this thread) and, in turn, help make the
> community better.
>
>
>
> On Thu, Oct 27, 2016 at 11:41 AM, Jean-Baptiste Onofré 
> wrote:
>
>> You did well ! It's an interesting discussion we have and it's great to
>> have it on the mailing list (better than in Jira or PR comments IMHO).
>>
>> Thanks !
>>
>> Regards
>> JB
>>
>> ⁣​
>>
>> On Oct 27, 2016, 20:39, at 20:39, Robert Bradshaw
>>  wrote:
>> >+1 to all Dan says.
>> >
>> >I only brought this up because it seemed new contributors (yay)
>> >jumping in and renaming a core transform based on "Something to
>> >consider" deserved a couple more more eyeballs, but didn't intend for
>> >it to become a big deal.
>> >
>> >On Thu, Oct 27, 2016 at 11:03 AM, Dan Halperin
>> > wrote:
>> >> Folks, I don't think this needs to be a "vote". This is just not that
>> >big a
>> >> deal :). It is important to be transparent and have these discussions
>> >on
>> >> the list, which is why we brought it here from GitHub/JIRA, but at
>> >the end
>> >> of the day I hope that a small group of committers and developers can
>> >> assess "good enough" consensus for these minor issues.
>> >>
>> >> Here's my assessment:
>> >> * We don't really have any rules about naming transforms. "Should be
>> >a
>> >> verb" is a sort of guiding principle inherited from the Google Flume
>> >> project from which Dataflow evolved, but honestly we violate this
>> >rule for
>> >> clarity all over the place. ("Values", for example).
>> >> * The "Big Data" community is significantly more familiar with the
>> >concept
>> >> of Distinct -- Jesse, who filed the original JIRA, is a good example
>> >here.
>> >> * Finally, nobody feels very strongly. We could argue minor points of
>> >each
>> >> solution, but at the end of the day I don't think anyone wants to
>> >block a
>> >> change.
>> >>
>> >> Let's go with Distinct. It's important to align Beam with the open
>> >source
>> >> big data community. (And thanks Jesse, our newest (*tied) committer,
>> >for
>> >> pushing us in the right direction!)
>> >>
>> >> Jesse, can you please take charge of wrapping up the PR and merging
>> >it?
>> >>
>> >> Thanks!
>> >> Dan
>> >>
>> >> On Wed, Oct 26, 2016 at 11:12 PM, Jean-Baptiste Onofré
>> >
>> >> wrote:
>> >>
>> >>> Just to clarify. Davor is right for a code modification change: -1
>> >means a
>> >>> veto.
>> >>> I meant that -1 is not a veto for a release vote.
>> >>>
>> >>> Anyway, even if it's not a formal code, we can have a discussion
>> >with
>> >>> "options" a,b and c.
>> >>>
>> >>> Regards
>> >>> JB
>> >>>
>> >>> ⁣
>> >>>
>> >>> On Oct 27, 2016, 06:48, at 06:48, Davor Bonaci
>> >
>> >>> wrote:
>> >>> >In terms of reaching a decision on any code or design changes,
>> >>> >including
>> >>> >this one, I'd suggest going without formal votes. Voting process
>> >for
>> >>> >code
>> >>> >modifications between choices A and B doesn't necessarily end with
>> >a
>> >>> >decision A or B -- a single (qualified) -1 vote is a veto and
>> >cannot be
>> >>> >overridden [1]. Said differently, the guideline is that code
>> >changes
>> >>> >should
>> >>> >be made by consensus; not by one group outvoting another. I'd like
>> >to
>> >>> >avoid
>> >>> >setting such precedent; we should try to drive consensus, as
>> >opposed to
>> >>> >attempting to outvote another part of the community.
>> >>> >
>> >>> >In this particular case, we have had a great discussion. Many
>> >>> >contributors
>> >>> >brought different perspectives. Consequently, some opinions have
>> >been
>> >>> >likely changed. At this point, someone should summarize the
>> >arguments,
>> >>> >try
>> >>> >to critique them from a neutral standpoint, and suggest a refined
>> >>> >proposal
>> >>> >that takes these perspectives into account. If nobody objects in a
>> >>> >short
>> >>> >time, we should consider this decided. [ I can certainly help here,
>> >but
>> >>> >I'd
>> >>> >love to see somebody else do it! ]
>> >>> >
>> >>> >[1] http://www.apache.org/foundation/voting.html
>> >>> >
>> >>> >On Wed, Oct 26, 2016 at 7:35 AM, Ben Chambers
>> >>> >
>> >>> >wrote:
>> >>> >
>> >>> >> I also like Distinct since it doesn't make it sound like it
>> >modifies
>> >>> >any
>> >>> >> underlying collection. RemoveDuplicates makes it sound like the
>> >>> >duplicates
>> >>> >> are removed, rather than a new PCollection without duplicates
>> >being
>> >>> >> returned.
>> >>> >>
>> >>> >> On Wed, Oct 26, 2016, 7:36 AM Jean-Baptiste Onofré
>> >
>> >>> >> wrote:
>> >>> >>
>> >>> >> > Agree. It was more a transition proposal.
>> >>> >> >
>> >>> >> > Regards
>> >>> >> > JB
>> >>> >> >
>> >>> >> > ⁣
>> >>> >> >
>> >>> >> > On Oct 26, 2016, 08:31, at 08:31, Robert Bradshaw
>> >>> >> >  wrote:
>> >>> >> > >On Mon, Oct 24, 2016 at 11:02 PM, Jean-Baptiste Onofré

Re: [DISCUSS] Using Verbs for Transforms

2016-10-27 Thread Neelesh Salian
Thanks everyone for all the inputs.
It's really encouraging for a new contributor, as myself, to get valuable
input and mentoring (like on this thread) and, in turn, help make the
community better.



On Thu, Oct 27, 2016 at 11:41 AM, Jean-Baptiste Onofré 
wrote:

> You did well ! It's an interesting discussion we have and it's great to
> have it on the mailing list (better than in Jira or PR comments IMHO).
>
> Thanks !
>
> Regards
> JB
>
> ⁣​
>
> On Oct 27, 2016, 20:39, at 20:39, Robert Bradshaw
>  wrote:
> >+1 to all Dan says.
> >
> >I only brought this up because it seemed new contributors (yay)
> >jumping in and renaming a core transform based on "Something to
> >consider" deserved a couple more more eyeballs, but didn't intend for
> >it to become a big deal.
> >
> >On Thu, Oct 27, 2016 at 11:03 AM, Dan Halperin
> > wrote:
> >> Folks, I don't think this needs to be a "vote". This is just not that
> >big a
> >> deal :). It is important to be transparent and have these discussions
> >on
> >> the list, which is why we brought it here from GitHub/JIRA, but at
> >the end
> >> of the day I hope that a small group of committers and developers can
> >> assess "good enough" consensus for these minor issues.
> >>
> >> Here's my assessment:
> >> * We don't really have any rules about naming transforms. "Should be
> >a
> >> verb" is a sort of guiding principle inherited from the Google Flume
> >> project from which Dataflow evolved, but honestly we violate this
> >rule for
> >> clarity all over the place. ("Values", for example).
> >> * The "Big Data" community is significantly more familiar with the
> >concept
> >> of Distinct -- Jesse, who filed the original JIRA, is a good example
> >here.
> >> * Finally, nobody feels very strongly. We could argue minor points of
> >each
> >> solution, but at the end of the day I don't think anyone wants to
> >block a
> >> change.
> >>
> >> Let's go with Distinct. It's important to align Beam with the open
> >source
> >> big data community. (And thanks Jesse, our newest (*tied) committer,
> >for
> >> pushing us in the right direction!)
> >>
> >> Jesse, can you please take charge of wrapping up the PR and merging
> >it?
> >>
> >> Thanks!
> >> Dan
> >>
> >> On Wed, Oct 26, 2016 at 11:12 PM, Jean-Baptiste Onofré
> >
> >> wrote:
> >>
> >>> Just to clarify. Davor is right for a code modification change: -1
> >means a
> >>> veto.
> >>> I meant that -1 is not a veto for a release vote.
> >>>
> >>> Anyway, even if it's not a formal code, we can have a discussion
> >with
> >>> "options" a,b and c.
> >>>
> >>> Regards
> >>> JB
> >>>
> >>> ⁣
> >>>
> >>> On Oct 27, 2016, 06:48, at 06:48, Davor Bonaci
> >
> >>> wrote:
> >>> >In terms of reaching a decision on any code or design changes,
> >>> >including
> >>> >this one, I'd suggest going without formal votes. Voting process
> >for
> >>> >code
> >>> >modifications between choices A and B doesn't necessarily end with
> >a
> >>> >decision A or B -- a single (qualified) -1 vote is a veto and
> >cannot be
> >>> >overridden [1]. Said differently, the guideline is that code
> >changes
> >>> >should
> >>> >be made by consensus; not by one group outvoting another. I'd like
> >to
> >>> >avoid
> >>> >setting such precedent; we should try to drive consensus, as
> >opposed to
> >>> >attempting to outvote another part of the community.
> >>> >
> >>> >In this particular case, we have had a great discussion. Many
> >>> >contributors
> >>> >brought different perspectives. Consequently, some opinions have
> >been
> >>> >likely changed. At this point, someone should summarize the
> >arguments,
> >>> >try
> >>> >to critique them from a neutral standpoint, and suggest a refined
> >>> >proposal
> >>> >that takes these perspectives into account. If nobody objects in a
> >>> >short
> >>> >time, we should consider this decided. [ I can certainly help here,
> >but
> >>> >I'd
> >>> >love to see somebody else do it! ]
> >>> >
> >>> >[1] http://www.apache.org/foundation/voting.html
> >>> >
> >>> >On Wed, Oct 26, 2016 at 7:35 AM, Ben Chambers
> >>> >
> >>> >wrote:
> >>> >
> >>> >> I also like Distinct since it doesn't make it sound like it
> >modifies
> >>> >any
> >>> >> underlying collection. RemoveDuplicates makes it sound like the
> >>> >duplicates
> >>> >> are removed, rather than a new PCollection without duplicates
> >being
> >>> >> returned.
> >>> >>
> >>> >> On Wed, Oct 26, 2016, 7:36 AM Jean-Baptiste Onofré
> >
> >>> >> wrote:
> >>> >>
> >>> >> > Agree. It was more a transition proposal.
> >>> >> >
> >>> >> > Regards
> >>> >> > JB
> >>> >> >
> >>> >> > ⁣
> >>> >> >
> >>> >> > On Oct 26, 2016, 08:31, at 08:31, Robert Bradshaw
> >>> >> >  wrote:
> >>> >> > >On Mon, Oct 24, 2016 at 11:02 PM, Jean-Baptiste Onofré
> >>> >> > > wrote:
> >>> >> > >> And what about use RemoveDuplicates and create an alias
> >Distinct
> >>> >?
> >>> >> > >
> >>> >> > >I'd really like to avoid (long term) aliases--you end up
> >having to
> >>> >> > >document (and maintain) them both, an

Re: [DISCUSS] Using Verbs for Transforms

2016-10-27 Thread Jean-Baptiste Onofré
You did well ! It's an interesting discussion we have and it's great to have it 
on the mailing list (better than in Jira or PR comments IMHO).

Thanks !

Regards
JB

⁣​

On Oct 27, 2016, 20:39, at 20:39, Robert Bradshaw  
wrote:
>+1 to all Dan says.
>
>I only brought this up because it seemed new contributors (yay)
>jumping in and renaming a core transform based on "Something to
>consider" deserved a couple more more eyeballs, but didn't intend for
>it to become a big deal.
>
>On Thu, Oct 27, 2016 at 11:03 AM, Dan Halperin
> wrote:
>> Folks, I don't think this needs to be a "vote". This is just not that
>big a
>> deal :). It is important to be transparent and have these discussions
>on
>> the list, which is why we brought it here from GitHub/JIRA, but at
>the end
>> of the day I hope that a small group of committers and developers can
>> assess "good enough" consensus for these minor issues.
>>
>> Here's my assessment:
>> * We don't really have any rules about naming transforms. "Should be
>a
>> verb" is a sort of guiding principle inherited from the Google Flume
>> project from which Dataflow evolved, but honestly we violate this
>rule for
>> clarity all over the place. ("Values", for example).
>> * The "Big Data" community is significantly more familiar with the
>concept
>> of Distinct -- Jesse, who filed the original JIRA, is a good example
>here.
>> * Finally, nobody feels very strongly. We could argue minor points of
>each
>> solution, but at the end of the day I don't think anyone wants to
>block a
>> change.
>>
>> Let's go with Distinct. It's important to align Beam with the open
>source
>> big data community. (And thanks Jesse, our newest (*tied) committer,
>for
>> pushing us in the right direction!)
>>
>> Jesse, can you please take charge of wrapping up the PR and merging
>it?
>>
>> Thanks!
>> Dan
>>
>> On Wed, Oct 26, 2016 at 11:12 PM, Jean-Baptiste Onofré
>
>> wrote:
>>
>>> Just to clarify. Davor is right for a code modification change: -1
>means a
>>> veto.
>>> I meant that -1 is not a veto for a release vote.
>>>
>>> Anyway, even if it's not a formal code, we can have a discussion
>with
>>> "options" a,b and c.
>>>
>>> Regards
>>> JB
>>>
>>> ⁣
>>>
>>> On Oct 27, 2016, 06:48, at 06:48, Davor Bonaci
>
>>> wrote:
>>> >In terms of reaching a decision on any code or design changes,
>>> >including
>>> >this one, I'd suggest going without formal votes. Voting process
>for
>>> >code
>>> >modifications between choices A and B doesn't necessarily end with
>a
>>> >decision A or B -- a single (qualified) -1 vote is a veto and
>cannot be
>>> >overridden [1]. Said differently, the guideline is that code
>changes
>>> >should
>>> >be made by consensus; not by one group outvoting another. I'd like
>to
>>> >avoid
>>> >setting such precedent; we should try to drive consensus, as
>opposed to
>>> >attempting to outvote another part of the community.
>>> >
>>> >In this particular case, we have had a great discussion. Many
>>> >contributors
>>> >brought different perspectives. Consequently, some opinions have
>been
>>> >likely changed. At this point, someone should summarize the
>arguments,
>>> >try
>>> >to critique them from a neutral standpoint, and suggest a refined
>>> >proposal
>>> >that takes these perspectives into account. If nobody objects in a
>>> >short
>>> >time, we should consider this decided. [ I can certainly help here,
>but
>>> >I'd
>>> >love to see somebody else do it! ]
>>> >
>>> >[1] http://www.apache.org/foundation/voting.html
>>> >
>>> >On Wed, Oct 26, 2016 at 7:35 AM, Ben Chambers
>>> >
>>> >wrote:
>>> >
>>> >> I also like Distinct since it doesn't make it sound like it
>modifies
>>> >any
>>> >> underlying collection. RemoveDuplicates makes it sound like the
>>> >duplicates
>>> >> are removed, rather than a new PCollection without duplicates
>being
>>> >> returned.
>>> >>
>>> >> On Wed, Oct 26, 2016, 7:36 AM Jean-Baptiste Onofré
>
>>> >> wrote:
>>> >>
>>> >> > Agree. It was more a transition proposal.
>>> >> >
>>> >> > Regards
>>> >> > JB
>>> >> >
>>> >> > ⁣
>>> >> >
>>> >> > On Oct 26, 2016, 08:31, at 08:31, Robert Bradshaw
>>> >> >  wrote:
>>> >> > >On Mon, Oct 24, 2016 at 11:02 PM, Jean-Baptiste Onofré
>>> >> > > wrote:
>>> >> > >> And what about use RemoveDuplicates and create an alias
>Distinct
>>> >?
>>> >> > >
>>> >> > >I'd really like to avoid (long term) aliases--you end up
>having to
>>> >> > >document (and maintain) them both, and it adds confusion as to
>>> >which
>>> >> > >one to use (especially if they every diverge), and means
>searching
>>> >for
>>> >> > >one or the other yields half the results.
>>> >> > >
>>> >> > >> It doesn't break the API and would address both SQL users
>and
>>> >more
>>> >> > >"big data" users.
>>> >> > >>
>>> >> > >> My $0.01 ;)
>>> >> > >>
>>> >> > >> Regards
>>> >> > >> JB
>>> >> > >>
>>> >> > >> ⁣
>>> >> > >>
>>> >> > >> On Oct 24, 2016, 22:23, at 22:23, Dan Halperin
>>> >> > > wrote:
>>> >> > >>>I find "MakeDistinct" more confusing. My votes in dec

Re: [DISCUSS] Using Verbs for Transforms

2016-10-27 Thread Robert Bradshaw
+1 to all Dan says.

I only brought this up because it seemed new contributors (yay)
jumping in and renaming a core transform based on "Something to
consider" deserved a couple more more eyeballs, but didn't intend for
it to become a big deal.

On Thu, Oct 27, 2016 at 11:03 AM, Dan Halperin
 wrote:
> Folks, I don't think this needs to be a "vote". This is just not that big a
> deal :). It is important to be transparent and have these discussions on
> the list, which is why we brought it here from GitHub/JIRA, but at the end
> of the day I hope that a small group of committers and developers can
> assess "good enough" consensus for these minor issues.
>
> Here's my assessment:
> * We don't really have any rules about naming transforms. "Should be a
> verb" is a sort of guiding principle inherited from the Google Flume
> project from which Dataflow evolved, but honestly we violate this rule for
> clarity all over the place. ("Values", for example).
> * The "Big Data" community is significantly more familiar with the concept
> of Distinct -- Jesse, who filed the original JIRA, is a good example here.
> * Finally, nobody feels very strongly. We could argue minor points of each
> solution, but at the end of the day I don't think anyone wants to block a
> change.
>
> Let's go with Distinct. It's important to align Beam with the open source
> big data community. (And thanks Jesse, our newest (*tied) committer, for
> pushing us in the right direction!)
>
> Jesse, can you please take charge of wrapping up the PR and merging it?
>
> Thanks!
> Dan
>
> On Wed, Oct 26, 2016 at 11:12 PM, Jean-Baptiste Onofré 
> wrote:
>
>> Just to clarify. Davor is right for a code modification change: -1 means a
>> veto.
>> I meant that -1 is not a veto for a release vote.
>>
>> Anyway, even if it's not a formal code, we can have a discussion with
>> "options" a,b and c.
>>
>> Regards
>> JB
>>
>> ⁣
>>
>> On Oct 27, 2016, 06:48, at 06:48, Davor Bonaci 
>> wrote:
>> >In terms of reaching a decision on any code or design changes,
>> >including
>> >this one, I'd suggest going without formal votes. Voting process for
>> >code
>> >modifications between choices A and B doesn't necessarily end with a
>> >decision A or B -- a single (qualified) -1 vote is a veto and cannot be
>> >overridden [1]. Said differently, the guideline is that code changes
>> >should
>> >be made by consensus; not by one group outvoting another. I'd like to
>> >avoid
>> >setting such precedent; we should try to drive consensus, as opposed to
>> >attempting to outvote another part of the community.
>> >
>> >In this particular case, we have had a great discussion. Many
>> >contributors
>> >brought different perspectives. Consequently, some opinions have been
>> >likely changed. At this point, someone should summarize the arguments,
>> >try
>> >to critique them from a neutral standpoint, and suggest a refined
>> >proposal
>> >that takes these perspectives into account. If nobody objects in a
>> >short
>> >time, we should consider this decided. [ I can certainly help here, but
>> >I'd
>> >love to see somebody else do it! ]
>> >
>> >[1] http://www.apache.org/foundation/voting.html
>> >
>> >On Wed, Oct 26, 2016 at 7:35 AM, Ben Chambers
>> >
>> >wrote:
>> >
>> >> I also like Distinct since it doesn't make it sound like it modifies
>> >any
>> >> underlying collection. RemoveDuplicates makes it sound like the
>> >duplicates
>> >> are removed, rather than a new PCollection without duplicates being
>> >> returned.
>> >>
>> >> On Wed, Oct 26, 2016, 7:36 AM Jean-Baptiste Onofré 
>> >> wrote:
>> >>
>> >> > Agree. It was more a transition proposal.
>> >> >
>> >> > Regards
>> >> > JB
>> >> >
>> >> > ⁣
>> >> >
>> >> > On Oct 26, 2016, 08:31, at 08:31, Robert Bradshaw
>> >> >  wrote:
>> >> > >On Mon, Oct 24, 2016 at 11:02 PM, Jean-Baptiste Onofré
>> >> > > wrote:
>> >> > >> And what about use RemoveDuplicates and create an alias Distinct
>> >?
>> >> > >
>> >> > >I'd really like to avoid (long term) aliases--you end up having to
>> >> > >document (and maintain) them both, and it adds confusion as to
>> >which
>> >> > >one to use (especially if they every diverge), and means searching
>> >for
>> >> > >one or the other yields half the results.
>> >> > >
>> >> > >> It doesn't break the API and would address both SQL users and
>> >more
>> >> > >"big data" users.
>> >> > >>
>> >> > >> My $0.01 ;)
>> >> > >>
>> >> > >> Regards
>> >> > >> JB
>> >> > >>
>> >> > >> ⁣
>> >> > >>
>> >> > >> On Oct 24, 2016, 22:23, at 22:23, Dan Halperin
>> >> > > wrote:
>> >> > >>>I find "MakeDistinct" more confusing. My votes in decreasing
>> >> > >>>preference:
>> >> > >>>
>> >> > >>>1. Keep `RemoveDuplicates` name, ensure that important keywords
>> >are
>> >> > >in
>> >> > >>>the
>> >> > >>>Javadoc. This reduces churn on our users and is honestly pretty
>> >dang
>> >> > >>> descriptive.
>> >> > >>>2. Rename to `Distinct`, which is clear if you're a SQL user and
>> >> > >likely
>> >> > >>>less clear otherwise. This

Re: [DISCUSS] Using Verbs for Transforms

2016-10-27 Thread Jean-Baptiste Onofré
It sounds good to me.

So basically you did kind of vote with a proposing solution ;)

Regards
JB

⁣​

On Oct 27, 2016, 20:04, at 20:04, Dan Halperin  
wrote:
>Folks, I don't think this needs to be a "vote". This is just not that
>big a
>deal :). It is important to be transparent and have these discussions
>on
>the list, which is why we brought it here from GitHub/JIRA, but at the
>end
>of the day I hope that a small group of committers and developers can
>assess "good enough" consensus for these minor issues.
>
>Here's my assessment:
>* We don't really have any rules about naming transforms. "Should be a
>verb" is a sort of guiding principle inherited from the Google Flume
>project from which Dataflow evolved, but honestly we violate this rule
>for
>clarity all over the place. ("Values", for example).
>* The "Big Data" community is significantly more familiar with the
>concept
>of Distinct -- Jesse, who filed the original JIRA, is a good example
>here.
>* Finally, nobody feels very strongly. We could argue minor points of
>each
>solution, but at the end of the day I don't think anyone wants to block
>a
>change.
>
>Let's go with Distinct. It's important to align Beam with the open
>source
>big data community. (And thanks Jesse, our newest (*tied) committer,
>for
>pushing us in the right direction!)
>
>Jesse, can you please take charge of wrapping up the PR and merging it?
>
>Thanks!
>Dan
>
>On Wed, Oct 26, 2016 at 11:12 PM, Jean-Baptiste Onofré
>
>wrote:
>
>> Just to clarify. Davor is right for a code modification change: -1
>means a
>> veto.
>> I meant that -1 is not a veto for a release vote.
>>
>> Anyway, even if it's not a formal code, we can have a discussion with
>> "options" a,b and c.
>>
>> Regards
>> JB
>>
>> ⁣​
>>
>> On Oct 27, 2016, 06:48, at 06:48, Davor Bonaci
>
>> wrote:
>> >In terms of reaching a decision on any code or design changes,
>> >including
>> >this one, I'd suggest going without formal votes. Voting process for
>> >code
>> >modifications between choices A and B doesn't necessarily end with a
>> >decision A or B -- a single (qualified) -1 vote is a veto and cannot
>be
>> >overridden [1]. Said differently, the guideline is that code changes
>> >should
>> >be made by consensus; not by one group outvoting another. I'd like
>to
>> >avoid
>> >setting such precedent; we should try to drive consensus, as opposed
>to
>> >attempting to outvote another part of the community.
>> >
>> >In this particular case, we have had a great discussion. Many
>> >contributors
>> >brought different perspectives. Consequently, some opinions have
>been
>> >likely changed. At this point, someone should summarize the
>arguments,
>> >try
>> >to critique them from a neutral standpoint, and suggest a refined
>> >proposal
>> >that takes these perspectives into account. If nobody objects in a
>> >short
>> >time, we should consider this decided. [ I can certainly help here,
>but
>> >I'd
>> >love to see somebody else do it! ]
>> >
>> >[1] http://www.apache.org/foundation/voting.html
>> >
>> >On Wed, Oct 26, 2016 at 7:35 AM, Ben Chambers
>> >
>> >wrote:
>> >
>> >> I also like Distinct since it doesn't make it sound like it
>modifies
>> >any
>> >> underlying collection. RemoveDuplicates makes it sound like the
>> >duplicates
>> >> are removed, rather than a new PCollection without duplicates
>being
>> >> returned.
>> >>
>> >> On Wed, Oct 26, 2016, 7:36 AM Jean-Baptiste Onofré
>
>> >> wrote:
>> >>
>> >> > Agree. It was more a transition proposal.
>> >> >
>> >> > Regards
>> >> > JB
>> >> >
>> >> > ⁣​
>> >> >
>> >> > On Oct 26, 2016, 08:31, at 08:31, Robert Bradshaw
>> >> >  wrote:
>> >> > >On Mon, Oct 24, 2016 at 11:02 PM, Jean-Baptiste Onofré
>> >> > > wrote:
>> >> > >> And what about use RemoveDuplicates and create an alias
>Distinct
>> >?
>> >> > >
>> >> > >I'd really like to avoid (long term) aliases--you end up having
>to
>> >> > >document (and maintain) them both, and it adds confusion as to
>> >which
>> >> > >one to use (especially if they every diverge), and means
>searching
>> >for
>> >> > >one or the other yields half the results.
>> >> > >
>> >> > >> It doesn't break the API and would address both SQL users and
>> >more
>> >> > >"big data" users.
>> >> > >>
>> >> > >> My $0.01 ;)
>> >> > >>
>> >> > >> Regards
>> >> > >> JB
>> >> > >>
>> >> > >> ⁣
>> >> > >>
>> >> > >> On Oct 24, 2016, 22:23, at 22:23, Dan Halperin
>> >> > > wrote:
>> >> > >>>I find "MakeDistinct" more confusing. My votes in decreasing
>> >> > >>>preference:
>> >> > >>>
>> >> > >>>1. Keep `RemoveDuplicates` name, ensure that important
>keywords
>> >are
>> >> > >in
>> >> > >>>the
>> >> > >>>Javadoc. This reduces churn on our users and is honestly
>pretty
>> >dang
>> >> > >>> descriptive.
>> >> > >>>2. Rename to `Distinct`, which is clear if you're a SQL user
>and
>> >> > >likely
>> >> > >>>less clear otherwise. This is a backwards-incompatible API
>> >change, so
>> >> > >>>we
>> >> > >>>should do it before we go stable.
>> >> > >>>
>> >> > >>>I am

Re: [DISCUSS] Using Verbs for Transforms

2016-10-27 Thread Jesse Anderson
Sure

On Thu, Oct 27, 2016, 8:04 PM Dan Halperin 
wrote:

> Folks, I don't think this needs to be a "vote". This is just not that big a
> deal :). It is important to be transparent and have these discussions on
> the list, which is why we brought it here from GitHub/JIRA, but at the end
> of the day I hope that a small group of committers and developers can
> assess "good enough" consensus for these minor issues.
>
> Here's my assessment:
> * We don't really have any rules about naming transforms. "Should be a
> verb" is a sort of guiding principle inherited from the Google Flume
> project from which Dataflow evolved, but honestly we violate this rule for
> clarity all over the place. ("Values", for example).
> * The "Big Data" community is significantly more familiar with the concept
> of Distinct -- Jesse, who filed the original JIRA, is a good example here.
> * Finally, nobody feels very strongly. We could argue minor points of each
> solution, but at the end of the day I don't think anyone wants to block a
> change.
>
> Let's go with Distinct. It's important to align Beam with the open source
> big data community. (And thanks Jesse, our newest (*tied) committer, for
> pushing us in the right direction!)
>
> Jesse, can you please take charge of wrapping up the PR and merging it?
>
> Thanks!
> Dan
>
> On Wed, Oct 26, 2016 at 11:12 PM, Jean-Baptiste Onofré 
> wrote:
>
> > Just to clarify. Davor is right for a code modification change: -1 means
> a
> > veto.
> > I meant that -1 is not a veto for a release vote.
> >
> > Anyway, even if it's not a formal code, we can have a discussion with
> > "options" a,b and c.
> >
> > Regards
> > JB
> >
> > ⁣​
> >
> > On Oct 27, 2016, 06:48, at 06:48, Davor Bonaci  >
> > wrote:
> > >In terms of reaching a decision on any code or design changes,
> > >including
> > >this one, I'd suggest going without formal votes. Voting process for
> > >code
> > >modifications between choices A and B doesn't necessarily end with a
> > >decision A or B -- a single (qualified) -1 vote is a veto and cannot be
> > >overridden [1]. Said differently, the guideline is that code changes
> > >should
> > >be made by consensus; not by one group outvoting another. I'd like to
> > >avoid
> > >setting such precedent; we should try to drive consensus, as opposed to
> > >attempting to outvote another part of the community.
> > >
> > >In this particular case, we have had a great discussion. Many
> > >contributors
> > >brought different perspectives. Consequently, some opinions have been
> > >likely changed. At this point, someone should summarize the arguments,
> > >try
> > >to critique them from a neutral standpoint, and suggest a refined
> > >proposal
> > >that takes these perspectives into account. If nobody objects in a
> > >short
> > >time, we should consider this decided. [ I can certainly help here, but
> > >I'd
> > >love to see somebody else do it! ]
> > >
> > >[1] http://www.apache.org/foundation/voting.html
> > >
> > >On Wed, Oct 26, 2016 at 7:35 AM, Ben Chambers
> > >
> > >wrote:
> > >
> > >> I also like Distinct since it doesn't make it sound like it modifies
> > >any
> > >> underlying collection. RemoveDuplicates makes it sound like the
> > >duplicates
> > >> are removed, rather than a new PCollection without duplicates being
> > >> returned.
> > >>
> > >> On Wed, Oct 26, 2016, 7:36 AM Jean-Baptiste Onofré 
> > >> wrote:
> > >>
> > >> > Agree. It was more a transition proposal.
> > >> >
> > >> > Regards
> > >> > JB
> > >> >
> > >> > ⁣​
> > >> >
> > >> > On Oct 26, 2016, 08:31, at 08:31, Robert Bradshaw
> > >> >  wrote:
> > >> > >On Mon, Oct 24, 2016 at 11:02 PM, Jean-Baptiste Onofré
> > >> > > wrote:
> > >> > >> And what about use RemoveDuplicates and create an alias Distinct
> > >?
> > >> > >
> > >> > >I'd really like to avoid (long term) aliases--you end up having to
> > >> > >document (and maintain) them both, and it adds confusion as to
> > >which
> > >> > >one to use (especially if they every diverge), and means searching
> > >for
> > >> > >one or the other yields half the results.
> > >> > >
> > >> > >> It doesn't break the API and would address both SQL users and
> > >more
> > >> > >"big data" users.
> > >> > >>
> > >> > >> My $0.01 ;)
> > >> > >>
> > >> > >> Regards
> > >> > >> JB
> > >> > >>
> > >> > >> ⁣
> > >> > >>
> > >> > >> On Oct 24, 2016, 22:23, at 22:23, Dan Halperin
> > >> > > wrote:
> > >> > >>>I find "MakeDistinct" more confusing. My votes in decreasing
> > >> > >>>preference:
> > >> > >>>
> > >> > >>>1. Keep `RemoveDuplicates` name, ensure that important keywords
> > >are
> > >> > >in
> > >> > >>>the
> > >> > >>>Javadoc. This reduces churn on our users and is honestly pretty
> > >dang
> > >> > >>> descriptive.
> > >> > >>>2. Rename to `Distinct`, which is clear if you're a SQL user and
> > >> > >likely
> > >> > >>>less clear otherwise. This is a backwards-incompatible API
> > >change, so
> > >> > >>>we
> > >> > >>>should do it before we go stable.
> > >> > >>>
> > >> > >

Re: [DISCUSS] Using Verbs for Transforms

2016-10-27 Thread Dan Halperin
Folks, I don't think this needs to be a "vote". This is just not that big a
deal :). It is important to be transparent and have these discussions on
the list, which is why we brought it here from GitHub/JIRA, but at the end
of the day I hope that a small group of committers and developers can
assess "good enough" consensus for these minor issues.

Here's my assessment:
* We don't really have any rules about naming transforms. "Should be a
verb" is a sort of guiding principle inherited from the Google Flume
project from which Dataflow evolved, but honestly we violate this rule for
clarity all over the place. ("Values", for example).
* The "Big Data" community is significantly more familiar with the concept
of Distinct -- Jesse, who filed the original JIRA, is a good example here.
* Finally, nobody feels very strongly. We could argue minor points of each
solution, but at the end of the day I don't think anyone wants to block a
change.

Let's go with Distinct. It's important to align Beam with the open source
big data community. (And thanks Jesse, our newest (*tied) committer, for
pushing us in the right direction!)

Jesse, can you please take charge of wrapping up the PR and merging it?

Thanks!
Dan

On Wed, Oct 26, 2016 at 11:12 PM, Jean-Baptiste Onofré 
wrote:

> Just to clarify. Davor is right for a code modification change: -1 means a
> veto.
> I meant that -1 is not a veto for a release vote.
>
> Anyway, even if it's not a formal code, we can have a discussion with
> "options" a,b and c.
>
> Regards
> JB
>
> ⁣​
>
> On Oct 27, 2016, 06:48, at 06:48, Davor Bonaci 
> wrote:
> >In terms of reaching a decision on any code or design changes,
> >including
> >this one, I'd suggest going without formal votes. Voting process for
> >code
> >modifications between choices A and B doesn't necessarily end with a
> >decision A or B -- a single (qualified) -1 vote is a veto and cannot be
> >overridden [1]. Said differently, the guideline is that code changes
> >should
> >be made by consensus; not by one group outvoting another. I'd like to
> >avoid
> >setting such precedent; we should try to drive consensus, as opposed to
> >attempting to outvote another part of the community.
> >
> >In this particular case, we have had a great discussion. Many
> >contributors
> >brought different perspectives. Consequently, some opinions have been
> >likely changed. At this point, someone should summarize the arguments,
> >try
> >to critique them from a neutral standpoint, and suggest a refined
> >proposal
> >that takes these perspectives into account. If nobody objects in a
> >short
> >time, we should consider this decided. [ I can certainly help here, but
> >I'd
> >love to see somebody else do it! ]
> >
> >[1] http://www.apache.org/foundation/voting.html
> >
> >On Wed, Oct 26, 2016 at 7:35 AM, Ben Chambers
> >
> >wrote:
> >
> >> I also like Distinct since it doesn't make it sound like it modifies
> >any
> >> underlying collection. RemoveDuplicates makes it sound like the
> >duplicates
> >> are removed, rather than a new PCollection without duplicates being
> >> returned.
> >>
> >> On Wed, Oct 26, 2016, 7:36 AM Jean-Baptiste Onofré 
> >> wrote:
> >>
> >> > Agree. It was more a transition proposal.
> >> >
> >> > Regards
> >> > JB
> >> >
> >> > ⁣​
> >> >
> >> > On Oct 26, 2016, 08:31, at 08:31, Robert Bradshaw
> >> >  wrote:
> >> > >On Mon, Oct 24, 2016 at 11:02 PM, Jean-Baptiste Onofré
> >> > > wrote:
> >> > >> And what about use RemoveDuplicates and create an alias Distinct
> >?
> >> > >
> >> > >I'd really like to avoid (long term) aliases--you end up having to
> >> > >document (and maintain) them both, and it adds confusion as to
> >which
> >> > >one to use (especially if they every diverge), and means searching
> >for
> >> > >one or the other yields half the results.
> >> > >
> >> > >> It doesn't break the API and would address both SQL users and
> >more
> >> > >"big data" users.
> >> > >>
> >> > >> My $0.01 ;)
> >> > >>
> >> > >> Regards
> >> > >> JB
> >> > >>
> >> > >> ⁣
> >> > >>
> >> > >> On Oct 24, 2016, 22:23, at 22:23, Dan Halperin
> >> > > wrote:
> >> > >>>I find "MakeDistinct" more confusing. My votes in decreasing
> >> > >>>preference:
> >> > >>>
> >> > >>>1. Keep `RemoveDuplicates` name, ensure that important keywords
> >are
> >> > >in
> >> > >>>the
> >> > >>>Javadoc. This reduces churn on our users and is honestly pretty
> >dang
> >> > >>> descriptive.
> >> > >>>2. Rename to `Distinct`, which is clear if you're a SQL user and
> >> > >likely
> >> > >>>less clear otherwise. This is a backwards-incompatible API
> >change, so
> >> > >>>we
> >> > >>>should do it before we go stable.
> >> > >>>
> >> > >>>I am not super strong that 1 > 2, but I am very strong that
> >> > >"Distinct"
> >> > >>
> >> > >>>"MakeDistinct" or and "RemoveDuplicates" >>> "AvoidDuplicate".
> >> > >>>
> >> > >>>Dan
> >> > >>>
> >> > >>>On Mon, Oct 24, 2016 at 10:12 AM, Kenneth Knowles
> >> > >>>
> >> > >>>wrote:
> >> > >>>
> >> >  The precedent t

Re: [DISCUSS] Using Verbs for Transforms

2016-10-26 Thread Jean-Baptiste Onofré
Just to clarify. Davor is right for a code modification change: -1 means a veto.
I meant that -1 is not a veto for a release vote.

Anyway, even if it's not a formal code, we can have a discussion with "options" 
a,b and c.

Regards
JB

⁣​

On Oct 27, 2016, 06:48, at 06:48, Davor Bonaci  wrote:
>In terms of reaching a decision on any code or design changes,
>including
>this one, I'd suggest going without formal votes. Voting process for
>code
>modifications between choices A and B doesn't necessarily end with a
>decision A or B -- a single (qualified) -1 vote is a veto and cannot be
>overridden [1]. Said differently, the guideline is that code changes
>should
>be made by consensus; not by one group outvoting another. I'd like to
>avoid
>setting such precedent; we should try to drive consensus, as opposed to
>attempting to outvote another part of the community.
>
>In this particular case, we have had a great discussion. Many
>contributors
>brought different perspectives. Consequently, some opinions have been
>likely changed. At this point, someone should summarize the arguments,
>try
>to critique them from a neutral standpoint, and suggest a refined
>proposal
>that takes these perspectives into account. If nobody objects in a
>short
>time, we should consider this decided. [ I can certainly help here, but
>I'd
>love to see somebody else do it! ]
>
>[1] http://www.apache.org/foundation/voting.html
>
>On Wed, Oct 26, 2016 at 7:35 AM, Ben Chambers
>
>wrote:
>
>> I also like Distinct since it doesn't make it sound like it modifies
>any
>> underlying collection. RemoveDuplicates makes it sound like the
>duplicates
>> are removed, rather than a new PCollection without duplicates being
>> returned.
>>
>> On Wed, Oct 26, 2016, 7:36 AM Jean-Baptiste Onofré 
>> wrote:
>>
>> > Agree. It was more a transition proposal.
>> >
>> > Regards
>> > JB
>> >
>> > ⁣​
>> >
>> > On Oct 26, 2016, 08:31, at 08:31, Robert Bradshaw
>> >  wrote:
>> > >On Mon, Oct 24, 2016 at 11:02 PM, Jean-Baptiste Onofré
>> > > wrote:
>> > >> And what about use RemoveDuplicates and create an alias Distinct
>?
>> > >
>> > >I'd really like to avoid (long term) aliases--you end up having to
>> > >document (and maintain) them both, and it adds confusion as to
>which
>> > >one to use (especially if they every diverge), and means searching
>for
>> > >one or the other yields half the results.
>> > >
>> > >> It doesn't break the API and would address both SQL users and
>more
>> > >"big data" users.
>> > >>
>> > >> My $0.01 ;)
>> > >>
>> > >> Regards
>> > >> JB
>> > >>
>> > >> ⁣
>> > >>
>> > >> On Oct 24, 2016, 22:23, at 22:23, Dan Halperin
>> > > wrote:
>> > >>>I find "MakeDistinct" more confusing. My votes in decreasing
>> > >>>preference:
>> > >>>
>> > >>>1. Keep `RemoveDuplicates` name, ensure that important keywords
>are
>> > >in
>> > >>>the
>> > >>>Javadoc. This reduces churn on our users and is honestly pretty
>dang
>> > >>> descriptive.
>> > >>>2. Rename to `Distinct`, which is clear if you're a SQL user and
>> > >likely
>> > >>>less clear otherwise. This is a backwards-incompatible API
>change, so
>> > >>>we
>> > >>>should do it before we go stable.
>> > >>>
>> > >>>I am not super strong that 1 > 2, but I am very strong that
>> > >"Distinct"
>> > >>
>> > >>>"MakeDistinct" or and "RemoveDuplicates" >>> "AvoidDuplicate".
>> > >>>
>> > >>>Dan
>> > >>>
>> > >>>On Mon, Oct 24, 2016 at 10:12 AM, Kenneth Knowles
>> > >>>
>> > >>>wrote:
>> > >>>
>> >  The precedent that we use verbs has many exceptions. We have
>> >  ApproximateQuantiles, Values, Keys, WithTimestamps, and I
>would
>> > >even
>> >  include Sum (at least when I read it).
>> > 
>> >  Historical note: the predilection towards verbs is from the
>Google
>> > >>>Style
>> >  Guide for Java method names
>> > 
>> > >>>> 2.3-method-names
>> > >,
>> >  which states "Method names are typically verbs or verb
>phrases".
>> > >But
>> > >>>even
>> >  in Google code there are lots of exceptions when it makes
>sense,
>> > >like
>> >  Guava's
>> >  Iterables.any(), Iterables.all(), Iterables.toArray(), the
>entire
>> >  Predicates module, etc. Just an aside; Beam isn't Google code.
>I
>> > >>>suggest we
>> >  use our judgment rather than a policy.
>> > 
>> >  I think "Distinct" is one of those exceptions. It is a
>standard
>> > >>>widespread
>> >  name and also reads better as an adjective. I prefer it, but
>also
>> > >>>don't
>> >  care strongly enough to change it or to change it back :-)
>> > 
>> >  If we must have a verb, I like it as-is more than MakeDistinct
>and
>> >  AvoidDuplicate.
>> > 
>> >  On Mon, Oct 24, 2016 at 9:46 AM Jesse Anderson
>> > >>>
>> >  wrote:
>> > 
>> >  > My original thought for this change was that Crunch uses the
>> > >class
>> > >>>name
>> >  > Distinct. SQL also uses the keyword distinct.
>> >  >
>> >  > Maybe 

Re: [DISCUSS] Using Verbs for Transforms

2016-10-26 Thread Jean-Baptiste Onofré
A -1 vote doesn't necessarily mean a veto. For instance it's not really 
possible to veto a release vote.

Anyway, we call it vote or discussion, but I think a formal summary of the 
different proposed approaches is a good thing.

My $0.01 ;)

Regards
JB

⁣​

On Oct 27, 2016, 06:48, at 06:48, Davor Bonaci  wrote:
>In terms of reaching a decision on any code or design changes,
>including
>this one, I'd suggest going without formal votes. Voting process for
>code
>modifications between choices A and B doesn't necessarily end with a
>decision A or B -- a single (qualified) -1 vote is a veto and cannot be
>overridden [1]. Said differently, the guideline is that code changes
>should
>be made by consensus; not by one group outvoting another. I'd like to
>avoid
>setting such precedent; we should try to drive consensus, as opposed to
>attempting to outvote another part of the community.
>
>In this particular case, we have had a great discussion. Many
>contributors
>brought different perspectives. Consequently, some opinions have been
>likely changed. At this point, someone should summarize the arguments,
>try
>to critique them from a neutral standpoint, and suggest a refined
>proposal
>that takes these perspectives into account. If nobody objects in a
>short
>time, we should consider this decided. [ I can certainly help here, but
>I'd
>love to see somebody else do it! ]
>
>[1] http://www.apache.org/foundation/voting.html
>
>On Wed, Oct 26, 2016 at 7:35 AM, Ben Chambers
>
>wrote:
>
>> I also like Distinct since it doesn't make it sound like it modifies
>any
>> underlying collection. RemoveDuplicates makes it sound like the
>duplicates
>> are removed, rather than a new PCollection without duplicates being
>> returned.
>>
>> On Wed, Oct 26, 2016, 7:36 AM Jean-Baptiste Onofré 
>> wrote:
>>
>> > Agree. It was more a transition proposal.
>> >
>> > Regards
>> > JB
>> >
>> > ⁣​
>> >
>> > On Oct 26, 2016, 08:31, at 08:31, Robert Bradshaw
>> >  wrote:
>> > >On Mon, Oct 24, 2016 at 11:02 PM, Jean-Baptiste Onofré
>> > > wrote:
>> > >> And what about use RemoveDuplicates and create an alias Distinct
>?
>> > >
>> > >I'd really like to avoid (long term) aliases--you end up having to
>> > >document (and maintain) them both, and it adds confusion as to
>which
>> > >one to use (especially if they every diverge), and means searching
>for
>> > >one or the other yields half the results.
>> > >
>> > >> It doesn't break the API and would address both SQL users and
>more
>> > >"big data" users.
>> > >>
>> > >> My $0.01 ;)
>> > >>
>> > >> Regards
>> > >> JB
>> > >>
>> > >> ⁣
>> > >>
>> > >> On Oct 24, 2016, 22:23, at 22:23, Dan Halperin
>> > > wrote:
>> > >>>I find "MakeDistinct" more confusing. My votes in decreasing
>> > >>>preference:
>> > >>>
>> > >>>1. Keep `RemoveDuplicates` name, ensure that important keywords
>are
>> > >in
>> > >>>the
>> > >>>Javadoc. This reduces churn on our users and is honestly pretty
>dang
>> > >>> descriptive.
>> > >>>2. Rename to `Distinct`, which is clear if you're a SQL user and
>> > >likely
>> > >>>less clear otherwise. This is a backwards-incompatible API
>change, so
>> > >>>we
>> > >>>should do it before we go stable.
>> > >>>
>> > >>>I am not super strong that 1 > 2, but I am very strong that
>> > >"Distinct"
>> > >>
>> > >>>"MakeDistinct" or and "RemoveDuplicates" >>> "AvoidDuplicate".
>> > >>>
>> > >>>Dan
>> > >>>
>> > >>>On Mon, Oct 24, 2016 at 10:12 AM, Kenneth Knowles
>> > >>>
>> > >>>wrote:
>> > >>>
>> >  The precedent that we use verbs has many exceptions. We have
>> >  ApproximateQuantiles, Values, Keys, WithTimestamps, and I
>would
>> > >even
>> >  include Sum (at least when I read it).
>> > 
>> >  Historical note: the predilection towards verbs is from the
>Google
>> > >>>Style
>> >  Guide for Java method names
>> > 
>> > >>>> 2.3-method-names
>> > >,
>> >  which states "Method names are typically verbs or verb
>phrases".
>> > >But
>> > >>>even
>> >  in Google code there are lots of exceptions when it makes
>sense,
>> > >like
>> >  Guava's
>> >  Iterables.any(), Iterables.all(), Iterables.toArray(), the
>entire
>> >  Predicates module, etc. Just an aside; Beam isn't Google code.
>I
>> > >>>suggest we
>> >  use our judgment rather than a policy.
>> > 
>> >  I think "Distinct" is one of those exceptions. It is a
>standard
>> > >>>widespread
>> >  name and also reads better as an adjective. I prefer it, but
>also
>> > >>>don't
>> >  care strongly enough to change it or to change it back :-)
>> > 
>> >  If we must have a verb, I like it as-is more than MakeDistinct
>and
>> >  AvoidDuplicate.
>> > 
>> >  On Mon, Oct 24, 2016 at 9:46 AM Jesse Anderson
>> > >>>
>> >  wrote:
>> > 
>> >  > My original thought for this change was that Crunch uses the
>> > >class
>> > >>>name
>> >  > Distinct. SQL also uses the keyword distinct.
>> >  >

Re: [DISCUSS] Using Verbs for Transforms

2016-10-26 Thread Davor Bonaci
In terms of reaching a decision on any code or design changes, including
this one, I'd suggest going without formal votes. Voting process for code
modifications between choices A and B doesn't necessarily end with a
decision A or B -- a single (qualified) -1 vote is a veto and cannot be
overridden [1]. Said differently, the guideline is that code changes should
be made by consensus; not by one group outvoting another. I'd like to avoid
setting such precedent; we should try to drive consensus, as opposed to
attempting to outvote another part of the community.

In this particular case, we have had a great discussion. Many contributors
brought different perspectives. Consequently, some opinions have been
likely changed. At this point, someone should summarize the arguments, try
to critique them from a neutral standpoint, and suggest a refined proposal
that takes these perspectives into account. If nobody objects in a short
time, we should consider this decided. [ I can certainly help here, but I'd
love to see somebody else do it! ]

[1] http://www.apache.org/foundation/voting.html

On Wed, Oct 26, 2016 at 7:35 AM, Ben Chambers 
wrote:

> I also like Distinct since it doesn't make it sound like it modifies any
> underlying collection. RemoveDuplicates makes it sound like the duplicates
> are removed, rather than a new PCollection without duplicates being
> returned.
>
> On Wed, Oct 26, 2016, 7:36 AM Jean-Baptiste Onofré 
> wrote:
>
> > Agree. It was more a transition proposal.
> >
> > Regards
> > JB
> >
> > ⁣​
> >
> > On Oct 26, 2016, 08:31, at 08:31, Robert Bradshaw
> >  wrote:
> > >On Mon, Oct 24, 2016 at 11:02 PM, Jean-Baptiste Onofré
> > > wrote:
> > >> And what about use RemoveDuplicates and create an alias Distinct ?
> > >
> > >I'd really like to avoid (long term) aliases--you end up having to
> > >document (and maintain) them both, and it adds confusion as to which
> > >one to use (especially if they every diverge), and means searching for
> > >one or the other yields half the results.
> > >
> > >> It doesn't break the API and would address both SQL users and more
> > >"big data" users.
> > >>
> > >> My $0.01 ;)
> > >>
> > >> Regards
> > >> JB
> > >>
> > >> ⁣
> > >>
> > >> On Oct 24, 2016, 22:23, at 22:23, Dan Halperin
> > > wrote:
> > >>>I find "MakeDistinct" more confusing. My votes in decreasing
> > >>>preference:
> > >>>
> > >>>1. Keep `RemoveDuplicates` name, ensure that important keywords are
> > >in
> > >>>the
> > >>>Javadoc. This reduces churn on our users and is honestly pretty dang
> > >>> descriptive.
> > >>>2. Rename to `Distinct`, which is clear if you're a SQL user and
> > >likely
> > >>>less clear otherwise. This is a backwards-incompatible API change, so
> > >>>we
> > >>>should do it before we go stable.
> > >>>
> > >>>I am not super strong that 1 > 2, but I am very strong that
> > >"Distinct"
> > >>
> > >>>"MakeDistinct" or and "RemoveDuplicates" >>> "AvoidDuplicate".
> > >>>
> > >>>Dan
> > >>>
> > >>>On Mon, Oct 24, 2016 at 10:12 AM, Kenneth Knowles
> > >>>
> > >>>wrote:
> > >>>
> >  The precedent that we use verbs has many exceptions. We have
> >  ApproximateQuantiles, Values, Keys, WithTimestamps, and I would
> > >even
> >  include Sum (at least when I read it).
> > 
> >  Historical note: the predilection towards verbs is from the Google
> > >>>Style
> >  Guide for Java method names
> > 
> > >>> 2.3-method-names
> > >,
> >  which states "Method names are typically verbs or verb phrases".
> > >But
> > >>>even
> >  in Google code there are lots of exceptions when it makes sense,
> > >like
> >  Guava's
> >  Iterables.any(), Iterables.all(), Iterables.toArray(), the entire
> >  Predicates module, etc. Just an aside; Beam isn't Google code. I
> > >>>suggest we
> >  use our judgment rather than a policy.
> > 
> >  I think "Distinct" is one of those exceptions. It is a standard
> > >>>widespread
> >  name and also reads better as an adjective. I prefer it, but also
> > >>>don't
> >  care strongly enough to change it or to change it back :-)
> > 
> >  If we must have a verb, I like it as-is more than MakeDistinct and
> >  AvoidDuplicate.
> > 
> >  On Mon, Oct 24, 2016 at 9:46 AM Jesse Anderson
> > >>>
> >  wrote:
> > 
> >  > My original thought for this change was that Crunch uses the
> > >class
> > >>>name
> >  > Distinct. SQL also uses the keyword distinct.
> >  >
> >  > Maybe the rule should be changed to adjectives or verbs depending
> > >>>on the
> >  > context.
> >  >
> >  > Using a verb to describe this class really doesn't connote what
> > >the
> > >>>class
> >  > does as succinctly as the adjective.
> >  >
> >  > On Mon, Oct 24, 2016 at 9:40 AM Neelesh Salian
> > >>>
> >  > wrote:
> >  >
> >  > > Hello,
> >  > >
> >  > > First of all, thank you to Daniel, Robert and Jes

Re: [DISCUSS] Using Verbs for Transforms

2016-10-26 Thread Ben Chambers
I also like Distinct since it doesn't make it sound like it modifies any
underlying collection. RemoveDuplicates makes it sound like the duplicates
are removed, rather than a new PCollection without duplicates being
returned.

On Wed, Oct 26, 2016, 7:36 AM Jean-Baptiste Onofré  wrote:

> Agree. It was more a transition proposal.
>
> Regards
> JB
>
> ⁣​
>
> On Oct 26, 2016, 08:31, at 08:31, Robert Bradshaw
>  wrote:
> >On Mon, Oct 24, 2016 at 11:02 PM, Jean-Baptiste Onofré
> > wrote:
> >> And what about use RemoveDuplicates and create an alias Distinct ?
> >
> >I'd really like to avoid (long term) aliases--you end up having to
> >document (and maintain) them both, and it adds confusion as to which
> >one to use (especially if they every diverge), and means searching for
> >one or the other yields half the results.
> >
> >> It doesn't break the API and would address both SQL users and more
> >"big data" users.
> >>
> >> My $0.01 ;)
> >>
> >> Regards
> >> JB
> >>
> >> ⁣
> >>
> >> On Oct 24, 2016, 22:23, at 22:23, Dan Halperin
> > wrote:
> >>>I find "MakeDistinct" more confusing. My votes in decreasing
> >>>preference:
> >>>
> >>>1. Keep `RemoveDuplicates` name, ensure that important keywords are
> >in
> >>>the
> >>>Javadoc. This reduces churn on our users and is honestly pretty dang
> >>> descriptive.
> >>>2. Rename to `Distinct`, which is clear if you're a SQL user and
> >likely
> >>>less clear otherwise. This is a backwards-incompatible API change, so
> >>>we
> >>>should do it before we go stable.
> >>>
> >>>I am not super strong that 1 > 2, but I am very strong that
> >"Distinct"
> >>
> >>>"MakeDistinct" or and "RemoveDuplicates" >>> "AvoidDuplicate".
> >>>
> >>>Dan
> >>>
> >>>On Mon, Oct 24, 2016 at 10:12 AM, Kenneth Knowles
> >>>
> >>>wrote:
> >>>
>  The precedent that we use verbs has many exceptions. We have
>  ApproximateQuantiles, Values, Keys, WithTimestamps, and I would
> >even
>  include Sum (at least when I read it).
> 
>  Historical note: the predilection towards verbs is from the Google
> >>>Style
>  Guide for Java method names
> 
> >>> >,
>  which states "Method names are typically verbs or verb phrases".
> >But
> >>>even
>  in Google code there are lots of exceptions when it makes sense,
> >like
>  Guava's
>  Iterables.any(), Iterables.all(), Iterables.toArray(), the entire
>  Predicates module, etc. Just an aside; Beam isn't Google code. I
> >>>suggest we
>  use our judgment rather than a policy.
> 
>  I think "Distinct" is one of those exceptions. It is a standard
> >>>widespread
>  name and also reads better as an adjective. I prefer it, but also
> >>>don't
>  care strongly enough to change it or to change it back :-)
> 
>  If we must have a verb, I like it as-is more than MakeDistinct and
>  AvoidDuplicate.
> 
>  On Mon, Oct 24, 2016 at 9:46 AM Jesse Anderson
> >>>
>  wrote:
> 
>  > My original thought for this change was that Crunch uses the
> >class
> >>>name
>  > Distinct. SQL also uses the keyword distinct.
>  >
>  > Maybe the rule should be changed to adjectives or verbs depending
> >>>on the
>  > context.
>  >
>  > Using a verb to describe this class really doesn't connote what
> >the
> >>>class
>  > does as succinctly as the adjective.
>  >
>  > On Mon, Oct 24, 2016 at 9:40 AM Neelesh Salian
> >>>
>  > wrote:
>  >
>  > > Hello,
>  > >
>  > > First of all, thank you to Daniel, Robert and Jesse for their
> >>>review on
>  > > this: https://issues.apache.org/jira/browse/BEAM-239
>  > >
>  > > A point that came up was using verbs explicitly for Transforms.
>  > > Here is the PR:
> >>>https://github.com/apache/incubator-beam/pull/1164
>  > >
>  > > Posting it to help understand if we have a consensus for it and
> >>>if yes,
>  > we
>  > > could perhaps document it for future changes.
>  > >
>  > > Thank you.
>  > >
>  > > --
>  > > Neelesh Srinivas Salian
>  > > Engineer
>  > >
>  >
> 
>


Re: [DISCUSS] Using Verbs for Transforms

2016-10-25 Thread Jean-Baptiste Onofré
Agree. It was more a transition proposal.

Regards
JB

⁣​

On Oct 26, 2016, 08:31, at 08:31, Robert Bradshaw  
wrote:
>On Mon, Oct 24, 2016 at 11:02 PM, Jean-Baptiste Onofré
> wrote:
>> And what about use RemoveDuplicates and create an alias Distinct ?
>
>I'd really like to avoid (long term) aliases--you end up having to
>document (and maintain) them both, and it adds confusion as to which
>one to use (especially if they every diverge), and means searching for
>one or the other yields half the results.
>
>> It doesn't break the API and would address both SQL users and more
>"big data" users.
>>
>> My $0.01 ;)
>>
>> Regards
>> JB
>>
>> ⁣
>>
>> On Oct 24, 2016, 22:23, at 22:23, Dan Halperin
> wrote:
>>>I find "MakeDistinct" more confusing. My votes in decreasing
>>>preference:
>>>
>>>1. Keep `RemoveDuplicates` name, ensure that important keywords are
>in
>>>the
>>>Javadoc. This reduces churn on our users and is honestly pretty dang
>>> descriptive.
>>>2. Rename to `Distinct`, which is clear if you're a SQL user and
>likely
>>>less clear otherwise. This is a backwards-incompatible API change, so
>>>we
>>>should do it before we go stable.
>>>
>>>I am not super strong that 1 > 2, but I am very strong that
>"Distinct"
>>
>>>"MakeDistinct" or and "RemoveDuplicates" >>> "AvoidDuplicate".
>>>
>>>Dan
>>>
>>>On Mon, Oct 24, 2016 at 10:12 AM, Kenneth Knowles
>>>
>>>wrote:
>>>
 The precedent that we use verbs has many exceptions. We have
 ApproximateQuantiles, Values, Keys, WithTimestamps, and I would
>even
 include Sum (at least when I read it).

 Historical note: the predilection towards verbs is from the Google
>>>Style
 Guide for Java method names

>>>,
 which states "Method names are typically verbs or verb phrases".
>But
>>>even
 in Google code there are lots of exceptions when it makes sense,
>like
 Guava's
 Iterables.any(), Iterables.all(), Iterables.toArray(), the entire
 Predicates module, etc. Just an aside; Beam isn't Google code. I
>>>suggest we
 use our judgment rather than a policy.

 I think "Distinct" is one of those exceptions. It is a standard
>>>widespread
 name and also reads better as an adjective. I prefer it, but also
>>>don't
 care strongly enough to change it or to change it back :-)

 If we must have a verb, I like it as-is more than MakeDistinct and
 AvoidDuplicate.

 On Mon, Oct 24, 2016 at 9:46 AM Jesse Anderson
>>>
 wrote:

 > My original thought for this change was that Crunch uses the
>class
>>>name
 > Distinct. SQL also uses the keyword distinct.
 >
 > Maybe the rule should be changed to adjectives or verbs depending
>>>on the
 > context.
 >
 > Using a verb to describe this class really doesn't connote what
>the
>>>class
 > does as succinctly as the adjective.
 >
 > On Mon, Oct 24, 2016 at 9:40 AM Neelesh Salian
>>>
 > wrote:
 >
 > > Hello,
 > >
 > > First of all, thank you to Daniel, Robert and Jesse for their
>>>review on
 > > this: https://issues.apache.org/jira/browse/BEAM-239
 > >
 > > A point that came up was using verbs explicitly for Transforms.
 > > Here is the PR:
>>>https://github.com/apache/incubator-beam/pull/1164
 > >
 > > Posting it to help understand if we have a consensus for it and
>>>if yes,
 > we
 > > could perhaps document it for future changes.
 > >
 > > Thank you.
 > >
 > > --
 > > Neelesh Srinivas Salian
 > > Engineer
 > >
 >



Re: [DISCUSS] Using Verbs for Transforms

2016-10-25 Thread Robert Bradshaw
On Mon, Oct 24, 2016 at 11:02 PM, Jean-Baptiste Onofré  
wrote:
> And what about use RemoveDuplicates and create an alias Distinct ?

I'd really like to avoid (long term) aliases--you end up having to
document (and maintain) them both, and it adds confusion as to which
one to use (especially if they every diverge), and means searching for
one or the other yields half the results.

> It doesn't break the API and would address both SQL users and more "big data" 
> users.
>
> My $0.01 ;)
>
> Regards
> JB
>
> ⁣
>
> On Oct 24, 2016, 22:23, at 22:23, Dan Halperin  
> wrote:
>>I find "MakeDistinct" more confusing. My votes in decreasing
>>preference:
>>
>>1. Keep `RemoveDuplicates` name, ensure that important keywords are in
>>the
>>Javadoc. This reduces churn on our users and is honestly pretty dang
>> descriptive.
>>2. Rename to `Distinct`, which is clear if you're a SQL user and likely
>>less clear otherwise. This is a backwards-incompatible API change, so
>>we
>>should do it before we go stable.
>>
>>I am not super strong that 1 > 2, but I am very strong that "Distinct"
>
>>"MakeDistinct" or and "RemoveDuplicates" >>> "AvoidDuplicate".
>>
>>Dan
>>
>>On Mon, Oct 24, 2016 at 10:12 AM, Kenneth Knowles
>>
>>wrote:
>>
>>> The precedent that we use verbs has many exceptions. We have
>>> ApproximateQuantiles, Values, Keys, WithTimestamps, and I would even
>>> include Sum (at least when I read it).
>>>
>>> Historical note: the predilection towards verbs is from the Google
>>Style
>>> Guide for Java method names
>>>
>>,
>>> which states "Method names are typically verbs or verb phrases". But
>>even
>>> in Google code there are lots of exceptions when it makes sense, like
>>> Guava's
>>> Iterables.any(), Iterables.all(), Iterables.toArray(), the entire
>>> Predicates module, etc. Just an aside; Beam isn't Google code. I
>>suggest we
>>> use our judgment rather than a policy.
>>>
>>> I think "Distinct" is one of those exceptions. It is a standard
>>widespread
>>> name and also reads better as an adjective. I prefer it, but also
>>don't
>>> care strongly enough to change it or to change it back :-)
>>>
>>> If we must have a verb, I like it as-is more than MakeDistinct and
>>> AvoidDuplicate.
>>>
>>> On Mon, Oct 24, 2016 at 9:46 AM Jesse Anderson
>>
>>> wrote:
>>>
>>> > My original thought for this change was that Crunch uses the class
>>name
>>> > Distinct. SQL also uses the keyword distinct.
>>> >
>>> > Maybe the rule should be changed to adjectives or verbs depending
>>on the
>>> > context.
>>> >
>>> > Using a verb to describe this class really doesn't connote what the
>>class
>>> > does as succinctly as the adjective.
>>> >
>>> > On Mon, Oct 24, 2016 at 9:40 AM Neelesh Salian
>>
>>> > wrote:
>>> >
>>> > > Hello,
>>> > >
>>> > > First of all, thank you to Daniel, Robert and Jesse for their
>>review on
>>> > > this: https://issues.apache.org/jira/browse/BEAM-239
>>> > >
>>> > > A point that came up was using verbs explicitly for Transforms.
>>> > > Here is the PR:
>>https://github.com/apache/incubator-beam/pull/1164
>>> > >
>>> > > Posting it to help understand if we have a consensus for it and
>>if yes,
>>> > we
>>> > > could perhaps document it for future changes.
>>> > >
>>> > > Thank you.
>>> > >
>>> > > --
>>> > > Neelesh Srinivas Salian
>>> > > Engineer
>>> > >
>>> >
>>>


Re: [DISCUSS] Using Verbs for Transforms

2016-10-25 Thread Jesse Anderson
A recap of options for RemoveDuplicates:

   - Leave the name as is and update the JavaDocs
   - Rename to Distinct
   - Rename to MakeDistinct
   - Rename to Deduplicate



On Wed, Oct 26, 2016 at 8:10 AM Jean-Baptiste Onofré 
wrote:

> OK. No problem.
>
> Regards
> JB
>
> ⁣​
>
> On Oct 26, 2016, 07:56, at 07:56, Kenneth Knowles 
> wrote:
> >To be clear: I am not saying that I think the discussion has concluded.
> >I
> >think we should give some more time for different time zone rotations
> >to
> >occur. I just meant to say that if it does come to a vote, I'd prefer
> >to
> >keep it focused rather than generalizing.
> >
> >On Tue, Oct 25, 2016 at 10:51 PM Kenneth Knowles 
> >wrote:
> >
> >> I'd prefer to keep the vote focused on this rename, not a general
> >policy.
> >>
> >> On Tue, Oct 25, 2016 at 10:26 PM Jean-Baptiste Onofré
> >
> >> wrote:
> >>
> >> Yes I would start a formal vote with the three proposals: descriptive
> >> verb, adjective, verbs + adjective.
> >>
> >> Regards
> >> JB
> >>
> >> ⁣​
> >>
> >> On Oct 26, 2016, 07:16, at 07:16, Jesse Anderson
> >
> >> wrote:
> >> >We need to make a decision on this so Neelesh can finish his commit.
> >> >Should
> >> >we take a vote or something?
> >> >
> >> >On Tue, Oct 25, 2016, 7:55 AM Jean-Baptiste Onofré 
> >> >wrote:
> >> >
> >> >> Sounds good to me.
> >> >>
> >> >> ⁣​
> >> >>
> >> >> On Oct 24, 2016, 19:11, at 19:11, je...@smokinghand.com wrote:
> >> >> >I prefer MakeDistinct if we have to make it a verb.
> >> >>
> >>
> >>
>


Re: [DISCUSS] Using Verbs for Transforms

2016-10-25 Thread Jean-Baptiste Onofré
OK. No problem.

Regards
JB

⁣​

On Oct 26, 2016, 07:56, at 07:56, Kenneth Knowles  
wrote:
>To be clear: I am not saying that I think the discussion has concluded.
>I
>think we should give some more time for different time zone rotations
>to
>occur. I just meant to say that if it does come to a vote, I'd prefer
>to
>keep it focused rather than generalizing.
>
>On Tue, Oct 25, 2016 at 10:51 PM Kenneth Knowles 
>wrote:
>
>> I'd prefer to keep the vote focused on this rename, not a general
>policy.
>>
>> On Tue, Oct 25, 2016 at 10:26 PM Jean-Baptiste Onofré
>
>> wrote:
>>
>> Yes I would start a formal vote with the three proposals: descriptive
>> verb, adjective, verbs + adjective.
>>
>> Regards
>> JB
>>
>> ⁣​
>>
>> On Oct 26, 2016, 07:16, at 07:16, Jesse Anderson
>
>> wrote:
>> >We need to make a decision on this so Neelesh can finish his commit.
>> >Should
>> >we take a vote or something?
>> >
>> >On Tue, Oct 25, 2016, 7:55 AM Jean-Baptiste Onofré 
>> >wrote:
>> >
>> >> Sounds good to me.
>> >>
>> >> ⁣​
>> >>
>> >> On Oct 24, 2016, 19:11, at 19:11, je...@smokinghand.com wrote:
>> >> >I prefer MakeDistinct if we have to make it a verb.
>> >>
>>
>>


Re: [DISCUSS] Using Verbs for Transforms

2016-10-25 Thread Jean-Baptiste Onofré
And what about use RemoveDuplicates and create an alias Distinct ?

It doesn't break the API and would address both SQL users and more "big data" 
users.

My $0.01 ;)

Regards
JB

⁣​

On Oct 24, 2016, 22:23, at 22:23, Dan Halperin  
wrote:
>I find "MakeDistinct" more confusing. My votes in decreasing
>preference:
>
>1. Keep `RemoveDuplicates` name, ensure that important keywords are in
>the
>Javadoc. This reduces churn on our users and is honestly pretty dang
> descriptive.
>2. Rename to `Distinct`, which is clear if you're a SQL user and likely
>less clear otherwise. This is a backwards-incompatible API change, so
>we
>should do it before we go stable.
>
>I am not super strong that 1 > 2, but I am very strong that "Distinct"

>"MakeDistinct" or and "RemoveDuplicates" >>> "AvoidDuplicate".
>
>Dan
>
>On Mon, Oct 24, 2016 at 10:12 AM, Kenneth Knowles
>
>wrote:
>
>> The precedent that we use verbs has many exceptions. We have
>> ApproximateQuantiles, Values, Keys, WithTimestamps, and I would even
>> include Sum (at least when I read it).
>>
>> Historical note: the predilection towards verbs is from the Google
>Style
>> Guide for Java method names
>>
>,
>> which states "Method names are typically verbs or verb phrases". But
>even
>> in Google code there are lots of exceptions when it makes sense, like
>> Guava's
>> Iterables.any(), Iterables.all(), Iterables.toArray(), the entire
>> Predicates module, etc. Just an aside; Beam isn't Google code. I
>suggest we
>> use our judgment rather than a policy.
>>
>> I think "Distinct" is one of those exceptions. It is a standard
>widespread
>> name and also reads better as an adjective. I prefer it, but also
>don't
>> care strongly enough to change it or to change it back :-)
>>
>> If we must have a verb, I like it as-is more than MakeDistinct and
>> AvoidDuplicate.
>>
>> On Mon, Oct 24, 2016 at 9:46 AM Jesse Anderson
>
>> wrote:
>>
>> > My original thought for this change was that Crunch uses the class
>name
>> > Distinct. SQL also uses the keyword distinct.
>> >
>> > Maybe the rule should be changed to adjectives or verbs depending
>on the
>> > context.
>> >
>> > Using a verb to describe this class really doesn't connote what the
>class
>> > does as succinctly as the adjective.
>> >
>> > On Mon, Oct 24, 2016 at 9:40 AM Neelesh Salian
>
>> > wrote:
>> >
>> > > Hello,
>> > >
>> > > First of all, thank you to Daniel, Robert and Jesse for their
>review on
>> > > this: https://issues.apache.org/jira/browse/BEAM-239
>> > >
>> > > A point that came up was using verbs explicitly for Transforms.
>> > > Here is the PR:
>https://github.com/apache/incubator-beam/pull/1164
>> > >
>> > > Posting it to help understand if we have a consensus for it and
>if yes,
>> > we
>> > > could perhaps document it for future changes.
>> > >
>> > > Thank you.
>> > >
>> > > --
>> > > Neelesh Srinivas Salian
>> > > Engineer
>> > >
>> >
>>


Re: [DISCUSS] Using Verbs for Transforms

2016-10-25 Thread Kenneth Knowles
To be clear: I am not saying that I think the discussion has concluded. I
think we should give some more time for different time zone rotations to
occur. I just meant to say that if it does come to a vote, I'd prefer to
keep it focused rather than generalizing.

On Tue, Oct 25, 2016 at 10:51 PM Kenneth Knowles  wrote:

> I'd prefer to keep the vote focused on this rename, not a general policy.
>
> On Tue, Oct 25, 2016 at 10:26 PM Jean-Baptiste Onofré 
> wrote:
>
> Yes I would start a formal vote with the three proposals: descriptive
> verb, adjective, verbs + adjective.
>
> Regards
> JB
>
> ⁣​
>
> On Oct 26, 2016, 07:16, at 07:16, Jesse Anderson 
> wrote:
> >We need to make a decision on this so Neelesh can finish his commit.
> >Should
> >we take a vote or something?
> >
> >On Tue, Oct 25, 2016, 7:55 AM Jean-Baptiste Onofré 
> >wrote:
> >
> >> Sounds good to me.
> >>
> >> ⁣​
> >>
> >> On Oct 24, 2016, 19:11, at 19:11, je...@smokinghand.com wrote:
> >> >I prefer MakeDistinct if we have to make it a verb.
> >>
>
>


Re: [DISCUSS] Using Verbs for Transforms

2016-10-25 Thread Kenneth Knowles
I'd prefer to keep the vote focused on this rename, not a general policy.

On Tue, Oct 25, 2016 at 10:26 PM Jean-Baptiste Onofré 
wrote:

> Yes I would start a formal vote with the three proposals: descriptive
> verb, adjective, verbs + adjective.
>
> Regards
> JB
>
> ⁣​
>
> On Oct 26, 2016, 07:16, at 07:16, Jesse Anderson 
> wrote:
> >We need to make a decision on this so Neelesh can finish his commit.
> >Should
> >we take a vote or something?
> >
> >On Tue, Oct 25, 2016, 7:55 AM Jean-Baptiste Onofré 
> >wrote:
> >
> >> Sounds good to me.
> >>
> >> ⁣​
> >>
> >> On Oct 24, 2016, 19:11, at 19:11, je...@smokinghand.com wrote:
> >> >I prefer MakeDistinct if we have to make it a verb.
> >>
>


Re: [DISCUSS] Using Verbs for Transforms

2016-10-25 Thread Jean-Baptiste Onofré
Yes I would start a formal vote with the three proposals: descriptive verb, 
adjective, verbs + adjective.

Regards
JB

⁣​

On Oct 26, 2016, 07:16, at 07:16, Jesse Anderson  wrote:
>We need to make a decision on this so Neelesh can finish his commit.
>Should
>we take a vote or something?
>
>On Tue, Oct 25, 2016, 7:55 AM Jean-Baptiste Onofré 
>wrote:
>
>> Sounds good to me.
>>
>> ⁣​
>>
>> On Oct 24, 2016, 19:11, at 19:11, je...@smokinghand.com wrote:
>> >I prefer MakeDistinct if we have to make it a verb.
>>


Re: [DISCUSS] Using Verbs for Transforms

2016-10-25 Thread Jesse Anderson
We need to make a decision on this so Neelesh can finish his commit. Should
we take a vote or something?

On Tue, Oct 25, 2016, 7:55 AM Jean-Baptiste Onofré  wrote:

> Sounds good to me.
>
> ⁣​
>
> On Oct 24, 2016, 19:11, at 19:11, je...@smokinghand.com wrote:
> >I prefer MakeDistinct if we have to make it a verb.
>


Re: [DISCUSS] Using Verbs for Transforms

2016-10-24 Thread Jean-Baptiste Onofré
Sounds good to me.

⁣​

On Oct 24, 2016, 19:11, at 19:11, je...@smokinghand.com wrote:
>I prefer MakeDistinct if we have to make it a verb.


Re: [DISCUSS] Using Verbs for Transforms

2016-10-24 Thread Robert Bradshaw
On Mon, Oct 24, 2016 at 8:52 PM, Robert Bradshaw  wrote:
> On Mon, Oct 24, 2016 at 10:12 AM, Kenneth Knowles
>  wrote:
>> The precedent that we use verbs has many exceptions. We have
>> ApproximateQuantiles, Values, Keys, WithTimestamps, and I would even
>> include Sum (at least when I read it).
>
> True.
>
>> Historical note: the predilection towards verbs is from the Google Style
>> Guide for Java method names
>> ,
>> which states "Method names are typically verbs or verb phrases". But even
>> in Google code there are lots of exceptions when it makes sense, like Guava's
>> Iterables.any(), Iterables.all(), Iterables.toArray(), the entire
>> Predicates module, etc. Just an aside; Beam isn't Google code. I suggest we
>> use our judgment rather than a policy.
>
> Yes, we should favor what flows well. Verbs often do, but...

On this note, however, the first attempt at trigger builders were
developed to be "fluent" and read like English sentences, but in
retrospect were needlessly verbose.

>> I think "Distinct" is one of those exceptions. It is a standard widespread
>> name and also reads better as an adjective. I prefer it, but also don't
>> care strongly enough to change it or to change it back :-)
>>
>> If we must have a verb, I like it as-is more than MakeDistinct and
>> AvoidDuplicate.
>
> I much prefer "Distinct" to the other options forcing it to be
> verb-like (despite being the one to bring this up). My (weak)
> preference is to leave RemoveDuplicates with better documentation, but
> Distinct could be fine.
>


Re: [DISCUSS] Using Verbs for Transforms

2016-10-24 Thread Robert Bradshaw
On Mon, Oct 24, 2016 at 10:12 AM, Kenneth Knowles
 wrote:
> The precedent that we use verbs has many exceptions. We have
> ApproximateQuantiles, Values, Keys, WithTimestamps, and I would even
> include Sum (at least when I read it).

True.

> Historical note: the predilection towards verbs is from the Google Style
> Guide for Java method names
> ,
> which states "Method names are typically verbs or verb phrases". But even
> in Google code there are lots of exceptions when it makes sense, like Guava's
> Iterables.any(), Iterables.all(), Iterables.toArray(), the entire
> Predicates module, etc. Just an aside; Beam isn't Google code. I suggest we
> use our judgment rather than a policy.

Yes, we should favor what flows well. Verbs often do, but...

> I think "Distinct" is one of those exceptions. It is a standard widespread
> name and also reads better as an adjective. I prefer it, but also don't
> care strongly enough to change it or to change it back :-)
>
> If we must have a verb, I like it as-is more than MakeDistinct and
> AvoidDuplicate.

I much prefer "Distinct" to the other options forcing it to be
verb-like (despite being the one to bring this up). My (weak)
preference is to leave RemoveDuplicates with better documentation, but
Distinct could be fine.

> On Mon, Oct 24, 2016 at 9:46 AM Jesse Anderson 
> wrote:
>
>> My original thought for this change was that Crunch uses the class name
>> Distinct. SQL also uses the keyword distinct.
>>
>> Maybe the rule should be changed to adjectives or verbs depending on the
>> context.
>>
>> Using a verb to describe this class really doesn't connote what the class
>> does as succinctly as the adjective.
>>
>> On Mon, Oct 24, 2016 at 9:40 AM Neelesh Salian 
>> wrote:
>>
>> > Hello,
>> >
>> > First of all, thank you to Daniel, Robert and Jesse for their review on
>> > this: https://issues.apache.org/jira/browse/BEAM-239
>> >
>> > A point that came up was using verbs explicitly for Transforms.
>> > Here is the PR: https://github.com/apache/incubator-beam/pull/1164
>> >
>> > Posting it to help understand if we have a consensus for it and if yes,
>> we
>> > could perhaps document it for future changes.
>> >
>> > Thank you.
>> >
>> > --
>> > Neelesh Srinivas Salian
>> > Engineer
>> >
>>


Re: [DISCUSS] Using Verbs for Transforms

2016-10-24 Thread Jesse Anderson
That's how the mainframe programmers I've dealt with refer to it. I agree
with Dan. We should either not change the name or change it to Distinct.
It's just not worth the effort otherwise.

On Mon, Oct 24, 2016, 3:10 PM Eugene Kirpichov 
wrote:

> $0.02: Deduplicate? (lends to extensions like Deduplicate.by(some key
> extractor function))
>
> On Mon, Oct 24, 2016 at 1:22 PM Dan Halperin 
> wrote:
>
> > I find "MakeDistinct" more confusing. My votes in decreasing preference:
> >
> > 1. Keep `RemoveDuplicates` name, ensure that important keywords are in
> the
> > Javadoc. This reduces churn on our users and is honestly pretty dang
> >  descriptive.
> > 2. Rename to `Distinct`, which is clear if you're a SQL user and likely
> > less clear otherwise. This is a backwards-incompatible API change, so we
> > should do it before we go stable.
> >
> > I am not super strong that 1 > 2, but I am very strong that "Distinct"
> >>>
> > "MakeDistinct" or and "RemoveDuplicates" >>> "AvoidDuplicate".
> >
> > Dan
> >
> > On Mon, Oct 24, 2016 at 10:12 AM, Kenneth Knowles  >
> > wrote:
> >
> > > The precedent that we use verbs has many exceptions. We have
> > > ApproximateQuantiles, Values, Keys, WithTimestamps, and I would even
> > > include Sum (at least when I read it).
> > >
> > > Historical note: the predilection towards verbs is from the Google
> Style
> > > Guide for Java method names
> > > <
> https://google.github.io/styleguide/javaguide.html#s5.2.3-method-names
> > >,
> > > which states "Method names are typically verbs or verb phrases". But
> even
> > > in Google code there are lots of exceptions when it makes sense, like
> > > Guava's
> > > Iterables.any(), Iterables.all(), Iterables.toArray(), the entire
> > > Predicates module, etc. Just an aside; Beam isn't Google code. I
> suggest
> > we
> > > use our judgment rather than a policy.
> > >
> > > I think "Distinct" is one of those exceptions. It is a standard
> > widespread
> > > name and also reads better as an adjective. I prefer it, but also don't
> > > care strongly enough to change it or to change it back :-)
> > >
> > > If we must have a verb, I like it as-is more than MakeDistinct and
> > > AvoidDuplicate.
> > >
> > > On Mon, Oct 24, 2016 at 9:46 AM Jesse Anderson 
> > > wrote:
> > >
> > > > My original thought for this change was that Crunch uses the class
> name
> > > > Distinct. SQL also uses the keyword distinct.
> > > >
> > > > Maybe the rule should be changed to adjectives or verbs depending on
> > the
> > > > context.
> > > >
> > > > Using a verb to describe this class really doesn't connote what the
> > class
> > > > does as succinctly as the adjective.
> > > >
> > > > On Mon, Oct 24, 2016 at 9:40 AM Neelesh Salian  >
> > > > wrote:
> > > >
> > > > > Hello,
> > > > >
> > > > > First of all, thank you to Daniel, Robert and Jesse for their
> review
> > on
> > > > > this: https://issues.apache.org/jira/browse/BEAM-239
> > > > >
> > > > > A point that came up was using verbs explicitly for Transforms.
> > > > > Here is the PR: https://github.com/apache/incubator-beam/pull/1164
> > > > >
> > > > > Posting it to help understand if we have a consensus for it and if
> > yes,
> > > > we
> > > > > could perhaps document it for future changes.
> > > > >
> > > > > Thank you.
> > > > >
> > > > > --
> > > > > Neelesh Srinivas Salian
> > > > > Engineer
> > > > >
> > > >
> > >
> >
>


Re: [DISCUSS] Using Verbs for Transforms

2016-10-24 Thread Eugene Kirpichov
$0.02: Deduplicate? (lends to extensions like Deduplicate.by(some key
extractor function))

On Mon, Oct 24, 2016 at 1:22 PM Dan Halperin 
wrote:

> I find "MakeDistinct" more confusing. My votes in decreasing preference:
>
> 1. Keep `RemoveDuplicates` name, ensure that important keywords are in the
> Javadoc. This reduces churn on our users and is honestly pretty dang
>  descriptive.
> 2. Rename to `Distinct`, which is clear if you're a SQL user and likely
> less clear otherwise. This is a backwards-incompatible API change, so we
> should do it before we go stable.
>
> I am not super strong that 1 > 2, but I am very strong that "Distinct" >>>
> "MakeDistinct" or and "RemoveDuplicates" >>> "AvoidDuplicate".
>
> Dan
>
> On Mon, Oct 24, 2016 at 10:12 AM, Kenneth Knowles 
> wrote:
>
> > The precedent that we use verbs has many exceptions. We have
> > ApproximateQuantiles, Values, Keys, WithTimestamps, and I would even
> > include Sum (at least when I read it).
> >
> > Historical note: the predilection towards verbs is from the Google Style
> > Guide for Java method names
> >  >,
> > which states "Method names are typically verbs or verb phrases". But even
> > in Google code there are lots of exceptions when it makes sense, like
> > Guava's
> > Iterables.any(), Iterables.all(), Iterables.toArray(), the entire
> > Predicates module, etc. Just an aside; Beam isn't Google code. I suggest
> we
> > use our judgment rather than a policy.
> >
> > I think "Distinct" is one of those exceptions. It is a standard
> widespread
> > name and also reads better as an adjective. I prefer it, but also don't
> > care strongly enough to change it or to change it back :-)
> >
> > If we must have a verb, I like it as-is more than MakeDistinct and
> > AvoidDuplicate.
> >
> > On Mon, Oct 24, 2016 at 9:46 AM Jesse Anderson 
> > wrote:
> >
> > > My original thought for this change was that Crunch uses the class name
> > > Distinct. SQL also uses the keyword distinct.
> > >
> > > Maybe the rule should be changed to adjectives or verbs depending on
> the
> > > context.
> > >
> > > Using a verb to describe this class really doesn't connote what the
> class
> > > does as succinctly as the adjective.
> > >
> > > On Mon, Oct 24, 2016 at 9:40 AM Neelesh Salian 
> > > wrote:
> > >
> > > > Hello,
> > > >
> > > > First of all, thank you to Daniel, Robert and Jesse for their review
> on
> > > > this: https://issues.apache.org/jira/browse/BEAM-239
> > > >
> > > > A point that came up was using verbs explicitly for Transforms.
> > > > Here is the PR: https://github.com/apache/incubator-beam/pull/1164
> > > >
> > > > Posting it to help understand if we have a consensus for it and if
> yes,
> > > we
> > > > could perhaps document it for future changes.
> > > >
> > > > Thank you.
> > > >
> > > > --
> > > > Neelesh Srinivas Salian
> > > > Engineer
> > > >
> > >
> >
>


Re: [DISCUSS] Using Verbs for Transforms

2016-10-24 Thread Dan Halperin
I find "MakeDistinct" more confusing. My votes in decreasing preference:

1. Keep `RemoveDuplicates` name, ensure that important keywords are in the
Javadoc. This reduces churn on our users and is honestly pretty dang
 descriptive.
2. Rename to `Distinct`, which is clear if you're a SQL user and likely
less clear otherwise. This is a backwards-incompatible API change, so we
should do it before we go stable.

I am not super strong that 1 > 2, but I am very strong that "Distinct" >>>
"MakeDistinct" or and "RemoveDuplicates" >>> "AvoidDuplicate".

Dan

On Mon, Oct 24, 2016 at 10:12 AM, Kenneth Knowles 
wrote:

> The precedent that we use verbs has many exceptions. We have
> ApproximateQuantiles, Values, Keys, WithTimestamps, and I would even
> include Sum (at least when I read it).
>
> Historical note: the predilection towards verbs is from the Google Style
> Guide for Java method names
> ,
> which states "Method names are typically verbs or verb phrases". But even
> in Google code there are lots of exceptions when it makes sense, like
> Guava's
> Iterables.any(), Iterables.all(), Iterables.toArray(), the entire
> Predicates module, etc. Just an aside; Beam isn't Google code. I suggest we
> use our judgment rather than a policy.
>
> I think "Distinct" is one of those exceptions. It is a standard widespread
> name and also reads better as an adjective. I prefer it, but also don't
> care strongly enough to change it or to change it back :-)
>
> If we must have a verb, I like it as-is more than MakeDistinct and
> AvoidDuplicate.
>
> On Mon, Oct 24, 2016 at 9:46 AM Jesse Anderson 
> wrote:
>
> > My original thought for this change was that Crunch uses the class name
> > Distinct. SQL also uses the keyword distinct.
> >
> > Maybe the rule should be changed to adjectives or verbs depending on the
> > context.
> >
> > Using a verb to describe this class really doesn't connote what the class
> > does as succinctly as the adjective.
> >
> > On Mon, Oct 24, 2016 at 9:40 AM Neelesh Salian 
> > wrote:
> >
> > > Hello,
> > >
> > > First of all, thank you to Daniel, Robert and Jesse for their review on
> > > this: https://issues.apache.org/jira/browse/BEAM-239
> > >
> > > A point that came up was using verbs explicitly for Transforms.
> > > Here is the PR: https://github.com/apache/incubator-beam/pull/1164
> > >
> > > Posting it to help understand if we have a consensus for it and if yes,
> > we
> > > could perhaps document it for future changes.
> > >
> > > Thank you.
> > >
> > > --
> > > Neelesh Srinivas Salian
> > > Engineer
> > >
> >
>


Re: [DISCUSS] Using Verbs for Transforms

2016-10-24 Thread Kenneth Knowles
The precedent that we use verbs has many exceptions. We have
ApproximateQuantiles, Values, Keys, WithTimestamps, and I would even
include Sum (at least when I read it).

Historical note: the predilection towards verbs is from the Google Style
Guide for Java method names
,
which states "Method names are typically verbs or verb phrases". But even
in Google code there are lots of exceptions when it makes sense, like Guava's
Iterables.any(), Iterables.all(), Iterables.toArray(), the entire
Predicates module, etc. Just an aside; Beam isn't Google code. I suggest we
use our judgment rather than a policy.

I think "Distinct" is one of those exceptions. It is a standard widespread
name and also reads better as an adjective. I prefer it, but also don't
care strongly enough to change it or to change it back :-)

If we must have a verb, I like it as-is more than MakeDistinct and
AvoidDuplicate.

On Mon, Oct 24, 2016 at 9:46 AM Jesse Anderson 
wrote:

> My original thought for this change was that Crunch uses the class name
> Distinct. SQL also uses the keyword distinct.
>
> Maybe the rule should be changed to adjectives or verbs depending on the
> context.
>
> Using a verb to describe this class really doesn't connote what the class
> does as succinctly as the adjective.
>
> On Mon, Oct 24, 2016 at 9:40 AM Neelesh Salian 
> wrote:
>
> > Hello,
> >
> > First of all, thank you to Daniel, Robert and Jesse for their review on
> > this: https://issues.apache.org/jira/browse/BEAM-239
> >
> > A point that came up was using verbs explicitly for Transforms.
> > Here is the PR: https://github.com/apache/incubator-beam/pull/1164
> >
> > Posting it to help understand if we have a consensus for it and if yes,
> we
> > could perhaps document it for future changes.
> >
> > Thank you.
> >
> > --
> > Neelesh Srinivas Salian
> > Engineer
> >
>


Re: [DISCUSS] Using Verbs for Transforms

2016-10-24 Thread Neelesh Salian
Thanks JB and Jesse.
Would something like "MakeDistinct" or "AvoidDuplicate" sound better?
I can do the collective changes of the name and the javadoc at one go.

Having it documented can be super helpful.

On Mon, Oct 24, 2016 at 9:58 AM, Jean-Baptiste Onofré 
wrote:

> It could make sense. However we should keep it informative. I mean that
> distinct could appear confusing for some users compare to AvoidDuplicate or
> whatever.
>
> Regards
> JB
>
> ⁣​
>
> On Oct 24, 2016, 18:46, at 18:46, Jesse Anderson 
> wrote:
> >My original thought for this change was that Crunch uses the class name
> >Distinct. SQL also uses the keyword distinct.
> >
> >Maybe the rule should be changed to adjectives or verbs depending on
> >the
> >context.
> >
> >Using a verb to describe this class really doesn't connote what the
> >class
> >does as succinctly as the adjective.
> >
> >On Mon, Oct 24, 2016 at 9:40 AM Neelesh Salian 
> >wrote:
> >
> >> Hello,
> >>
> >> First of all, thank you to Daniel, Robert and Jesse for their review
> >on
> >> this: https://issues.apache.org/jira/browse/BEAM-239
> >>
> >> A point that came up was using verbs explicitly for Transforms.
> >> Here is the PR: https://github.com/apache/incubator-beam/pull/1164
> >>
> >> Posting it to help understand if we have a consensus for it and if
> >yes, we
> >> could perhaps document it for future changes.
> >>
> >> Thank you.
> >>
> >> --
> >> Neelesh Srinivas Salian
> >> Engineer
> >>
>



-- 
Neelesh Srinivas Salian
Customer Operations Engineer


Re: [DISCUSS] Using Verbs for Transforms

2016-10-24 Thread Jean-Baptiste Onofré
It could make sense. However we should keep it informative. I mean that 
distinct could appear confusing for some users compare to AvoidDuplicate or 
whatever.

Regards
JB

⁣​

On Oct 24, 2016, 18:46, at 18:46, Jesse Anderson  wrote:
>My original thought for this change was that Crunch uses the class name
>Distinct. SQL also uses the keyword distinct.
>
>Maybe the rule should be changed to adjectives or verbs depending on
>the
>context.
>
>Using a verb to describe this class really doesn't connote what the
>class
>does as succinctly as the adjective.
>
>On Mon, Oct 24, 2016 at 9:40 AM Neelesh Salian 
>wrote:
>
>> Hello,
>>
>> First of all, thank you to Daniel, Robert and Jesse for their review
>on
>> this: https://issues.apache.org/jira/browse/BEAM-239
>>
>> A point that came up was using verbs explicitly for Transforms.
>> Here is the PR: https://github.com/apache/incubator-beam/pull/1164
>>
>> Posting it to help understand if we have a consensus for it and if
>yes, we
>> could perhaps document it for future changes.
>>
>> Thank you.
>>
>> --
>> Neelesh Srinivas Salian
>> Engineer
>>


Re: [DISCUSS] Using Verbs for Transforms

2016-10-24 Thread Jesse Anderson
My original thought for this change was that Crunch uses the class name
Distinct. SQL also uses the keyword distinct.

Maybe the rule should be changed to adjectives or verbs depending on the
context.

Using a verb to describe this class really doesn't connote what the class
does as succinctly as the adjective.

On Mon, Oct 24, 2016 at 9:40 AM Neelesh Salian  wrote:

> Hello,
>
> First of all, thank you to Daniel, Robert and Jesse for their review on
> this: https://issues.apache.org/jira/browse/BEAM-239
>
> A point that came up was using verbs explicitly for Transforms.
> Here is the PR: https://github.com/apache/incubator-beam/pull/1164
>
> Posting it to help understand if we have a consensus for it and if yes, we
> could perhaps document it for future changes.
>
> Thank you.
>
> --
> Neelesh Srinivas Salian
> Engineer
>


Re: [DISCUSS] Using Verbs for Transforms

2016-10-24 Thread Jean-Baptiste Onofré
Hi

Transforms are already expressed mostly with verbs. I agree with your proposal 
or at least fully documented semantic.

By the way, we have to be careful when changing as it could break the API.

Regards
JB

⁣​

On Oct 24, 2016, 18:40, at 18:40, Neelesh Salian  wrote:
>Hello,
>
>First of all, thank you to Daniel, Robert and Jesse for their review on
>this: https://issues.apache.org/jira/browse/BEAM-239
>
>A point that came up was using verbs explicitly for Transforms.
>Here is the PR: https://github.com/apache/incubator-beam/pull/1164
>
>Posting it to help understand if we have a consensus for it and if yes,
>we
>could perhaps document it for future changes.
>
>Thank you.
>
>-- 
>Neelesh Srinivas Salian
>Engineer