I also like Distinct since it doesn't make it sound like it modifies any
underlying collection. RemoveDuplicates makes it sound like the duplicates
are removed, rather than a new PCollection without duplicates being
returned.

On Wed, Oct 26, 2016, 7:36 AM Jean-Baptiste Onofré <j...@nanthrax.net> wrote:

> Agree. It was more a transition proposal.
>
> Regards
> JB
>
> ⁣​
>
> On Oct 26, 2016, 08:31, at 08:31, Robert Bradshaw
> <rober...@google.com.INVALID> wrote:
> >On Mon, Oct 24, 2016 at 11:02 PM, Jean-Baptiste Onofré
> ><j...@nanthrax.net> wrote:
> >> And what about use RemoveDuplicates and create an alias Distinct ?
> >
> >I'd really like to avoid (long term) aliases--you end up having to
> >document (and maintain) them both, and it adds confusion as to which
> >one to use (especially if they every diverge), and means searching for
> >one or the other yields half the results.
> >
> >> It doesn't break the API and would address both SQL users and more
> >"big data" users.
> >>
> >> My $0.01 ;)
> >>
> >> Regards
> >> JB
> >>
> >> ⁣
> >>
> >> On Oct 24, 2016, 22:23, at 22:23, Dan Halperin
> ><dhalp...@google.com.INVALID> wrote:
> >>>I find "MakeDistinct" more confusing. My votes in decreasing
> >>>preference:
> >>>
> >>>1. Keep `RemoveDuplicates` name, ensure that important keywords are
> >in
> >>>the
> >>>Javadoc. This reduces churn on our users and is honestly pretty dang
> >>> descriptive.
> >>>2. Rename to `Distinct`, which is clear if you're a SQL user and
> >likely
> >>>less clear otherwise. This is a backwards-incompatible API change, so
> >>>we
> >>>should do it before we go stable.
> >>>
> >>>I am not super strong that 1 > 2, but I am very strong that
> >"Distinct"
> >>>>>>
> >>>"MakeDistinct" or and "RemoveDuplicates" >>> "AvoidDuplicate".
> >>>
> >>>Dan
> >>>
> >>>On Mon, Oct 24, 2016 at 10:12 AM, Kenneth Knowles
> >>><k...@google.com.invalid>
> >>>wrote:
> >>>
> >>>> The precedent that we use verbs has many exceptions. We have
> >>>> ApproximateQuantiles, Values, Keys, WithTimestamps, and I would
> >even
> >>>> include Sum (at least when I read it).
> >>>>
> >>>> Historical note: the predilection towards verbs is from the Google
> >>>Style
> >>>> Guide for Java method names
> >>>>
> >>><https://google.github.io/styleguide/javaguide.html#s5.2.3-method-names
> >,
> >>>> which states "Method names are typically verbs or verb phrases".
> >But
> >>>even
> >>>> in Google code there are lots of exceptions when it makes sense,
> >like
> >>>> Guava's
> >>>> Iterables.any(), Iterables.all(), Iterables.toArray(), the entire
> >>>> Predicates module, etc. Just an aside; Beam isn't Google code. I
> >>>suggest we
> >>>> use our judgment rather than a policy.
> >>>>
> >>>> I think "Distinct" is one of those exceptions. It is a standard
> >>>widespread
> >>>> name and also reads better as an adjective. I prefer it, but also
> >>>don't
> >>>> care strongly enough to change it or to change it back :-)
> >>>>
> >>>> If we must have a verb, I like it as-is more than MakeDistinct and
> >>>> AvoidDuplicate.
> >>>>
> >>>> On Mon, Oct 24, 2016 at 9:46 AM Jesse Anderson
> >>><je...@smokinghand.com>
> >>>> wrote:
> >>>>
> >>>> > My original thought for this change was that Crunch uses the
> >class
> >>>name
> >>>> > Distinct. SQL also uses the keyword distinct.
> >>>> >
> >>>> > Maybe the rule should be changed to adjectives or verbs depending
> >>>on the
> >>>> > context.
> >>>> >
> >>>> > Using a verb to describe this class really doesn't connote what
> >the
> >>>class
> >>>> > does as succinctly as the adjective.
> >>>> >
> >>>> > On Mon, Oct 24, 2016 at 9:40 AM Neelesh Salian
> >>><nsal...@cloudera.com>
> >>>> > wrote:
> >>>> >
> >>>> > > Hello,
> >>>> > >
> >>>> > > First of all, thank you to Daniel, Robert and Jesse for their
> >>>review on
> >>>> > > this: https://issues.apache.org/jira/browse/BEAM-239
> >>>> > >
> >>>> > > A point that came up was using verbs explicitly for Transforms.
> >>>> > > Here is the PR:
> >>>https://github.com/apache/incubator-beam/pull/1164
> >>>> > >
> >>>> > > Posting it to help understand if we have a consensus for it and
> >>>if yes,
> >>>> > we
> >>>> > > could perhaps document it for future changes.
> >>>> > >
> >>>> > > Thank you.
> >>>> > >
> >>>> > > --
> >>>> > > Neelesh Srinivas Salian
> >>>> > > Engineer
> >>>> > >
> >>>> >
> >>>>
>

Reply via email to