I also like Distinct since it doesn't make it sound like it modifies any underlying collection. RemoveDuplicates makes it sound like the duplicates are removed, rather than a new PCollection without duplicates being returned.
On Wed, Oct 26, 2016, 7:36 AM Jean-Baptiste Onofré <j...@nanthrax.net> wrote: > Agree. It was more a transition proposal. > > Regards > JB > > > > On Oct 26, 2016, 08:31, at 08:31, Robert Bradshaw > <rober...@google.com.INVALID> wrote: > >On Mon, Oct 24, 2016 at 11:02 PM, Jean-Baptiste Onofré > ><j...@nanthrax.net> wrote: > >> And what about use RemoveDuplicates and create an alias Distinct ? > > > >I'd really like to avoid (long term) aliases--you end up having to > >document (and maintain) them both, and it adds confusion as to which > >one to use (especially if they every diverge), and means searching for > >one or the other yields half the results. > > > >> It doesn't break the API and would address both SQL users and more > >"big data" users. > >> > >> My $0.01 ;) > >> > >> Regards > >> JB > >> > >> > >> > >> On Oct 24, 2016, 22:23, at 22:23, Dan Halperin > ><dhalp...@google.com.INVALID> wrote: > >>>I find "MakeDistinct" more confusing. My votes in decreasing > >>>preference: > >>> > >>>1. Keep `RemoveDuplicates` name, ensure that important keywords are > >in > >>>the > >>>Javadoc. This reduces churn on our users and is honestly pretty dang > >>> descriptive. > >>>2. Rename to `Distinct`, which is clear if you're a SQL user and > >likely > >>>less clear otherwise. This is a backwards-incompatible API change, so > >>>we > >>>should do it before we go stable. > >>> > >>>I am not super strong that 1 > 2, but I am very strong that > >"Distinct" > >>>>>> > >>>"MakeDistinct" or and "RemoveDuplicates" >>> "AvoidDuplicate". > >>> > >>>Dan > >>> > >>>On Mon, Oct 24, 2016 at 10:12 AM, Kenneth Knowles > >>><k...@google.com.invalid> > >>>wrote: > >>> > >>>> The precedent that we use verbs has many exceptions. We have > >>>> ApproximateQuantiles, Values, Keys, WithTimestamps, and I would > >even > >>>> include Sum (at least when I read it). > >>>> > >>>> Historical note: the predilection towards verbs is from the Google > >>>Style > >>>> Guide for Java method names > >>>> > >>><https://google.github.io/styleguide/javaguide.html#s5.2.3-method-names > >, > >>>> which states "Method names are typically verbs or verb phrases". > >But > >>>even > >>>> in Google code there are lots of exceptions when it makes sense, > >like > >>>> Guava's > >>>> Iterables.any(), Iterables.all(), Iterables.toArray(), the entire > >>>> Predicates module, etc. Just an aside; Beam isn't Google code. I > >>>suggest we > >>>> use our judgment rather than a policy. > >>>> > >>>> I think "Distinct" is one of those exceptions. It is a standard > >>>widespread > >>>> name and also reads better as an adjective. I prefer it, but also > >>>don't > >>>> care strongly enough to change it or to change it back :-) > >>>> > >>>> If we must have a verb, I like it as-is more than MakeDistinct and > >>>> AvoidDuplicate. > >>>> > >>>> On Mon, Oct 24, 2016 at 9:46 AM Jesse Anderson > >>><je...@smokinghand.com> > >>>> wrote: > >>>> > >>>> > My original thought for this change was that Crunch uses the > >class > >>>name > >>>> > Distinct. SQL also uses the keyword distinct. > >>>> > > >>>> > Maybe the rule should be changed to adjectives or verbs depending > >>>on the > >>>> > context. > >>>> > > >>>> > Using a verb to describe this class really doesn't connote what > >the > >>>class > >>>> > does as succinctly as the adjective. > >>>> > > >>>> > On Mon, Oct 24, 2016 at 9:40 AM Neelesh Salian > >>><nsal...@cloudera.com> > >>>> > wrote: > >>>> > > >>>> > > Hello, > >>>> > > > >>>> > > First of all, thank you to Daniel, Robert and Jesse for their > >>>review on > >>>> > > this: https://issues.apache.org/jira/browse/BEAM-239 > >>>> > > > >>>> > > A point that came up was using verbs explicitly for Transforms. > >>>> > > Here is the PR: > >>>https://github.com/apache/incubator-beam/pull/1164 > >>>> > > > >>>> > > Posting it to help understand if we have a consensus for it and > >>>if yes, > >>>> > we > >>>> > > could perhaps document it for future changes. > >>>> > > > >>>> > > Thank you. > >>>> > > > >>>> > > -- > >>>> > > Neelesh Srinivas Salian > >>>> > > Engineer > >>>> > > > >>>> > > >>>> >