Agree that filter is perhaps unintuitive. Though the Scala collections API has 
"filter" and "filterNot" which together provide context that makes it more 
intuitive.


And yes the change could be via added methods that don't break existing API.


Still overall I would be -1 on this unless a significant proportion of users 
would find it added value.




Actually adding "filterNot" while not that necessary would make more sense in 
my view








—
Sent from Mailbox for iPhone

On Thu, Feb 27, 2014 at 3:56 PM, Bertrand Dechoux <decho...@gmail.com>
wrote:

> I understand the explanation but I had to try. However, the change could be
> made without breaking anything but that's another story.
> Regards
> Bertrand
> Bertrand Dechoux
> On Thu, Feb 27, 2014 at 2:05 PM, Nick Pentreath 
> <nick.pentre...@gmail.com>wrote:
>> filter comes from the Scala collection method "filter". I'd say it's best
>> to keep in line with the Scala collections API, as Spark has done with RDDs
>> generally (map, flatMap, take etc), so that is is easier and natural for
>> developers to apply the same thinking for Scala (parallel) collections to
>> Spark RDDs.
>>
>> Plus, such an API change would be a major breaking one and IMO not a good
>> idea at this stage.
>>
>> deffilter(p: (A) => 
>> Boolean<http://www.scala-lang.org/api/2.10.3/scala/Boolean.html>
>> ): Seq <http://www.scala-lang.org/api/2.10.3/scala/collection/Seq.html>[A]
>>
>> Selects all elements of this sequence which satisfy a predicate.
>> p
>>
>> the predicate used to test elements.
>> returns
>>
>> a new sequence consisting of all elements of this sequence that satisfy
>> the given predicate p. The order of the elements is preserved.
>>
>>
>> On Thu, Feb 27, 2014 at 2:36 PM, Bertrand Dechoux <decho...@gmail.com>wrote:
>>
>>> Hi,
>>>
>>> It might seem like a trivial issue but even though it is somehow a
>>> standard name filter() is not really explicit in which way it does work.
>>> Sure, it makes sense to provide a filter function but what happens when it
>>> returns true? Is the current element removed or kept? It is not really
>>> obvious.
>>>
>>> Has another name been already discussed? It could be keep() or remove().
>>> But take() could also be reused and instead of providing a number, the
>>> filter function could be requested.
>>>
>>>  Regards
>>>
>>> Bertrand
>>>
>>
>>

Reply via email to