You could short-circuit the filtering within the interator function
supplied to mapPartitions.


On Sunday, November 3, 2013, Xiang Huo wrote:

> Hi all,
>
> I am trying to filter a smaller RDD data set from a large RDD data set.
> And the large one is sorted. So my question is that is there any way to
> make the filter method does't check every element in RDD but filter out all
> the other elements when one element doesn't meet the condition of filter.
> Because the large data set is sorted, when there is one element doesn't
> meet the requirement, all the following elements are impossible to meet.
> But checking them one by one will take a relative long time.
> So is there any way to save time for this part?
>
> Thanks,
>
> Xiang
>
> --
> Xiang Huo
> Department of Computer Science
> University of Illinois at Chicago(UIC)
> Chicago, Illinois
> US
> Email: huoxiang5...@gmail.com <javascript:_e({}, 'cvml',
> 'huoxiang5...@gmail.com');>
>            or xh...@uic.edu <javascript:_e({}, 'cvml', 'xh...@uic.edu');>
>

Reply via email to