join seems to me the proper approach followed by keying the fits by KeyID and using combineByKey to choose the best - I am implementing that now and will report on performance
On Fri, Oct 31, 2014 at 11:56 AM, Sonal Goyal <sonalgoy...@gmail.com> wrote: > Does the following help? > > JavaPairRDD<bin,key> join with JavaPairRDD<bin,lock> > > If you partition both RDDs by the bin id, I think you should be able to > get what you want. > > Best Regards, > Sonal > Nube Technologies <http://www.nubetech.co> > > <http://in.linkedin.com/in/sonalgoyal> > > >> >> On Fri, Oct 31, 2014 at 5:44 PM, Steve Lewis <lordjoe2...@gmail.com> >> wrote: >> >>> >>> The original problem is in biology but the following captures the CS >>> issues, Assume ... >>> >>