join seems to me the proper approach followed by keying the fits by KeyID
and using combineByKey to choose the best -
I am implementing that now and will report on performance
On Fri, Oct 31, 2014 at 11:56 AM, Sonal Goyal sonalgoy...@gmail.com wrote:
Does the following help?
The original problem is in biology but the following captures the CS
issues, Assume I have a large number of locks and a large number of keys.
There is a scoring function between keys and locks and a key that fits a
lock will have a high score. There may be many keys fitting one lock and a
key
Hi Steve,
Are you talking about sequence alignment ?
—
FG
On Fri, Oct 31, 2014 at 5:44 PM, Steve Lewis lordjoe2...@gmail.com
wrote:
The original problem is in biology but the following captures the CS
issues, Assume I have a large number of locks and a large number of keys.
There is a
Does the following help?
JavaPairRDDbin,key join with JavaPairRDDbin,lock
If you partition both RDDs by the bin id, I think you should be able to get
what you want.
Best Regards,
Sonal
Nube Technologies http://www.nubetech.co
http://in.linkedin.com/in/sonalgoyal
On Fri, Oct 31, 2014 at