Depending on how you intend to use this, the following might suggest a more substantial speed-up. I'll first re-cap what we've seen so far to provide a basis for my timings, then I'll sketch out an unfinished idea for potentially speeding up the process.
findSDM=: +/@:="1/~ NB. Tarmo Veskioja's original findSDMvc=: +/"1@:(="1/~) NB. Victor Cerovski's findSDMrm=: +/@:(=/~"1) NB. Raul Miller's (10) 6!:2 'findSDM t2' 5.68061 (10) 6!:2 'findSDMvc t2' 4.32129 (10) 6!:2 'findSDMrm T2' [ T2=: |:t2 4.21779 NB. So, the two suggestions are both a little bit better on my machine. NB. A preliminary idea for speeding up process by reducing amount NB. of data processed per invocation by grouping "like" items: <.%:#t2 NB. Try to scale as square root of number of records... 70 refpts=: 70 40 ?@$ 5 NB. Random reference points... $keys=. +/+/"1 refpts="1/t2 NB. Group by similarity to reference points 5000 $findSDM&.>keys </. t2 136 NB. This gives matches within groups - a partial, approximate solution... $&.>findSDM&.>keys </. t2 +-----+-----+-------+-------+-----+-----+-----+-----... |57 57|73 73|107 107|116 116|76 76|29 29|74 74|65 65... +-----+-----+-------+-------+-----+-----+-----+-----... NB. Combining these ideas: findSDMdhm=: 3 : 'refpts;findSDM &.> (+/+/"1 (refpts=: ((<.%:#y),1{$y)?@$5)="1/y) </. y' (10) 6!:2 'findSDMdhm t2' 0.150275 -- Devon McCormick, CFA ^me^ at acm. org is my preferred e-mail ---------------------------------------------------------------------- For information about J forums see http://www.jsoftware.com/forums.htm