Sorry about the previous message. I am trying an old mail address to avoid the
company's warnings and disclosures but the formatting was a mess. This is a
second try. My view is that if a concise version is not competitive then is J's
fault ;) ... T2=. 5000 40 $ ?.200000#5
fsdm0=: +/@:="1/~ NB. Tarmo Veskioja's original
fsdm1=: +/"1@:(="1/~) NB. Victor Cerovski's
fsdm2=: +/@:(=/~"1)@:|: NB. Raul Miller's
fsdm3=. +/ .= |:
st=. 7!:2@:] , 6!:2
1 st'fsdm0 T2'
135333632 14.5976951
1 st'fsdm1 T2'
1.20796045e9 11.6368667
1 st'fsdm2 T2'
1.24256371e9 11.2936644
1 st'fsdm3 T2'
136652544 8.26036541
(fsdm0 -: fsdm3)T2
1… but so far it is competitive.
> From: jmq...@hotmail.com
> To: i...@jsoftware.com; programming@jsoftware.com
> Date: Sun, 20 May 2012 01:20:28 -0400
> Subject: [Jprogramming] [Jprogramming] Any faster way of computing a
> similarity distance matrix ?
>
>
>
> My view, in general, is that if a concise version is not competitive then it
> is J's fault ;) ... T2=. 5000 40 $ ?.200000#5 fsdm0=: +/@:="1/~
> NB. Tarmo Veskioja's original
> fsdm1=: +/"1@:(="1/~) NB. Victor Cerovski's
> fsdm2=: +/@:(=/~"1)@:|: NB. Raul Miller's fsdm3=. +/ .= |: st=.
> 7!:2@:] , 6!:2 1 st'fsdm0 T2'
> 135333632 14.5976951
> 1 st'fsdm1 T2'
> 1.20796045e9 11.6368667
> 1 st'fsdm2 T2'
> 1.24256371e9 11.2936644
> 1 st'fsdm3 T2'
> 136652544 8.26036541 (fsdm0 -: fsdm3)T2
> 1… but so far it is competitive. On Thu, May 17, 2012 at 1:44 PM, Devon
> McCormick <devon...@gmail.com> wrote:
> > Depending on how you intend to use this, the following might suggest a
> > more substantial speed-up. I'll first re-cap what we've seen so far
> > to provide a basis for my timings, then I'll sketch out an unfinished
> > idea for potentially speeding up the process.
> >
> > findSDM=: +/@:="1/~ NB. Tarmo Veskioja's original
> > findSDMvc=: +/"1@:(="1/~) NB. Victor Cerovski's
> > findSDMrm=: +/@:(=/~"1) NB. Raul Miller's
> >
> > (10) 6!:2 'findSDM t2'
> > 5.68061
> > (10) 6!:2 'findSDMvc t2'
> > 4.32129
> > (10) 6!:2 'findSDMrm T2' [ T2=: |:t2
> > 4.21779
> >
> > NB. So, the two suggestions are both a little bit better on my machine.
> >
> > NB. A preliminary idea for speeding up process by reducing amount
> > NB. of data processed per invocation by grouping "like" items:
> >
> > <.%:#t2 NB. Try to scale as square root of
> > number of records...
> > 70
> > refpts=: 70 40 ?@$ 5 NB. Random reference points...
> > $keys=. +/+/"1 refpts="1/t2 NB. Group by similarity to reference
> > points
> > 5000
> > $findSDM&.>keys </. t2
> > 136
> >
> > NB. This gives matches within groups - a partial, approximate solution...
> > $&.>findSDM&.>keys </. t2
> > +-----+-----+-------+-------+-----+-----+-----+-----...
> > |57 57|73 73|107 107|116 116|76 76|29 29|74 74|65 65...
> > +-----+-----+-------+-------+-----+-----+-----+-----...
> >
> > NB. Combining these ideas:
> > findSDMdhm=: 3 : 'refpts;findSDM &.> (+/+/"1 (refpts=:
> > ((<.%:#y),1{$y)?@$5)="1/y) </. y'
> > (10) 6!:2 'findSDMdhm t2'
> > 0.150275
> >
> > --
> > Devon McCormick, CFA
> > ^me^ at acm.
> > org is my
> > preferred e-mail
> > ----------------------------------------------------------------------
> > For information about J forums see http://www.jsoftware.com/forums.htm
> >
> ----------------------------------------------------------------------
> For information about J forums see http://www.jsoftware.com/forums.htm
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm