Re: [Jprogramming] Any faster way of computing a similarity distance matrix ?

Jose Quintana Sat, 19 May 2012 22:37:34 -0700
Sorry about the previous message.  I am trying an old mail address to avoid the 
company's warnings and disclosures but the formatting was a mess.  This is a 
second try. My view is that if a concise version is not competitive then is J's 
fault ;) ...    T2=. 5000 40 $ ?.200000#5
   
   fsdm0=: +/@:="1/~       NB. Tarmo Veskioja's original
   fsdm1=: +/"1@:(="1/~)   NB. Victor Cerovski's
   fsdm2=: +/@:(=/~"1)@:|: NB. Raul Miller's
   
   fsdm3=. +/ .= |:        
   
   st=. 7!:2@:] , 6!:2
   
   1 st'fsdm0 T2'
135333632 14.5976951
   1 st'fsdm1 T2'
1.20796045e9 11.6368667
   1 st'fsdm2 T2'
1.24256371e9 11.2936644
   1 st'fsdm3 T2'
136652544 8.26036541
   
   (fsdm0 -: fsdm3)T2
1… but so far it is competitive.
 > From: [email protected]
> To: [email protected]; [email protected]
> Date: Sun, 20 May 2012 01:20:28 -0400
> Subject: [Jprogramming] [Jprogramming] Any faster way of computing a 
> similarity distance matrix ?
> 
> 
>  
> My view, in general, is that if a concise version is not competitive then it 
> is J's fault ;) ...    T2=. 5000 40 $ ?.200000#5    fsdm0=: +/@:="1/~       
> NB. Tarmo Veskioja's original
>    fsdm1=: +/"1@:(="1/~)   NB. Victor Cerovski's
>    fsdm2=: +/@:(=/~"1)@:|: NB. Raul Miller's    fsdm3=. +/ .= |:    st=. 
> 7!:2@:] , 6!:2    1 st'fsdm0 T2'
> 135333632 14.5976951
>    1 st'fsdm1 T2'
> 1.20796045e9 11.6368667
>    1 st'fsdm2 T2'
> 1.24256371e9 11.2936644
>    1 st'fsdm3 T2'
> 136652544 8.26036541   (fsdm0 -: fsdm3)T2
> 1… but so far it is competitive.       On Thu, May 17, 2012 at 1:44 PM, Devon 
> McCormick <[email protected]> wrote:
> > Depending on how you intend to use this, the following might suggest a
> > more substantial speed-up.  I'll  first re-cap what we've seen so far
> > to provide a basis for my timings, then I'll sketch out an unfinished
> > idea for potentially speeding up the process.
> >
> >   findSDM=: +/@:="1/~        NB. Tarmo Veskioja's original
> >   findSDMvc=: +/"1@:(="1/~)  NB. Victor Cerovski's
> >   findSDMrm=: +/@:(=/~"1)    NB. Raul Miller's
> >
> >   (10) 6!:2 'findSDM t2'
> > 5.68061
> >   (10) 6!:2 'findSDMvc t2'
> > 4.32129
> >   (10) 6!:2 'findSDMrm T2' [ T2=: |:t2
> > 4.21779
> >
> > NB. So, the two suggestions are both a little bit better on my machine.
> >
> > NB. A preliminary idea for speeding up process by reducing amount
> > NB. of data processed per invocation by grouping "like" items:
> >
> >   <.%:#t2                         NB. Try to scale as square root of
> > number of records...
> > 70
> >   refpts=: 70 40 ?@$ 5            NB. Random reference points...
> >   $keys=. +/+/"1 refpts="1/t2     NB. Group by similarity to reference 
> > points
> > 5000
> >   $findSDM&.>keys </. t2
> > 136
> >
> > NB. This gives matches within groups - a partial, approximate solution...
> >   $&.>findSDM&.>keys </. t2
> > +-----+-----+-------+-------+-----+-----+-----+-----...
> > |57 57|73 73|107 107|116 116|76 76|29 29|74 74|65 65...
> > +-----+-----+-------+-------+-----+-----+-----+-----...
> >
> > NB. Combining these ideas:
> >   findSDMdhm=: 3 : 'refpts;findSDM &.> (+/+/"1 (refpts=:
> > ((<.%:#y),1{$y)?@$5)="1/y) </. y'
> >   (10) 6!:2 'findSDMdhm t2'
> > 0.150275
> >
> > --
> > Devon McCormick, CFA
> > ^me^ at acm.
> > org is my
> > preferred e-mail
> > ----------------------------------------------------------------------
> > For information about J forums see http://www.jsoftware.com/forums.htm      
> >                                           
> ----------------------------------------------------------------------
> For information about J forums see http://www.jsoftware.com/forums.htm
                                          
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm
Re: [Jprogramming] Any faster way of computing a similarity distance matrix ?

Reply via email to