Re: [Perldl] Multidimensional scaling

Jean Véronis Sun, 03 Feb 2013 21:59:27 -0800

Le 3 févr. 2013 à 14:20, Stefan Evert <[email protected]> a écrit :


> 
>> Yes, apparently I am having trouble reading today!  I saw "Principal 
>> Coordinates Analysis" on the MDS wikipedia page, and my brain interpreted it 
>> as "Principal Components Analysis", and I went in the completely wrong 
>> direction.  Sorry about that.
> 
> 
> Classical MDS is basically the same as PCA.  It embeds data points in a 
> sufficiently high-dimensional space to reproduce the distance matrix and then 
> performs a truncated PCA to the required number of dimensions.
> 
> However, iterative non-metric MDS algorithms will usually give much better 
> results.
> 
> I'm afraid I can't offer any help wrt. to the Perl/R bridge.  In some of my 
> own software, I use the Expect module to communicate interactively with Perl 
> (which is indeed very slow).  I tried Statistics::R once, but that was even 
> slower.  The RSPerl interface is much faster, but with every OS or R upgrade 
> I had to spend several days hacking it to get it to work again, and haven't 
> been able to do so on Mac OS X for the last three or four years.

I have some hopes with RServe (http://www.rforge.net/Rserve/). There is a 
recent attempt to build a client for Perl 
(https://github.com/djun-kim/Statistics--RserveClient). The implementation is 
quite young, bu the developer is active and responsive. Well' see. 

> 
> Jean, in your application the bottleneck will be to upload the distance 
> matrix to R and (perhaps less critically) get back the MDS vectors, right?  
> If you're willing to put in the extra work, you can speed up communication a 
> lot by exchanging data through external files.  Text files are quite fine 
> (using scan()/write() in R), but you could also try a SQLite database, which 
> has excellent support in both R and Perl.

You're right, the difficulty is the size of data (mainly uploading the matrix). 
At the moment, the text file solution is the one I use. I haven't tried SQLite 
(I suspect that writing the matrix to the database could be slow), but I am 
considering Redis, which could be fast, and for which a client exists 
(https://github.com/bwlewis/rredis). Linux "named piped" is another option, as 
simple as text files, but probably much faster. I am going to experiment.

Thanks for your comments, in any case!
--j


> 
> Best regards,
> Stefan Evert
> 
> [ [email protected] | www.stefan-evert.de ]
> 
> 
> 
> 
> _______________________________________________
> Perldl mailing list
> [email protected]
> http://mailman.jach.hawaii.edu/mailman/listinfo/perldl


_______________________________________________
Perldl mailing list
[email protected]
http://mailman.jach.hawaii.edu/mailman/listinfo/perldl

Re: [Perldl] Multidimensional scaling

Reply via email to