2013/9/29 Tasos Ventouris <[email protected]>: > Thank you for your answer. I checked it with many documents. Both totaly > different and similar documents. You can see an example of the text I used > here https://dl.dropboxusercontent.com/u/37124455/documents.txt > > Another script I wrote with only tf-idf shows me 69% similarity on those > documents.
Then I guess you should really try using more documents. LSA typically shines when the number of documents is on the order of 10k-1M. ------------------------------------------------------------------------------ October Webinars: Code for Performance Free Intel webinars can help you accelerate application performance. Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from the latest Intel processors and coprocessors. See abstracts and register > http://pubads.g.doubleclick.net/gampad/clk?id=60133471&iu=/4140/ostg.clktrk _______________________________________________ Scikit-learn-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
