Peyman,
Yes, what Ted said. Please take 0.6 release. Also try ssvd, it may benefit you in some regards compared to Lanczos. -d On Sun, Feb 19, 2012 at 10:34 AM, Peyman Mohajerian <[email protected]> wrote: > Hi Dmitriy & Others, > > Dmitriy thanks for your previous response. > I have a follow up question to my LSA project. I have managed to > upload 1,500 documents from two different news groups (one about > graphics and one about Atheism > http://people.csail.mit.edu/jrennie/20Newsgroups/) to Solr. However my > LanczosSolver in Mahout.4 does not find any eigenvalues (there are > eigenvectors as you see in the follow up logs). > The only things I'm doing different from > (https://github.com/algoriffic/lsa4solr) is that I'm not using the > 'Summary' field but rather the actual 'text' field in Solr. I'm > assuming the issue is that Summary field already removes the noise and > make the clustering work and the raw index data does not do that, am I > correct or there are other potential explanations? For the desired > rank I'm using values between 10-100 and looking for #clusters between > 2-10 (different values for different trials), but always the same > result comes out, no clusters found. > If my issue is related to not having summarization done, how can that > be done in Solr? I wasn't able to fine a Summary field in Solr. > > Thanks > Peyman > > > Feb 19, 2012 3:25:20 AM > org.apache.mahout.math.decomposer.lanczos.LanczosSolver solve > INFO: Lanczos iteration complete - now to diagonalize the tri-diagonal > auxiliary matrix. > Feb 19, 2012 3:25:20 AM > org.apache.mahout.math.decomposer.lanczos.LanczosSolver solve > INFO: Eigenvector 0 found with eigenvalue 0.0 > Feb 19, 2012 3:25:20 AM > org.apache.mahout.math.decomposer.lanczos.LanczosSolver solve > INFO: Eigenvector 1 found with eigenvalue 0.0 > Feb 19, 2012 3:25:20 AM > org.apache.mahout.math.decomposer.lanczos.LanczosSolver solve > INFO: Eigenvector 2 found with eigenvalue 0.0 > Feb 19, 2012 3:25:20 AM > org.apache.mahout.math.decomposer.lanczos.LanczosSolver solve > INFO: Eigenvector 3 found with eigenvalue 0.0 > Feb 19, 2012 3:25:20 AM > org.apache.mahout.math.decomposer.lanczos.LanczosSolver solve > INFO: Eigenvector 4 found with eigenvalue 0.0 > Feb 19, 2012 3:25:20 AM > org.apache.mahout.math.decomposer.lanczos.LanczosSolver solve > INFO: Eigenvector 5 found with eigenvalue 0.0 > Feb 19, 2012 3:25:20 AM > org.apache.mahout.math.decomposer.lanczos.LanczosSolver solve > INFO: Eigenvector 6 found with eigenvalue 0.0 > Feb 19, 2012 3:25:20 AM > org.apache.mahout.math.decomposer.lanczos.LanczosSolver solve > INFO: Eigenvector 7 found with eigenvalue 0.0 > Feb 19, 2012 3:25:20 AM > org.apache.mahout.math.decomposer.lanczos.LanczosSolver solve > INFO: Eigenvector 8 found with eigenvalue 0.0 > Feb 19, 2012 3:25:20 AM > org.apache.mahout.math.decomposer.lanczos.LanczosSolver solve > INFO: Eigenvector 9 found with eigenvalue 0.0 > Feb 19, 2012 3:25:20 AM > org.apache.mahout.math.decomposer.lanczos.LanczosSolver solve > INFO: Eigenvector 10 found with eigenvalue 0.0 > Feb 19, 2012 3:25:20 AM > org.apache.mahout.math.decomposer.lanczos.LanczosSolver solve > INFO: LanczosSolver finished. > > > On Sun, Jan 1, 2012 at 10:06 PM, Dmitriy Lyubimov <[email protected]> wrote: >> In Mahout lsa pipeline is possible with seqdirectory, seq2sparse and ssvd >> commands. Nuances are understanding dictionary format and llr anaylysis of >> n-grams and perhaps use a slightly better lemmatizer than the default one. >> >> With indexing part you are on your own at this point. >> On Jan 1, 2012 2:28 PM, "Peyman Mohajerian" <[email protected]> wrote: >> >>> Hi Guys, >>> >>> I'm interested in this work: >>> >>> http://www.ccri.com/blog/2010/4/2/latent-semantic-analysis-in-solr-using-clojure.html >>> >>> I looked at some of the comments and notices that there was interest >>> in incorporating it into Mahout, back in 2010. I'm also having issues >>> running this code due to dependencies on older version of Mahout. >>> >>> I was wondering if LSA is now directly available in Mahout? Also if I >>> upgrade to the latest Mahout would this Clojure code work? >>> >>> Thanks >>> Peyman >>>
