Peyman,

Yes, what Ted said. Please take 0.6 release. Also try ssvd, it may
benefit you in some regards compared to Lanczos.

-d

On Sun, Feb 19, 2012 at 10:34 AM, Peyman Mohajerian <[email protected]> wrote:
> Hi Dmitriy & Others,
>
> Dmitriy thanks for your previous response.
> I have a follow up question to my LSA project. I have managed to
> upload 1,500 documents from two different news groups (one about
> graphics and one about Atheism
> http://people.csail.mit.edu/jrennie/20Newsgroups/) to Solr. However my
> LanczosSolver in Mahout.4 does not find any eigenvalues (there are
> eigenvectors as you see in the follow up logs).
> The only things I'm doing different from
> (https://github.com/algoriffic/lsa4solr) is that I'm not using the
> 'Summary' field but rather the actual 'text' field in Solr. I'm
> assuming the issue is that Summary field already removes the noise and
> make the clustering work and the raw index data does not do that, am I
> correct or there are other potential explanations? For the desired
> rank I'm using values between 10-100 and looking for #clusters between
> 2-10 (different values for different trials), but always the same
> result comes out, no clusters found.
> If my issue is related to not having summarization done, how can that
> be done in Solr? I wasn't able to fine a Summary field in Solr.
>
> Thanks
> Peyman
>
>
> Feb 19, 2012 3:25:20 AM
> org.apache.mahout.math.decomposer.lanczos.LanczosSolver solve
> INFO: Lanczos iteration complete - now to diagonalize the tri-diagonal
> auxiliary matrix.
> Feb 19, 2012 3:25:20 AM
> org.apache.mahout.math.decomposer.lanczos.LanczosSolver solve
> INFO: Eigenvector 0 found with eigenvalue 0.0
> Feb 19, 2012 3:25:20 AM
> org.apache.mahout.math.decomposer.lanczos.LanczosSolver solve
> INFO: Eigenvector 1 found with eigenvalue 0.0
> Feb 19, 2012 3:25:20 AM
> org.apache.mahout.math.decomposer.lanczos.LanczosSolver solve
> INFO: Eigenvector 2 found with eigenvalue 0.0
> Feb 19, 2012 3:25:20 AM
> org.apache.mahout.math.decomposer.lanczos.LanczosSolver solve
> INFO: Eigenvector 3 found with eigenvalue 0.0
> Feb 19, 2012 3:25:20 AM
> org.apache.mahout.math.decomposer.lanczos.LanczosSolver solve
> INFO: Eigenvector 4 found with eigenvalue 0.0
> Feb 19, 2012 3:25:20 AM
> org.apache.mahout.math.decomposer.lanczos.LanczosSolver solve
> INFO: Eigenvector 5 found with eigenvalue 0.0
> Feb 19, 2012 3:25:20 AM
> org.apache.mahout.math.decomposer.lanczos.LanczosSolver solve
> INFO: Eigenvector 6 found with eigenvalue 0.0
> Feb 19, 2012 3:25:20 AM
> org.apache.mahout.math.decomposer.lanczos.LanczosSolver solve
> INFO: Eigenvector 7 found with eigenvalue 0.0
> Feb 19, 2012 3:25:20 AM
> org.apache.mahout.math.decomposer.lanczos.LanczosSolver solve
> INFO: Eigenvector 8 found with eigenvalue 0.0
> Feb 19, 2012 3:25:20 AM
> org.apache.mahout.math.decomposer.lanczos.LanczosSolver solve
> INFO: Eigenvector 9 found with eigenvalue 0.0
> Feb 19, 2012 3:25:20 AM
> org.apache.mahout.math.decomposer.lanczos.LanczosSolver solve
> INFO: Eigenvector 10 found with eigenvalue 0.0
> Feb 19, 2012 3:25:20 AM
> org.apache.mahout.math.decomposer.lanczos.LanczosSolver solve
> INFO: LanczosSolver finished.
>
>
> On Sun, Jan 1, 2012 at 10:06 PM, Dmitriy Lyubimov <[email protected]> wrote:
>> In Mahout lsa pipeline is possible with seqdirectory, seq2sparse and ssvd
>> commands. Nuances are understanding dictionary format and llr anaylysis of
>> n-grams and perhaps use a slightly better lemmatizer than the default one.
>>
>> With indexing part you are on your own at this point.
>> On Jan 1, 2012 2:28 PM, "Peyman Mohajerian" <[email protected]> wrote:
>>
>>> Hi Guys,
>>>
>>> I'm interested in this work:
>>>
>>> http://www.ccri.com/blog/2010/4/2/latent-semantic-analysis-in-solr-using-clojure.html
>>>
>>> I looked at some of the comments and notices that there was interest
>>> in incorporating it into Mahout, back in 2010. I'm also having issues
>>> running this code due to dependencies on older version of Mahout.
>>>
>>> I was wondering if LSA is now directly available in Mahout? Also if I
>>> upgrade to the latest Mahout would this Clojure code work?
>>>
>>> Thanks
>>> Peyman
>>>

Reply via email to