Re: Latent Semantic Analysis for Document Categorization

2015-03-30 Thread Ted Dunning
to discover topics better > > > > > > > > Chirag Nagpal > > > > Department of Computer Engineering > > > > Army Institute of Technology, Pune > > > > > > > > > > > > From: He

Re: Latent Semantic Analysis for Document Categorization

2015-03-30 Thread Hersheeta Chandankar
gt; > That way you will be able to discover topics better > > > > > > Chirag Nagpal > > > Department of Computer Engineering > > > Army Institute of Technology, Pune > > > > > > > > > From: Hersheeta Chandankar > > &

Latent Semantic Analysis for Document Categorization

2015-03-30 Thread Hersheeta Chandankar
Hi Ted, Thank you for a quick reply. It would be of great help if you could please explain what kind of 'linking information between documents' I should look for.

Re: Latent Semantic Analysis for Document Categorization

2015-03-26 Thread Ted Dunning
; > Chirag Nagpal > > Department of Computer Engineering > > Army Institute of Technology, Pune > > > > ____ > > From: Hersheeta Chandankar > > Sent: Thursday, March 26, 2015 6:25 PM > > To: user@mahout.apache.org > >

Re: Latent Semantic Analysis for Document Categorization

2015-03-26 Thread Hersheeta Chandankar
neering > Army Institute of Technology, Pune > > > From: Hersheeta Chandankar > Sent: Thursday, March 26, 2015 6:25 PM > To: user@mahout.apache.org > Subject: Latent Semantic Analysis for Document Categorization > > Hi, > > I'

Re: Latent Semantic Analysis for Document Categorization

2015-03-26 Thread David Starina
able to discover topics better > > Chirag Nagpal > Department of Computer Engineering > Army Institute of Technology, Pune > > > From: Hersheeta Chandankar > Sent: Thursday, March 26, 2015 6:25 PM > To: user@mahout.apache.org > Subject:

Re: Latent Semantic Analysis for Document Categorization

2015-03-26 Thread 3316 Chirag Nagpal
topics better Chirag Nagpal Department of Computer Engineering Army Institute of Technology, Pune From: Hersheeta Chandankar Sent: Thursday, March 26, 2015 6:25 PM To: user@mahout.apache.org Subject: Latent Semantic Analysis for Document Categorization Hi

Latent Semantic Analysis for Document Categorization

2015-03-26 Thread Hersheeta Chandankar
in mahout which has given good accuracy result of 70%-75%. But I would still like to improve the accuracy by retrieving the semantic dependencies between words of the documents. I've read about Latent Semantic Analysis(LSA) which creates a term-document matrix and subjects it to mathematical

Re: Latent Semantic Analysis

2012-06-05 Thread Ted Dunning
t; >> >>>> >>> for the third time, in context of lsa, faster and hence > >> perhaps > >> >> >> better > >> >> >> >>>> >>> alternative to lanczos is ssvd. Is there any specific > reason

Re: Latent Semantic Analysis

2012-06-04 Thread Dmitriy Lyubimov
bimov < >>> >> >> dlie...@gmail.com> >>> >> >> >>>> wrote: >>> >> >> >>>> >> >>> >> >> >>>> >>> for the third time, in context of lsa, faster and hence >>> perhaps >>> &g

Re: Latent Semantic Analysis

2012-06-04 Thread Dmitriy Lyubimov
t;> >> >>>> >>> > Hi Guys, >> >> >> >>>> >>> > >> >> >> >>>> >>> > Per you advice I did upgrade to Mahout .6 and did a bunch >> of >> >> API >> >> >> >>>> >>> > changes and in the meantime realized I had a bug wit

Re: Latent Semantic Analysis

2012-06-04 Thread Peyman Mohajerian
g the below error now, in the > context > >> >> of some > >> >> >>>> >>> > other Mahout algorithm there was a mention of '/tmp' vs > >> '/_tmp' > >> >> >>>> >>> >

Re: Latent Semantic Analysis

2012-04-06 Thread Dmitriy Lyubimov
gt;> >> >>>> >>> > SEVERE: java.util.NoSuchElementException >> >> >>>> >>> >        at >> >> >>>> >>> >> >> >>>> >> >> >> com.google.common.c

Re: Latent Semantic Analysis

2012-04-06 Thread Peyman Mohajerian
t; >>>> > >> > org.apache.mahout.math.decomposer.lanczos.LanczosSolver.solve(LanczosSolver.java:104) > >> >>>> >>> >at > >> >>>> >>> > lsa4solr.mahout_matrix$decompose_svd.invoke(mahout_matrix.clj:165) > >> >>>>

Re: Latent Semantic Analysis

2012-04-05 Thread Dmitriy Lyubimov
>> >>>> >>> > >> >>>> >>> > >> >>>> >>> > On Mon, Feb 20, 2012 at 10:38 AM, Dmitriy Lyubimov < >> >>>> dlie...@gmail.com> >> >>>> >>> wrote: >> >>

Re: Latent Semantic Analysis

2012-04-05 Thread Dmitriy Lyubimov
at 10:38 AM, Dmitriy Lyubimov < >> >>>> dlie...@gmail.com> >> >>>> >>> wrote: >> >>>> >>> >> Peyman, >> >>>> >>> >> >> >>>> >>> >> >> >>>> >>> >>

Re: Latent Semantic Analysis

2012-04-05 Thread Peyman Mohajerian
t;>>> >>> wrote: > >>>> >>> >>> Hi Dmitriy & Others, > >>>> >>> >>> > >>>> >>> >>> Dmitriy thanks for your previous response. > >>>> >>> >>> I have a follow up question to my LSA

Re: Latent Semantic Analysis

2012-04-05 Thread Dmitriy Lyubimov
>>>> However my >>>> >>> >>> LanczosSolver in Mahout.4 does not find any eigenvalues (there are >>>> >>> >>> eigenvectors as you see in the follow up logs). >>>> >>> >>> The only things I'm doing di

Re: Latent Semantic Analysis

2012-04-05 Thread Dmitriy Lyubimov
y field already removes the noise >>> and >>> >>> >>> make the clustering work and the raw index data does not do that, >>> am I >>> >>> >>> correct or there are other potential explanations? For the desired >>> >>> >>> rank I'

Re: Latent Semantic Analysis

2012-04-05 Thread Dmitriy Lyubimov
;> >>> result comes out, no clusters found. >> >>> >>> If my issue is related to not having summarization done, how can >> that >> >>> >>> be done in Solr? I wasn't able to fine a Summary field in Solr. >> >>> >>

Re: Latent Semantic Analysis

2012-04-05 Thread Peyman Mohajerian
Eigenvector 0 found with eigenvalue 0.0 > >>> >>> Feb 19, 2012 3:25:20 AM > >>> >>> org.apache.mahout.math.decomposer.lanczos.LanczosSolver solve > >>> >>> INFO: Eigenvector 1 found with eigenvalue 0.0 > >>> >>> Fe

Re: Latent Semantic Analysis

2012-04-05 Thread Dmitriy Lyubimov
mahout.math.decomposer.lanczos.LanczosSolver solve >>> >>> INFO: Eigenvector 3 found with eigenvalue 0.0 >>> >>> Feb 19, 2012 3:25:20 AM >>> >>> org.apache.mahout.math.decomposer.lanczos.LanczosSolver solve >>> >>> INFO: Eigenvector 4 found wi

Re: Latent Semantic Analysis

2012-04-05 Thread Dmitriy Lyubimov
0.0 >> >>> Feb 19, 2012 3:25:20 AM >> >>> org.apache.mahout.math.decomposer.lanczos.LanczosSolver solve >> >>> INFO: Eigenvector 6 found with eigenvalue 0.0 >> >>> Feb 19, 2012 3:25:20 AM >> >>> org.apache.mahout.mat

Re: Latent Semantic Analysis

2012-04-05 Thread Peyman Mohajerian
>>> org.apache.mahout.math.decomposer.lanczos.LanczosSolver solve > >>> INFO: Eigenvector 7 found with eigenvalue 0.0 > >>> Feb 19, 2012 3:25:20 AM > >>> org.apache.mahout.math.decomposer.lanczos.LanczosSolver solve > >>> INFO: Eigenvector 8

Re: Latent Semantic Analysis

2012-02-26 Thread Dmitriy Lyubimov
eb 19, 2012 3:25:20 AM >>> org.apache.mahout.math.decomposer.lanczos.LanczosSolver solve >>> INFO: Eigenvector 10 found with eigenvalue 0.0 >>> Feb 19, 2012 3:25:20 AM >>> org.apache.mahout.math.decomposer.lanczos.LanczosSolver solve >>> INFO: LanczosSolver

Re: Latent Semantic Analysis

2012-02-26 Thread Peyman Mohajerian
osSolver solve >> INFO: Eigenvector 10 found with eigenvalue 0.0 >> Feb 19, 2012 3:25:20 AM >> org.apache.mahout.math.decomposer.lanczos.LanczosSolver solve >> INFO: LanczosSolver finished. >> >> >> On Sun, Jan 1, 2012 at 10:06 PM, Dmitriy Lyubimov wrote: &

Re: Latent Semantic Analysis

2012-02-20 Thread Dmitriy Lyubimov
ommands. Nuances are understanding dictionary format and llr anaylysis of >> n-grams and perhaps use a slightly better lemmatizer than the default one. >> >> With indexing part you are on your own at this point. >> On Jan 1, 2012 2:28 PM, "Peyman Mohajerian" wrote

Re: Latent Semantic Analysis

2012-02-19 Thread Ted Dunning
mat and llr anaylysis > of > > n-grams and perhaps use a slightly better lemmatizer than the default > one. > > > > With indexing part you are on your own at this point. > > On Jan 1, 2012 2:28 PM, "Peyman Mohajerian" wrote: > > > >> Hi Guy

Re: Latent Semantic Analysis

2012-02-19 Thread Peyman Mohajerian
PM, "Peyman Mohajerian" wrote: > >> Hi Guys, >> >> I'm interested in this work: >> >> http://www.ccri.com/blog/2010/4/2/latent-semantic-analysis-in-solr-using-clojure.html >> >> I looked at some of the comments and notices that there wa

Re: Latent Semantic Analysis

2012-01-01 Thread Dmitriy Lyubimov
PM, "Peyman Mohajerian" wrote: > Hi Guys, > > I'm interested in this work: > > http://www.ccri.com/blog/2010/4/2/latent-semantic-analysis-in-solr-using-clojure.html > > I looked at some of the comments and notices that there was interest > in incorporating

Latent Semantic Analysis

2012-01-01 Thread Peyman Mohajerian
Hi Guys, I'm interested in this work: http://www.ccri.com/blog/2010/4/2/latent-semantic-analysis-in-solr-using-clojure.html I looked at some of the comments and notices that there was interest in incorporating it into Mahout, back in 2010. I'm also having issues running this c