Clustering with Lucene?

2011-04-26 Thread vivek sar
Hi, I've been researching about clustering with Lucene. Here is what I've found so far, 1) Lucene clustering with Carrot2 - http://download.carrot2.org/head/manual/#section.getting-started.lucene - but, this seems suitable for only smaller size index (few hundred documents

Clustering with Lucene

2005-10-17 Thread msftblows
Hi All- I have seen an example using carrot2 for clustering, but have not really played with it that much. Does anyone have a good example of using clustering with Lucene...has anyone attempted to do it with carrot2 or something else? I was initially going to do a facated search...which

Document clustering with Lucene

2008-05-14 Thread Supheakmungkol SARIN
Dear all, I'd like to do document clustering using full-text with Lucene. In other words, I would like to group similar documents in their respective groups. I searched the mailing list and found that there are two ways around. The first method is to represent the one document as query and sear

Re: Clustering with Lucene?

2011-04-26 Thread Dawid Weiss
Still, there's plenty of algorithms and preprocessing options to consider, so if you provide more background somebody may push you in the right direction. Dawid On Tue, Apr 26, 2011 at 1:49 PM, vivek sar wrote: > Hi, > >  I've been researching about clustering with Lucene. Here

Re: Clustering with Lucene?

2011-04-26 Thread vivek sar
; > On Tue, Apr 26, 2011 at 1:49 PM, vivek sar wrote: >> Hi, >> >>  I've been researching about clustering with Lucene. Here is what >> I've found so far, >> >> 1) Lucene clustering with Carrot2 - >> http://download.carrot2.org/head/manual/#section.

Re: Clustering with Lucene?

2011-04-26 Thread Dawid Weiss
> 1) We index around 20 fields, of that we want to have grouping option > for five of them. For ex., user can search on name of the city and we > should have option to group by products available in that city (and > vice-versa). > Are these fields stricly defined or free text? Because if they are

Re: Clustering with Lucene?

2011-04-26 Thread vivek sar
Thanks Dawid. I was trying to give some example, but this is not exactly our text. Our fields include things like "user name", "IP Address", "Application Name", "Port 3", "Byte Count" - all network related stuff. So, if user searches on certain IP address then we would need to group the result by u

Re: Clustering with Lucene?

2011-04-26 Thread Dawid Weiss
They may not be dictionary, but they is a limited number of term entries and they seem regular. Your inquiries indicate you need a faceting feature (or even an sql-like set of queries backed up by a fast index...), probably with some pruning. Clustering is an unsupervised process that attempts to

Re: Clustering with Lucene

2005-10-17 Thread Stanislaw Osinski
lly > played with it that much. Does anyone have a good example of using > clustering with Lucene...has anyone attempted to do it with carrot2 or > something else? > > I was initially going to do a facated search...which would be much > simpler, but my taxonomies are not built

Re: Document clustering with Lucene

2008-05-15 Thread Otis Gospodnetic
008 11:23:45 PM > Subject: Document clustering with Lucene > > Dear all, > > I'd like to do document clustering using full-text with Lucene. In other > words, > I would like to group similar documents in their respective groups. I > searched > the mailing lis

Re: Document clustering with Lucene

2008-05-16 Thread Grant Ingersoll
ngkol SARIN <[EMAIL PROTECTED]> To: java-user@lucene.apache.org Sent: Wednesday, May 14, 2008 11:23:45 PM Subject: Document clustering with Lucene Dear all, I'd like to do document clustering using full-text with Lucene. In other words, I would like to group similar documents in their r

Re: Document clustering with Lucene

2008-05-17 Thread Supheakmungkol SARIN
he term vectors. Anyway, thank you Otis and Grant for your suggestions. I appreciate them. Regards, Supheakmungkol - Original Message From: Grant Ingersoll <[EMAIL PROTECTED]> To: java-user@lucene.apache.org Sent: Friday, May 16, 2008 7:22:39 PM Subject: Re: Document clustering wit

Re: Document clustering with Lucene

2008-05-17 Thread Grant Ingersoll
On May 17, 2008, at 1:15 PM, Supheakmungkol SARIN wrote: You're right. I want document clustering precisely the documents that are already in the index. I don't know much about Mahout project, but it seems that it doesn't help much. What I want is simply to group together similar documents