Hi All and Marc,

There is the carrot project :
http://www.cs.put.poznan.pl/dweiss/carrot/

The carrot system consists of webservices that can easily be fed by a lucene
resultlist. You simply have to create a JSP that creates this XML file and
create a custom process and input component. The input component
for lucene could look like:

<?xml version="1.0" encoding="UTF-8"?>
<service xmlns      =
"http://www.dawidweiss.com/projects/carrot/componentDescriptor"; framework  =
"Carrot2">
    <component id               = "carrot2.input.lucene"
               type             = "input"
               serviceURL       = "http://localhost/weblucene/c2.jsp";
               infoURL          = "http://localhost/weblucene/";
    />
</service>

The c2.jsp file simply has to translate a resultlist into an XLM file such
as:
<searchresult>
    <document id="1">
 <title>...</title>
 <weight>1.0</weight>
 <url>http://...</url>
 <summary>sum 1</summary>
 <snippet>snip 2</snippet>
    </document>
    <document id="2">
 <title>...</title>
 <weight>1.0</weight>
 <url>http://...</url>
 <summary>sum 2</summary>
 <snippet>snip 2</snippet>
    </document>
</searchresult>

Feed this into the carrot system, and you will get a nice clustered
result list. The amazing part is of this clustering mechanism is that
the cluster labels are incredible, their great!

Then there is a open source project called Classifier4J that can
be used for classification, the oposite of clustering. These other
open source projects are a great addition to the Lucene system.

I hope this helps...

Marc, what are you building?? Maybe we can help!

Kind regards,

Maurits


----- Original Message ----- 
From: "marc" <[EMAIL PROTECTED]>
To: "Lucene Users List" <[EMAIL PROTECTED]>
Sent: Tuesday, November 11, 2003 5:15 PM
Subject: Document Clustering


Hi,

does anyone have any sample code/documentation available for doing document
based clustering using lucene?

Thanks,
Marc



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to