maurits van wijland wrote:
Hi All and Marc,
There is the carrot project : http://www.cs.put.poznan.pl/dweiss/carrot/
The carrot system consists of webservices that can easily be fed by a lucene resultlist. You simply have to create a JSP that creates this XML file and create a custom process and input component. The input component for lucene could look like:
<?xml version="1.0" encoding="UTF-8"?> <service xmlns = "http://www.dawidweiss.com/projects/carrot/componentDescriptor" framework = "Carrot2"> <component id = "carrot2.input.lucene" type = "input" serviceURL = "http://localhost/weblucene/c2.jsp" infoURL = "http://localhost/weblucene/" /> </service>
The c2.jsp file simply has to translate a resultlist into an XLM file such as: <searchresult> <document id="1"> <title>...</title> <weight>1.0</weight> <url>http://...</url> <summary>sum 1</summary> <snippet>snip 2</snippet> </document> <document id="2"> <title>...</title> <weight>1.0</weight> <url>http://...</url> <summary>sum 2</summary> <snippet>snip 2</snippet> </document> </searchresult>
Feed this into the carrot system, and you will get a nice clustered result list. The amazing part is of this clustering mechanism is that the cluster labels are incredible, their great!
Then there is a open source project called Classifier4J that can be used for classification, the oposite of clustering. These other open source projects are a great addition to the Lucene system.
I hope this helps...
Marc, what are you building?? Maybe we can help!
Kind regards,
Maurits
----- Original Message ----- From: "marc" <[EMAIL PROTECTED]>
To: "Lucene Users List" <[EMAIL PROTECTED]>
Sent: Tuesday, November 11, 2003 5:15 PM
Subject: Document Clustering
Hi,
does anyone have any sample code/documentation available for doing document based clustering using lucene?
Thanks, Marc
--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
-- day time: www.media-style.com spare time: www.text-mining.org | www.weta-group.net
--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]