Categorization typically assigns documents to a node in a pre-defined taxonomy.

For clustering, however, the categorization 'structure' is emergent... i.e. the 
clusters (which are analogous to taxonomy nodes) are created dynamically based on the 
content of the documents at hand.


-----Original Message-----
From: petite_abeille [mailto:[EMAIL PROTECTED]
Sent: Tuesday, November 11, 2003 10:50 AM
To: Lucene Users List
Subject: Re: Document Clustering


Hi Otis,

On Nov 11, 2003, at 16:41, Otis Gospodnetic wrote:

> How is document clustering different/related to text categorization?

Not that I'm an expert in any of this, but clustering is a much more 
"holistic" approach than categorization. Usually, categorization is 
understood as a more precise endeavor (e.g. dmoz.org), while clustering 
is much more "fuzzy" and non-deterministic. Both try to achieve the 
same goal though. So perhaps this is just a question of jargon.

I'm confident that the owner of this site could help bring some light 
on the finer point of clustering vs categorization:

http://www.lissus.com/resources/index.htm

Cheers,

PA.


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to