Categorization typically assigns documents to a node in a pre-defined taxonomy.
For clustering, however, the categorization 'structure' is emergent... i.e. the clusters (which are analogous to taxonomy nodes) are created dynamically based on the content of the documents at hand. -----Original Message----- From: petite_abeille [mailto:[EMAIL PROTECTED] Sent: Tuesday, November 11, 2003 10:50 AM To: Lucene Users List Subject: Re: Document Clustering Hi Otis, On Nov 11, 2003, at 16:41, Otis Gospodnetic wrote: > How is document clustering different/related to text categorization? Not that I'm an expert in any of this, but clustering is a much more "holistic" approach than categorization. Usually, categorization is understood as a more precise endeavor (e.g. dmoz.org), while clustering is much more "fuzzy" and non-deterministic. Both try to achieve the same goal though. So perhaps this is just a question of jargon. I'm confident that the owner of this site could help bring some light on the finer point of clustering vs categorization: http://www.lissus.com/resources/index.htm Cheers, PA. --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]