On 5-dec-2006, at 7:01, chad savage wrote:
I'm doing some research on how to classify documents into pre- defined categories.

On basis of...? The technique that's the most appropriate depends on the type of documents and the type of categories. For instance, are the documents structured (e.g. all XML using a common definition) or unstructured data (HTML from the web)? Are you looking the place documents in a large hierarchical category system or is it a simple binary decision (e.g. 'Spam' or 'No spam').

If you know what you want and how it's called it should be relatively easy to find information and scientific papers about it.

--
Regards,

Eelco Lempsink

Attachment: PGP.sig
Description: This is a digitally signed message part

-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general

Reply via email to