Hi Gail, I've used Naive Bayes classification [1] to accomplish similar things in the past. The method can be used to sort blocks of text into predefined categories (your taxonomy) based on word frequencies (in the items that come from automatic feeds). It's a pretty popular approach for filtering spam out of inboxes, but it can be used much more generally and with as many categories as you'd like. Implementation tends to follow these steps:
1. Set up a classifier and categories [2] 2. Train the classifier with sample content for each of your categories 3. Test the classifier with additional sample content to make sure it's working reasonably well 4. Refine over time A nice reference implementation might be POPFile [3]. POPFile sorts emails into categories you define and then refine by letting it know when it's made a mistake. The Wikipedia page on Naive Bayes can lead you to other methods or you might consider a more advanced solution like SPSS's Predictive Text Analytics [4]. Sincerely, Joseph Dombroski [1] http://en.wikipedia.org/wiki/Naive_Bayesian_classification [2] Many programming languages have libraries to make this easier. You can also find software that will help you set these up. I did a quick search for feed classifiers and found the service http://rss.knownews.net as well as some software at http://the.taoofmac.com/space/blog/2006/11/04 [3] http://getpopfile.org/ [4] http://www.spss.com/text_mining_for_clementine/ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Posted from the new ixda.org http://www.ixda.org/discuss?post=38635 ________________________________________________________________ Welcome to the Interaction Design Association (IxDA)! To post to this list ....... disc...@ixda.org Unsubscribe ................ http://www.ixda.org/unsubscribe List Guidelines ............ http://www.ixda.org/guidelines List Help .................. http://www.ixda.org/help