I have been digging through the user lists for Solr and Nutch, as well as
reading lots of blogs, etc.  I have yet to find a clear answer (maybe there
is none )

I am trying to find the best way ahead for choosing a technology that will
allow the ability to use a large taxonomy for classifying structured and
unstructured data and then displaying those categorizations as facets to the
user during search.  

There seems to be several approaches, some of which make use of index time
for encoding the terms found in the text, but I have seen no mention of HOW
to get those terms from the text.  Some sort of text classification software
I am assuming.  If this is true, are there any good open source engines that
can process text against a taxonomy?

The other approach seems to be two patches being developed for Solr 3.0, 792
and 64.  Again, I think you would have to have some sort of an engine to
give you this information that could then be added at index time. 

I have also seen some interesting literature on using Drupal and the Solr
module.  

My current architecture uses Nutch (1.2) for crawling, solrindex for inexing
(Solr 1.4.1), and Ajax Solr for my UI.  

I have also seen some talk in webinars/etc from Lucid Imagination about
upcoming development on "Native Taxonomy Facets", any idea where that
development stands?

I have to use the most stable version of Solr/Nutch/Lucene possible for my
implementation, because, unfortunately, once I choose, going back will be
next to impossible for years to come!

Thanks!




-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/Taxonomy-and-Faceting-tp2028442p2028442.html
Sent from the Solr - User mailing list archive at Nabble.com.

Reply via email to