BTW I don't remember anyone on the Nutch list suggesting you to use Carrot for this (see : http://search-lucene.com/?q=luan+carrot) or classifying at querying time
What I suggested in http://search-lucene.com/m/JWZTj1q4lB92 was about classifying during the parsing or indexing and generating a field for Lucene or SOLR. As Otis pointed out you can of course use SOLR for faceting. Since you will be using Nutch anyway, you might as well avoid an external DB just for storing the results of the classification and just keep the labels e.g. in the parse metadata Julien -- DigitalPebble Ltd Open Source Solutions for Text Engineering http://www.digitalpebble.com On 9 August 2010 00:16, Luan Cestari <luan.cest...@gmail.com> wrote: > > Lucene developers, > > We’ve been working on a undergraduate project to the college about changing > Apache Nutch (that uses Lucene do index it’s web pages) to include a > category filter, and we are having problems about the query part. We want > to > develop an application with a good performance, so we thought that here > would be the best place to ask this kind of question. The idea is that the > user can search pages stored for only a category. So the number of results > found should display the number of pages that actually is classified in > that > category. > > The problem is about how to add to the Lucene indexes the category > information, and how filter the search on that. We tried to look on the > Nutch mailing-list (Nabble) about that and asked some help, but people from > there think that we should use some plug-in like Carrot, that get like 100 > of pages and classify it in the query time. We are not very confident that > it’s the best solution. We thought in other two different ideas: #1 To > classify those pages and store that information on a DB and in the query > time filter the result that DB to filter the result. #2 Use different index > servers, one for each category and one to search without filtering by > category. > > We have seen on this project http://search-lucene.com/ that there are > pre-defined categories. We think that this should be classified at indexing > time, as we wanted. > > Do you have any other idea about how to do that? > > Sincerely, > > Daniel Costa Gimenes & Luan Cestari > Undergraduate students of University Center of FEI > Brazil > -- > View this message in context: > http://lucene.472066.n3.nabble.com/Using-categories-with-Lucene-tp1049232p1049232.html > Sent from the Lucene - Java Users mailing list archive at Nabble.com. > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > >