I'm using nutch 1.0.
My subcollections.xml config file is configured like this:

<?xml version="1.0" encoding="UTF-8"?>
<subcollections>
<subcollection>
       <name>sub1</name>
       <id>sub1</id>
               <whitelist>
                       http://www.apache.org/
               </whitelist>
               <blacklist />
</subcollection>
<subcollection>
       <name>sub2</name>
               <id>sub2</id>
               <whitelist>
                       http://www.mysql.com/
               </whitelist>
               <blacklist />
</subcollection>
<subcollection>
       <name>sub3</name>
               <id>sub3</id>
               <whitelist>
                       http://www.redhat.com/
               </whitelist>
               <blacklist />
</subcollection>
</subcollections>


After indexing, and making sure that plugin subcollection was activated on nutch-site.xml,
I checked the database with luke.
Subcollection field was populated as it should with sub1,sub2,sub3
Problem is when I try to search for anything associated with a subcollection.
I get zero results (on luke).
Using the command line, the same results:
./bin/nutch org.apache.nutch.searcher.NutchBean "subcollection:sub1 apache"
Total hits: 0
After performing a normal search, following the explain link on the search results, the subcollection content is correct too but any search using subcollection:sub1 text, returns no results..
Bug maybe?

Reply via email to