0.8 has subcollection plugin. It can add subollection id for set of urls and then you can limit searching to subcollections. Is that what you're after?

--
Sami Siren

Stefan Neufeind (JIRA) wrote:

[ http://issues.apache.org/jira/browse/NUTCH-271?page=comments#action_12422226 ] Stefan Neufeind commented on NUTCH-271:
---------------------------------------

Does somebody have an existing demo-plugin for that, that would catch URL-prefixes from a 
file and in case matches are found certain tags are then added? I don't yet fully get it 
how to do it "the elegant way" :-)

Meta-data per URL/site/section
------------------------------

               Key: NUTCH-271
               URL: http://issues.apache.org/jira/browse/NUTCH-271
           Project: Nutch
        Issue Type: New Feature
  Affects Versions: 0.7.2
          Reporter: Stefan Neufeind

We have the need to index sites and attach additional meta-data-tags to them. Afaik this 
is not yet possible, or is there a "workaround" I don't see? What I think of is 
using meta-tags per start-url, only indexing content below that URL, and have the ability 
to limit searches upon those meta-tags. E.g.
http://www.example1.com/something1/   -> meta-tag "companybranch1"
http://www.example2.com/something2/   -> meta-tag "companybranch2"
http://www.example3.com/something3/   -> meta-tag "companybranch1"
http://www.example4.com/something4/   -> meta-tag "companybranch3"
search for everything in companybranch1 or across 1 and 3 or similar


Reply via email to