[ 
http://issues.apache.org/jira/browse/NUTCH-271?page=comments#action_12412436 ] 

Gal Nitzan commented on NUTCH-271:
----------------------------------

Sorry for the short comment.

Actually the meta tags functionality is already available in the 0.8 version 
along with a CrawlDatum object.

You can build the required functionality just by developing plugins for parsing 
indexing and querying....

HTH.

> Meta-data per URL/site/section
> ------------------------------
>
>          Key: NUTCH-271
>          URL: http://issues.apache.org/jira/browse/NUTCH-271
>      Project: Nutch
>         Type: New Feature

>     Versions: 0.7.2
>     Reporter: Stefan Neufeind

>
> We have the need to index sites and attach additional meta-data-tags to them. 
> Afaik this is not yet possible, or is there a "workaround" I don't see? What 
> I think of is using meta-tags per start-url, only indexing content below that 
> URL, and have the ability to limit searches upon those meta-tags. E.g.
> http://www.example1.com/something1/   -> meta-tag "companybranch1"
> http://www.example2.com/something2/   -> meta-tag "companybranch2"
> http://www.example3.com/something3/   -> meta-tag "companybranch1"
> http://www.example4.com/something4/   -> meta-tag "companybranch3"
> search for everything in companybranch1 or across 1 and 3 or similar

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira

Reply via email to