[ https://issues.apache.org/jira/browse/NUTCH-1872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jonathan Cooper-Ellis updated NUTCH-1872: ----------------------------------------- Attachment: urlmeta_propagation.diff > enables control over how injected metadata is propagated > -------------------------------------------------------- > > Key: NUTCH-1872 > URL: https://issues.apache.org/jira/browse/NUTCH-1872 > Project: Nutch > Issue Type: New Feature > Reporter: Jonathan Cooper-Ellis > Priority: Minor > Attachments: urlmeta_propagation.diff > > > This builds on NUTCH-655 and NUTCH-855, allowing users some control over > which outlinks receive injected metadata. A new configuration property > "urlmeta.rule" has been introduced, with a default value of "all". > The value "all" indicated that "urlmeta.tags" should be propagated to all > outlinks. Other options include: "host" (propagated to outlinks with the same > host as the url with which the metadata was injected), "domain" (same, except > with the same domain), "prefix" (treats the injected url as a prefix, so > metadata is only propagated to urls that extend the injected url). > Would appreciate feedback on whether you think this is a useful feature, and > if its implemented properly. -- This message was sent by Atlassian JIRA (v6.3.4#6332)