[ https://issues.apache.org/jira/browse/NUTCH-2612?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16554232#comment-16554232 ]
Markus Jelsma edited comment on NUTCH-2612 at 7/24/18 1:24 PM: --------------------------------------------------------------- Updated patch: * logging when a hostname is processed * added note of this feature to usage text * sitemaps_from_hostdb to sitemaps_from_hostname was (Author: markus17): Updated patch: * logging when a hostname is processed * added note of this feature to usage text > Support for sitemap processing by hostname > ------------------------------------------ > > Key: NUTCH-2612 > URL: https://issues.apache.org/jira/browse/NUTCH-2612 > Project: Nutch > Issue Type: Improvement > Components: sitemap > Affects Versions: 1.14 > Reporter: Markus Jelsma > Assignee: Markus Jelsma > Priority: Major > Fix For: 1.16 > > Attachments: NUTCH-2612.patch, NUTCH-2612.patch, NUTCH-2612.patch > > > Add support to sitemap processor for processing just hostnames. Similar to > the mapper eating sitemap URL's, but then with BaseRobotRules finding the > sitemap URL's itself. > Will upload patch soon. -- This message was sent by Atlassian JIRA (v7.6.3#76005)