[ https://issues.apache.org/jira/browse/NUTCH-1574?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Antoinette updated NUTCH-1574: ------------------------------ Description: I am looking for a fix to prevent indexing the list of files crawled via http(s) protocol. For example: I have 10 files in a directory. Nutch finds and Solr indexes 11, the first being a list of the other 10 files. (was: I am looking for a fix to prevent crawling of parent directories via http(s) protocol. NUTCH-407/905 only seems to cover file protocol.) > Crawling parent directories for http(s) protocol > ------------------------------------------------- > > Key: NUTCH-1574 > URL: https://issues.apache.org/jira/browse/NUTCH-1574 > Project: Nutch > Issue Type: Bug > Affects Versions: 1.6 > Reporter: Antoinette > > I am looking for a fix to prevent indexing the list of files crawled via > http(s) protocol. For example: I have 10 files in a directory. Nutch finds > and Solr indexes 11, the first being a list of the other 10 files. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira