Sebastian Nagel created NUTCH-1718: -------------------------------------- Summary: update description of property http.robots.agent Key: NUTCH-1718 URL: https://issues.apache.org/jira/browse/NUTCH-1718 Project: Nutch Issue Type: Bug Components: fetcher Affects Versions: 2.2.1, 2.2, 1.7 Reporter: Sebastian Nagel Priority: Trivial Fix For: 2.3, 1.8
The description of property http.robots.agent in nutch-default.xml recommends to add a '*' to the list of agent names. This will cause the same problem as described in NUTCH-1715. The description should be updated. Also regarding "order of precedence" which is dictated since NUTCH-1031 only by ordering of user agents in robots.txt. {code:xml} <property> <name>http.robots.agents</name> <value>*</value> <description>The agent strings we'll look for in robots.txt files, comma-separated, in decreasing order of precedence. You should put the value of http.agent.name as the first agent name, and keep the default * at the end of the list. E.g.: BlurflDev,Blurfl,* </description> </property> {code} -- This message was sent by Atlassian JIRA (v6.1.5#6160)