[ https://issues.apache.org/jira/browse/NUTCH-1718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13885650#comment-13885650 ]
Tejas Patil commented on NUTCH-1718: ------------------------------------ Hi [~someuser77], Yup. I am waiting for folks to comment if that addition is fine. If it is, then I would go ahead and update the description of this jira. > update description of property http.robots.agent > ------------------------------------------------ > > Key: NUTCH-1718 > URL: https://issues.apache.org/jira/browse/NUTCH-1718 > Project: Nutch > Issue Type: Bug > Components: fetcher > Affects Versions: 1.7, 2.2, 2.2.1 > Reporter: Sebastian Nagel > Priority: Trivial > Fix For: 2.3, 1.8 > > Attachments: NUTCH-1718-trunk.v1.patch > > > The description of property http.robots.agent in nutch-default.xml recommends > to add a '*' to the list of agent names. This will cause the same problem as > described in NUTCH-1715. The description should be updated. Also regarding > "order of precedence" which is dictated since NUTCH-1031 only by ordering of > user agents in robots.txt. > {code:xml} > <property> > <name>http.robots.agents</name> > <value>*</value> > <description>The agent strings we'll look for in robots.txt files, > comma-separated, in decreasing order of precedence. You should > put the value of http.agent.name as the first agent name, and keep the > default * at the end of the list. E.g.: BlurflDev,Blurfl,* > </description> > </property> > {code} -- This message was sent by Atlassian JIRA (v6.1.5#6160)