Sebastian Nagel created NUTCH-1718:
--------------------------------------

             Summary: update description of property http.robots.agent
                 Key: NUTCH-1718
                 URL: https://issues.apache.org/jira/browse/NUTCH-1718
             Project: Nutch
          Issue Type: Bug
          Components: fetcher
    Affects Versions: 2.2.1, 2.2, 1.7
            Reporter: Sebastian Nagel
            Priority: Trivial
             Fix For: 2.3, 1.8


The description of property http.robots.agent in nutch-default.xml recommends 
to add a '*' to the list of agent names. This will cause the same problem as 
described in NUTCH-1715. The description should be updated. Also regarding 
"order of precedence" which is dictated since NUTCH-1031 only by ordering of 
user agents in robots.txt.
{code:xml}
<property>
  <name>http.robots.agents</name>
  <value>*</value>
  <description>The agent strings we'll look for in robots.txt files,
  comma-separated, in decreasing order of precedence. You should
  put the value of http.agent.name as the first agent name, and keep the
  default * at the end of the list. E.g.: BlurflDev,Blurfl,*
  </description>
</property>
{code}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Reply via email to