[
https://issues.apache.org/jira/browse/NUTCH-854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12888469#action_12888469
]
Pham Tuan Minh commented on NUTCH-854:
--------------------------------------
My idea is we define standard attributes for nutch work. In case user want some
customizations in crawling their web site (data source), they will define their
attributes in nutch-site.xml to override.
> Define standard attributes with values and explaination to configuration
> files in conf directory
> ------------------------------------------------------------------------------------------------
>
> Key: NUTCH-854
> URL: https://issues.apache.org/jira/browse/NUTCH-854
> Project: Nutch
> Issue Type: Improvement
> Environment: Window XP SP3, Cygwin, JDK 1.6.20, Ant 1.8.1
> Reporter: Pham Tuan Minh
> Fix For: 2.0
>
>
> It would make nutch easier to use if all configuration file in conf directory
> is defined standard attributes with values and explanation. For example,
> currently nutch-site.xml.template contains no attributes and no explanation,
> we should define them.
> -------------
> <?xml version="1.0"?>
> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
> <!-- site-specific property overrides in this file. -->
> <configuration>
> <!-- Agent name-->
> <property>
> <name>http.agent.name</name>
> <value>nutch-solr-integration</value>
> </property>
> <!---->
> <property>
> <name>generate.max.per.host</name>
> <value>100</value>
> </property>
> <property>
> <!-- plug-in using in this site -->
> <name>plugin.includes</name>
> <value>protocol-http|urlfilter-regex|parse-tika|scoring-opic|urlnormalizer-(pass|regex|basic)</value>
> </property>
> </configuration>
> -------------
> Thanks,
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.