[jira] [Commented] (NUTCH-2148) Review and update mapred --> mapreduce config params in crawl script

Hudson (JIRA) Wed, 21 Oct 2015 21:54:44 -0700

    [ 
https://issues.apache.org/jira/browse/NUTCH-2148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14968554#comment-14968554
 ]


Hudson commented on NUTCH-2148:
-------------------------------

SUCCESS: Integrated in Nutch-trunk #3292 (See 
[https://builds.apache.org/job/Nutch-trunk/3292/])
NUTCH-2148 Review and update mapred --> mapreduce config params in crawl script 
(lewismc: [http://svn.apache.org/viewvc/nutch/trunk/?view=rev&rev=1709943])
* trunk/CHANGES.txt
* trunk/src/bin/crawl


> Review and update mapred --> mapreduce config params in crawl script
> --------------------------------------------------------------------
>
>                 Key: NUTCH-2148
>                 URL: https://issues.apache.org/jira/browse/NUTCH-2148
>             Project: Nutch
>          Issue Type: New Feature
>          Components: bin
>    Affects Versions: 1.10, 2.3.1
>            Reporter: Lewis John McGibbney
>            Assignee: Lewis John McGibbney
>             Fix For: 1.11
>
>         Attachments: NUTCH-2148.patch, NUTCH-2148v2.patch
>
>
> Configuration parameters inside of $NUTCH_HOME/src/bin/crawl currently include
> {code}
> commonOptions="-D mapred.reduce.tasks=$numTasks -D 
> mapred.child.java.opts=-Xmx1000m -D 
> mapred.reduce.tasks.speculative.execution=false -D 
> mapred.map.tasks.speculative.execution=false -D 
> mapred.compress.map.output=true"
> {code}
> as well as
> {code}
>   skipRecordsOptions="-D mapred.skip.attempts.to.start.skipping=2 -D 
> mapred.skip.map.max.skip.records=1"
>   __bin_nutch parse $commonOptions $skipRecordsOptions 
> "$CRAWL_PATH"/segments/$SEGMENT
> {code}
> In all honesty as part of the upgrade to Hadoop 2.4.0, this should have been 
> addressed!!! woops.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (NUTCH-2148) Review and update mapred --> mapreduce config params in crawl script

Reply via email to