[jira] [Created] (NUTCH-1303) Fetcher to skip queues for URLS getting repeated exceptions, based on percentage

2012-03-06 Thread behnam nikbakht (Created) (JIRA)
Fetcher to skip queues for URLS getting repeated exceptions, based on percentage Key: NUTCH-1303 URL: https://issues.apache.org/jira/browse/NUTCH-1303 Project: Nutch

[jira] [Updated] (NUTCH-366) Merge URLFilters and URLNormalizers

2012-03-06 Thread Chris A. Mattmann (Updated) (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated NUTCH-366: Labels: gsoc2012 (was: ) > Merge URLFilters and URLNormalizers > --

Re: Apply to solve issue

2012-03-06 Thread Mattmann, Chris A (388J)
Hi Yang, I'd be willing to mentor this project. I tagged with GSOC, so it's now eligible on the ASF ComDev list for a project. Please contact d...@community.apache.org to get the info on how to apply. Here is some of it: http://community.apache.org/gsoc.html I'd be happy to mentor: https://i

[jira] [Commented] (NUTCH-366) Merge URLFilters and URLNormalizers

2012-03-06 Thread Chris A. Mattmann (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13224020#comment-13224020 ] Chris A. Mattmann commented on NUTCH-366: - I'd favor option #1 here, especially if

[jira] [Commented] (NUTCH-1299) LinkRank inverter to ignore records without Node

2012-03-06 Thread Hudson (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13223964#comment-13223964 ] Hudson commented on NUTCH-1299: --- Integrated in Nutch-trunk #1779 (See [https://builds.apach

[jira] [Commented] (NUTCH-1298) Pass numTasks to FetcherJob

2012-03-06 Thread Hudson (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13223957#comment-13223957 ] Hudson commented on NUTCH-1298: --- Integrated in Nutch-nutchgora #185 (See [https://builds.ap

[jira] [Commented] (NUTCH-902) Add all necessary files and configuration so that nutch can be used with different backends out-of-the-box

2012-03-06 Thread Hudson (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13223956#comment-13223956 ] Hudson commented on NUTCH-902: -- Integrated in Nutch-nutchgora #185 (See [https://builds.apach

[jira] [Commented] (NUTCH-1302) nutchgora job failures should be noticed by submitter

2012-03-06 Thread Hudson (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13223955#comment-13223955 ] Hudson commented on NUTCH-1302: --- Integrated in Nutch-nutchgora #185 (See [https://builds.ap

[jira] [Commented] (NUTCH-1253) Incompatible neko and xerces versions

2012-03-06 Thread Lewis John McGibbney (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13223504#comment-13223504 ] Lewis John McGibbney commented on NUTCH-1253: - Hi Ferdy, the patches I attache

[jira] [Commented] (NUTCH-1299) LinkRank inverter to ignore records without Node

2012-03-06 Thread Hudson (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13223485#comment-13223485 ] Hudson commented on NUTCH-1299: --- Integrated in nutch-trunk-maven #184 (See [https://builds.

[jira] [Resolved] (NUTCH-1299) LinkRank inverter to ignore records without Node

2012-03-06 Thread Markus Jelsma (Resolved) (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Markus Jelsma resolved NUTCH-1299. -- Resolution: Fixed Superfluous records are now ignored but a WARN message is written to the log.

[jira] [Updated] (NUTCH-1290) crawlId not supported by all Tools

2012-03-06 Thread Mathijs Homminga (Updated) (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1290?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mathijs Homminga updated NUTCH-1290: Attachment: NUTCH-1290.patch This patch modifies the following files in order to support cr

Re: Apply to solve issue

2012-03-06 Thread Lewis John Mcgibbney
Hi Yang, I think this would be a missed opportunity if we didn't take you up on this offer. I can only assume that the development community are short on time, hence why no-one has replied to this thread. Is there any reason that you wish to attack this particular issue? Without providing justifi

[jira] [Updated] (NUTCH-1299) LinkRank inverter to ignore records without Node

2012-03-06 Thread Markus Jelsma (Updated) (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Markus Jelsma updated NUTCH-1299: - Priority: Major (was: Critical) Summary: LinkRank inverter to ignore records without Node (

[jira] [Updated] (NUTCH-1302) nutchgora job failures should be noticed by submitter

2012-03-06 Thread Ferdy Galema (Updated) (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ferdy Galema updated NUTCH-1302: Attachment: NUTCH-1302.patch > nutchgora job failures should be noticed by submitter >

[jira] [Closed] (NUTCH-1302) nutchgora job failures should be noticed by submitter

2012-03-06 Thread Ferdy Galema (Closed) (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ferdy Galema closed NUTCH-1302. --- Resolution: Fixed > nutchgora job failures should be noticed by submitter > -

[jira] [Commented] (NUTCH-1302) nutchgora job failures should be noticed by submitter

2012-03-06 Thread Ferdy Galema (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13223231#comment-13223231 ] Ferdy Galema commented on NUTCH-1302: - NutchJob is a nice wrapper of Hadoop's Job, so

[jira] [Updated] (NUTCH-1299) NPE in LinkRank inverter

2012-03-06 Thread Markus Jelsma (Updated) (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Markus Jelsma updated NUTCH-1299: - Attachment: NUTCH-1299-1.5-2.patch New patch logs warning with proper error message.

[jira] [Commented] (NUTCH-1290) crawlId not supported by all Tools

2012-03-06 Thread Mathijs Homminga (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13223222#comment-13223222 ] Mathijs Homminga commented on NUTCH-1290: - Actually, the IndexerReducer is only us

[jira] [Created] (NUTCH-1302) nutchgora job failures should be noticed by submitter

2012-03-06 Thread Ferdy Galema (Created) (JIRA)
nutchgora job failures should be noticed by submitter - Key: NUTCH-1302 URL: https://issues.apache.org/jira/browse/NUTCH-1302 Project: Nutch Issue Type: Bug Reporter: Ferdy Gale

[jira] [Updated] (NUTCH-1299) NPE in LinkRank inverter

2012-03-06 Thread Markus Jelsma (Updated) (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Markus Jelsma updated NUTCH-1299: - Attachment: NUTCH-1299-1.5-1.patch Most likely solution is to check whether a LoopSet enters the

[jira] [Updated] (NUTCH-1299) NPE in LinkRank inverter

2012-03-06 Thread Markus Jelsma (Updated) (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Markus Jelsma updated NUTCH-1299: - Patch Info: Patch Available > NPE in LinkRank inverter > > >

[jira] [Updated] (NUTCH-1301) Index job resume switch to resume a failed job

2012-03-06 Thread Dan Rosher (Updated) (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dan Rosher updated NUTCH-1301: -- Attachment: NUTCH-1301.patch > Index job resume switch to resume a failed job > ---

[jira] [Created] (NUTCH-1301) Index job resume switch to resume a failed job

2012-03-06 Thread Dan Rosher (Created) (JIRA)
Index job resume switch to resume a failed job -- Key: NUTCH-1301 URL: https://issues.apache.org/jira/browse/NUTCH-1301 Project: Nutch Issue Type: Improvement Components: indexer Affe

[jira] [Commented] (NUTCH-1067) Configure minimum throughput for fetcher

2012-03-06 Thread behnam nikbakht (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13223157#comment-13223157 ] behnam nikbakht commented on NUTCH-1067: i can not understand why disable the thre

[jira] [Closed] (NUTCH-1298) Pass numTasks to FetcherJob

2012-03-06 Thread Ferdy Galema (Closed) (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1298?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ferdy Galema closed NUTCH-1298. --- Resolution: Fixed > Pass numTasks to FetcherJob > --- > >

[jira] [Closed] (NUTCH-1289) In distributed mode URL's are not partitioned

2012-03-06 Thread Ferdy Galema (Closed) (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1289?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ferdy Galema closed NUTCH-1289. --- Resolution: Fixed > In distributed mode URL's are not partitioned > -

[jira] [Commented] (NUTCH-902) Add all necessary files and configuration so that nutch can be used with different backends out-of-the-box

2012-03-06 Thread Ferdy Galema (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13223132#comment-13223132 ] Ferdy Galema commented on NUTCH-902: I just committed a minor change to the sql mapping

[jira] [Commented] (NUTCH-1298) Pass numTasks to FetcherJob

2012-03-06 Thread Ferdy Galema (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13223105#comment-13223105 ] Ferdy Galema commented on NUTCH-1298: - I'm not certain what the status and support of