Fetcher to skip queues for URLS getting repeated exceptions, based on percentage
Key: NUTCH-1303
URL: https://issues.apache.org/jira/browse/NUTCH-1303
Project: Nutch
[
https://issues.apache.org/jira/browse/NUTCH-366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Chris A. Mattmann updated NUTCH-366:
Labels: gsoc2012 (was: )
> Merge URLFilters and URLNormalizers
> --
Hi Yang,
I'd be willing to mentor this project. I tagged with GSOC, so it's now eligible
on the ASF ComDev list
for a project. Please contact d...@community.apache.org to get the info on how
to apply. Here is
some of it:
http://community.apache.org/gsoc.html
I'd be happy to mentor:
https://i
[
https://issues.apache.org/jira/browse/NUTCH-366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13224020#comment-13224020
]
Chris A. Mattmann commented on NUTCH-366:
-
I'd favor option #1 here, especially if
[
https://issues.apache.org/jira/browse/NUTCH-1299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13223964#comment-13223964
]
Hudson commented on NUTCH-1299:
---
Integrated in Nutch-trunk #1779 (See
[https://builds.apach
[
https://issues.apache.org/jira/browse/NUTCH-1298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13223957#comment-13223957
]
Hudson commented on NUTCH-1298:
---
Integrated in Nutch-nutchgora #185 (See
[https://builds.ap
[
https://issues.apache.org/jira/browse/NUTCH-902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13223956#comment-13223956
]
Hudson commented on NUTCH-902:
--
Integrated in Nutch-nutchgora #185 (See
[https://builds.apach
[
https://issues.apache.org/jira/browse/NUTCH-1302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13223955#comment-13223955
]
Hudson commented on NUTCH-1302:
---
Integrated in Nutch-nutchgora #185 (See
[https://builds.ap
[
https://issues.apache.org/jira/browse/NUTCH-1253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13223504#comment-13223504
]
Lewis John McGibbney commented on NUTCH-1253:
-
Hi Ferdy, the patches I attache
[
https://issues.apache.org/jira/browse/NUTCH-1299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13223485#comment-13223485
]
Hudson commented on NUTCH-1299:
---
Integrated in nutch-trunk-maven #184 (See
[https://builds.
[
https://issues.apache.org/jira/browse/NUTCH-1299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Markus Jelsma resolved NUTCH-1299.
--
Resolution: Fixed
Superfluous records are now ignored but a WARN message is written to the log.
[
https://issues.apache.org/jira/browse/NUTCH-1290?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Mathijs Homminga updated NUTCH-1290:
Attachment: NUTCH-1290.patch
This patch modifies the following files in order to support cr
Hi Yang,
I think this would be a missed opportunity if we didn't take you up on this
offer.
I can only assume that the development community are short on time, hence
why no-one has replied to this thread.
Is there any reason that you wish to attack this particular issue? Without
providing justifi
[
https://issues.apache.org/jira/browse/NUTCH-1299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Markus Jelsma updated NUTCH-1299:
-
Priority: Major (was: Critical)
Summary: LinkRank inverter to ignore records without Node (
[
https://issues.apache.org/jira/browse/NUTCH-1302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ferdy Galema updated NUTCH-1302:
Attachment: NUTCH-1302.patch
> nutchgora job failures should be noticed by submitter
>
[
https://issues.apache.org/jira/browse/NUTCH-1302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ferdy Galema closed NUTCH-1302.
---
Resolution: Fixed
> nutchgora job failures should be noticed by submitter
> -
[
https://issues.apache.org/jira/browse/NUTCH-1302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13223231#comment-13223231
]
Ferdy Galema commented on NUTCH-1302:
-
NutchJob is a nice wrapper of Hadoop's Job, so
[
https://issues.apache.org/jira/browse/NUTCH-1299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Markus Jelsma updated NUTCH-1299:
-
Attachment: NUTCH-1299-1.5-2.patch
New patch logs warning with proper error message.
[
https://issues.apache.org/jira/browse/NUTCH-1290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13223222#comment-13223222
]
Mathijs Homminga commented on NUTCH-1290:
-
Actually, the IndexerReducer is only us
nutchgora job failures should be noticed by submitter
-
Key: NUTCH-1302
URL: https://issues.apache.org/jira/browse/NUTCH-1302
Project: Nutch
Issue Type: Bug
Reporter: Ferdy Gale
[
https://issues.apache.org/jira/browse/NUTCH-1299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Markus Jelsma updated NUTCH-1299:
-
Attachment: NUTCH-1299-1.5-1.patch
Most likely solution is to check whether a LoopSet enters the
[
https://issues.apache.org/jira/browse/NUTCH-1299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Markus Jelsma updated NUTCH-1299:
-
Patch Info: Patch Available
> NPE in LinkRank inverter
>
>
>
[
https://issues.apache.org/jira/browse/NUTCH-1301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Dan Rosher updated NUTCH-1301:
--
Attachment: NUTCH-1301.patch
> Index job resume switch to resume a failed job
> ---
Index job resume switch to resume a failed job
--
Key: NUTCH-1301
URL: https://issues.apache.org/jira/browse/NUTCH-1301
Project: Nutch
Issue Type: Improvement
Components: indexer
Affe
[
https://issues.apache.org/jira/browse/NUTCH-1067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13223157#comment-13223157
]
behnam nikbakht commented on NUTCH-1067:
i can not understand why disable the thre
[
https://issues.apache.org/jira/browse/NUTCH-1298?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ferdy Galema closed NUTCH-1298.
---
Resolution: Fixed
> Pass numTasks to FetcherJob
> ---
>
>
[
https://issues.apache.org/jira/browse/NUTCH-1289?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ferdy Galema closed NUTCH-1289.
---
Resolution: Fixed
> In distributed mode URL's are not partitioned
> -
[
https://issues.apache.org/jira/browse/NUTCH-902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13223132#comment-13223132
]
Ferdy Galema commented on NUTCH-902:
I just committed a minor change to the sql mapping
[
https://issues.apache.org/jira/browse/NUTCH-1298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13223105#comment-13223105
]
Ferdy Galema commented on NUTCH-1298:
-
I'm not certain what the status and support of
29 matches
Mail list logo