Markus Jelsma created NUTCH-1932:
Summary: Automatically remove orphaned pages
Key: NUTCH-1932
URL: https://issues.apache.org/jira/browse/NUTCH-1932
Project: Nutch
Issue Type: New Feature
[
https://issues.apache.org/jira/browse/NUTCH-1932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Markus Jelsma updated NUTCH-1932:
-
Attachment: NUTCH-1932.patch
Dirty patch!
Automatically remove orphaned pages
[
https://issues.apache.org/jira/browse/NUTCH-1930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lewis John McGibbney updated NUTCH-1930:
Fix Version/s: 2.3.1
Fetcher erases Markers for certain URLs / documents
[
https://issues.apache.org/jira/browse/NUTCH-1933?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lewis John McGibbney updated NUTCH-1933:
Attachment: NUTCH-selenium-trunk.patch
Patch for trunk
nutch-selenium plugin
Lewis John McGibbney created NUTCH-1934:
---
Summary: Refactor Fetcher in trunk
Key: NUTCH-1934
URL: https://issues.apache.org/jira/browse/NUTCH-1934
Project: Nutch
Issue Type:
[
https://issues.apache.org/jira/browse/NUTCH-1935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14306165#comment-14306165
]
stack commented on NUTCH-1935:
--
What did you have ulimit set to? See 'Limits on Number of
[
https://issues.apache.org/jira/browse/NUTCH-1935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14306189#comment-14306189
]
yuanyun.cn commented on NUTCH-1935:
---
Thanks, stack.
The limit is 4096.
cat
Moving to Hadoop 2.x ?
On 4 February 2015 at 14:42, Lewis John Mcgibbney lewis.mcgibb...@gmail.com
wrote:
Hi Folks,
Does anyone have any good ideas for GSoC?
Seb mentioned moving Nutch towards Spark so potentially a pluggable
runtime execution engine abstraction?
I am currently working on
Dear Wiki user,
You have subscribed to a wiki page or wiki category on Nutch Wiki for change
notification.
The FrontPage page has been changed by LewisJohnMcgibbney:
https://wiki.apache.org/nutch/FrontPage?action=diffrev1=293rev2=294
* NutchMeetUps - Records of previous Nutch community
[
https://issues.apache.org/jira/browse/NUTCH-1934?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lewis John McGibbney updated NUTCH-1934:
Attachment: (was: NUTCH-1934.patch)
Refactor Fetcher in trunk
[
https://issues.apache.org/jira/browse/NUTCH-1934?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lewis John McGibbney updated NUTCH-1934:
Attachment: NUTCH-1934.patch
Patch for trunk.
Some early observations:
* Existing
[
https://issues.apache.org/jira/browse/NUTCH-827?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lewis John McGibbney updated NUTCH-827:
---
Fix Version/s: (was: 1.11)
1.10
HTTP POST Authentication
[
https://issues.apache.org/jira/browse/NUTCH-1934?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lewis John McGibbney updated NUTCH-1934:
Patch Info: Patch Available
Refactor Fetcher in trunk
-
[
https://issues.apache.org/jira/browse/NUTCH-1934?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lewis John McGibbney updated NUTCH-1934:
Attachment: NUTCH-1934.patch
Refactor Fetcher in trunk
-
Dear Wiki user,
You have subscribed to a wiki page or wiki category on Nutch Wiki for change
notification.
The ContributorsGroup page has been changed by LewisJohnMcgibbney:
https://wiki.apache.org/nutch/ContributorsGroup?action=diffrev1=18rev2=19
* ArthurCinader
* MaziyarBoustani
*
Hi Folks,
Does anyone have any good ideas for GSoC?
Seb mentioned moving Nutch towards Spark so potentially a pluggable runtime
execution engine abstraction?
I am currently working on a lot of security and authentication related work
so I would possibly be tempted to overhaul and improve that
Dear Wiki user,
You have subscribed to a wiki page or wiki category on Nutch Wiki for change
notification.
The FrontPage page has been changed by LewisJohnMcgibbney:
https://wiki.apache.org/nutch/FrontPage?action=diffrev1=292rev2=293
* NutchMeetUps - Records of previous Nutch community
Dear Wiki user,
You have subscribed to a wiki page or wiki category on Nutch Wiki for change
notification.
The AdvancedAjaxInteraction page has been changed by LewisJohnMcgibbney:
https://wiki.apache.org/nutch/AdvancedAjaxInteraction
New page:
= AdvancedAjaxInteraction =
This page provides
18 matches
Mail list logo