I'm a freshman learning about the nutch.
Here,I have serval questions:
1、URLNormalizer is a kind of a ExtensionPoint.But why does it implement the
Pluggable as other extensionpoint does?And further-more,do any difference
exist between the URLNormalizer and the other ExtensionPoint leading
the
I'm gonna to specify my own regex-normalize.xml.Are there any tutorial for
writing regex-normalize.xml?
waiting for ur help and thank u
[
https://issues.apache.org/jira/browse/NUTCH-990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13091682#comment-13091682
]
Stephan Grotz commented on NUTCH-990:
-
Same here - been trying to fetch https pages
[
https://issues.apache.org/jira/browse/NUTCH-990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Julien Nioche reopened NUTCH-990:
-
protocol-httpclient fails with short pages
--
Resending your messages every hour won't get you more answers - at the
opposite
On 26 August 2011 09:28, Kaiwii Ho kaiwi...@gmail.com wrote:
I'm a freshman learning about the nutch.
Here,I have serval questions:
1、URLNormalizer is a kind of a ExtensionPoint.But why does it implement the
[
https://issues.apache.org/jira/browse/NUTCH-990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Julien Nioche resolved NUTCH-990.
-
Resolution: Fixed
Fix Version/s: (was: 1.3)
1.4
A patch has been
[
https://issues.apache.org/jira/browse/NUTCH-937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13091740#comment-13091740
]
Radim Kolar commented on NUTCH-937:
---
we should stick with hadoop 0.20.203.0 not CDH and
[
https://issues.apache.org/jira/browse/NUTCH-937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13091747#comment-13091747
]
Julien Nioche commented on NUTCH-937:
-
@Radim : Nutch is based on the Apache
Dear Wiki user,
You have subscribed to a wiki page or wiki category on Nutch Wiki for change
notification.
The FrontPage page has been changed by LewisJohnMcgibbney:
http://wiki.apache.org/nutch/FrontPage?action=diffrev1=222rev2=223
* StrategicGoals
* IndexStructure
*
Dear Wiki user,
You have subscribed to a wiki page or wiki category on Nutch Wiki for change
notification.
The Archive and Legacy page has been changed by LewisJohnMcgibbney:
http://wiki.apache.org/nutch/Archive%20and%20Legacy?action=diffrev1=15rev2=16
=== Development and Old Nutch 2.0 ===
Dear Wiki user,
You have subscribed to a wiki page or wiki category on Nutch Wiki for change
notification.
The MapReduce page has been changed by LewisJohnMcgibbney:
http://wiki.apache.org/nutch/MapReduce?action=diffrev1=7rev2=8
+ = How Map and Reduce operations are actually carried out =
+ ==
Dear Wiki user,
You have subscribed to a wiki page or wiki category on Nutch Wiki for change
notification.
The IndexStructure page has been changed by LewisJohnMcgibbney:
http://wiki.apache.org/nutch/IndexStructure?action=diffrev1=3rev2=4
||type|| NO || UnTokenized
[
https://issues.apache.org/jira/browse/NUTCH-386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13091846#comment-13091846
]
Lewis John McGibbney commented on NUTCH-386:
What is the position with this
Dear Wiki user,
You have subscribed to a wiki page or wiki category on Nutch Wiki for change
notification.
The IndexStructure page has been changed by LewisJohnMcgibbney:
http://wiki.apache.org/nutch/IndexStructure?action=diffrev1=4rev2=5
||lang|| YES || UnTokenized
Dear Wiki user,
You have subscribed to a wiki page or wiki category on Nutch Wiki for change
notification.
The IndexStructure page has been changed by LewisJohnMcgibbney:
http://wiki.apache.org/nutch/IndexStructure?action=diffrev1=5rev2=6
||segment || YES ||
Dear Wiki user,
You have subscribed to a wiki page or wiki category on Nutch Wiki for change
notification.
The IndexStructure page has been changed by LewisJohnMcgibbney:
http://wiki.apache.org/nutch/IndexStructure?action=diffrev1=6rev2=7
||lang|| YES || UnTokenized
Dear Wiki user,
You have subscribed to a wiki page or wiki category on Nutch Wiki for change
notification.
The IndexStructure page has been changed by LewisJohnMcgibbney:
http://wiki.apache.org/nutch/IndexStructure?action=diffrev1=7rev2=8
The index structure formed after indexing is shown
Dear Wiki user,
You have subscribed to a wiki page or wiki category on Nutch Wiki for change
notification.
The FrontPage page has been changed by LewisJohnMcgibbney:
http://wiki.apache.org/nutch/FrontPage?action=diffrev1=224rev2=225
* MultiLingualSupport - ''In development''.
*
See https://builds.apache.org/job/Nutch-trunk/1586/
--
[...truncated 986 lines...]
A
src/plugin/subcollection/src/java/org/apache/nutch/collection/CollectionManager.java
A
20 matches
Mail list logo