Build failed in Jenkins: Nutch-trunk #1586

2011-08-26 Thread Apache Jenkins Server
See -- [...truncated 986 lines...] A src/plugin/subcollection/src/java/org/apache/nutch/collection/CollectionManager.java A src/plugin/subcollection/src/java/org/apache/nutch/collection/pack

[jira] [Commented] (NUTCH-937) When nutch is run on hadoop > 0.20.2 (or cdh) it will not find plugins because MapReduce will not unpack plugin/ directory from the job's pack (due to MAPREDUCE-967)

2011-08-26 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13092064#comment-13092064 ] Lewis John McGibbney commented on NUTCH-937: This in not strictly true, Nutch d

[jira] [Commented] (NUTCH-937) When nutch is run on hadoop > 0.20.2 (or cdh) it will not find plugins because MapReduce will not unpack plugin/ directory from the job's pack (due to MAPREDUCE-967)

2011-08-26 Thread Radim Kolar (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13092059#comment-13092059 ] Radim Kolar commented on NUTCH-937: --- nutch-1.4 contains hadoop-core 0.20.2. If nutch 1.4

[Nutch Wiki] Trivial Update of "FrontPage" by LewisJohnMcgibbney

2011-08-26 Thread Apache Wiki
Dear Wiki user, You have subscribed to a wiki page or wiki category on "Nutch Wiki" for change notification. The "FrontPage" page has been changed by LewisJohnMcgibbney: http://wiki.apache.org/nutch/FrontPage?action=diff&rev1=224&rev2=225 * MultiLingualSupport - ''In development''. * Fixi

[Nutch Wiki] Trivial Update of "FrontPage" by LewisJohnMcgibbney

2011-08-26 Thread Apache Wiki
Dear Wiki user, You have subscribed to a wiki page or wiki category on "Nutch Wiki" for change notification. The "FrontPage" page has been changed by LewisJohnMcgibbney: http://wiki.apache.org/nutch/FrontPage?action=diff&rev1=223&rev2=224 * [[Image_Search_Design]] * [[NutchOSGi]] * Str

[Nutch Wiki] Trivial Update of "IndexStructure" by LewisJohnMcgibbney

2011-08-26 Thread Apache Wiki
Dear Wiki user, You have subscribed to a wiki page or wiki category on "Nutch Wiki" for change notification. The "IndexStructure" page has been changed by LewisJohnMcgibbney: http://wiki.apache.org/nutch/IndexStructure?action=diff&rev1=7&rev2=8 The index structure formed after indexing is

[Nutch Wiki] Trivial Update of "IndexStructure" by LewisJohnMcgibbney

2011-08-26 Thread Apache Wiki
Dear Wiki user, You have subscribed to a wiki page or wiki category on "Nutch Wiki" for change notification. The "IndexStructure" page has been changed by LewisJohnMcgibbney: http://wiki.apache.org/nutch/IndexStructure?action=diff&rev1=6&rev2=7 ||lang|| YES || UnTokenize

[Nutch Wiki] Trivial Update of "IndexStructure" by LewisJohnMcgibbney

2011-08-26 Thread Apache Wiki
Dear Wiki user, You have subscribed to a wiki page or wiki category on "Nutch Wiki" for change notification. The "IndexStructure" page has been changed by LewisJohnMcgibbney: http://wiki.apache.org/nutch/IndexStructure?action=diff&rev1=5&rev2=6 ||segment || YES || No

[Nutch Wiki] Trivial Update of "IndexStructure" by LewisJohnMcgibbney

2011-08-26 Thread Apache Wiki
Dear Wiki user, You have subscribed to a wiki page or wiki category on "Nutch Wiki" for change notification. The "IndexStructure" page has been changed by LewisJohnMcgibbney: http://wiki.apache.org/nutch/IndexStructure?action=diff&rev1=4&rev2=5 ||lang|| YES || UnTokenize

[jira] [Commented] (NUTCH-386) Plugin to index categories by url rules

2011-08-26 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13091846#comment-13091846 ] Lewis John McGibbney commented on NUTCH-386: What is the position with this one

[Nutch Wiki] Trivial Update of "IndexStructure" by LewisJohnMcgibbney

2011-08-26 Thread Apache Wiki
Dear Wiki user, You have subscribed to a wiki page or wiki category on "Nutch Wiki" for change notification. The "IndexStructure" page has been changed by LewisJohnMcgibbney: http://wiki.apache.org/nutch/IndexStructure?action=diff&rev1=3&rev2=4 ||type|| NO || UnTokenize

[Nutch Wiki] Trivial Update of "MapReduce" by LewisJohnMcgibbney

2011-08-26 Thread Apache Wiki
Dear Wiki user, You have subscribed to a wiki page or wiki category on "Nutch Wiki" for change notification. The "MapReduce" page has been changed by LewisJohnMcgibbney: http://wiki.apache.org/nutch/MapReduce?action=diff&rev1=7&rev2=8 + = How Map and Reduce operations are actually carried out =

[Nutch Wiki] Trivial Update of "Archive and Legacy" by LewisJohnMcgibbney

2011-08-26 Thread Apache Wiki
Dear Wiki user, You have subscribed to a wiki page or wiki category on "Nutch Wiki" for change notification. The "Archive and Legacy" page has been changed by LewisJohnMcgibbney: http://wiki.apache.org/nutch/Archive%20and%20Legacy?action=diff&rev1=15&rev2=16 === Development and Old Nutch 2.0

[Nutch Wiki] Trivial Update of "FrontPage" by LewisJohnMcgibbney

2011-08-26 Thread Apache Wiki
Dear Wiki user, You have subscribed to a wiki page or wiki category on "Nutch Wiki" for change notification. The "FrontPage" page has been changed by LewisJohnMcgibbney: http://wiki.apache.org/nutch/FrontPage?action=diff&rev1=222&rev2=223 * StrategicGoals * IndexStructure * [[Getting_S

[no subject]

2011-08-26 Thread gaurav bagga

[jira] [Commented] (NUTCH-937) When nutch is run on hadoop > 0.20.2 (or cdh) it will not find plugins because MapReduce will not unpack plugin/ directory from the job's pack (due to MAPREDUCE-967)

2011-08-26 Thread Julien Nioche (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13091747#comment-13091747 ] Julien Nioche commented on NUTCH-937: - @Radim : Nutch is based on the Apache distributi

[jira] [Commented] (NUTCH-937) When nutch is run on hadoop > 0.20.2 (or cdh) it will not find plugins because MapReduce will not unpack plugin/ directory from the job's pack (due to MAPREDUCE-967)

2011-08-26 Thread Radim Kolar (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13091740#comment-13091740 ] Radim Kolar commented on NUTCH-937: --- we should stick with hadoop 0.20.203.0 not CDH and m

[jira] [Resolved] (NUTCH-990) protocol-httpclient fails with short pages

2011-08-26 Thread Julien Nioche (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Julien Nioche resolved NUTCH-990. - Resolution: Fixed Fix Version/s: (was: 1.3) 1.4 A patch has been com

Re: Why URLNormalizer doesn't implement the Pluggable?

2011-08-26 Thread Julien Nioche
Resending your messages every hour won't get you more answers - at the opposite On 26 August 2011 09:28, Kaiwii Ho wrote: > > I'm a freshman learning about the nutch. > Here,I have serval questions: > 1、URLNormalizer is a kind of a ExtensionPoint.But why does it implement the > Pluggable as othe

[jira] [Reopened] (NUTCH-990) protocol-httpclient fails with short pages

2011-08-26 Thread Julien Nioche (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Julien Nioche reopened NUTCH-990: - > protocol-httpclient fails with short pages > -- > >

[jira] [Commented] (NUTCH-990) protocol-httpclient fails with short pages

2011-08-26 Thread Stephan Grotz (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13091682#comment-13091682 ] Stephan Grotz commented on NUTCH-990: - Same here - been trying to fetch https pages thr

Are there any tutorial for writing regex-normalize.xml?

2011-08-26 Thread Kaiwii Ho
I'm gonna to specify my own regex-normalize.xml.Are there any tutorial for writing regex-normalize.xml? waiting for ur help and thank u

Why URLNormalizer doesn't implement the Pluggable?

2011-08-26 Thread Kaiwii Ho
I'm a freshman learning about the nutch. Here,I have serval questions: 1、URLNormalizer is a kind of a ExtensionPoint.But why does it implement the Pluggable as other extensionpoint does?And further-more,do any difference exist between the URLNormalizer and the other ExtensionPoint leading the URLNo