[Nutch Wiki] Trivial Update of "NutchScoring" by LewisJohnMcgibbney

2014-09-20 Thread Apache Wiki
Dear Wiki user, You have subscribed to a wiki page or wiki category on "Nutch Wiki" for change notification. The "NutchScoring" page has been changed by LewisJohnMcgibbney: https://wiki.apache.org/nutch/NutchScoring?action=diff&rev1=4&rev2=5 == What Scoring is... what it means in Nutch

Re: Review Request 9119: Create SegmentContentDumperTool for easily extracting out file contents from SegmentDirs

2014-09-20 Thread Chris Mattmann
> On Sept. 9, 2014, 11:40 p.m., Lewis McGibbney wrote: > > ./trunk/src/java/org/apache/nutch/tools/FileDumper.java, line 101 > > > > > > When I change the Text() class to use the UTF8() class, I get the > > following >

Re: Review Request 9119: Create SegmentContentDumperTool for easily extracting out file contents from SegmentDirs

2014-09-20 Thread Chris Mattmann
> On Sept. 10, 2014, 1:24 a.m., Julien Le Dem wrote: > > ./trunk/src/java/org/apache/nutch/tools/FileDumper.java, line 45 > > > > > > this should be in the scope of the main method. > > If you wanted to write unit te

[jira] [Created] (NUTCH-1843) Upgrade to Gora 0.5

2014-09-20 Thread Lewis John McGibbney (JIRA)
Lewis John McGibbney created NUTCH-1843: --- Summary: Upgrade to Gora 0.5 Key: NUTCH-1843 URL: https://issues.apache.org/jira/browse/NUTCH-1843 Project: Nutch Issue Type: Improvement

[Nutch Wiki] Trivial Update of "NutchScoring" by LewisJohnMcgibbney

2014-09-20 Thread Apache Wiki
Dear Wiki user, You have subscribed to a wiki page or wiki category on "Nutch Wiki" for change notification. The "NutchScoring" page has been changed by LewisJohnMcgibbney: https://wiki.apache.org/nutch/NutchScoring?action=diff&rev1=5&rev2=6 == Examples == * NewScoring -- New stable pagera

[jira] [Commented] (NUTCH-1526) Create SegmentContentDumperTool for easily extracting out file contents from SegmentDirs

2014-09-20 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14142346#comment-14142346 ] Chris A. Mattmann commented on NUTCH-1526: -- OK guys I addressed all the comments

[jira] [Commented] (NUTCH-1526) Create SegmentContentDumperTool for easily extracting out file contents from SegmentDirs

2014-09-20 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14142347#comment-14142347 ] Chris A. Mattmann commented on NUTCH-1526: -- Committed to trunk in r1626517. Thank

[jira] [Commented] (NUTCH-1526) Create SegmentContentDumperTool for easily extracting out file contents from SegmentDirs

2014-09-20 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14142348#comment-14142348 ] Chris A. Mattmann commented on NUTCH-1526: -- Note: you can now run ./bin/nutch dum

[jira] [Resolved] (NUTCH-1526) Create SegmentContentDumperTool for easily extracting out file contents from SegmentDirs

2014-09-20 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1526?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann resolved NUTCH-1526. -- Resolution: Fixed Thanks to Lewis, Markus, JulienN and Julien De Lem for the feedback.

[jira] [Updated] (NUTCH-1844) testresources/testcrawl not referenced anywhere in code.

2014-09-20 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated NUTCH-1844: - Description: While working on NUTCH-1526 in Review Board http://reviews.apache.org/r9119/

[jira] [Created] (NUTCH-1844) testresources/testcrawl not referenced anywhere in code.

2014-09-20 Thread Chris A. Mattmann (JIRA)
Chris A. Mattmann created NUTCH-1844: Summary: testresources/testcrawl not referenced anywhere in code. Key: NUTCH-1844 URL: https://issues.apache.org/jira/browse/NUTCH-1844 Project: Nutch

[jira] [Updated] (NUTCH-1844) testresources/testcrawl not referenced anywhere in code.

2014-09-20 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated NUTCH-1844: - Description: While working on NUTCH-1526 in Review Board https://reviews.apache.org/r/9119

[jira] [Comment Edited] (NUTCH-1526) Create SegmentContentDumperTool for easily extracting out file contents from SegmentDirs

2014-09-20 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14142349#comment-14142349 ] Chris A. Mattmann edited comment on NUTCH-1526 at 9/21/14 6:07 AM: -