[jira] [Commented] (NUTCH-1135) Fix TestGoraStorage for Nutchgora

2011-10-13 Thread Lewis John McGibbney (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13126826#comment-13126826 ] Lewis John McGibbney commented on NUTCH-1135: - First and foremost, I must say

Re: injector in nutch-1.4

2011-10-13 Thread Radim Kolar
Let me know if anybody got injector to work in 1.4 branch i have Hadoop 0.20.204.0 and cant make it to insert single url

Re: injector in nutch-1.4

2011-10-13 Thread Markus Jelsma
Hi, This is most likely an URL filter issue. Check all URL filters. There's also a test program for URL filtering. Try it out. http://wiki.apache.org/nutch/CommandLineOptions Cheers, ps. Moved to user@nutch as it's more appropriate there. > I have problems with running injector in nutch-1.4 o

injector in nutch-1.4

2011-10-13 Thread Radim Kolar
I have problems with running injector in nutch-1.4 on hadoop, same command with nutch-1.3 works fine. As you can see, list of URLs is loaded from hdfs correctly Map input records=66906 but no records are on map ouput. Could it be some problems with broken filtering? ponto:(crawler)runtime/depl

[jira] [Updated] (NUTCH-1135) Fix TestGoraStorage for Nutchgora

2011-10-13 Thread Ferdy (Updated) (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ferdy updated NUTCH-1135: - Attachment: NUTCH-1135-v1.patch This patch is a rewrite of the test. I thought it was the right thing to do beca

[jira] [Commented] (NUTCH-1098) better url-normalizer basic

2011-10-13 Thread Radim Kolar (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13126594#comment-13126594 ] Radim Kolar commented on NUTCH-1098: Patch is good. i will add replace high bit chars

[jira] [Issue Comment Edited] (NUTCH-1001) bin/nutch fetch/parse handle crawl/segments directory

2011-10-13 Thread Gabriele Kahlout (Issue Comment Edited) (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13126413#comment-13126413 ] Gabriele Kahlout edited comment on NUTCH-1001 at 10/13/11 7:47 AM: -

[jira] [Commented] (NUTCH-1001) bin/nutch fetch/parse handle crawl/segments directory

2011-10-13 Thread Gabriele Kahlout (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13126413#comment-13126413 ] Gabriele Kahlout commented on NUTCH-1001: - Hi Lewis, Regarding indentation I can