[jira] [Commented] (NUTCH-2132) Publisher/Subscriber model for Nutch to emit events

2015-10-05 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14943786#comment-14943786 ] Chris A. Mattmann commented on NUTCH-2132: -- True Julien, but that locks us into using Kibana,

[jira] [Commented] (NUTCH-2132) Publisher/Subscriber model for Nutch to emit events

2015-10-05 Thread Julien Nioche (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14943757#comment-14943757 ] Julien Nioche commented on NUTCH-2132: -- Looking at it from a slightly different angle, couldn't you

[jira] [Commented] (NUTCH-2132) Publisher/Subscriber model for Nutch to emit events

2015-10-05 Thread Julien Nioche (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14943856#comment-14943856 ] Julien Nioche commented on NUTCH-2132: -- bq. but that locks us into using Kibana, etc. Ideally one

[jira] [Commented] (NUTCH-2132) Publisher/Subscriber model for Nutch to emit events

2015-10-05 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14943880#comment-14943880 ] Chris A. Mattmann commented on NUTCH-2132: -- Hey Julien, yeah to be honest we thought about using

Re: Redirection in nutch

2015-10-05 Thread Taichi Ho
Thanks for pointing out. But it seems I can't get your patch to work directly with git apply. Ended up creating my own version of patch. On Sun, Oct 4, 2015 at 11:35 AM Sebastian Nagel wrote: > Hi, > > yes, this is a bug which has been fixed in the commit you

Re: Nutch not recognizing html pages/images retrieved via php

2015-10-05 Thread Julien Nioche
Hi What happens is that parse-tika is used by default but doesn't know what to do with that mime type. You can edit parse-plugins.xml and add to map the mime type to the html parser. Obviously you'll need parse-html to be

[jira] [Commented] (NUTCH-2123) Seed List REST API returns Text but headers indicate/require JSON

2015-10-05 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14944359#comment-14944359 ] Hudson commented on NUTCH-2123: --- SUCCESS: Integrated in Nutch-trunk #3286 (See

[jira] [Assigned] (NUTCH-2123) Seed List REST API returns Text but headers indicate/require JSON

2015-10-05 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann reassigned NUTCH-2123: Assignee: Chris A. Mattmann > Seed List REST API returns Text but headers

[jira] [Work started] (NUTCH-2123) Seed List REST API returns Text but headers indicate/require JSON

2015-10-05 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on NUTCH-2123 started by Chris A. Mattmann. > Seed List REST API returns Text but headers indicate/require JSON >

[jira] [Resolved] (NUTCH-2123) Seed List REST API returns Text but headers indicate/require JSON

2015-10-05 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann resolved NUTCH-2123. -- Resolution: Fixed - fixed thanks guys! {noformat} [chipotle:~/tmp/nutch1.11] mattmann%

[jira] [Commented] (NUTCH-2108) Add a function to the selenium interactive plugin interface to do multiple manipulation of driver and then return the data

2015-10-05 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14944426#comment-14944426 ] Chris A. Mattmann commented on NUTCH-2108: -- see my comments on Github, please +1 commit. > Add a

Re: Redirection in nutch

2015-10-05 Thread Sebastian Nagel
Hi, > Ended up creating my own version of patch. Not a problem, since it's a small patch :) These commands should do the work: (1) patch -p0 Thanks for pointing out. But it seems I can't get your patch to work directly > with git apply. Ended > up creating my own version of patch. > > >

[jira] [Commented] (NUTCH-2124) redirect following same link again and again , max redirect exceed and went db_gone

2015-10-05 Thread Yogendra Kumar Soni (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14943382#comment-14943382 ] Yogendra Kumar Soni commented on NUTCH-2124: Hello Sebastian, applied the patch, problem is

[jira] [Commented] (NUTCH-2132) Publisher/Subscriber model for Nutch to emit events

2015-10-05 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14943658#comment-14943658 ] Chris A. Mattmann commented on NUTCH-2132: -- Great comments, Seb, agree. > Publisher/Subscriber

[jira] [Commented] (NUTCH-2124) redirect following same link again and again , max redirect exceed and went db_gone

2015-10-05 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14943609#comment-14943609 ] Sebastian Nagel commented on NUTCH-2124: I've tested the patch with the mentioned URL as only seed

[jira] [Commented] (NUTCH-2132) Publisher/Subscriber model for Nutch to emit events

2015-10-05 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14943637#comment-14943637 ] Sebastian Nagel commented on NUTCH-2132: No question, this is a significant improvement over

[jira] [Commented] (NUTCH-2132) Publisher/Subscriber model for Nutch to emit events

2015-10-05 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14943625#comment-14943625 ] Chris A. Mattmann commented on NUTCH-2132: -- Right now here are a few comments: h2. Turn these

[jira] [Comment Edited] (NUTCH-2132) Publisher/Subscriber model for Nutch to emit events

2015-10-05 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14943625#comment-14943625 ] Chris A. Mattmann edited comment on NUTCH-2132 at 10/5/15 4:42 PM: ---