[jira] [Commented] (NUTCH-2152) CommonCrawl dump via Service endpoint

2015-10-29 Thread Aron Ahmadia (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14981124#comment-14981124 ] Aron Ahmadia commented on NUTCH-2152: - Nutch crashed completely. > CommonCrawl dump via Service

[jira] [Commented] (NUTCH-2152) CommonCrawl dump via Service endpoint

2015-10-29 Thread Aron Ahmadia (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14981110#comment-14981110 ] Aron Ahmadia commented on NUTCH-2152: - Calling dump before the segments have been generated causes

[jira] [Commented] (NUTCH-2152) CommonCrawl dump via Service endpoint

2015-10-29 Thread Aron Ahmadia (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14981130#comment-14981130 ] Aron Ahmadia commented on NUTCH-2152: - Sure, or you could throw a 400 response and not accept the job

[jira] [Commented] (NUTCH-2132) Publisher/Subscriber model for Nutch to emit events

2015-10-29 Thread Aron Ahmadia (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14981198#comment-14981198 ] Aron Ahmadia commented on NUTCH-2132: - Do you log the errors that you're catching here? I'd rather

[jira] [Commented] (NUTCH-2132) Publisher/Subscriber model for Nutch to emit events

2015-10-28 Thread Aron Ahmadia (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14978902#comment-14978902 ] Aron Ahmadia commented on NUTCH-2132: - [~sujenshah] - I'm reviewing this again now. One issue I'm

[jira] [Commented] (NUTCH-2132) Publisher/Subscriber model for Nutch to emit events

2015-10-28 Thread Aron Ahmadia (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14978938#comment-14978938 ] Aron Ahmadia commented on NUTCH-2132: - I'm observing crashes when fetcher.publisher is set to false.

[jira] [Commented] (NUTCH-2153) Nutch REST API (DB) uses POST instead of GET to request

2015-10-28 Thread Aron Ahmadia (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14978841#comment-14978841 ] Aron Ahmadia commented on NUTCH-2153: - If it's asynchronous, use a POST and return a crawldb_job

[jira] [Commented] (NUTCH-2153) Nutch REST API (DB) uses POST instead of GET to request

2015-10-28 Thread Aron Ahmadia (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14978764#comment-14978764 ] Aron Ahmadia commented on NUTCH-2153: - The API from https://wiki.apache.org/nutch/Nutch_1.X_RESTAPI:

[jira] [Commented] (NUTCH-2132) Publisher/Subscriber model for Nutch to emit events

2015-10-28 Thread Aron Ahmadia (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14978945#comment-14978945 ] Aron Ahmadia commented on NUTCH-2132: - got it. > Publisher/Subscriber model for Nutch to emit events

[jira] [Commented] (NUTCH-2132) Publisher/Subscriber model for Nutch to emit events

2015-10-28 Thread Aron Ahmadia (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14978944#comment-14978944 ] Aron Ahmadia commented on NUTCH-2132: - I think the protection belongs in public void

[jira] [Created] (NUTCH-2154) Nutch REST API (DB) suffering NullPointerException

2015-10-28 Thread Aron Ahmadia (JIRA)
Aron Ahmadia created NUTCH-2154: --- Summary: Nutch REST API (DB) suffering NullPointerException Key: NUTCH-2154 URL: https://issues.apache.org/jira/browse/NUTCH-2154 Project: Nutch Issue Type:

[jira] [Commented] (NUTCH-2154) Nutch REST API (DB) suffering NullPointerException

2015-10-28 Thread Aron Ahmadia (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14978824#comment-14978824 ] Aron Ahmadia commented on NUTCH-2154: - Looks like it's assumed that "args" is passed in to the REST

[jira] [Commented] (NUTCH-2132) Publisher/Subscriber model for Nutch to emit events

2015-10-28 Thread Aron Ahmadia (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14978908#comment-14978908 ] Aron Ahmadia commented on NUTCH-2132: - Also, the vice-versa situation is important as well. Can I

[jira] [Commented] (NUTCH-2132) Publisher/Subscriber model for Nutch to emit events

2015-10-26 Thread Aron Ahmadia (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14974909#comment-14974909 ] Aron Ahmadia commented on NUTCH-2132: - Another comment. The API exposes the following configuration

[jira] [Commented] (NUTCH-2132) Publisher/Subscriber model for Nutch to emit events

2015-10-26 Thread Aron Ahmadia (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14975090#comment-14975090 ] Aron Ahmadia commented on NUTCH-2132: - I tried to reconfigure the queue using the REST API: cc =

[jira] [Commented] (NUTCH-2132) Publisher/Subscriber model for Nutch to emit events

2015-10-26 Thread Aron Ahmadia (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14975372#comment-14975372 ] Aron Ahmadia commented on NUTCH-2132: - Discussed with Sujen, our current plan is to provide an option

[jira] [Commented] (NUTCH-2149) REST endpoint to read Nutch sequence files

2015-10-26 Thread Aron Ahmadia (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14974207#comment-14974207 ] Aron Ahmadia commented on NUTCH-2149: - Hi Sujen and Chris. I think I'm missing something in this API.

[jira] [Commented] (NUTCH-2086) Nutch 1.X Webui

2015-10-04 Thread Aron Ahmadia (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14942796#comment-14942796 ] Aron Ahmadia commented on NUTCH-2086: - I've updated the nutch-python interface to match trunk:

[jira] [Created] (NUTCH-2123) Seed List REST API returns Text but headers indicate/require JSON

2015-09-24 Thread Aron Ahmadia (JIRA)
Aron Ahmadia created NUTCH-2123: --- Summary: Seed List REST API returns Text but headers indicate/require JSON Key: NUTCH-2123 URL: https://issues.apache.org/jira/browse/NUTCH-2123 Project: Nutch

[jira] [Updated] (NUTCH-2098) Add null SeedUrl constructor

2015-09-17 Thread Aron Ahmadia (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aron Ahmadia updated NUTCH-2098: Labels: memex newbie (was: newbie) > Add null SeedUrl constructor > >

[jira] [Commented] (NUTCH-2011) Endpoint to support realtime JSON output from the fetcher

2015-09-17 Thread Aron Ahmadia (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14805002#comment-14805002 ] Aron Ahmadia commented on NUTCH-2011: - What's the status on the implementation of this endpoint? This

[jira] [Updated] (NUTCH-2098) Add null SeedUrl constructor

2015-09-15 Thread Aron Ahmadia (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aron Ahmadia updated NUTCH-2098: Attachment: 0001-Default-SeedURL-constructor.patch > Add null SeedUrl constructor >

[jira] [Created] (NUTCH-2098) Add null SeedUrl constructor

2015-09-15 Thread Aron Ahmadia (JIRA)
Aron Ahmadia created NUTCH-2098: --- Summary: Add null SeedUrl constructor Key: NUTCH-2098 URL: https://issues.apache.org/jira/browse/NUTCH-2098 Project: Nutch Issue Type: Bug