[jira] [Comment Edited] (NUTCH-1679) UpdateDb using batchId, link may override crawled page.

2014-07-21 Thread Alexander Kingson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14069567#comment-14069567 ] Alexander Kingson edited comment on NUTCH-1679 at 7/22/14 12:46 AM:

[jira] [Comment Edited] (NUTCH-1679) UpdateDb using batchId, link may override crawled page.

2014-07-21 Thread Alexander Kingson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14069567#comment-14069567 ] Alexander Kingson edited comment on NUTCH-1679 at 7/22/14 12:15 AM:

[jira] [Commented] (NUTCH-1679) UpdateDb using batchId, link may override crawled page.

2014-07-21 Thread Alexander Kingson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14069567#comment-14069567 ] Alexander Kingson commented on NUTCH-1679: -- Hi, I was suggesting to close the da

[jira] [Updated] (NUTCH-1821) Nutch Crawl class for EMR

2014-07-21 Thread Luis Lopez (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1821?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luis Lopez updated NUTCH-1821: -- Description: Hi all, Some of us are using Amazon EMR to deploy/run Nutch and from what I've been read

[jira] [Created] (NUTCH-1821) Nutch Crawl class for EMR

2014-07-21 Thread Luis Lopez (JIRA)
Luis Lopez created NUTCH-1821: - Summary: Nutch Crawl class for EMR Key: NUTCH-1821 URL: https://issues.apache.org/jira/browse/NUTCH-1821 Project: Nutch Issue Type: Wish Affects Versions: 1.6

[jira] [Created] (NUTCH-1820) remove field "orig" which duplicates "id"

2014-07-21 Thread Sebastian Nagel (JIRA)
Sebastian Nagel created NUTCH-1820: -- Summary: remove field "orig" which duplicates "id" Key: NUTCH-1820 URL: https://issues.apache.org/jira/browse/NUTCH-1820 Project: Nutch Issue Type: Bug

[jira] [Commented] (NUTCH-1708) use same id when indexing and deleting redirects

2014-07-21 Thread Julien Nioche (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14068374#comment-14068374 ] Julien Nioche commented on NUTCH-1708: -- I like the approach and this would be the bes

Re: Problems running some ant targets on recent trunk

2014-07-21 Thread Julien Nioche
It's actually a bit more twisted than that : see https://issues.apache.org/jira/browse/NUTCH-1818 This separation of the test and runtime dependencies has actually been very good for exposing inconsistencies in the way the existing build worked. The issue should be solved now, thanks for reporting