[jira] [Updated] (NUTCH-1556) enabling updatedb to accept batchId
[ https://issues.apache.org/jira/browse/NUTCH-1556?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nguyen Manh Tien updated NUTCH-1556: Attachment: NUTCH-1556-batchId.patch batchId is not set in currentJob because we set batchId after creating currentJob, i change to set batchId first > enabling updatedb to accept batchId > > > Key: NUTCH-1556 > URL: https://issues.apache.org/jira/browse/NUTCH-1556 > Project: Nutch > Issue Type: Improvement >Affects Versions: 2.2 >Reporter: kaveh minooie > Fix For: 2.3 > > Attachments: NUTCH-1556-batchId.patch, NUTCH-1556-v2.patch, > NUTCH-1556-v3.patch, NUTCH-1556.patch > > > So the idea here is to be able to run updatedb and fetch for different > batchId simultaneously. I put together a patch. it seems to be working ( it > does skip the rows that do not match the batchId), but I am worried if and > how it might affect the sorting in the reduce part. anyway check it out. > it also change the command line usage to this: > Usage: DbUpdaterJob ( | -all) [-crawlId ] -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (NUTCH-1556) enabling updatedb to accept batchId
[ https://issues.apache.org/jira/browse/NUTCH-1556?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] kaveh minooie updated NUTCH-1556: - Attachment: NUTCH-1556-v3.patch there are typos (fetch instead of update) in v2 :) > enabling updatedb to accept batchId > > > Key: NUTCH-1556 > URL: https://issues.apache.org/jira/browse/NUTCH-1556 > Project: Nutch > Issue Type: Improvement >Affects Versions: 2.2 >Reporter: kaveh minooie > Fix For: 2.3 > > Attachments: NUTCH-1556.patch, NUTCH-1556-v2.patch, > NUTCH-1556-v3.patch > > > So the idea here is to be able to run updatedb and fetch for different > batchId simultaneously. I put together a patch. it seems to be working ( it > does skip the rows that do not match the batchId), but I am worried if and > how it might affect the sorting in the reduce part. anyway check it out. > it also change the command line usage to this: > Usage: DbUpdaterJob ( | -all) [-crawlId ] -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (NUTCH-1556) enabling updatedb to accept batchId
[ https://issues.apache.org/jira/browse/NUTCH-1556?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] lufeng updated NUTCH-1556: -- Attachment: NUTCH-1556-v2.patch new patch merged with issue 1632 > enabling updatedb to accept batchId > > > Key: NUTCH-1556 > URL: https://issues.apache.org/jira/browse/NUTCH-1556 > Project: Nutch > Issue Type: Improvement >Affects Versions: 2.2 >Reporter: kaveh minooie > Fix For: 2.3 > > Attachments: NUTCH-1556.patch, NUTCH-1556-v2.patch > > > So the idea here is to be able to run updatedb and fetch for different > batchId simultaneously. I put together a patch. it seems to be working ( it > does skip the rows that do not match the batchId), but I am worried if and > how it might affect the sorting in the reduce part. anyway check it out. > it also change the command line usage to this: > Usage: DbUpdaterJob ( | -all) [-crawlId ] -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (NUTCH-1556) enabling updatedb to accept batchId
[ https://issues.apache.org/jira/browse/NUTCH-1556?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated NUTCH-1556: Fix Version/s: 2.2 > enabling updatedb to accept batchId > > > Key: NUTCH-1556 > URL: https://issues.apache.org/jira/browse/NUTCH-1556 > Project: Nutch > Issue Type: Improvement >Affects Versions: 2.2 >Reporter: kaveh minooie > Fix For: 2.2 > > Attachments: NUTCH-1556.patch > > > So the idea here is to be able to run updatedb and fetch for different > batchId simultaneously. I put together a patch. it seems to be working ( it > does skip the rows that do not match the batchId), but I am worried if and > how it might affect the sorting in the reduce part. anyway check it out. > it also change the command line usage to this: > Usage: DbUpdaterJob ( | -all) [-crawlId ] -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (NUTCH-1556) enabling updatedb to accept batchId
[ https://issues.apache.org/jira/browse/NUTCH-1556?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] kaveh minooie updated NUTCH-1556: - Description: So the idea here is to be able to run updatedb and fetch for different batchId simultaneously. I put together a patch. it seems to be working ( it does skip the rows that do not match the batchId), but I am worried if and how it might affect the sorting in the reduce part. anyway check it out. it also change the command line usage to this: Usage: DbUpdaterJob ( | -all) [-crawlId ] was:So the idea here is to be able to run updatedb and fetch for different batchId simultaneously. I put together a patch. it seems to be working ( it does skip the rows that do not match the batchId), but I am worried if and how it might affect the sorting in the reduce part. anyway check it out. > enabling updatedb to accept batchId > > > Key: NUTCH-1556 > URL: https://issues.apache.org/jira/browse/NUTCH-1556 > Project: Nutch > Issue Type: Improvement >Affects Versions: 2.2 >Reporter: kaveh minooie > Attachments: NUTCH-1556.patch > > > So the idea here is to be able to run updatedb and fetch for different > batchId simultaneously. I put together a patch. it seems to be working ( it > does skip the rows that do not match the batchId), but I am worried if and > how it might affect the sorting in the reduce part. anyway check it out. > it also change the command line usage to this: > Usage: DbUpdaterJob ( | -all) [-crawlId ] -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (NUTCH-1556) enabling updatedb to accept batchId
[ https://issues.apache.org/jira/browse/NUTCH-1556?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] kaveh minooie updated NUTCH-1556: - Attachment: NUTCH-1556.patch > enabling updatedb to accept batchId > > > Key: NUTCH-1556 > URL: https://issues.apache.org/jira/browse/NUTCH-1556 > Project: Nutch > Issue Type: Improvement >Affects Versions: 2.2 >Reporter: kaveh minooie > Attachments: NUTCH-1556.patch > > > So the idea here is to be able to run updatedb and fetch for different > batchId simultaneously. I put together a patch. it seems to be working ( it > does skip the rows that do not match the batchId), but I am worried if and > how it might affect the sorting in the reduce part. anyway check it out. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira