[jira] [Updated] (NUTCH-1714) Nutch 2.x upgrade to Gora 0.4
[ https://issues.apache.org/jira/browse/NUTCH-1714?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alparslan Avcı updated NUTCH-1714: -- Attachment: NUTCH-1714v6.patch Hi [~jnioche], Firstly, sorry about late answer and thanks for your comments! bq. The code has changed since the last patch and we are now getting : bq. This is due to status.getArgs() returning null. I have fixed these in the [new patch|^NUTCH-1714v6.patch] I added. bq. I presume you added the methods mentioned in NUTCH-1709 by hand after generating the classes automatically? Yes, as you've guessed I've changed them by hand after generating automatically. bq. WebTableReader should also remove the dirty field in processDumpJob I have also fixed this in the new patch I added. {quote} * the Generator marks 50K entries with GENERATE_MARK but the Fetcher shows only 49,461 as Map Input Records (and the same number as Reduce input records) => looks like we are not getting all the records we should be getting. I dumped the content of the table pre-fetching and it contains the right number of entries i.e. 50K * The Generator displayed 'generated batch id: 1399626659-15643 containing 0 URLs' but as I just explained it marked 50K entries correctly * The dump of the webtable contains 'markers: org.apache.gora.persistency.impl.DirtyMapWrapper@eb173c'. It should display the values correctly. {quote} I will have look into these issues as soon as possible. Thanks again! > Nutch 2.x upgrade to Gora 0.4 > - > > Key: NUTCH-1714 > URL: https://issues.apache.org/jira/browse/NUTCH-1714 > Project: Nutch > Issue Type: Improvement >Reporter: Alparslan Avcı >Assignee: Alparslan Avcı > Fix For: 2.3 > > Attachments: NUTCH-1714.patch, NUTCH-1714_NUTCH-1714_v2_v3.patch, > NUTCH-1714v2.patch, NUTCH-1714v4.patch, NUTCH-1714v5.patch, NUTCH-1714v6.patch > > > Nutch upgrade for GORA_94 branch has to be implemented. We can discuss the > details in this issue. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (NUTCH-1714) Nutch 2.x upgrade to Gora 0.4
[ https://issues.apache.org/jira/browse/NUTCH-1714?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alparslan Avcı updated NUTCH-1714: -- Attachment: NUTCH-1714v5.patch Hi [~jnioche], I have uploaded a new patch that also fixes the problem in _"./nutch readdb -crawlId MYCRAWLIDHERE -stats"_ command. Would you please test it again? Thanks! > Nutch 2.x upgrade to Gora 0.4 > - > > Key: NUTCH-1714 > URL: https://issues.apache.org/jira/browse/NUTCH-1714 > Project: Nutch > Issue Type: Improvement >Reporter: Alparslan Avcı >Assignee: Alparslan Avcı > Fix For: 2.3 > > Attachments: NUTCH-1714.patch, NUTCH-1714_NUTCH-1714_v2_v3.patch, > NUTCH-1714v2.patch, NUTCH-1714v4.patch, NUTCH-1714v5.patch > > > Nutch upgrade for GORA_94 branch has to be implemented. We can discuss the > details in this issue. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (NUTCH-1714) Nutch 2.x upgrade to Gora 0.4
[ https://issues.apache.org/jira/browse/NUTCH-1714?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated NUTCH-1714: Summary: Nutch 2.x upgrade to Gora 0.4 (was: Nutch 2.x upgrade to use GORA_94 branch) > Nutch 2.x upgrade to Gora 0.4 > - > > Key: NUTCH-1714 > URL: https://issues.apache.org/jira/browse/NUTCH-1714 > Project: Nutch > Issue Type: Improvement >Reporter: Alparslan Avcı >Assignee: Alparslan Avcı > Fix For: 2.3 > > Attachments: NUTCH-1714.patch, NUTCH-1714_NUTCH-1714_v2_v3.patch, > NUTCH-1714v2.patch, NUTCH-1714v4.patch > > > Nutch upgrade for GORA_94 branch has to be implemented. We can discuss the > details in this issue. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (NUTCH-1714) Nutch 2.x upgrade to Gora 0.4
[ https://issues.apache.org/jira/browse/NUTCH-1714?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated NUTCH-1714: Fix Version/s: 2.3 > Nutch 2.x upgrade to Gora 0.4 > - > > Key: NUTCH-1714 > URL: https://issues.apache.org/jira/browse/NUTCH-1714 > Project: Nutch > Issue Type: Improvement >Reporter: Alparslan Avcı >Assignee: Alparslan Avcı > Fix For: 2.3 > > Attachments: NUTCH-1714.patch, NUTCH-1714_NUTCH-1714_v2_v3.patch, > NUTCH-1714v2.patch, NUTCH-1714v4.patch > > > Nutch upgrade for GORA_94 branch has to be implemented. We can discuss the > details in this issue. -- This message was sent by Atlassian JIRA (v6.2#6252)