[jira] [Updated] (NUTCH-1714) Nutch 2.x upgrade to Gora 0.4

2014-05-13 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/NUTCH-1714?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alparslan Avcı updated NUTCH-1714:
--

Attachment: NUTCH-1714v6.patch

Hi [~jnioche],

Firstly, sorry about late answer and thanks for your comments!

bq. The code has changed since the last patch and we are now getting :
bq. This is due to status.getArgs() returning null.

I have fixed these in the [new patch|^NUTCH-1714v6.patch] I added.

bq. I presume you added the methods mentioned in NUTCH-1709 by hand after 
generating the classes automatically?
Yes, as you've guessed I've changed them by hand after generating automatically.

bq. WebTableReader should also remove the dirty field in processDumpJob
I have also fixed this in the new patch I added.
{quote}
* the Generator marks 50K entries with GENERATE_MARK but the Fetcher shows only 
49,461 as Map Input Records (and the same number as Reduce input records) => 
looks like we are not getting all the records we should be getting. I dumped 
the content of the table pre-fetching and it contains the right number of 
entries i.e. 50K
* The Generator displayed 'generated batch id: 1399626659-15643 containing 0 
URLs' but as I just explained it marked 50K entries correctly
* The dump of the webtable contains 'markers:   
org.apache.gora.persistency.impl.DirtyMapWrapper@eb173c'. It should display the 
values correctly.
{quote}
I will have look into these issues as soon as possible. Thanks again!

> Nutch 2.x upgrade to Gora 0.4
> -
>
> Key: NUTCH-1714
> URL: https://issues.apache.org/jira/browse/NUTCH-1714
> Project: Nutch
>  Issue Type: Improvement
>Reporter: Alparslan Avcı
>Assignee: Alparslan Avcı
> Fix For: 2.3
>
> Attachments: NUTCH-1714.patch, NUTCH-1714_NUTCH-1714_v2_v3.patch, 
> NUTCH-1714v2.patch, NUTCH-1714v4.patch, NUTCH-1714v5.patch, NUTCH-1714v6.patch
>
>
> Nutch upgrade for GORA_94 branch has to be implemented. We can discuss the 
> details in this issue.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (NUTCH-1714) Nutch 2.x upgrade to Gora 0.4

2014-05-01 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/NUTCH-1714?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alparslan Avcı updated NUTCH-1714:
--

Attachment: NUTCH-1714v5.patch

Hi [~jnioche],
I have uploaded a new patch that also fixes the problem in _"./nutch readdb 
-crawlId MYCRAWLIDHERE -stats"_ command. 
Would you please test it again? Thanks!

> Nutch 2.x upgrade to Gora 0.4
> -
>
> Key: NUTCH-1714
> URL: https://issues.apache.org/jira/browse/NUTCH-1714
> Project: Nutch
>  Issue Type: Improvement
>Reporter: Alparslan Avcı
>Assignee: Alparslan Avcı
> Fix For: 2.3
>
> Attachments: NUTCH-1714.patch, NUTCH-1714_NUTCH-1714_v2_v3.patch, 
> NUTCH-1714v2.patch, NUTCH-1714v4.patch, NUTCH-1714v5.patch
>
>
> Nutch upgrade for GORA_94 branch has to be implemented. We can discuss the 
> details in this issue.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (NUTCH-1714) Nutch 2.x upgrade to Gora 0.4

2014-04-30 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/NUTCH-1714?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney updated NUTCH-1714:


Summary: Nutch 2.x upgrade to Gora 0.4  (was: Nutch 2.x upgrade to use 
GORA_94 branch)

> Nutch 2.x upgrade to Gora 0.4
> -
>
> Key: NUTCH-1714
> URL: https://issues.apache.org/jira/browse/NUTCH-1714
> Project: Nutch
>  Issue Type: Improvement
>Reporter: Alparslan Avcı
>Assignee: Alparslan Avcı
> Fix For: 2.3
>
> Attachments: NUTCH-1714.patch, NUTCH-1714_NUTCH-1714_v2_v3.patch, 
> NUTCH-1714v2.patch, NUTCH-1714v4.patch
>
>
> Nutch upgrade for GORA_94 branch has to be implemented. We can discuss the 
> details in this issue.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (NUTCH-1714) Nutch 2.x upgrade to Gora 0.4

2014-04-30 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/NUTCH-1714?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney updated NUTCH-1714:


Fix Version/s: 2.3

> Nutch 2.x upgrade to Gora 0.4
> -
>
> Key: NUTCH-1714
> URL: https://issues.apache.org/jira/browse/NUTCH-1714
> Project: Nutch
>  Issue Type: Improvement
>Reporter: Alparslan Avcı
>Assignee: Alparslan Avcı
> Fix For: 2.3
>
> Attachments: NUTCH-1714.patch, NUTCH-1714_NUTCH-1714_v2_v3.patch, 
> NUTCH-1714v2.patch, NUTCH-1714v4.patch
>
>
> Nutch upgrade for GORA_94 branch has to be implemented. We can discuss the 
> details in this issue.



--
This message was sent by Atlassian JIRA
(v6.2#6252)