Couple of points:

1. You used tabs
2. You left some unneccessary comments on source, bug history is 
allready in jira and commit logs
3. Why not addition to testcase?
4. Issue could have been iterated in jira a bit further so all these 
could have been catched before a commit.

--
  Sami Siren




Chris A. Mattmann (JIRA) wrote:
>      [ http://issues.apache.org/jira/browse/NUTCH-406?page=all ]
> 
> Chris A. Mattmann closed NUTCH-406.
> -----------------------------------
> 
> 
> Patch applied to trunk:
> 
> http://svn.apache.org/viewvc?view=rev&revision=478619
> 
> 
> 
> 
>> Metadata tries to write null values
>> -----------------------------------
>>
>>                 Key: NUTCH-406
>>                 URL: http://issues.apache.org/jira/browse/NUTCH-406
>>             Project: Nutch
>>          Issue Type: Bug
>>    Affects Versions: 0.9.0
>>            Reporter: Doğacan Güney
>>         Assigned To: Chris A. Mattmann
>>             Fix For: 0.9.0
>>
>>         Attachments: NUTCH-406.patch, NUTCH-406.patch
>>
>>
>> During parsing, some urls (especially pdfs, it seems) may create <some_key, 
>> null> pairs in ParseData's parseMeta. 
>> When Metadata.write() tries to write such a pair, it causes an NPE.
>> Stack trace will be something like this:
>>         at org.apache.hadoop.io.Text.encode(Text.java:373)
>>         at org.apache.hadoop.io.Text.encode(Text.java:354)
>>         at org.apache.hadoop.io.Text.writeString(Text.java:394)
>>         at org.apache.nutch.metadata.Metadata.write(Metadata.java:214)
>> I can consistently reproduce this using the following url:
>> http://www.efesbev.com/corporate_governance/pdf/MergerAgreement.pdf
> 


-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
Nutch-developers mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-developers

Reply via email to