[ 
http://issues.apache.org/jira/browse/NUTCH-135?page=comments#action_12360025 ] 

Stefan Groschupf commented on NUTCH-135:
----------------------------------------

Andrzej, that is easy to add to the ContentProperties object and sure I can do 
that. However first I would love to get a OK for this patch, before I invest 
more time in it, since I spend to many time writing stuff just for the issue 
archive. 
As soon this patch is in the sources I will write a small new patch (as Doug 
suggested, do it in small steps) to solve NUTCH-3

> http header meta data are case insensitive in the real world (e.g. 
> Content-Type or content-type)
> ------------------------------------------------------------------------------------------------
>
>          Key: NUTCH-135
>          URL: http://issues.apache.org/jira/browse/NUTCH-135
>      Project: Nutch
>         Type: Bug
>   Components: fetcher
>     Versions: 0.7, 0.7.1
>     Reporter: Stefan Groschupf
>     Priority: Critical
>      Fix For: 0.8-dev, 0.7.2-dev
>  Attachments: contentProperties_patch.txt
>
> As described in issue nutch-133, some webservers return http header meta data 
> not standard conform case insensitive.
> This provides many negative side effects, for example query thet content type 
> from the meta data return null also in case the webserver returns a content 
> type, but the key is not standard conform e.g. lower case. Also this has 
> effects to the pdf parser that queries the content length etc.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira



-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems?  Stop!  Download the new AJAX search engine that makes
searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
http://ads.osdn.com/?ad_id=7637&alloc_id=16865&op=click
_______________________________________________
Nutch-developers mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-developers

Reply via email to