Hi,
 
when parsing HTML pages via HTTP using either the html or nekohtml encoder,
it doesn't correctly decode the German diaeresis characters (äöüÄÖÜ) and
others.
 
If have checked the page. It doesn't specify an encoding in the header
(plain HTML - maybe 3.2), but the page is delivered with the correct
encoding in HTTP response header (UTF-8). Apparently neither the html
generator nor nekohtml do respect this setting.
 
I've tried to edit /WEB-INF/tidy.properties and /WEB-INF/neko.properties,
but with no success.
 
Is there a workaround? Any help would be very much appreciated - I must fix
this encoding.
 
Kind regards,
Christian Schlichtherle
-- 
Schlichtherle IT Services
Wittelsbacherstr. 10a
10707 Berlin
 
Tel: +49 (0) 30 / 34 35 29 29
Mobil: +49 (0) 173 / 27 12 470
mailto:[EMAIL PROTECTED]
http://www.schlichtherle.de <http://www.schlichtherle.de/> 
 

Reply via email to