Rick Measham <[EMAIL PROTECTED]> writes: >That being the case, I grab the charset and use Encode's decode function >to turn it into 'perl's internal format' .. which in 5.8.5 is utf8 >right?
As it happens the answer is "maybe", but it is the _internal_ form it is none of your business ;-) - so pretend you know nothing about how it all works and convert internal form to UTF-8 explicitly. (But this will be efficent if string is internally in that form :-)) >I then store that in the db. When you get it back from db you need to convert it from UTF-8 to perl's internal form. Again this is trivial. > >However it's not working. > >Does that mean that the encoding of the actual characters on the page is >not in the charset in the meta tag? Quite possibly - do you mean the chars in the headers or the body? >Or am I missing some piece of the >puzzle? > >A random example page would be >http://www.reitsport-schill.de/index1053542873.html > >This page is in German and *says* the charset it ISO-8859-1. However the >characters with the umlauts are displaying as unknown chars in a page >tagged as utf8. >