Correction - I am misrepresenting Sven. What he said was that Zinc would not look inside the HTML <head> node to find out about coding. It would of course use information in the HTTP headers, if any.
Peter Kenny wrote > Henry > > Thanks for the explanations. It's a bit clearer now. I'm still not sure > about how ZnUrl>>retrieveContents manages to decode correctly in this > case; > I'm sure I recall Sven saying it didn't (and in his view shouldn't) look > at > the HTTP declarations in the header. There is also the mystery of how the > string reader in the XML-Parser package (XMLURI>>get) does the same trick, > when it is presumably what XMLHTMLParser>>parseURL: uses and fails. > > However, all these are second order problems. It all begins because the > Corriere web site does strange things with encoding, including using a > UTF8 > character in a page coded with 8859-1, as Paul pointed out. In any case, > reading the page as a string and then parsing it solves my problem, so I > shall stick to that as a standard procedure. Most importantly, I don't > think > there is any indication of a problem in the XML package for Monty to worry > about. > > Thanks again > > Peter > > > > -- > Sent from: http://forum.world.st/Pharo-Smalltalk-Users-f1310670.html -- Sent from: http://forum.world.st/Pharo-Smalltalk-Users-f1310670.html