On 27 July 2018 at 18:39, Offray Vladimir Luna Cárdenas <offray.l...@mutabit.com> wrote: > Hi, > > I was ready to show a friend the Pharo web capabilities with the > classical "myString asUrl retrieveContents", but the friend gave me a > url that contains non Latin characters[1] and then I got an > ZnInvalidUTF8 error. > > [1] > http://www.bidchance.com/freesearch.do?&filetype=&channel=¤tpage=1&searchtype=zb&queryword=%BF%A6%CA%B2&displayStyle=&pstate=&field=&leftday=&province=&bidfile=&project=&heshi=&recommend=&field=&jing=&starttime=&endtime=&attachment= > > How can I process web addresses in Pharo that contain non latin > characters like the one in [1]?
Just some blind digging... A few levels down the stack is a call equivalent to... x := '%BF%A6%CA%B2'. ZnPercentEncoder new decode: x. which fails with the same error. In #decode we have... bytes := #[191 166 202 178]. and browsing around I discovered a useful method... encoder := ZnCharacterEncoder detectEncoding: bytes "==> a ZnSimplifiedByteEncoder('iso88591' strict)" now the following works... (ZnPercentEncoder new characterEncoder: encoder ) decode: x. So maybe that helps explain it, but I don't know how to join the dots to make it work out of the box with "asUrl retrieveContents" cheers -ben