Hi Udo, With a URL/URI there are two representations: the external one (the way they are written) and the internal one (what is really meant). ZnUrl follows this distinction.
When you say #asUrl (or #asZnUrl) you are actually parsing an external string representation. When doing so, percent decoding is done by ZnPercentEncoder. This class is strict, in that it does not allow non-safe, non-ascii characters in its input. AFAIK this is correct, but I can imagine a less strict interpretation (like the URL input box of a browser would allow). If you have a reading of the specs that says otherwise I would be very interested. To save you from doing the encoding yourself, you have to construct the URL from its parts explicitly, like this: ZnUrl new scheme: #http; host: 'myhost'; addPathSegments: #('path' 'with' 'unlaut' 'äöü.txt'); yourself. => http://myhost/path/with/unlaut/%C3%A4%C3%B6%C3%BC.txt Class comments and unit tests should help. There is also this draft: http://stfx.eu/EnterprisePharo/Zinc-Encoding-Meta/ HTH, Sven PS: Incidentally, this does work 'http://myhost/path/with/umlaut/äöü.txt' asFileReference asUrl. because #asFileReference works differently. > On 02 Dec 2014, at 23:32, Udo Schneider <udo.schnei...@homeaddress.de> wrote: > > All, > > What's the expected behavior with non-ASCII characters in URLs. Let's say I > want to access a file named "äöü.txt" - My assumption was that Zinc takes > care of the UTF-8 -> 7bit (ASCII) -> Escape encoding. But there is either > something I don't understand or some manual steps I'm missing. > > The "straightforward" way doesn't work: > 'http://myhost/path/with/umlaut/äöü.txt' asUrl. "ZnCharacterEncodingError: > ASCII character expected" > > Although the actual encoding seems to be able to handle it (ignoring the > escapes slashes for the moment: > 'http://myhost/path/with/umlaut/äöü.txt' urlEncoded. > "'http%3A%2F%2Fmyhost%2Fpath%2Fwith%2Fumlaut%2F%C3%A4%C3%B6%C3%BC.txt'" > > Creating a URL from already escaped characters works as well: > 'http://myhost/path/with/umlaut/%C3%A4%C3%B6%C3%BC.txt' asUrl. > "http://myhost/path/with/umlaut/%C3%A4%C3%B6%C3%BC.txt" > > As does the decoding of such an URL: > 'http://myhost/path/with/umlaut/%C3%A4%C3%B6%C3%BC.txt' urlDecoded. > "'http://myhost/path/with/umlaut/äöü.txt'" > > At them moment I'm manually encoding UTF-8 characters in paths segments > before trying to build the URL. But is this the correct way? > > Best Regards, > > Udo > > >