2015-01-22 19:21 GMT+01:00 Karl Berry <k...@freefriends.org>: > characters which should be replaced with entites in \url commands. > > I think that's right. > > this declaration is used only when `url-il2-pl` command line option is > used. not all special characters are declared. now, the problem is > with lualatex, as it is unicode engine, it reports invalid utf-8 > sequence even if it doesn't use url-encoders at all. > > as it is unlikely that anybody uses still latin2 encoding and special > characters in urls at the same time, and given that list of these > escaped special characters isn't comprehensive anyway, maybe we should > take that away? because it causes compile errors every time tex4ht is > used with lualatex. > > Oh. Clearly we need to solve it somehow, but I don't much like the idea > of getting rid of functionality, even something as obscure and probably > little-used as this. Plenty of people still use Latin N encodings, and > there is an active TeX community in Poland -- I surmise that's who asked > for that option in the first place. >
OK. clearly someone needed it, as it is only configuration provided for any input encoding for url-encoder. > LuaTeX can certainly read files in any encoding, including plain bytes, > not just UTF-8. I'm afraid I don't have any recipes at hand, though > it seems like it should be doable. > it is possible to use luatex's callback to convert read file to utf8 on the fly. I did that when I tried to use callbacks to write html directly from LuaLaTeX: https://github.com/michal-h21/lua4ht/blob/math/l4patchlatin1.lua > But a simpler idea comes to mind: how about replacing the problematic > characters with TeX's ^^xx notation? I'm not sure if the conversions > will happen at the right time, given that \url is changing everything > around anyway, but we can wait and see if anyone notices. At least it > would go through at the input level and is one step beyond just deleting it. > > Another idea is to move that chunk of input to a separate file, which > only gets read when that option is in effect. that might be perhaps the best solution? thanks, Michal > > Thanks, > Karl