Hello! I'm sorry for the delay. On 2019-02-22 Mario Blättermann wrote: > Am Donnerstag, 21. Februar 2019, 18:38:06 CET schrieb Lasse Collin: > > On 2019-02-17 Mario Blättermann wrote: > > > It would be nice if xz would be integrated into a global > > > translation platform. > > > > Benno Schulenberg asked me about this in 2016. I didn't want to > > think about it at that moment and then it was forgotten. :-/ Let's > > try again now. > > > He's CCed from now on.
:-) The xz-devel list only allows subscribers to post, which you probably already noticed. This a bit inconvenient but it keeps spam away. I get the rejected messages in my inbox still. > > I worry that it's not that simple. My experience is that I need to > > look through the translations because most have had some errors in > > aligning columns in --help and --list outputs. In some cases it has > > taken several tries until a translator has gotten it correctly > > done. > > > > There is debug/translation.bash to see the translations in action, > > and there are instructions in README section 4. Multiple > > translators having similar problems suggests that there's a problem > > in my code or instructions, but I don't know how to improve. > > > In some translations, the --help output is split into two gettext > messages: the option itself and the corresponding description. This > way, translators don't have to bother with indentations, tab widths > and so on. But this behavior I haven't found very often. > Unfortunately, I don't have any coding skills, that's why I won't be > able to help you. I think some GNU packages use "argp" for the --help output where the messages are split as you described. argp can be convenient and I understand why translators may like it too. On the other hand, raw strings give translators more control how --help is shown (e.g. they can change the column where the description starts for all messages) which might be useful in some (rare) cases. argp is not in POSIX. argp is availabe in gnulib so it isn't too hard to add it into a package. The gnulib implementation is under LGPLv3+. xz is public domain because LZMA SDK is; I didn't want to use a more restrictive license than the original compression code does. Thus I don't want to use argp in xz. (There is GNU getopt_long in xz but it's not a big problem because a compatible enough version is available on many OSes, including all BSDs.) Obviously argp isn't the only way to split the --help messages. I haven't searched for other ready-made solutions though because so far I haven't had much interest in this. A bit more off-topic but I post it here anyway in case someone finds it interesting or even has knowledge and energy to improve the relevant argp code: Splitting the strings in --help works perfectly only if the library is sophisticated enough. Things are simple in US-ASCII, ISO-8859-*, and such character sets, but nowadays UTF-8 is the most common. In UTF-8 a single Unicode code point can use 1-4 bytes and each code point may use 0, 1, or 2 columns in a terminal. If these things aren't handled properly, the --help output won't look perfect. I tested GNU tar --help under de_DE (ISO-8859-1) and de_DE.UTF-8. It think argp uses bytes to calculate string lengths and thus gets it wrong under UTF-8 locale: --group-map=DATEI DATEI benutzen, um GIDs und Namen der Besitzer abzubilden --mode=ÄNDERUNGEN den (symbolischen) Modus ÄNDERUNGEN für hinzugefügte Dateien erzwingen "den (symbolischen)" is misaligned because the Ä in ÄNDERUNGEN is two bytes and argp thinks it takes two columns of space, while in reality those two bytes use only one column. With ISO-8859-1 locale the alignment is correct. The same problem causes line-wrapping to happen too early. ISO-8859-1 version first (converted to UTF-8 for email), then UTF-8: --xattrs-include=MASKE das Einschluss-Muster für xattr-Schlüssel angeben --xattrs-include=MASKE das Einschluss-Muster für xattr-Schlüssel angeben -P, --absolute-names führende »/«-Zeichen in den Dateinamen erhalten -P, --absolute-names führende „/“-Zeichen in den Dateinamen erhalten These aren't translator's fault, but still make the translated program look slightly sloppy. > > I wonder should a few experienced translators look at this first so > > that possible problems at my side can be fixed. It doesn't sound > > great if I get 30 new translations and 25 need similar fixes and I > > need to explain them to each translator separately. > > > Once a new translation arrives (assuming the TP robot sends it to > this list) I will have a look at it. I'm not sure if I understood correctly. If you meant that the TP would send the ready-made translations to xz-devel, I guess it's a problem due to only subscribers being able to post to xz-devel. I had thought the translations could be sent directly to me but now I'm unsure if that is flexible enough. Benno Schulenberg wrote: > For the --help output, I wouldn't worry much about the alignment; it's > much more important that the translation is clear and grammatically > correct. I agree that correct language is much more important than the alignment. However, I think it's way easier to get the alignment right than make a good translation, so if the hard part is done, it would be nice if the easy part gets done too. :-) > For the --list output... I've looked at the Dutch output > of xz-5.2.2 (that's installed on my machine) and it is... quite > misaligned. Not looking good. Oh. :-( If so, it's my fault too as I thought I had checked them before committing. > Maybe have a look at df in coreutils. It used to have problems with > alignment of the column headers too, but they changed things so that > each column header is translated separately and they are aligned > automatically. Or maybe have a look at util-linux -- I think it has > a mechanism/library to create properly aligned tables. Thanks! I quickly looked at df and I see it has code that handles the various issues in getting the alignment right. :-) I think I cannot use that code in xz for license reasons, but on the other hand I don't need that fancy features in xz either, I think. There already is some multibyte-aware code in xz because some languages use fancy characters for thousand separators, and those need to be handled correctly to get the alignment right. Splitting the strings for --list is much easier than for --help (without an external library) as --list doesn't need word wrapping. Perhaps I should look if it is easy enough to change --list to separate strings, or at least part of it. I suppose splitting even a few strings should make translations easier and less error prone. > If that is too much work, then adding a translator instruction (hint) > as a comment before the relevant string might help a bit. Normally > one then adds --add-comments=TRANSLATORS to the invocation of > xgettext. There already are TRANSLATORS-comments and they show up in xz.pot too. > Op 01-03-19 om 21:18 schreef Mario Blättermann: > > XZ now could be added now to the TP. > > Okay for me. But I need a direct request from Lasse, From other emails I understood that the situation of the existing translations and translators is clear now, thus I can now ask that XZ Utils becomes part of the TP. I think version 5.2.4 is a decent starting point: https://tukaani.org/xz/xz-5.2.4.tar.xz 5.2.5 will probably have no changes in translatable strings, unless I suddenly split the strings in --list, but probably I don't want to rush that into a bug fix release because I fear regressions. I plan to add this to README, I hope it's good enough: The translations are handled via the Translation Project. If you wish to help translating xz, please join the Translation Project: http://translationproject.org/html/translators.html > plus whether he wants to receive a notification when a translator > uploads an update, and if yes on which email address. I would like an email to my personal address (not xz-devel) with an URL to the translation. > And whether he wants the translators to have signed a disclaimer > (normally only required for GNU software, > https://translationproject.org/html/whydisclaim.html). The original strings are in the public domain and the translations should be too (strictly speaking: as far as PD is legally possible). I think it's enough if this is written in the .po files like it is in the existing .po files. That is, I don't request any physical papers or such things. Thanks to everyone involved for helping with this! -- Lasse Collin | IRC: Larhzu @ IRCnet & Freenode