On Wed, Jan 5, 2022 at 6:46 PM Alexandre Gacon <alexandre.ga...@gmail.com> wrote:
> --------------------------- > In order to understand the following explanation, keep in mind that: > > - UTF-8 is the encoding that will preserve properly all non-ascii, > non-latin1 characters > - ISO-5589-1 (aka latin1 ) is a ascii based encoding that contains all > the ascii characters plus some additional ones used in the latin alphabet > (i.e. é, è etc..) > > Probably key to understanding the rest, latin1 and ISO8859-1 are the same <https://en.wikipedia.org/wiki/ISO/IEC_8859-1> (it confused me at first). > > - > - us-ascii is the standard encoding for electronic communication and > as we already mentioned a subset of the latin1 encoding. > > > After the new tests regarding the retaining of the encoding of the file > given in the ticket, we noticed the following: > > - If a non-latin1, non-ascii character exists in the translation > (UTF-8 characters) then the final translation file will contain the UTF-8 > escaped corresponding characters (i.e. \u0420 corresponds to some Cyrillic > letter). > > Ok, so Transifex won't support the Wicket ".utf8.properties" convention, and just escape chars so that they can be encoded in ISO-8859-1 instead. > > - In our case, the latin1 character wasn’t part of the translated > strings but part of the structure of the file, at the template of the file. > This means that we don’t want to change it to the UTF-8 escaped character. > > I don't understand what "the structure of the file" instead of "part of the translated strings" means. Maybe the latin1 character was in a key rather than in a value? Or maybe in a comment. > > - But on the other hand, the library that we are using in order to > integrate github with transifex is not supporting latin1 but UTF-8 so when > a non-ascii character appears it converts the whole file to the best > encoding that can represent that character. In our case that is UTF-8. > > It seems they have a technical limitation, and can either do us-ascii or escaped UTF-8, but does not support latin1 (ISO-8859-1). > > In order to preserve the us-ascii encoding (not the latin1) in github one > must make sure that the source keys and the comments of the file do not > contain any non ascii characters. > Seems that we can either use only us-ascii chars (and encode anything else, included accented letters, using UTF-8 escape codes), or maybe fully UTF-8? Regardless it seems ISO-8859-1 is simply out of the equation? > --------------------------- > > In case something wasn't clear, what this means is that because the source > file had a latin1 character (é) even though the translations for the > strings did not, this character was kept as-is (not escaped) as part of the > "template". Therefore, the translation files sent back to GitHub are being > encoded with UTF-8 by the library being used. We do not think we can do > anything about this, unfortunately. So, the translation files for the Java > Properties file format must be retrieved from Transifex directly instead of > using the GitHub integration. > I believe the "é" character was added in a comment, as an attempt to force Transifex to use ISO-8859-1? And Transifex is simply incapable of doing that? Hum... well Wicket does not really care and will support translation files made of us-ascii with UTF-8 escapes fine I believe, but translators that are doing direct commits, rather than going though Transifex might be less than pleased. I believe Jody at one point mentioned a different platform, but cannot remember which one that is. Thinking out loud, I see two avenues ahead: - Put up with Transifex limitations - Try to extract the good work present in Transifex once, and then migrate to another translation system, if you can find one that works better for translator Cheers Andrea == GeoServer Professional Services from the experts! Visit http://bit.ly/gs-services-us for more information. == Ing. Andrea Aime @geowolf Technical Lead GeoSolutions Group phone: +39 0584 962313 fax: +39 0584 1660272 mob: +39 333 8128928 https://www.geosolutionsgroup.com/ http://twitter.com/geosolutions_it ------------------------------------------------------- Con riferimento alla normativa sul trattamento dei dati personali (Reg. UE 2016/679 - Regolamento generale sulla protezione dei dati “GDPR”), si precisa che ogni circostanza inerente alla presente email (il suo contenuto, gli eventuali allegati, etc.) è un dato la cui conoscenza è riservata al/i solo/i destinatario/i indicati dallo scrivente. Se il messaggio Le è giunto per errore, è tenuta/o a cancellarlo, ogni altra operazione è illecita. Le sarei comunque grato se potesse darmene notizia. This email is intended only for the person or entity to which it is addressed and may contain information that is privileged, confidential or otherwise protected from disclosure. We remind that - as provided by European Regulation 2016/679 “GDPR” - copying, dissemination or use of this e-mail or the information herein by anyone other than the intended recipient is prohibited. If you have received this email by mistake, please notify us immediately by telephone or e-mail
_______________________________________________ Geoserver-users mailing list Please make sure you read the following two resources before posting to this list: - Earning your support instead of buying it, but Ian Turton: http://www.ianturton.com/talks/foss4g.html#/ - The GeoServer user list posting guidelines: http://geoserver.org/comm/userlist-guidelines.html If you want to request a feature or an improvement, also see this: https://github.com/geoserver/geoserver/wiki/Successfully-requesting-and-integrating-new-features-and-improvements-in-GeoServer Geoserver-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/geoserver-users