Re: Fwd: [libreoffice-l10n] Help text for MIDB
Thanks a lot! I've submitted the suggestions to Gerrit, where anyone is welcome to comment them: https://gerrit.libreoffice.org/#/c/14092/ Stanislav Dne 20.1.2015 v 02:25 Jesper Hertel napsal(a): Here are my suggestions for examples for MIDB, LEFTB, RIGHTB and LENB. I actually made a spreadsheet in LibreOffice Calc and tested each expression to be absolutely sure of the results. The spreadsheet I made can be found at [1]. I made it using the English (US) user interface and locale. [1]: http://www49.zippyshare.com/v/YbkWBbkZ/file.html It turned out that invalid requests (half DBCS characters) actually do *not* result in empty strings but rather in a *space character*. Therefore these suggested examples and explanations. The return values are the *actual* return values using the actual mentioned expressions and were therefore *not* typed by hand (check the spreadsheet if you want to see how). Note the rather subtle spaces returned. MIDB("中国",1,0) returns "" (0 bytes is always an empty string). MIDB("中国",1,1) returns " " (1 byte is only half a DBCS character and therefore the result is a space character). MIDB("中国",1,2) returns "中" (2 bytes constitute one complete DBCS character). MIDB("中国",1,3) returns "中 " (3 bytes constitute one and a half DBCS character; the last byte results in a space character). MIDB("中国",1,4) returns "中国" (4 bytes constitute two complete DBCS characters). MIDB("中国",2,1) returns " " (byte position 2 is not at the beginning of a character in a DBCS string; 1 space character is returned). MIDB("中国",2,2) returns " " (byte position 2 points to the last half of the first character in the DBCS string; the 2 bytes asked for therefore constitutes the last half of the first character and the first half of the second character in the string; 2 space characters are therefore returned). MIDB("中国",2,3) returns " 国" (byte position 2 is not at the beginning of a character in a DBCS string; a space character is returned for byte position 2). MIDB("中国",3,1) returns " " (byte position 3 is at the beginning of a character in a DBCS string, but 1 byte is only half a DBCS character and a space character is therefore returned instead). MIDB("中国",3,2) returns "国" (byte position 3 is at the beginning of a character in a DBCS string, and 2 bytes constitute one DBCS character). MIDB("office",2,3) returns "ffi" (byte position 2 is at the beginning of a character in a non-DBCS string, and 3 bytes of a non-DBCS string constitute 3 characters). LEFTB("中国",1) returns " " (1 byte is only half a DBCS character and a space character is returned instead). LEFTB("中国",2) returns "中" (2 bytes constitute one complete DBCS character). LEFTB("中国",3) returns "中 " (3 bytes constitute one DBCS character and a half; the last character returned is therefore a space character). LEFTB("中国",4) returns "中国" (4 bytes constitute two complete DBCS characters). LEFTB("office",3) returns "off" (3 non-DBCS characters each consisting of 1 byte). RIGHTB("中国",1) returns " " (1 byte is only half a DBCS character and a space character is returned instead). RIGHTB("中国",2) returns "国" (2 bytes constitute one complete DBCS character). RIGHTB("中国",3) returns " 国" (3 bytes constitute one half DBCS character and one whole DBCS character; a space is returned for the first half). RIGHTB("中国",4) returns "中国" (4 bytes constitute two complete DBCS characters). RIGHTB("office",3) returns "ice" (3 non-DBCS characters each consisting of 1 byte). LENB("中") returns "2" (1 DBCS character consisting of 2 bytes). LENB("中国") returns "4" (2 DBCS characters each consisting of 2 bytes). LENB("office") returns "6" (6 non-DBCS characters each consisting of 1 byte). If anyone else is curious, "中国" means China in Chinese – according to Google Translate :-). Jesper -- To unsubscribe e-mail to: l10n+unsubscr...@global.libreoffice.org Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/ Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette List archive: http://listarchives.libreoffice.org/global/l10n/ All messages sent to this list will be publicly archived and cannot be deleted
Re: Re: Fwd: [libreoffice-l10n] Help text for MIDB
2015-01-20 13:18 GMT+01:00 Kevin Suo : > 在2015年01月20 09时25分, "Jesper Hertel"写道: > > Here are my suggestions for examples for MIDB, LEFTB, RIGHTB and LENB. > Good job! > Thanks! > > > > If anyone else is curious, "中国" means China in Chinese – according to > Google Translate :-). > > Google Translate is 100% right. > :-) -- To unsubscribe e-mail to: l10n+unsubscr...@global.libreoffice.org Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/ Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette List archive: http://listarchives.libreoffice.org/global/l10n/ All messages sent to this list will be publicly archived and cannot be deleted
Re: Fwd: [libreoffice-l10n] Help text for MIDB
I won't pretend I understood the Chinese and Japanese cases, however, seems to me ALL this, or at least the most representative parts, should go into help, all languages, possibly not into the specific Basic function but into some separate subclause ("handling the multi-byte codings?"). This shouldn't be considered a "duplicate" of the relevant standards, but an explanation of what is actually implemented in LO. On 01/20/2015 04:32 PM, Naruhiko Ogasawara wrote: ... I wonder if HELP should describe such a detail, though. -Yury -- To unsubscribe e-mail to: l10n+unsubscr...@global.libreoffice.org Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/ Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette List archive: http://listarchives.libreoffice.org/global/l10n/ All messages sent to this list will be publicly archived and cannot be deleted
Re: Re: Fwd: [libreoffice-l10n] Help text for MIDB
Hi, Kevin, Jesper, * Sorry, I couldn't catch this discussion, just short comment. Basically Japanese characters can be expressed double-byte as Chinese, and some of Japanese characters use 4 bytes (called "Surrogate Pair"), not a two byte, such as "𠀋" (U+2000B). I know it's trivial example: A1 = "𠀋𠀋" B1 = MIDB(A1,1,1) returns "" B1 = MIDB(A1,1,2) returns "(*)" B1 = MIDB(A1,1,3) returns "(*)" B1 = MIDB(A1,1,4) returns "𠀋" B1 = MIDB(A1,1,5) returns "𠀋(*)" B1 = MIDB(A1,1,6) returns "𠀋(*)" B1 = MIDB(A1,1,7) returns "𠀋(*)" B1 = MIDB(A1,1,8) returns "𠀋𠀋" (*) is a special character means that font has no glyph in that codepoint. I wonder if HELP should describe such a detail, though. Regards, -- Naruhiko Ogasawara (naru...@gmail.com) -- To unsubscribe e-mail to: l10n+unsubscr...@global.libreoffice.org Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/ Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette List archive: http://listarchives.libreoffice.org/global/l10n/ All messages sent to this list will be publicly archived and cannot be deleted
Re:Re: Fwd: [libreoffice-l10n] Help text for MIDB
在2015年01月20 09时25分, "Jesper Hertel"写道: > Here are my suggestions for examples for MIDB, LEFTB, RIGHTB and LENB. Good job! > If anyone else is curious, "中国" means China in Chinese – according to Google Translate :-). Google Translate is 100% right. Kevin Suo -- To unsubscribe e-mail to: l10n+unsubscr...@global.libreoffice.org Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/ Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette List archive: http://listarchives.libreoffice.org/global/l10n/ All messages sent to this list will be publicly archived and cannot be deleted
Re: Fwd: [libreoffice-l10n] Help text for MIDB
2015-01-19 21:30 GMT+01:00 Stanislav Horáček : > Hi, > > I agree that these examples are really useful. Could you provide also some > examples for the other functions dealing with DBCS (LEFTB, RIGHTB, LENB)? > If so, I will add them to the Help text. > Hi Stanislav and others, Here are my suggestions for examples for MIDB, LEFTB, RIGHTB and LENB. I actually made a spreadsheet in LibreOffice Calc and tested each expression to be absolutely sure of the results. The spreadsheet I made can be found at [1]. I made it using the English (US) user interface and locale. [1]: http://www49.zippyshare.com/v/YbkWBbkZ/file.html It turned out that invalid requests (half DBCS characters) actually do *not* result in empty strings but rather in a *space character*. Therefore these suggested examples and explanations. The return values are the *actual* return values using the actual mentioned expressions and were therefore *not* typed by hand (check the spreadsheet if you want to see how). Note the rather subtle spaces returned. MIDB("中国",1,0) returns "" (0 bytes is always an empty string).MIDB("中国",1,1) returns " " (1 byte is only half a DBCS character and therefore the result is a space character).MIDB("中国",1,2) returns "中" (2 bytes constitute one complete DBCS character).MIDB("中国",1,3) returns "中 " (3 bytes constitute one and a half DBCS character; the last byte results in a space character).MIDB("中国",1,4) returns "中国" (4 bytes constitute two complete DBCS characters).MIDB("中国",2,1) returns " " (byte position 2 is not at the beginning of a character in a DBCS string; 1 space character is returned).MIDB("中国",2,2) returns " " (byte position 2 points to the last half of the first character in the DBCS string; the 2 bytes asked for therefore constitutes the last half of the first character and the first half of the second character in the string; 2 space characters are therefore returned).MIDB("中国",2,3) returns " 国" (byte position 2 is not at the beginning of a character in a DBCS string; a space character is returned for byte position 2).MIDB("中国",3,1) returns " " (byte position 3 is at the beginning of a character in a DBCS string, but 1 byte is only half a DBCS character and a space character is therefore returned instead).MIDB("中国",3,2) returns "国" (byte position 3 is at the beginning of a character in a DBCS string, and 2 bytes constitute one DBCS character).MIDB("office",2,3) returns "ffi" (byte position 2 is at the beginning of a character in a non-DBCS string, and 3 bytes of a non-DBCS string constitute 3 characters). LEFTB("中国",1) returns " " (1 byte is only half a DBCS character and a space character is returned instead).LEFTB("中国",2) returns "中" (2 bytes constitute one complete DBCS character).LEFTB("中国",3) returns "中 " (3 bytes constitute one DBCS character and a half; the last character returned is therefore a space character).LEFTB("中国",4) returns "中国" (4 bytes constitute two complete DBCS characters).LEFTB("office",3) returns "off" (3 non-DBCS characters each consisting of 1 byte). RIGHTB("中国",1) returns " " (1 byte is only half a DBCS character and a space character is returned instead).RIGHTB("中国",2) returns "国" (2 bytes constitute one complete DBCS character).RIGHTB("中国",3) returns " 国" (3 bytes constitute one half DBCS character and one whole DBCS character; a space is returned for the first half).RIGHTB("中国",4) returns "中国" (4 bytes constitute two complete DBCS characters).RIGHTB("office",3) returns "ice" (3 non-DBCS characters each consisting of 1 byte). LENB("中") returns "2" (1 DBCS character consisting of 2 bytes).LENB("中国") returns "4" (2 DBCS characters each consisting of 2 bytes).LENB("office") returns "6" (6 non-DBCS characters each consisting of 1 byte). If anyone else is curious, "中国" means China in Chinese – according to Google Translate :-). Jesper -- To unsubscribe e-mail to: l10n+unsubscr...@global.libreoffice.org Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/ Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette List archive: http://listarchives.libreoffice.org/global/l10n/ All messages sent to this list will be publicly archived and cannot be deleted
Re: Fwd: [libreoffice-l10n] Help text for MIDB
Hi, I agree that these examples are really useful. Could you provide also some examples for the other functions dealing with DBCS (LEFTB, RIGHTB, LENB)? If so, I will add them to the Help text. Thanks! Stanislav Dne 19.1.2015 v 16:11 Jesper Hertel napsal(a): 2015-01-19 15:32 GMT+01:00 Kevin Suo : A1 = "中国" B1 = MIDB(A1,1,1) returns "" B1 = MIDB(A1,1,2) returns "中" B1 = MIDB(A1,1,3) returns "中" B1 = MIDB(A1,1,4) returns "中国" Thanks for the examples, Kevin! I was afraid they wouldn't go through the maling list system, so that was why I didn't supply any. But yours are even better than the ones I would have thought of providing. I think it is better up to the localizer to translate this help text according to their needs, for example Japanese team may show how this works with Japanese chars. I agree that the specific translation is up to the localizers. But even people using a non-DBCS user interface language, such as English or Danish, could want to use that function and could want to know what it is about and how to use it; they could work with Japanese or another DBCS language without having the user interface in that language. So I still believe the English text could be improved. Both regarding the earlier mentioned sentence and regarding the addition of several actual DBCS examples similar to the good ones you provided. Maybe just worded and expanded like this to show that the position argument is also counted in bytes and not in character positions: MIDB("中国",1,1) returns "" (1 byte is only half a character and it is therefore discarded). MIDB("中国",1,2) returns "中" (2 bytes are one complete character). MIDB("中国",1,3) returns "中" (3 bytes are one character and a half; the last byte is discarded). MIDB("中国",1,4) returns "中国" (4 bytes are two complete characters). MIDB("中国",2,1) returns "" (byte position 2 is not at the beginning of a character). MIDB("中国",2,2) returns "" (byte position 2 is not at the beginning of a character). MIDB("中国",3,1) returns "" (byte position 3 is at the beginning of a character, but 1 byte is only half a character and is therefore discarded). MIDB("中国",3,2) returns "国". And yes, I do believe that this rather large amount of examples are necessary to make it completely clear how this rather technical function works, and that the Help should be the place for such an explanation. Whether my explanations in parentheses are understandable or relevant I don't know. It is an attempt to explain what is happening to the not-so-technical users, but even also to technical users that want to be sure they understood it right. Jesper -- To unsubscribe e-mail to: l10n+unsubscr...@global.libreoffice.org Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/ Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette List archive: http://listarchives.libreoffice.org/global/l10n/ All messages sent to this list will be publicly archived and cannot be deleted
Fwd: [libreoffice-l10n] Help text for MIDB
2015-01-19 15:32 GMT+01:00 Kevin Suo : > A1 = "中国" > B1 = MIDB(A1,1,1) returns "" > B1 = MIDB(A1,1,2) returns "中" > B1 = MIDB(A1,1,3) returns "中" > B1 = MIDB(A1,1,4) returns "中国" > Thanks for the examples, Kevin! I was afraid they wouldn't go through the maling list system, so that was why I didn't supply any. But yours are even better than the ones I would have thought of providing. > I think it is better up to the localizer to translate this help text > according to their needs, for example Japanese team may show how this works > with Japanese chars. > I agree that the specific translation is up to the localizers. But even people using a non-DBCS user interface language, such as English or Danish, could want to use that function and could want to know what it is about and how to use it; they could work with Japanese or another DBCS language without having the user interface in that language. So I still believe the English text could be improved. Both regarding the earlier mentioned sentence and regarding the addition of several actual DBCS examples similar to the good ones you provided. Maybe just worded and expanded like this to show that the position argument is also counted in bytes and not in character positions: MIDB("中国",1,1) returns "" (1 byte is only half a character and it is therefore discarded). MIDB("中国",1,2) returns "中" (2 bytes are one complete character). MIDB("中国",1,3) returns "中" (3 bytes are one character and a half; the last byte is discarded). MIDB("中国",1,4) returns "中国" (4 bytes are two complete characters). MIDB("中国",2,1) returns "" (byte position 2 is not at the beginning of a character). MIDB("中国",2,2) returns "" (byte position 2 is not at the beginning of a character). MIDB("中国",3,1) returns "" (byte position 3 is at the beginning of a character, but 1 byte is only half a character and is therefore discarded). MIDB("中国",3,2) returns "国". And yes, I do believe that this rather large amount of examples are necessary to make it completely clear how this rather technical function works, and that the Help should be the place for such an explanation. Whether my explanations in parentheses are understandable or relevant I don't know. It is an attempt to explain what is happening to the not-so-technical users, but even also to technical users that want to be sure they understood it right. Jesper -- To unsubscribe e-mail to: l10n+unsubscr...@global.libreoffice.org Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/ Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette List archive: http://listarchives.libreoffice.org/global/l10n/ All messages sent to this list will be publicly archived and cannot be deleted