On Thu, May 5, 2016 at 12:09 AM, DFS <nos...@dfs.com> wrote:
> On 5/3/2016 11:28 PM, Steven D'Aprano wrote:
>> [ lengthy piece about text, Unicode, and letter case ]
>
> Linguist much?

As an English-only speaker who writes code that needs to be used
around the world, you end up accruing tidbits of language and text
trivia in the form of edge cases that you need to remember to test.
Among them:

* Turkish dotless and dotted i
* Greek medial and final sigma
* German eszett
* Hebrew and Arabic right-to-left text
* Chinese non-BMP characters
* Combining characters (eg diacriticals starting U+0300)
* Non-characters eg U+FFFE

And then a post like Steven's basically comes from pulling up all
those from your memory, and maybe doing a spot of quick testing and/or
research to get some explanatory details. You don't have to be a
linguist, necessarily - just a competent debugger.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list

Reply via email to