Re: Unicode script

2016-12-17 Thread MRAB
//www.unicode.org/reports/tr24/ http://www.unicode.org/Public/UCD/latest/ucd/Scripts.txt http://www.unicode.org/Public/UCD/latest/ucd/ScriptExtensions.txt Interestingly, there's issue 6331 "Add unicode script info to the unicode database". Looks like it didn't make it into Pyth

Re: Unicode script

2016-12-15 Thread MRAB
24/ http://www.unicode.org/Public/UCD/latest/ucd/Scripts.txt http://www.unicode.org/Public/UCD/latest/ucd/ScriptExtensions.txt Interestingly, there's issue 6331 "Add unicode script info to the unicode database". Looks like it didn't make it into Python 3.6. https://bugs.python.org/i

Re: Unicode script

2016-12-15 Thread Terry Reedy
test/ucd/Scripts.txt http://www.unicode.org/Public/UCD/latest/ucd/ScriptExtensions.txt Interestingly, there's issue 6331 "Add unicode script info to the unicode database". Looks like it didn't make it into Python 3.6. https://bugs.python.org/issue6331 Opened in 2009 with patch a

Re: Unicode script

2016-12-15 Thread Terry Reedy
On 12/15/2016 11:53 AM, Steve D'Aprano wrote: Suppose I have a Unicode character, and I want to determine the script or scripts it belongs to. For example: U+0033 DIGIT THREE "3" belongs to the script "COMMON"; U+0061 LATIN SMALL LETTER A "a" belongs to the script "LATIN"; U+03BE GREEK SMALL LE

Re: Unicode script

2016-12-15 Thread MRAB
de.org/Public/UCD/latest/ucd/ScriptExtensions.txt Interestingly, there's issue 6331 "Add unicode script info to the unicode database". Looks like it didn't make it into Python 3.6. -- https://mail.python.org/mailman/listinfo/python-list

Re: Unicode script

2016-12-15 Thread Joel Goldstick
I think this might be what you want: https://docs.python.org/3/howto/unicode.html#unicode-properties On Thu, Dec 15, 2016 at 11:53 AM, Steve D'Aprano wrote: > Suppose I have a Unicode character, and I want to determine the script or > scripts it belongs to. > > For example: > > U+0033 DIGIT THREE

Re: Unicode script

2016-12-15 Thread eryk sun
On Thu, Dec 15, 2016 at 4:53 PM, Steve D'Aprano wrote: > Suppose I have a Unicode character, and I want to determine the script or > scripts it belongs to. > > For example: > > U+0033 DIGIT THREE "3" belongs to the script "COMMON"; > U+0061 LATIN SMALL LETTER A "a" belongs to the script "LATIN"; >

Unicode script

2016-12-15 Thread Steve D'Aprano
Suppose I have a Unicode character, and I want to determine the script or scripts it belongs to. For example: U+0033 DIGIT THREE "3" belongs to the script "COMMON"; U+0061 LATIN SMALL LETTER A "a" belongs to the script "LATIN"; U+03BE GREEK SMALL LETTER XI "ΞΎ" belongs to the script "GREEK". Is