Re: Question about Perl5 extended UTF-8 design

2015-11-05 Thread Ilya Zakharevich
On Thu, Nov 05, 2015 at 08:57:16AM -0700, Karl Williamson wrote: > Several of us are wondering about the reason for reserving bits for > the extended UTF-8 in perl5. I'm asking you because you are the > apparent author of the commits that did this. To start, the INTERNAL REPRESENTATION of Perl’s

Re: Combined Yorùbá characters with dot below and tonal diacritics

2015-04-12 Thread Ilya Zakharevich
On Sun, Apr 12, 2015 at 07:07:01AM +0200, Philippe Verdy wrote: 1) the characters required do not all exist as precomposed characters thus microsoft's dead key sequences will not work for yoruba. (As I said in my other email, the conclusion is wrong.) It's effectively a good catch that

Re: Combined Yorùbá characters with dot below and tonal diacritics

2015-04-12 Thread Ilya Zakharevich
On Sun, Apr 12, 2015 at 08:46:31AM +0200, Philippe Verdy wrote: Well that CPAN doc page is also full of junks, with considerations about a particular layout design for extended Latin, that should have been placed on a separate page for that layout. If you think it is junk, please write a

Re: Combined Yorùbá characters with dot below and tonal diacritics

2015-04-11 Thread Ilya Zakharevich
On Sat, Apr 11, 2015 at 01:19:23AM +0100, Luis de la Orden wrote: Thanks for challenging my understanding of dead keys. I have a layout in my Mac that works like a charm to write Yorùbá, Portuguese and Spanish with the UK layout. I am having trouble with the Windows layout and should have

Re: Combined Yorùbá characters with dot below and tonal diacritics

2015-04-11 Thread Ilya Zakharevich
On Sun, Apr 12, 2015 at 01:06:51PM +1000, Andrew Cunningham wrote: The problem with approach documented below is two fold: 1) the characters required do not all exist as precomposed characters thus microsoft's dead key sequences will not work for yoruba. As I explained in my mail, this is

Re: Terms for rotations

2014-11-10 Thread Ilya Zakharevich
On Fri, Nov 07, 2014 at 02:39:58PM -0800, Garth Wallace wrote: I'm leaning towards turned, left rotated, and right rotated for the cardinal orientations, … Please keep in mind that left/right are especially bad terms to describe rotations. When you rotate the

Re: Terms for rotations

2014-11-10 Thread Ilya Zakharevich
On Mon, Nov 10, 2014 at 06:30:37PM +, Whistler, Ken wrote: Could the characters SWR2 to SWR8 be applied to chess symbols or should new rotation modifiers be created for them? They aren't currently defined to do so -- and there is certainly a danger in opening up the applicability to

Re: Rotations, SignWriting, and Mr Potato Head

2014-11-10 Thread Ilya Zakharevich
Oups, I forgot to update the subject, AND made a misprint On Mon, Nov 10, 2014 at 02:01:09PM -0800, I wrote: See, for example, the Mr Potato Head font http://www.unicode.org/mail-arch/unicode-ml/y2014-m09/0003.html ; using the same principles, one could encode most (all?) of the hand

Re: Terms for rotations

2014-11-10 Thread Ilya Zakharevich
On Tue, Nov 11, 2014 at 12:43:05AM +0100, Jean-François Colson wrote: (I believe that people associate left ↔ counterclockwise etc only because for many shapes, visually, the bottom is just a pedestal for the top. So you “grab” the shape “on top”.] Look at this picture:

Re: Request for Information

2014-07-26 Thread Ilya Zakharevich
On Sat, Jul 26, 2014 at 09:26:21AM -0600, Doug Ewell wrote: It's a bit like the locale collections (CLDR is not alone here) that specify a single date format for an entire country, as if all Americans only ever write a short date as m/dd/yy and anyone who uses a different format is employing

Re: Unicode ranges with baseline/x-height/X-height

2014-05-15 Thread Ilya Zakharevich
On Thu, May 15, 2014 at 04:21:23PM +0600, Christopher Fynn wrote: Indic scripts generally have a hanging base Sure. And many mathematical symbols should have a “math-centerline base”. However, the font files I’m working with do not have the information about where these extra baselines are

Re: Unclear text in the UBA (UAX#9) of Unicode 6.3

2014-04-25 Thread Ilya Zakharevich
On Wed, Apr 23, 2014 at 06:15:44PM -0700, Asmus Freytag wrote: On 4/23/2014 4:41 PM, Ilya Zakharevich wrote: GREED) Given any close-delimiter marked as “non-matching”, its pre-context does not contain any open-delimiter which could match it. Here pre-context

Re: Unclear text in the UBA (UAX#9) of Unicode 6.3

2014-04-23 Thread Ilya Zakharevich
On Tue, Apr 22, 2014 at 09:06:27AM -0700, Asmus Freytag wrote: if you read UAX#9, the way the algorithm works is by pushing openers on a stack, then, on finding the first closer, going down the stack and attempting to locate a match, then, on finding a match, discarding any enclosed openers,

Re: Unclear text in the UBA (UAX#9) of Unicode 6.3

2014-04-23 Thread Ilya Zakharevich
On Wed, Apr 23, 2014 at 09:21:04AM -0700, Asmus Freytag wrote: a parsing is good if it satisfies all conditions below: 0) Some delimiters in the string are marked as “non-matching”; the rest is broken into disjoint “matched” pairs; MATCH) A “matched” pair consists of an

Re: Unclear text in the UBA (UAX#9) of Unicode 6.3

2014-04-23 Thread Ilya Zakharevich
On Wed, Apr 23, 2014 at 06:25:53PM +0300, Eli Zaretskii wrote: I see nothing in your definition that is significantly different from our attempts. It does feel more complex, mainly because you have much more conditions, combining which in one's mind might not be easy at first reading.

Re: Unclear text in the UBA (UAX#9) of Unicode 6.3

2014-04-22 Thread Ilya Zakharevich
On Mon, Apr 21, 2014 at 11:25:05PM -0700, Asmus Freytag wrote: On 4/21/2014 8:32 PM, Ilya Zakharevich wrote: On Mon, Apr 21, 2014 at 06:08:12PM -0700, Asmus Freytag wrote: Here's the text I supplied, with numbers added for discussion. It definitely needs some editing, but the point

Re: Unclear text in the UBA (UAX#9) of Unicode 6.3

2014-04-22 Thread Ilya Zakharevich
On Tue, Apr 22, 2014 at 07:08:56PM +0300, Eli Zaretskii wrote: Sorry, I do not see any definition here. Just a collection of words which looks like a definition, but only locally… Any definition is just a collection of words, of course. Can you tell what is missing from this collection

Re: Unclear text in the UBA (UAX#9) of Unicode 6.3

2014-04-21 Thread Ilya Zakharevich
On Mon, Apr 21, 2014 at 02:44:14PM -0700, Asmus Freytag wrote: On 4/21/2014 1:54 PM, Philippe Verdy wrote: My intent was not to demonstrate a bug in the algorithm, I have not even claimed that, but to make sure that (less common) usages of paired brackets that do not obey to a pure hierarchy

Re: Unclear text in the UBA (UAX#9) of Unicode 6.3

2014-04-21 Thread Ilya Zakharevich
On Mon, Apr 21, 2014 at 06:08:12PM -0700, Asmus Freytag wrote: Here's the text I supplied, with numbers added for discussion. It definitely needs some editing, but the point of the exercise would be to see what: 1. A bracket pair is a pair of characters consisting of an opening

23AF HORIZONTAL LINE EXTENSION: glyph or variation selector?

2014-04-02 Thread Ilya Zakharevich
Current (and 7.0.0-tobe) versions do not say much: 23AF HORIZONTAL LINE EXTENSION * used for extension of arrows x (vertical line extension - 23D0) If it is intended to be a variation selector (possibly prepended instead of appended!), then using it with ⇒ should give longer

HORIZONTAL SCAN LINEs

2014-04-02 Thread Ilya Zakharevich
The current version (and 7.0.0-tobe) describe them as: @ Scan lines for terminal graphics @+ The scan line numbers here refer to old, low-resolution technology for terminals, with only 9 scan lines per fixed-size character glyph. Even-numbered scan lines

Re: Emoji [And crash in the Web interface to the mailing list]

2014-04-02 Thread Ilya Zakharevich
On Wed, Apr 02, 2014 at 10:00:08AM -0700, James Lin wrote: Everyone can guess what are the following emoji that used frequently in Japan: What makes you think so? I would not have a slightest clue what the intended meaning is… ヽ( ̄д ̄;)ノ - worried [I removed the rest since they crash the

Re: FYI: More emoji from Chrome

2014-04-01 Thread Ilya Zakharevich
On Tue, Apr 01, 2014 at 09:01:39AM +0200, Mark Davis ☕️ wrote: More emoji from Chrome: http://chrome.blogspot.ch/2014/04/a-faster-mobiler-web-with-emoji.html with video: https://www.youtube.com/watch?v=G3NXNnoGr3Y I do not know… The demos leave me completely unimpressed: emoji — by their

Re: How to remove accents while conforming to language standards?

2013-11-01 Thread Ilya Zakharevich
On Fri, Nov 01, 2013 at 07:32:44PM +0200, Jukka K. Korpela wrote: 2013-11-01 17:37, Jennifer Wong wrote: I would like to ask for advice on removing accents from characters. To address first the question you ask in the Subject line, “How to remove accents while conforming to language

Re: How to remove accents while conforming to language standards?

2013-11-01 Thread Ilya Zakharevich
On Sat, Nov 02, 2013 at 01:07:44AM +0100, Pierpaolo Bernardi wrote: On Sat, Nov 2, 2013 at 12:33 AM, Ilya Zakharevich nospam-ab...@ilyaz.org wrote: Given that the initial question was more or less explicitly formulated as “how to minimize the losses?”, The initial question

Re: New arrows?

2013-09-20 Thread Ilya Zakharevich
On Fri, Sep 20, 2013 at 02:34:19PM +0200, Jean-François Colson wrote: Hello Trying to solve a 2 × 2 × 2 Rubik’s cube, I was looking for some help on the web when I found a handful of unknown arrows. I had never seen them before, but I understood their meaning at first sight. Since they are

Re: Origin of Ellipsis (was: RE: Empty set)

2013-09-15 Thread Ilya Zakharevich
On Sun, Sep 15, 2013 at 09:21:47PM +0200, Philippe Verdy wrote: If there's something to do now (given it is no longer used in CJK contexts), it's to strongly recommand that fonts map them to exactly the same glyph as the one obtained by aligning three periods in a raw without any additional

Re: Origin of Ellipsis and double spacing after a sentence.

2013-09-14 Thread Ilya Zakharevich
On Sat, Sep 14, 2013 at 08:19:54PM +0100, Michael Everson wrote: And as a book designer and publisher, I think that having large spaces after a full stop is both unnecessary and vulgar. As a book consumer, I know that having somewhat larger space after end-of-sentence is a MUST (at least for

Re: Empty set

2013-09-12 Thread Ilya Zakharevich
On Thu, Sep 12, 2013 at 09:06:54PM +0300, Jukka K. Korpela wrote: And below the university level Germans write { }, which I like better. The notation { } is quite correct. IMO, in math texts the correctness is significantly less important than being not ambiguous. (It is practically

Re: Can a single text document use multiple character encodings?

2013-08-30 Thread Ilya Zakharevich
On Wed, Aug 28, 2013 at 07:07:23PM +, Costello, Roger L. wrote: For example, can some text be encoded as UTF-8 while other text is encoded as UTF-16 - within the same document? I think it is a very interesting question. A Perl program is (obviously) a text document. On the other hand,

Re: Ways to show Unicode contents on Windows?

2013-07-29 Thread Ilya Zakharevich
On Wed, Jul 10, 2013 at 04:24:36AM +, Murray Sargent wrote: Ilya asked, Are there any other ways to show Unicode on Windows? You can download Unibook (http://www.unicode.org/unibook/) and set up your fonts for the ranges. That's the way The Unicode Standard code charts are displayed

Re: Ways to show Unicode contents on Windows?

2013-07-29 Thread Ilya Zakharevich
On Fri, Jul 19, 2013 at 03:19:44AM +, Peter Constable wrote: Why would one NEED to upgrade the OS to use Old Italic? You can't expect an OS like Windows XP to support Old Italic characters that weren't even defined in Unicode at the time it shipped. Yes I can. And did not you notice

Re: Ways to show Unicode contents on Windows?

2013-07-29 Thread Ilya Zakharevich
On Fri, Jul 19, 2013 at 08:28:07AM +0100, Richard Wordingham wrote: Just in case: do you realize that out-of-BMP must be specified via LIGATURES section? Yes, for 'character' read UTF-16 code element. Even worse, you can't use dead keys outside the BMP, which prevents one using MSKLC for

Re: symbols/codepoints for necessity and possibility in modal logic

2013-07-29 Thread Ilya Zakharevich
On Fri, Jul 19, 2013 at 04:00:26AM -0700, Stephan Stiller wrote: I'd consider U+22C4 DIAMOND OPERATOR as wrong because it is used as a binary operator I'm thinking that binary infix symbol and unary prefix symbol might be appropriate terminology too. which has a very different spacing than

Re: Ways to show Unicode contents on Windows?

2013-07-18 Thread Ilya Zakharevich
On Wed, Jul 17, 2013 at 12:04:10AM +0100, Richard Wordingham wrote: (LCID); I don't see any way to check what the general .klc file format is - the format seemed very delicate when I had to edit it by hand, at least, not for the SMP. I wonder whether this link is relevant to what you discuss:

Re: Ways to show Unicode contents on Windows?

2013-07-18 Thread Ilya Zakharevich
On Fri, Jul 12, 2013 at 04:46:23AM +, Peter Constable wrote: Your Tai Tham situation is, of course, exceptional. For a lot of users, though, if they would only update their XP machines to even Windows 7, if not Windows 8.1, they'd find a lot of characters they've been missing are well

Re: Ways to show Unicode contents on Windows?

2013-07-10 Thread Ilya Zakharevich
On Wed, Jul 10, 2013 at 05:15:51AM +, Murray Sargent wrote: A bulk approach works. The hyperlink gives full instructions on how to set up the fonts. You can customize it by changing the fonts listed in default.cfl. Thanks. ——

Ways to show Unicode contents on Windows?

2013-07-09 Thread Ilya Zakharevich
I put some info on the subject into http://search.cpan.org/~ilyaz/UI-KeyboardLayout/lib/UI/KeyboardLayout.pm#There_is_no_way_to_show_Unicode_contents_on_Windows Are there any other ways to show Unicode on Windows? (Here I mean showing “real” Unicode, not the subset of Unicode MicroSoft decided

Re: Ways to show Unicode contents on Windows?

2013-07-09 Thread Ilya Zakharevich
On Wed, Jul 10, 2013 at 04:24:36AM +, Murray Sargent wrote: Ilya asked, Are there any other ways to show Unicode on Windows? You can download Unibook (http://www.unicode.org/unibook/) and set up your fonts for the ranges. That's the way The Unicode Standard code charts are displayed

Re: Latvian and Marshallese Ad Hoc Report (cedilla and comma below)

2013-06-26 Thread Ilya Zakharevich
On Thu, Jun 20, 2013 at 09:27:49AM +0100, Michael Everson wrote: On 19 Jun 2013, at 18:24, Richard Wordingham richard.wording...@ntlworld.com wrote: The X11 restriction of one character per key stroke is not so easy to get round. Get them to fix X11. It looks like you think that X11 is

Re: Proposed Update UTS #18, Unicode Regular Expressions

2013-04-21 Thread Ilya Zakharevich
On Fri, Apr 19, 2013 at 01:29:07PM -0700, announceme...@unicode.org wrote: UTS #18, Unicode Regular Expressions, is being updated to bring it into alignment with Unicode 6.3. [This comment is not on the updates, but on the base text of #18.] Sec3.2 says: For example, an implementation

F A A on this mailing list

2013-02-10 Thread Ilya Zakharevich
[This is my first posting to this newsgroup] As a part of separate project, I collected a certain collection of Fabulously Attractive Answers on this mailing list.

On European keyboard

2013-02-10 Thread Ilya Zakharevich
[This is my second posting to this newsgroup] I think some people may find this keyboard layout useful — although its documentation is somewhat uneven nowadays. It is the principal example of possibilities of the toolset UI::KeyboardLayout for designing quality keyboard layouts.