> Date: Tue, 11 Oct 2011 10:53:39 +0900 > From: "Martin J. Dürst" <due...@it.aoyama.ac.jp> > CC: li bo <libo....@gmail.com>, unicode@unicode.org > > I might add here that 'break a line' in the Bidi algorithm is done > before actual reordering (which is done line-by-line), but after > calculating all the levels.
Please be aware that this separation of the UBA into phases makes no sense at all in the context of Emacs display engine. The UBA is written from the POV of batch processing of a block of text -- you pass in a string in logical order, and receive a reordered string in return. The UBA describes the processing as a series of phases, each one of which is completed for all the characters in the block of text before the next phase begins. By contrast, the Emacs display engine examines the text to display one character at a time. For each character, it loads the necessary display and typeface information, and then decides whether it will fit the display line. Then it examines the next character, and so on. It should be clear that processing characters one by one completely disrupts the subdivision of the UBA into the phases that include examination of more than that single character, let alone decisions of where to break the line, because reordering can no longer be done "line by line". Let me give you just one example: if the character should be mirrored, you cannot decide whether it fits the display line until _after_ you know what its mirrored glyph looks like. But mirroring is only resolved at a very late stage of reordering, so if you want to reorder _after_ breaking into display lines, you will have to back up and reconsider that decision after reordering, which will slow you down. Given these considerations, it is a small wonder that the UBA implementation inside Emacs is _very_ different from the description in UAX#9. Therefore, the subdivision into phases that are on the line and higher levels makes very little sense here, since the implementation needed to produce an identical result while performing a significant surgery on the algorithm description. In effect, the UBA implementation in Emacs treated UAX#9 as a set of requirements, not as a high-level description of the implementation.