Users just type what gives them the correct appearance. (In Arabic, that
infamously includes typing the wrong character, because it happens to
look correct and is on your regional keyboard).
The question then is "what software processes are unavoidable and known
to interfere with this user choice" for arrows in a bidirectional context?
Any proposal would need to be based on a careful vetting of scenarios,
like the one given below (where "->" is turned into an arrow character)
to see whether there's enough of an issue there and whether addressing
it with character coding is the right answer (or perhaps the only answer).
Without a solid body of evidence of where the current approach is
failing for lack of a solution that requires new characters, the issue
remains stuck in the category of "good idea". Something that looks like
it might be useful, but with obvious complications that would make it a
terrible idea unless these are outweighed by real, practical and
demonstrable gains (and for which no other alternative exists).
Even then, the problem with encoding duplicate characters based on
layout properties is that "users just type what gives them the correct
appearance" at the time they enter the character. The only context a
user has is the text being typed. If that happens to give the correct
direction, a user wouldn't know to shift to a different character, just
in case the context might change.
If replacing "->" by an arrow character can change its direction, isn't
it up to the autocorrect software to analyze the bidi context and select
the correct arrow? The rule should be to select whatever substitution
gives the same appearance (direction) as what the user would see for the
string they typed.
A./
On 4/8/2025 9:33 AM, Mark E. Shoulson via Unicode wrote:
My initial reaction on reading the subject was "*eyeroll* like we need
MORE arrow characters!" But then again, there is some point to these
arrows (sorry). I do feel like there are already _so many_ arrow
characters that duplicating all the ones with a horizontal component
to have a mirrored version would be a bit much, but there does seem to
be some utility in what is being proposed here. Naturally, this makes
me think, "well, how about we just make a _few_ such duplicates?" but
that's a slippery slope and will only lead to people protesting "But
there's a mirrored →, why can't I have a mirrored ⇰???" Not sure what
the best answer is. (Unless maybe mirrored characters were a Bad Idea
to start with?)
Here's a possibly disastrous idea: arrows mirror when they are within
the domain of a Directional Override character (U+202D, U+202E). This
would entail creating a new category of character which is subject to
this optional mirroring behavior, which then might be applied to other
characters (hmm, like some emoji, to get people running to the left or
something?) and I get the feeling that anything that touches the BiDi
algorithm might just be asking for trouble.
A similar[ly bad] idea might be to have markup-type characters,
something like <MIRRORED SELECTOR> or some such, to indicate that an
attached character should be mirrored (or a pair of them that indicate
direction).
I don't even want to know about handling this in TTB contexts...
~mark
On 4/8/25 10:34 AM, NeatNit via Unicode wrote:
Hi, I hope this is the right place to bring this up. I could not find
any discussions on this other than the document I quote.
Quick intro: characters with the property Bidi_Mirrored=Yes will be
visually mirrored within RTL text, such as Hebrew or Arabic. An easy
example is the Greater Than symbol: A>B and א>ב.
Arrow characters do not have this property: A→B but א→ב.
I've found this discrepancy mentioned in this document:
https://www.unicode.org/L2/L2022/22026r-non-bidi-mirroring.pdf
In particular, arrow and arrow-like characters
each often has a mirror character. One could
argue that they should have had the
Bidi_Mirrored=Yes property value, but they
don’t, and cannot now get that.
Even if it weren't for Unicode's stability policies, there are two
distinct usages of arrow symbols:
To indicate directions, e.g. "Turn left (←) and then right (→)" - in
this case the arrow refers to the physical direction and should not
be mirrored in RTL. The existing arrow characters serve this purpose
well: "פנה שמאלה (←) ואז ימינה (→)"
As an operator: "Convert A->B and assign C<-D" - in this case the
arrow direction should be mirrored if it appears in RTL text.
Currently this can only be emulated with ASCII "->" as I've just
demonstrated. Result: "המר א->ב וקבע ג<-ד".
Therefore I think there should be new characters, "Forward Arrow" and
"Backward Arrow", to serve the latter case. They would use the same
glyphs as existing arrows, but have the Bidi_Mirrored=Yes property.
Please let me know if this is likely to happen, and what I would have
to do to make a proper proposal. And if any of you are convinced
enough that you would like to make a proposal on my behalf, you are
welcome to do so!
The same reasoning can be applied to many other characters besides
these basic arrows. At minimum, all arrow and arrow-like characters
should be included. I haven't made a thorough search to find all
affected characters, at least not yet.
Note that some software, such as the Discourse forum software,
convert "->" to "→" in user content, obviously unaware of this issue.
These proposed bidi-mirrored arrow characters would be an appropriate
replacement in such cases. Today, that simple search-and-replace must
be replaced with parsing the text using the full Unicode Bidi
algorithm to select the correct arrow, and even then some cases would
be impossible to determine without knowing the base direction or more
context which is not always available.
Awaiting your comments.
Thanks,
Nitai
On 4/8/25 10:34 AM, NeatNit via Unicode wrote:
Hi, I hope this is the right place to bring this up. I could not find
any discussions on this other than the document I quote.
Quick intro: characters with the property Bidi_Mirrored=Yes will be
visually mirrored within RTL text, such as Hebrew or Arabic. An easy
example is the Greater Than symbol: A>B and א>ב.
Arrow characters do not have this property: A→B but א→ב.
I've found this discrepancy mentioned in this document:
https://www.unicode.org/L2/L2022/22026r-non-bidi-mirroring.pdf
In particular, arrow and arrow-like characters
each often has a mirror character. One could
argue that they should have had the
Bidi_Mirrored=Yes property value, but they
don’t, and cannot now get that.
Even if it weren't for Unicode's stability policies, there are two
distinct usages of arrow symbols:
To indicate directions, e.g. "Turn left (←) and then right (→)" - in
this case the arrow refers to the physical direction and should not
be mirrored in RTL. The existing arrow characters serve this purpose
well: "פנה שמאלה (←) ואז ימינה (→)"
As an operator: "Convert A->B and assign C<-D" - in this case the
arrow direction should be mirrored if it appears in RTL text.
Currently this can only be emulated with ASCII "->" as I've just
demonstrated. Result: "המר א->ב וקבע ג<-ד".
Therefore I think there should be new characters, "Forward Arrow" and
"Backward Arrow", to serve the latter case. They would use the same
glyphs as existing arrows, but have the Bidi_Mirrored=Yes property.
Please let me know if this is likely to happen, and what I would have
to do to make a proper proposal. And if any of you are convinced
enough that you would like to make a proposal on my behalf, you are
welcome to do so!
The same reasoning can be applied to many other characters besides
these basic arrows. At minimum, all arrow and arrow-like characters
should be included. I haven't made a thorough search to find all
affected characters, at least not yet.
Note that some software, such as the Discourse forum software,
convert "->" to "→" in user content, obviously unaware of this issue.
These proposed bidi-mirrored arrow characters would be an appropriate
replacement in such cases. Today, that simple search-and-replace must
be replaced with parsing the text using the full Unicode Bidi
algorithm to select the correct arrow, and even then some cases would
be impossible to determine without knowing the base direction or more
context which is not always available.
Awaiting your comments.
Thanks,
Nitai