Max Nikulin <maniku...@gmail.com> writes: >>> The good point in your patch is that \- is still work as shy hyphen >>> (that, by the way, may be used in some cases instead of zero width >>> space: *intra*\-word). On the other hand I have managed to find a case >>> when your approach is not ideal: >>> >>> *\--scratch\--* >>> >>> <p> >>> <b>­-scratch</b></p> >> >> Well. I think that it is impossible to use the same escape construct to >> both force emphasis and escape it. > > Let's articulate the problem as follows: when some characters ("*". "/". > etc.) besides used literally are overloaded with 2 additional roles that > are start emphasis group and terminate emphasis group, in addition to > lightweight markup heuristics, it is necessary to provide a way to > disambiguate which of 3 roles is associated with particular character. > > "Activate" and "deactivate" characters or entities for emphasis markers > are alternative and perhaps not so clear terms have used before. > > The advantage of zero width space is that "[:space:]" is part of > PREMATCH and POSTMATCH (outer) regexps in > `org-emphasis-regexp-components' and "[:space:]" is forbidden at the > inner borders of emphasized span of text. The latter is mostly > meaningful, however I am unsure if bold space has the same width as > regular one, and space in fixed width font is certainly distinct. > > The problem with the "\--" entity is that it is not handled properly at > the start of emphasis region. It neither disables emphasis nor parsed as > complete entity, instead it becomes combination of "\-" shy hyphen and > literal "-". > > Unsure if it can be solved consistently. Possible ways: > - It addition to space-like (in respect to current regexp) entity add > another one that acts as a part of word, but like "\--" stripped from > output. Likely it should be accompanied by more changes in the parser > and regexps. > - Provide some new explicit syntax for literal character, start of > emphasis group, end of emphasis group.
The fact that \-- was not parsed in your example is because entities cannot be directly followed by a letter (see 12.4 Special Symbols). You need *\--{}scratch\--* Concerning the 3 listed roles of the *_/+ markup, I propose to simplify the problem a bit and not try to make \-- serve as a proper escape symbol. Instead, we can document the already existing quoting entities: ("slash" "/" nil "/" "/" "/" "/") ("plus" "+" nil "+" "+" "+" "+") ("under" "\\_" nil "_" "_" "_" "_") ("equal" "=" nil "=" "=" "=" "=") ("star" "\\star" t "*" "*" "*" "⋆") Then, your example should better be written as \star{}scratch\star \-- may better work between markup, not inside. > Concerning zero width space workaround, I may be wrong, but Nicolas > might consider using U+200B zero width space as the escape character for > itself: single one is filtered out during export, double zero width > space becomes single character. (I do not like this kind of "white > space" programming language".) This is too complex, IMHO. If desired, we can again go the entity road and introduce \zws entity. Note that we already have ("nbsp" "~" nil " " " " " " " ") ("ensp" "\\hspace*{.5em}" nil " " " " " " " ") ("emsp" "\\hspace*{1em}" nil " " " " " " " ") ("thinsp" "\\hspace*{.2em}" nil " " " " " " " ") Generally, it is a good idea to advertise entities in the manual. Zero-width space is not only limited, it is impossible to use, e.g. in tables when you want to quote "|". The only solution is using \vert or \vbar entity. > Another question is whether U+2060 word > joiner (or some other character) should be added either as alternative > to zero width space or to allow = verbatim = fixed width text > surrounded by fixed width spaces. This particular example is tricky. If we put escape symbol _inside_ the verbatim, it is never possible to know if the user intents to use that symbol literally or not. But non-space before/after opening/closing markup char is hard-coded and changing it is fragile. Instead of using some kind of "escape" symbol here, I suggest turning to the idea about inline special blocks. We can introduce a more verbose markup that will allow spaces inside at the beginning/end of the contents. https://orgmode.org/list/87a6b8pbhg....@posteo.net Manuel Macías [ML:Org mode] (2022) About 'inline special blocks' Instead of using the tricky *bold text*, we may allow _*{bold text}*_ or something similar, with _name{...}name_ being inline special block. Best, Ihor