Hi Aaron, On Di, 2014-09-23 at 14:15 -0400, Aaron Ecay wrote: > org-emphasis-regexp-components is known to be a wart. You can search > for posts on the mailing list. Some people are trying to figure out how > to get rid of it. (You can search in particular for Nicolas Goaziou’s > posts...) Here’s one thread where you can see the lay of the land: > <http://mid.gmane.org/87zjl6ktu2....@gmail.com>.
Thank you for the background info! > All that to say, the longer-term solution is to figure out some radically > different approach. In the meantime though, if you can provide a list of > characters (by unicode name and/or code point) that you think should be > added to that variable, someone might be able to add them. I guess the straightforward way of defining white-space would be just using the set of characters with the Unicode property WSpace=Y, and this would be what «[:space:]», «\s«, etc., should be expected to match on Unicode-based locales. I’m supplying a list of code-points below, for convenience. I agree though that defining what counts as «white space» within the confines of org-mode is putting the cart before the horse. I’ll try to ascertain whether the Emacs implementation of «[:space:]» really only does 8-bit spaces, and if so I’ll see whether I can poke someone on the Emacs bug tracker about this. Best regards, T. ────────────────────────────────────────────────────────────────────── List of Unicode white-space Below is the list of characters with the property White_Space set, taken from the Unicode 7.0.0 character database. This includes line-breaking white-space such as «line feed». If these are not relevant, one can use the subset of space separators (Zs; these do not include control characters such as Tab) and control chars (Cc). 0009..000D ; White_Space # Cc [5] <control-0009>..<control-000D> 0020 ; White_Space # Zs SPACE 0085 ; White_Space # Cc <control-0085> 00A0 ; White_Space # Zs NO-BREAK SPACE 1680 ; White_Space # Zs OGHAM SPACE MARK 2000..200A ; White_Space # Zs [11] EN QUAD..HAIR SPACE 2028 ; White_Space # Zl LINE SEPARATOR 2029 ; White_Space # Zp PARAGRAPH SEPARATOR 202F ; White_Space # Zs NARROW NO-BREAK SPACE 205F ; White_Space # Zs MEDIUM MATHEMATICAL SPACE 3000 ; White_Space # Zs IDEOGRAPHIC SPACE ──────────────────────────────────────────────────────────────────────