Re: [O] Embedded LaTeX does not work with Unicode quotes

2014-11-13 Thread Nicolas Goaziou
Hello,

Florian Beck f...@miszellen.de writes:

 Nick Dokos ndo...@gmail.com writes:

 punctuation in the syntax tables. Look for org-latex-regexps in
 org.el

 The line in question is

 #+BEGIN_SRC emacs-lisp
 ($ \\([^$]\\|^\\)\\(\\(\\$\\([^
 \r\n,;.$][^$\n\r]*?\\(\n[^$\n\r]*?\\)\\{0,2\\}[^
 \r\n,.$]\\)\\$\\)\\)\\([-   .,?;:'\)\000]\\|$\\) 2 nil)
 #+END_SRC

 It's probably not too hard to see that the culprit is the bunch of
 punctuation characters towards the end. Indeed if you change .,?;:'\
 to .,?;:'\” -- that solves the OPs problem. However, it might be even
 better to use a more general syntax, [:punct:], which matches all
 punctuation (as we want). So:

 #+BEGIN_SRC emacs-lisp
 ($ \\([^$]\\|^\\)\\(\\(\\$\\([^
 \r\n,;.$][^$\n\r]*?\\(\n[^$\n\r]*?\\)\\{0,2\\}[^
 \r\n,.$]\\)\\$\\)\\)\\([-   [:punct:]\000]\\|$\\) 2 nil)
 #+END_SRC

Actually this variable is hardly used throughout Org code base. See
org-element-latex-fragment-parser instead (which has the same problem
anyway).

Also, according to Elisp manual; [:punct:] is not ideal either:

  `[:punct:]'
   This matches any punctuation character.  (At present, for multibyte
   characters, it matches anything that has non-word syntax.)

There is also \s..

Anyway, it might be better to know exactly what kind of false positives
we want to avoid.


Regards,

-- 
Nicolas Goaziou



Re: [O] Embedded LaTeX does not work with Unicode quotes

2014-11-12 Thread Marcin Borkowski

On 2014-11-12, at 07:05, Nick Dokos wrote:

 Marcin Borkowski mb...@wmi.amu.edu.pl writes:

 Hi list,

 I have this: „$n\eps\le b$”, and it seems not to be recognized as a
 LaTeX fragment.  The manual says:

 
 To avoid conflicts with currency specifications, single `$' characters
 are only recognized as math delimiters if the enclosed text contains at
 most two line breaks, is directly attached to the `$' characters with no
 whitespace in between, and if the closing `$' is followed by whitespace,
 punctuation or a dash.
 

 When I C-u C-x = on the closing quote, I get

 
 ...
syntax: . which means: punctuation
 ...
 

 so I don't know why it is not recognized as punctuation.  Consequently,
 it is exported verbatim (with `\$') into LaTeX, and also (obviously) C-c
 C-x C-l does not fontify it.  When I change ” into  (the ASCII #x22
 quote), everything is ok.


 The $...$ construct is recognized by a regexp which, while complicated,
 is not complicated enough to recognize everything that's marked
 punctuation in the syntax tables. Look for org-latex-regexps in org.el
 (and note that the regexp for $ is about twice as long as the next
 longest regexp - the one for begin). The others (for \(...\), \[...\]
 and $$..$$) are fairly trivial.

 My questions:

 1. Isn't it a bug?


 Yes, probably - but looking at the regexp, I cringe: I don't want to even
 try deciphering it, let alone change it - life's too short...

Ah, regex.  I have no more questions...

 2. If not, what can I do to in my config so that it is recognized
 properly?

 PS. I just recalled that using \(...\) should help, and indeed it does.
 Still, I'm curious about the answer to my questions (now that I
 remembered a workaround, especially #1).

 That is indeed the best solution.

Yep.

Thanks!

-- 
Marcin Borkowski
http://octd.wmi.amu.edu.pl/en/Marcin_Borkowski
Adam Mickiewicz University



Re: [O] Embedded LaTeX does not work with Unicode quotes

2014-11-12 Thread Florian Beck
Nick Dokos ndo...@gmail.com writes:

 punctuation in the syntax tables. Look for org-latex-regexps in
 org.el

The line in question is

#+BEGIN_SRC emacs-lisp
($ \\([^$]\\|^\\)\\(\\(\\$\\([^  
\r\n,;.$][^$\n\r]*?\\(\n[^$\n\r]*?\\)\\{0,2\\}[^
\r\n,.$]\\)\\$\\)\\)\\([-   .,?;:'\)\000]\\|$\\) 2 nil)
#+END_SRC

It's probably not too hard to see that the culprit is the bunch of
punctuation characters towards the end. Indeed if you change .,?;:'\
to .,?;:'\” -- that solves the OPs problem. However, it might be even
better to use a more general syntax, [:punct:], which matches all
punctuation (as we want). So:

#+BEGIN_SRC emacs-lisp
($ \\([^$]\\|^\\)\\(\\(\\$\\([^  
\r\n,;.$][^$\n\r]*?\\(\n[^$\n\r]*?\\)\\{0,2\\}[^
\r\n,.$]\\)\\$\\)\\)\\([-   [:punct:]\000]\\|$\\) 2 nil)
#+END_SRC


-- 
Florian Beck




[O] Embedded LaTeX does not work with Unicode quotes

2014-11-11 Thread Marcin Borkowski
Hi list,

I have this: „$n\eps\le b$”, and it seems not to be recognized as a
LaTeX fragment.  The manual says:


To avoid conflicts with currency specifications, single `$' characters
are only recognized as math delimiters if the enclosed text contains at
most two line breaks, is directly attached to the `$' characters with no
whitespace in between, and if the closing `$' is followed by whitespace,
punctuation or a dash.


When I C-u C-x = on the closing quote, I get


 position: 54465 of 108125 (50%), restriction: 52496-56766, 
column: 152
character: ” (displayed as ”) (codepoint 8221, #o20035, #x201d)
preferred charset: unicode (Unicode (ISO10646))
code point in charset: 0x201D
   syntax: .which means: punctuation
 category: .:Base, c:Chinese, h:Korean, j:Japanese
 to input: type C-x 8 RET HEX-CODEPOINT or C-x 8 RET NAME
  buffer code: #xE2 #x80 #x9D
file code: #xE2 #x80 #x9D (encoded by coding system utf-8-unix)
  display: by this font (glyph code)
xft:-unknown-Ubuntu Mono-normal-normal-normal-*-17-*-*-*-m-0-iso10646-1 
(#x71)

Character code properties: customize what to show
  name: RIGHT DOUBLE QUOTATION MARK
  old-name: DOUBLE COMMA QUOTATION MARK
  general-category: Pf (Punctuation, Final quote)
  decomposition: (8221) ('”')

There are text properties here:
  fontifiedt


so I don't know why it is not recognized as punctuation.  Consequently,
it is exported verbatim (with `\$') into LaTeX, and also (obviously) C-c
C-x C-l does not fontify it.  When I change ” into  (the ASCII #x22
quote), everything is ok.

My questions:

1. Isn't it a bug?

2. If not, what can I do to in my config so that it is recognized
properly?

PS. I just recalled that using \(...\) should help, and indeed it does.
Still, I'm curious about the answer to my questions (now that I
remembered a workaround, especially #1).

TIA,

-- 
Marcin Borkowski
http://octd.wmi.amu.edu.pl/en/Marcin_Borkowski
Adam Mickiewicz University