[O] Smart Quotes Exporting (Was: Re: (no subject))

Mark E. Shoulson Thu, 31 May 2012 16:26:58 -0700

Sorry for messing up the thread subject header; I think I misusedgmane's posting.


On 05/31/2012 09:38 AM, Nicolas Goaziou wrote:

Hello,


Mark Shoulson<[email protected]>  writes:

+(defvar org-e-html-quote-replacements
+  '(("fr" "« " " »" "‘" "’" "’")
+    ("en" "“" "”" "‘" "’" "’")
+    ("de" "„" "“" "‚" "‘" "’"))

A docstring will be required for this variable. It should be
a defcustom.

Oh, certainly; they're all a disaster. I think I said that in thewriteup at the top. This is just proof of concept, nothing is in theright place, nothing is properly documented. They have to bedefcustoms, there needs to be a good :type in the defcustom as well as aproper docstring. You'll get no argument from me about the lack (orinaccuracy) of docstrings and such. I hadn't gotten that far yet. Isaid the patch was only if you wanted to tinker with the development asthis progresses.

+(defun org-e-latex--quotation-marks (text info)
+  (org-export-quotation-marks text info org-e-latex-quote-replacements))
+  ;; (mapc (lambda(l)
+  ;;     (let ((start 0))
+  ;;       (while (setq start (string-match (car l) text start))
+  ;;         (let ((new-quote (concat (match-string 1 text) (cdr l))))
+  ;;           (setq text (replace-match new-quote  t t text))))))
+  ;;   (cdr (or (assoc (plist-get info :language) org-e-latex-quotes)
+  ;;            ;; Falls back on English.
+  ;;            (assoc "en" org-e-latex-quotes))))
+  ;; text)
Use directly `org-e-latex-quote-replacements' in code then.


Not sure I understand this comment.

+;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
+;; Probably a defcustom eventually.
+
+;; Each element of this consists of: car=language code, cdr=list of
+;; double-quote-open-regexp, double-quote-close-regexp,
+;; single-quote-open-regexp, single-quote-close-regexp,&optional
+;; single-apostrophe regexp?
+;; Just about all will be the same anyway, so mostly language DEFAULT.
+
+;; For testing purposes, poorly-designed at first.
+(defvar org-export-quotes-regexps
+  '((DEFAULT
+      "\\(?:\\s-\\|[[(]\\|^\\)\\(\"\\)\\w"
+      "\\(?:\\S-\\)\\(\"\\)\\s-"
+      "\\(?:\\s-\\|(\\|^\\)\\('\\)\\w"
+      "\\w\\('\\)\\(?:\\s-\\|\\s.\\|$\\)"
+      "\\w\\('\\)\\w")))

I'm not sure this variable can be used for both the buffer and the
export engine. Export back-ends will only see chunks of the paragraph.

For example, in the following text,

   He crossed the Rubicon and said: "/Alea jacta est./"

Plain text translators will see three strings:

   1. "He crossed the Rubicon and said: \""
   2. "Alea jacta est."
   3. "\""

In case 1, you have an opening quote with nothing after it. In case 3,
you have a closing quote with nothing before or after it. Plain regexps
can't help here.

The only solution in can think of is to do quote substitutions in
paragraphs within the parse tree before they reach the translators (i.e.
with `org-export-filter-parse-tree-functions').

That's the only way to know if "\"" is an opening or a closing quote,
for example. The current approach won't work.

Hm. OK, this may indeed be (a) a problem and (b) an indication that Ireally don't understand the process as I thought I did... ... ... Ah.So when the "plain" text is being exported, the exporter passes alongthe text in chunks as divided up by the formatting. So string #2 isbroken out from the others due to its being in italics. That is indeedan issue. Moreover, I never even properly considered the effects offormatting characters (as opposed to punctuation) right next to thequote-marks, even if this weren't a problem.

So... there's the filter-parse-tree-functions hook gets applied withinthe parse tree... so a back-end can add a function to that list whichlooks over the parse-tree and watches for these border cases (and alsothe ones within ordinary strings). Looks like it's going to be tough towork in any flexibility to define further per-language or per-backendcleverness to handle anything beyond the "canonical set" of open-double,close-double, open-single, close-single, and mid-word.

To be sure, anything we do will most assuredly fail even on some fairlyreasonable input, in which case the users are pretty much on their ownand will have to do things the hard way. And I could use that as theanswer here, that, "well, it'll work only within plain-text strings"(and I might possibly still have to use that answer), but I would ratherinclude the situations you bring up in the supported set and not throwup my hands at it. So, yes, will look at that.

+  (let* ((start 0)
+        (regexps
+         (cdr
+          (or
+           (assoc (plist-get info :language)
+                  org-export-quotes-regexps)
+           (assoc 'DEFAULT org-export-quotes-regexps))))

Use `assq' instead of `assoc' in the second case.


Good call.

+        (subs (cdr (or (assoc (plist-get info :language)
+                              replacements)
+                       (assoc "en" replacements))))
+        (quotes (pairlis regexps subs)))
+    (mapc (lambda (p)
+           (let ((re (car p))
+                 (su (cdr p)))
+             (while (setq start (string-match re text start))
+               (setq text (replace-match su t t text 1)))))

Use `replace-regexp-in-string' instead.

   (replace-regexp-in-string (car p) (cdr p) text t t 1)

I'd been looking at other functions that didn't have that available;thanks for pointing me at it.


~mark

[O] Smart Quotes Exporting (Was: Re: (no subject))

Reply via email to