Rick Frankel <r...@rickster.com> writes: > On 25.04.2013 17:20, Eric Abrahamsen wrote: >> Who knew this would turn out to be such a fraught issue! All I wanted >> was that little green checkmark from the W3C... >> >> Here's what I think should be an acceptable final patch. I dropped >> the >> CDATA mess, and came up with a slightly different implementation for >> handling self-closing tags. It's maybe a little /bulkier/ than the >> previous implementation, but not so hacky, and may continue to be >> useful >> in the future. There's also a documentation patch. > > Overall, looks good, but again, i would _strongly_ argue that html5 > should generate valid xhtml. > If it doesn't, it will really break my post-processing workflow... > > Therefore, `org-html-close-tag' should check that the doctype is not a > flavor of html4 rather than a flavor of xhtml. An alternative would be > to add ("xhtml5" . "<!DOCTYPE html>") to the doctype alist, and the > appropriate testing for being html5 and xhtml. > > See the discussions of polyglot markup @ > http://en.wikipedia.org/wiki/Polyglot_markup > and > http://www.w3.org/TR/2011/WD-html-polyglot-20110405/#dfn-polyglot-markup > for the rationale.
Ah, those were interesting links, I hadn't considered those issues. Luckily, your second option was a three-line change to the existing patch: using "xhtml5" now produces the same output as "html5", except that self-closing tags are self-closed, and there's a xmlns declaration in the <html> element. Best of all worlds, I hope. E
>From 12472f7fe52848a011cc218e36b01416cfa6c146 Mon Sep 17 00:00:00 2001 From: Eric Abrahamsen <e...@ericabrahamsen.net> Date: Fri, 26 Apr 2013 10:04:47 -0700 Subject: [PATCH 11/11] ox-html.el: Export to various flavors of (X)HTML lisp/ox-html.el (org-html-doctype-alist): New variable holding an alist of (X)HTML doctypes (org-html-xhtml-p): New function (org-html-html5-p): New function (org-html-close-tag): New function Significant changes to `org-html-format-inline-image', `org-html--build-meta-info', `org-html--build-head', `org-html--build-pre/postable', `org-html-template', `org-html-horizontal-rule', `org-html-format-list-item', `org-html-line-break', `org-html-table', and `org-html-verse-block'. doc/org.texi: Document the above --- doc/org.texi | 43 ++++++++++++- lisp/ox-html.el | 188 +++++++++++++++++++++++++++++++++++++------------------- 2 files changed, 166 insertions(+), 65 deletions(-) diff --git a/doc/org.texi b/doc/org.texi index 3f2d1b8..0815c49 100644 --- a/doc/org.texi +++ b/doc/org.texi @@ -596,6 +596,7 @@ Exporting HTML export * HTML Export commands:: How to invoke HTML export +* HTML doctypes:: Org can export to various (X)HTML flavors * HTML preamble and postamble:: How to insert a preamble and a postamble * Quoting HTML tags:: Using direct HTML in Org mode * Links in HTML export:: How links will be interpreted and formatted @@ -10959,6 +10960,7 @@ language, but with additional support for tables. @menu * HTML Export commands:: How to invoke HTML export +* HTML doctypes:: Org can export to various (X)HTML flavors * HTML preamble and postamble:: How to insert a preamble and a postamble * Quoting HTML tags:: Using direct HTML in Org mode * Links in HTML export:: How links will be interpreted and formatted @@ -10970,7 +10972,7 @@ language, but with additional support for tables. * JavaScript support:: Info and Folding in a web browser @end menu -@node HTML Export commands, HTML preamble and postamble, HTML export, HTML export +@node HTML Export commands, HTML doctypes, HTML export, HTML export @subsection HTML export commands @table @kbd @@ -10998,7 +11000,44 @@ Export to a temporary buffer. Do not create a file. @c @noindent @c creates two levels of headings and does the rest as items. -@node HTML preamble and postamble, Quoting HTML tags, HTML Export commands, HTML export +@node HTML doctypes, HTML preamble and postamble, HTML Export commands, HTML export +@subsection HTML doctypes +@vindex org-html-doctype +@vindex org-html-doctype-alist + +Org can export to various (X)HTML flavors. + +Setting the variable @var{org-html-doctype} allows you to export to different +(X)HTML variants. The exported HTML will be adjusted according to the sytax +requirements of that variant. You can either set this variable to a doctype +string directly, in which case the exporter will try to adjust the syntax +automatically, or you can use a ready-made doctype. The ready-made options +are: + +@itemize +@item +``html4-strict'' +@item +``html4-transitional'' +@item +``html4-frameset'' +@item +``xhtml-strict'' +@item +``xhtml-transitional'' +@item +``xhtml-frameset'' +@item +``xhtml-11'' +@item +``html5'' +@item +``xhtml5'' +@end itemize + +See the variable @var{org-html-doctype-alist} for details. The default is ``xhtml-strict''. + +@node HTML preamble and postamble, Quoting HTML tags, HTML doctypes, HTML export @subsection HTML preamble and postamble @vindex org-html-preamble @vindex org-html-postamble diff --git a/lisp/ox-html.el b/lisp/ox-html.el index ef7d15a..8223a18 100644 --- a/lisp/ox-html.el +++ b/lisp/ox-html.el @@ -143,6 +143,27 @@ (defvar org-html--pre/postamble-class "status" "CSS class used for pre/postamble") +(defconst org-html-doctype-alist + '(("html4-strict" . "<!DOCTYPE html PUBLIC \"-//W3C//DTD HTML 4.01//EN\" +\"http://www.w3.org/TR/html4/strict.dtd\">") + ("html4-transitional" . "<!DOCTYPE html PUBLIC \"-//W3C//DTD HTML 4.01 Transitional//EN\" +\"http://www.w3.org/TR/html4/loose.dtd\">") + ("html4-frameset" . "<!DOCTYPE html PUBLIC \"-//W3C//DTD HTML 4.01 Frameset//EN\" +\"http://www.w3.org/TR/html4/frameset.dtd\">") + + ("xhtml-strict" . "<!DOCTYPE html PUBLIC \"-//W3C//DTD XHTML 1.0 Strict//EN\" +\"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd\">") + ("xhtml-transitional" . "<!DOCTYPE html PUBLIC \"-//W3C//DTD XHTML 1.0 Transitional//EN\" +\"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd\">") + ("xhtml-framset" . "<!DOCTYPE html PUBLIC \"-//W3C//DTD XHTML 1.0 Frameset//EN\" +\"http://www.w3.org/TR/xhtml1/DTD/xhtml1-frameset.dtd\">") + ("xhtml-11" . "<!DOCTYPE html PUBLIC \"-//W3C//DTD XHTML 1.1//EN\" +\"http://www.w3.org/TR/xhtml1/DTD/xhtml11.dtd\">") + + ("html5" . "<!DOCTYPE html>") + ("xhtml5" . "<!DOCTYPE html>")) + "An alist mapping (x)html flavors to specific doctypes.") + (defconst org-html-special-string-regexps '(("\\\\-" . "­") ; shy ("---\\([^-]\\)" . "—\\1") ; mdash @@ -748,7 +769,9 @@ in all modes you want. Then, use the command '(:border "2" :cellspacing "0" :cellpadding "6" :rules "groups" :frame "hsides") "Default attributes and values which will be used in table tags. This is a plist where attributes are symbols, starting with -colons, and values are strings." +colons, and values are strings. + +When exporting to HTML5, these values will be disregarded." :group 'org-export-html :version "24.4" :package-version '(Org . "8.0") @@ -856,7 +879,9 @@ CSS classes, then this prefix can be very useful." "The extension for exported HTML files. %s will be replaced with the charset of the exported file. This may be a string, or an alist with export extensions -and corresponding declarations." +and corresponding declarations. + +This declaration only applies when exporting to XHTML." :group 'org-export-html :type '(choice (string :tag "Single declaration") @@ -872,8 +897,7 @@ Use utf-8 as the default value." :package-version '(Org . "8.0") :type 'coding-system) -(defcustom org-html-doctype - "<!DOCTYPE html PUBLIC \"-//W3C//DTD XHTML 1.0 Strict//EN\" \"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd\">" +(defcustom org-html-doctype "xhtml-strict" "Document type definition to use for exported HTML files. Can be set with the in-buffer HTML_DOCTYPE property or for publishing, with :html-doctype." @@ -962,7 +986,8 @@ You can also customize this for each buffer, using something like (const :format " " mathml) (boolean)))) (defcustom org-html-mathjax-template - "<script type=\"text/javascript\" src=\"%PATH\"> + "<script type=\"text/javascript\" src=\"%PATH\"></script> +<script type=\"text/javascript\"> <!--/*--><![CDATA[/*><!--*/ MathJax.Hub.Config({ // Only one of the two following lines, depending on user settings @@ -1035,7 +1060,7 @@ precedence over this variable." '(("en" "<p class=\"author\">Author: %a (%e)</p> <p class=\"date\">Date: %d</p> <p class=\"creator\">%c</p> -<p class=\"xhtml-validation\">%v</p>")) +<p class=\"validation\">%v</p>")) "Alist of languages and format strings for the HTML postamble. The first element of each list is the language code, as used for @@ -1060,7 +1085,7 @@ like that: \"%%\"." :value-type (string :tag "Format string"))) (defcustom org-html-validation-link - "<a href=\"http://validator.w3.org/check?uri=referer\">Validate XHTML 1.0</a>" + "<a href=\"http://validator.w3.org/check?uri=referer\">Validate</a>" "Link to HTML validation service." :group 'org-export-html :type 'string) @@ -1240,6 +1265,18 @@ CSS classes, then this prefix can be very useful." ;;; Internal Functions +(defun org-html-xhtml-p (info) + (let ((dt (downcase (plist-get info :html-doctype)))) + (string-match-p "xhtml" dt))) + +(defun org-html-html5-p (info) + (let ((dt (downcase (plist-get info :html-doctype)))) + (member dt '("html5" "xhtml5" "<!doctype html>")))) + +(defun org-html-close-tag (tag attr info) + (concat "<" tag (or attr "") + (if (org-html-xhtml-p info) " />" ">"))) + (defun org-html--make-attribute-string (attributes) "Return a list of attributes, as a string. ATTRIBUTES is a plist where values are either strings or nil. An @@ -1253,7 +1290,7 @@ attributes with a nil value will be omitted from the result." "\"" """ (org-html-encode-plain-text item)))) (setcar output (format "%s=\"%s\"" key value)))))))) -(defun org-html-format-inline-image (src &optional +(defun org-html-format-inline-image (src info &optional caption label attr standalone-p) "Format an inline image from SRC. CAPTION, LABEL and ATTR are optional arguments providing the @@ -1262,6 +1299,7 @@ When STANDALONE-P is t, wrap the <img.../> into a <div>...</div>." (let* ((id (if (not label) "" (format " id=\"%s\"" (org-export-solidify-link-text label)))) (attr (concat attr + (format " src=\"%s\"" src) (cond ((string-match "\\<alt=" (or attr "")) "") ((string-match "^ltxpng/" src) @@ -1273,12 +1311,12 @@ When STANDALONE-P is t, wrap the <img.../> into a <div>...</div>." (file-name-nondirectory src))))))) (cond (standalone-p - (let ((img (format "<img src=\"%s\" %s/>" src attr))) + (let ((img (org-html-close-tag "img" attr info))) (format "\n<div%s class=\"figure\">%s%s\n</div>" id (format "\n<p>%s</p>" img) (if (and caption (not (string= caption ""))) (format "\n<p>%s</p>" caption) "")))) - (t (format "<img src=\"%s\" %s/>" src (concat attr id)))))) + (t (org-html-close-tag "img" (concat attr id) info))))) (defun org-html--textarea-block (element) "Transcode ELEMENT into a textarea block. @@ -1437,7 +1475,11 @@ INFO is a plist used as a communication channel." (cons 'plain-text org-element-all-objects) 'identity info)))))) (description (plist-get info :description)) - (keywords (plist-get info :keywords))) + (keywords (plist-get info :keywords)) + (charset (or (and org-html-coding-system + (fboundp 'coding-system-get) + (coding-system-get org-html-coding-system 'mime-charset)) + "iso-8859-1"))) (concat (format "<title>%s</title>\n" title) (format @@ -1445,21 +1487,24 @@ INFO is a plist used as a communication channel." (format-time-string (concat "<!-- " org-html-metadata-timestamp-format " -->\n")))) (format - "<meta http-equiv=\"Content-Type\" content=\"text/html;charset=%s\"/>\n" - (or (and org-html-coding-system - (fboundp 'coding-system-get) - (coding-system-get org-html-coding-system 'mime-charset)) - "iso-8859-1")) - (format "<meta name=\"generator\" content=\"Org-mode\"/>\n") + (if (org-html-html5-p info) + (org-html-close-tag "meta" " charset=\"%s\"" info) + (org-html-close-tag + "meta" " http-equiv=\"Content-Type\" content=\"text/html;charset=%s\"" info)) + charset) "\n" + (org-html-close-tag "meta" " name=\"generator\" content=\"Org-mode\"" info) "\n" (and (org-string-nw-p author) - (format "<meta name=\"author\" content=\"%s\"/>\n" - (funcall protect-string author))) + (org-html-close-tag "meta" (format " name=\"author\" content=\"%s\"" + (funcall protect-string author)) info) + "\n") (and (org-string-nw-p description) - (format "<meta name=\"description\" content=\"%s\"/>\n" - (funcall protect-string description))) + (org-html-close-tag "meta" (format " name=\"description\" content=\"%s\"%s\n" + (funcall protect-string description)) info) + "\n") (and (org-string-nw-p keywords) - (format "<meta name=\"keywords\" content=\"%s\"/>\n" - (funcall protect-string keywords)))))) + (org-html-close-tag "meta" (format " name=\"keywords\" content=\"%s\"" + (funcall protect-string keywords)) info) + "\n")))) (defun org-html--build-head (info) "Return information for the <head>..</head> of the HTML output. @@ -1472,8 +1517,10 @@ INFO is a plist used as a communication channel." (org-element-normalize-string (plist-get info :html-head-extra)) (when (and (plist-get info :html-htmlized-css-url) (eq org-html-htmlize-output-type 'css)) - (format "<link rel=\"stylesheet\" href=\"%s\" type=\"text/css\" />\n" - (plist-get info :html-htmlized-css-url))) + (org-html-close-tag "link" + (format " rel=\"stylesheet\" href=\"%s\" type=\"text/css\"" + (plist-get info :html-htmlized-css-url)) + info)) (when (plist-get info :html-head-include-scripts) org-html-scripts)))) (defun org-html--build-mathjax-config (info) @@ -1570,7 +1617,7 @@ communication channel." (format-time-string org-html-metadata-timestamp-format))) (when (plist-get info :with-creator) (format "<p class=\"creator\">%s</p>\n" creator)) - (format "<p class=\"xhtml-validation\">%s</p>\n" + (format "<p class=\"validation\">%s</p>\n" validation-link)))) (t (format-spec (or (cadr (assoc @@ -1612,23 +1659,31 @@ holding export options." CONTENTS is the transcoded contents string. INFO is a plist holding export options." (concat - (format - (or (and (stringp org-html-xml-declaration) - org-html-xml-declaration) - (cdr (assoc (plist-get info :html-extension) - org-html-xml-declaration)) - (cdr (assoc "html" org-html-xml-declaration)) - - "") - (or (and org-html-coding-system - (fboundp 'coding-system-get) - (coding-system-get org-html-coding-system 'mime-charset)) - "iso-8859-1")) - "\n" - (plist-get info :html-doctype) + (when (org-html-xhtml-p info) + (format "%s\n" + (format (or (and (stringp org-html-xml-declaration) + org-html-xml-declaration) + (cdr (assoc (plist-get info :html-extension) + org-html-xml-declaration)) + (cdr (assoc "html" org-html-xml-declaration)) + + "") + (or (and org-html-coding-system + (fboundp 'coding-system-get) + (coding-system-get org-html-coding-system 'mime-charset)) + "iso-8859-1")))) + (let* ((dt (plist-get info :html-doctype)) + (dt-cons (assoc dt org-html-doctype-alist))) + (if dt-cons + (cdr dt-cons) + dt)) "\n" - (format "<html xmlns=\"http://www.w3.org/1999/xhtml\" lang=\"%s\" xml:lang=\"%s\">\n" - (plist-get info :language) (plist-get info :language)) + (concat "<html" + (when (org-html-xhtml-p info) + (format + " xmlns=\"http://www.w3.org/1999/xhtml\" lang=\"%s\" xml:lang=\"%s\"" + (plist-get info :language) (plist-get info :language))) + ">\n") "<head>\n" (org-html--build-meta-info info) (org-html--build-head info) @@ -2179,7 +2234,7 @@ holding contextual information." ;; Build the real contents of the sub-tree. (let* ((type (if numberedp 'ordered 'unordered)) (itemized-body (org-html-format-list-item - contents type nil nil full-text))) + contents type nil info nil full-text))) (concat (and (org-export-first-sibling-p headline info) (org-html-begin-plain-list type)) @@ -2239,7 +2294,7 @@ holding contextual information." (defun org-html-horizontal-rule (horizontal-rule contents info) "Transcode an HORIZONTAL-RULE object from Org to HTML. CONTENTS is nil. INFO is a plist holding contextual information." - "<hr/>") + (org-html-close-tag "hr" nil info)) ;;;; Inline Src Block @@ -2275,8 +2330,9 @@ holding contextual information." (org-html-format-headline--wrap inlinetask info format-function :contents contents))) ;; Otherwise, use a default template. - (t (format "<div class=\"inlinetask\">\n<b>%s</b><br/>\n%s</div>" + (t (format "<div class=\"inlinetask\">\n<b>%s</b>%s\n%s</div>" (org-html-format-headline--wrap inlinetask info) + (org-html-close-tag "br" nil info) contents)))) ;;;; Italic @@ -2296,11 +2352,12 @@ contextual information." (trans "<code>[-]</code>") (t ""))) -(defun org-html-format-list-item (contents type checkbox +(defun org-html-format-list-item (contents type checkbox info &optional term-counter-id headline) "Format a list item into HTML." - (let ((checkbox (concat (org-html-checkbox checkbox) (and checkbox " ")))) + (let ((checkbox (concat (org-html-checkbox checkbox) (and checkbox " "))) + (br (org-html-close-tag "br" nil info))) (concat (case type (ordered @@ -2308,13 +2365,13 @@ contextual information." (extra (if counter (format " value=\"%s\"" counter) ""))) (concat (format "<li%s>" extra) - (when headline (concat headline "<br/>"))))) + (when headline (concat headline br))))) (unordered (let* ((id term-counter-id) (extra (if id (format " id=\"%s\"" id) ""))) (concat (format "<li%s>" extra) - (when headline (concat headline "<br/>"))))) + (when headline (concat headline br))))) (descriptive (let* ((term term-counter-id)) (setq term (or term "(no term)")) @@ -2340,7 +2397,7 @@ contextual information." (tag (let ((tag (org-element-property :tag item))) (and tag (org-export-data tag info))))) (org-html-format-list-item - contents type checkbox (or tag counter)))) + contents type checkbox info (or tag counter)))) ;;;; Keyword @@ -2399,7 +2456,7 @@ CONTENTS is nil. INFO is a plist holding contextual information." (when (and formula-link (string-match "file:\\([^]]*\\)" formula-link)) (org-html-format-inline-image - (match-string 1 formula-link) caption label attr t)))) + (match-string 1 formula-link) info caption label attr t)))) (t latex-frag)))) ;;;; Latex Fragment @@ -2418,7 +2475,7 @@ CONTENTS is nil. INFO is a plist holding contextual information." (when (and formula-link (string-match "file:\\([^]]*\\)" formula-link)) (org-html-format-inline-image - (match-string 1 formula-link))))) + (match-string 1 formula-link) info)))) (t latex-frag)))) ;;;; Line Break @@ -2426,7 +2483,7 @@ CONTENTS is nil. INFO is a plist holding contextual information." (defun org-html-line-break (line-break contents info) "Transcode a LINE-BREAK object from Org to HTML. CONTENTS is nil. INFO is a plist holding contextual information." - "<br/>\n") + (concat (org-html-close-tag "br" nil info) "\n")) ;;;; Link @@ -2451,7 +2508,7 @@ Inline images can have these attributes: (label (org-element-property :name parent))) ;; Return proper string, depending on DISPOSITION. (org-html-format-inline-image - path caption label + path info caption label (org-html--make-attribute-string (org-export-read-attribute :attr_html parent)) (org-html-standalone-image-p link info)))) @@ -2769,7 +2826,8 @@ contextual information." (when (plist-get info :preserve-breaks) (setq output (replace-regexp-in-string - "\\(\\\\\\\\\\)?[ \t]*\n" "<br/>\n" output))) + "\\(\\\\\\\\\\)?[ \t]*\n" + (concat (org-html-close-tag "br" nil info) "\n") output))) ;; Return value. output)) @@ -3044,11 +3102,12 @@ contextual information." (let* ((label (org-element-property :name table)) (caption (org-export-get-caption table)) (attributes - (org-html--make-attribute-string - (org-combine-plists - (and label (list :id (org-export-solidify-link-text label))) - (plist-get info :html-table-attributes) - (org-export-read-attribute :attr_html table)))) + (if (org-html-html5-p info) "" + (org-html--make-attribute-string + (org-combine-plists + (and label (list :id (org-export-solidify-link-text label))) + (plist-get info :html-table-attributes) + (org-export-read-attribute :attr_html table))))) (alignspec (if (and (boundp 'org-html-format-table-no-css) org-html-format-table-no-css) @@ -3066,7 +3125,9 @@ contextual information." table-cell info) "\n<colgroup>") ;; Add a column. Also specify it's alignment. - (format "\n<col %s/>" (format alignspec alignment)) + (format "\n%s" + (org-html-close-tag + "col" (concat " " (format alignspec alignment)) info)) ;; End a colgroup? (when (org-export-table-cell-ends-colgroup-p table-cell info) @@ -3128,9 +3189,10 @@ contextual information." ;; Replace each newline character with line break. Also replace ;; each blank line with a line break. (setq contents (replace-regexp-in-string - "^ *\\\\\\\\$" "<br/>\n" + "^ *\\\\\\\\$" (format "%s\n" (org-html-close-tag "br" nil info)) (replace-regexp-in-string - "\\(\\\\\\\\\\)?[ \t]*\n" " <br/>\n" contents))) + "\\(\\\\\\\\\\)?[ \t]*\n" + (format "%s\n" (org-html-close-tag "br" nil info)) contents))) ;; Replace each white space at beginning of a line with a ;; non-breaking space. (while (string-match "^[ \t]+" contents) -- 1.8.2.1