I'm starting a new thread for this since the previous discussion was buried in with something tangential.
I've attached what I think is a working patch to allow ox-html to export to different flavors of (X)HTML. It works via `org-html-doctype': in addition to setting it to a doctype string directly, you can also set it to one of the following (defaults to xhtml-strict): html4-strict html4-transitional html4-frameset xhtml-strict xhtml-transitional xhtml-framset xhtml-11 html5 The doctype will be set correctly, and the HTML output will be adjusted to conform to the requirements of that doctype, with No Errors Whatsoever™. The results validate, anyway... I'm not proud of some of the implementation (self-closing vs non-self-closing tags are ugly, and I wish org-html-html5-p and org-html-xhtml-p were variables, not functions), but there it is, it seems to work. If this is deemed okay I'll send a version of the patch with a proper commit message, and also updated documentation. And once that's done I've got another that builds on this, allowing you to use stuff like <canvas> and <video>. Whee! Eric
>From 6ab61bbd573b7625e23e33f439aa2c579880cf56 Mon Sep 17 00:00:00 2001 From: Eric Abrahamsen <e...@ericabrahamsen.net> Date: Fri, 19 Apr 2013 15:39:40 +0800 Subject: [PATCH] Export to different flavors of (x)html. --- lisp/ox-html.el | 209 +++++++++++++++++++++++++++++++++++++------------------- 1 file changed, 140 insertions(+), 69 deletions(-) diff --git a/lisp/ox-html.el b/lisp/ox-html.el index 54c6a45..ede983d 100644 --- a/lisp/ox-html.el +++ b/lisp/ox-html.el @@ -143,6 +143,26 @@ (defvar org-html--pre/postamble-class "status" "CSS class used for pre/postamble") +(defconst org-html-doctype-alist + '(("html4-strict" . "<!DOCTYPE html PUBLIC \"-//W3C//DTD HTML 4.01//EN\" +\"http://www.w3.org/TR/html4/strict.dtd\">") + ("html4-transitional" . "<!DOCTYPE html PUBLIC \"-//W3C//DTD HTML 4.01 Transitional//EN\" +\"http://www.w3.org/TR/html4/loose.dtd\">") + ("html4-frameset" . "<!DOCTYPE html PUBLIC \"-//W3C//DTD HTML 4.01 Frameset//EN\" +\"http://www.w3.org/TR/html4/frameset.dtd\">") + + ("xhtml-strict" . "<!DOCTYPE html PUBLIC \"-//W3C//DTD XHTML 1.0 Strict//EN\" +\"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd\">") + ("xhtml-transitional" . "<!DOCTYPE html PUBLIC \"-//W3C//DTD XHTML 1.0 Transitional//EN\" +\"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd\">") + ("xhtml-framset" . "<!DOCTYPE html PUBLIC \"-//W3C//DTD XHTML 1.0 Frameset//EN\" +\"http://www.w3.org/TR/xhtml1/DTD/xhtml1-frameset.dtd\">") + ("xhtml-11" . "<!DOCTYPE html PUBLIC \"-//W3C//DTD XHTML 1.1//EN\" +\"http://www.w3.org/TR/xhtml1/DTD/xhtml11.dtd\">") + + ("html5" . "<!DOCTYPE html>")) + "An alist mapping (x)html flavors to specific doctypes.") + (defconst org-html-special-string-regexps '(("\\\\-" . "­") ; shy ("---\\([^-]\\)" . "—\\1") ; mdash @@ -150,6 +170,10 @@ ("\\.\\.\\." . "…")) ; hellip "Regular expressions for special string conversion.") +(defconst org-html-cdata-regexp + "\\(<!--/\\*--><!\\[CDATA\\[/\\*><!--\\*/\n\\|/\\*\\]\\]>\\*/\\{1,3\\}-->\n\\)" + "Regexp used to strip escape tags from script blocks") + (defconst org-html-scripts "<script type=\"text/javascript\"> /* @@ -748,7 +772,9 @@ in all modes you want. Then, use the command '(:border "2" :cellspacing "0" :cellpadding "6" :rules "groups" :frame "hsides") "Default attributes and values which will be used in table tags. This is a plist where attributes are symbols, starting with -colons, and values are strings." +colons, and values are strings. + +When exporting to HTML5, these values will be disregarded." :group 'org-export-html :version "24.4" :package-version '(Org . "8.0") @@ -856,7 +882,9 @@ CSS classes, then this prefix can be very useful." "The extension for exported HTML files. %s will be replaced with the charset of the exported file. This may be a string, or an alist with export extensions -and corresponding declarations." +and corresponding declarations. + +This declaration only applies when exporting to XHTML." :group 'org-export-html :type '(choice (string :tag "Single declaration") @@ -872,8 +900,7 @@ Use utf-8 as the default value." :package-version '(Org . "8.0") :type 'coding-system) -(defcustom org-html-doctype - "<!DOCTYPE html PUBLIC \"-//W3C//DTD XHTML 1.0 Strict//EN\" \"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd\">" +(defcustom org-html-doctype "xhtml-strict" "Document type definition to use for exported HTML files. Can be set with the in-buffer HTML_DOCTYPE property or for publishing, with :html-doctype." @@ -962,7 +989,8 @@ You can also customize this for each buffer, using something like (const :format " " mathml) (boolean)))) (defcustom org-html-mathjax-template - "<script type=\"text/javascript\" src=\"%PATH\"> + "<script type=\"text/javascript\" src=\"%PATH\"></script> +<script type=\"text/javascript\"> <!--/*--><![CDATA[/*><!--*/ MathJax.Hub.Config({ // Only one of the two following lines, depending on user settings @@ -1035,7 +1063,7 @@ precedence over this variable." '(("en" "<p class=\"author\">Author: %a (%e)</p> <p class=\"date\">Date: %d</p> <p class=\"creator\">%c</p> -<p class=\"xhtml-validation\">%v</p>")) +<p class=\"validation\">%v</p>")) "Alist of languages and format strings for the HTML postamble. The first element of each list is the language code, as used for @@ -1060,7 +1088,7 @@ like that: \"%%\"." :value-type (string :tag "Format string"))) (defcustom org-html-validation-link - "<a href=\"http://validator.w3.org/check?uri=referer\">Validate XHTML 1.0</a>" + "<a href=\"http://validator.w3.org/check?uri=referer\">Validate</a>" "Link to HTML validation service." :group 'org-export-html :type 'string) @@ -1240,6 +1268,19 @@ CSS classes, then this prefix can be very useful." ;;; Internal Functions +(defun org-html-xhtml-p (info) + (let ((dt (downcase (plist-get info :html-doctype)))) + (string-match-p dt "xhtml"))) + +(defun org-html-html5-p (info) + (let ((dt (downcase (plist-get info :html-doctype)))) + (member dt '("html5" "<!doctype html>")))) + +(defun org-html-close-tag (info) + (if (org-html-xhtml-p info) + " />" + ">")) + (defun org-html--make-attribute-string (attributes) "Return a list of attributes, as a string. ATTRIBUTES is a plist where values are either strings or nil. An @@ -1253,7 +1294,7 @@ attributes with a nil value will be omitted from the result." "\"" """ (org-html-encode-plain-text item)))) (setcar output (format "%s=\"%s\"" key value)))))))) -(defun org-html-format-inline-image (src &optional +(defun org-html-format-inline-image (src &optional info caption label attr standalone-p) "Format an inline image from SRC. CAPTION, LABEL and ATTR are optional arguments providing the @@ -1273,12 +1314,14 @@ When STANDALONE-P is t, wrap the <img.../> into a <div>...</div>." (file-name-nondirectory src))))))) (cond (standalone-p - (let ((img (format "<img src=\"%s\" %s/>" src attr))) + (let ((img (format "<img src=\"%s\" %s%s" src attr + (org-html-close-tag info)))) (format "\n<div%s class=\"figure\">%s%s\n</div>" id (format "\n<p>%s</p>" img) (if (and caption (not (string= caption ""))) (format "\n<p>%s</p>" caption) "")))) - (t (format "<img src=\"%s\" %s/>" src (concat attr id)))))) + (t (format "<img src=\"%s\" %s%s" src (concat attr id) + (org-html-close-tag info)))))) (defun org-html--textarea-block (element) "Transcode ELEMENT into a textarea block. @@ -1437,7 +1480,8 @@ INFO is a plist used as a communication channel." (cons 'plain-text org-element-all-objects) 'identity info)))))) (description (plist-get info :description)) - (keywords (plist-get info :keywords))) + (keywords (plist-get info :keywords)) + (close (org-html-close-tag info))) (concat (format "<title>%s</title>\n" title) (format @@ -1445,36 +1489,45 @@ INFO is a plist used as a communication channel." (format-time-string (concat "<!-- " org-html-metadata-timestamp-format " -->\n")))) (format - "<meta http-equiv=\"Content-Type\" content=\"text/html;charset=%s\"/>\n" + (if (org-html-html5-p info) + "<meta charset=\"%s\">\n" + "<meta http-equiv=\"Content-Type\" content=\"text/html;charset=%s\"%s\n") (or (and org-html-coding-system (fboundp 'coding-system-get) (coding-system-get org-html-coding-system 'mime-charset)) - "iso-8859-1")) - (format "<meta name=\"generator\" content=\"Org-mode\"/>\n") + "iso-8859-1") close) + (format "<meta name=\"generator\" content=\"Org-mode\"%s\n" close) (and (org-string-nw-p author) - (format "<meta name=\"author\" content=\"%s\"/>\n" - (funcall protect-string author))) + (format "<meta name=\"author\" content=\"%s\"%s\n" + (funcall protect-string author) close)) (and (org-string-nw-p description) - (format "<meta name=\"description\" content=\"%s\"/>\n" - (funcall protect-string description))) + (format "<meta name=\"description\" content=\"%s\"%s\n" + (funcall protect-string description) close)) (and (org-string-nw-p keywords) - (format "<meta name=\"keywords\" content=\"%s\"/>\n" - (funcall protect-string keywords)))))) + (format "<meta name=\"keywords\" content=\"%s\"%s\n" + (funcall protect-string keywords) close))))) (defun org-html--build-head (info) "Return information for the <head>..</head> of the HTML output. INFO is a plist used as a communication channel." - (org-element-normalize-string - (concat - (when (plist-get info :html-head-include-default-style) - (org-element-normalize-string org-html-style-default)) - (org-element-normalize-string (plist-get info :html-head)) - (org-element-normalize-string (plist-get info :html-head-extra)) - (when (and (plist-get info :html-htmlized-css-url) - (eq org-html-htmlize-output-type 'css)) - (format "<link rel=\"stylesheet\" href=\"%s\" type=\"text/css\" />\n" - (plist-get info :html-htmlized-css-url))) - (when (plist-get info :html-head-include-scripts) org-html-scripts)))) + (let ((head-string + (org-element-normalize-string + (concat + (when (plist-get info :html-head-include-default-style) + (org-element-normalize-string org-html-style-default)) + (org-element-normalize-string (plist-get info :html-head)) + (org-element-normalize-string (plist-get info :html-head-extra)) + (when (and (plist-get info :html-htmlized-css-url) + (eq org-html-htmlize-output-type 'css)) + (format "<link rel=\"stylesheet\" href=\"%s\" type=\"text/css\" %s\n" + (plist-get info :html-htmlized-css-url) + (org-html-close-tag info))) + (when (plist-get info :html-head-include-scripts) + org-html-scripts))))) + (when (org-html-html5-p info) + (setq head-string + (replace-regexp-in-string org-html-cdata-regexp "" head-string))) + head-string)) (defun org-html--build-mathjax-config (info) "Insert the user setup into the mathjax template. @@ -1507,6 +1560,10 @@ INFO is a plist used as a communication channel." (setq template (replace-match yes t t template))) (if (string-match ":MMLNO:" template) (setq template (replace-match no t t template))) + ;; remove CDATA escapes for html5 + (when (org-html-html5-p info) + (setq template (replace-regexp-in-string + org-html-cdata-regexp "" template))) ;; Return the modified template. (org-element-normalize-string template)))) @@ -1570,7 +1627,7 @@ communication channel." (format-time-string org-html-metadata-timestamp-format))) (when (plist-get info :with-creator) (format "<p class=\"creator\">%s</p>\n" creator)) - (format "<p class=\"xhtml-validation\">%s</p>\n" + (format "<p class=\"validation\">%s</p>\n" validation-link)))) (t (format-spec (or (cadr (assoc @@ -1612,23 +1669,31 @@ holding export options." CONTENTS is the transcoded contents string. INFO is a plist holding export options." (concat - (format - (or (and (stringp org-html-xml-declaration) - org-html-xml-declaration) - (cdr (assoc (plist-get info :html-extension) - org-html-xml-declaration)) - (cdr (assoc "html" org-html-xml-declaration)) - - "") - (or (and org-html-coding-system - (fboundp 'coding-system-get) - (coding-system-get org-html-coding-system 'mime-charset)) - "iso-8859-1")) - "\n" - (plist-get info :html-doctype) + (when (org-html-xhtml-p info) + (format "%s\n" + (format (or (and (stringp org-html-xml-declaration) + org-html-xml-declaration) + (cdr (assoc (plist-get info :html-extension) + org-html-xml-declaration)) + (cdr (assoc "html" org-html-xml-declaration)) + + "") + (or (and org-html-coding-system + (fboundp 'coding-system-get) + (coding-system-get org-html-coding-system 'mime-charset)) + "iso-8859-1")))) + (let* ((dt (plist-get info :html-doctype)) + (dt-cons (assoc dt org-html-doctype-alist))) + (if dt-cons + (cdr dt-cons) + dt)) "\n" - (format "<html xmlns=\"http://www.w3.org/1999/xhtml\" lang=\"%s\" xml:lang=\"%s\">\n" - (plist-get info :language) (plist-get info :language)) + (concat "<html" + (when (org-html-xhtml-p info) + (format + " xmlns=\"http://www.w3.org/1999/xhtml\" lang=\"%s\" xml:lang=\"%s\"" + (plist-get info :language) (plist-get info :language))) + ">\n") "<head>\n" (org-html--build-meta-info info) (org-html--build-head info) @@ -2179,7 +2244,7 @@ holding contextual information." ;; Build the real contents of the sub-tree. (let* ((type (if numberedp 'ordered 'unordered)) (itemized-body (org-html-format-list-item - contents type nil nil full-text))) + contents type nil info nil full-text))) (concat (and (org-export-first-sibling-p headline info) (org-html-begin-plain-list type)) @@ -2239,7 +2304,7 @@ holding contextual information." (defun org-html-horizontal-rule (horizontal-rule contents info) "Transcode an HORIZONTAL-RULE object from Org to HTML. CONTENTS is nil. INFO is a plist holding contextual information." - "<hr/>") + (format "<hr%s" (org-html-close-tag info))) ;;;; Inline Src Block @@ -2275,8 +2340,9 @@ holding contextual information." (org-html-format-headline--wrap inlinetask info format-function :contents contents))) ;; Otherwise, use a default template. - (t (format "<div class=\"inlinetask\">\n<b>%s</b><br/>\n%s</div>" + (t (format "<div class=\"inlinetask\">\n<b>%s</b><br%s\n%s</div>" (org-html-format-headline--wrap inlinetask info) + (org-html-close-tag info) contents)))) ;;;; Italic @@ -2296,11 +2362,12 @@ contextual information." (trans "<code>[-]</code>") (t ""))) -(defun org-html-format-list-item (contents type checkbox +(defun org-html-format-list-item (contents type checkbox info &optional term-counter-id headline) "Format a list item into HTML." - (let ((checkbox (concat (org-html-checkbox checkbox) (and checkbox " ")))) + (let ((checkbox (concat (org-html-checkbox checkbox) (and checkbox " "))) + (br (format "<br%s" (org-html-close-tag info)))) (concat (case type (ordered @@ -2308,13 +2375,13 @@ contextual information." (extra (if counter (format " value=\"%s\"" counter) ""))) (concat (format "<li%s>" extra) - (when headline (concat headline "<br/>"))))) + (when headline (concat headline br))))) (unordered (let* ((id term-counter-id) (extra (if id (format " id=\"%s\"" id) ""))) (concat (format "<li%s>" extra) - (when headline (concat headline "<br/>"))))) + (when headline (concat headline br))))) (descriptive (let* ((term term-counter-id)) (setq term (or term "(no term)")) @@ -2340,7 +2407,7 @@ contextual information." (tag (let ((tag (org-element-property :tag item))) (and tag (org-export-data tag info))))) (org-html-format-list-item - contents type checkbox (or tag counter)))) + contents type checkbox info (or tag counter)))) ;;;; Keyword @@ -2399,7 +2466,7 @@ CONTENTS is nil. INFO is a plist holding contextual information." (when (and formula-link (string-match "file:\\([^]]*\\)" formula-link)) (org-html-format-inline-image - (match-string 1 formula-link) caption label attr t)))) + (match-string 1 formula-link) info caption label attr t)))) (t latex-frag)))) ;;;; Latex Fragment @@ -2418,7 +2485,7 @@ CONTENTS is nil. INFO is a plist holding contextual information." (when (and formula-link (string-match "file:\\([^]]*\\)" formula-link)) (org-html-format-inline-image - (match-string 1 formula-link))))) + (match-string 1 formula-link) info)))) (t latex-frag)))) ;;;; Line Break @@ -2426,7 +2493,7 @@ CONTENTS is nil. INFO is a plist holding contextual information." (defun org-html-line-break (line-break contents info) "Transcode a LINE-BREAK object from Org to HTML. CONTENTS is nil. INFO is a plist holding contextual information." - "<br/>\n") + (format "<br%s\n" (org-html-close-tag info))) ;;;; Link @@ -2451,7 +2518,7 @@ Inline images can have these attributes: (label (org-element-property :name parent))) ;; Return proper string, depending on DISPOSITION. (org-html-format-inline-image - path caption label + path info caption label (org-html--make-attribute-string (org-export-read-attribute :attr_html parent)) (org-html-standalone-image-p link info)))) @@ -2772,7 +2839,8 @@ contextual information." (when (plist-get info :preserve-breaks) (setq output (replace-regexp-in-string - "\\(\\\\\\\\\\)?[ \t]*\n" "<br/>\n" output))) + "\\(\\\\\\\\\\)?[ \t]*\n" + (format "<br%s\n" (org-html-close-tag info)) output))) ;; Return value. output)) @@ -3047,11 +3115,12 @@ contextual information." (let* ((label (org-element-property :name table)) (caption (org-export-get-caption table)) (attributes - (org-html--make-attribute-string - (org-combine-plists - (and label (list :id (org-export-solidify-link-text label))) - (plist-get info :html-table-attributes) - (org-export-read-attribute :attr_html table)))) + (if (org-html-html5-p info) "" + (org-html--make-attribute-string + (org-combine-plists + (and label (list :id (org-export-solidify-link-text label))) + (plist-get info :html-table-attributes) + (org-export-read-attribute :attr_html table))))) (alignspec (if (and (boundp 'org-html-format-table-no-css) org-html-format-table-no-css) @@ -3069,7 +3138,8 @@ contextual information." table-cell info) "\n<colgroup>") ;; Add a column. Also specify it's alignment. - (format "\n<col %s/>" (format alignspec alignment)) + (format "\n<col %s%s" (format alignspec alignment) + (org-html-close-tag info)) ;; End a colgroup? (when (org-export-table-cell-ends-colgroup-p table-cell info) @@ -3131,9 +3201,10 @@ contextual information." ;; Replace each newline character with line break. Also replace ;; each blank line with a line break. (setq contents (replace-regexp-in-string - "^ *\\\\\\\\$" "<br/>\n" + "^ *\\\\\\\\$" (format "<br%s\n" (org-html-close-tag info)) (replace-regexp-in-string - "\\(\\\\\\\\\\)?[ \t]*\n" " <br/>\n" contents))) + "\\(\\\\\\\\\\)?[ \t]*\n" + (format " <br%s\n" (org-html-close-tag info)) contents))) ;; Replace each white space at beginning of a line with a ;; non-breaking space. (while (string-match "^[ \t]+" contents) -- 1.8.2.1