Dear HTML editors, Here is a small update to my previous e-mail [http://lists.w3.org/Archives/Public/www-html-editor/2006JulSep/0015.html] suggesting better patterns in XML Schemas for some XHTML Modularization 1.1 datatypes in [http://www.w3.org/TR/2006/WD-xhtml-modularization-20060705/SCHEMA/xhtml-datatypes-1.xsd].
An updated "xhtml-datatypes-1.xsd" with my propositions is also attached to
this e-mail.
ContentType: "A media type, as per [RFC2045]"
------------
- Short version:
<xs:pattern value="[^/ ;,=]+/[^/ ;,=]+(;\s*[^/ ;,=]+=([^/
;,=]+|"([^"\\]|\\\\|\\")*"))*"/>
The update is on the "quoted string" part (as per RFC 2822, without optional
comments):
"([^"\\]|\\\\|\\")*"
Details:
" # quotation mark
( #
[^"\\] ## any character but a quotation mark " or an anti-slash \
| ## or
\\\\ ## an escaped anti-slash \\
| ## or
\\" ## an escaped quotation mark \"
)* # the content of a quoted string can
# be 0 or more characters
" # quotation mark
- Long version:
<xs:pattern
value="([xX][-.][!#$%&'*+-.0-9A-Z\\^_`a-z{|}~]+|[a-zA-Z]{4,})/([xX][-.][!#$%&'*+-.0-9A-Z\\^_`a-z{|}~]+|[a-zA-Z0-9._+-]+)
(;\s*[!#$%&'*+-.0-9A-Z\\^_`a-z{|}~]+=([!#$%&'*+-.0-9A-Z\\^_`a-z{|}~]+|"([^"\\]|\\\\|\\")*"))*"/>
In addition to the update on "quoted string" reported above, an anti-slash
escaping was missing in my token definition (as per
RFC 2045):
[!#$%&'*+-.0-9A-Z\\^_`a-z{|}~]+
Cordially,
Alexandre
http://alexandre.alapetite.net
xhtml-datatypes-1.xsd
Description: Binary data
