[ http://issues.apache.org/jira/browse/XALANJ-2109?page=all ]
Brian Minchau updated XALANJ-2109:
----------------------------------
Fix Version: 2.7
(was: CurrentCVS)
> \r\n in an HTML attribute is incorrectly output as \r\r\n
> -----------------------------------------------------------
>
> Key: XALANJ-2109
> URL: http://issues.apache.org/jira/browse/XALANJ-2109
> Project: XalanJ2
> Type: Bug
> Components: Serialization
> Reporter: Brian Minchau
> Assignee: Brian Minchau
> Fix For: 2.7
> Attachments: ToHTMLStream.2109.patch.txt
>
> The serializer assumes that a single \n should be expanded to the systems end
> of line sequence. This is OK for text nodes, but not correct for HTML
> attributes. The reasons follow.
> Input XML document:
> <?xml version="1.0"?>
> <input
> data="xxx yyy"
> type="hidden"
> name="data.stuff" />
> Stylesheet:
> <?xml version="1.0"?>
> <xsl:stylesheet
> xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
> version="1.0" >
> <xsl:output method="html" />
> <xsl:template match="input|br">
> <xsl:copy>
> <xsl:copy-of select="@*"/>
> <xsl:apply-templates/>
> </xsl:copy>
> </xsl:template>
> </xsl:stylesheet>
> There are Four stages of processing:
> A) what is in the input XML document
> B) what is presented to Xalan by the XML parser
> C) what is written out by the Xalan processor
> D) what is interpreted by a browser or user agent.
> The output produced by stage C) by Xalan is this:
> <input data="xxx
> yyy" type="hidden" name="data.stuff">
> To indicate that more clearly the value for the attribute 'data'
> written out on windows is this:
> "xxx\r\r\nyyy"
> and on other operating systems the value written out is this:
> "xxx\r\nyyy"
> Current processing of the attribute by Xalan is this:
> - write out the \r as is
> - consider the \n a normalized end of line sequence produced by
> the XML parser from stage A) and it write it out
> in stage C) as the system end of line
> sequence, either \r\n or just \n depending on the operation system.
> The HTML recommendation, at
> http://www.w3.org/TR/html401/types.html#h-6.2
> says this about stage D) :
> <<
> User agents should interpret attribute values as follows:
> 1. Replace character entities with characters,
> 2. Ignore line feeds,
> 3. Replace each carriage return or tab with a single space.
> >>
> Xalan's output on Windows OS by stage C) of "xxx\r\r\nyyy" would be
> interpreted
> as "xxx yyy" by a browser at stage D). Bullet 2. means that the
> browser would ignore the \n, and bullet 3 means that it would
> interpret \r\r as two spaces.
> Xalan's output from stage C) on other operating systems
> of "xxx\r\nyyy" would be interpreted as "xxx yyy" by a browser at stage D).
> This is one less space between "xxx" and "yyy"
> Since the browser interpretation differs depending on which OS
> we are running on this is a bug, we shouldn't normalize
> the \n in the attribute value to the system end of line sequence.
> We should leave it alone, thus producing this output by stage D) on all
> operating systems:
> "xxx\r\nyyy"
> I ran this through Saxon 6.5.3 and its output was:
> <input data="xxx
yyy" type="hidden" name="data.stuff">
> When a browser interprets Saxon's output it would apply
> bullet 1 and interpret a single newline character between "xxx" and "yyy".
> It is not clear if the bullets 1,2,3 quoted from the HTML recommendation
> apply in sequence, or if just one of them applies. If just one of them
> applies the browser might interpret Saxons 'data' attribute value as
> "xxx\nyyy". On the other hand if one applies bullet 1. followed by bullet 2.
> then Saxon's 'data' atribute value is interpreted as "xxxyyy". Either way
> Xalan's output is different than Saxon's in a way that is significant to a
> browser or user agent.
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]