[ http://issues.apache.org/jira/browse/XALANJ-2109?page=all ] Brian Minchau closed XALANJ-2109: ---------------------------------
> \r\n in an HTML attribute is incorrectly output as \r\r\n > ----------------------------------------------------------- > > Key: XALANJ-2109 > URL: http://issues.apache.org/jira/browse/XALANJ-2109 > Project: XalanJ2 > Type: Bug > Components: Serialization > Reporter: Brian Minchau > Assignee: Brian Minchau > Fix For: 2.7 > Attachments: ToHTMLStream.2109.patch.txt > > The serializer assumes that a single \n should be expanded to the systems end > of line sequence. This is OK for text nodes, but not correct for HTML > attributes. The reasons follow. > Input XML document: > <?xml version="1.0"?> > <input > data="xxx yyy" > type="hidden" > name="data.stuff" /> > Stylesheet: > <?xml version="1.0"?> > <xsl:stylesheet > xmlns:xsl="http://www.w3.org/1999/XSL/Transform" > version="1.0" > > <xsl:output method="html" /> > <xsl:template match="input|br"> > <xsl:copy> > <xsl:copy-of select="@*"/> > <xsl:apply-templates/> > </xsl:copy> > </xsl:template> > </xsl:stylesheet> > There are Four stages of processing: > A) what is in the input XML document > B) what is presented to Xalan by the XML parser > C) what is written out by the Xalan processor > D) what is interpreted by a browser or user agent. > The output produced by stage C) by Xalan is this: > <input data="xxx > yyy" type="hidden" name="data.stuff"> > To indicate that more clearly the value for the attribute 'data' > written out on windows is this: > "xxx\r\r\nyyy" > and on other operating systems the value written out is this: > "xxx\r\nyyy" > Current processing of the attribute by Xalan is this: > - write out the \r as is > - consider the \n a normalized end of line sequence produced by > the XML parser from stage A) and it write it out > in stage C) as the system end of line > sequence, either \r\n or just \n depending on the operation system. > The HTML recommendation, at > http://www.w3.org/TR/html401/types.html#h-6.2 > says this about stage D) : > << > User agents should interpret attribute values as follows: > 1. Replace character entities with characters, > 2. Ignore line feeds, > 3. Replace each carriage return or tab with a single space. > >> > Xalan's output on Windows OS by stage C) of "xxx\r\r\nyyy" would be > interpreted > as "xxx yyy" by a browser at stage D). Bullet 2. means that the > browser would ignore the \n, and bullet 3 means that it would > interpret \r\r as two spaces. > Xalan's output from stage C) on other operating systems > of "xxx\r\nyyy" would be interpreted as "xxx yyy" by a browser at stage D). > This is one less space between "xxx" and "yyy" > Since the browser interpretation differs depending on which OS > we are running on this is a bug, we shouldn't normalize > the \n in the attribute value to the system end of line sequence. > We should leave it alone, thus producing this output by stage D) on all > operating systems: > "xxx\r\nyyy" > I ran this through Saxon 6.5.3 and its output was: > <input data="xxx
yyy" type="hidden" name="data.stuff"> > When a browser interprets Saxon's output it would apply > bullet 1 and interpret a single newline character between "xxx" and "yyy". > It is not clear if the bullets 1,2,3 quoted from the HTML recommendation > apply in sequence, or if just one of them applies. If just one of them > applies the browser might interpret Saxons 'data' attribute value as > "xxx\nyyy". On the other hand if one applies bullet 1. followed by bullet 2. > then Saxon's 'data' atribute value is interpreted as "xxxyyy". Either way > Xalan's output is different than Saxon's in a way that is significant to a > browser or user agent. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
