[ http://issues.apache.org/jira/browse/XMLBEANS-274?page=all ]
Peter Rodgers updated XMLBEANS-274:
-----------------------------------
Summary: Over zealous whitespace cropping after parsing entity like
& (was: Over zelous whitespace cropping after parsing entity like &)
Description:
When white space stripping is specified the parser does not detect XML entities
such as & and strips the whitespace following each entity.
For example
<root>dog & cat</root>
is parsed as
<root>dog &cat</root>
The cause of the problem is the stripLeft() method in the
org.apache.xmlbeans.impl.store.CharUtil
Below is a fixed version of the method that detects the ';' character after an
entity which indicates that whitespace is significant and must be preserved.
Note this code does not fix the case where the iteration is a for loop.
public Object stripLeft ( Object src, int off, int cch )
{
assert isValid( src, off, cch );
if (cch > 0)
{
if (src instanceof char[])
{
char[] chars = (char[]) src;
while ( cch > 0 && isWhiteSpace( chars[ off ] ) && chars[off -
1]!=';' ) //Fix for & etc
{ cch--; off++; }
}
else if (src instanceof String)
{
String s = (String) src;
while ( cch > 0 && isWhiteSpace( s.charAt( off ) ) &&
s.charAt(off - 1)!=';' ) //Fix for & etc
{ cch--; off++; }
}
else
{
int count = 0;
for ( _charIter.init( src, off, cch ) ; _charIter.hasNext() ;
count++ )
if (!isWhiteSpace( _charIter.next() ))
break;
_charIter.release();
off += count;
}
}
if (cch == 0)
{
_offSrc = 0;
_cchSrc = 0;
return null;
}
_offSrc = off;
_cchSrc = cch;
return src;
}
was:
When white space stripping is specified the parser does not detect XML entities
such as & and strips the whitespace following each entity.
For example
<root>dog & cat</root>
is parsed as
<root>doc &cat</root>
The cause of the problem is the stripLeft() method in the
org.apache.xmlbeans.impl.store.CharUtil
Below is a fixed version of the method that detects the ';' character after an
entity which indicates that whitespace is significant and must be preserved.
Note this code does not fix the case where the iteration is a for loop.
public Object stripLeft ( Object src, int off, int cch )
{
assert isValid( src, off, cch );
if (cch > 0)
{
if (src instanceof char[])
{
char[] chars = (char[]) src;
while ( cch > 0 && isWhiteSpace( chars[ off ] ) && chars[off -
1]!=';' ) //Fix for & etc
{ cch--; off++; }
}
else if (src instanceof String)
{
String s = (String) src;
while ( cch > 0 && isWhiteSpace( s.charAt( off ) ) &&
s.charAt(off - 1)!=';' ) //Fix for & etc
{ cch--; off++; }
}
else
{
int count = 0;
for ( _charIter.init( src, off, cch ) ; _charIter.hasNext() ;
count++ )
if (!isWhiteSpace( _charIter.next() ))
break;
_charIter.release();
off += count;
}
}
if (cch == 0)
{
_offSrc = 0;
_cchSrc = 0;
return null;
}
_offSrc = off;
_cchSrc = cch;
return src;
}
> Over zealous whitespace cropping after parsing entity like &
> ----------------------------------------------------------------
>
> Key: XMLBEANS-274
> URL: http://issues.apache.org/jira/browse/XMLBEANS-274
> Project: XMLBeans
> Type: Bug
> Versions: Version 2.1
> Environment: All
> Reporter: Peter Rodgers
>
> When white space stripping is specified the parser does not detect XML
> entities such as & and strips the whitespace following each entity.
> For example
> <root>dog & cat</root>
> is parsed as
> <root>dog &cat</root>
> The cause of the problem is the stripLeft() method in the
> org.apache.xmlbeans.impl.store.CharUtil
> Below is a fixed version of the method that detects the ';' character after
> an entity which indicates that whitespace is significant and must be
> preserved. Note this code does not fix the case where the iteration is a for
> loop.
> public Object stripLeft ( Object src, int off, int cch )
> {
> assert isValid( src, off, cch );
> if (cch > 0)
> {
> if (src instanceof char[])
> {
> char[] chars = (char[]) src;
> while ( cch > 0 && isWhiteSpace( chars[ off ] ) && chars[off
> - 1]!=';' ) //Fix for & etc
> { cch--; off++; }
> }
> else if (src instanceof String)
> {
> String s = (String) src;
> while ( cch > 0 && isWhiteSpace( s.charAt( off ) ) &&
> s.charAt(off - 1)!=';' ) //Fix for & etc
> { cch--; off++; }
> }
> else
> {
> int count = 0;
>
> for ( _charIter.init( src, off, cch ) ; _charIter.hasNext() ;
> count++ )
> if (!isWhiteSpace( _charIter.next() ))
> break;
>
> _charIter.release();
> off += count;
> }
> }
> if (cch == 0)
> {
> _offSrc = 0;
> _cchSrc = 0;
>
> return null;
> }
> _offSrc = off;
> _cchSrc = cch;
> return src;
> }
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]