[
https://issues.apache.org/jira/browse/XALANJ-2725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17810472#comment-17810472
]
Joe Kesselman commented on XALANJ-2725:
---------------------------------------
TL;DR: I may be fooling myself in how I'm invoking the tests, so don't panic
yet...
Xalan's test cases are in a separate package,
[https://github.com/apache/xalan-test/,] since they intended to be a (mostly)
reusable suite that could also be applied to other XPath/XSLT 1.0 processors.
We've merged new tests into that to exercise the encoding changes.
To build the test suite, check it out and from its top-level directory run
{code:java}
./build.xx jar extensions.classes{code}
where xx is .bat or .sh as appropriate for your system (Windows or Unixoid).
The intent is that running these tests in isolation:
{code:java}
java
-Djavax.xml.transform.TransformerFactory=org.apache.xalan.processor.TransformerFactoryImpl
-cp ../xalan-java/build/*:java/build/*:tools/*
org.apache.qetest.trax.ToXMLStreamTest -inputDir tests/api -outputDir
results-api -goldDir tests/api-gold/
java
-Djavax.xml.transform.TransformerFactory=org.apache.xalan.processor.TransformerFactoryImpl
-cp ../xalan-java/build/*:java/build/*:tools/*
org.apache.qetest.trax.ToHTMLStreamTest -inputDir tests/api -outputDir
results-api -goldDir tests/api-gold/
java
-Djavax.xml.transform.TransformerFactory=org.apache.xalan.processor.TransformerFactoryImpl
-cp ../xalan-java/build/*:java/build/*:tools/*
org.apache.qetest.trax.stream.StreamResultAPITest -inputDir tests/api
-outputDir results-api -goldDir tests/api-gold/{code}
or running them as part of the test framework:
{code:java}
./build.xx apitest{code}
should both run the new tests successfully.
{_}+*However*+{_}, I'm seeing some weirdness in that right now, When running
against what should be the most recent code, with the changes merged, *build
apitest* is reporting that these pass whereas running them with the java
command line doesn't.
I suspect this may be because the command line is using the default Xerces
rather than setting the parser factory; investigating. The other option would
be that *apitest* is failing to set TransformerFactory and is running with the
shadowed version that ships with Java, but my tests under a debugger suggested
that wasn't the case.
> Possible buffer-boundry issue when serializing surrogate pairs
> --------------------------------------------------------------
>
> Key: XALANJ-2725
> URL: https://issues.apache.org/jira/browse/XALANJ-2725
> Project: XalanJ2
> Issue Type: Improvement
> Security Level: No security risk; visible to anyone(Ordinary problems in
> Xalan projects. Anybody can view the issue.)
> Components: Serialization
> Reporter: Joe Kesselman
> Assignee: Joe Kesselman
> Priority: Major
> Labels: Surrogates, escaping, unicode, utf
> Attachments: astral-chars-split-buffer.patch
>
> Original Estimate: 168h
> Remaining Estimate: 168h
>
> XALANJ-2419 addressed a case where "astral" Unicode characters, requiring a
> surrogate pair (two UTF-16 units), were not being serialized correctly. We
> have a proposed fix for that.
> There is reported to still be an edge case when a surrogate pair which
> crosses buffer boundaries might not be handled correctly. [~maxfortun]
> offered what looks like a reasonable proposed fix
> (https://github.com/maxfortun/xalan-j/blob/a9bd5591d9f8a523548aeec091e886b64c691628/src/org/apache/xml/serializer/ToStream.java#L1607),
> but in my testing this was not serializing the surrogate pairs correctly,
> causing regression on the tests XALANJ-2419 introduced. I don't know whether
> that's because we're taking multiple paths through
> But the edge case does appear to be real, and if so we will need some such
> solution.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]