[
https://issues.apache.org/jira/browse/XALANJ-2613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17823775#comment-17823775
]
Lorenzo Dalla Vecchia commented on XALANJ-2613:
-----------------------------------------------
I have encountered this issue, too.
The target file is written, but all URL-unsafe characters are percent-encoded,
creating the wrong file name.
I have fixed the bug by adding URL encoding and rebuilding Xalan. Please see
the attached patch [^URL-encoding-fix.diff].
> TransformerIdentityImpl doesn't properly handle file URIs with
> percent-encoded Unicode characters
> -------------------------------------------------------------------------------------------------
>
> Key: XALANJ-2613
> URL: https://issues.apache.org/jira/browse/XALANJ-2613
> Project: XalanJ2
> Issue Type: Bug
> Security Level: No security risk; visible to anyone(Ordinary problems in
> Xalan projects. Anybody can view the issue.)
> Components: transformation
> Affects Versions: 2.7.2
> Environment: I tested on the following system:
> $ cat /etc/centos-release
> CentOS Linux release 7.4.1708 (Core)
> $ uname -a
> Linux jjmdeskvm.informatica.com 3.10.0-693.17.1.el7.x86_64 #1 SMP Thu Jan 25
> 20:13:58 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
> $ env | grep -E '^LANG'
> LANG=en_US.UTF-8
> $ env | grep -E '^LC'
> $
> Reporter: Joshua Maurice
> Assignee: Steven J. Hathaway
> Priority: Major
> Fix For: The Latest Development Code
>
> Attachments: Repro.java, URL-encoding-fix.diff, runtest.sh
>
>
> When using Xalan, and javax.xml.transform.Transformer, with a
> javax.xml.transform.stream.StreamResult constructed from a java.io.File
> object that contains Unicode characters, the Transformer will create an
> output file with the wrong file path.
> I have attached a very small repro, which is a very small Java file and a
> very small bash script used to compile and run the test, and print out a few
> relevant environmental details.
>
> The cause of the bug is this:
> When constructing a StreamResult object by passing a File object to the
> constructor, the StreamResult object saves a string representation of the URI
> object created from the File object. This string representation of the URI is
> properly formatted, which means that the individual path elements of the path
> of the URI are properly percent-encoded. The Xalan TransformerImpl class
> calls getSystemId on StreamResult to get this string representation of the
> URI, and it simply strips off the leading "file://" prefix, and uses the
> remainder to create a FileOutputStream object. However, the remainder of the
> string is the result of URI percent-encoding, and as such, it is not suitable
> for directly passing to FileOutputStream. Instead, the code here must use a
> URI utility to properly interpret the URI string, and to undo the
> percent-encoding, to obtain a string that is suitable for creating a
> FileOutputStream object.
> When the file path contains only ASCII characters, percent-encoding does
> nothing, which means that the code works with ASCII. However, as soon as any
> other Unicode character is part of the file path, then it breaks by writing
> to the wrong file path.
> Because it writes to the wrong file path which may silently succeed, this may
> have security concerns.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]