[
https://issues.apache.org/jira/browse/SSHD-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17290857#comment-17290857
]
Thomas Wolf commented on SSHD-1131:
-----------------------------------
That looks like a problem with that Tectia SFTP client. If you google, you'll
find several mentions that it has problems with UTF-8, including a [mention
from 2021|https://intra.kth.se/en/it/arbeta-pa-distans/unix/encoding-1.71788].
Per the [internet draft on
SFTP|https://tools.ietf.org/html/draft-ietf-secsh-filexfer-13#page-15] file
name encoding in STFP is UTF-8 unless the server indicated otherwise. AFAIK an
Apache MINA sshd server doesn't indicate otherwise, so the client is supposed
to send UTF-8. According to the same IETF draft, "All clients MUST support
UTF-8 filenames."
Apparently Tectia doesn't. In ISO-8859-1, "ØÆ¤" is 0xD8 0xC6 0xA4. If it sends
that and the server expects this to be UTF-8, the server will use \uFFFD\u01A4
("�Ƥ") as file name. (0xD8 is invalid in UTF-8 and is replaced by the
replacement character �; 0xC6 0xA4 happens to be a valid UTF-8 sequence for Ƥ).
When the server sends that file name back in a directory listing, it'll encode
this as UTF-8 again, sending the bytes 0xEF 0xBF 0xBD 0xC6 0xA4 (0xEF 0xBF 0xBD
is UTF-8 for \uFFFD). The Tectia client again takes that as ISO-8859-1 and
shows �Ƥ.
So, file a bug at ssh.com against the Tectia SFTP client.
BTW, ISO-8859-8 is Latin/Hebrew. ISO-8859-4 would be Latin/Nordic.
I don't think reconfiguring the server because of a broken client is a good
idea. I also don't think reading filenames received should be made more lenient
(like, try UTF-8 strictly, if it fails, fall back to ISO-8859-1), that would
only hide such problems. What _could_ be done in Apache MINA sshd is perhaps
converting UTF-8 strictly and failing the operation if the bytes for the
filename are not valid UTF-8. (With an exception message saying something like
"refusing to perform operation XYZ; buggy client did not send valid UTF-8 file
name".)
> Filenames containing Norwegian Characters not allowed over SFTP in Tectia
> client
> --------------------------------------------------------------------------------
>
> Key: SSHD-1131
> URL: https://issues.apache.org/jira/browse/SSHD-1131
> Project: MINA SSHD
> Issue Type: Bug
> Reporter: Susmit Sarkar
> Priority: Blocker
> Labels: mina, sshd
> Attachments: image-2021-02-24-15-38-39-337.png
>
>
> Hi,
> Filenames containing Norwegian, Spanish, French, and German Characters not
> allowed over SFTP in Tectia client, names are getting messed up
> How can I override the decoding charset in *SshServer* :
> DecodingCharset is present for SftpClient:
> client.setNameDecodingCharset(Charset.forName("ISO-8859-8"));
> by default it uses UTF-8
> The SftpClient exposes a get/setNameDecodingCharset method which enables the
> user to modify the charset - even while the SFTP session is in progress
> After the file is uploaded the Norwegian characters in the filename are
> replaced with non-readable characters.
> Can we configure the SshServer to override the decoding charset from the
> default UTF-8? Is there any API to set the SFTP CharsetEncoding (Set the
> default encoding for filenames in SFTP sessions)
> !image-2021-02-24-15-38-39-337.png!
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]