[ https://issues.apache.org/jira/browse/SSHD-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17290857#comment-17290857 ]
Thomas Wolf commented on SSHD-1131: ----------------------------------- That looks like a problem with that Tectia SFTP client. If you google, you'll find several mentions that it has problems with UTF-8, including a [mention from 2021|https://intra.kth.se/en/it/arbeta-pa-distans/unix/encoding-1.71788]. Per the [internet draft on SFTP|https://tools.ietf.org/html/draft-ietf-secsh-filexfer-13#page-15] file name encoding in STFP is UTF-8 unless the server indicated otherwise. AFAIK an Apache MINA sshd server doesn't indicate otherwise, so the client is supposed to send UTF-8. According to the same IETF draft, "All clients MUST support UTF-8 filenames." Apparently Tectia doesn't. In ISO-8859-1, "ØÆ¤" is 0xD8 0xC6 0xA4. If it sends that and the server expects this to be UTF-8, the server will use \uFFFD\u01A4 ("�Ƥ") as file name. (0xD8 is invalid in UTF-8 and is replaced by the replacement character �; 0xC6 0xA4 happens to be a valid UTF-8 sequence for Ƥ). When the server sends that file name back in a directory listing, it'll encode this as UTF-8 again, sending the bytes 0xEF 0xBF 0xBD 0xC6 0xA4 (0xEF 0xBF 0xBD is UTF-8 for \uFFFD). The Tectia client again takes that as ISO-8859-1 and shows �Ƥ. So, file a bug at ssh.com against the Tectia SFTP client. BTW, ISO-8859-8 is Latin/Hebrew. ISO-8859-4 would be Latin/Nordic. I don't think reconfiguring the server because of a broken client is a good idea. I also don't think reading filenames received should be made more lenient (like, try UTF-8 strictly, if it fails, fall back to ISO-8859-1), that would only hide such problems. What _could_ be done in Apache MINA sshd is perhaps converting UTF-8 strictly and failing the operation if the bytes for the filename are not valid UTF-8. (With an exception message saying something like "refusing to perform operation XYZ; buggy client did not send valid UTF-8 file name".) > Filenames containing Norwegian Characters not allowed over SFTP in Tectia > client > -------------------------------------------------------------------------------- > > Key: SSHD-1131 > URL: https://issues.apache.org/jira/browse/SSHD-1131 > Project: MINA SSHD > Issue Type: Bug > Reporter: Susmit Sarkar > Priority: Blocker > Labels: mina, sshd > Attachments: image-2021-02-24-15-38-39-337.png > > > Hi, > Filenames containing Norwegian, Spanish, French, and German Characters not > allowed over SFTP in Tectia client, names are getting messed up > How can I override the decoding charset in *SshServer* : > DecodingCharset is present for SftpClient: > client.setNameDecodingCharset(Charset.forName("ISO-8859-8")); > by default it uses UTF-8 > The SftpClient exposes a get/setNameDecodingCharset method which enables the > user to modify the charset - even while the SFTP session is in progress > After the file is uploaded the Norwegian characters in the filename are > replaced with non-readable characters. > Can we configure the SshServer to override the decoding charset from the > default UTF-8? Is there any API to set the SFTP CharsetEncoding (Set the > default encoding for filenames in SFTP sessions) > !image-2021-02-24-15-38-39-337.png! -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@mina.apache.org For additional commands, e-mail: dev-h...@mina.apache.org