[ 
https://issues.apache.org/jira/browse/SSHD-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17290857#comment-17290857
 ] 

Thomas Wolf commented on SSHD-1131:
-----------------------------------

That looks like a problem with that Tectia SFTP client. If you google, you'll 
find several mentions that it has problems with UTF-8, including a [mention 
from 2021|https://intra.kth.se/en/it/arbeta-pa-distans/unix/encoding-1.71788].

Per the [internet draft on 
SFTP|https://tools.ietf.org/html/draft-ietf-secsh-filexfer-13#page-15] file 
name encoding in STFP is UTF-8 unless the server indicated otherwise. AFAIK an 
Apache MINA sshd server doesn't indicate otherwise, so the client is supposed 
to send UTF-8. According to the same IETF draft, "All clients MUST support 
UTF-8 filenames."

Apparently Tectia doesn't. In ISO-8859-1, "ØÆ¤" is 0xD8 0xC6 0xA4. If it sends 
that and the server expects this to be UTF-8, the server will use \uFFFD\u01A4 
("�Ƥ") as file name. (0xD8 is invalid in UTF-8 and is replaced by the 
replacement character �; 0xC6 0xA4 happens to be a valid UTF-8 sequence for Ƥ). 
When the server sends that file name back in a directory listing, it'll encode 
this as UTF-8 again, sending the bytes 0xEF 0xBF 0xBD 0xC6 0xA4 (0xEF 0xBF 0xBD 
is UTF-8 for \uFFFD). The Tectia client again takes that as ISO-8859-1 and 
shows �Ƥ.

So, file a bug at ssh.com against the Tectia SFTP client.

BTW, ISO-8859-8 is Latin/Hebrew. ISO-8859-4 would be Latin/Nordic.

I don't think reconfiguring the server because of a broken client is a good 
idea. I also don't think reading filenames received should be made more lenient 
(like, try UTF-8 strictly, if it fails, fall back to ISO-8859-1), that would 
only hide such problems. What _could_ be done in Apache MINA sshd is perhaps 
converting UTF-8 strictly and failing the operation if the bytes for the 
filename are not valid UTF-8. (With an exception message saying something like 
"refusing to perform operation XYZ; buggy client did not send valid UTF-8 file 
name".)

> Filenames containing Norwegian Characters not allowed over SFTP in Tectia 
> client
> --------------------------------------------------------------------------------
>
>                 Key: SSHD-1131
>                 URL: https://issues.apache.org/jira/browse/SSHD-1131
>             Project: MINA SSHD
>          Issue Type: Bug
>            Reporter: Susmit Sarkar
>            Priority: Blocker
>              Labels: mina, sshd
>         Attachments: image-2021-02-24-15-38-39-337.png
>
>
> Hi,
> Filenames containing Norwegian, Spanish, French, and German Characters not 
> allowed over SFTP in Tectia client, names are getting messed up
> How can I override the decoding charset in *SshServer* :
> DecodingCharset is present for SftpClient:
> client.setNameDecodingCharset(Charset.forName("ISO-8859-8"));
> by default it uses UTF-8
> The SftpClient exposes a get/setNameDecodingCharset method which enables the 
> user to modify the charset - even while the SFTP session is in progress 
> After the file is uploaded the Norwegian characters in the filename are 
> replaced with non-readable characters.
> Can we configure the SshServer to override the decoding charset from the 
> default UTF-8? Is there any API to set the SFTP CharsetEncoding (Set the 
> default encoding for filenames in SFTP sessions)
> !image-2021-02-24-15-38-39-337.png!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@mina.apache.org
For additional commands, e-mail: dev-h...@mina.apache.org

Reply via email to