I am now pretty much sure the problem comes either from the lftp side or the FTP itself.
Actually, I used the same script as previously used, on two different folders. On one folder, it is working as is, and the output is well UTF-8 (as it used to be, and this folder does have some special accentuated characters). On the other one (which was the one I was using when starting this thread), the output has to be parsed as latin-1. Both folders are on the same FTP, and launching lftp with "debug 10" shows that apparently it sends the command to tell the FTP server to use UTF-8 (which seems to be confirmed by the `ftp:charset` and `file:charset` options). I do not have any problems using either directly Filezilla or lftp because I guess they (or the shell) handles encoding conversion. Thanks Le 07/01/2016 20:47, Alexander Lukyanov a écrit : > Can you check the locale under subprocess.run? Can the subprocess itself > recode the output? > > чт, 7 янв. 2016, 21:52, Phyks <webmas...@phyks.me>: > >> I did try to `set file:charset UTF-8` without any success. >> >> From the shell from which I run my Python script: >> >> ``` >> % locale charmap >> UTF-8 >> >> % locale >> LANG=fr_FR.UTF-8 >> LC_CTYPE="fr_FR.UTF-8" >> LC_NUMERIC="fr_FR.UTF-8" >> LC_TIME="fr_FR.UTF-8" >> LC_COLLATE="fr_FR.UTF-8" >> LC_MONETARY="fr_FR.UTF-8" >> LC_MESSAGES="fr_FR.UTF-8" >> LC_PAPER="fr_FR.UTF-8" >> LC_NAME="fr_FR.UTF-8" >> LC_ADDRESS="fr_FR.UTF-8" >> LC_TELEPHONE="fr_FR.UTF-8" >> LC_MEASUREMENT="fr_FR.UTF-8" >> LC_IDENTIFICATION="fr_FR.UTF-8" >> LC_ALL= >> ``` >> >> Which should be ok AFAIK, no? >> >> In case it can be useful for anyone, here is the full script: >> https://gist.github.com/Phyks/4e4c65fcd12d600374a7. Problem is with the >> call at line 33. >> >> >> Le 07/01/2016 19:42, Alexander Lukyanov a écrit : >>> Probably you run lftp with a latin-1 locale. Try to set >> LC_ALL=en_US.UTF-8 >>> (for example). Alternatively you can force a local character set in lftp >> by >>> "set file:charset UTF-8". >>> >>> 2016-01-07 20:04 GMT+03:00 Phyks <webmas...@phyks.me>: >>> >>>> Hi, >>>> >>>> I am using lftp from a Python script, calling it through >>>> `subprocess.run`. Everything should be UTF-8 (shell, lftp is sending >>>> OPTS UTF-8 to the server) and my Python script is using UTF-8, sends >>>> UTF-8 to lftp on the stdin and expect UTF-8 on its stdout. >>>> >>>> But for some unknown reason, it seems that lftp is returning latin-1 >>>> encoding, instead of UTF-8, for the directory listings. >>>> >>>> Are you aware of such a problem? I did not find any way to force lftp to >>>> use UTF-8. >>>> >>>> Thanks! >>>> >>>> >>>> _______________________________________________ >>>> lftp mailing list >>>> lftp@uniyar.ac.ru >>>> http://univ.uniyar.ac.ru/mailman/listinfo/lftp >>>> >>>> >>> >>> >>> -- >>> Alexander. >>> >> >> _______________________________________________ >> lftp mailing list >> lftp@uniyar.ac.ru >> http://univ.uniyar.ac.ru/mailman/listinfo/lftp >> >
signature.asc
Description: OpenPGP digital signature
_______________________________________________ lftp mailing list lftp@uniyar.ac.ru http://univ.uniyar.ac.ru/mailman/listinfo/lftp