Hi, I finally managed to narrow down the problem, but I am not sure of the correct way to solve it…
Let's say I am running (in lftp) ``` mirror --dry-run --delete -c --use-pget-n=6 --parallel=4 photos/gallery/2014/2014-10\ Octobre /tmp ``` It returns me: ``` pget -n 6 -O "/tmp/(2014-10-15) Ã%89cocups" ftp://… ``` The problem comes from the partly url-encoded string. There is a à character which is not urlencoded into %c3. When I try to decode it with Python, I then get something like "\xc3%89" which is not a valid utf-8 sequence (should be \xc3\x89). I guess I could monkey patch above it, but I am feeling like either this is a bug or a misuse of lftp from my side. Thanks! Le 09/01/2016 01:57, Phyks a écrit : > I am now pretty much sure the problem comes either from the lftp side or > the FTP itself. > > Actually, I used the same script as previously used, on two different > folders. On one folder, it is working as is, and the output is well > UTF-8 (as it used to be, and this folder does have some special > accentuated characters). On the other one (which was the one I was using > when starting this thread), the output has to be parsed as latin-1. > > Both folders are on the same FTP, and launching lftp with "debug 10" > shows that apparently it sends the command to tell the FTP server to use > UTF-8 (which seems to be confirmed by the `ftp:charset` and > `file:charset` options). > > I do not have any problems using either directly Filezilla or lftp > because I guess they (or the shell) handles encoding conversion. > > Thanks > > Le 07/01/2016 20:47, Alexander Lukyanov a écrit : >> Can you check the locale under subprocess.run? Can the subprocess itself >> recode the output? >> >> чт, 7 янв. 2016, 21:52, Phyks <webmas...@phyks.me>: >> >>> I did try to `set file:charset UTF-8` without any success. >>> >>> From the shell from which I run my Python script: >>> >>> ``` >>> % locale charmap >>> UTF-8 >>> >>> % locale >>> LANG=fr_FR.UTF-8 >>> LC_CTYPE="fr_FR.UTF-8" >>> LC_NUMERIC="fr_FR.UTF-8" >>> LC_TIME="fr_FR.UTF-8" >>> LC_COLLATE="fr_FR.UTF-8" >>> LC_MONETARY="fr_FR.UTF-8" >>> LC_MESSAGES="fr_FR.UTF-8" >>> LC_PAPER="fr_FR.UTF-8" >>> LC_NAME="fr_FR.UTF-8" >>> LC_ADDRESS="fr_FR.UTF-8" >>> LC_TELEPHONE="fr_FR.UTF-8" >>> LC_MEASUREMENT="fr_FR.UTF-8" >>> LC_IDENTIFICATION="fr_FR.UTF-8" >>> LC_ALL= >>> ``` >>> >>> Which should be ok AFAIK, no? >>> >>> In case it can be useful for anyone, here is the full script: >>> https://gist.github.com/Phyks/4e4c65fcd12d600374a7. Problem is with the >>> call at line 33. >>> >>> >>> Le 07/01/2016 19:42, Alexander Lukyanov a écrit : >>>> Probably you run lftp with a latin-1 locale. Try to set >>> LC_ALL=en_US.UTF-8 >>>> (for example). Alternatively you can force a local character set in lftp >>> by >>>> "set file:charset UTF-8". >>>> >>>> 2016-01-07 20:04 GMT+03:00 Phyks <webmas...@phyks.me>: >>>> >>>>> Hi, >>>>> >>>>> I am using lftp from a Python script, calling it through >>>>> `subprocess.run`. Everything should be UTF-8 (shell, lftp is sending >>>>> OPTS UTF-8 to the server) and my Python script is using UTF-8, sends >>>>> UTF-8 to lftp on the stdin and expect UTF-8 on its stdout. >>>>> >>>>> But for some unknown reason, it seems that lftp is returning latin-1 >>>>> encoding, instead of UTF-8, for the directory listings. >>>>> >>>>> Are you aware of such a problem? I did not find any way to force lftp to >>>>> use UTF-8. >>>>> >>>>> Thanks! >>>>> >>>>> >>>>> _______________________________________________ >>>>> lftp mailing list >>>>> lftp@uniyar.ac.ru >>>>> http://univ.uniyar.ac.ru/mailman/listinfo/lftp >>>>> >>>>> >>>> >>>> >>>> -- >>>> Alexander. >>>> >>> >>> _______________________________________________ >>> lftp mailing list >>> lftp@uniyar.ac.ru >>> http://univ.uniyar.ac.ru/mailman/listinfo/lftp >>> >> > > > > _______________________________________________ > lftp mailing list > lftp@uniyar.ac.ru > http://univ.uniyar.ac.ru/mailman/listinfo/lftp >
signature.asc
Description: OpenPGP digital signature
_______________________________________________ lftp mailing list lftp@uniyar.ac.ru http://univ.uniyar.ac.ru/mailman/listinfo/lftp