Hi,

I finally managed to narrow down the problem, but I am not sure of the
correct way to solve it…

Let's say I am running (in lftp)
```
mirror --dry-run --delete -c --use-pget-n=6 --parallel=4
photos/gallery/2014/2014-10\ Octobre /tmp
```

It returns me:
```
pget -n 6 -O "/tmp/(2014-10-15) Ã%89cocups" ftp://…
```

The problem comes from the partly url-encoded string. There is a Ã
character which is not urlencoded into %c3. When I try to decode it with
Python, I then get something like "\xc3%89" which is not a valid utf-8
sequence (should be \xc3\x89).

I guess I could monkey patch above it, but I am feeling like either this
is a bug or a misuse of lftp from my side.

Thanks!


Le 09/01/2016 01:57, Phyks a écrit :
> I am now pretty much sure the problem comes either from the lftp side or
> the FTP itself.
> 
> Actually, I used the same script as previously used, on two different
> folders. On one folder, it is working as is, and the output is well
> UTF-8 (as it used to be, and this folder does have some special
> accentuated characters). On the other one (which was the one I was using
> when starting this thread), the output has to be parsed as latin-1.
> 
> Both folders are on the same FTP, and launching lftp with "debug 10"
> shows that apparently it sends the command to tell the FTP server to use
> UTF-8 (which seems to be confirmed by the `ftp:charset` and
> `file:charset` options).
> 
> I do not have any problems using either directly Filezilla or lftp
> because I guess they (or the shell) handles encoding conversion.
> 
> Thanks
> 
> Le 07/01/2016 20:47, Alexander Lukyanov a écrit :
>> Can you check the locale under subprocess.run? Can the subprocess itself
>> recode the output?
>>
>> чт, 7 янв. 2016, 21:52, Phyks <webmas...@phyks.me>:
>>
>>> I did try to `set file:charset UTF-8` without any success.
>>>
>>> From the shell from which I run my Python script:
>>>
>>> ```
>>> % locale charmap
>>> UTF-8
>>>
>>> % locale
>>> LANG=fr_FR.UTF-8
>>> LC_CTYPE="fr_FR.UTF-8"
>>> LC_NUMERIC="fr_FR.UTF-8"
>>> LC_TIME="fr_FR.UTF-8"
>>> LC_COLLATE="fr_FR.UTF-8"
>>> LC_MONETARY="fr_FR.UTF-8"
>>> LC_MESSAGES="fr_FR.UTF-8"
>>> LC_PAPER="fr_FR.UTF-8"
>>> LC_NAME="fr_FR.UTF-8"
>>> LC_ADDRESS="fr_FR.UTF-8"
>>> LC_TELEPHONE="fr_FR.UTF-8"
>>> LC_MEASUREMENT="fr_FR.UTF-8"
>>> LC_IDENTIFICATION="fr_FR.UTF-8"
>>> LC_ALL=
>>> ```
>>>
>>> Which should be ok AFAIK, no?
>>>
>>> In case it can be useful for anyone, here is the full script:
>>> https://gist.github.com/Phyks/4e4c65fcd12d600374a7. Problem is with the
>>> call at line 33.
>>>
>>>
>>> Le 07/01/2016 19:42, Alexander Lukyanov a écrit :
>>>> Probably you run lftp with a latin-1 locale. Try to set
>>> LC_ALL=en_US.UTF-8
>>>> (for example). Alternatively you can force a local character set in lftp
>>> by
>>>> "set file:charset UTF-8".
>>>>
>>>> 2016-01-07 20:04 GMT+03:00 Phyks <webmas...@phyks.me>:
>>>>
>>>>> Hi,
>>>>>
>>>>> I am using lftp from a Python script, calling it through
>>>>> `subprocess.run`. Everything should be UTF-8 (shell, lftp is sending
>>>>> OPTS UTF-8 to the server) and my Python script is using UTF-8, sends
>>>>> UTF-8 to lftp on the stdin and expect UTF-8 on its stdout.
>>>>>
>>>>> But for some unknown reason, it seems that lftp is returning latin-1
>>>>> encoding, instead of UTF-8, for the directory listings.
>>>>>
>>>>> Are you aware of such a problem? I did not find any way to force lftp to
>>>>> use UTF-8.
>>>>>
>>>>> Thanks!
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> lftp mailing list
>>>>> lftp@uniyar.ac.ru
>>>>> http://univ.uniyar.ac.ru/mailman/listinfo/lftp
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>>    Alexander.
>>>>
>>>
>>> _______________________________________________
>>> lftp mailing list
>>> lftp@uniyar.ac.ru
>>> http://univ.uniyar.ac.ru/mailman/listinfo/lftp
>>>
>>
> 
> 
> 
> _______________________________________________
> lftp mailing list
> lftp@uniyar.ac.ru
> http://univ.uniyar.ac.ru/mailman/listinfo/lftp
> 

Attachment: signature.asc
Description: OpenPGP digital signature

_______________________________________________
lftp mailing list
lftp@uniyar.ac.ru
http://univ.uniyar.ac.ru/mailman/listinfo/lftp

Reply via email to