Re: Determining "file system encoding" from Python

Manuel Jacob Mon, 29 Jun 2020 04:37:03 -0700

On 2020-06-29 12:11, Manuel Jacob wrote:

Hi,


In a Python application, I want to convert a path (as native Unix
bytes) to a file URL (and later probably also other paths between the
"file system encoding" and UTF-8). There are functions for this in the
Subversion binding. However, for the sake of being able to deal with
the familiar Python exceptions, I’d like to do the decoding/encoding
in Python. For that, I need to find out the encoding that Subversion
uses for converting UTF-8 to the "file system encoding".

Subversion seems to use the encoding returned by
apr_os_locale_encoding(), which is however not exposed by the Python
bindings.

lib = ctypes.CDLL(libsvn._core.__file__)
lib.apr_os_locale_encoding.argtypes = [ctypes.c_void_p]
lib.apr_os_locale_encoding.restype = ctypes.c_char_p
with util.with_lc_ctype():

I forgot to mention what `with util.with_lc_ctype()` does. It calls`setlocale(LC_CTYPE, '')` before the block and resets it to what it wasbefore after the block. I put it around all calls to the Subversionbindings to ensure that Subversion works correcly while locale-dependentstr methods on Python 2 stay unchanged.

es =lib.apr_os_locale_encoding(int(svn.core.application_pool.this))

fsencoding = codecs.lookup(es).name

Is there an easier way? I could emulate what apr_os_locale_encoding()
is doing, which is calling nl_langinfo() and falling back to
ISO-8859-1 on systems which are supported by Python. Is it reasonable
to assume that this logic will stay? Or, asked differently, what has
the least chance of stopping to give the "file system encoding"? The
ctypes code or using nl_langinfo (falling back to ISO-8859-1)?

Thanks,
Manuel

Re: Determining "file system encoding" from Python

Reply via email to