Depends a little bit on whether you are using embedded mode or daemon mode
of mod_wsgi, or whether using mod_wsgi-express.

The Python embedded in Apache when not using mod_wsgi-express should by
default inherit the system default locale. This is often the C or POSIX
locale from memory and not any variant of UTF-8 because Linux distros don't
necessarily do sane things, although this may actually have changed.

What is calculated for language/local for specific HTTP requests to Apache
based on Apache's rules makes no difference.

If you are using daemon mode of mod_wsgi you can use the lang/locale option
to the WSGIDaemonProcess directive to explicitly set it for those processes.

https://modwsgi.readthedocs.io/en/master/configuration-directives/WSGIDaemonProcess.html#lang
https://modwsgi.readthedocs.io/en/master/configuration-directives/WSGIDaemonProcess.html#locale

I can't remember if there is a way of overriding it for embedded mode
easily besides setting it in systemd or other startup files which startup
Apache, I don't think so, so it is governed by what Apache process inherits
from the system. You can possibly use Python functions to change it after
the process started, but that may be too late for stuff which is already
imported.

If you are using mod_wsgi-express, it tries to set things itself to a sane
value if not set by the --locale command line option.

Bit of a description about it in:

https://github.com/GrahamDumpleton/mod_wsgi/blob/f54eadd6da8e3da0faccd497d4165de435b97242/docs/release-notes/version-4.4.3.rst#features-changed





*  The behaviour of the --locale option to mod_wsgi-express has changed.
Previously if this option was not defined, then both of the locales
en_US.UTF-8 and C.UTF-8 have at times been hardwired as the default locale.
These locales are though not always present. As a consequence, a new
algorithm is now used.  If the --locale option is supplied, the argument
will be used as the locale. If no argument is supplied, the default locale
for the executing mod_wsgi-express process will be used. If that however is
C or POSIX, then an attempt will be made to use either the en_US.UTF-8 or
C.UTF-8 locales and if that is not possible only then fallback to the
default locale of the mod_wsgi-express process.  In other words, unless you
override the default language locale, an attempt is made to use an English
language locale with UTF-8 encoding.*

So the wisest thing to do if you have a special requirement is to set
--locale option.

If you force mod_wsgi-express into embedded mode though, it possibly just
inherits whatever parent shell is using again, I can't remember if
mod_wsgi-express tries to set it in the parent process as well so inherited
in the child process.

As to the initial WSGI script file, it is not a module import and so any
special language encoding definition in a magic header of the file is
ignored and it should just use whatever the Python lang/locale is set to.

If you need such a thing to be honoured then don't put your real code in
the WSGI script file and instead hold your project code in a distinct
Python package structure and import modules from it in the WSGI script file.

Not sure if this answers your question or not. My memory is very murky
about some of this stuff, especially what happens in embedded mode.

Graham

On Fri, 22 Mar 2024 at 12:31, Lucas Thode <thode...@gmail.com> wrote:

> What determines which encoding mod_wsgi uses when it reads WSGI scripts:
> Apache's configured locale (which for me is en_us.UTF8), or something
> else?  (I ask about this because mod_wsgi appears to do low-level manual
> hackery when reading wsgi script files instead of going through importlib
> or runpy, which means that it can't handle a zipapp or even something that
> uses a PEP 263 magic comment to convey encoding information, the latter
> making it impossible to "wrap" a zipapp with a loader shim even unless
> something else gives.)
>
> Minimized example (works when you run it using python3 breaks.py, breaks
> with the errors below if you try to load it using `mod_wsgi-express
> start-server breaks.py` using a mod_wsgi-express installed into a venv with
> pip install), note that you will have to save breaks.py as
> latin1/iso-8859-1 to cause this to break):
>
> $ cat breaks.py
> # coding: latin1
> import sys
> from wsgiref.simple_server import make_server
>
> def application(environ, start_response):
>     start_response('200 OK', [('Content-Type', 'text/plain')])
>     message = 'It works!\n'
>     version = 'Python v' + sys.version.split()[0] + '\n'
>     response = '\n'.join([message, version])
>     return [response.encode()]
>
> def main():
>     with make_server('', 8100, application) as httpd:
>         httpd.serve_forever()
>
> blow_up_unicode = 'â(¡' # \xe2\x28\xa1
>
> if __name__ == '__main__':
>     main()
>
> Errors it generates when run under mod_wsgi-express:
> [Thu Mar 21 20:12:56.942439 2024] [wsgi:error] [pid 3288289:tid
> 140356515776384]
>  mod_wsgi (pid=3288289): Failed to exec Python script file
> '/tmp/mod_wsgi-localh
> ost:8000:1000/handler.wsgi'.
> [Thu Mar 21 20:12:56.942486 2024] [wsgi:error] [pid 3288289:tid
> 140356515776384]
>  mod_wsgi (pid=3288289): Exception occurred processing WSGI script
> '/tmp/mod_wsg
> i-localhost:8000:1000/handler.wsgi'.
> [Thu Mar 21 20:12:56.943223 2024] [wsgi:error] [pid 3288289:tid
> 140356515776384] Traceback (most recent call last):
> [Thu Mar 21 20:12:56.943329 2024] [wsgi:error] [pid 3288289:tid
> 140356515776384]   File "/tmp/mod_wsgi-localhost:8000:1000/handler.wsgi",
> line 90, in <module>
> [Thu Mar 21 20:12:56.943335 2024] [wsgi:error] [pid 3288289:tid
> 140356515776384]     handler =
> mod_wsgi.server.ApplicationHandler(entry_point,
> [Thu Mar 21 20:12:56.943337 2024] [wsgi:error] [pid 3288289:tid
> 140356515776384]
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> [Thu Mar 21 20:12:56.943345 2024] [wsgi:error] [pid 3288289:tid
> 140356515776384]   File
> "/home/lucas/wsgizip/lib/python3.11/site-packages/mod_wsgi/server/__init__.py",
> line 1475, in __init__
> [Thu Mar 21 20:12:56.943348 2024] [wsgi:error] [pid 3288289:tid
> 140356515776384]     code = compile(fp.read(), entry_point, 'exec',
> [Thu Mar 21 20:12:56.943350 2024] [wsgi:error] [pid 3288289:tid
> 140356515776384]                    ^^^^^^^^^
> [Thu Mar 21 20:12:56.943356 2024] [wsgi:error] [pid 3288289:tid
> 140356515776384]   File "<frozen codecs>", line 322, in decode
> [Thu Mar 21 20:12:56.943371 2024] [wsgi:error] [pid 3288289:tid
> 140356515776384] UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe2
> in position 458: invalid continuation byte
>
> --
> You received this message because you are subscribed to the Google Groups
> "modwsgi" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to modwsgi+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/modwsgi/18b16e2e-4e3c-49f7-84af-7351e4619687n%40googlegroups.com
> <https://groups.google.com/d/msgid/modwsgi/18b16e2e-4e3c-49f7-84af-7351e4619687n%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
>

-- 
You received this message because you are subscribed to the Google Groups 
"modwsgi" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to modwsgi+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/modwsgi/CALRNbkCveLh%3D08RRqudM1SgGDMPwv3GLZ3VEpk_6prXJzTe2wA%40mail.gmail.com.

Reply via email to