[ceph-users] Re: Debian 12 (bookworm) / Reef 18.2.1 problems

2024-01-18 Thread Max Carrara
On 1/17/24 20:49, Chris Palmer wrote:
> 
> 
> On 17/01/2024 16:11, kefu chai wrote:
>>
>>
>> On Tue, Jan 16, 2024 at 12:11 AM Chris Palmer  wrote:
>>
>>     Updates on both problems:
>>
>>     Problem 1
>>     --
>>
>>     The bookworm/reef cephadm package needs updating to accommodate
>>     the last
>>     change in /usr/share/doc/adduser/NEWS.Debian.gz:
>>
>>        System user home defaults to /nonexistent if --home is not
>>     specified.
>>        Packages that call adduser to create system accounts should
>>     explicitly
>>        specify a location for /home (see Lintian check
>>        maintainer-script-lacks-home-in-adduser).
>>
>>     i.e. when creating the cephadm user as a system user it needs to
>>     explicitly specify the expected home directory of /home/cephadm.
>>
>>
>> Hi Chris, thank you for the bug report and the suggestion. could you please
>> file a tracker ticket, so we can track and backport the related fixes? i just
>> created https://github.com/ceph/ceph/pull/55218 in hope to alleviate the
>> problem.
> 
> I've created issue https://tracker.ceph.com/issues/64069 for this.
> 
>>
>>     A workaround is to manually create the user+directory before
>>     installing
>>     ceph.
>>
>>
>>     Problem 2
>>     --
>>
>>     This is a complex set of interactions that prevent many mgr modules
>>     (including dashboard) from running. It is NOT debian-specific and
>>     will
>>     eventually bite other distributions as well. At the moment Ceph
>>     PR54710
>>     looks the most promising fix (full or partial). Detail is spread
>>     across
>>     the following:
>>
>>     https://github.com/pyca/cryptography/issues/9016
>>     https://github.com/ceph/ceph/pull/54710
>>     https://tracker.ceph.com/issues/63529
>>     
>> https://forum.proxmox.com/threads/ceph-warning-post-upgrade-to-v8.129371/page-5
>>     https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1055212
>>     https://github.com/pyca/bcrypt/issues/694
>>
>>
>> IIUC, a backport of https://github.com/ceph/ceph/pull/54710 to reef would 
>> address this issue, am i right?
>>
>>
> 
> Unfortunately I think this may be part of a much bigger MGR problem. My 
> understanding of the relevant background is:
> 
>  * MGR modules use python subinterpreters for isolation between modules.
>  * Several modules (including but not limited to dashboard & restful)
>    use python3-cryptography for hashing and TLS (and possibly other
>    things).
>  * python3-cryptography delegates some crypto functions to Rust
>    functions. These include bcrypt and TLS-related functions.
>  * python3-cryptography uses PyO3 to invoke Rust functions.
>  * PyO3 does not support being used by subinterpreters. In the past
>    this has been allowed but was actually unsafe. Now PyO3 throws an
>    exception when it detects multiple initialisations.
> 
> So it appears that the MGR use of these functions has always been unsafe, and 
> is now forbidden.
> 
> PR54710 identified that the code necessary for the bcrypt hashing used during 
> authentication could easily be written in a small amount of native python, 
> thus avoiding the whole PyO3 area altogether.
> However there was a note in the discussions that you also had to disable TLS. 
> And it only applied to the dashboard. My stacktrace below shows the exception 
> during TLS initialisation.
> 
> As PyO3 updates are adopted in other linux distributions this is likely to 
> break a number of MGR modules. As there does not seem to be any 
> subinterpreter support in PyO3 coming soon, the only option
> may be to completely eliminate use of python3-cryptopgraphy from all MGR 
> modules. (It is possible MGR modules may also use other python3 modules that 
> use PyO3 to invoke Rust).
> 
> Unfortunately for us, we didn't find this until we had upgraded all MONs in a 
> cluster to reef, at which point we can't downgrade them to quincy. And we 
> can't upgrade the MGR. As a temporary measure
> (this cluster had MON/MGR/MDS/RGW colocated on 2 hosts) we have added another 
> bookworm host running a reef MON to ensure we can maintain quorum. We are not 
> sure whether it is safe to upgrade the other
> components (OSD, MDS, RGW) while the MGR remains at quincy,
> 
> 

Hi there,
glad to see that this is getting some more attention. I'm the one that submitted
that one bug regarding PyO3 and Ceph MGR [0] a while ago.

Everything you've mentioned is correct - Ceph is using a rare sub-interpreter
model for the MGR in order to juggle all the different MGR modules. 
Theoretically,
it should be possible to start a thread with one interpreter for each module
instead, but that would definitely be anything but a trivial rewrite on Ceph's 
side.

Side note: If anyone here is reading this, wishing to contribute to PyO3 and
help implementing sub-interpreter support, you can join me over on GitHub, where
I've created a tracking issue for this problem some time ago. [1] I'm now 
finally
able to allocate more 

[ceph-users] Re: ceph-dashboard python warning with new pyo3 0.17 lib (debian12)

2023-10-11 Thread Max Carrara
On 9/5/23 16:53, Max Carrara wrote:
> Hello there,
> 
> could you perhaps provide some more information on how (or where) this
> got fixed? It doesn't seem to be fixed yet on the latest Ceph Quincy
> and Reef versions, but maybe I'm mistaken. I've provided some more
> context regarding this below, in case that helps.
> 
> 
> On Ceph Quincy 17.2.6 I'm encountering the following error when trying
> to enable the dashboard (so, the same error that was posted above):
> 
>   root@ceph-01:~# ceph --version
>   ceph version 17.2.6 (810db68029296377607028a6c6da1ec06f5a2b27) quincy 
> (stable)
> 
>   root@ceph-01:~#  ceph mgr module enable dashboard
>   Error ENOENT: module 'dashboard' reports that it cannot run on the active 
> manager daemon: PyO3 modules may only be initialized once per interpreter 
> process (pass --force to force enablement)
> 
> I was then able to find this Python traceback in the systemd journal:
> 
>   Sep 04 18:39:51 ceph-01 ceph-mgr[15669]: 2023-09-04T18:39:51.438+0200 
> 7fecdc91e000 -1 mgr[py] Traceback (most recent call last):
>   Sep 04 18:39:51 ceph-01 ceph-mgr[15669]:   File 
> "/usr/share/ceph/mgr/dashboard/__init__.py", line 60, in 
>   Sep 04 18:39:51 ceph-01 ceph-mgr[15669]: from .module import Module, 
> StandbyModule  # noqa: F401
>   Sep 04 18:39:51 ceph-01 ceph-mgr[15669]: 
> ^
>   Sep 04 18:39:51 ceph-01 ceph-mgr[15669]:   File 
> "/usr/share/ceph/mgr/dashboard/module.py", line 30, in 
>   Sep 04 18:39:51 ceph-01 ceph-mgr[15669]: from .controllers import 
> Router, json_error_page
>   Sep 04 18:39:51 ceph-01 ceph-mgr[15669]:   File 
> "/usr/share/ceph/mgr/dashboard/controllers/__init__.py", line 1, in 
>   Sep 04 18:39:51 ceph-01 ceph-mgr[15669]: from ._api_router import 
> APIRouter
>   Sep 04 18:39:51 ceph-01 ceph-mgr[15669]:   File 
> "/usr/share/ceph/mgr/dashboard/controllers/_api_router.py", line 1, in 
> 
>   Sep 04 18:39:51 ceph-01 ceph-mgr[15669]: from ._router import Router
>   Sep 04 18:39:51 ceph-01 ceph-mgr[15669]:   File 
> "/usr/share/ceph/mgr/dashboard/controllers/_router.py", line 7, in 
>   Sep 04 18:39:51 ceph-01 ceph-mgr[15669]: from ._base_controller import 
> BaseController
>   Sep 04 18:39:51 ceph-01 ceph-mgr[15669]:   File 
> "/usr/share/ceph/mgr/dashboard/controllers/_base_controller.py", line 11, in 
> 
>   Sep 04 18:39:51 ceph-01 ceph-mgr[15669]: from ..services.auth import 
> AuthManager, JwtManager
>   Sep 04 18:39:51 ceph-01 ceph-mgr[15669]:   File 
> "/usr/share/ceph/mgr/dashboard/services/auth.py", line 12, in 
>   Sep 04 18:39:51 ceph-01 ceph-mgr[15669]: import jwt
>   Sep 04 18:39:51 ceph-01 ceph-mgr[15669]:   File 
> "/lib/python3/dist-packages/jwt/__init__.py", line 1, in 
>   Sep 04 18:39:51 ceph-01 ceph-mgr[15669]: from .api_jwk import PyJWK, 
> PyJWKSet
>   Sep 04 18:39:51 ceph-01 ceph-mgr[15669]:   File 
> "/lib/python3/dist-packages/jwt/api_jwk.py", line 6, in 
>   Sep 04 18:39:51 ceph-01 ceph-mgr[15669]: from .algorithms import 
> get_default_algorithms
>   Sep 04 18:39:51 ceph-01 ceph-mgr[15669]:   File 
> "/lib/python3/dist-packages/jwt/algorithms.py", line 6, in 
>   Sep 04 18:39:51 ceph-01 ceph-mgr[15669]: from .utils import (
>   Sep 04 18:39:51 ceph-01 ceph-mgr[15669]:   File 
> "/lib/python3/dist-packages/jwt/utils.py", line 7, in 
>   Sep 04 18:39:51 ceph-01 ceph-mgr[15669]: from 
> cryptography.hazmat.primitives.asymmetric.ec import EllipticCurve
>   Sep 04 18:39:51 ceph-01 ceph-mgr[15669]:   File 
> "/lib/python3/dist-packages/cryptography/hazmat/primitives/asymmetric/ec.py", 
> line 11, in 
>   Sep 04 18:39:51 ceph-01 ceph-mgr[15669]: from cryptography.hazmat._oid 
> import ObjectIdentifier
>   Sep 04 18:39:51 ceph-01 ceph-mgr[15669]:   File 
> "/lib/python3/dist-packages/cryptography/hazmat/_oid.py", line 7, in 
>   Sep 04 18:39:51 ceph-01 ceph-mgr[15669]: from 
> cryptography.hazmat.bindings._rust import (
>   Sep 04 18:39:51 ceph-01 ceph-mgr[15669]: ImportError: PyO3 modules may only 
> be initialized once per interpreter process
>   Sep 04 18:39:51 ceph-01 ceph-mgr[15669]: 2023-09-04T18:39:51.438+0200 
> 7fecdc91e000 -1 mgr[py] Class not found in module 'dashboard'
>   Sep 04 18:39:51 ceph-01 ceph-mgr[15669]: 2023-09-04T18:39:51.438+0200 
> 7fecdc91e000 -1 mgr[py] Error loading module 'dashboard': (2) No such file or 
> directory
>   Sep 04 18:39:51 ceph-01 ceph-mgr[15669]: 2023-09-04T18:39:51.470+0200 
> 7fecdc91e000 -1 mgr[py] Module progress has missing NOTIFY_TYPES member
>   Sep 04 18:39:51 ceph-01 ceph-mgr[15669]: 2023-09-04T18:

[ceph-users] Re: ceph-dashboard python warning with new pyo3 0.17 lib (debian12)

2023-09-05 Thread Max Carrara
Hello there,

could you perhaps provide some more information on how (or where) this
got fixed? It doesn't seem to be fixed yet on the latest Ceph Quincy
and Reef versions, but maybe I'm mistaken. I've provided some more
context regarding this below, in case that helps.


On Ceph Quincy 17.2.6 I'm encountering the following error when trying
to enable the dashboard (so, the same error that was posted above):

  root@ceph-01:~# ceph --version
  ceph version 17.2.6 (810db68029296377607028a6c6da1ec06f5a2b27) quincy (stable)

  root@ceph-01:~#  ceph mgr module enable dashboard
  Error ENOENT: module 'dashboard' reports that it cannot run on the active 
manager daemon: PyO3 modules may only be initialized once per interpreter 
process (pass --force to force enablement)

I was then able to find this Python traceback in the systemd journal:

  Sep 04 18:39:51 ceph-01 ceph-mgr[15669]: 2023-09-04T18:39:51.438+0200 
7fecdc91e000 -1 mgr[py] Traceback (most recent call last):
  Sep 04 18:39:51 ceph-01 ceph-mgr[15669]:   File 
"/usr/share/ceph/mgr/dashboard/__init__.py", line 60, in 
  Sep 04 18:39:51 ceph-01 ceph-mgr[15669]: from .module import Module, 
StandbyModule  # noqa: F401
  Sep 04 18:39:51 ceph-01 ceph-mgr[15669]: 
^
  Sep 04 18:39:51 ceph-01 ceph-mgr[15669]:   File 
"/usr/share/ceph/mgr/dashboard/module.py", line 30, in 
  Sep 04 18:39:51 ceph-01 ceph-mgr[15669]: from .controllers import Router, 
json_error_page
  Sep 04 18:39:51 ceph-01 ceph-mgr[15669]:   File 
"/usr/share/ceph/mgr/dashboard/controllers/__init__.py", line 1, in 
  Sep 04 18:39:51 ceph-01 ceph-mgr[15669]: from ._api_router import 
APIRouter
  Sep 04 18:39:51 ceph-01 ceph-mgr[15669]:   File 
"/usr/share/ceph/mgr/dashboard/controllers/_api_router.py", line 1, in 
  Sep 04 18:39:51 ceph-01 ceph-mgr[15669]: from ._router import Router
  Sep 04 18:39:51 ceph-01 ceph-mgr[15669]:   File 
"/usr/share/ceph/mgr/dashboard/controllers/_router.py", line 7, in 
  Sep 04 18:39:51 ceph-01 ceph-mgr[15669]: from ._base_controller import 
BaseController
  Sep 04 18:39:51 ceph-01 ceph-mgr[15669]:   File 
"/usr/share/ceph/mgr/dashboard/controllers/_base_controller.py", line 11, in 

  Sep 04 18:39:51 ceph-01 ceph-mgr[15669]: from ..services.auth import 
AuthManager, JwtManager
  Sep 04 18:39:51 ceph-01 ceph-mgr[15669]:   File 
"/usr/share/ceph/mgr/dashboard/services/auth.py", line 12, in 
  Sep 04 18:39:51 ceph-01 ceph-mgr[15669]: import jwt
  Sep 04 18:39:51 ceph-01 ceph-mgr[15669]:   File 
"/lib/python3/dist-packages/jwt/__init__.py", line 1, in 
  Sep 04 18:39:51 ceph-01 ceph-mgr[15669]: from .api_jwk import PyJWK, 
PyJWKSet
  Sep 04 18:39:51 ceph-01 ceph-mgr[15669]:   File 
"/lib/python3/dist-packages/jwt/api_jwk.py", line 6, in 
  Sep 04 18:39:51 ceph-01 ceph-mgr[15669]: from .algorithms import 
get_default_algorithms
  Sep 04 18:39:51 ceph-01 ceph-mgr[15669]:   File 
"/lib/python3/dist-packages/jwt/algorithms.py", line 6, in 
  Sep 04 18:39:51 ceph-01 ceph-mgr[15669]: from .utils import (
  Sep 04 18:39:51 ceph-01 ceph-mgr[15669]:   File 
"/lib/python3/dist-packages/jwt/utils.py", line 7, in 
  Sep 04 18:39:51 ceph-01 ceph-mgr[15669]: from 
cryptography.hazmat.primitives.asymmetric.ec import EllipticCurve
  Sep 04 18:39:51 ceph-01 ceph-mgr[15669]:   File 
"/lib/python3/dist-packages/cryptography/hazmat/primitives/asymmetric/ec.py", 
line 11, in 
  Sep 04 18:39:51 ceph-01 ceph-mgr[15669]: from cryptography.hazmat._oid 
import ObjectIdentifier
  Sep 04 18:39:51 ceph-01 ceph-mgr[15669]:   File 
"/lib/python3/dist-packages/cryptography/hazmat/_oid.py", line 7, in 
  Sep 04 18:39:51 ceph-01 ceph-mgr[15669]: from 
cryptography.hazmat.bindings._rust import (
  Sep 04 18:39:51 ceph-01 ceph-mgr[15669]: ImportError: PyO3 modules may only 
be initialized once per interpreter process
  Sep 04 18:39:51 ceph-01 ceph-mgr[15669]: 2023-09-04T18:39:51.438+0200 
7fecdc91e000 -1 mgr[py] Class not found in module 'dashboard'
  Sep 04 18:39:51 ceph-01 ceph-mgr[15669]: 2023-09-04T18:39:51.438+0200 
7fecdc91e000 -1 mgr[py] Error loading module 'dashboard': (2) No such file or 
directory
  Sep 04 18:39:51 ceph-01 ceph-mgr[15669]: 2023-09-04T18:39:51.470+0200 
7fecdc91e000 -1 mgr[py] Module progress has missing NOTIFY_TYPES member
  Sep 04 18:39:51 ceph-01 ceph-mgr[15669]: 2023-09-04T18:39:51.502+0200 
7fecdc91e000 -1 mgr[py] Module iostat has missing NOTIFY_TYPES member
  Sep 04 18:39:51 ceph-01 ceph-mgr[15669]: 2023-09-04T18:39:51.502+0200 
7fecdc91e000 -1 log_channel(cluster) log [ERR] : Failed to load ceph-mgr 
modules: dashboard


As the traceback above reveals, the dashboard uses `PyJWT`, which in
turn uses `cryptography`, and `cryptography` uses `PyO3`.

That led me to an issue[0] regarding this on `cryptography`'s side;
the Ceph Dashboard is apparently not the only thing that's affected
by this.

As it turns out, the maintainer of the Ceph AUR package has also
recently stumbled