[ceph-users] Re: Debian 12 (bookworm) / Reef 18.2.1 problems

2024-01-12 Thread Chris Palmer

More info on problem 2:

When starting the dashboard, the mgr seems to try to initialise cephadm, 
which in turn uses python crypto libraries that lead to the python error:


$ ceph crash info 
2024-01-12T11:10:03.938478Z_2263d2c8-8120-417e-84bc-bb01f5d81e52

{
    "backtrace": [
    "  File \"/usr/share/ceph/mgr/cephadm/__init__.py\", line 1, in 
\n    from .module import CephadmOrchestrator",
    "  File \"/usr/share/ceph/mgr/cephadm/module.py\", line 15, in 
\n    from cephadm.service_discovery import ServiceDiscovery",
    "  File \"/usr/share/ceph/mgr/cephadm/service_discovery.py\", 
line 20, in \n    from cephadm.ssl_cert_utils import SSLCerts",
    "  File \"/usr/share/ceph/mgr/cephadm/ssl_cert_utils.py\", line 
8, in \n    from cryptography import x509",
    "  File 
\"/lib/python3/dist-packages/cryptography/x509/__init__.py\", line 6, in 
\n    from cryptography.x509 import certificate_transparency",
    "  File 
\"/lib/python3/dist-packages/cryptography/x509/certificate_transparency.py\", 
line 10, in \n    from cryptography.hazmat.bindings._rust import 
x509 as rust_x509",
    "ImportError: PyO3 modules may only be initialized once per 
interpreter process"

    ],
    "ceph_version": "18.2.1",
    "crash_id": 
"2024-01-12T11:10:03.938478Z_2263d2c8-8120-417e-84bc-bb01f5d81e52",

    "entity_name": "mgr.x01",
    "mgr_module": "cephadm",
    "mgr_module_caller": "PyModule::load_subclass_of",
    "mgr_python_exception": "ImportError",
    "os_id": "12",
    "os_name": "Debian GNU/Linux 12 (bookworm)",
    "os_version": "12 (bookworm)",
    "os_version_id": "12",
    "process_name": "ceph-mgr",
    "stack_sig": 
"7815ad73ced094695056319d1241bf7847da19b4b0dfee7a216407b59a7e3d84",

    "timestamp": "2024-01-12T11:10:03.938478Z",
    "utsname_hostname": "x01.xxx.xxx",
    "utsname_machine": "x86_64",
    "utsname_release": "6.1.0-17-amd64",
    "utsname_sysname": "Linux",
    "utsname_version": "#1 SMP PREEMPT_DYNAMIC Debian 6.1.69-1 
(2023-12-30)"

}


On 12/01/2024 12:39, Chris Palmer wrote:
I was delighted to see the native Debian 12 (bookworm) packages turn 
up in Reef 18.2.1.


We currently run a number of ceph clusters on Debian11 (bullseye) / 
Quincy 17.2.7. These are not cephadm-managed.


I have attempted to upgrade a test cluster, and it is not going well. 
Quincy only supports bullseye, and Reef only supports bookworm, we are 
reinstalling from bare metal. However I don't think either of these 
two problems are related to that.


Problem 1
--

A simple "apt install ceph" goes most of the way, then errors with

Setting up cephadm (18.2.1-1~bpo12+1) ...
usermod: unlocking the user's password would result in a passwordless 
account.

You should set a password with usermod -p to unlock this user's password.
mkdir: cannot create directory ‘/home/cephadm/.ssh’: No such file or 
directory

dpkg: error processing package cephadm (--configure):
 installed cephadm package post-installation script subprocess 
returned error exit status 1

dpkg: dependency problems prevent configuration of ceph-mgr-cephadm:
 ceph-mgr-cephadm depends on cephadm; however:
  Package cephadm is not configured yet.

dpkg: error processing package ceph-mgr-cephadm (--configure):
 dependency problems - leaving unconfigured


The two cephadm-related packages are then left in an error state, 
which apt tries to continue each time it is run.


The cephadm user has a login directory of /nonexistent, however the 
cephadm --configure script is trying to use /home/cephadm (as it was 
on Quincy/bullseye).


So, we aren't using cephadm, and decided to keep going as the other 
packages were actually installed, and deal with the package state later.


Problem 2
---

I upgraded 2/3 monitor nodes without any other problems, and (for the 
moment) removed the other Quincy monitor prior to rebuild.


I then shutdown the remaining Quincy manager, and attempted to start 
the Reef manager. Although the manager is running, "ceph mgr services" 
shows it is only providing the restful and not the dashboard service. 
The log file has lots of the following error:


ImportError: PyO3 modules may only be initialized once per interpreter 
process


and ceph -s reports "Module 'dashboard' has failed dependency: PyO3 
modules may only be initialized once per interpreter process



Questions
---

1. Have the Reef/bookworm packages ever been tested in a non-cephadm 
environment?
2. I want to revert this cluster back to a fully functional state. I 
cannot bring back up the remaining Quincy monitor though ("require 
release 18 > 17"). Would I have to go through the procedure of 
starting over, and trying to rescue the monmap from the OSDs? (OSDs 
and an active MDS are still up and running Quincy). I'm aware that 
process exists but have never had to delve into it.



Thanks, Chris
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe

[ceph-users] Re: Debian 12 (bookworm) / Reef 18.2.1 problems

2024-01-15 Thread Chris Palmer

Updates on both problems:

Problem 1
--

The bookworm/reef cephadm package needs updating to accommodate the last 
change in /usr/share/doc/adduser/NEWS.Debian.gz:


  System user home defaults to /nonexistent if --home is not specified.
  Packages that call adduser to create system accounts should explicitly
  specify a location for /home (see Lintian check
  maintainer-script-lacks-home-in-adduser).

i.e. when creating the cephadm user as a system user it needs to 
explicitly specify the expected home directory of /home/cephadm.


A workaround is to manually create the user+directory before installing 
ceph.



Problem 2
--

This is a complex set of interactions that prevent many mgr modules 
(including dashboard) from running. It is NOT debian-specific and will 
eventually bite other distributions as well. At the moment Ceph PR54710 
looks the most promising fix (full or partial). Detail is spread across 
the following:


https://github.com/pyca/cryptography/issues/9016
https://github.com/ceph/ceph/pull/54710
https://tracker.ceph.com/issues/63529
https://forum.proxmox.com/threads/ceph-warning-post-upgrade-to-v8.129371/page-5
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1055212
https://github.com/pyca/bcrypt/issues/694



On 12/01/2024 14:29, Chris Palmer wrote:

More info on problem 2:

When starting the dashboard, the mgr seems to try to initialise 
cephadm, which in turn uses python crypto libraries that lead to the 
python error:


$ ceph crash info 
2024-01-12T11:10:03.938478Z_2263d2c8-8120-417e-84bc-bb01f5d81e52

{
    "backtrace": [
    "  File \"/usr/share/ceph/mgr/cephadm/__init__.py\", line 1, 
in \n    from .module import CephadmOrchestrator",
    "  File \"/usr/share/ceph/mgr/cephadm/module.py\", line 15, in 
\n    from cephadm.service_discovery import ServiceDiscovery",
    "  File \"/usr/share/ceph/mgr/cephadm/service_discovery.py\", 
line 20, in \n    from cephadm.ssl_cert_utils import SSLCerts",
    "  File \"/usr/share/ceph/mgr/cephadm/ssl_cert_utils.py\", 
line 8, in \n    from cryptography import x509",
    "  File 
\"/lib/python3/dist-packages/cryptography/x509/__init__.py\", line 6, 
in \n    from cryptography.x509 import certificate_transparency",
    "  File 
\"/lib/python3/dist-packages/cryptography/x509/certificate_transparency.py\", 
line 10, in \n    from cryptography.hazmat.bindings._rust 
import x509 as rust_x509",
    "ImportError: PyO3 modules may only be initialized once per 
interpreter process"

    ],
    "ceph_version": "18.2.1",
    "crash_id": 
"2024-01-12T11:10:03.938478Z_2263d2c8-8120-417e-84bc-bb01f5d81e52",

    "entity_name": "mgr.x01",
    "mgr_module": "cephadm",
    "mgr_module_caller": "PyModule::load_subclass_of",
    "mgr_python_exception": "ImportError",
    "os_id": "12",
    "os_name": "Debian GNU/Linux 12 (bookworm)",
    "os_version": "12 (bookworm)",
    "os_version_id": "12",
    "process_name": "ceph-mgr",
    "stack_sig": 
"7815ad73ced094695056319d1241bf7847da19b4b0dfee7a216407b59a7e3d84",

    "timestamp": "2024-01-12T11:10:03.938478Z",
    "utsname_hostname": "x01.xxx.xxx",
    "utsname_machine": "x86_64",
    "utsname_release": "6.1.0-17-amd64",
    "utsname_sysname": "Linux",
    "utsname_version": "#1 SMP PREEMPT_DYNAMIC Debian 6.1.69-1 
(2023-12-30)"

}


On 12/01/2024 12:39, Chris Palmer wrote:
I was delighted to see the native Debian 12 (bookworm) packages turn 
up in Reef 18.2.1.


We currently run a number of ceph clusters on Debian11 (bullseye) / 
Quincy 17.2.7. These are not cephadm-managed.


I have attempted to upgrade a test cluster, and it is not going well. 
Quincy only supports bullseye, and Reef only supports bookworm, we 
are reinstalling from bare metal. However I don't think either of 
these two problems are related to that.


Problem 1
--

A simple "apt install ceph" goes most of the way, then errors with

Setting up cephadm (18.2.1-1~bpo12+1) ...
usermod: unlocking the user's password would result in a passwordless 
account.
You should set a password with usermod -p to unlock this user's 
password.
mkdir: cannot create directory ‘/home/cephadm/.ssh’: No such file or 
directory

dpkg: error processing package cephadm (--configure):
 installed cephadm package post-installation script subprocess 
returned error exit status 1

dpkg: dependency problems prevent configuration of ceph-mgr-cephadm:
 ceph-mgr-cephadm depends on cephadm; however:
  Package cephadm is not configured yet.

dpkg: error processing package ceph-mgr-cephadm (--configure):
 dependency problems - leaving unconfigured


The two cephadm-related packages are then left in an error state, 
which apt tries to continue each time it is run.


The cephadm user has a login directory of /nonexistent, however the 
cephadm --configure script is trying to use /home/cephadm (as it was 
on Quincy/bullseye).


So, we aren't using cephadm, and decided to keep going as the other 
packages were actual

[ceph-users] Re: Debian 12 (bookworm) / Reef 18.2.1 problems

2024-01-17 Thread kefu chai
On Tue, Jan 16, 2024 at 12:11 AM Chris Palmer 
wrote:

> Updates on both problems:
>
> Problem 1
> --
>
> The bookworm/reef cephadm package needs updating to accommodate the last
> change in /usr/share/doc/adduser/NEWS.Debian.gz:
>
>System user home defaults to /nonexistent if --home is not specified.
>Packages that call adduser to create system accounts should explicitly
>specify a location for /home (see Lintian check
>maintainer-script-lacks-home-in-adduser).
>
> i.e. when creating the cephadm user as a system user it needs to
> explicitly specify the expected home directory of /home/cephadm.
>

Hi Chris, thank you for the bug report and the suggestion. could you please
file a tracker ticket, so we can track and backport the related fixes? i
just
created https://github.com/ceph/ceph/pull/55218 in hope to alleviate the
problem.


>
> A workaround is to manually create the user+directory before installing
> ceph.
>
>
> Problem 2
> --
>
> This is a complex set of interactions that prevent many mgr modules
> (including dashboard) from running. It is NOT debian-specific and will
> eventually bite other distributions as well. At the moment Ceph PR54710
> looks the most promising fix (full or partial). Detail is spread across
> the following:
>
> https://github.com/pyca/cryptography/issues/9016
> https://github.com/ceph/ceph/pull/54710
> https://tracker.ceph.com/issues/63529
>
> https://forum.proxmox.com/threads/ceph-warning-post-upgrade-to-v8.129371/page-5
> https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1055212
> https://github.com/pyca/bcrypt/issues/694


IIUC, a backport of https://github.com/ceph/ceph/pull/54710 to reef would
address this issue, am i right?


>
>
>
> On 12/01/2024 14:29, Chris Palmer wrote:
> > More info on problem 2:
> >
> > When starting the dashboard, the mgr seems to try to initialise
> > cephadm, which in turn uses python crypto libraries that lead to the
> > python error:
> >
> > $ ceph crash info
> > 2024-01-12T11:10:03.938478Z_2263d2c8-8120-417e-84bc-bb01f5d81e52
> > {
> > "backtrace": [
> > "  File \"/usr/share/ceph/mgr/cephadm/__init__.py\", line 1,
> > in \nfrom .module import CephadmOrchestrator",
> > "  File \"/usr/share/ceph/mgr/cephadm/module.py\", line 15, in
> > \nfrom cephadm.service_discovery import ServiceDiscovery",
> > "  File \"/usr/share/ceph/mgr/cephadm/service_discovery.py\",
> > line 20, in \nfrom cephadm.ssl_cert_utils import SSLCerts",
> > "  File \"/usr/share/ceph/mgr/cephadm/ssl_cert_utils.py\",
> > line 8, in \nfrom cryptography import x509",
> > "  File
> > \"/lib/python3/dist-packages/cryptography/x509/__init__.py\", line 6,
> > in \nfrom cryptography.x509 import certificate_transparency",
> > "  File
> >
> \"/lib/python3/dist-packages/cryptography/x509/certificate_transparency.py\",
>
> > line 10, in \nfrom cryptography.hazmat.bindings._rust
> > import x509 as rust_x509",
> > "ImportError: PyO3 modules may only be initialized once per
> > interpreter process"
> > ],
> > "ceph_version": "18.2.1",
> > "crash_id":
> > "2024-01-12T11:10:03.938478Z_2263d2c8-8120-417e-84bc-bb01f5d81e52",
> > "entity_name": "mgr.x01",
> > "mgr_module": "cephadm",
> > "mgr_module_caller": "PyModule::load_subclass_of",
> > "mgr_python_exception": "ImportError",
> > "os_id": "12",
> > "os_name": "Debian GNU/Linux 12 (bookworm)",
> > "os_version": "12 (bookworm)",
> > "os_version_id": "12",
> > "process_name": "ceph-mgr",
> > "stack_sig":
> > "7815ad73ced094695056319d1241bf7847da19b4b0dfee7a216407b59a7e3d84",
> > "timestamp": "2024-01-12T11:10:03.938478Z",
> > "utsname_hostname": "x01.xxx.xxx",
> > "utsname_machine": "x86_64",
> > "utsname_release": "6.1.0-17-amd64",
> > "utsname_sysname": "Linux",
> > "utsname_version": "#1 SMP PREEMPT_DYNAMIC Debian 6.1.69-1
> > (2023-12-30)"
> > }
> >
> >
> > On 12/01/2024 12:39, Chris Palmer wrote:
> >> I was delighted to see the native Debian 12 (bookworm) packages turn
> >> up in Reef 18.2.1.
> >>
> >> We currently run a number of ceph clusters on Debian11 (bullseye) /
> >> Quincy 17.2.7. These are not cephadm-managed.
> >>
> >> I have attempted to upgrade a test cluster, and it is not going well.
> >> Quincy only supports bullseye, and Reef only supports bookworm, we
> >> are reinstalling from bare metal. However I don't think either of
> >> these two problems are related to that.
> >>
> >> Problem 1
> >> --
> >>
> >> A simple "apt install ceph" goes most of the way, then errors with
> >>
> >> Setting up cephadm (18.2.1-1~bpo12+1) ...
> >> usermod: unlocking the user's password would result in a passwordless
> >> account.
> >> You should set a password with usermod -p to unlock this user's
> >> password.
> >> mkdir: cannot create directory ‘/home/cephadm/.ssh’: No such file or
> >> directory
> >> dpkg: error processing pac

[ceph-users] Re: Debian 12 (bookworm) / Reef 18.2.1 problems

2024-01-17 Thread Chris Palmer



On 17/01/2024 16:11, kefu chai wrote:



On Tue, Jan 16, 2024 at 12:11 AM Chris Palmer  
wrote:


Updates on both problems:

Problem 1
--

The bookworm/reef cephadm package needs updating to accommodate
the last
change in /usr/share/doc/adduser/NEWS.Debian.gz:

   System user home defaults to /nonexistent if --home is not
specified.
   Packages that call adduser to create system accounts should
explicitly
   specify a location for /home (see Lintian check
   maintainer-script-lacks-home-in-adduser).

i.e. when creating the cephadm user as a system user it needs to
explicitly specify the expected home directory of /home/cephadm.


Hi Chris, thank you for the bug report and the suggestion. could you 
please
file a tracker ticket, so we can track and backport the related fixes? 
i just

created https://github.com/ceph/ceph/pull/55218 in hope to alleviate the
problem.


I've created issue https://tracker.ceph.com/issues/64069 for this.



A workaround is to manually create the user+directory before
installing
ceph.


Problem 2
--

This is a complex set of interactions that prevent many mgr modules
(including dashboard) from running. It is NOT debian-specific and
will
eventually bite other distributions as well. At the moment Ceph
PR54710
looks the most promising fix (full or partial). Detail is spread
across
the following:

https://github.com/pyca/cryptography/issues/9016
https://github.com/ceph/ceph/pull/54710
https://tracker.ceph.com/issues/63529

https://forum.proxmox.com/threads/ceph-warning-post-upgrade-to-v8.129371/page-5
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1055212
https://github.com/pyca/bcrypt/issues/694


IIUC, a backport of https://github.com/ceph/ceph/pull/54710 to reef 
would address this issue, am i right?





Unfortunately I think this may be part of a much bigger MGR problem. My 
understanding of the relevant background is:


 * MGR modules use python subinterpreters for isolation between modules.
 * Several modules (including but not limited to dashboard & restful)
   use python3-cryptography for hashing and TLS (and possibly other
   things).
 * python3-cryptography delegates some crypto functions to Rust
   functions. These include bcrypt and TLS-related functions.
 * python3-cryptography uses PyO3 to invoke Rust functions.
 * PyO3 does not support being used by subinterpreters. In the past
   this has been allowed but was actually unsafe. Now PyO3 throws an
   exception when it detects multiple initialisations.

So it appears that the MGR use of these functions has always been 
unsafe, and is now forbidden.


PR54710 identified that the code necessary for the bcrypt hashing used 
during authentication could easily be written in a small amount of 
native python, thus avoiding the whole PyO3 area altogether. However 
there was a note in the discussions that you also had to disable TLS. 
And it only applied to the dashboard. My stacktrace below shows the 
exception during TLS initialisation.


As PyO3 updates are adopted in other linux distributions this is likely 
to break a number of MGR modules. As there does not seem to be any 
subinterpreter support in PyO3 coming soon, the only option may be to 
completely eliminate use of python3-cryptopgraphy from all MGR modules. 
(It is possible MGR modules may also use other python3 modules that use 
PyO3 to invoke Rust).


Unfortunately for us, we didn't find this until we had upgraded all MONs 
in a cluster to reef, at which point we can't downgrade them to quincy. 
And we can't upgrade the MGR. As a temporary measure (this cluster had 
MON/MGR/MDS/RGW colocated on 2 hosts) we have added another bookworm 
host running a reef MON to ensure we can maintain quorum. We are not 
sure whether it is safe to upgrade the other components (OSD, MDS, RGW) 
while the MGR remains at quincy,


🙁





On 12/01/2024 14:29, Chris Palmer wrote:
> More info on problem 2:
>
> When starting the dashboard, the mgr seems to try to initialise
> cephadm, which in turn uses python crypto libraries that lead to
the
> python error:
>
> $ ceph crash info
> 2024-01-12T11:10:03.938478Z_2263d2c8-8120-417e-84bc-bb01f5d81e52
> {
>     "backtrace": [
>     "  File \"/usr/share/ceph/mgr/cephadm/__init__.py\",
line 1,
> in \n    from .module import CephadmOrchestrator",
>     "  File \"/usr/share/ceph/mgr/cephadm/module.py\", line
15, in
> \n    from cephadm.service_discovery import
ServiceDiscovery",
>     "  File
\"/usr/share/ceph/mgr/cephadm/service_discovery.py\",
> line 20, in \n    from cephadm.ssl_cert_utils import
SSLCerts",
>     "  File \"/usr/share/ceph/mgr/cephadm/ssl_cert_utils.py\",
> line 8, in \n    from cryptography import x509",
>     "  File
> \"/lib/python3/dist-packag

[ceph-users] Re: Debian 12 (bookworm) / Reef 18.2.1 problems

2024-01-18 Thread Max Carrara
On 1/17/24 20:49, Chris Palmer wrote:
> 
> 
> On 17/01/2024 16:11, kefu chai wrote:
>>
>>
>> On Tue, Jan 16, 2024 at 12:11 AM Chris Palmer  wrote:
>>
>>     Updates on both problems:
>>
>>     Problem 1
>>     --
>>
>>     The bookworm/reef cephadm package needs updating to accommodate
>>     the last
>>     change in /usr/share/doc/adduser/NEWS.Debian.gz:
>>
>>        System user home defaults to /nonexistent if --home is not
>>     specified.
>>        Packages that call adduser to create system accounts should
>>     explicitly
>>        specify a location for /home (see Lintian check
>>        maintainer-script-lacks-home-in-adduser).
>>
>>     i.e. when creating the cephadm user as a system user it needs to
>>     explicitly specify the expected home directory of /home/cephadm.
>>
>>
>> Hi Chris, thank you for the bug report and the suggestion. could you please
>> file a tracker ticket, so we can track and backport the related fixes? i just
>> created https://github.com/ceph/ceph/pull/55218 in hope to alleviate the
>> problem.
> 
> I've created issue https://tracker.ceph.com/issues/64069 for this.
> 
>>
>>     A workaround is to manually create the user+directory before
>>     installing
>>     ceph.
>>
>>
>>     Problem 2
>>     --
>>
>>     This is a complex set of interactions that prevent many mgr modules
>>     (including dashboard) from running. It is NOT debian-specific and
>>     will
>>     eventually bite other distributions as well. At the moment Ceph
>>     PR54710
>>     looks the most promising fix (full or partial). Detail is spread
>>     across
>>     the following:
>>
>>     https://github.com/pyca/cryptography/issues/9016
>>     https://github.com/ceph/ceph/pull/54710
>>     https://tracker.ceph.com/issues/63529
>>     
>> https://forum.proxmox.com/threads/ceph-warning-post-upgrade-to-v8.129371/page-5
>>     https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1055212
>>     https://github.com/pyca/bcrypt/issues/694
>>
>>
>> IIUC, a backport of https://github.com/ceph/ceph/pull/54710 to reef would 
>> address this issue, am i right?
>>
>>
> 
> Unfortunately I think this may be part of a much bigger MGR problem. My 
> understanding of the relevant background is:
> 
>  * MGR modules use python subinterpreters for isolation between modules.
>  * Several modules (including but not limited to dashboard & restful)
>    use python3-cryptography for hashing and TLS (and possibly other
>    things).
>  * python3-cryptography delegates some crypto functions to Rust
>    functions. These include bcrypt and TLS-related functions.
>  * python3-cryptography uses PyO3 to invoke Rust functions.
>  * PyO3 does not support being used by subinterpreters. In the past
>    this has been allowed but was actually unsafe. Now PyO3 throws an
>    exception when it detects multiple initialisations.
> 
> So it appears that the MGR use of these functions has always been unsafe, and 
> is now forbidden.
> 
> PR54710 identified that the code necessary for the bcrypt hashing used during 
> authentication could easily be written in a small amount of native python, 
> thus avoiding the whole PyO3 area altogether.
> However there was a note in the discussions that you also had to disable TLS. 
> And it only applied to the dashboard. My stacktrace below shows the exception 
> during TLS initialisation.
> 
> As PyO3 updates are adopted in other linux distributions this is likely to 
> break a number of MGR modules. As there does not seem to be any 
> subinterpreter support in PyO3 coming soon, the only option
> may be to completely eliminate use of python3-cryptopgraphy from all MGR 
> modules. (It is possible MGR modules may also use other python3 modules that 
> use PyO3 to invoke Rust).
> 
> Unfortunately for us, we didn't find this until we had upgraded all MONs in a 
> cluster to reef, at which point we can't downgrade them to quincy. And we 
> can't upgrade the MGR. As a temporary measure
> (this cluster had MON/MGR/MDS/RGW colocated on 2 hosts) we have added another 
> bookworm host running a reef MON to ensure we can maintain quorum. We are not 
> sure whether it is safe to upgrade the other
> components (OSD, MDS, RGW) while the MGR remains at quincy,
> 
> 🙁

Hi there,
glad to see that this is getting some more attention. I'm the one that submitted
that one bug regarding PyO3 and Ceph MGR [0] a while ago.

Everything you've mentioned is correct - Ceph is using a rare sub-interpreter
model for the MGR in order to juggle all the different MGR modules. 
Theoretically,
it should be possible to start a thread with one interpreter for each module
instead, but that would definitely be anything but a trivial rewrite on Ceph's 
side.

Side note: If anyone here is reading this, wishing to contribute to PyO3 and
help implementing sub-interpreter support, you can join me over on GitHub, where
I've created a tracking issue for this problem some time ago. [1] I'm now 
finally
able to allocate more time

[ceph-users] Re: Debian 12 (bookworm) / Reef 18.2.1 problems

2024-01-29 Thread Chris Palmer

I have logged this as https://tracker.ceph.com/issues/64213

On 16/01/2024 14:18, DERUMIER, Alexandre wrote:

Hi,


ImportError: PyO3 modules may only be initialized once per
interpreter
process

and ceph -s reports "Module 'dashboard' has failed dependency: PyO3
modules may only be initialized once per interpreter process

We have the same problem on proxmox8 (based on debian12) with ceph
quincy or reef.

It seem to be related to python version on debian12

(we have no fix for this currently)




___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Debian 12 (bookworm) / Reef 18.2.1 problems

2024-02-02 Thread Matthew Darwin

Chris,

Thanks for all the investigations you are doing here. We're on 
quincy/debian11.  Is there any working path at this point to 
reef/debian12?  Ideally I want to go in two steps.  Upgrade ceph first 
or upgrade debian first, then do the upgrade to the other one. Most of 
our infra is already upgraded to debian 12, except ceph.


On 2024-01-29 07:27, Chris Palmer wrote:

I have logged this as https://tracker.ceph.com/issues/64213

On 16/01/2024 14:18, DERUMIER, Alexandre wrote:

Hi,


ImportError: PyO3 modules may only be initialized once per
interpreter
process

and ceph -s reports "Module 'dashboard' has failed dependency: PyO3
modules may only be initialized once per interpreter process

We have the same problem on proxmox8 (based on debian12) with ceph
quincy or reef.

It seem to be related to python version on debian12

(we have no fix for this currently)




___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Debian 12 (bookworm) / Reef 18.2.1 problems

2024-02-02 Thread Chris Palmer

Hi Matthew

AFAIK the upgrade from quincy/deb11 to reef/deb12 is not possible:

 * The packaging problem you can work around, and a fix is pending
 * You have to upgrade both the OS and Ceph in one step
 * The MGR will not run under deb12 due to the PyO3 lack of support for
   subinterpreters.

If you do attempt an upgrade, you will end up stuck with a partially 
upgraded cluster. The MONs will be on deb12/reef and cannot be 
downgraded, and the MGR will be stuck on deb11/quincy, We have a test 
cluster in that state with no way forward or back.


I fear the MGR problem will spread as time goes on and PyO3 updates 
occur. And it's not good that it can silently corrupt in the existing 
apparently-working installations.


No-one has picked up issue 64213 that I raised yet.

I'm tempted to raise another issue for qa : the debian 12 package cannot 
have been tested as it just won't work either as an upgrade or a new 
install.


Regards, Chris


On 02/02/2024 14:40, Matthew Darwin wrote:

Chris,

Thanks for all the investigations you are doing here. We're on 
quincy/debian11.  Is there any working path at this point to 
reef/debian12?  Ideally I want to go in two steps.  Upgrade ceph first 
or upgrade debian first, then do the upgrade to the other one. Most of 
our infra is already upgraded to debian 12, except ceph.


On 2024-01-29 07:27, Chris Palmer wrote:

I have logged this as https://tracker.ceph.com/issues/64213

On 16/01/2024 14:18, DERUMIER, Alexandre wrote:

Hi,


ImportError: PyO3 modules may only be initialized once per
interpreter
process

and ceph -s reports "Module 'dashboard' has failed dependency: PyO3
modules may only be initialized once per interpreter process

We have the same problem on proxmox8 (based on debian12) with ceph
quincy or reef.

It seem to be related to python version on debian12

(we have no fix for this currently)




___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Debian 12 (bookworm) / Reef 18.2.1 problems

2024-02-02 Thread Casey Bodley
On Fri, Feb 2, 2024 at 11:21 AM Chris Palmer  wrote:
>
> Hi Matthew
>
> AFAIK the upgrade from quincy/deb11 to reef/deb12 is not possible:
>
>   * The packaging problem you can work around, and a fix is pending
>   * You have to upgrade both the OS and Ceph in one step
>   * The MGR will not run under deb12 due to the PyO3 lack of support for
> subinterpreters.
>
> If you do attempt an upgrade, you will end up stuck with a partially
> upgraded cluster. The MONs will be on deb12/reef and cannot be
> downgraded, and the MGR will be stuck on deb11/quincy, We have a test
> cluster in that state with no way forward or back.
>
> I fear the MGR problem will spread as time goes on and PyO3 updates
> occur. And it's not good that it can silently corrupt in the existing
> apparently-working installations.
>
> No-one has picked up issue 64213 that I raised yet.
>
> I'm tempted to raise another issue for qa : the debian 12 package cannot
> have been tested as it just won't work either as an upgrade or a new
> install.

you're right that the debian packages don't get tested:

https://docs.ceph.com/en/reef/start/os-recommendations/#platforms

>
> Regards, Chris
>
>
> On 02/02/2024 14:40, Matthew Darwin wrote:
> > Chris,
> >
> > Thanks for all the investigations you are doing here. We're on
> > quincy/debian11.  Is there any working path at this point to
> > reef/debian12?  Ideally I want to go in two steps.  Upgrade ceph first
> > or upgrade debian first, then do the upgrade to the other one. Most of
> > our infra is already upgraded to debian 12, except ceph.
> >
> > On 2024-01-29 07:27, Chris Palmer wrote:
> >> I have logged this as https://tracker.ceph.com/issues/64213
> >>
> >> On 16/01/2024 14:18, DERUMIER, Alexandre wrote:
> >>> Hi,
> >>>
> > ImportError: PyO3 modules may only be initialized once per
> > interpreter
> > process
> >
> > and ceph -s reports "Module 'dashboard' has failed dependency: PyO3
> > modules may only be initialized once per interpreter process
> >>> We have the same problem on proxmox8 (based on debian12) with ceph
> >>> quincy or reef.
> >>>
> >>> It seem to be related to python version on debian12
> >>>
> >>> (we have no fix for this currently)
> >>>
> >>>
> >>>
> >> ___
> >> ceph-users mailing list -- ceph-users@ceph.io
> >> To unsubscribe send an email to ceph-users-le...@ceph.io
> > ___
> > ceph-users mailing list -- ceph-users@ceph.io
> > To unsubscribe send an email to ceph-users-le...@ceph.io
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Debian 12 (bookworm) / Reef 18.2.1 problems

2024-02-02 Thread Brian Chow
Would migrating to a cephadm orchestrated docker/podman cluster be an
acceptable workaround?

We are running that config with reef containers on Debian 12 hosts, with a
couple of debian 12 clients successfully mounting cephfs mounts, using the
reef client packages directly on Debian.

On Fri, Feb 2, 2024, 8:21 AM Chris Palmer  wrote:

> Hi Matthew
>
> AFAIK the upgrade from quincy/deb11 to reef/deb12 is not possible:
>
>   * The packaging problem you can work around, and a fix is pending
>   * You have to upgrade both the OS and Ceph in one step
>   * The MGR will not run under deb12 due to the PyO3 lack of support for
> subinterpreters.
>
> If you do attempt an upgrade, you will end up stuck with a partially
> upgraded cluster. The MONs will be on deb12/reef and cannot be
> downgraded, and the MGR will be stuck on deb11/quincy, We have a test
> cluster in that state with no way forward or back.
>
> I fear the MGR problem will spread as time goes on and PyO3 updates
> occur. And it's not good that it can silently corrupt in the existing
> apparently-working installations.
>
> No-one has picked up issue 64213 that I raised yet.
>
> I'm tempted to raise another issue for qa : the debian 12 package cannot
> have been tested as it just won't work either as an upgrade or a new
> install.
>
> Regards, Chris
>
>
> On 02/02/2024 14:40, Matthew Darwin wrote:
> > Chris,
> >
> > Thanks for all the investigations you are doing here. We're on
> > quincy/debian11.  Is there any working path at this point to
> > reef/debian12?  Ideally I want to go in two steps.  Upgrade ceph first
> > or upgrade debian first, then do the upgrade to the other one. Most of
> > our infra is already upgraded to debian 12, except ceph.
> >
> > On 2024-01-29 07:27, Chris Palmer wrote:
> >> I have logged this as https://tracker.ceph.com/issues/64213
> >>
> >> On 16/01/2024 14:18, DERUMIER, Alexandre wrote:
> >>> Hi,
> >>>
> > ImportError: PyO3 modules may only be initialized once per
> > interpreter
> > process
> >
> > and ceph -s reports "Module 'dashboard' has failed dependency: PyO3
> > modules may only be initialized once per interpreter process
> >>> We have the same problem on proxmox8 (based on debian12) with ceph
> >>> quincy or reef.
> >>>
> >>> It seem to be related to python version on debian12
> >>>
> >>> (we have no fix for this currently)
> >>>
> >>>
> >>>
> >> ___
> >> ceph-users mailing list -- ceph-users@ceph.io
> >> To unsubscribe send an email to ceph-users-le...@ceph.io
> > ___
> > ceph-users mailing list -- ceph-users@ceph.io
> > To unsubscribe send an email to ceph-users-le...@ceph.io
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Debian 12 (bookworm) / Reef 18.2.1 problems

2024-02-02 Thread Chris Palmer
We have fundamental problems with the concept of cephadm and its 
direction of travel. But that's a different story.


The nub of this problem is a design incompatibility with MGR and the 
PyO3 package that python-cryptography relies on. It's actually unsafe as 
it is, and the new package just stops you performing the unsafe 
operations. So that affects all distributions and containers and 
versions of ceph. Eventually the updated PyO3 will find its way into 
other distributions and containers bringing things to a head.


On 02/02/2024 16:45, Brian Chow wrote:

Would migrating to a cephadm orchestrated docker/podman cluster be an
acceptable workaround?

We are running that config with reef containers on Debian 12 hosts, with a
couple of debian 12 clients successfully mounting cephfs mounts, using the
reef client packages directly on Debian.

On Fri, Feb 2, 2024, 8:21 AM Chris Palmer  wrote:


Hi Matthew

AFAIK the upgrade from quincy/deb11 to reef/deb12 is not possible:

   * The packaging problem you can work around, and a fix is pending
   * You have to upgrade both the OS and Ceph in one step
   * The MGR will not run under deb12 due to the PyO3 lack of support for
 subinterpreters.

If you do attempt an upgrade, you will end up stuck with a partially
upgraded cluster. The MONs will be on deb12/reef and cannot be
downgraded, and the MGR will be stuck on deb11/quincy, We have a test
cluster in that state with no way forward or back.

I fear the MGR problem will spread as time goes on and PyO3 updates
occur. And it's not good that it can silently corrupt in the existing
apparently-working installations.

No-one has picked up issue 64213 that I raised yet.

I'm tempted to raise another issue for qa : the debian 12 package cannot
have been tested as it just won't work either as an upgrade or a new
install.

Regards, Chris


On 02/02/2024 14:40, Matthew Darwin wrote:

Chris,

Thanks for all the investigations you are doing here. We're on
quincy/debian11.  Is there any working path at this point to
reef/debian12?  Ideally I want to go in two steps.  Upgrade ceph first
or upgrade debian first, then do the upgrade to the other one. Most of
our infra is already upgraded to debian 12, except ceph.

On 2024-01-29 07:27, Chris Palmer wrote:

I have logged this as https://tracker.ceph.com/issues/64213

On 16/01/2024 14:18, DERUMIER, Alexandre wrote:

Hi,


ImportError: PyO3 modules may only be initialized once per
interpreter
process

and ceph -s reports "Module 'dashboard' has failed dependency: PyO3
modules may only be initialized once per interpreter process

We have the same problem on proxmox8 (based on debian12) with ceph
quincy or reef.

It seem to be related to python version on debian12

(we have no fix for this currently)




___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Debian 12 (bookworm) / Reef 18.2.1 problems

2024-02-05 Thread David Orman
Hi,

Just looking back through PyO3 issues, it would appear this functionality was 
never supported:

https://github.com/PyO3/pyo3/issues/3451
https://github.com/PyO3/pyo3/issues/576

It just appears attempting to use this functionality (which does not 
work/exist) wasn't successfully prevented previously, and is now. I see a few 
PRs in associated projects (such as bcrypt) where they attempted to rollback 
(example):

https://github.com/pyca/bcrypt/pull/714

This will restore functionality to the way it was before (not sure if the same 
exists for other libraries) but they are basically stop-gaps until proper 
support exists in PyO3, which may or may not happen in the near future. It 
sounds like the rollbacks are being considered if the upstream issue isn't 
resolved in some undefined timeline.

I just wanted to add this information to further the discussion, as I know it 
does not resolve your immediate problem(s). It sounds like we need to discuss 
the reliance on PyO3 if this is necessary functionality from the library which 
isn't actually implemented but was just permitted in error and has an undefined 
or ill-defined target date of resolution (sounds like a large upstream 
project). I don't pretend to know the complexities around an alternative 
implementation, but it seems worth at least a cursory investigation, as 
behavior right now (prior to the blocking change) may be somewhat undefined 
even if not throwing errors, according to the above PyO3 issues.

David 

On Fri, Feb 2, 2024, at 10:20, Chris Palmer wrote:
> Hi Matthew
>
> AFAIK the upgrade from quincy/deb11 to reef/deb12 is not possible:
>
>   * The packaging problem you can work around, and a fix is pending
>   * You have to upgrade both the OS and Ceph in one step
>   * The MGR will not run under deb12 due to the PyO3 lack of support for
> subinterpreters.
>
> If you do attempt an upgrade, you will end up stuck with a partially 
> upgraded cluster. The MONs will be on deb12/reef and cannot be 
> downgraded, and the MGR will be stuck on deb11/quincy, We have a test 
> cluster in that state with no way forward or back.
>
> I fear the MGR problem will spread as time goes on and PyO3 updates 
> occur. And it's not good that it can silently corrupt in the existing 
> apparently-working installations.
>
> No-one has picked up issue 64213 that I raised yet.
>
> I'm tempted to raise another issue for qa : the debian 12 package cannot 
> have been tested as it just won't work either as an upgrade or a new 
> install.
>
> Regards, Chris
>
>
> On 02/02/2024 14:40, Matthew Darwin wrote:
>> Chris,
>>
>> Thanks for all the investigations you are doing here. We're on 
>> quincy/debian11.  Is there any working path at this point to 
>> reef/debian12?  Ideally I want to go in two steps.  Upgrade ceph first 
>> or upgrade debian first, then do the upgrade to the other one. Most of 
>> our infra is already upgraded to debian 12, except ceph.
>>
>> On 2024-01-29 07:27, Chris Palmer wrote:
>>> I have logged this as https://tracker.ceph.com/issues/64213
>>>
>>> On 16/01/2024 14:18, DERUMIER, Alexandre wrote:
 Hi,

>> ImportError: PyO3 modules may only be initialized once per
>> interpreter
>> process
>>
>> and ceph -s reports "Module 'dashboard' has failed dependency: PyO3
>> modules may only be initialized once per interpreter process
 We have the same problem on proxmox8 (based on debian12) with ceph
 quincy or reef.

 It seem to be related to python version on debian12

 (we have no fix for this currently)



>>> ___
>>> ceph-users mailing list -- ceph-users@ceph.io
>>> To unsubscribe send an email to ceph-users-le...@ceph.io
>> ___
>> ceph-users mailing list -- ceph-users@ceph.io
>> To unsubscribe send an email to ceph-users-le...@ceph.io
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Debian 12 (bookworm) / Reef 18.2.1 problems

2024-02-21 Thread Matthew Vernon

[mgr modules failing because pyO3 can't be imported more than once]

On 29/01/2024 12:27, Chris Palmer wrote:

I have logged this as https://tracker.ceph.com/issues/64213


I've noted there that it's related to 
https://tracker.ceph.com/issues/63529 (an earlier report relating to the 
dashboard); there is a MR to fix just the dashboard issue which got 
merged into main. I've opened a MR to backport that change to Reef:

https://github.com/ceph/ceph/pull/55689

I don't know what the devs' plans are for dealing with the broader pyO3 
issue, but I'll ask on the dev list...


Regards,

Matthew
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io