Hello,

On 2024-07-25 16:39, Harry G Coin wrote:
Upgraded to 18.2.4 yesterday.  Healthy cluster reported a few minutes after the upgrade completed.  Next morning, this:

# ceph health detail
HEALTH_ERR Module 'diskprediction_local' has failed: No module named 'sklearn' [ERR] MGR_MODULE_ERROR: Module 'diskprediction_local' has failed: No module named 'sklearn'
   Module 'diskprediction_local' has failed: No module named 'sklearn'


Searching found this was a problem several years ago, then resolved, now returned.

We encountered the same problem after an upgrade on our cluster and I dug a bit into this. It appears that [0] was the fix for the missing sklearn package back in 2021. That fix was seemingly specifically tied to centos 8.

Now that the container images are being built on centos 9, the relevant Dockerfile doesn't include the fix any more as it checks the OS version for centos 8. I wonder a bit why it was done this way.

That problem in relation to centos 9 seems to be known to the ceph-container managers. See for example [1].

[0] https://github.com/ceph/ceph-container/pull/1821/files
[1] https://github.com/ceph/ceph-container/blob/main/ceph-releases/ALL/centos/9/daemon-base/README.tmp

Best regards,
Rouven
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

Reply via email to