I confirmed selinux is disabled on all existing and new hosts. Likewise, python3.6 is installed on all as well.  (3.9.16 on RL8, 3.9.18 on RL9).

I am running 16.2.12 on all containers, so it may be worth updating to 16.2.14 to ensure I'm on the latest Pacific release.

Gary


On 2024-02-05 08:17, Curt wrote:

        
You don't often get email from light...@gmail.com. Learn why this is important <https://aka.ms/LearnAboutSenderIdentification>
        

I don't use rocky, so stab in the dark and probably not the issue, but could selinux be blocking the process?  Really long shot, but python3 is in the standard location? So if you run python3 --version as your ceph user it returns?

Probably not much help, but figured I'd throw it out there.

On Mon, 5 Feb 2024, 16:54 Gary Molenkamp, <molen...@uwo.ca> wrote:

    I have verified the server's expected hostname (with `hostname`)
    matches
    the hostname I am trying to use.
    Just to be sure, I also ran:
         cephadm check-host --expect-hostname <hostname>
    and it returns:
         Hostname "<hostname>" matches what is expected.

    On the current admin server where I am trying to add the host, the
    host
    is reachable, the shortname even matches proper IP with dns search
    order.
    Likewise, on the server where the mgr is running, I am able to
    confirm
    reachability and DNS resolution for the new server as well.

    I thought this may be a DNS/name resolution issue as well, but I
    don't
    see any errors in my setup wrt to host naming.

    Thanks
    Gary


    On 2024-02-03 06:46, Eugen Block wrote:
    > Hi,
    >
    > I found this blog post [1] which reports the same error message. It
    > seems a bit misleading because it appears to be about DNS. Can
    you check
    >
    > cephadm check-host --expect-hostname <HOSTNAME>
    >
    > Or is that what you already tried? It's not entirely clear how you
    > checked the hostname.
    >
    > Regards,
    > Eugen
    >
    > [1]
    >
    
https://blog.mousetech.com/ceph-distributed-file-system-for-the-enterprise/ceph-bogus-error-cannot-allocate-memory/
    >
    > Zitat von Gary Molenkamp <molen...@uwo.ca>:
    >
    >> Happy Friday all.  I was hoping someone could point me in the
    right
    >> direction or clarify any limitations that could be impacting an
    issue
    >> I am having.
    >>
    >> I'm struggling to add a new set of hosts to my ceph cluster using
    >> cephadm and orchestration.  When trying to add a host:
    >>     "ceph orch host add <hostname> 172.31.102.41 --labels _admin"
    >> returns:
    >>     "Error EINVAL: Can't communicate with remote host
    >> `172.31.102.41`, possibly because python3 is not installed there:
    >> [Errno 12] Cannot allocate memory"
    >>
    >> I've verified that the ceph ssh key works to the remote host,
    host's
    >> name matches that returned from `hostname`, python3 is
    installed, and
    >> "/usr/sbin/cephadm prepare-host" on the new hosts returns "host is
    >> ok".    In addition, the cluster ssh key works between hosts
    and the
    >> existing hosts are able to ssh in using the ceph key.
    >>
    >> The existing ceph cluster is Pacific release using docker based
    >> containerization on RockyLinux8 base OS.  The new hosts are
    >> RockyLinux9 based, with the cephadm being installed from Quincy
    release:
    >>         ./cephadm add-repo --release quincy
    >>         ./cephadm install
    >> I did try installing cephadm from the Pacific release by
    changing the
    >> repo to el8,  but that did not work either.
    >>
    >> Is there a limitation is mixing RL8 and RL9 container hosts under
    >> Pacific?  Does this same limitation exist under Quincy? Is there a
    >> python version dependency?
    >> The reason for RL9 on the new hosts is to stage upgrading the OS's
    >> for the cluster.  I did this under Octopus for moving from
    Centos7 to
    >> RL8.
    >>
    >> Thanks and I appreciate any feedback/pointers.
    >> Gary
    >>
    >>
    >> I've added the log trace here in case that helps (from `ceph
    log last
    >> cephadm`)
    >>
    >>
    >>
    >> 2024-02-02T14:22:32.610048+0000 mgr.storage01.oonvfl
    (mgr.441023307)
    >> 4957871 : cephadm [ERR] Can't communicate with remote host
    >> `172.31.102.41`, possibly because python3 is not installed there:
    >> [Errno 12] Cannot allocate memory
    >> Traceback (most recent call last):
    >>   File "/usr/share/ceph/mgr/cephadm/serve.py", line 1524, in
    >> _remote_connection
    >>     conn, connr = self.mgr._get_connection(addr)
    >>   File "/usr/share/ceph/mgr/cephadm/module.py", line 1370, in
    >> _get_connection
    >>     sudo=True if self.ssh_user != 'root' else False)
    >>   File "/lib/python3.6/site-packages/remoto/backends/__init__.py",
    >> line 35, in __init__
    >>     self.gateway = self._make_gateway(hostname)
    >>   File "/lib/python3.6/site-packages/remoto/backends/__init__.py",
    >> line 46, in _make_gateway
    >>     self._make_connection_string(hostname)
    >>   File "/lib/python3.6/site-packages/execnet/multi.py", line
    133, in
    >> makegateway
    >>     io = gateway_io.create_io(spec, execmodel=self.execmodel)
    >>   File "/lib/python3.6/site-packages/execnet/gateway_io.py", line
    >> 121, in create_io
    >>     io = Popen2IOMaster(args, execmodel)
    >>   File "/lib/python3.6/site-packages/execnet/gateway_io.py",
    line 21,
    >> in __init__
    >>     self.popen = p = execmodel.PopenPiped(args)
    >>   File "/lib/python3.6/site-packages/execnet/gateway_base.py",
    line
    >> 184, in PopenPiped
    >>     return self.subprocess.Popen(args, stdout=PIPE, stdin=PIPE)
    >>   File "/lib64/python3.6/subprocess.py", line 729, in __init__
    >>     restore_signals, start_new_session)
    >>   File "/lib64/python3.6/subprocess.py", line 1295, in
    _execute_child
    >>     restore_signals, start_new_session, preexec_fn)
    >> OSError: [Errno 12] Cannot allocate memory
    >>
    >> During handling of the above exception, another exception occurred:
    >>
    >> Traceback (most recent call last):
    >>   File "/usr/share/ceph/mgr/cephadm/serve.py", line 1528, in
    >> _remote_connection
    >>     raise execnet.gateway_bootstrap.HostNotFound(msg)
    >> execnet.gateway_bootstrap.HostNotFound: Can't communicate with
    remote
    >> host `172.31.102.41`, possibly because python3 is not installed
    >> there: [Errno 12] Cannot allocate memory
    >>
    >> The above exception was the direct cause of the following
    exception:
    >>
    >> Traceback (most recent call last):
    >>   File "/usr/share/ceph/mgr/orchestrator/_interface.py", line
    125, in
    >> wrapper
    >>     return OrchResult(f(*args, **kwargs))
    >>   File "/usr/share/ceph/mgr/cephadm/module.py", line 2709, in apply
    >>     results.append(self._apply(spec))
    >>   File "/usr/share/ceph/mgr/cephadm/module.py", line 2574, in
    _apply
    >>     return self._add_host(cast(HostSpec, spec))
    >>   File "/usr/share/ceph/mgr/cephadm/module.py", line 1517, in
    _add_host
    >>     ip_addr = self._check_valid_addr(spec.hostname, spec.addr)
    >>   File "/usr/share/ceph/mgr/cephadm/module.py", line 1498, in
    >> _check_valid_addr
    >>     error_ok=True, no_fsid=True)
    >>   File "/usr/share/ceph/mgr/cephadm/serve.py", line 1326, in
    >> _run_cephadm
    >>     with self._remote_connection(host, addr) as tpl:
    >>   File "/lib64/python3.6/contextlib.py", line 81, in __enter__
    >>     return next(self.gen)
    >>   File "/usr/share/ceph/mgr/cephadm/serve.py", line 1558, in
    >> _remote_connection
    >>     raise OrchestratorError(msg) from e
    >> orchestrator._interface.OrchestratorError: Can't communicate with
    >> remote host `172.31.102.41`, possibly because python3 is not
    >> installed there: [Errno 12] Cannot allocate memory
    >>
    >>
    >>
    >>
    >> --
    >> Gary Molenkamp            Science Technology Services
    >> Systems Engineer        University of Western Ontario
    >> molen...@uwo.ca http://sts.sci.uwo.ca
    >> (519) 661-2111 x86882        (519) 661-3566
    >> _______________________________________________
    >> ceph-users mailing list -- ceph-users@ceph.io
    >> To unsubscribe send an email to ceph-users-le...@ceph.io
    >
    >
    > _______________________________________________
    > ceph-users mailing list -- ceph-users@ceph.io
    > To unsubscribe send an email to ceph-users-le...@ceph.io

-- Gary Molenkamp                  Science Technology Services
    Systems Engineer                University of Western Ontario
    molen...@uwo.ca http://sts.sci.uwo.ca
    (519) 661-2111 x86882           (519) 661-3566
    _______________________________________________
    ceph-users mailing list -- ceph-users@ceph.io
    To unsubscribe send an email to ceph-users-le...@ceph.io


--
Gary Molenkamp                  Science Technology Services
Systems Engineer                University of Western Ontario
molen...@uwo.ca                  http://sts.sci.uwo.ca
(519) 661-2111 x86882           (519) 661-3566
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

Reply via email to