Hi all,
two days ago we upgraded our cluster from octopus to pacific. Everything went
well and we see lots of improvements. Thanks for releasing the last stable
version with all its fixes. I do have some questions though and this hiccup is
one for starters:
After the upgrade to pacific we started getting the error message
"admin_socket: exception getting command descriptions: [Errno 2] No such file
or directory" when using the ceph daemon command. Here is the output of a full
session:
[root@ceph-adm:ceph-26 ~]# ceph daemon mon.ceph-26 version | jq .release
"pacific"
[root@ceph-adm:ceph-26 ~]# ceph --id admin daemon mon.ceph-26 version | jq
.release
admin_socket: exception getting command descriptions: [Errno 2] No such file or
directory
[root@ceph-adm:ceph-26 ~]# ceph --id admin daemon
/var/run/ceph/ceph-mon.ceph-26.asok version | jq .release
"pacific"
[root@ceph-adm:ceph-26 ~]# ceph daemon /var/run/ceph/ceph-mon.ceph-26.asok
version | jq .release
"pacific"
We observe that it is impossible to use the ceph daemon command in its simple
form whenever a --id argument is present. This, unfortunately, creates an
unnecessary restrictions, we can't use non-admin users any more. here is why
this fails:
[root@ceph-adm:ceph-26 ~]# strace ceph daemon mon.ceph-26 version |& grep asok
stat("/var/run/ceph/ceph-mon.ceph-26.asok", {st_mode=S_IFSOCK|0755, st_size=0,
...}) = 0
connect(3, {sa_family=AF_UNIX, sun_path="/var/run/ceph/ceph-mon.ceph-26.asok"},
37) = 0
getpeername(3, {sa_family=AF_UNIX,
sun_path="/var/run/ceph/ceph-mon.ceph-26.asok"}, [110 => 38]) = 0
connect(3, {sa_family=AF_UNIX, sun_path="/var/run/ceph/ceph-mon.ceph-26.asok"},
37) = 0
getpeername(3, {sa_family=AF_UNIX,
sun_path="/var/run/ceph/ceph-mon.ceph-26.asok"}, [110 => 38]) = 0
[root@ceph-adm:ceph-26 ~]# strace ceph --id admin daemon mon.ceph-26 version |&
grep asok
stat("/var/run/ceph/ceph-mon.admin.asok", 0x7fffa65e9f00) = -1 ENOENT (No such
file or directory)
connect(3, {sa_family=AF_UNIX, sun_path="/var/run/ceph/ceph-mon.admin.asok"},
35) = -1 ENOENT (No such file or directory)
As you can see, the daemon name "ceph-26" was replaced with the user name
"admin" passed with the argument to --id. As a result the command looks for a
non-existent file. Passing the full path "fixes" this. This is clearly a bug
and I wonder if there is a way out, for example, by setting an explicit daemon
path template in the config.
I will open a tracker if a user on quincy or newer confirms that this is
present in newer versions as well. I wonder if this is a fall-out of
https://docs.ceph.com/en/latest/releases/pacific/#id39 Point 3: "$pid expansion
in config paths like admin_socket will now properly expand to the daemon pid
for commands like ceph-mds or ceph-osd. Previously only ceph-fuse/rbd-nbd
expanded $pid with the actual daemon pid."
Thanks for any pointers on how to work around this issue.
Best regards,
=================
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
_______________________________________________
ceph-users mailing list -- [email protected]
To unsubscribe send an email to [email protected]