Have found the problem. All this was caused by missing mon_host directive in
ceph.conf. I have expected userspace to catch this, but it looks like it didn't
care.
We use DNS SRV in this cluster.
With mon_host directive reinstated, it was able to connect:
Jul 26 09:51:40 xx kernel: libceph: mon0 10.xx:6789 session established
Jul 26 09:51:40 xx kernel: libceph: client188721 fsid
548a0823-815a-4ac5-a2e5-42cc7e8206ab
Jul 26 09:51:40 xx kernel: rbd: image blk1: image uses unsupported features:
0x38
I'm wondering what happens in case this mon1 host goes down, will the kernel
module go through the remaining mon directive addresses?
As re: strace, here you go:
# strace -f -e write -s 500 rbd device map test1/blk1 --user testing-rw
strace: Process 12962 attached
strace: Process 12963 attached
strace: Process 12964 attached
[pid 12964] write(7, " name=testing-rw,key=client.testing-rw test1 blk1 -", 51)
= -1 ESRCH (No such process)
[pid 12964] write(6, "\375\377\377\377", 4) = 4
[pid 12961] write(2, "rbd: sysfs write failed", 23rbd: sysfs write failed
[pid 12964] +++ exited with 0 +++
[pid 12961] <... write resumed>)= 23
[pid 12961] write(2, "\n", 1
) = 1
strace: Process 12970 attached
strace: Process 12971 attached
strace: Process 12972 attached
[pid 12961] write(6, "c", 1)= 1
strace: Process 12973 attached
strace: Process 12974 attached
strace: Process 12975 attached
strace: Process 12976 attached
[pid 12961] write(9, "c", 1)= 1
[pid 12961] write(12, "c", 1) = 1
[pid 12961] write(6, "c", 1)= 1
[pid 12971] write(6, "c", 1)= 1
[pid 12971] write(12, "c", 1) = 1
[pid 12961] write(9, "c", 1)= 1
[pid 12976] +++ exited with 0 +++
[pid 12975] +++ exited with 0 +++
[pid 12961] write(6, "c", 1)= 1
[pid 12961] write(6, "c", 1)= 1
[pid 12961] write(9, "c", 1)= 1
[pid 12961] write(12, "c", 1) = 1
[pid 12974] +++ exited with 0 +++
[pid 12973] +++ exited with 0 +++
[pid 12961] write(6, "c", 1)= 1
[pid 12961] write(9, "c", 1)= 1
[pid 12961] write(12, "c", 1) = 1
strace: Process 12977 attached
[pid 12961] write(6, "c", 1)= 1
strace: Process 12978 attached
strace: Process 12979 attached
strace: Process 12980 attached
strace: Process 12981 attached
[pid 12961] write(9, "c", 1)= 1
[pid 12961] write(12, "c", 1) = 1
[pid 12961] write(6, "c", 1)= 1
[pid 12971] write(12, "c", 1) = 1
[pid 12971] write(6, "c", 1)= 1
[pid 12961] write(9, "c", 1)= 1
strace: Process 12982 attached
[pid 12961] write(9, "c", 1)= 1
strace: Process 12983 attached
strace: Process 12984 attached
[pid 12978] write(12, "c", 1) = 1
strace: Process 12985 attached
strace: Process 12986 attached
[pid 12961] write(6, "c", 1)= 1
[pid 12984] write(6, "c", 1)= 1
[pid 12984] write(9, "c", 1)= 1
[pid 12984] write(9, "c", 1)= 1
[pid 12984] write(9, "c", 1)= 1
[pid 12984] write(9, "c", 1)= 1
strace: Process 12987 attached
strace: Process 12988 attached
[pid 12984] write(9, "c", 1)= 1
[pid 12984] write(9, "c", 1)= 1
[pid 12984] write(9, "c", 1)= 1
[pid 12984] write(12, "c", 1) = 1
[pid 12984] write(9, "c", 1)= 1
strace: Process 12989 attached
strace: Process 12990 attached
[pid 12961] write(1, "In some cases useful info is found in syslog - try
\"dmesg | tail\".\n", 67In some cases useful info is found in syslog - try
"dmesg | tail".
) = 67
[pid 12990] +++ exited with 0 +++
[pid 12989] +++ exited with 0 +++
[pid 12984] +++ exited with 0 +++
[pid 12983] +++ exited with 0 +++
[pid 12961] write(6, "c", 1)= 1
[pid 12961] write(9, "c", 1)= 1
[pid 12961] write(12, "c", 1) = 1
[pid 12961] write(6, "c", 1)= 1
[pid 12982] +++ exited with 0 +++
[pid 12961] write(12, "c", 1) = 1
[pid 12961] write(9, "c", 1)= 1
[pid 12981] +++ exited with 0 +++
[pid 12980] +++ exited with 0 +++
[pid 12961] write(6, "c", 1)= 1
[pid 12961] write(6, "c", 1)= 1
[pid 12961] write(9, "c", 1)= 1
[pid 12961] write(12, "c", 1) = 1
[pid 12979] +++ exited with 0 +++
[pid 12978] +++ exited with 0 +++
[pid 12961] write(6, "c", 1)= 1
[pid 12961] write(9, "c", 1)= 1
[pid 12961] write(12, "c", 1) = 1
[pid 12977] +++ exited with 0 +++
[pid 12961] write(2, "rbd: map failed: ", 17rbd: map failed: ) = 17
[pid 12961] write(2, "(3) No such process", 19(3) No such process) = 19
[pid 12961] write(2, "\n", 1
) = 1
[pid 12985] +++ exited with 0 +++
[pid 12986] +++ exited with 0 +++
[pid 12987] +++ exited with 0 +++
[pid 12988] +++ exited with 0 +++
[pid 12961] write(6, "c", 1)= 1
[pid 12970] +++ exited with 0 +++
[pid 12961] write(9, "c", 1)