Bug#949743: ceph-osd crashes when osd_memory_target is set in config

2020-01-24 Thread Bernd Zeimetz
Hi Martin,

> When osd_memory_target is present in config ceph-osd refuses to start:
> 
> # ceph config set osd osd_memory_target 2147483648
> 
> # /usr/bin/ceph-osd -d --cluster ceph --id 0 --setuser ceph --setgroup ceph
> (cut)

I gave this a try in bullseye, with the same result.

The first idea I have is a that a patch, that fixes the build on 32bit
systems, introduced this bug. Although I fail to understand why as
should not make a difference on 64bit systems.

I'm building a version without these patches and see what happens.


Bernd


-- 
 Bernd ZeimetzDebian GNU/Linux Developer
 http://bzed.dehttp://www.debian.org
 GPG Fingerprint: ECA1 E3F2 8E11 2432 D485  DD95 EB36 171A 6FF9 435F



Bug#949743: ceph-osd crashes when osd_memory_target is set in config

2020-01-24 Thread Martin Mlynář
Package: ceph-osd
Version: 14.2.6-4~bpo10+1
Severity: important


When osd_memory_target is present in config ceph-osd refuses to start:

# ceph config set osd osd_memory_target 2147483648

# /usr/bin/ceph-osd -d --cluster ceph --id 0 --setuser ceph --setgroup ceph
(cut)

 ceph version 14.2.6 (f0aa067ac7a02ee46ea48aa26c6e298b5ea272e9) nautilus 
(stable)
 1: (()+0x13520) [0x7f83c83cc520]
 2: (gsignal()+0x141) [0x7f83c7e9b081]
 3: (abort()+0x121) [0x7f83c7e86535]
 4: (()+0x9a693) [0x7f83c821a693]
 5: (()+0xa6036) [0x7f83c8226036]
 6: (()+0xa60a1) [0x7f83c82260a1]
 7: (()+0xa62f5) [0x7f83c82262f5]
 8: (()+0x49a92c) [0x557d33e9a92c]
 9: (Option::size_t const md_config_t::get_val(ConfigValues 
const&, std::__cxx11::basic_string, 
std::allocator > const&) const+0x51) [0x557d33fb1ea1]
 10: (BlueStore::_set_cache_sizes()+0x174) [0x557d344cea44]
 11: (BlueStore::_open_bdev(bool)+0x1c5) [0x557d344d1845]
 12: (BlueStore::get_devices(std::set, std::allocator >, 
std::less, 
std::allocator > >, std::allocator, std::allocator > > >*)+0x103) [0x557d34558c43]
 13: (BlueStore::get_numa_node(int*, std::set, 
std::allocator >*, std::set, std::allocator >, 
std::less, 
std::allocator > >, std::allocator, std::allocator > > >*)+0x83) [0x557d344de053]
 14: (main()+0x2784) [0x557d33f7a834]
 15: (__libc_start_main()+0xeb) [0x7f83c7e87bbb]
 16: (_start()+0x2a) [0x557d33fac03a]
 NOTE: a copy of the executable, or `objdump -rdS ` is needed to 
interpret this.
(cut)

After removing this option everyting works fine. It was tested on clean
installed debian and newly initialized cluster.

OSD & MON components are in sync with versions:
# ceph versions
{
"mon": {
"ceph version 14.2.6 (f0aa067ac7a02ee46ea48aa26c6e298b5ea272e9) 
nautilus (stable)": 1
},
"mgr": {
"ceph version 14.2.6 (f0aa067ac7a02ee46ea48aa26c6e298b5ea272e9) 
nautilus (stable)": 1
},
"osd": {},
"mds": {},
"overall": {
"ceph version 14.2.6 (f0aa067ac7a02ee46ea48aa26c6e298b5ea272e9) 
nautilus (stable)": 2
}
}

I've tested this also on ubuntu with official ceph packages for bionic
and have not encountered this problem there.

There is also a discussion in mailing list: 
http://lists.ceph.com/pipermail/ceph-users-ceph.com/2020-January/038012.html
Also a bug in upstream ceph: https://tracker.ceph.com/issues/43766


-- System Information:
Debian Release: 10.2
  APT prefers stable-updates
  APT policy: (500, 'stable-updates'), (500, 'unstable'), (500, 'stable')
Architecture: amd64 (x86_64)

Kernel: Linux 4.19.0-6-amd64 (SMP w/2 CPU cores)
Locale: LANG=cs_CZ.UTF-8, LC_CTYPE=cs_CZ.UTF-8 (charmap=UTF-8), 
LANGUAGE=cs_CZ.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /usr/bin/dash
Init: systemd (via /run/systemd/system)
LSM: AppArmor: enabled

Versions of packages ceph-osd depends on:
ii  ceph-base   14.2.6-5
ii  ceph-common 14.2.6-5
ii  libaio1 0.3.112-3
ii  libblkid1   2.33.1-0.1
ii  libboost-iostreams1.67.01.67.0-13
ii  libboost-program-options1.67.0  1.67.0-13
ii  libboost-system1.67.0   1.67.0-13
ii  libboost-thread1.67.0   1.67.0-13
ii  libc6   2.29-9
ii  libfuse22.9.9-1
ii  libgcc1 1:8.3.0-6
ii  libgoogle-perftools42.7-1
ii  libibverbs1 24.0-2~bpo10+1
ii  libleveldb1d1.20-2.1
ii  liblz4-11.8.3-1
ii  libnspr42:4.20-1
ii  libnss3 2:3.42.1-1+deb10u2
ii  librados2   14.2.6-5
ii  librdmacm1  24.0-2~bpo10+1
ii  libsnappy1v51.1.7-1
ii  libssl1.1   1.1.1d-0+deb10u2
ii  libstdc++6  9.2.1-24
ii  libudev1241-7~deb10u2
ii  lvm22.03.02-3
ii  python3 3.7.3-1
ii  smartmontools   7.1-1~bpo10+1
ii  sudo1.8.27-1+deb10u1
ii  zlib1g  1:1.2.11.dfsg-1

ceph-osd recommends no packages.

Versions of packages ceph-osd suggests:
pn  nvme-cli  

-- Configuration Files:
/etc/sudoers.d/ceph-osd-smartctl [Errno 13] Operace zamítnuta: 
'/etc/sudoers.d/ceph-osd-smartctl'

-- no debconf information