Re: [ceph-users] FYI - Mimic segv in OSD

2018-07-09 Thread Steffen Winther Sørensen


> On 9 Jul 2018, at 15.49, John Spray  wrote:
> 
> On Mon, Jul 9, 2018 at 2:37 PM Steffen Winther Sørensen
>  wrote:
>> 
>> Dunno if this has been seen before so just for info, 1 in 24 OSD just did 
>> this:
>> 
>> Jul  9 15:13:35 n4 ceph-osd: *** Caught signal (Segmentation fault) **
>> Jul  9 15:13:35 n4 ceph-osd: in thread 7ff209282700 thread_name:msgr-worker-2
>> Jul  9 15:13:35 n4 kernel: msgr-worker-2[4697]: segfault at 0 ip 
>> 7ff21002f42b sp 7ff20927b9c0 error 4 in 
>> libtcmalloc.so.4.4.5[7ff210008000+46000]
>> Jul  9 15:13:36 n4 systemd: ceph-osd@2.service: main process exited, 
>> code=killed, status=11/SEGV
>> Jul  9 15:13:36 n4 systemd: Unit ceph-osd@2.service entered failed state.
>> Jul  9 15:13:36 n4 systemd: ceph-osd@2.service failed.
> 
> Hopefully there's a stack trace above those lines in your OSD log?
Nope just what looks like relaunch events:
...
2018-07-09 14:45:17.158 7ff1ef75f700  0 log_channel(cluster) log [DBG] : 3.a0 
scrub starts
2018-07-09 14:45:17.185 7ff1ef75f700  0 log_channel(cluster) log [DBG] : 3.a0 
scrub ok
2018-07-09 15:07:04.398 7ff1eef5e700  0 log_channel(cluster) log [DBG] : 4.b0 
scrub starts
2018-07-09 15:07:04.412 7ff1eef5e700  0 log_channel(cluster) log [DBG] : 4.b0 
scrub ok
2018-07-09 15:13:56.365 7f31359411c0  0 set uid:gid to 167:167 (ceph:ceph)
2018-07-09 15:13:56.365 7f31359411c0  0 ceph version 13.2.0 
(79a10589f1f80dfe21e8f9794365ed98143071c4) mimic (stable), process (unknown), 
pid 604987
2018-07-09 15:13:56.365 7f31359411c0  0 pidfile_write: ignore empty --pid-file
2018-07-09 15:13:56.442 7f31359411c0  0 load: jerasure load: lrc load: isa
2018-07-09 15:13:56.442 7f31359411c0  1 bdev create path 
/var/lib/ceph/osd/ceph-2/block type kernel
2018-07-09 15:13:56.442 7f31359411c0  1 bdev(0x559559628000 
/var/lib/ceph/osd/ceph-2/block) open path /var/lib/ceph/osd/ceph-2/block
2018-07-09 15:13:56.443 7f31359411c0  1 bdev(0x559559628000 
/var/lib/ceph/osd/ceph-2/block) open size 146775474176 (0x222c80, 137 GiB) 
block_size 4096 (4 KiB) rotational
2018-07-09 15:13:56.444 7f31359411c0  1 bluestore(/var/lib/ceph/osd/ceph-2) 
_set_cache_sizes cache_size 1073741824 meta 0.5 kv 0.5 data 0
2018-07-09 15:13:56.444 7f31359411c0  1 bdev(0x559559628000 
/var/lib/ceph/osd/ceph-2/block) close
2018-07-09 15:13:56.700 7f31359411c0  1 bluestore(/var/lib/ceph/osd/ceph-2) 
_mount path /var/lib/ceph/osd/ceph-2
2018-07-09 15:13:56.700 7f31359411c0  1 bdev create path 
/var/lib/ceph/osd/ceph-2/block type kernel
2018-07-09 15:13:56.700 7f31359411c0  1 bdev(0x559559628000 
/var/lib/ceph/osd/ceph-2/block) open path /var/lib/ceph/osd/ceph-2/block
2018-07-09 15:13:56.700 7f31359411c0  1 bdev(0x559559628000 
/var/lib/ceph/osd/ceph-2/block) open size 146775474176 (0x222c80, 137 GiB) 
block_size 4096 (4 KiB) rotational
2018-07-09 15:13:56.701 7f31359411c0  1 bluestore(/var/lib/ceph/osd/ceph-2) 
_set_cache_sizes cache_size 1073741824 meta 0.5 kv 0.5 data 0
2018-07-09 15:13:56.701 7f31359411c0  1 bdev create path 
/var/lib/ceph/osd/ceph-2/block type kernel
2018-07-09 15:13:56.701 7f31359411c0  1 bdev(0x559559628a80 
/var/lib/ceph/osd/ceph-2/block) open path /var/lib/ceph/osd/ceph-2/block
2018-07-09 15:13:56.701 7f31359411c0  1 bdev(0x559559628a80 
/var/lib/ceph/osd/ceph-2/block) open size 146775474176 (0x222c80, 137 GiB) 
block_size 4096 (4 KiB) rotational
2018-07-09 15:13:56.701 7f31359411c0  1 bluefs add_block_device bdev 1 path 
/var/lib/ceph/osd/ceph-2/block size 137 GiB
2018-07-09 15:13:56.701 7f31359411c0  1 bluefs mount
2018-07-09 15:13:56.757 7f31359411c0  0  set rocksdb option 
compaction_readahead_size = 2097152
2018-07-09 15:13:56.757 7f31359411c0  0  set rocksdb option compression = 
kNoCompression
2018-07-09 15:13:56.757 7f31359411c0  0  set rocksdb option 
max_write_buffer_number = 4
2018-07-09 15:13:56.757 7f31359411c0  0  set rocksdb option 
min_write_buffer_number_to_merge = 1
2018-07-09 15:13:56.757 7f31359411c0  0  set rocksdb option 
recycle_log_file_num = 4
2018-07-09 15:13:56.757 7f31359411c0  0  set rocksdb option 
writable_file_max_buffer_size = 0
2018-07-09 15:13:56.757 7f31359411c0  0  set rocksdb option write_buffer_size = 
268435456
2018-07-09 15:13:56.757 7f31359411c0  0  set rocksdb option 
compaction_readahead_size = 2097152
2018-07-09 15:13:56.757 7f31359411c0  0  set rocksdb option compression = 
kNoCompression
2018-07-09 15:13:56.757 7f31359411c0  0  set rocksdb option 
max_write_buffer_number = 4
2018-07-09 15:13:56.757 7f31359411c0  0  set rocksdb option 
min_write_buffer_number_to_merge = 1
2018-07-09 15:13:56.757 7f31359411c0  0  set rocksdb option 
recycle_log_file_num = 4
2018-07-09 15:13:56.757 7f31359411c0  0  set rocksdb option 
writable_file_max_buffer_size = 0
2018-07-09 15:13:56.757 7f31359411c0  0  set rocksdb option write_buffer_size = 
268435456
2018-07-09 15:13:56.764 7f31359411c0  1 rocksdb: do_open column families: 
[default]
2018-07-09 15:13:56.764 7f31359411c0  4 rocksdb: RocksDB version: 5.13.0

2018-07-09 15:13:5

Re: [ceph-users] FYI - Mimic segv in OSD

2018-07-09 Thread John Spray
On Mon, Jul 9, 2018 at 2:37 PM Steffen Winther Sørensen
 wrote:
>
> Dunno if this has been seen before so just for info, 1 in 24 OSD just did 
> this:
>
> Jul  9 15:13:35 n4 ceph-osd: *** Caught signal (Segmentation fault) **
> Jul  9 15:13:35 n4 ceph-osd: in thread 7ff209282700 thread_name:msgr-worker-2
> Jul  9 15:13:35 n4 kernel: msgr-worker-2[4697]: segfault at 0 ip 
> 7ff21002f42b sp 7ff20927b9c0 error 4 in 
> libtcmalloc.so.4.4.5[7ff210008000+46000]
> Jul  9 15:13:36 n4 systemd: ceph-osd@2.service: main process exited, 
> code=killed, status=11/SEGV
> Jul  9 15:13:36 n4 systemd: Unit ceph-osd@2.service entered failed state.
> Jul  9 15:13:36 n4 systemd: ceph-osd@2.service failed.

Hopefully there's a stack trace above those lines in your OSD log?

John

> # ceph --version
> ceph version 13.2.0 (79a10589f1f80dfe21e8f9794365ed98143071c4) mimic (stable)
> # cat /etc/centos-release
> CentOS Linux release 7.5.1804 (Core)
> # uname -r
> 3.10.0-862.3.3.el7.x86_64
>
> /Steffen
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] FYI - Mimic segv in OSD

2018-07-09 Thread Steffen Winther Sørensen
Dunno if this has been seen before so just for info, 1 in 24 OSD just did this:

Jul  9 15:13:35 n4 ceph-osd: *** Caught signal (Segmentation fault) **
Jul  9 15:13:35 n4 ceph-osd: in thread 7ff209282700 thread_name:msgr-worker-2
Jul  9 15:13:35 n4 kernel: msgr-worker-2[4697]: segfault at 0 ip 
7ff21002f42b sp 7ff20927b9c0 error 4 in 
libtcmalloc.so.4.4.5[7ff210008000+46000]
Jul  9 15:13:36 n4 systemd: ceph-osd@2.service: main process exited, 
code=killed, status=11/SEGV
Jul  9 15:13:36 n4 systemd: Unit ceph-osd@2.service entered failed state.
Jul  9 15:13:36 n4 systemd: ceph-osd@2.service failed.

# ceph --version
ceph version 13.2.0 (79a10589f1f80dfe21e8f9794365ed98143071c4) mimic (stable)
# cat /etc/centos-release
CentOS Linux release 7.5.1804 (Core) 
# uname -r
3.10.0-862.3.3.el7.x86_64

/Steffen
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com