Hi

Tcmalloc on arm7 is problematic. You need to compile your own with either 
jemalloc or just libc malloc

/Torben

Den 20. maj 2019 17.48.40 CEST, "Jesper Taxbøl" <jes...@taxboel.dk> skrev:
>I am trying to setup a Ceph cluster on 4 odroid-hc2 instances on top of
>Ubuntu 18.04.
>
>My ceph-mgr deamon keeps crashing on me.
>
>Any advise on how to proceed?
>
>Log on mgr node says something about ms_dispatch:
>
>2019-05-20 15:34:43.070424 b6714230  0 set uid:gid to 64045:64045
>(ceph:ceph)
>2019-05-20 15:34:43.070455 b6714230  0 ceph version 12.2.11
>(26dc3775efc7bb286a1d6d66faee0b
>a30ea23eee) luminous (stable), process ceph-mgr, pid 1169
>2019-05-20 15:34:43.070799 b6714230  0 pidfile_write: ignore empty
>--pid-file
>2019-05-20 15:34:43.101162 b6714230  1 mgr send_beacon standby
>2019-05-20 15:34:43.124462 b06f8c30 -1 *** Caught signal (Segmentation
>fault) **
>in thread b06f8c30 thread_name:ms_dispatch
>
>ceph version 12.2.11 (26dc3775efc7bb286a1d6d66faee0ba30ea23eee)
>luminous
>(stable)
>1: (()+0x30133c) [0x77033c]
>2: (()+0x25750) [0xb688a750]
>3: (_ULarm_step()+0x55) [0xb6816ce6]
>4: (()+0x255e8) [0xb6cd85e8]
>5: (GetStackTrace(void**, int, int)+0x25) [0xb6cd8a3e]
>6: (tcmalloc::PageHeap::GrowHeap(unsigned int)+0xb9) [0xb6ccd36a]
>7: (tcmalloc::PageHeap::New(unsigned int)+0x79) [0xb6ccd5e6]
>8: (tcmalloc::CentralFreeList::Populate()+0x71) [0xb6ccc5ce]
>9: (tcmalloc::CentralFreeList::FetchFromOneSpansSafe(int, void**,
>void**)+0x1b) [0xb6ccc76
>0]
>10: (tcmalloc::CentralFreeList::RemoveRange(void**, void**, int)+0x6d)
>[0xb6ccc7de]
>11: (tcmalloc::ThreadCache::FetchFromCentralCache(unsigned int,
>unsigned
>int)+0x51) [0xb6c
>cea56]
>12: (malloc()+0x22d) [0xb6cd9a8e]
>NOTE: a copy of the executable, or `objdump -rdS <executable>` is
>needed to
>interpret this
>.
>
>--- begin dump of recent events ---
>  -90> 2019-05-20 15:34:43.053293 b6714230  5 asok(0x55b5320)
>register_command perfcounter
>s_dump hook 0x554c088
>  -89> 2019-05-20 15:34:43.053322 b6714230  5 asok(0x55b5320)
>register_command 1 hook 0x55
>4c088
>  -88> 2019-05-20 15:34:43.053330 b6714230  5 asok(0x55b5320)
>register_command perf dump h
>ook 0x554c088
>  -87> 2019-05-20 15:34:43.053341 b6714230  5 asok(0x55b5320)
>register_command perfcounter
>s_schema hook 0x554c088
>  -86> 2019-05-20 15:34:43.053360 b6714230  5 asok(0x55b5320)
>register_command perf histog
>ram dump hook 0x554c088
>  -85> 2019-05-20 15:34:43.053374 b6714230  5 asok(0x55b5320)
>register_command 2 hook 0x55
>4c088
>  -84> 2019-05-20 15:34:43.053381 b6714230  5 asok(0x55b5320)
>register_command perf schema
>hook 0x554c088
>  -83> 2019-05-20 15:34:43.053389 b6714230  5 asok(0x55b5320)
>register_command perf histog
>ram schema hook 0x554c088
>  -82> 2019-05-20 15:34:43.053410 b6714230  5 asok(0x55b5320)
>register_command perf reset
>hook 0x554c088
>  -81> 2019-05-20 15:34:43.053418 b6714230  5 asok(0x55b5320)
>register_command config show
>hook 0x554c088
>  -80> 2019-05-20 15:34:43.053425 b6714230  5 asok(0x55b5320)
>register_command config help
>hook 0x554c088
>  -79> 2019-05-20 15:34:43.053436 b6714230  5 asok(0x55b5320)
>register_command config set
>hook 0x554c088
>  -78> 2019-05-20 15:34:43.053444 b6714230  5 asok(0x55b5320)
>register_command config get
>hook 0x554c088
>  -77> 2019-05-20 15:34:43.053459 b6714230  5 asok(0x55b5320)
>register_command config diff
>hook 0x554c088
>  -76> 2019-05-20 15:34:43.053467 b6714230  5 asok(0x55b5320)
>register_command config diff
>get hook 0x554c088
>  -75> 2019-05-20 15:34:43.053475 b6714230  5 asok(0x55b5320)
>register_command log flush h
>ook 0x554c088
>  -74> 2019-05-20 15:34:43.053482 b6714230  5 asok(0x55b5320)
>register_command log dump ho
>ok 0x554c088
>  -73> 2019-05-20 15:34:43.053490 b6714230  5 asok(0x55b5320)
>register_command log reopen
>hook 0x554c088
>  -72> 2019-05-20 15:34:43.053513 b6714230  5 asok(0x55b5320)
>register_command dump_mempoo
>ls hook 0x56e3504
> -71> 2019-05-20 15:34:43.070424 b6714230  0 set uid:gid to 64045:64045
>(ceph:ceph)
>  -70> 2019-05-20 15:34:43.070455 b6714230  0 ceph version 12.2.11
>(26dc3775efc7bb286a1d6d
>66faee0ba30ea23eee) luminous (stable), process ceph-mgr, pid 1169
>-69> 2019-05-20 15:34:43.070799 b6714230  0 pidfile_write: ignore empty
>--pid-file
>  -68> 2019-05-20 15:34:43.074441 b6714230  5 asok(0x55b5320) init
>/var/run/ceph/ceph-mgr.
>odroid-c.asok
>  -67> 2019-05-20 15:34:43.074473 b6714230  5 asok(0x55b5320)
>bind_and_listen /var/run/cep
>h/ceph-mgr.odroid-c.asok
>  -66> 2019-05-20 15:34:43.074615 b6714230  5 asok(0x55b5320)
>register_command 0 hook 0x55
>4c1d0
>  -65> 2019-05-20 15:34:43.074633 b6714230  5 asok(0x55b5320)
>register_command version hoo
>k 0x554c1d0
>  -64> 2019-05-20 15:34:43.074654 b6714230  5 asok(0x55b5320)
>register_command git_version
>hook 0x554c1d0
>  -63> 2019-05-20 15:34:43.074674 b6714230  5 asok(0x55b5320)
>register_command help hook 0
>x554c1d8
>  -62> 2019-05-20 15:34:43.074694 b6714230  5 asok(0x55b5320)
>register_command get_command
>_descriptions hook 0x554c1e0
>-61> 2019-05-20 15:34:43.074785 b3effc30  5 asok(0x55b5320) entry start
>-60> 2019-05-20 15:34:43.076464 b36fec30  2 Event(0x554e068 nevent=5000
>time_id=1).set_o
>wner idx=0 owner=3010456624
>-59> 2019-05-20 15:34:43.076559 b2efdc30  2 Event(0x554e488 nevent=5000
>time_id=1).set_o
>wner idx=1 owner=3002063920
>-58> 2019-05-20 15:34:43.076643 b26fcc30  2 Event(0x554e1c8 nevent=5000
>time_id=1).set_o
>wner idx=2 owner=2993671216
>  -57> 2019-05-20 15:34:43.077177 b6714230  1  Processor -- start
>  -56> 2019-05-20 15:34:43.077298 b6714230  1 -- - start start
>  -55> 2019-05-20 15:34:43.077315 b6714230 10 monclient:
>build_initial_monmap
>  -54> 2019-05-20 15:34:43.077362 b6714230 10 monclient: init
>-53> 2019-05-20 15:34:43.077380 b6714230  5 adding auth protocol: cephx
>-52> 2019-05-20 15:34:43.077391 b6714230 10 monclient: auth_supported 2
>method cephx
>-51> 2019-05-20 15:34:43.077625 b6714230  2 auth: KeyRing::load: loaded
>key file /var/li
>b/ceph/mgr/ceph-odroid-c/keyring
> -50> 2019-05-20 15:34:43.077761 b6714230 10 monclient: _reopen_session
>rank -1
> -49> 2019-05-20 15:34:43.077847 b6714230 10 monclient(hunting): picked
>mon.noname-a con
>0x5792d00 addr 192.168.130.131:6789/0
>  -48> 2019-05-20 15:34:43.077899 b6714230  1 -- - -->
>192.168.130.131:6789/0 -- auth(prot
>o 0 33 bytes epoch 0) v1 -- 0x5590680 con 0
>  -47> 2019-05-20 15:34:43.077985 b6714230 10 monclient(hunting):
>_renew_subs
>  -46> 2019-05-20 15:34:43.080980 b2efdc30  1 --
>192.168.130.132:0/2049423493 learned_addr
>learned my addr 192.168.130.132:0/2049423493
>  -45> 2019-05-20 15:34:43.082020 b2efdc30  2 --
>192.168.130.132:0/2049423493 >> 192.168.1
>30.131:6789/0 conn(0x5792d00 :-1 s=STATE_CONNECTING_WAIT_ACK_SEQ pgs=0
>cs=0
>l=0)._process_c
>onnection got newly_acked_seq 0 vs out_seq 0
>  -44> 2019-05-20 15:34:43.084528 b2efdc30  5 --
>192.168.130.132:0/2049423493 >> 192.168.1
>30.131:6789/0 conn(0x5792d00 :-1
>s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=45 cs=1
>l=1). rx mon.0 seq 1 0x55aa900 mon_map magic: 0 v1
>  -43> 2019-05-20 15:34:43.084615 b06f8c30  1 --
>192.168.130.132:0/2049423493 <== mon.0 19
>2.168.130.131:6789/0 1 ==== mon_map magic: 0 v1 ==== 196+0+0
>(1694575244 0
>0) 0x55aa900 con
>0x5792d00
>  -42> 2019-05-20 15:34:43.084656 b06f8c30 10 monclient(hunting):
>handle_monmap mon_map ma
>gic: 0 v1
>  -41> 2019-05-20 15:34:43.084685 b06f8c30 10 monclient(hunting):  got
>monmap 1, mon.nonam
>e-a is now rank -1
>  -40> 2019-05-20 15:34:43.084698 b06f8c30 10 monclient(hunting): dump:
>epoch 1
>fsid 75cb9a2d-673b-4a32-897a-05470a08ed58
>last_changed 2019-05-20 15:02:53.998735
>created 2019-05-20 15:02:53.998735
>0: 192.168.130.131:6789/0 mon.odroid-b
>
>  -39> 2019-05-20 15:34:43.084956 b2efdc30  5 --
>192.168.130.132:0/2049423493 >> 192.168.1
>30.131:6789/0 conn(0x5792d00 :-1
>s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=45 cs=1
>l=1). rx mon.0 seq 2 0x55a0540 auth_reply(proto 2 0 (0) Success) v1
>  -38> 2019-05-20 15:34:43.085011 b06f8c30  1 --
>192.168.130.132:0/2049423493 <== mon.0 19
>2.168.130.131:6789/0 2 ==== auth_reply(proto 2 0 (0) Success) v1 ====
>33+0+0 (4086221156 0
>0) 0x55a0540 con 0x5792d00
>  -37> 2019-05-20 15:34:43.085053 b06f8c30 10 monclient(hunting): my
>global_id is 24139
>  -36> 2019-05-20 15:34:43.085175 b06f8c30  1 --
>192.168.130.132:0/2049423493 --> 192.168.
>130.131:6789/0 -- auth(proto 2 32 bytes epoch 0) v1 -- 0x5590d00 con 0
>  -35> 2019-05-20 15:34:43.088488 b2efdc30  5 --
>192.168.130.132:0/2049423493 >> 192.168.1
>30.131:6789/0 conn(0x5792d00 :-1
>s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=45 cs=1
>l=1). rx mon.0 seq 3 0x55a0700 auth_reply(proto 2 0 (0) Success) v1
>  -34> 2019-05-20 15:34:43.088712 b06f8c30  1 --
>192.168.130.132:0/2049423493 <== mon.0 19
>2.168.130.131:6789/0 3 ==== auth_reply(proto 2 0 (0) Success) v1 ====
>222+0+0 (1945430716 0
>0) 0x55a0700 con 0x5792d00
>  -33> 2019-05-20 15:34:43.089295 b06f8c30  1 --
>192.168.130.132:0/2049423493 --> 192.168.
>130.131:6789/0 -- auth(proto 2 181 bytes epoch 0) v1 -- 0x5590680 con 0
>  -32> 2019-05-20 15:34:43.097488 b2efdc30  5 --
>192.168.130.132:0/2049423493 >> 192.168.1
>30.131:6789/0 conn(0x5792d00 :-1
>s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=45 cs=1
>l=1). rx mon.0 seq 4 0x55a08c0 auth_reply(proto 2 0 (0) Success) v1
>  -31> 2019-05-20 15:34:43.097643 b06f8c30  1 --
>192.168.130.132:0/2049423493 <== mon.0 19
>2.168.130.131:6789/0 4 ==== auth_reply(proto 2 0 (0) Success) v1 ====
>783+0+0 (327382700 0
>0) 0x55a08c0 con 0x5792d00
>-30> 2019-05-20 15:34:43.098725 b06f8c30  1 monclient: found
>mon.odroid-b
>-29> 2019-05-20 15:34:43.098850 b06f8c30 10 monclient:
>_send_mon_message
>to mon.odroid-b
>at 192.168.130.131:6789/0
>  -28> 2019-05-20 15:34:43.098898 b06f8c30  1 --
>192.168.130.132:0/2049423493 --> 192.168.
>130.131:6789/0 -- mon_subscribe({mgrmap=0+,monmap=0+}) v2 -- 0x554eb00
>con
>0
>  -27> 2019-05-20 15:34:43.099042 b06f8c30 10 monclient:
>_check_auth_rotating renewing rot
>ating keys (they expired before 2019-05-20 15:34:13.099036)
>-26> 2019-05-20 15:34:43.099183 b06f8c30 10 monclient:
>_send_mon_message
>to mon.odroid-b
>at 192.168.130.131:6789/0
>  -25> 2019-05-20 15:34:43.099271 b06f8c30  1 --
>192.168.130.132:0/2049423493 --> 192.168.
>130.131:6789/0 -- auth(proto 2 2 bytes epoch 0) v1 -- 0x5590d00 con 0
>  -24> 2019-05-20 15:34:43.099404 b6714230  5 monclient: authenticate
>success, global_id 2
>4139
>  -23> 2019-05-20 15:34:43.099543 b6714230 10 log_channel(cluster)
>update_config to_monito
>rs: true to_syslog: false syslog_facility: daemon prio: info
>to_graylog:
>false graylog_host
>: 127.0.0.1 graylog_port: 12201)
>  -22> 2019-05-20 15:34:43.099602 b6714230 10 log_channel(audit)
>update_config to_monitors
>: true to_syslog: false syslog_facility: local0 prio: info to_graylog:
>false graylog_host:
>127.0.0.1 graylog_port: 12201)
>  -21> 2019-05-20 15:34:43.099970 b6714230  5 asok(0x55b5320)
>register_command objecter_re
>quests hook 0x554c238
>  -20> 2019-05-20 15:34:43.100171 b6714230 10 monclient: _renew_subs
>-19> 2019-05-20 15:34:43.100214 b6714230 10 monclient:
>_send_mon_message
>to mon.odroid-b
>at 192.168.130.131:6789/0
>  -18> 2019-05-20 15:34:43.100246 b6714230  1 --
>192.168.130.132:0/2049423493 --> 192.168.
>130.131:6789/0 -- mon_subscribe({osdmap=0}) v2 -- 0x554ec60 con 0
>  -17> 2019-05-20 15:34:43.100737 b6714230  5 asok(0x55b5320)
>register_command mds_request
>s hook 0xbefefe80
>  -16> 2019-05-20 15:34:43.100793 b6714230  5 asok(0x55b5320)
>register_command mds_session
>s hook 0xbefefe80
>  -15> 2019-05-20 15:34:43.100847 b6714230  5 asok(0x55b5320)
>register_command dump_cache
>hook 0xbefefe80
>  -14> 2019-05-20 15:34:43.100811 b2efdc30  5 --
>192.168.130.132:0/2049423493 >> 192.168.1
>30.131:6789/0 conn(0x5792d00 :-1
>s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=45 cs=1
>l=1). rx mon.0 seq 5 0x558dc00 mgrmap(e 99) v1
>  -13> 2019-05-20 15:34:43.100915 b6714230  5 asok(0x55b5320)
>register_command kick_stale_
>sessions hook 0xbefefe80
>  -12> 2019-05-20 15:34:43.100977 b6714230  5 asok(0x55b5320)
>register_command status hook
>0xbefefe80
>  -11> 2019-05-20 15:34:43.100987 b06f8c30  1 --
>192.168.130.132:0/2049423493 <== mon.0 19
>2.168.130.131:6789/0 5 ==== mgrmap(e 99) v1 ==== 232+0+0 (4078310027 0
>0)
>0x558dc00 con 0x5
>792d00
>  -10> 2019-05-20 15:34:43.101004 b2efdc30  5 --
>192.168.130.132:0/2049423493 >> 192.168.1
>30.131:6789/0 conn(0x5792d00 :-1
>s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=45 cs=1
>l=1). rx mon.0 seq 6 0x55aaa80 mon_map magic: 0 v1
>   -9> 2019-05-20 15:34:43.101162 b6714230  1 mgr send_beacon standby
>   -8> 2019-05-20 15:34:43.101575 b2efdc30  5 --
>192.168.130.132:0/2049423493 >> 192.168.1
>30.131:6789/0 conn(0x5792d00 :-1
>s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=45 cs=1
>l=1). rx mon.0 seq 7 0x55a0540 auth_reply(proto 2 0 (0) Success) v1
>   -7> 2019-05-20 15:34:43.101889 b2efdc30  5 --
>192.168.130.132:0/2049423493 >> 192.168.1
>30.131:6789/0 conn(0x5792d00 :-1
>s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=45 cs=1
>l=1). rx mon.0 seq 8 0x5590d00 osd_map(42..42 src has 1..42) v3
>-6> 2019-05-20 15:34:43.102775 b6714230 10 monclient: _send_mon_message
>to mon.odroid-b
>at 192.168.130.131:6789/0
>   -5> 2019-05-20 15:34:43.102838 b6714230  1 --
>192.168.130.132:0/2049423493 --> 192.168.
>130.131:6789/0 -- mgrbeacon
>mgr.odroid-c(75cb9a2d-673b-4a32-897a-05470a08ed58,24139, -, 0)
>v6 -- 0x5562400 con 0
>   -4> 2019-05-20 15:34:43.102991 b6714230  4 mgr init Complete.
>   -3> 2019-05-20 15:34:43.103065 b06f8c30  4 mgr ms_dispatch standby
>mgrmap(e 99) v1
> -2> 2019-05-20 15:34:43.103110 b06f8c30  4 mgr handle_mgr_map received
>map epoch 99
>-1> 2019-05-20 15:34:43.103128 b06f8c30  4 mgr handle_mgr_map active in
>map: 0 active i
>s 24134
>    0> 2019-05-20 15:34:43.124462 b06f8c30 -1 *** Caught signal
>(Segmentation fault) **
>in thread b06f8c30 thread_name:ms_dispatch
>
>ceph version 12.2.11 (26dc3775efc7bb286a1d6d66faee0ba30ea23eee)
>luminous
>(stable)
>1: (()+0x30133c) [0x77033c]
>2: (()+0x25750) [0xb688a750]
>3: (_ULarm_step()+0x55) [0xb6816ce6]
>4: (()+0x255e8) [0xb6cd85e8]
>5: (GetStackTrace(void**, int, int)+0x25) [0xb6cd8a3e]
>6: (tcmalloc::PageHeap::GrowHeap(unsigned int)+0xb9) [0xb6ccd36a]
>7: (tcmalloc::PageHeap::New(unsigned int)+0x79) [0xb6ccd5e6]
>8: (tcmalloc::CentralFreeList::Populate()+0x71) [0xb6ccc5ce]
>9: (tcmalloc::CentralFreeList::FetchFromOneSpansSafe(int, void**,
>void**)+0x1b) [0xb6ccc76
>0]
>10: (tcmalloc::CentralFreeList::RemoveRange(void**, void**, int)+0x6d)
>[0xb6ccc7de]
>11: (tcmalloc::ThreadCache::FetchFromCentralCache(unsigned int,
>unsigned
>int)+0x51) [0xb6c
>cea56]
>12: (malloc()+0x22d) [0xb6cd9a8e]
>NOTE: a copy of the executable, or `objdump -rdS <executable>` is
>needed to
>interpret this
>.
>
>--- logging levels ---
>  0/ 5 none
>  0/ 1 lockdep
>  0/ 1 context
>  1/ 1 crush
>  1/ 5 mds
>  1/ 5 mds_balancer
>  1/ 5 mds_locker
>  1/ 5 mds_log
>  1/ 5 mds_log_expire
>  1/ 5 mds_migrator
>  0/ 1 buffer
>  0/ 1 timer
>  0/ 1 filer
>  0/ 1 striper
>  0/ 1 objecter
>  0/ 5 rados
>  0/ 5 rbd
>  0/ 5 rbd_mirror
>  0/ 5 rbd_replay
>  0/ 5 journaler
>  0/ 5 objectcacher
>  0/ 5 client
>  1/ 5 osd
>  0/ 5 optracker
>  0/ 5 objclass
>  1/ 3 filestore
>  1/ 3 journal
>  0/ 5 ms
>  1/ 5 mon
>  0/10 monc
>  1/ 5 paxos
>  0/ 5 tp
>  1/ 5 auth
>  1/ 5 crypto
>  1/ 1 finisher
>  1/ 1 reserver
>  1/ 5 heartbeatmap
>  1/ 5 perfcounter
>  1/ 5 rgw
>  1/10 civetweb
>  1/ 5 javaclient
>  1/ 5 asok
>  1/ 1 throttle
>  0/ 0 refs
>  1/ 5 xio
>  1/ 5 compressor
>  1/ 5 bluestore
>  1/ 5 bluefs
>  1/ 3 bdev
>  1/ 5 kstore
>  4/ 5 rocksdb
>  4/ 5 leveldb
>  4/ 5 memdb
>  1/ 5 kinetic
>  1/ 5 fuse
>  1/ 5 mgr
>  1/ 5 mgrc
>  1/ 5 dpdk
>  1/ 5 eventtrace
> -2/-2 (syslog threshold)
> -1/-1 (stderr threshold)
> max_recent     10000
> max_new         1000
> log_file /var/log/ceph/ceph-mgr.odroid-c.log
>--- end dump of recent events ---
>
>
>
>Kind regards
>
>Jesper

-- 
Dette er sendt fra min mobiltelefon. Undskyld at jeg fatter mig i korthed.
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to