Looking at this gdb output, it looks like all of the tp_osd_tp threads are idling around except for three that are all waiting on a PG lock. I bet those sit there for 60s and eventually time out. The kv_sync_thread looks idle so I don't think that's it. Thread 16 is doing OSD::trim_maps but I don't really know what the locking semantics are there. Maybe Sage or Josh can chime in.

In any event, here are the tp_osd_tp threads waiting on a pg lock:

Thread 64 (Thread 0x7f5ed184e700 (LWP 3545048)):
#0 __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135
#1  0x00007f5ef68f3d02 in _L_lock_791 () from /lib64/libpthread.so.0
#2 0x00007f5ef68f3c08 in __GI___pthread_mutex_lock (mutex=0x7f5f50baf4e8) at pthread_mutex_lock.c:64 #3 0x00007f5ef9c95e48 in Mutex::Lock (this=this@entry=0x7f5f50baf4d8, no_lockdep=no_lockdep@entry=false)
    at /usr/src/debug/ceph-12.0.3/src/common/Mutex.cc:110
#4 0x00007f5ef97f4d63 in PG::lock (this=0x7f5f50baf000, no_lockdep=no_lockdep@entry=false) at /usr/src/debug/ceph-12.0.3/src/osd/PG.cc:363 #5 0x00007f5ef9796751 in OSD::ShardedOpWQ::_process (this=0x7f5f04437198, thread_index=<optimized out>, hb=0x7f5f05b27100)
    at /usr/src/debug/ceph-12.0.3/src/osd/OSD.cc:9706
#6 0x00007f5ef9cbb235 in ShardedThreadPool::shardedthreadpool_worker (this=0x7f5f04436958, thread_index=<optimized out>)
    at /usr/src/debug/ceph-12.0.3/src/common/WorkQueue.cc:354
#7 0x00007f5ef9cbd390 in ShardedThreadPool::WorkThreadSharded::entry (this=<optimized out>)
    at /usr/src/debug/ceph-12.0.3/src/common/WorkQueue.h:685
#8 0x00007f5ef68f1dc5 in start_thread (arg=0x7f5ed184e700) at pthread_create.c:308 #9 0x00007f5ef57e173d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113

Thread 61 (Thread 0x7f5ed3051700 (LWP 3545041)):
#0 __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135
#1  0x00007f5ef68f3d02 in _L_lock_791 () from /lib64/libpthread.so.0
#2 0x00007f5ef68f3c08 in __GI___pthread_mutex_lock (mutex=0x7f5f07f844e8) at pthread_mutex_lock.c:64 #3 0x00007f5ef9c95e48 in Mutex::Lock (this=this@entry=0x7f5f07f844d8, no_lockdep=no_lockdep@entry=false)
    at /usr/src/debug/ceph-12.0.3/src/common/Mutex.cc:110
#4 0x00007f5ef97f4d63 in PG::lock (this=0x7f5f07f84000, no_lockdep=no_lockdep@entry=false) at /usr/src/debug/ceph-12.0.3/src/osd/PG.cc:363 #5 0x00007f5ef9796751 in OSD::ShardedOpWQ::_process (this=0x7f5f04437198, thread_index=<optimized out>, hb=0x7f5f05b26d80)
    at /usr/src/debug/ceph-12.0.3/src/osd/OSD.cc:9706
#6 0x00007f5ef9cbb235 in ShardedThreadPool::shardedthreadpool_worker (this=0x7f5f04436958, thread_index=<optimized out>)
    at /usr/src/debug/ceph-12.0.3/src/common/WorkQueue.cc:354
#7 0x00007f5ef9cbd390 in ShardedThreadPool::WorkThreadSharded::entry (this=<optimized out>)
    at /usr/src/debug/ceph-12.0.3/src/common/WorkQueue.h:685
#8 0x00007f5ef68f1dc5 in start_thread (arg=0x7f5ed3051700) at pthread_create.c:308 #9 0x00007f5ef57e173d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113

Thread 57 (Thread 0x7f5ed5055700 (LWP 3545035)):
#0 __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135
#1  0x00007f5ef68f3d02 in _L_lock_791 () from /lib64/libpthread.so.0
#2 0x00007f5ef68f3c08 in __GI___pthread_mutex_lock (mutex=0x7f5f4f7674e8) at pthread_mutex_lock.c:64 #3 0x00007f5ef9c95e48 in Mutex::Lock (this=this@entry=0x7f5f4f7674d8, no_lockdep=no_lockdep@entry=false)
    at /usr/src/debug/ceph-12.0.3/src/common/Mutex.cc:110
#4 0x00007f5ef97f4d63 in PG::lock (this=0x7f5f4f767000, no_lockdep=no_lockdep@entry=false) at /usr/src/debug/ceph-12.0.3/src/osd/PG.cc:363 #5 0x00007f5ef9796751 in OSD::ShardedOpWQ::_process (this=0x7f5f04437198, thread_index=<optimized out>, hb=0x7f5f05b26900)
    at /usr/src/debug/ceph-12.0.3/src/osd/OSD.cc:9706
#6 0x00007f5ef9cbb235 in ShardedThreadPool::shardedthreadpool_worker (this=0x7f5f04436958, thread_index=<optimized out>)
    at /usr/src/debug/ceph-12.0.3/src/common/WorkQueue.cc:354
#7 0x00007f5ef9cbd390 in ShardedThreadPool::WorkThreadSharded::entry (this=<optimized out>)
    at /usr/src/debug/ceph-12.0.3/src/common/WorkQueue.h:685
#8 0x00007f5ef68f1dc5 in start_thread (arg=0x7f5ed5055700) at pthread_create.c:308 #9 0x00007f5ef57e173d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113

Mark

On 06/01/2017 01:31 AM, nokia ceph wrote:

Hello Mark,

Please find  the attached gdb.txt file which having 'thread apply all
bt' result.

Thanks
Jayaram






On Wed, May 31, 2017 at 5:43 PM, Mark Nelson <mnel...@redhat.com
<mailto:mnel...@redhat.com>> wrote:



    On 05/31/2017 05:21 AM, nokia ceph wrote:

        + ceph-devel ..

        $ps -ef | grep 294
        ceph     3539720       1 14 08:04 ?        00:16:35
        /usr/bin/ceph-osd -f
        --cluster ceph --id 294 --setuser ceph --setgroup ceph

        $gcore -o coredump-osd  3539720


        $(gdb) bt
        #0  0x00007f5ef68f56d5 in pthread_cond_wait@@GLIBC_2.3.2 () from
        /lib64/libpthread.so.0
        #1  0x00007f5ef9cc45ab in ceph::logging::Log::entry() ()
        #2  0x00007f5ef68f1dc5 in start_thread () from
        /lib64/libpthread.so.0
        #3  0x00007f5ef57e173d in clone () from /lib64/libc.so.6


    Can you do a thread apply all bt instead so we can see what the
    tp_osd_tp threads were doing?  That might be the bigger clue.

    Mark



        *2017-05-31 10:11:51.064495 7f5ef383b700  1 heartbeat_map is_healthy
        'tp_osd_tp thread tp_osd_tp' had timed out after 60*
        *2017-05-31 10:11:51.064496 7f5ef383b700  1 heartbeat_map is_healthy
        'tp_osd_tp thread tp_osd_tp' had timed out after 60*
        *2017-05-31 10:11:51.064497 7f5ef383b700  1 heartbeat_map is_healthy
        'tp_osd_tp thread tp_osd_tp' had timed out after 60*
        *2017-05-31 10:11:51.064497 7f5ef383b700  1 heartbeat_map is_healthy
        'tp_osd_tp thread tp_osd_tp' had timed out after 60*
        *2017-05-31 10:11:51.064498 7f5ef383b700  1 heartbeat_map is_healthy
        'tp_osd_tp thread tp_osd_tp' had timed out after 60*
        2017-05-31 10:11:51.957092 7f5eef88a700 -1 osd.294 1004
        heartbeat_check:
        no reply from 10.50.62.154:6858 <http://10.50.62.154:6858>
        <http://10.50.62.154:6858> osd.64 since
        back 2017-05-31 09:58:53.016145 front 2017-05-31 09:58:53.016145
        (cutoff
        2017-05-31 10:11:31.957089)
        2017-05-31 10:11:51.957114 7f5eef88a700 -1 osd.294 1004
        heartbeat_check:
        no reply from 10.50.62.154:6931 <http://10.50.62.154:6931>
        <http://10.50.62.154:6931> osd.82 since
        back 2017-05-31 10:04:48.204500 front 2017-05-31 10:04:48.204500
        (cutoff
        2017-05-31 10:11:31.957089)
        2017-05-31 10:11:51.957121 7f5eef88a700 -1 osd.294 1004
        heartbeat_check:
        no reply from 10.50.62.152:6929 <http://10.50.62.152:6929>
        <http://10.50.62.152:6929> osd.162 since
        back 2017-05-31 09:57:37.451595 front 2017-05-31 09:57:37.451595
        (cutoff
        2017-05-31 10:11:31.957089)



        Thanks
        Jayaram

        On Tue, May 30, 2017 at 7:33 PM, nokia ceph
        <nokiacephus...@gmail.com <mailto:nokiacephus...@gmail.com>
        <mailto:nokiacephus...@gmail.com
        <mailto:nokiacephus...@gmail.com>>> wrote:

            Hello Mark,

            Yes this issue happens once the test/write started after 60 secs
            which correspond config value -- "threadpool_default_timeout
        = 60 "
            . Do you require the down OSD coredump to analyse tp_osd_tp
        state. .
            Please be specify which process dump you would require to
        analyse.

            Like ,
            #gcore <PID of down OSD>

            or using wallclock profiler, I'm not much aware how to use
        this tool.

            Thanks
            Jayaram

            On Tue, May 30, 2017 at 6:57 PM, Mark Nelson
        <mnel...@redhat.com <mailto:mnel...@redhat.com>
            <mailto:mnel...@redhat.com <mailto:mnel...@redhat.com>>> wrote:

                On 05/30/2017 05:07 AM, nokia ceph wrote:

                    Hello Mark,

                    I can able to reproduce this problem everytime.


                Ok, next question, does it happen 60s after starting the
        200MB/s
                load, or does it take a while?  Sounds like it's pretty
        random
                across OSDs? I'm thinking we want to figure out what
        state the
                tp_osd_tp threads are in when this is happening (maybe
        via a gdb
                bt or using the wallclock profiler to gather several
        samples)
                and then figure out via the logs what lead to the chain of
                events that put it in that state.

                Mark


                    Env:-- 5 node, v12.0.3, EC 4+1 bluestore , RHEL 7.3 -
                    3.10.0-514.el7.x86_64

                    Tested with debug bluestore = 20...

                    From ceph watch
                    ===========

                    2017-05-30 08:57:33.510794 mon.0 [INF] pgmap v15649:
        8192
                    pgs: 8192
                    active+clean; 774 GB data, 1359 GB used, 1854 TB /
        1855 TB
                    avail; 35167
                    B/s rd, 77279 kB/s wr, 726 op/s
                    2017-05-30 08:57:35.134604 mon.0 [INF] pgmap v15650:
        8192
                    pgs: 8192
                    active+clean; 774 GB data, 1359 GB used, 1854 TB /
        1855 TB
                    avail; 19206
                    B/s rd, 63835 kB/s wr, 579 op/s
                    2017-05-30 08:57:36.295086 mon.0 [INF] pgmap v15651:
        8192
                    pgs: 8192
                    active+clean; 774 GB data, 1359 GB used, 1854 TB /
        1855 TB
                    avail; 10999
                    B/s rd, 44093 kB/s wr, 379 op/s
                    2017-05-30 08:57:37.228422 mon.0 [INF] osd.243
                    10.50.62.154:6895/1842853
        <http://10.50.62.154:6895/1842853>
        <http://10.50.62.154:6895/1842853
        <http://10.50.62.154:6895/1842853>>
                    <http://10.50.62.154:6895/1842853
        <http://10.50.62.154:6895/1842853>
                    <http://10.50.62.154:6895/1842853
        <http://10.50.62.154:6895/1842853>>> failed (4 reporters from
                    different
                    host after 20.062159 >= grace 20.000000)
                    2017-05-30 08:57:37.234308 mon.0 [INF] osd.19
                    10.50.62.152:6856/1940715
        <http://10.50.62.152:6856/1940715>
        <http://10.50.62.152:6856/1940715
        <http://10.50.62.152:6856/1940715>>
                    <http://10.50.62.152:6856/1940715
        <http://10.50.62.152:6856/1940715>

                    <http://10.50.62.152:6856/1940715
        <http://10.50.62.152:6856/1940715>>> failed (4 reporters from
                    different

                    host after 20.000234 >= grace 20.000000)
                    2017-05-30 08:57:37.368342 mon.0 [INF] pgmap v15652:
        8192
                    pgs: 8192
                    active+clean; 774 GB data, 1360 GB used, 1854 TB /
        1855 TB
                    avail; 12628
                    B/s rd, 50471 kB/s wr, 444 op/s
                    2017-05-30 08:57:37.506743 mon.0 [INF] osdmap e848: 340
                    osds: 338 up, 340 in


                    From failed osd.19 log
                    =========

                    ===
                    2017-05-30 08:57:04.155836 7f9d6c723700 10
                    bluestore(/var/lib/ceph/osd/ceph-19) _omap_setkeys
        1.1e85s4_head
                    4#1:a1780000::::head# = 0
                    2017-05-30 08:57:04.155840 7f9d6c723700 10
                    bluestore(/var/lib/ceph/osd/ceph-19) _txc_calc_cost
                    0x7f9da2d1f180 cost
                    3542664 (2 ios * 1500000 + 542664 bytes)
                    2017-05-30 08:57:04.155841 7f9d6c723700 20
                    bluestore(/var/lib/ceph/osd/ceph-19)
        _txc_write_nodes txc
                    0x7f9da2d1f180
                    onodes 0x7f9da6651840 shared_blobs
                    2017-05-30 08:57:04.155843 7f9d6c723700 20
                    bluestore.extentmap(0x7f9da6651930) update

        
4#1:a179feb3:::%2fc20%2fvx0000033%2fpot%2fchannel1%2fhls%2fvar2660000%2f00000%2fseg0015.ts:head#
                    2017-05-30 08:57:04.155852 7f9d6c723700 20
                    bluestore.extentmap(0x7f9da6651930) update  inline
        shard 607
                    bytes from
                    2 extents
                    2017-05-30 08:57:04.155858 7f9d6c723700 20
                    bluestore(/var/lib/ceph/osd/ceph-19)   onode

        
4#1:a179feb3:::%2fc20%2fvx0000033%2fpot%2fchannel1%2fhls%2fvar2660000%2f00000%2fseg0015.ts:head#
                    is 1348 (735 bytes onode + 2 bytes spanning blobs + 611
                    bytes inline
                    extents)
                    2017-05-30 08:57:04.155876 7f9d6c723700 20
                    bluestore(/var/lib/ceph/osd/ceph-19)
        _txc_finalize_kv txc
                    0x7f9da2d1f180
                    allocated 0x[388af10000~90000] released 0x[]
                    2017-05-30 08:57:04.155880 7f9d6c723700 10 freelist
        allocate
                    0x388af10000~90000
                    2017-05-30 08:57:04.155882 7f9d6c723700 20 freelist _xor
                    first_key
                    0x388af00000 last_key 0x388af80000
                    2017-05-30 08:57:07.698556 7f9d6a71f700 20
                    bluestore(/var/lib/ceph/osd/ceph-19)
                    statfsstore_statfs(0x574506f0000/0x5753b990000, stored
                    0xa389a531/0xa70e0000, compress 0x0/0x0/0x0)
                    2017-05-30 08:57:08.201089 7f9d6a71f700 20
                    bluestore(/var/lib/ceph/osd/ceph-19)
                    statfsstore_statfs(0x574506f0000/0x5753b990000, stored
                    0xa389a531/0xa70e0000, compress 0x0/0x0/0x0)
                    2017-05-30 08:57:09.903290 7f9d6a71f700 20
                    bluestore(/var/lib/ceph/osd/ceph-19)
                    statfsstore_statfs(0x574506f0000/0x5753b990000, stored
                    0xa389a531/0xa70e0000, compress 0x0/0x0/0x0)
                    2017-05-30 08:57:11.605590 7f9d6a71f700 20
                    bluestore(/var/lib/ceph/osd/ceph-19)
                    statfsstore_statfs(0x574506f0000/0x5753b990000, stored
                    0xa389a531/0xa70e0000, compress 0x0/0x0/0x0)
                    2017-05-30 08:57:15.707908 7f9d6a71f700 20
                    bluestore(/var/lib/ceph/osd/ceph-19)
                    statfsstore_statfs(0x574506f0000/0x5753b990000, stored
                    0xa389a531/0xa70e0000, compress 0x0/0x0/0x0)
                    2017-05-30 08:57:17.410365 7f9d6a71f700 20
                    bluestore(/var/lib/ceph/osd/ceph-19)
                    statfsstore_statfs(0x574506f0000/0x5753b990000, stored
                    0xa389a531/0xa70e0000, compress 0x0/0x0/0x0)
                    2017-05-30 08:57:18.512773 7f9d6a71f700 20
                    bluestore(/var/lib/ceph/osd/ceph-19)
                    statfsstore_statfs(0x574506f0000/0x5753b990000, stored
                    0xa389a531/0xa70e0000, compress 0x0/0x0/0x0)
                    *2017-05-30 08:57:20.005812 7f9d8f712700  1
        heartbeat_map
                    is_healthy
                    'tp_osd_tp thread tp_osd_tp' had timed out after 60*
                    *2017-05-30 08:57:20.006648 7f9d8ff13700  1
        heartbeat_map
                    is_healthy
                    'tp_osd_tp thread tp_osd_tp' had timed out after 60*
                    *2017-05-30 08:57:20.015585 7f9d8f712700  1
        heartbeat_map
                    is_healthy
                    'tp_osd_tp thread tp_osd_tp' had timed out after 60*
                    *2017-05-30 08:57:20.015677 7f9d8f712700  1
        heartbeat_map
                    is_healthy
                    'tp_osd_tp thread tp_osd_tp' had timed out after 60*
                    *
                    *
                    *..<snip>..*


                    2017-05-30 08:57:37.506457 7f9d875fd700 20
                    bluestore(/var/lib/ceph/osd/ceph-19) _do_write
        perform garbage
                    collection, expected benefit = 0 AUs
                    2017-05-30 08:57:37.506459 7f9d875fd700 20
                    bluestore.extentmap(0x7f9da0ed50b0) dirty_range mark
        inline
                    shard dirty
                    2017-05-30 08:57:37.506461 7f9d875fd700 10
                    bluestore(/var/lib/ceph/osd/ceph-19) _write meta
                    #-1:7b3f43c4:::osd_superblock:0# 0x0~1f8 = 0
                    2017-05-30 08:57:37.506463 7f9d875fd700 10
                    bluestore(/var/lib/ceph/osd/ceph-19) _txc_calc_cost
                    0x7f9da2d22680 cost
                    3104677 (2 ios * 1500000 + 104677 bytes)
                    2017-05-30 08:57:37.506479 7f9d875fd700 20
                    bluestore(/var/lib/ceph/osd/ceph-19)
        _txc_write_nodes txc
                    0x7f9da2d22680
                    onodes 0x7f9da0ed4fc0,0x7f9da5bd8000,0x7f9da72dad80
        shared_blobs
                    2017-05-30 08:57:37.506482 7f9d875fd700 20
                    bluestore.extentmap(0x7f9da0ed50b0) update
                    #-1:7b3f43c4:::osd_superblock:0#
                    2017-05-30 08:57:37.506485 7f9d875fd700 20
                    bluestore.extentmap(0x7f9da0ed50b0) update  inline
        shard 81
                    bytes from 1
                    extents
                    2017-05-30 08:57:37.506491 7f9d875fd700 20
                    bluestore(/var/lib/ceph/osd/ceph-19)   onode
                    #-1:7b3f43c4:::osd_superblock:0# is 108 (21 bytes
        onode + 2
                    bytes
                    spanning blobs + 85 bytes inline extents)
                    2017-05-30 08:57:37.506495 7f9d875fd700 20
                    bluestore.extentmap(0x7f9da5bd80f0) update
                    #-1:a51ca935:::osdmap.848:0#
                    2017-05-30 08:57:37.506497 7f9d875fd700 20
                    bluestore.extentmap(0x7f9da5bd80f0) update  inline
        shard 235
                    bytes from
                    2 extents
                    2017-05-30 08:57:37.506499 7f9d875fd700 20
                    bluestore(/var/lib/ceph/osd/ceph-19)   onode
                    #-1:a51ca935:::osdmap.848:0# is 264 (23 bytes onode + 2
                    bytes spanning
                    blobs + 239 bytes inline extents)
                    2017-05-30 08:57:37.506502 7f9d875fd700 20
                    bluestore.extentmap(0x7f9da72dae70) update
                    #-1:48acce63:::inc_osdmap.848:0#
                    2017-05-30 08:57:37.506504 7f9d875fd700 20
                    bluestore.extentmap(0x7f9da72dae70) update  inline
        shard 82
                    bytes from 1
                    extents
                    2017-05-30 08:57:37.506506 7f9d875fd700 20
                    bluestore(/var/lib/ceph/osd/ceph-19)   onode
                    #-1:48acce63:::inc_osdmap.848:0# is 110 (22 bytes
        onode + 2
                    bytes
                    spanning blobs + 86 bytes inline extents)
                    2017-05-30 08:57:37.506558 7f9d875fd700 20
                    bluestore(/var/lib/ceph/osd/ceph-19)
        _txc_finalize_kv txc
                    0x7f9da2d22680
                    allocated 0x[388afa0000~30000] released 0x[]
                    2017-05-30 08:57:37.506562 7f9d875fd700 10 freelist
        allocate
                    0x388afa0000~30000
                    2017-05-30 08:57:37.506563 7f9d875fd700 20 freelist _xor
                    first_key
                    0x388af80000 last_key 0x388af80000
                    2017-05-30 08:57:37.510446 7f9d8ff13700  0 --
                    10.50.63.152:6857/1940715
        <http://10.50.63.152:6857/1940715>
        <http://10.50.63.152:6857/1940715
        <http://10.50.63.152:6857/1940715>>
                    <http://10.50.63.152:6857/1940715
        <http://10.50.63.152:6857/1940715>
                    <http://10.50.63.152:6857/1940715
        <http://10.50.63.152:6857/1940715>>> >>
                    10.50.63.153:6869/1572972
        <http://10.50.63.153:6869/1572972>
        <http://10.50.63.153:6869/1572972
        <http://10.50.63.153:6869/1572972>>
                    <http://10.50.63.153:6869/1572972
        <http://10.50.63.153:6869/1572972>
                    <http://10.50.63.153:6869/1572972
        <http://10.50.63.153:6869/1572972>>> conn(0x7f9da698a800 :6857
                    s=STATE_OPEN pgs=243 cs=1 l=0).fault with nothing to
        send,
                    going to standby
                    2017-05-30 08:57:37.514366 7f9d8ff13700  0 --
                    10.50.63.152:6857/1940715
        <http://10.50.63.152:6857/1940715>
        <http://10.50.63.152:6857/1940715
        <http://10.50.63.152:6857/1940715>>
                    <http://10.50.63.152:6857/1940715
        <http://10.50.63.152:6857/1940715>
                    <http://10.50.63.152:6857/1940715
        <http://10.50.63.152:6857/1940715>>> >>
                    10.50.63.153:6863/1572982
        <http://10.50.63.153:6863/1572982>
        <http://10.50.63.153:6863/1572982
        <http://10.50.63.153:6863/1572982>>
                    <http://10.50.63.153:6863/1572982
        <http://10.50.63.153:6863/1572982>

                    <http://10.50.63.153:6863/1572982
        <http://10.50.63.153:6863/1572982>>> conn(0x7f9da620f800 :6857
                    s=STATE_OPEN pgs=149 cs=1 l=0).fault with nothing to
        send,
                    going to standby
                    2017-05-30 08:57:37.517351 7f9d8ef11700  1 heartbeat_map
                    is_healthy
                    'tp_osd_tp thread tp_osd_tp' had timed out after 60
                    2017-05-30 08:57:37.517354 7f9d8ef11700  1 heartbeat_map
                    is_healthy
                    'tp_osd_tp thread tp_osd_tp' had timed out after 60
                    ====


                     config value --> threadpool_default_timeout = 60

                    // default timeout while caling WaitInterval on an
        empty queue
                    OPTION(threadpool_default_timeout, OPT_INT, 60)

                    I altered the above value to 300, then it serve for
        some time

                    2017-05-30 09:44:32.027168 7efc1f2b9700 20
                    bluestore(/var/lib/ceph/osd/ceph-41)
                    statfsstore_statfs(0x574f2860000/0x5753b990000, stored
                    0x2e65ed1/0x6730000, compress 0x0/0x0/0x0)
                    2017-05-30 09:44:35.529164 7efc1f2b9700 20
                    bluestore(/var/lib/ceph/osd/ceph-41)
                    statfsstore_statfs(0x574f2860000/0x5753b990000, stored
                    0x2e65ed1/0x6730000, compress 0x0/0x0/0x0)
                    2017-05-30 09:44:40.831130 7efc1f2b9700 20
                    bluestore(/var/lib/ceph/osd/ceph-41)
                    statfsstore_statfs(0x574f2860000/0x5753b990000, stored
                    0x2e65ed1/0x6730000, compress 0x0/0x0/0x0)
                    2017-05-30 09:44:41.634056 7efc3fafa700 -1 osd.41 878
                    heartbeat_check:
                    no reply from 10.50.62.152:6853
        <http://10.50.62.152:6853> <http://10.50.62.152:6853>
                    <http://10.50.62.152:6853> osd.30 since
                    back 2017-05-30 09:44:21.521517 front 2017-05-30
                    09:44:21.521517 (cutoff
                    2017-05-30 09:44:21.634053)
                    *2017-05-30 09:44:42.128280 7efc442ac700  1
        heartbeat_map
                    is_healthy
                    'tp_osd_tp thread tp_osd_tp' had timed out after 300*
                    *2017-05-30 09:44:42.128282 7efc442ac700  1
        heartbeat_map
                    is_healthy
                    'tp_osd_tp thread tp_osd_tp' had timed out after 300*
                    *2017-05-30 09:44:42.128284 7efc442ac700  1
        heartbeat_map
                    is_healthy
                    'tp_osd_tp thread tp_osd_tp' had timed out after 300*

                    Do you require any logs from my end? Shall I open
        tracker
                    for this issue?

                    Thanks
                    Jayaram


                    On Mon, May 29, 2017 at 4:58 AM, Mark Nelson
                    <mnel...@redhat.com <mailto:mnel...@redhat.com>
        <mailto:mnel...@redhat.com <mailto:mnel...@redhat.com>>
                    <mailto:mnel...@redhat.com
        <mailto:mnel...@redhat.com> <mailto:mnel...@redhat.com
        <mailto:mnel...@redhat.com>>>> wrote:

                        Hi Jake,

                        200MB/s is pretty low load across 5 servers.  I
        wouldn't
                    expect the
                        tp_osd_tp threads to be that heavily loaded that
        it's
                    not responding
                        for 60s.  Sounds like a bug.  Can you reproduce
        it?  It
                    might be
                        worth trying it with debug bluestore = 20.

                        Mark


                        On 05/27/2017 05:02 AM, Jake Grimmett wrote:

                            Dear All,

                            I wonder if anyone can give advice regarding
                    bluestore OSD's
                            going down on
                            Luminous 12.0.3 when the cluster is under
        moderate
                    (200MB/s) load,

                            OSD's seem to do down randomly across the 5 OSD
                    servers. When
                            one OSD is
                            down, load decreases, so no further OSD's drop,
                    until I restart
                            the OSD,
                            then another fails.

                            There are no obvious disk errors seen in
                    /var/log/messages.

                            Here though is part of a ceph-osd.46.log...

                            2017-05-27 10:42:28.781821 7f7c503b4700  0
                    log_channel(cluster)
                            log [WRN]
                            : slow request 120.247733 seconds old,
        received at
                    2017-05-27
                            10:40:28.534048: osd_op(osd.52.1251:8732 1.3ffs0
                    1.a8ec73ff
                            (undecoded)


        
ondisk+read+rwordered+ignore_cache+ignore_overlay+map_snap_clone+known_if_redirected
                            e1342) currently queued_for_pg
                            2017-05-27 10:42:33.748109 7f7c52bf1700  1
                    heartbeat_map is_healthy
                            'tp_osd_tp thread tp_osd_tp' had timed out
        after 60

                            - these two errors repeat, with more of the
                    'tp_osd_tp thread
                            tp_osd_tp'
                            errors.

                            Hopefully this is not due to the highpoint
        rocket
                    r750 cards in
                            my OSD
                            servers (the OSD servers are all 45drive.com
        <http://45drive.com>
                    <http://45drive.com>
                            <http://45drive.com> storinators)

                            Other info - each node has 64GB ram, 10 x 8TB
                    Ironwolf drive,
                            10Gb Intel
                            nic, single E5-2620 v4

                            Any advice gratefully received!

                            thanks,

                            Jake

                        _______________________________________________
                        ceph-users mailing list
                        ceph-users@lists.ceph.com
        <mailto:ceph-users@lists.ceph.com>
                    <mailto:ceph-users@lists.ceph.com
        <mailto:ceph-users@lists.ceph.com>>
                    <mailto:ceph-users@lists.ceph.com
        <mailto:ceph-users@lists.ceph.com>
                    <mailto:ceph-users@lists.ceph.com
        <mailto:ceph-users@lists.ceph.com>>>

        http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
        <http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com>

        <http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
        <http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com>>

        <http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
        <http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com>

        <http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
        <http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com>>>





_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to