Hi,

On a 4 node / 48 OSDs Luminous cluster Im giving a try at RBD on EC pools +
Bluestore.

Setup went fine but after a few bench runs several OSD are failing and many
wont even restart.

ceph osd erasure-code-profile set myprofile \
   k=2\
   m=1 \
   crush-failure-domain=host
ceph osd pool create mypool 1024 1024 erasure myprofile
ceph osd pool set mypool allow_ec_overwrites true
rbd pool init mypool
ceph -s
ceph health detail
ceph osd pool create metapool 1024 1024 replicated
rbd create --size 1024G --data-pool mypool --image metapool/test1
rbd bench -p metapool test1 --io-type write --io-size 8192 --io-pattern
rand --io-total 10G
...


One of many OSD failing logs

Sep 05 17:02:54 r72-k7-06-01.k8s.ash1.cloudsys.tmcs systemd[1]: Started
Ceph object storage daemon osd.12.
Sep 05 17:02:54 r72-k7-06-01.k8s.ash1.cloudsys.tmcs ceph-osd[4775]:
starting osd.12 at - osd_data /var/lib/ceph/osd/ceph-12
/var/lib/ceph/osd/ceph-12/journal
Sep 05 17:02:56 r72-k7-06-01.k8s.ash1.cloudsys.tmcs ceph-osd[4775]:
2017-09-05 17:02:56.627301 7fe1a2e42d00 -1 osd.12 2219 log_to_monitors
{default=true}
Sep 05 17:02:58 r72-k7-06-01.k8s.ash1.cloudsys.tmcs ceph-osd[4775]:
2017-09-05 17:02:58.686723 7fe1871ac700 -1 bluestore(/var/lib/ceph/osd/ceph-12)
_txc_add_transac
tion error (2) No such file or directory not handled on operation 15 (op 0,
counting from 0)
Sep 05 17:02:58 r72-k7-06-01.k8s.ash1.cloudsys.tmcs ceph-osd[4775]:
2017-09-05 17:02:58.686742 7fe1871ac700 -1 bluestore(/var/lib/ceph/osd/ceph-12)
unexpected error
 code
Sep 05 17:02:58 r72-k7-06-01.k8s.ash1.cloudsys.tmcs ceph-osd[4775]:
/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_
64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/
centos7/DIST/centos7/MACHINE_SIZE/huge/release/12.2.0/rpm/
el7/BUILD/ceph-12.2.0/src/os/bluestore/BlueStore.cc: In function 'void
BlueStore::_txc_add_transaction(Blu
eStore::TransContext*, ObjectStore::Transaction*)' thread 7fe1871ac700 time
2017-09-05 17:02:58.686821
Sep 05 17:02:58 r72-k7-06-01.k8s.ash1.cloudsys.tmcs ceph-osd[4775]:
/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_
64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/
centos7/DIST/centos7/MACHINE_SIZE/huge/release/12.2.0/rpm/
el7/BUILD/ceph-12.2.0/src/os/bluestore/BlueStore.cc: 9282: FAILED assert(0
== "unexpected error")
Sep 05 17:02:58 r72-k7-06-01.k8s.ash1.cloudsys.tmcs ceph-osd[4775]: ceph
version 12.2.0 (32ce2a3ae5239ee33d6150705cdb24d43bab910c) luminous (rc)
Sep 05 17:02:58 r72-k7-06-01.k8s.ash1.cloudsys.tmcs ceph-osd[4775]: 1:
(ceph::__ceph_assert_fail(char const*, char const*, int, char
const*)+0x110) [0x7fe1a38bf510]
Sep 05 17:02:58 r72-k7-06-01.k8s.ash1.cloudsys.tmcs ceph-osd[4775]: 2:
(BlueStore::_txc_add_transaction(BlueStore::TransContext*,
ObjectStore::Transaction*)+0x1487)
 [0x7fe1a3796057]
Sep 05 17:02:58 r72-k7-06-01.k8s.ash1.cloudsys.tmcs ceph-osd[4775]: 3:
(BlueStore::queue_transactions(ObjectStore::Sequencer*,
std::vector<ObjectStore::Transaction,
 std::allocator<ObjectStore::Transaction> >&, boost::intrusive_ptr<TrackedOp>,
ThreadPool::TPHandle*)+0x3a0) [0x7fe1a37970a0]
Sep 05 17:02:58 r72-k7-06-01.k8s.ash1.cloudsys.tmcs ceph-osd[4775]: 4:
(PrimaryLogPG::queue_transactions(std::vector<ObjectStore::Transaction,
std::allocator<Object
Store::Transaction> >&, boost::intrusive_ptr<OpRequest>)+0x65)
[0x7fe1a3508745]
Sep 05 17:02:58 r72-k7-06-01.k8s.ash1.cloudsys.tmcs ceph-osd[4775]: 5:
(ECBackend::handle_sub_write(pg_shard_t, boost::intrusive_ptr<OpRequest>,
ECSubWrite&, ZTrace
r::Trace const&, Context*)+0x631) [0x7fe1a3628711]
Sep 05 17:02:58 r72-k7-06-01.k8s.ash1.cloudsys.tmcs ceph-osd[4775]: 6:
(ECBackend::_handle_message(boost::intrusive_ptr<OpRequest>)+0x327)
[0x7fe1a36392b7]
Sep 05 17:02:58 r72-k7-06-01.k8s.ash1.cloudsys.tmcs ceph-osd[4775]: 7:
(PGBackend::handle_message(boost::intrusive_ptr<OpRequest>)+0x50)
[0x7fe1a353da10]
Sep 05 17:02:58 r72-k7-06-01.k8s.ash1.cloudsys.tmcs ceph-osd[4775]: 8:
(PrimaryLogPG::do_request(boost::intrusive_ptr<OpRequest>&,
ThreadPool::TPHandle&)+0x58e) [0x
7fe1a34a9a7e]
Sep 05 17:02:58 r72-k7-06-01.k8s.ash1.cloudsys.tmcs ceph-osd[4775]: 9:
(OSD::dequeue_op(boost::intrusive_ptr<PG>, boost::intrusive_ptr<OpRequest>,
ThreadPool::TPHan
dle&)+0x3f9) [0x7fe1a333c729]
Sep 05 17:02:58 r72-k7-06-01.k8s.ash1.cloudsys.tmcs ceph-osd[4775]: 10:
(PGQueueable::RunVis::operator()(boost::intrusive_ptr<OpRequest>
const&)+0x57) [0x7fe1a35ac1
97]
Sep 05 17:02:58 r72-k7-06-01.k8s.ash1.cloudsys.tmcs ceph-osd[4775]: 11:
(OSD::ShardedOpWQ::_process(unsigned int, ceph::heartbeat_handle_d*)+0xfce)
[0x7fe1a3367c8e]
Sep 05 17:02:58 r72-k7-06-01.k8s.ash1.cloudsys.tmcs ceph-osd[4775]: 12:
(ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x839)
[0x7fe1a38c5029]
Sep 05 17:02:58 r72-k7-06-01.k8s.ash1.cloudsys.tmcs ceph-osd[4775]: 13:
(ShardedThreadPool::WorkThreadSharded::entry()+0x10) [0x7fe1a38c6fc0]
Sep 05 17:02:58 r72-k7-06-01.k8s.ash1.cloudsys.tmcs ceph-osd[4775]: 14:
(()+0x7dc5) [0x7fe1a0484dc5]
Sep 05 17:02:58 r72-k7-06-01.k8s.ash1.cloudsys.tmcs ceph-osd[4775]: 15:
(clone()+0x6d) [0x7fe19f57876d]
Sep 05 17:02:58 r72-k7-06-01.k8s.ash1.cloudsys.tmcs ceph-osd[4775]: NOTE: a
copy of the executable, or `objdump -rdS <executable>` is needed to
interpret this.
[root@r72-k7-06-01 ~]# journalctl -u  ceph-osd@12 --no-pager -n 100
-- Logs begin at Wed 2017-08-30 10:26:26 UTC, end at Tue 2017-09-05
22:06:19 UTC. --
Sep 05 17:08:05 r72-k7-06-01.k8s.ash1.cloudsys.tmcs ceph-osd[6441]: 15:
(clone()+0x6d) [0x7f0160c9d76d]
Sep 05 17:08:05 r72-k7-06-01.k8s.ash1.cloudsys.tmcs ceph-osd[6441]: NOTE: a
copy of the executable, or `objdump -rdS <executable>` is needed to
interpret this.
Sep 05 17:08:05 r72-k7-06-01.k8s.ash1.cloudsys.tmcs ceph-osd[6441]: -5496>
2017-09-05 17:08:03.460844 7f0164567d00 -1 osd.12 2362 log_to_monitors
{default=true}
Sep 05 17:08:05 r72-k7-06-01.k8s.ash1.cloudsys.tmcs ceph-osd[6441]: -74>
2017-09-05 17:08:05.837648 7f01488d1700 -1 bluestore(/var/lib/ceph/osd/ceph-12)
_txc_add_transaction error (2) No such file or directory not handled on
operation 15 (op 0, counting from 0)
Sep 05 17:08:05 r72-k7-06-01.k8s.ash1.cloudsys.tmcs ceph-osd[6441]: -73>
2017-09-05 17:08:05.837670 7f01488d1700 -1 bluestore(/var/lib/ceph/osd/ceph-12)
unexpected error code
Sep 05 17:08:05 r72-k7-06-01.k8s.ash1.cloudsys.tmcs ceph-osd[6441]: 0>
2017-09-05 17:08:05.843218 7f01488d1700 -1 /home/jenkins-build/build/
workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/
AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/
release/12.2.0/rpm/el7/BUILD/ceph-12.2.0/src/os/bluestore/BlueStore.cc: In
function 'void BlueStore::_txc_add_transaction(BlueStore::TransContext*,
ObjectStore::Transaction*)' thread 7f01488d1700 time 2017-09-05
17:08:05.837770
Sep 05 17:08:05 r72-k7-06-01.k8s.ash1.cloudsys.tmcs ceph-osd[6441]:
/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_
64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/
centos7/MACHINE_SIZE/huge/release/12.2.0/rpm/el7/BUILD/
ceph-12.2.0/src/os/bluestore/BlueStore.cc: 9282: FAILED assert(0 ==
"unexpected error")
Sep 05 17:08:05 r72-k7-06-01.k8s.ash1.cloudsys.tmcs ceph-osd[6441]: ceph
version 12.2.0 (32ce2a3ae5239ee33d6150705cdb24d43bab910c) luminous (rc)
Sep 05 17:08:05 r72-k7-06-01.k8s.ash1.cloudsys.tmcs ceph-osd[6441]: 1:
(ceph::__ceph_assert_fail(char const*, char const*, int, char
const*)+0x110) [0x7f0164fe4510]
Sep 05 17:08:05 r72-k7-06-01.k8s.ash1.cloudsys.tmcs ceph-osd[6441]: 2:
(BlueStore::_txc_add_transaction(BlueStore::TransContext*,
ObjectStore::Transaction*)+0x1487) [0x7f0164ebb057]
Sep 05 17:08:05 r72-k7-06-01.k8s.ash1.cloudsys.tmcs ceph-osd[6441]: 3:
(BlueStore::queue_transactions(ObjectStore::Sequencer*,
std::vector<ObjectStore::Transaction, std::allocator<ObjectStore::Transaction>
>&, boost::intrusive_ptr<TrackedOp>, ThreadPool::TPHandle*)+0x3a0)
[0x7f0164ebc0a0]
Sep 05 17:08:05 r72-k7-06-01.k8s.ash1.cloudsys.tmcs ceph-osd[6441]: 4:
(PrimaryLogPG::queue_transactions(std::vector<ObjectStore::Transaction,
std::allocator<ObjectStore::Transaction> >&,
boost::intrusive_ptr<OpRequest>)+0x65)
[0x7f0164c2d745]
Sep 05 17:08:05 r72-k7-06-01.k8s.ash1.cloudsys.tmcs ceph-osd[6441]: 5:
(ECBackend::handle_sub_write(pg_shard_t, boost::intrusive_ptr<OpRequest>,
ECSubWrite&, ZTracer::Trace const&, Context*)+0x631) [0x7f0164d4d711]
Sep 05 17:08:05 r72-k7-06-01.k8s.ash1.cloudsys.tmcs ceph-osd[6441]: 6:
(ECBackend::_handle_message(boost::intrusive_ptr<OpRequest>)+0x327)
[0x7f0164d5e2b7]
Sep 05 17:08:05 r72-k7-06-01.k8s.ash1.cloudsys.tmcs ceph-osd[6441]: 7:
(PGBackend::handle_message(boost::intrusive_ptr<OpRequest>)+0x50)
[0x7f0164c62a10]
Sep 05 17:08:05 r72-k7-06-01.k8s.ash1.cloudsys.tmcs ceph-osd[6441]: 8:
(PrimaryLogPG::do_request(boost::intrusive_ptr<OpRequest>&,
ThreadPool::TPHandle&)+0x58e) [0x7f0164bcea7e]
Sep 05 17:08:05 r72-k7-06-01.k8s.ash1.cloudsys.tmcs ceph-osd[6441]: 9:
(OSD::dequeue_op(boost::intrusive_ptr<PG>, boost::intrusive_ptr<OpRequest>,
ThreadPool::TPHandle&)+0x3f9) [0x7f0164a61729]
Sep 05 17:08:05 r72-k7-06-01.k8s.ash1.cloudsys.tmcs ceph-osd[6441]: 10:
(PGQueueable::RunVis::operator()(boost::intrusive_ptr<OpRequest>
const&)+0x57) [0x7f0164cd1197]
Sep 05 17:08:05 r72-k7-06-01.k8s.ash1.cloudsys.tmcs ceph-osd[6441]: 11:
(OSD::ShardedOpWQ::_process(unsigned int, ceph::heartbeat_handle_d*)+0xfce)
[0x7f0164a8cc8e]
Sep 05 17:08:05 r72-k7-06-01.k8s.ash1.cloudsys.tmcs ceph-osd[6441]: 12:
(ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x839)
[0x7f0164fea029]
Sep 05 17:08:05 r72-k7-06-01.k8s.ash1.cloudsys.tmcs ceph-osd[6441]: 13:
(ShardedThreadPool::WorkThreadSharded::entry()+0x10) [0x7f0164febfc0]
Sep 05 17:08:05 r72-k7-06-01.k8s.ash1.cloudsys.tmcs ceph-osd[6441]: 14:
(()+0x7dc5) [0x7f0161ba9dc5]
Sep 05 17:08:05 r72-k7-06-01.k8s.ash1.cloudsys.tmcs ceph-osd[6441]: 15:
(clone()+0x6d) [0x7f0160c9d76d]
Sep 05 17:08:05 r72-k7-06-01.k8s.ash1.cloudsys.tmcs ceph-osd[6441]: NOTE: a
copy of the executable, or `objdump -rdS <executable>` is needed to
interpret this.
Sep 05 17:08:05 r72-k7-06-01.k8s.ash1.cloudsys.tmcs ceph-osd[6441]: ***
Caught signal (Aborted) **
Sep 05 17:08:05 r72-k7-06-01.k8s.ash1.cloudsys.tmcs ceph-osd[6441]: in
thread 7f01488d1700 thread_name:tp_osd_tp
Sep 05 17:08:05 r72-k7-06-01.k8s.ash1.cloudsys.tmcs ceph-osd[6441]: ceph
version 12.2.0 (32ce2a3ae5239ee33d6150705cdb24d43bab910c) luminous (rc)
Sep 05 17:08:05 r72-k7-06-01.k8s.ash1.cloudsys.tmcs ceph-osd[6441]: 1:
(()+0xa23b21) [0x7f0164fa5b21]
Sep 05 17:08:05 r72-k7-06-01.k8s.ash1.cloudsys.tmcs ceph-osd[6441]: 2:
(()+0xf370) [0x7f0161bb1370]
Sep 05 17:08:05 r72-k7-06-01.k8s.ash1.cloudsys.tmcs ceph-osd[6441]: 3:
(gsignal()+0x37) [0x7f0160bdb1d7]
Sep 05 17:08:05 r72-k7-06-01.k8s.ash1.cloudsys.tmcs ceph-osd[6441]: 4:
(abort()+0x148) [0x7f0160bdc8c8]
Sep 05 17:08:05 r72-k7-06-01.k8s.ash1.cloudsys.tmcs ceph-osd[6441]: 5:
(ceph::__ceph_assert_fail(char const*, char const*, int, char
const*)+0x284) [0x7f0164fe4684]
Sep 05 17:08:05 r72-k7-06-01.k8s.ash1.cloudsys.tmcs ceph-osd[6441]: 6:
(BlueStore::_txc_add_transaction(BlueStore::TransContext*,
ObjectStore::Transaction*)+0x1487) [0x7f0164ebb057]
Sep 05 17:08:05 r72-k7-06-01.k8s.ash1.cloudsys.tmcs ceph-osd[6441]: 7:
(BlueStore::queue_transactions(ObjectStore::Sequencer*,
std::vector<ObjectStore::Transaction, std::allocator<ObjectStore::Transaction>
>&, boost::intrusive_ptr<TrackedOp>, ThreadPool::TPHandle*)+0x3a0)
[0x7f0164ebc0a0]
Sep 05 17:08:05 r72-k7-06-01.k8s.ash1.cloudsys.tmcs ceph-osd[6441]: 8:
(PrimaryLogPG::queue_transactions(std::vector<ObjectStore::Transaction,
std::allocator<ObjectStore::Transaction> >&,
boost::intrusive_ptr<OpRequest>)+0x65)
[0x7f0164c2d745]
Sep 05 17:08:05 r72-k7-06-01.k8s.ash1.cloudsys.tmcs ceph-osd[6441]: 9:
(ECBackend::handle_sub_write(pg_shard_t, boost::intrusive_ptr<OpRequest>,
ECSubWrite&, ZTracer::Trace const&, Context*)+0x631) [0x7f0164d4d711]
Sep 05 17:08:05 r72-k7-06-01.k8s.ash1.cloudsys.tmcs ceph-osd[6441]: 10:
(ECBackend::_handle_message(boost::intrusive_ptr<OpRequest>)+0x327)
[0x7f0164d5e2b7]
Sep 05 17:08:05 r72-k7-06-01.k8s.ash1.cloudsys.tmcs ceph-osd[6441]: 11:
(PGBackend::handle_message(boost::intrusive_ptr<OpRequest>)+0x50)
[0x7f0164c62a10]
Sep 05 17:08:05 r72-k7-06-01.k8s.ash1.cloudsys.tmcs ceph-osd[6441]: 12:
(PrimaryLogPG::do_request(boost::intrusive_ptr<OpRequest>&,
ThreadPool::TPHandle&)+0x58e) [0x7f0164bcea7e]
Sep 05 17:08:05 r72-k7-06-01.k8s.ash1.cloudsys.tmcs ceph-osd[6441]: 13:
(OSD::dequeue_op(boost::intrusive_ptr<PG>, boost::intrusive_ptr<OpRequest>,
ThreadPool::TPHandle&)+0x3f9) [0x7f0164a61729]
Sep 05 17:08:05 r72-k7-06-01.k8s.ash1.cloudsys.tmcs ceph-osd[6441]: 14:
(PGQueueable::RunVis::operator()(boost::intrusive_ptr<OpRequest>
const&)+0x57) [0x7f0164cd1197]
Sep 05 17:08:05 r72-k7-06-01.k8s.ash1.cloudsys.tmcs ceph-osd[6441]: 15:
(OSD::ShardedOpWQ::_process(unsigned int, ceph::heartbeat_handle_d*)+0xfce)
[0x7f0164a8cc8e]
Sep 05 17:08:05 r72-k7-06-01.k8s.ash1.cloudsys.tmcs ceph-osd[6441]: 16:
(ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x839)
[0x7f0164fea029]
Sep 05 17:08:05 r72-k7-06-01.k8s.ash1.cloudsys.tmcs ceph-osd[6441]: 17:
(ShardedThreadPool::WorkThreadSharded::entry()+0x10) [0x7f0164febfc0]
Sep 05 17:08:05 r72-k7-06-01.k8s.ash1.cloudsys.tmcs ceph-osd[6441]: 18:
(()+0x7dc5) [0x7f0161ba9dc5]
Sep 05 17:08:05 r72-k7-06-01.k8s.ash1.cloudsys.tmcs ceph-osd[6441]: 19:
(clone()+0x6d) [0x7f0160c9d76d]
Sep 05 17:08:05 r72-k7-06-01.k8s.ash1.cloudsys.tmcs ceph-osd[6441]:
2017-09-05 17:08:05.883240 7f01488d1700 -1 *** Caught signal (Aborted) **
Sep 05 17:08:05 r72-k7-06-01.k8s.ash1.cloudsys.tmcs ceph-osd[6441]: in
thread 7f01488d1700 thread_name:tp_osd_tp
Sep 05 17:08:05 r72-k7-06-01.k8s.ash1.cloudsys.tmcs ceph-osd[6441]: ceph
version 12.2.0 (32ce2a3ae5239ee33d6150705cdb24d43bab910c) luminous (rc)
Sep 05 17:08:05 r72-k7-06-01.k8s.ash1.cloudsys.tmcs ceph-osd[6441]: 1:
(()+0xa23b21) [0x7f0164fa5b21]
Sep 05 17:08:05 r72-k7-06-01.k8s.ash1.cloudsys.tmcs ceph-osd[6441]: 2:
(()+0xf370) [0x7f0161bb1370]
Sep 05 17:08:05 r72-k7-06-01.k8s.ash1.cloudsys.tmcs ceph-osd[6441]: 3:
(gsignal()+0x37) [0x7f0160bdb1d7]
Sep 05 17:08:05 r72-k7-06-01.k8s.ash1.cloudsys.tmcs ceph-osd[6441]: 4:
(abort()+0x148) [0x7f0160bdc8c8]
Sep 05 17:08:05 r72-k7-06-01.k8s.ash1.cloudsys.tmcs ceph-osd[6441]: 5:
(ceph::__ceph_assert_fail(char const*, char const*, int, char
const*)+0x284) [0x7f0164fe4684]
Sep 05 17:08:05 r72-k7-06-01.k8s.ash1.cloudsys.tmcs ceph-osd[6441]: 6:
(BlueStore::_txc_add_transaction(BlueStore::TransContext*,
ObjectStore::Transaction*)+0x1487) [0x7f0164ebb057]
Sep 05 17:08:05 r72-k7-06-01.k8s.ash1.cloudsys.tmcs ceph-osd[6441]: 7:
(BlueStore::queue_transactions(ObjectStore::Sequencer*,
std::vector<ObjectStore::Transaction, std::allocator<ObjectStore::Transaction>
>&, boost::intrusive_ptr<TrackedOp>, ThreadPool::TPHandle*)+0x3a0)
[0x7f0164ebc0a0]
Sep 05 17:08:05 r72-k7-06-01.k8s.ash1.cloudsys.tmcs ceph-osd[6441]: 8:
(PrimaryLogPG::queue_transactions(std::vector<ObjectStore::Transaction,
std::allocator<ObjectStore::Transaction> >&,
boost::intrusive_ptr<OpRequest>)+0x65)
[0x7f0164c2d745]
Sep 05 17:08:05 r72-k7-06-01.k8s.ash1.cloudsys.tmcs ceph-osd[6441]: 9:
(ECBackend::handle_sub_write(pg_shard_t, boost::intrusive_ptr<OpRequest>,
ECSubWrite&, ZTracer::Trace const&, Context*)+0x631) [0x7f0164d4d711]
Sep 05 17:08:05 r72-k7-06-01.k8s.ash1.cloudsys.tmcs ceph-osd[6441]: 10:
(ECBackend::_handle_message(boost::intrusive_ptr<OpRequest>)+0x327)
[0x7f0164d5e2b7]
Sep 05 17:08:05 r72-k7-06-01.k8s.ash1.cloudsys.tmcs ceph-osd[6441]: 11:
(PGBackend::handle_message(boost::intrusive_ptr<OpRequest>)+0x50)
[0x7f0164c62a10]
Sep 05 17:08:05 r72-k7-06-01.k8s.ash1.cloudsys.tmcs ceph-osd[6441]: 12:
(PrimaryLogPG::do_request(boost::intrusive_ptr<OpRequest>&,
ThreadPool::TPHandle&)+0x58e) [0x7f0164bcea7e]
Sep 05 17:08:05 r72-k7-06-01.k8s.ash1.cloudsys.tmcs ceph-osd[6441]: 13:
(OSD::dequeue_op(boost::intrusive_ptr<PG>, boost::intrusive_ptr<OpRequest>,
ThreadPool::TPHandle&)+0x3f9) [0x7f0164a61729]
Sep 05 17:08:05 r72-k7-06-01.k8s.ash1.cloudsys.tmcs ceph-osd[6441]: 14:
(PGQueueable::RunVis::operator()(boost::intrusive_ptr<OpRequest>
const&)+0x57) [0x7f0164cd1197]
Sep 05 17:08:05 r72-k7-06-01.k8s.ash1.cloudsys.tmcs ceph-osd[6441]: 15:
(OSD::ShardedOpWQ::_process(unsigned int, ceph::heartbeat_handle_d*)+0xfce)
[0x7f0164a8cc8e]
Sep 05 17:08:05 r72-k7-06-01.k8s.ash1.cloudsys.tmcs ceph-osd[6441]: 16:
(ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x839)
[0x7f0164fea029]
Sep 05 17:08:05 r72-k7-06-01.k8s.ash1.cloudsys.tmcs ceph-osd[6441]: 17:
(ShardedThreadPool::WorkThreadSharded::entry()+0x10) [0x7f0164febfc0]
Sep 05 17:08:05 r72-k7-06-01.k8s.ash1.cloudsys.tmcs ceph-osd[6441]: 18:
(()+0x7dc5) [0x7f0161ba9dc5]
Sep 05 17:08:05 r72-k7-06-01.k8s.ash1.cloudsys.tmcs ceph-osd[6441]: 19:
(clone()+0x6d) [0x7f0160c9d76d]
Sep 05 17:08:05 r72-k7-06-01.k8s.ash1.cloudsys.tmcs ceph-osd[6441]: NOTE: a
copy of the executable, or `objdump -rdS <executable>` is needed to
interpret this.
Sep 05 17:08:05 r72-k7-06-01.k8s.ash1.cloudsys.tmcs ceph-osd[6441]: 0>
2017-09-05 17:08:05.883240 7f01488d1700 -1 *** Caught signal (Aborted) **
Sep 05 17:08:05 r72-k7-06-01.k8s.ash1.cloudsys.tmcs ceph-osd[6441]: in
thread 7f01488d1700 thread_name:tp_osd_tp
Sep 05 17:08:05 r72-k7-06-01.k8s.ash1.cloudsys.tmcs ceph-osd[6441]: ceph
version 12.2.0 (32ce2a3ae5239ee33d6150705cdb24d43bab910c) luminous (rc)
Sep 05 17:08:05 r72-k7-06-01.k8s.ash1.cloudsys.tmcs ceph-osd[6441]: 1:
(()+0xa23b21) [0x7f0164fa5b21]
Sep 05 17:08:05 r72-k7-06-01.k8s.ash1.cloudsys.tmcs ceph-osd[6441]: 2:
(()+0xf370) [0x7f0161bb1370]
Sep 05 17:08:05 r72-k7-06-01.k8s.ash1.cloudsys.tmcs ceph-osd[6441]: 3:
(gsignal()+0x37) [0x7f0160bdb1d7]
Sep 05 17:08:05 r72-k7-06-01.k8s.ash1.cloudsys.tmcs ceph-osd[6441]: 4:
(abort()+0x148) [0x7f0160bdc8c8]
Sep 05 17:08:05 r72-k7-06-01.k8s.ash1.cloudsys.tmcs ceph-osd[6441]: 5:
(ceph::__ceph_assert_fail(char const*, char const*, int, char
const*)+0x284) [0x7f0164fe4684]
Sep 05 17:08:05 r72-k7-06-01.k8s.ash1.cloudsys.tmcs ceph-osd[6441]: 6:
(BlueStore::_txc_add_transaction(BlueStore::TransContext*,
ObjectStore::Transaction*)+0x1487) [0x7f0164ebb057]
Sep 05 17:08:05 r72-k7-06-01.k8s.ash1.cloudsys.tmcs ceph-osd[6441]: 7:
(BlueStore::queue_transactions(ObjectStore::Sequencer*,
std::vector<ObjectStore::Transaction, std::allocator<ObjectStore::Transaction>
>&, boost::intrusive_ptr<TrackedOp>, ThreadPool::TPHandle*)+0x3a0)
[0x7f0164ebc0a0]
Sep 05 17:08:05 r72-k7-06-01.k8s.ash1.cloudsys.tmcs ceph-osd[6441]: 8:
(PrimaryLogPG::queue_transactions(std::vector<ObjectStore::Transaction,
std::allocator<ObjectStore::Transaction> >&,
boost::intrusive_ptr<OpRequest>)+0x65)
[0x7f0164c2d745]
Sep 05 17:08:05 r72-k7-06-01.k8s.ash1.cloudsys.tmcs ceph-osd[6441]: 9:
(ECBackend::handle_sub_write(pg_shard_t, boost::intrusive_ptr<OpRequest>,
ECSubWrite&, ZTracer::Trace const&, Context*)+0x631) [0x7f0164d4d711]
Sep 05 17:08:05 r72-k7-06-01.k8s.ash1.cloudsys.tmcs ceph-osd[6441]: 10:
(ECBackend::_handle_message(boost::intrusive_ptr<OpRequest>)+0x327)
[0x7f0164d5e2b7]
Sep 05 17:08:05 r72-k7-06-01.k8s.ash1.cloudsys.tmcs ceph-osd[6441]: 11:
(PGBackend::handle_message(boost::intrusive_ptr<OpRequest>)+0x50)
[0x7f0164c62a10]
Sep 05 17:08:05 r72-k7-06-01.k8s.ash1.cloudsys.tmcs ceph-osd[6441]: 12:
(PrimaryLogPG::do_request(boost::intrusive_ptr<OpRequest>&,
ThreadPool::TPHandle&)+0x58e) [0x7f0164bcea7e]
Sep 05 17:08:05 r72-k7-06-01.k8s.ash1.cloudsys.tmcs ceph-osd[6441]: 13:
(OSD::dequeue_op(boost::intrusive_ptr<PG>, boost::intrusive_ptr<OpRequest>,
ThreadPool::TPHandle&)+0x3f9) [0x7f0164a61729]
Sep 05 17:08:05 r72-k7-06-01.k8s.ash1.cloudsys.tmcs ceph-osd[6441]: 14:
(PGQueueable::RunVis::operator()(boost::intrusive_ptr<OpRequest>
const&)+0x57) [0x7f0164cd1197]
Sep 05 17:08:05 r72-k7-06-01.k8s.ash1.cloudsys.tmcs ceph-osd[6441]: 15:
(OSD::ShardedOpWQ::_process(unsigned int, ceph::heartbeat_handle_d*)+0xfce)
[0x7f0164a8cc8e]
Sep 05 17:08:05 r72-k7-06-01.k8s.ash1.cloudsys.tmcs ceph-osd[6441]: 16:
(ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x839)
[0x7f0164fea029]
Sep 05 17:08:05 r72-k7-06-01.k8s.ash1.cloudsys.tmcs ceph-osd[6441]: 17:
(ShardedThreadPool::WorkThreadSharded::entry()+0x10) [0x7f0164febfc0]
Sep 05 17:08:05 r72-k7-06-01.k8s.ash1.cloudsys.tmcs ceph-osd[6441]: 18:
(()+0x7dc5) [0x7f0161ba9dc5]
Sep 05 17:08:05 r72-k7-06-01.k8s.ash1.cloudsys.tmcs ceph-osd[6441]: 19:
(clone()+0x6d) [0x7f0160c9d76d]
Sep 05 17:08:05 r72-k7-06-01.k8s.ash1.cloudsys.tmcs ceph-osd[6441]: NOTE: a
copy of the executable, or `objdump -rdS <executable>` is needed to
interpret this.
Sep 05 17:08:05 r72-k7-06-01.k8s.ash1.cloudsys.tmcs systemd[1]:
ceph-osd@12.service: main process exited, code=killed, status=6/ABRT
Sep 05 17:08:05 r72-k7-06-01.k8s.ash1.cloudsys.tmcs systemd[1]: Unit
ceph-osd@12.service entered failed state.
Sep 05 17:08:05 r72-k7-06-01.k8s.ash1.cloudsys.tmcs systemd[1]:
ceph-osd@12.service failed.
Sep 05 17:08:26 r72-k7-06-01.k8s.ash1.cloudsys.tmcs systemd[1]:
ceph-osd@12.service holdoff time over, scheduling restart.
Sep 05 17:08:26 r72-k7-06-01.k8s.ash1.cloudsys.tmcs systemd[1]: start
request repeated too quickly for ceph-osd@12.service
Sep 05 17:08:26 r72-k7-06-01.k8s.ash1.cloudsys.tmcs systemd[1]: Failed to
start Ceph object storage daemon osd.12.
Sep 05 17:08:26 r72-k7-06-01.k8s.ash1.cloudsys.tmcs systemd[1]: Unit
ceph-osd@12.service entered failed state.
Sep 05 17:08:26 r72-k7-06-01.k8s.ash1.cloudsys.tmcs systemd[1]:
ceph-osd@12.service failed.



Drives are OK, FS mounted and no suspicious kernel logs.

What's next to troubleshoot ?


thanks !
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to