Can you restart one of the affected osds with debug osd = 20, debug
filestore = 20, debug ms = 1 and post the log?
-Sam

On Mon, Nov 19, 2012 at 3:39 PM, Stefan Priebe <s.pri...@profihost.ag> wrote:
> Am 20.11.2012 00:39, schrieb Samuel Just:
>
>> Seems to be a truncated log file...  That usually indicates filesystem
>> corruption.  Anything in dmesg?
>> -Sam
>
> No. Everything fine.
>
>
>
>> On Thu, Nov 15, 2012 at 1:07 PM, Stefan Priebe <s.pri...@profihost.ag>
>> wrote:
>>>
>>> Hello list,
>>>
>>> actual master incl. upstream/wip-fd-simple-cache results in this crash
>>> when
>>> i try to start some of my osds (others work fine) today on multiple
>>> nodes:
>>>
>>>      -2> 2012-11-15 22:04:09.226945 7f3af1c7a780  0 osd.52 pg_epoch: 657
>>> pg[3.3b( v 632'823 (632'823,632'823] n=5 ec=17 les/c 18/18 656/656/17) []
>>> r=0 lpr=0 pi=17-655/2 (info mismatch, log(632'823,0'0]) (log bound
>>> mismatch,
>>> empty) lcod 0'0 mlcod 0'0 inactive] Got exception 'read_log_error:
>>> read_log
>>> got 0 bytes, expected 126086-0=126086' while reading log. Moving
>>> corrupted
>>> log file to 'corrupt_log_2012-11-15_22:04_3.3b' for later analysis.
>>>      -1> 2012-11-15 22:04:09.233563 7f3af1c7a780  0 osd.52 pg_epoch: 657
>>> pg[3.557( v 632'753 (0'0,632'753] n=2 ec=17 les/c 18/18 656/656/17) []
>>> r=0
>>> lpr=0 pi=17-655/2 (info mismatch, log(0'0,0'0]) lcod 0'0 mlcod 0'0
>>> inactive]
>>> Got exception 'read_log_error: read_log got 0 bytes, expected
>>> 115488-0=115488' while reading log. Moving corrupted log file to
>>> 'corrupt_log_2012-11-15_22:04_3.557' for later analysis.
>>>       0> 2012-11-15 22:04:09.234536 7f3ae87d0700 -1 os/FileStore.cc: In
>>> function 'int FileStore::_collection_add(coll_t, coll_t, const
>>> hobject_t&,
>>> const SequencerPosition&)' thread 7f3ae87d0700 time 2012-11-15
>>> 22:04:09.233672
>>> os/FileStore.cc: 4500: FAILED assert(replaying)
>>>
>>>   ceph version 0.54-607-gf89e101
>>> (f89e1012bafabd6875a4a1e1832d76ffdf45b039)
>>>   1: (FileStore::_collection_add(coll_t, coll_t, hobject_t const&,
>>> SequencerPosition const&)+0x77d) [0x72ff0d]
>>>   2: (FileStore::_do_transaction(ObjectStore::Transaction&, unsigned
>>> long,
>>> int)+0x25fb) [0x73481b]
>>>   3: (FileStore::do_transactions(std::list<ObjectStore::Transaction*,
>>> std::allocator<ObjectStore::Transaction*> >&, unsigned long)+0x4c)
>>> [0x73952c]
>>>   4: (FileStore::_do_op(FileStore::OpSequencer*)+0x195) [0x705c45]
>>>   5: (ThreadPool::worker(ThreadPool::WorkThread*)+0x82b) [0x830f1b]
>>>   6: (ThreadPool::WorkThread::entry()+0x10) [0x833700]
>>>   7: (()+0x68ca) [0x7f3af16578ca]
>>>   8: (clone()+0x6d) [0x7f3aefac6bfd]
>>>   NOTE: a copy of the executable, or `objdump -rdS <executable>` is
>>> needed to
>>> interpret this.
>>>
>>> --- logging levels ---
>>>     0/ 5 none
>>>     0/ 0 lockdep
>>>     0/ 0 context
>>>     0/ 0 crush
>>>     1/ 5 mds
>>>     1/ 5 mds_balancer
>>>     1/ 5 mds_locker
>>>     1/ 5 mds_log
>>>     1/ 5 mds_log_expire
>>>     1/ 5 mds_migrator
>>>     0/ 0 buffer
>>>     0/ 0 timer
>>>     0/ 1 filer
>>>     0/ 1 striper
>>>     0/ 1 objecter
>>>     0/ 5 rados
>>>     0/ 5 rbd
>>>     0/ 0 journaler
>>>     0/ 5 objectcacher
>>>     0/ 5 client
>>>     0/ 0 osd
>>>     0/ 0 optracker
>>>     0/ 0 objclass
>>>     0/ 0 filestore
>>>     0/ 0 journal
>>>     0/ 0 ms
>>>     1/ 5 mon
>>>     0/ 0 monc
>>>     0/ 5 paxos
>>>     0/ 0 tp
>>>     0/ 0 auth
>>>     1/ 5 crypto
>>>     0/ 0 finisher
>>>     0/ 0 heartbeatmap
>>>     0/ 0 perfcounter
>>>     1/ 5 rgw
>>>     1/ 5 hadoop
>>>     1/ 5 javaclient
>>>     0/ 0 asok
>>>     0/ 0 throttle
>>>    -2/-2 (syslog threshold)
>>>    -1/-1 (stderr threshold)
>>>    max_recent     10000
>>>    max_new      1000000
>>>    log_file /var/log/ceph/ceph-osd.52.log
>>> --- end dump of recent events ---
>>> 2012-11-15 22:04:09.235734 7f3ae87d0700 -1 *** Caught signal (Aborted) **
>>>   in thread 7f3ae87d0700
>>>
>>>   ceph version 0.54-607-gf89e101
>>> (f89e1012bafabd6875a4a1e1832d76ffdf45b039)
>>>   1: /usr/bin/ceph-osd() [0x799769]
>>>   2: (()+0xeff0) [0x7f3af165fff0]
>>>   3: (gsignal()+0x35) [0x7f3aefa29215]
>>>   4: (abort()+0x180) [0x7f3aefa2c020]
>>>   5: (__gnu_cxx::__verbose_terminate_handler()+0x115) [0x7f3af02bddc5]
>>>   6: (()+0xcb166) [0x7f3af02bc166]
>>>   7: (()+0xcb193) [0x7f3af02bc193]
>>>   8: (()+0xcb28e) [0x7f3af02bc28e]
>>>   9: (ceph::__ceph_assert_fail(char const*, char const*, int, char
>>> const*)+0x7c9) [0x7fd069]
>>>   10: (FileStore::_collection_add(coll_t, coll_t, hobject_t const&,
>>> SequencerPosition const&)+0x77d) [0x72ff0d]
>>>   11: (FileStore::_do_transaction(ObjectStore::Transaction&, unsigned
>>> long,
>>> int)+0x25fb) [0x73481b]
>>>   12: (FileStore::do_transactions(std::list<ObjectStore::Transaction*,
>>> std::allocator<ObjectStore::Transaction*> >&, unsigned long)+0x4c)
>>> [0x73952c]
>>>   13: (FileStore::_do_op(FileStore::OpSequencer*)+0x195) [0x705c45]
>>>   14: (ThreadPool::worker(ThreadPool::WorkThread*)+0x82b) [0x830f1b]
>>>   15: (ThreadPool::WorkThread::entry()+0x10) [0x833700]
>>>   16: (()+0x68ca) [0x7f3af16578ca]
>>>   17: (clone()+0x6d) [0x7f3aefac6bfd]
>>>   NOTE: a copy of the executable, or `objdump -rdS <executable>` is
>>> needed to
>>> interpret this.
>>>
>>> --- begin dump of recent events ---
>>>       0> 2012-11-15 22:04:09.235734 7f3ae87d0700 -1 *** Caught signal
>>> (Aborted) **
>>>   in thread 7f3ae87d0700
>>>
>>>   ceph version 0.54-607-gf89e101
>>> (f89e1012bafabd6875a4a1e1832d76ffdf45b039)
>>>   1: /usr/bin/ceph-osd() [0x799769]
>>>   2: (()+0xeff0) [0x7f3af165fff0]
>>>   3: (gsignal()+0x35) [0x7f3aefa29215]
>>>   4: (abort()+0x180) [0x7f3aefa2c020]
>>>   5: (__gnu_cxx::__verbose_terminate_handler()+0x115) [0x7f3af02bddc5]
>>>   6: (()+0xcb166) [0x7f3af02bc166]
>>>   7: (()+0xcb193) [0x7f3af02bc193]
>>>   8: (()+0xcb28e) [0x7f3af02bc28e]
>>>   9: (ceph::__ceph_assert_fail(char const*, char const*, int, char
>>> const*)+0x7c9) [0x7fd069]
>>>   10: (FileStore::_collection_add(coll_t, coll_t, hobject_t const&,
>>> SequencerPosition const&)+0x77d) [0x72ff0d]
>>>   11: (FileStore::_do_transaction(ObjectStore::Transaction&, unsigned
>>> long,
>>> int)+0x25fb) [0x73481b]
>>>   12: (FileStore::do_transactions(std::list<ObjectStore::Transaction*,
>>> std::allocator<ObjectStore::Transaction*> >&, unsigned long)+0x4c)
>>> [0x73952c]
>>>   13: (FileStore::_do_op(FileStore::OpSequencer*)+0x195) [0x705c45]
>>>   14: (ThreadPool::worker(ThreadPool::WorkThread*)+0x82b) [0x830f1b]
>>>   15: (ThreadPool::WorkThread::entry()+0x10) [0x833700]
>>>   16: (()+0x68ca) [0x7f3af16578ca]
>>>   17: (clone()+0x6d) [0x7f3aefac6bfd]
>>>   NOTE: a copy of the executable, or `objdump -rdS <executable>` is
>>> needed to
>>> interpret this.
>>>
>>> --- logging levels ---
>>>     0/ 5 none
>>>     0/ 0 lockdep
>>>     0/ 0 context
>>>     0/ 0 crush
>>>     1/ 5 mds
>>>     1/ 5 mds_balancer
>>>     1/ 5 mds_locker
>>>     1/ 5 mds_log
>>>     1/ 5 mds_log_expire
>>>     1/ 5 mds_migrator
>>>     0/ 0 buffer
>>>     0/ 0 timer
>>>     0/ 1 filer
>>>     0/ 1 striper
>>>     0/ 1 objecter
>>>     0/ 5 rados
>>>     0/ 5 rbd
>>>     0/ 0 journaler
>>>     0/ 5 objectcacher
>>>     0/ 5 client
>>>     0/ 0 osd
>>>     0/ 0 optracker
>>>     0/ 0 objclass
>>>     0/ 0 filestore
>>>     0/ 0 journal
>>>     0/ 0 ms
>>>     1/ 5 mon
>>>     0/ 0 monc
>>>     0/ 5 paxos
>>>     0/ 0 tp
>>>     0/ 0 auth
>>>     1/ 5 crypto
>>>     0/ 0 finisher
>>>     0/ 0 heartbeatmap
>>>     0/ 0 perfcounter
>>>     1/ 5 rgw
>>>     1/ 5 hadoop
>>>     1/ 5 javaclient
>>>     0/ 0 asok
>>>     0/ 0 throttle
>>>    -2/-2 (syslog threshold)
>>>    -1/-1 (stderr threshold)
>>>    max_recent     10000
>>>    max_new      1000000
>>>    log_file /var/log/ceph/ceph-osd.52.log
>>> --- end dump of recent events ---
>>>
>>> Stefan
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>>> the body of a message to majord...@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> the body of a message to majord...@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
>
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to