Hello Greg,

I added the debug options which you mentioned and started the process again:

[root@th1-mon001 ~]# /usr/bin/ceph-mds -i th1-mon001 --pid-file 
/var/run/ceph/mds.th1-mon001.pid -c /etc/ceph/ceph.conf --cluster ceph 
--reset-journal 0
old journal was 9483323613~134233517
new journal start will be 9621733376 (4176246 bytes past old end)
writing journal head
writing EResetJournal entry
done
[root@th1-mon001 ~]# /usr/bin/ceph-mds -i th1-mon001 -c /etc/ceph/ceph.conf 
--cluster ceph --undump-journal 0 journaldumptgho-mon001 
undump journaldumptgho-mon001
start 9483323613 len 134213311
writing header 200.00000000
 writing 9483323613~1048576
 writing 9484372189~1048576
 writing 9485420765~1048576
 writing 9486469341~1048576
 writing 9487517917~1048576
 writing 9488566493~1048576
 writing 9489615069~1048576
 writing 9490663645~1048576
 writing 9491712221~1048576
 writing 9492760797~1048576
 writing 9493809373~1048576
 writing 9494857949~1048576
 writing 9495906525~1048576
 writing 9496955101~1048576
 writing 9498003677~1048576
 writing 9499052253~1048576
 writing 9500100829~1048576
 writing 9501149405~1048576
 writing 9502197981~1048576
 writing 9503246557~1048576
 writing 9504295133~1048576
 writing 9505343709~1048576
 writing 9506392285~1048576
 writing 9507440861~1048576
 writing 9508489437~1048576
 writing 9509538013~1048576
 writing 9510586589~1048576
 writing 9511635165~1048576
 writing 9512683741~1048576
 writing 9513732317~1048576
 writing 9514780893~1048576
 writing 9515829469~1048576
 writing 9516878045~1048576
 writing 9517926621~1048576
 writing 9518975197~1048576
 writing 9520023773~1048576
 writing 9521072349~1048576
 writing 9522120925~1048576
 writing 9523169501~1048576
 writing 9524218077~1048576
 writing 9525266653~1048576
 writing 9526315229~1048576
 writing 9527363805~1048576
 writing 9528412381~1048576
 writing 9529460957~1048576
 writing 9530509533~1048576
 writing 9531558109~1048576
 writing 9532606685~1048576
 writing 9533655261~1048576
 writing 9534703837~1048576
 writing 9535752413~1048576
 writing 9536800989~1048576
 writing 9537849565~1048576
 writing 9538898141~1048576
 writing 9539946717~1048576
 writing 9540995293~1048576
 writing 9542043869~1048576
 writing 9543092445~1048576
 writing 9544141021~1048576
 writing 9545189597~1048576
 writing 9546238173~1048576
 writing 9547286749~1048576
 writing 9548335325~1048576
 writing 9549383901~1048576
 writing 9550432477~1048576
 writing 9551481053~1048576
 writing 9552529629~1048576
 writing 9553578205~1048576
 writing 9554626781~1048576
 writing 9555675357~1048576
 writing 9556723933~1048576
 writing 9557772509~1048576
 writing 9558821085~1048576
 writing 9559869661~1048576
 writing 9560918237~1048576
 writing 9561966813~1048576
 writing 9563015389~1048576
 writing 9564063965~1048576
 writing 9565112541~1048576
 writing 9566161117~1048576
 writing 9567209693~1048576
 writing 9568258269~1048576
 writing 9569306845~1048576
 writing 9570355421~1048576
 writing 9571403997~1048576
 writing 9572452573~1048576
 writing 9573501149~1048576
 writing 9574549725~1048576
 writing 9575598301~1048576
 writing 9576646877~1048576
 writing 9577695453~1048576
 writing 9578744029~1048576
 writing 9579792605~1048576
 writing 9580841181~1048576
 writing 9581889757~1048576
 writing 9582938333~1048576
 writing 9583986909~1048576
 writing 9585035485~1048576
 writing 9586084061~1048576
 writing 9587132637~1048576
 writing 9588181213~1048576
 writing 9589229789~1048576
 writing 9590278365~1048576
 writing 9591326941~1048576
 writing 9592375517~1048576
 writing 9593424093~1048576
 writing 9594472669~1048576
 writing 9595521245~1048576
 writing 9596569821~1048576
 writing 9597618397~1048576
 writing 9598666973~1048576
 writing 9599715549~1048576
 writing 9600764125~1048576
 writing 9601812701~1048576
 writing 9602861277~1048576
 writing 9603909853~1048576
 writing 9604958429~1048576
 writing 9606007005~1048576
 writing 9607055581~1048576
 writing 9608104157~1048576
 writing 9609152733~1048576
 writing 9610201309~1048576
 writing 9611249885~1048576
 writing 9612298461~1048576
 writing 9613347037~1048576
 writing 9614395613~1048576
 writing 9615444189~1048576
 writing 9616492765~1044159
done.
[root@th1-mon001 ~]# service ceph start mds
=== mds.th1-mon001 === 
Starting Ceph mds.th1-mon001 on th1-mon001...
starting mds.th1-mon001 at :/0


The new logs:
http://pastebin.com/wqqjuEpy


Kind regards,

Jasper

________________________________________
Van: gregory.far...@inktank.com [gregory.far...@inktank.com] namens Gregory 
Farnum [gfar...@redhat.com]
Verzonden: dinsdag 28 oktober 2014 19:26
Aan: Jasper Siero
CC: John Spray; ceph-users
Onderwerp: Re: [ceph-users] mds isn't working anymore after osd's running full

You'll need to gather a log with the offsets visible; you can do this
with "debug ms = 1; debug mds = 20; debug journaler = 20".
-Greg

On Fri, Oct 24, 2014 at 7:03 AM, Jasper Siero
<jasper.si...@target-holding.nl> wrote:
> Hello Greg and John,
>
> I used the patch on the ceph cluster and tried it again:
>  /usr/bin/ceph-mds -i th1-mon001 -c /etc/ceph/ceph.conf --cluster ceph 
> --undump-journal 0 journaldumptgho-mon001
> undump journaldumptgho-mon001
> start 9483323613 len 134213311
> writing header 200.00000000
> writing 9483323613~1048576
> writing 9484372189~1048576
> ....
> ....
> writing 9614395613~1048576
> writing 9615444189~1048576
> writing 9616492765~1044159
> done.
>
> It went well without errors and after that I restarted the mds.
> The status went from up:replay to up:reconnect to up:rejoin(lagged or crashed)
>
> In the log there is an error about trim_to > trimming_pos and its like Greg 
> mentioned that maybe the dumpfile needs to be truncated to the proper length 
> and resetting and undumping again.
>
> How can I truncate the dumped file to the correct length?
>
> The mds log during the undumping and starting the mds:
> http://pastebin.com/y14pSvM0
>
> Kind Regards,
>
> Jasper
> ________________________________________
> Van: john.sp...@inktank.com [john.sp...@inktank.com] namens John Spray 
> [john.sp...@redhat.com]
> Verzonden: donderdag 16 oktober 2014 12:23
> Aan: Jasper Siero
> CC: Gregory Farnum; ceph-users
> Onderwerp: Re: [ceph-users] mds isn't working anymore after osd's running full
>
> Following up: firefly fix for undump is: 
> https://github.com/ceph/ceph/pull/2734
>
> Jasper: if you still need to try undumping on this existing firefly
> cluster, then you can download ceph-mds packages from this
> wip-firefly-undump branch from
> http://gitbuilder.ceph.com/ceph-deb-precise-x86_64-basic/ref/
>
> Cheers,
> John
>
> On Wed, Oct 15, 2014 at 8:15 PM, John Spray <john.sp...@redhat.com> wrote:
>> Sadly undump has been broken for quite some time (it was fixed in
>> giant as part of creating cephfs-journal-tool).  If there's a one line
>> fix for this then it's probably worth putting in firefly since it's a
>> long term supported branch -- I'll do that now.
>>
>> John
>>
>> On Wed, Oct 15, 2014 at 8:23 AM, Jasper Siero
>> <jasper.si...@target-holding.nl> wrote:
>>> Hello Greg,
>>>
>>> The dump and reset of the journal was succesful:
>>>
>>> [root@th1-mon001 ~]# /usr/bin/ceph-mds -i th1-mon001 --pid-file 
>>> /var/run/ceph/mds.th1-mon001.pid -c /etc/ceph/ceph.conf --cluster ceph 
>>> --dump-journal 0 journaldumptgho-mon001
>>> journal is 9483323613~134215459
>>> read 134213311 bytes at offset 9483323613
>>> wrote 134213311 bytes at offset 9483323613 to journaldumptgho-mon001
>>> NOTE: this is a _sparse_ file; you can
>>>         $ tar cSzf journaldumptgho-mon001.tgz journaldumptgho-mon001
>>>       to efficiently compress it while preserving sparseness.
>>>
>>> [root@th1-mon001 ~]# /usr/bin/ceph-mds -i th1-mon001 --pid-file 
>>> /var/run/ceph/mds.th1-mon001.pid -c /etc/ceph/ceph.conf --cluster ceph 
>>> --reset-journal 0
>>> old journal was 9483323613~134215459
>>> new journal start will be 9621733376 (4194304 bytes past old end)
>>> writing journal head
>>> writing EResetJournal entry
>>> done
>>>
>>>
>>> Undumping the journal was not successful and looking into the error 
>>> "client_lock.is_locked()" is showed several times. The mds is not running 
>>> when I start the undumping so maybe have forgot something?
>>>
>>> [root@th1-mon001 ~]# /usr/bin/ceph-mds -i th1-mon001 --pid-file 
>>> /var/run/ceph/mds.th1-mon001.pid -c /etc/ceph/ceph.conf --cluster ceph 
>>> --undump-journal 0 journaldumptgho-mon001
>>> undump journaldumptgho-mon001
>>> start 9483323613 len 134213311
>>> writing header 200.00000000
>>> osdc/Objecter.cc: In function 'ceph_tid_t 
>>> Objecter::op_submit(Objecter::Op*)' thread 7fec3e5ad7a0 time 2014-10-15 
>>> 09:09:32.020287
>>> osdc/Objecter.cc: 1225: FAILED assert(client_lock.is_locked())
>>>  ceph version 0.80.5 (38b73c67d375a2552d8ed67843c8a65c2c0feba6)
>>>  1: /usr/bin/ceph-mds() [0x80f15e]
>>>  2: (Dumper::undump(char const*)+0x65d) [0x56c7ad]
>>>  3: (main()+0x1632) [0x569c62]
>>>  4: (__libc_start_main()+0xfd) [0x7fec3ca68d5d]
>>>  5: /usr/bin/ceph-mds() [0x567d99]
>>>  NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed 
>>> to interpret this.
>>> 2014-10-15 09:09:32.021313 7fec3e5ad7a0 -1 osdc/Objecter.cc: In function 
>>> 'ceph_tid_t Objecter::op_submit(Objecter::Op*)' thread 7fec3e5ad7a0 time 
>>> 2014-10-15 09:09:32.020287
>>> osdc/Objecter.cc: 1225: FAILED assert(client_lock.is_locked())
>>>
>>>  ceph version 0.80.5 (38b73c67d375a2552d8ed67843c8a65c2c0feba6)
>>>  1: /usr/bin/ceph-mds() [0x80f15e]
>>>  2: (Dumper::undump(char const*)+0x65d) [0x56c7ad]
>>>  3: (main()+0x1632) [0x569c62]
>>>  4: (__libc_start_main()+0xfd) [0x7fec3ca68d5d]
>>>  5: /usr/bin/ceph-mds() [0x567d99]
>>>  NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed 
>>> to interpret this.
>>>
>>>      0> 2014-10-15 09:09:32.021313 7fec3e5ad7a0 -1 osdc/Objecter.cc: In 
>>> function 'ceph_tid_t Objecter::op_submit(Objecter::Op*)' thread 
>>> 7fec3e5ad7a0 time 2014-10-15 09:09:32.020287
>>> osdc/Objecter.cc: 1225: FAILED assert(client_lock.is_locked())
>>>
>>>  ceph version 0.80.5 (38b73c67d375a2552d8ed67843c
>>> [root@th1-mon001 ~]# /usr/bin/ceph-mds -i th1-mon001 --p8a65c2c0feba6)
>>>  1: /usr/bin/ceph-mds() [0x80f15e]
>>>  2: (Dumper::undump(char const*)+0x65d) [0x56c7ad]
>>>  3: (main()+0x1632) [0x569c62]
>>>  4: (__libc_start_main()+0xfd) [0x7fec3ca68d5d]
>>>  5: /usr/bin/ceph-mds() [0x567d99]
>>>  NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed 
>>> to interpret this.
>>>
>>> terminate called after throwing an instance of 'ceph::FailedAssertion'
>>> *** Caught signal (Aborted) **
>>>  in thread 7fec3e5ad7a0
>>>  ceph version 0.80.5 (38b73c67d375a2552d8ed67843c8a65c2c0feba6)
>>>  1: /usr/bin/ceph-mds() [0x82ef61]
>>>  2: (()+0xf710) [0x7fec3d9a6710]
>>>  3: (gsignal()+0x35) [0x7fec3ca7c635]
>>>  4: (abort()+0x175) [0x7fec3ca7de15]
>>>  5: (__gnu_cxx::__verbose_terminate_handler()+0x12d) [0x7fec3d336a5d]
>>>  6: (()+0xbcbe6) [0x7fec3d334be6]
>>>  7: (()+0xbcc13) [0x7fec3d334c13]
>>>  8: (()+0xbcd0e) [0x7fec3d334d0e]
>>>  9: (ceph::__ceph_assert_fail(char const*, char const*, int, char 
>>> const*)+0x7f2) [0x94b812]
>>>  10: /usr/bin/ceph-mds() [0x80f15e]
>>>  11: (Dumper::undump(char const*)+0x65d) [0x56c7ad]
>>>  12: (main()+0x1632) [0x569c62]
>>>  13: (__libc_start_main()+0xfd) [0x7fec3ca68d5d]
>>>  14: /usr/bin/ceph-mds() [0x567d99]
>>> 2014-10-15 09:09:32.024248 7fec3e5ad7a0 -1 *** Caught signal (Aborted) **
>>>  in thread 7fec3e5ad7a0
>>>
>>>  ceph version 0.80.5 (38b73c67d375a2552d8ed67843c8a65c2c0feba6)
>>>  1: /usr/bin/ceph-mds() [0x82ef61]
>>>  2: (()+0xf710) [0x7fec3d9a6710]
>>>  3: (gsignal()+0x35) [0x7fec3ca7c635]
>>>  4: (abort()+0x175) [0x7fec3ca7de15]
>>>  5: (__gnu_cxx::__verbose_terminate_handler()+0x12d) [0x7fec3d336a5d]
>>>  6: (()+0xbcbe6) [0x7fec3d334be6]
>>>  7: (()+0xbcc13) [0x7fec3d334c13]
>>>  8: (()+0xbcd0e) [0x7fec3d334d0e]
>>>  9: (ceph::__ceph_assert_fail(char const*, char const*, int, char 
>>> const*)+0x7f2) [0x94b812]
>>>  10: /usr/bin/ceph-mds() [0x80f15e]
>>>  11: (Dumper::undump(char const*)+0x65d) [0x56c7ad]
>>>  12: (main()+0x1632) [0x569c62]
>>>  13: (__libc_start_main()+0xfd) [0x7fec3ca68d5d]
>>>  14: /usr/bin/ceph-mds() [0x567d99]
>>>  NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed 
>>> to interpret this.
>>>
>>>      0> 2014-10-15 09:09:32.024248 7fec3e5ad7a0 -1 *** Caught signal 
>>> (Aborted) **
>>>  in thread 7fec3e5ad7a0
>>>
>>>  ceph version 0.80.5 (38b73c67d375a2552d8ed67843c8a65c2c0feba6)
>>>  1: /usr/bin/ceph-mds() [0x82ef61]
>>>  2: (()+0xf710) [0x7fec3d9a6710]
>>>  3: (gsignal()+0x35) [0x7fec3ca7c635]
>>>  4: (abort()+0x175) [0x7fec3ca7de15]
>>>  5: (__gnu_cxx::__verbose_terminate_handler()+0x12d) [0x7fec3d336a5d]
>>>  6: (()+0xbcbe6) [0x7fec3d334be6]
>>>  7: (()+0xbcc13) [0x7fec3d334c13]
>>>  8: (()+0xbcd0e) [0x7fec3d334d0e]
>>>  9: (ceph::__ceph_assert_fail(char const*, char const*, int, char 
>>> const*)+0x7f2) [0x94b812]
>>>  10: /usr/bin/ceph-mds() [0x80f15e]
>>>  11: (Dumper::undump(char const*)+0x65d) [0x56c7ad]
>>>  12: (main()+0x1632) [0x569c62]
>>>  13: (__libc_start_main()+0xfd) [0x7fec3ca68d5d]
>>>  14: /usr/bin/ceph-mds() [0x567d99]
>>>  NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed 
>>> to interpret this.
>>>
>>> Aborted
>>>
>>> Jasper
>>> ________________________________________
>>> Van: Gregory Farnum [g...@inktank.com]
>>> Verzonden: dinsdag 14 oktober 2014 23:40
>>> Aan: Jasper Siero
>>> CC: ceph-users
>>> Onderwerp: Re: [ceph-users] mds isn't working anymore after osd's running 
>>> full
>>>
>>> ceph-mds --undump-journal <rank> <journal-file>
>>> Looks like it accidentally (or on purpose? you can break things with
>>> it) got left out of the help text.
>>>
>>> On Tue, Oct 14, 2014 at 8:19 AM, Jasper Siero
>>> <jasper.si...@target-holding.nl> wrote:
>>>> Hello Greg,
>>>>
>>>> I dumped the journal successful to a file:
>>>>
>>>> journal is 9483323613~134215459
>>>> read 134213311 bytes at offset 9483323613
>>>> wrote 134213311 bytes at offset 9483323613 to journaldumptgho
>>>> NOTE: this is a _sparse_ file; you can
>>>>         $ tar cSzf journaldumptgho.tgz journaldumptgho
>>>>       to efficiently compress it while preserving sparseness.
>>>>
>>>> I see the option for resetting the mds journal but I can't find the option 
>>>> for undumping /importing the journal:
>>>>
>>>>  usage: ceph-mds -i name [flags] [[--journal_check 
>>>> rank]|[--hot-standby][rank]]
>>>>   -m monitorip:port
>>>>         connect to monitor at given address
>>>>   --debug_mds n
>>>>         debug MDS level (e.g. 10)
>>>>   --dump-journal rank filename
>>>>         dump the MDS journal (binary) for rank.
>>>>   --dump-journal-entries rank filename
>>>>         dump the MDS journal (JSON) for rank.
>>>>   --journal-check rank
>>>>         replay the journal for rank, then exit
>>>>   --hot-standby rank
>>>>         start up as a hot standby for rank
>>>>   --reset-journal rank
>>>>         discard the MDS journal for rank, and replace it with a single
>>>>         event that updates/resets inotable and sessionmap on replay.
>>>>
>>>> Do you know how to "undump" the journal back into ceph?
>>>>
>>>> Jasper
>>>>
>>>> ________________________________________
>>>> Van: Gregory Farnum [g...@inktank.com]
>>>> Verzonden: vrijdag 10 oktober 2014 23:45
>>>> Aan: Jasper Siero
>>>> CC: ceph-users
>>>> Onderwerp: Re: [ceph-users] mds isn't working anymore after osd's running 
>>>> full
>>>>
>>>> Ugh, "debug journaler", not "debug journaled."
>>>>
>>>> That said, the filer output tells me that you're missing an object out
>>>> of the MDS log. (200.000008f5) I think this issue should be resolved
>>>> if you "dump" the journal to a file, "reset" it, and then "undump" it.
>>>> (These are commands you can invoke from ceph-mds.)
>>>> I haven't done this myself in a long time, so there may be some hard
>>>> edges around it. In particular, I'm not sure if the dumped journal
>>>> file will stop when the data stops, or if it will be a little too
>>>> long. If so, we can fix that by truncating the dumped file to the
>>>> proper length and resetting and undumping again.
>>>> (And just to harp on it, this journal manipulation is a lot simpler in
>>>> Giant... ;) )
>>>> -Greg
>>>> Software Engineer #42 @ http://inktank.com | http://ceph.com
>>>>
>>>> On Wed, Oct 8, 2014 at 7:11 AM, Jasper Siero
>>>> <jasper.si...@target-holding.nl> wrote:
>>>>> Hello Greg,
>>>>>
>>>>> No problem thanks for looking into the log. I attached the log to this 
>>>>> email.
>>>>> I'm looking forward for the new release because it would be nice to have 
>>>>> more possibilities to diagnose problems.
>>>>>
>>>>> Kind regards,
>>>>>
>>>>> Jasper Siero
>>>>> ________________________________________
>>>>> Van: Gregory Farnum [g...@inktank.com]
>>>>> Verzonden: dinsdag 7 oktober 2014 19:45
>>>>> Aan: Jasper Siero
>>>>> CC: ceph-users
>>>>> Onderwerp: Re: [ceph-users] mds isn't working anymore after osd's running 
>>>>> full
>>>>>
>>>>> Sorry; I guess this fell off my radar.
>>>>>
>>>>> The issue here is not that it's waiting for an osdmap; it got the
>>>>> requested map and went into replay mode almost immediately. In fact
>>>>> the log looks good except that it seems to finish replaying the log
>>>>> and then simply fail to transition into active. Generate a new one,
>>>>> adding in "debug journaled = 20" and "debug filer = 20", and we can
>>>>> probably figure out how to fix it.
>>>>> (This diagnosis is much easier in the upcoming Giant!)
>>>>> -Greg
>>>>> Software Engineer #42 @ http://inktank.com | http://ceph.com
>>>>>
>>>>>
>>>>> On Tue, Oct 7, 2014 at 7:55 AM, Jasper Siero
>>>>> <jasper.si...@target-holding.nl> wrote:
>>>>>> Hello Gregory,
>>>>>>
>>>>>> We still have the same problems with our test ceph cluster and didn't 
>>>>>> receive a reply from you after I send you the requested log files. Do 
>>>>>> you know if it's possible to get our cephfs filesystem working again or 
>>>>>> is it better to give up the files on cephfs and start over again?
>>>>>>
>>>>>> We restarted the cluster serveral times but it's still degraded:
>>>>>> [root@th1-mon001 ~]# ceph -w
>>>>>>     cluster c78209f5-55ea-4c70-8968-2231d2b05560
>>>>>>      health HEALTH_WARN mds cluster is degraded
>>>>>>      monmap e3: 3 mons at 
>>>>>> {th1-mon001=10.1.2.21:6789/0,th1-mon002=10.1.2.22:6789/0,th1-mon003=10.1.2.23:6789/0},
>>>>>>  election epoch 432, quorum 0,1,2 th1-mon001,th1-mon002,th1-mon003
>>>>>>      mdsmap e190: 1/1/1 up {0=th1-mon001=up:replay}, 1 up:standby
>>>>>>      osdmap e2248: 12 osds: 12 up, 12 in
>>>>>>       pgmap v197548: 492 pgs, 4 pools, 60297 MB data, 470 kobjects
>>>>>>             124 GB used, 175 GB / 299 GB avail
>>>>>>                  491 active+clean
>>>>>>                    1 active+clean+scrubbing+deep
>>>>>>
>>>>>> One placement group stays in the deep scrubbing fase.
>>>>>>
>>>>>> Kind regards,
>>>>>>
>>>>>> Jasper Siero
>>>>>>
>>>>>>
>>>>>> ________________________________________
>>>>>> Van: Jasper Siero
>>>>>> Verzonden: donderdag 21 augustus 2014 16:43
>>>>>> Aan: Gregory Farnum
>>>>>> Onderwerp: RE: [ceph-users] mds isn't working anymore after osd's 
>>>>>> running full
>>>>>>
>>>>>> I did restart it but you are right about the epoch number which has 
>>>>>> changed but the situation looks the same.
>>>>>> 2014-08-21 16:33:06.032366 7f9b5f3cd700  1 mds.0.27  need osdmap epoch 
>>>>>> 1994, have 1993
>>>>>> 2014-08-21 16:33:06.032368 7f9b5f3cd700  1 mds.0.27  waiting for osdmap 
>>>>>> 1994 (which blacklists
>>>>>> prior instance)
>>>>>> I started the mds with the debug options and attached the log.
>>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>> Jasper
>>>>>> ________________________________________
>>>>>> Van: Gregory Farnum [g...@inktank.com]
>>>>>> Verzonden: woensdag 20 augustus 2014 18:38
>>>>>> Aan: Jasper Siero
>>>>>> CC: ceph-users@lists.ceph.com
>>>>>> Onderwerp: Re: [ceph-users] mds isn't working anymore after osd's 
>>>>>> running full
>>>>>>
>>>>>> After restarting your MDS, it still says it has epoch 1832 and needs
>>>>>> epoch 1833? I think you didn't really restart it.
>>>>>> If the epoch numbers have changed, can you restart it with "debug mds
>>>>>> = 20", "debug objecter = 20", "debug ms = 1" in the ceph.conf and post
>>>>>> the resulting log file somewhere?
>>>>>> -Greg
>>>>>> Software Engineer #42 @ http://inktank.com | http://ceph.com
>>>>>>
>>>>>>
>>>>>> On Wed, Aug 20, 2014 at 12:49 AM, Jasper Siero
>>>>>> <jasper.si...@target-holding.nl> wrote:
>>>>>>> Unfortunately that doesn't help. I restarted both the active and 
>>>>>>> standby mds but that doesn't change the state of the mds. Is there a 
>>>>>>> way to force the mds to look at the 1832 epoch (or earlier) instead of 
>>>>>>> 1833 (need osdmap epoch 1833, have 1832)?
>>>>>>>
>>>>>>> Thanks,
>>>>>>>
>>>>>>> Jasper
>>>>>>> ________________________________________
>>>>>>> Van: Gregory Farnum [g...@inktank.com]
>>>>>>> Verzonden: dinsdag 19 augustus 2014 19:49
>>>>>>> Aan: Jasper Siero
>>>>>>> CC: ceph-users@lists.ceph.com
>>>>>>> Onderwerp: Re: [ceph-users] mds isn't working anymore after osd's 
>>>>>>> running full
>>>>>>>
>>>>>>> On Mon, Aug 18, 2014 at 6:56 AM, Jasper Siero
>>>>>>> <jasper.si...@target-holding.nl> wrote:
>>>>>>>> Hi all,
>>>>>>>>
>>>>>>>> We have a small ceph cluster running version 0.80.1 with cephfs on five
>>>>>>>> nodes.
>>>>>>>> Last week some osd's were full and shut itself down. To help de osd's 
>>>>>>>> start
>>>>>>>> again I added some extra osd's and moved some placement group 
>>>>>>>> directories on
>>>>>>>> the full osd's (which has a copy on another osd) to another place on 
>>>>>>>> the
>>>>>>>> node (as mentioned in
>>>>>>>> http://ceph.com/docs/master/rados/troubleshooting/troubleshooting-osd/)
>>>>>>>> After clearing some space on the full osd's I started them again. 
>>>>>>>> After a
>>>>>>>> lot of deep scrubbing and two pg inconsistencies which needed to be 
>>>>>>>> repaired
>>>>>>>> everything looked fine except the mds which still is in the replay 
>>>>>>>> state and
>>>>>>>> it stays that way.
>>>>>>>> The log below says that mds need osdmap epoch 1833 and have 1832.
>>>>>>>>
>>>>>>>> 2014-08-18 12:29:22.268248 7fa786182700  1 mds.-1.0 handle_mds_map 
>>>>>>>> standby
>>>>>>>> 2014-08-18 12:29:22.273995 7fa786182700  1 mds.0.25 handle_mds_map i 
>>>>>>>> am now
>>>>>>>> mds.0.25
>>>>>>>> 2014-08-18 12:29:22.273998 7fa786182700  1 mds.0.25 handle_mds_map 
>>>>>>>> state
>>>>>>>> change up:standby --> up:replay
>>>>>>>> 2014-08-18 12:29:22.274000 7fa786182700  1 mds.0.25 replay_start
>>>>>>>> 2014-08-18 12:29:22.274014 7fa786182700  1 mds.0.25  recovery set is
>>>>>>>> 2014-08-18 12:29:22.274016 7fa786182700  1 mds.0.25  need osdmap epoch 
>>>>>>>> 1833,
>>>>>>>> have 1832
>>>>>>>> 2014-08-18 12:29:22.274017 7fa786182700  1 mds.0.25  waiting for 
>>>>>>>> osdmap 1833
>>>>>>>> (which blacklists prior instance)
>>>>>>>>
>>>>>>>>  # ceph status
>>>>>>>>     cluster c78209f5-55ea-4c70-8968-2231d2b05560
>>>>>>>>      health HEALTH_WARN mds cluster is degraded
>>>>>>>>      monmap e3: 3 mons at
>>>>>>>> {th1-mon001=10.1.2.21:6789/0,th1-mon002=10.1.2.22:6789/0,th1-mon003=10.1.2.23:6789/0},
>>>>>>>> election epoch 362, quorum 0,1,2 th1-mon001,th1-mon002,th1-mon003
>>>>>>>>      mdsmap e154: 1/1/1 up {0=th1-mon001=up:replay}, 1 up:standby
>>>>>>>>      osdmap e1951: 12 osds: 12 up, 12 in
>>>>>>>>       pgmap v193685: 492 pgs, 4 pools, 60297 MB data, 470 kobjects
>>>>>>>>             124 GB used, 175 GB / 299 GB avail
>>>>>>>>                  492 active+clean
>>>>>>>>
>>>>>>>> # ceph osd tree
>>>>>>>> # id    weight    type name    up/down    reweight
>>>>>>>> -1    0.2399    root default
>>>>>>>> -2    0.05997        host th1-osd001
>>>>>>>> 0    0.01999            osd.0    up    1
>>>>>>>> 1    0.01999            osd.1    up    1
>>>>>>>> 2    0.01999            osd.2    up    1
>>>>>>>> -3    0.05997        host th1-osd002
>>>>>>>> 3    0.01999            osd.3    up    1
>>>>>>>> 4    0.01999            osd.4    up    1
>>>>>>>> 5    0.01999            osd.5    up    1
>>>>>>>> -4    0.05997        host th1-mon003
>>>>>>>> 6    0.01999            osd.6    up    1
>>>>>>>> 7    0.01999            osd.7    up    1
>>>>>>>> 8    0.01999            osd.8    up    1
>>>>>>>> -5    0.05997        host th1-mon002
>>>>>>>> 9    0.01999            osd.9    up    1
>>>>>>>> 10    0.01999            osd.10    up    1
>>>>>>>> 11    0.01999            osd.11    up    1
>>>>>>>>
>>>>>>>> What is the way to get the mds up and running again?
>>>>>>>>
>>>>>>>> I still have all the placement group directories which I moved from 
>>>>>>>> the full
>>>>>>>> osds which where down to create disk space.
>>>>>>>
>>>>>>> Try just restarting the MDS daemon. This sounds a little familiar so I
>>>>>>> think it's a known bug which may be fixed in a later dev or point
>>>>>>> release on the MDS, but it's a soft-state rather than a disk state
>>>>>>> issue.
>>>>>>> -Greg
>>>>>>> Software Engineer #42 @ http://inktank.com | http://ceph.com
>>> _______________________________________________
>>> ceph-users mailing list
>>> ceph-users@lists.ceph.com
>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to