Re: Journal too small

2012-05-18 Thread Karol Jurak
On Thursday 17 of May 2012 20:59:52 Josh Durgin wrote: On 05/17/2012 03:59 AM, Karol Jurak wrote: How serious is such situation? Do the OSDs know how to handle it correctly? Or could this result in some data loss or corruption? After the recovery finished (ceph -w showed that all PGs are in

Re: Journal too small

2012-05-18 Thread Karol Jurak
at 863363072 : JOURNAL FULL 863363072 = 1048571903 (max_size 1048576000 start 863363072) 2012-05-12 13:31:04.034680 7f491061d700 0 journal JOURNAL TOO SMALL: item 1693745152 journal 1048571904 (usable) I was under the impression that the OSDs stopped participating in recovery after

Re: Journal too small

2012-05-18 Thread Josh Durgin
On 05/18/2012 03:56 AM, Karol Jurak wrote: On Thursday 17 of May 2012 20:59:52 Josh Durgin wrote: On 05/17/2012 03:59 AM, Karol Jurak wrote: How serious is such situation? Do the OSDs know how to handle it correctly? Or could this result in some data loss or corruption? After the recovery

Journal too small

2012-05-17 Thread Karol Jurak
:04.034680 7f491061d700 0 journal JOURNAL TOO SMALL: item 1693745152 journal 1048571904 (usable) I was under the impression that the OSDs stopped participating in recovery after this event. (ceph -w showed that the number of PGs in state active+clean no longer increased.) They resumed recovery

Re: Journal too small

2012-05-17 Thread Sage Weil
1048576000 start 863363072) 2012-05-12 13:31:04.034680 7f491061d700 0 journal JOURNAL TOO SMALL: item 1693745152 journal 1048571904 (usable) I was under the impression that the OSDs stopped participating in recovery after this event. (ceph -w showed that the number of PGs in state

Re: Journal too small

2012-05-17 Thread Tommi Virtanen
On Thu, May 17, 2012 at 9:01 AM, Sage Weil s...@inktank.com wrote: 2012-05-12 13:31:04.034144 7f491061d700  1 journal check_for_full at 863363072 : JOURNAL FULL 863363072 = 1048571903 (max_size 1048576000 start 863363072) 2012-05-12 13:31:04.034680 7f491061d700  0 journal JOURNAL TOO SMALL

Re: Journal too small

2012-05-17 Thread Sage Weil
:04.034680 7f491061d700  0 journal JOURNAL TOO SMALL: item 1693745152 journal 1048571904 (usable) The osds tolerate the full journal.  There will be a big latency spike, but they'll recover without risking data.  You should definitely increase the journal size if this happens regulary, though

Re: Journal too small

2012-05-17 Thread Josh Durgin
On 05/17/2012 03:59 AM, Karol Jurak wrote: How serious is such situation? Do the OSDs know how to handle it correctly? Or could this result in some data loss or corruption? After the recovery finished (ceph -w showed that all PGs are in active+clean state) I noticed that a few rbd images were