It has just been pointed out to me that you can also workaround this
issue on your existing system by increasing the osd_max_write_size
setting on your OSDs (default 90MB) to something higher, but still
smaller than your osd journal size.  That might get you on a path to
having an accessible filesystem before you consider an upgrade.

John

On Fri, Jan 16, 2015 at 10:57 AM, John Spray <john.sp...@redhat.com> wrote:
> Hmm, upgrading should help here, as the problematic data structure
> (anchortable) no longer exists in the latest version.  I haven't
> checked, but hopefully we don't try to write it during upgrades.
>
> The bug you're hitting is more or less the same as a similar one we
> have with the sessiontable in the latest ceph, but you won't hit it
> there unless you're very unlucky!
>
> John
>
> On Fri, Jan 16, 2015 at 7:37 AM, Mohd Bazli Ab Karim
> <bazli.abka...@mimos.my> wrote:
>> Dear Ceph-Users, Ceph-Devel,
>>
>> Apologize me if you get double post of this email.
>>
>> I am running a ceph cluster version 0.72.2 and one MDS (in fact, it's 3, 2 
>> down and only 1 up) at the moment.
>> Plus I have one CephFS client mounted to it.
>>
>> Now, the MDS always get aborted after recovery and active for 4 secs.
>> Some parts of the log are as below:
>>
>>     -3> 2015-01-15 14:10:28.464706 7fbcc8226700  1 -- 10.4.118.21:6800/5390 
>> <== osd.19 10.4.118.32:6821/243161 73 ==== osd_op_re
>> ply(3742 1000240c57e.00000000 [create 0~0,setxattr (99)] v56640'1871414 
>> uv1871414 ondisk = 0) v6 ==== 221+0+0 (261801329 0 0) 0x
>> 7770bc80 con 0x69c7dc0
>>     -2> 2015-01-15 14:10:28.464730 7fbcc8226700  1 -- 10.4.118.21:6800/5390 
>> <== osd.18 10.4.118.32:6818/243072 67 ==== osd_op_re
>> ply(3645 1000007941c.00000000 [tmapup 0~0] v56640'1769567 uv1769567 ondisk = 
>> 0) v6 ==== 179+0+0 (3759887079 0 0) 0x7757ec80 con
>> 0x1c6bb00
>>     -1> 2015-01-15 14:10:28.464754 7fbcc8226700  1 -- 10.4.118.21:6800/5390 
>> <== osd.47 10.4.118.35:6809/8290 79 ==== osd_op_repl
>> y(3419 mds_anchortable [writefull 0~94394932] v0'0 uv0 ondisk = -90 (Message 
>> too long)) v6 ==== 174+0+0 (3942056372 0 0) 0x69f94
>> a00 con 0x1c6b9a0
>>      0> 2015-01-15 14:10:28.471684 7fbcc8226700 -1 mds/MDSTable.cc: In 
>> function 'void MDSTable::save_2(int, version_t)' thread 7
>> fbcc8226700 time 2015-01-15 14:10:28.469999
>> mds/MDSTable.cc: 83: FAILED assert(r >= 0)
>>
>>  ceph version  ()
>>  1: (MDSTable::save_2(int, unsigned long)+0x325) [0x769e25]
>>  2: (Context::complete(int)+0x9) [0x568d29]
>>  3: (Objecter::handle_osd_op_reply(MOSDOpReply*)+0x1097) [0x7c15d7]
>>  4: (MDS::handle_core_message(Message*)+0x5a0) [0x588900]
>>  5: (MDS::_dispatch(Message*)+0x2f) [0x58908f]
>>  6: (MDS::ms_dispatch(Message*)+0x1e3) [0x58ab93]
>>  7: (DispatchQueue::entry()+0x549) [0x975739]
>>  8: (DispatchQueue::DispatchThread::entry()+0xd) [0x8902dd]
>>  9: (()+0x7e9a) [0x7fbcccb0de9a]
>>  10: (clone()+0x6d) [0x7fbccb4ba3fd]
>>  NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to 
>> interpret this.
>>
>> Is there any workaround/patch to fix this issue? Let me know if need to see 
>> the log with debug-mds of certain level as well.
>> Any helps would be very much appreciated.
>>
>> Thanks.
>> Bazli
>>
>> ________________________________
>> DISCLAIMER:
>>
>>
>> This e-mail (including any attachments) is for the addressee(s) only and may 
>> be confidential, especially as regards personal data. If you are not the 
>> intended recipient, please note that any dealing, review, distribution, 
>> printing, copying or use of this e-mail is strictly prohibited. If you have 
>> received this email in error, please notify the sender immediately and 
>> delete the original message (including any attachments).
>>
>>
>> MIMOS Berhad is a research and development institution under the purview of 
>> the Malaysian Ministry of Science, Technology and Innovation. Opinions, 
>> conclusions and other information in this e-mail that do not relate to the 
>> official business of MIMOS Berhad and/or its subsidiaries shall be 
>> understood as neither given nor endorsed by MIMOS Berhad and/or its 
>> subsidiaries and neither MIMOS Berhad nor its subsidiaries accepts 
>> responsibility for the same. All liability arising from or in connection 
>> with computer viruses and/or corrupted e-mails is excluded to the fullest 
>> extent permitted by law.
>> --
>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> the body of a message to majord...@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to