osd crash after reboot

2012-12-14 Thread Stefan Priebe
Hello list, after a reboot of my node i see this on all OSDs of this node after the reboot: 2012-12-14 09:03:20.393224 7f8e652f8780 -1 osd/OSD.cc: In function 'OSDMapRef OSDService::get_map(epoch_t)' thread 7f8e652f8780 time 2012-12-14 09:03:20.392528 osd/OSD.cc: 4385: FAILED

Re: osd crash after reboot

2012-12-14 Thread Stefan Priebe
same log more verbose: 11 ec=10 les/c 3307/3307 3306/3306/3306) [] r=0 lpr=0 lcod 0'0 mlcod 0'0 inactive] read_log done -11 2012-12-14 09:17:50.648572 7fb6e0d6b780 10 osd.3 pg_epoch: 3996 pg[3.44b( v 3988'3969 (1379'2968,3988'3969] local-les=3307 n=11 ec=10 les/c 3307/3307 3306/3306/3306)

Re: osd crash after reboot

2012-12-14 Thread Dennis Jacobfeuerborn
On 12/14/2012 10:14 AM, Stefan Priebe wrote: One more IMPORTANT note. This might happen due to the fact that a disk was missing (disk failure) afte the reboot. fstab and mountpoint are working with UUIDs so they match but the journal block device: osd journal = /dev/sde1 didn't match

Re: osd crash after reboot

2012-12-14 Thread Mark Nelson
On 12/14/2012 08:52 AM, Dennis Jacobfeuerborn wrote: On 12/14/2012 10:14 AM, Stefan Priebe wrote: One more IMPORTANT note. This might happen due to the fact that a disk was missing (disk failure) afte the reboot. fstab and mountpoint are working with UUIDs so they match but the journal block

Re: osd crash after reboot

2012-12-14 Thread Stefan Priebe - Profihost AG
Hello Dennis, Am 14.12.2012 15:52, schrieb Dennis Jacobfeuerborn: didn't match anymore - as the numbers got renumber due to the failed disk. Is there a way to use some kind of UUIDs here too for journal? You should be able to use /dev/disk/by-uuid/* instead. That should give you a stable view

Re: osd crash after reboot

2012-12-14 Thread Mark Nelson
Hi Stefan, Here's what I often do when I have a journal and data partition sharing a disk: sudo parted -s -a optimal /dev/$DEV mklabel gpt sudo parted -s -a optimal /dev/$DEV mkpart osd-device-$i-journal 0% 10G sudo parted -s -a optimal /dev/$DEV mkpart osd-device-$i-data 10G 100% Mark On

Re: osd crash after reboot

2012-12-14 Thread Stefan Priebe - Profihost AG
Hi Mark, Am 14.12.2012 16:20, schrieb Mark Nelson: sudo parted -s -a optimal /dev/$DEV mklabel gpt sudo parted -s -a optimal /dev/$DEV mkpart osd-device-$i-journal 0% 10G sudo parted -s -a optimal /dev/$DEV mkpart osd-device-$i-data 10G 100% My disks are gpt too and i'm also using parted. But

Re: osd crash after reboot

2012-12-14 Thread Stefan Priebe - Profihost AG
Hello Mark, Am 14.12.2012 16:20, schrieb Mark Nelson: sudo parted -s -a optimal /dev/$DEV mklabel gpt sudo parted -s -a optimal /dev/$DEV mkpart osd-device-$i-journal 0% 10G sudo parted -s -a optimal /dev/$DEV mkpart osd-device-$i-data 10G 100% Isn't that the part type you're using? mkpart

Re: osd crash after reboot

2012-12-14 Thread Sage Weil
On Fri, 14 Dec 2012, Stefan Priebe wrote: One more IMPORTANT note. This might happen due to the fact that a disk was missing (disk failure) afte the reboot. fstab and mountpoint are working with UUIDs so they match but the journal block device: osd journal = /dev/sde1 didn't match

Re: osd crash after reboot

2012-12-14 Thread Stefan Priebe
Hi Sage, this was just an idea and i need to fix MY uuid problem. But then the crash is still a problem of ceph. Have you looked into my log? Am 14.12.2012 20:42, schrieb Sage Weil: On Fri, 14 Dec 2012, Stefan Priebe wrote: One more IMPORTANT note. This might happen due to the fact that a