So let me get this straight! You have 3 hosts with 6 drives each in raid 0. So you have set 3 OSDs in crushmap, right? You said replication level is 2, so you have 2 copies of the original data! So the pool size is 3, right? You said 2 out of 3 OSD are down. So you are left with only one copy of the data. As i know ceph locks acces to the remaining data to prevent changes to it (if 2 out of 3 OSD were up then you should have had access to your data) You can try setting pool min_size to 1 and see if you can access it. Although you should bring back up the lost OSDs.
And i don't think running RAID0 (stripping) is a good idea. When a drive in the array goes down it takes the whole array down with it, as opposed to having every single drive be an OSD and group them by host in crushmap. Or setup 3 RAID0 arrays on every host. I might be mistaken though! Anyway, someone with a better experience than me should have the right answer for you. Hope i understood correctly! 2016-01-13 14:26 GMT+02:00 Magnus Hagdorn <magnus.hagd...@ed.ac.uk>: > Hi there, > we recently had a problem with two OSDs failing because of I/O errors of > the underlying disks. We run a small ceph cluster with 3 nodes and 18 OSDs > in total. All 3 nodes are dell poweredge r515 servers with PERC H700 > (MegaRAID SAS 2108) RAID controllers. All disks are configured as single > disk RAID 0 arrays. A disk on two separate nodes started showing I/O errors > reported by SMART, with one of the disks reporting pre failure SMART error. > The node with the failing disk also reported XFS I/O errors. In both cases > the OSD daemons kept running although ceph reported that they were slow to > respond. When we started to look into this we first tried restarted the > OSDs. They then failed straight away. We ended up with data loss. We are > running ceph 0.80.5 on Scientific Linux 6.6 with a replication level of 2. > We had hoped that loosing disks due to hardware failure would be > recoverable. > > Is this a known issue with the RAID controllers, version of ceph? > > Regards > magnus > > > -- > The University of Edinburgh is a charitable body, registered in > Scotland, with registration number SC005336. > > _______________________________________________ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com