On 01/01/2013 01:13 AM, Matthew Roy wrote:
On Mon, Dec 31, 2012 at 9:14 AM, Miles Fidelman
<mfidel...@meetinghouse.net> wrote:


Which raises another question: how are you combining drives within each OSD 
(raid, lvm, ?).


I'm not combining them, just running an OSD per data disk. On this
cluster it's 2 disks for each of the 3 nodes.  I ended up that way
only because I added the second disk to each node after getting
started. There was an inktank blog post not too long ago about the
performance of RAID'ed disks on OSDs that might provide quantitative
justification for which route to go.

Like Wido suggests, I also use a shared SSD for journal on each node.
The journal's not really about speeding recovery from failed
OSDs/disks, it's about being able to ACK writes faster and still
retain integrity when Bad Things happen. If you're RAIDing with a
battery-backed cache I think you can run without a journal, but I
don't know the details on that.


That depends on the RAID controller. Some (like Areca I think) cache O_DIRECT writes in their write cache, but other still flush them directly to the disks although they have a cache with BBU.

I'd try to prevent using any RAID system with Ceph. Let the replication handle everything. The less hardware and complexity you add, the less can fail.

Wido

Matthew
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to