Re: Designing a cluster guide

2012-06-29 Thread Sage Weil
On Fri, 29 Jun 2012, Brian Edmonds wrote: > On Fri, Jun 29, 2012 at 2:11 PM, Gregory Farnum wrote: > > Well, actually this depends on the filesystem you're using. With > > btrfs, the OSD will roll back to a consistent state, but you don't > > know how out-of-date that state is. > > Ok, so assumin

Re: Designing a cluster guide

2012-06-29 Thread Gregory Farnum
On Fri, Jun 29, 2012 at 2:18 PM, Brian Edmonds wrote: > On Fri, Jun 29, 2012 at 2:11 PM, Gregory Farnum wrote: >> Well, actually this depends on the filesystem you're using. With >> btrfs, the OSD will roll back to a consistent state, but you don't >> know how out-of-date that state is. > > Ok, s

Re: Designing a cluster guide

2012-06-29 Thread Brian Edmonds
On Fri, Jun 29, 2012 at 2:11 PM, Gregory Farnum wrote: > Well, actually this depends on the filesystem you're using. With > btrfs, the OSD will roll back to a consistent state, but you don't > know how out-of-date that state is. Ok, so assuming btrfs, then a single machine failure with a ramdisk

Re: Designing a cluster guide

2012-06-29 Thread Gregory Farnum
On Fri, Jun 29, 2012 at 1:59 PM, Brian Edmonds wrote: > On Fri, Jun 29, 2012 at 11:50 AM, Gregory Farnum wrote: >> If you lose a journal, you lose the OSD. > > Really?  Everything?  Not just recent commits?  I would have hoped it > would just come back up in an old state.  Replication should have

Re: Designing a cluster guide

2012-06-29 Thread Brian Edmonds
On Fri, Jun 29, 2012 at 11:50 AM, Gregory Farnum wrote: > If you lose a journal, you lose the OSD. Really? Everything? Not just recent commits? I would have hoped it would just come back up in an old state. Replication should have already been taking care of regaining redundancy for the stuff

Re: Designing a cluster guide

2012-06-29 Thread Gregory Farnum
On Fri, Jun 29, 2012 at 11:42 AM, Brian Edmonds wrote: > What are the likely and worst case scenarios if the OSD journal were > to simply be on a garden variety ramdisk, no battery backing?  In the > case of a single node losing power, and thus losing some data, surely > Ceph can recognize this, a

Re: Designing a cluster guide

2012-06-29 Thread Brian Edmonds
On Fri, Jun 29, 2012 at 11:07 AM, Gregory Farnum wrote: >>> the "Designing a cluster guide" >>> http://wiki.ceph.com/wiki/Designing_a_cluster is pretty good but it >>> still leaves some questions unanswered. Oh, thank you. I've been poking through the Ceph docs, but somehow had not managed to tu

Re: Designing a cluster guide

2012-06-29 Thread Gregory Farnum
On Thu, May 17, 2012 at 2:27 PM, Gregory Farnum wrote: > Sorry this got left for so long... > > On Thu, May 10, 2012 at 6:23 AM, Stefan Priebe - Profihost AG > wrote: >> Hi, >> >> the "Designing a cluster guide" >> http://wiki.ceph.com/wiki/Designing_a_cluster is pretty good but it >> still leave

Re: Designing a cluster guide

2012-05-29 Thread Tommi Virtanen
On Tue, May 29, 2012 at 12:25 AM, Quenten Grasso wrote: > So if we have 10 nodes vs. 3 nodes with the same mount of disks we should see > better write and read performance as you would have less "overlap". First of all, a typical way to run Ceph is with say 8-12 disks per node, and an OSD per di

RE: Designing a cluster guide

2012-05-29 Thread Quenten Grasso
l@vger.kernel.org Subject: Re: Designing a cluster guide I have some performance from rbd cluster near 320MB/s on VM from 3 node cluster, but with 10GE, and with 26 2.5" SAS drives used on every machine it's not everything that can be. Every osd drive is raid0 with one drive via battery

Re: Designing a cluster guide

2012-05-24 Thread Jerker Nyberg
On Wed, 23 May 2012, Gregory Farnum wrote: On Wed, May 23, 2012 at 12:47 PM, Jerker Nyberg wrote:  * Scratch file system for HPC. (kernel client)  * Scratch file system for research groups. (SMB, NFS, SSH)  * Backend for simple disk backup. (SSH/rsync, AFP, BackupPC)  * Metropolitan cluster.

Re: Designing a cluster guide

2012-05-23 Thread Gregory Farnum
On Wed, May 23, 2012 at 12:47 PM, Jerker Nyberg wrote: > On Tue, 22 May 2012, Gregory Farnum wrote: > >> Direct users of the RADOS object store (i.e., librados) can do all kinds >> of things with the integrity guarantee options. But I don't believe there's >> currently a way to make the filesystem

Re: Designing a cluster guide

2012-05-23 Thread Jerker Nyberg
On Tue, 22 May 2012, Gregory Farnum wrote: Direct users of the RADOS object store (i.e., librados) can do all kinds of things with the integrity guarantee options. But I don't believe there's currently a way to make the filesystem do so ÿÿ among other things, you're running through the page ca

Re: Designing a cluster guide

2012-05-22 Thread Gregory Farnum
On Tuesday, May 22, 2012 at 2:04 AM, Jerker Nyberg wrote: > On Mon, 21 May 2012, Gregory Farnum wrote: > > > This one the write is considered "safe" once it is on-disk on all > > OSDs currently responsible for hosting the object. > > > > Is it possible to configure the client to consider th

Re: Designing a cluster guide

2012-05-22 Thread Jerker Nyberg
On Mon, 21 May 2012, Gregory Farnum wrote: This one  the write is considered "safe" once it is on-disk on all OSDs currently responsible for hosting the object. Is it possible to configure the client to consider the write successful when the data is hitting RAM on all the OSDs but not yet com

Re: Designing a cluster guide

2012-05-21 Thread Sławomir Skowron
http://en.wikipedia.org/wiki/Host_protected_area On Tue, May 22, 2012 at 8:30 AM, Stefan Priebe - Profihost AG wrote: > Am 21.05.2012 23:22, schrieb Sławomir Skowron: >> Maybe good for journal will be two cheap MLC Intel drives on Sandforce >> (320/520), 120GB or 240GB, and HPA changed to 20-30GB

Re: Designing a cluster guide

2012-05-21 Thread Stefan Priebe - Profihost AG
Am 21.05.2012 20:13, schrieb Gregory Farnum: > On Sat, May 19, 2012 at 1:37 AM, Stefan Priebe wrote: >> So would you recommand a fast (more ghz) Core i3 instead of a single xeon >> for this system? (price per ghz is better). > > If that's all the MDS is doing there, probably? (It would also depen

Re: Designing a cluster guide

2012-05-21 Thread Sławomir Skowron
@vger.kernel.org > [mailto:ceph-devel-ow...@vger.kernel.org] On Behalf Of Gregory Farnum > Sent: Tuesday, 22 May 2012 10:30 AM > To: Quenten Grasso > Cc: ceph-devel@vger.kernel.org > Subject: Re: Designing a cluster guide > > On Mon, May 21, 2012 at 4:52 PM, Quenten Grasso wrote

RE: Designing a cluster guide

2012-05-21 Thread Quenten Grasso
el-ow...@vger.kernel.org] On Behalf Of Quenten Grasso Sent: Tuesday, 22 May 2012 10:43 AM To: 'Gregory Farnum' Cc: ceph-devel@vger.kernel.org Subject: RE: Designing a cluster guide Hi Greg, I'm only talking about journal disks not storage. :) Regards, Quenten -Original Message-

RE: Designing a cluster guide

2012-05-21 Thread Quenten Grasso
devel@vger.kernel.org Subject: Re: Designing a cluster guide On Mon, May 21, 2012 at 4:52 PM, Quenten Grasso wrote: > Hi All, > > > I've been thinking about this issue myself past few days, and an idea I've > come up with is running 16 x 2.5" 15K 72/146GB Disks, > in

Re: Designing a cluster guide

2012-05-21 Thread Gregory Farnum
ould use a cachecade as well) > > > Cons > Not as fast as SSD's > More rackspace required per server. > > > Regards, > Quenten > > -Original Message- > From: ceph-devel-ow...@vger.kernel.org > [mailto:ceph-devel-ow...@vger.kernel.org] On Behalf Of Slawomir

RE: Designing a cluster guide

2012-05-21 Thread Quenten Grasso
.org] On Behalf Of Slawomir Skowron Sent: Tuesday, 22 May 2012 7:22 AM To: ceph-devel@vger.kernel.org Cc: Tomasz Paszkowski Subject: Re: Designing a cluster guide Maybe good for journal will be two cheap MLC Intel drives on Sandforce (320/520), 120GB or 240GB, and HPA changed to 20-30GB only for

Re: Designing a cluster guide

2012-05-21 Thread Sławomir Skowron
an, they rocks for journal. >> http://www.stec-inc.com/product/zeusram.php >> >> another interessesting product is ddrdrive >> http://www.ddrdrive.com/ >> >> - Mail original - >> >> De: "Stefan Priebe" >> À: "Gregory Farnum" &g

Re: Designing a cluster guide

2012-05-21 Thread Tomasz Paszkowski
On Linux boxes you may use output from iostat -x /dev/sda and connect it it to any monitoring system like: zabbix or cacti :-) On Mon, May 21, 2012 at 10:14 PM, Stefan Priebe wrote: > Am 21.05.2012 22:13, schrieb Tomasz Paszkowski: > >> Just to clarify. You'd like to measure I/O on those system

Re: Designing a cluster guide

2012-05-21 Thread Stefan Priebe
Am 21.05.2012 22:13, schrieb Tomasz Paszkowski: Just to clarify. You'd like to measure I/O on those system which are currently running on physical machines ? IOPs not just I/O. Stefan -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majord...@

Re: Designing a cluster guide

2012-05-21 Thread Tomasz Paszkowski
Just to clarify. You'd like to measure I/O on those system which are currently running on physical machines ? On Mon, May 21, 2012 at 10:11 PM, Stefan Priebe wrote: > Am 21.05.2012 17:12, schrieb Tomasz Paszkowski: > >> If you're using Qemu/KVM you can use 'info blockstats' command for >> measru

Re: Designing a cluster guide

2012-05-21 Thread Stefan Priebe
Am 21.05.2012 17:12, schrieb Tomasz Paszkowski: If you're using Qemu/KVM you can use 'info blockstats' command for measruing I/O on particular VM. I want to migrate physical servers to KVM. Any idea for that? Stefan -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in th

Re: Designing a cluster guide

2012-05-21 Thread Damien Churchill
On 21 May 2012 16:36, Tomasz Paszkowski wrote: > Project is indeed very interesting, but requires to patch a kernel > source. For me using lkm is safer ;) > I believe bcache is actually in the process of being mainlined and moved to a device mapper target, although I could wrong about one or more

Re: Designing a cluster guide

2012-05-21 Thread Gregory Farnum
On Sat, May 19, 2012 at 1:37 AM, Stefan Priebe wrote: > Hi Greg, > > Am 17.05.2012 23:27, schrieb Gregory Farnum: > >>> It mentions for example "Fast CPU" for the mds system. What does fast >>> mean? Just the speed of one core? Or is ceph designed to use multi core? >>> Is multi core or more speed

Re: Designing a cluster guide

2012-05-21 Thread Tomasz Paszkowski
Project is indeed very interesting, but requires to patch a kernel source. For me using lkm is safer ;) On Mon, May 21, 2012 at 5:30 PM, Kiran Patil wrote: > Hello, > > Has someone looked into bcache (http://bcache.evilpiepirate.org/) ? > > It seems, it is superior to flashcache. > > Lwn.net art

Re: Designing a cluster guide

2012-05-21 Thread Tomasz Paszkowski
If you're using Qemu/KVM you can use 'info blockstats' command for measruing I/O on particular VM. On Mon, May 21, 2012 at 5:05 PM, Stefan Priebe - Profihost AG wrote: > Am 21.05.2012 16:59, schrieb Christian Brunner: >> Apart from that you should calculate the sum of the IOPS your guests >> gen

Re: Designing a cluster guide

2012-05-21 Thread Tomasz Paszkowski
ssesting product is ddrdrive > http://www.ddrdrive.com/ > > - Mail original - > > De: "Stefan Priebe" > À: "Gregory Farnum" > Cc: ceph-devel@vger.kernel.org > Envoyé: Samedi 19 Mai 2012 10:37:01 > Objet: Re: Designing a cluster guide > > Hi

Re: Designing a cluster guide

2012-05-21 Thread Christian Brunner
2012/5/21 Stefan Priebe - Profihost AG : > Am 20.05.2012 10:31, schrieb Christian Brunner: >>> That's exactly what i thought too but then you need a seperate ceph / rbd >>> cluster for each type. >>> >>> Which will result in a minimum of: >>> 3x mon servers per type >>> 4x osd servers per type >>>

Re: Designing a cluster guide

2012-05-21 Thread Christian Brunner
2012/5/20 Tim O'Donovan : >> - High performance Block Storage (RBD) >> >>   Many large SATA SSDs for the storage (prbably in a RAID5 config) >>   stec zeusram ssd drive for the journal > > How do you think standard SATA disks would perform in comparison to > this, and is a separate journaling devic

Re: Designing a cluster guide

2012-05-21 Thread Stefan Priebe - Profihost AG
Am 20.05.2012 10:31, schrieb Christian Brunner: >> That's exactly what i thought too but then you need a seperate ceph / rbd >> cluster for each type. >> >> Which will result in a minimum of: >> 3x mon servers per type >> 4x osd servers per type >> --- >> >> so you'll need a minimum of 12x osd syst

Re: Designing a cluster guide

2012-05-20 Thread Stefan Priebe
No sorry just wanted to clarify as you quoted the ssd part. Stefan Am 20.05.2012 um 11:46 schrieb Tim O'Donovan : >> He's talking about ssd's not normal sata disks. > > I realise that. I'm looking for similar advice and have been following > this thread. It didn't seem off topic to ask here. >

Re: Designing a cluster guide

2012-05-20 Thread Tim O'Donovan
> He's talking about ssd's not normal sata disks. I realise that. I'm looking for similar advice and have been following this thread. It didn't seem off topic to ask here. Regards, Tim O'Donovan -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to

Re: Designing a cluster guide

2012-05-20 Thread Stefan Priebe
Am 20.05.2012 um 10:56 schrieb Tim O'Donovan : >> - High performance Block Storage (RBD) >> >> Many large SATA SSDs for the storage (prbably in a RAID5 config) >> stec zeusram ssd drive for the journal > > How do you think standard SATA disks would perform in comparison to > this, and is a sep

Re: Designing a cluster guide

2012-05-20 Thread Tim O'Donovan
> - High performance Block Storage (RBD) > > Many large SATA SSDs for the storage (prbably in a RAID5 config) > stec zeusram ssd drive for the journal How do you think standard SATA disks would perform in comparison to this, and is a separate journaling device really necessary? Perhaps three

Re: Designing a cluster guide

2012-05-20 Thread Christian Brunner
2012/5/20 Stefan Priebe : > Am 20.05.2012 10:19, schrieb Christian Brunner: > >> - Cheap Object Storage (S3): >> >>   Many 3,5'' SATA Drives for the storage (probably in a RAID config) >>   A small and cheap SSD for the journal >> >> - Basic Block Storage (RBD): >> >>   Many 2,5'' SATA Drives for t

Re: Designing a cluster guide

2012-05-20 Thread Stefan Priebe
Am 20.05.2012 10:19, schrieb Christian Brunner: - Cheap Object Storage (S3): Many 3,5'' SATA Drives for the storage (probably in a RAID config) A small and cheap SSD for the journal - Basic Block Storage (RBD): Many 2,5'' SATA Drives for the storage (RAID10 and/or mutliple OSDs) Sm

Re: Designing a cluster guide

2012-05-20 Thread Christian Brunner
2012/5/20 Stefan Priebe : > Am 19.05.2012 18:15, schrieb Alexandre DERUMIER: > >> Hi, >> >> For your journal , if you have money, you can use >> >> stec zeusram ssd drive. (around 2000€ /8GB / 10 iops read/write with >> 4k block). >> I'm using them with zfs san, they rocks for journal. >> http:

Re: Designing a cluster guide

2012-05-20 Thread Alexandre DERUMIER
,) - Mail original - De: "Stefan Priebe" À: "Alexandre DERUMIER" Cc: ceph-devel@vger.kernel.org, "Gregory Farnum" Envoyé: Dimanche 20 Mai 2012 09:56:21 Objet: Re: Designing a cluster guide Am 19.05.2012 18:15, schrieb Alexandre DERUMIER: > Hi

Re: Designing a cluster guide

2012-05-20 Thread Stefan Priebe
Am 19.05.2012 18:15, schrieb Alexandre DERUMIER: Hi, For your journal , if you have money, you can use stec zeusram ssd drive. (around 2000€ /8GB / 10 iops read/write with 4k block). I'm using them with zfs san, they rocks for journal. http://www.stec-inc.com/product/zeusram.php another i

Re: Designing a cluster guide

2012-05-19 Thread Alexandre DERUMIER
http://www.ddrdrive.com/ - Mail original - De: "Stefan Priebe" À: "Gregory Farnum" Cc: ceph-devel@vger.kernel.org Envoyé: Samedi 19 Mai 2012 10:37:01 Objet: Re: Designing a cluster guide Hi Greg, Am 17.05.2012 23:27, schrieb Gregory Farnum: >> It mentions for

Re: Designing a cluster guide

2012-05-19 Thread Stefan Priebe
Hi Greg, Am 17.05.2012 23:27, schrieb Gregory Farnum: It mentions for example "Fast CPU" for the mds system. What does fast mean? Just the speed of one core? Or is ceph designed to use multi core? Is multi core or more speed important? Right now, it's primarily the speed of a single core. The M

Re: Designing a cluster guide

2012-05-17 Thread Gregory Farnum
Sorry this got left for so long... On Thu, May 10, 2012 at 6:23 AM, Stefan Priebe - Profihost AG wrote: > Hi, > > the "Designing a cluster guide" > http://wiki.ceph.com/wiki/Designing_a_cluster is pretty good but it > still leaves some questions unanswered. > > It mentions for example "Fast CPU"