Re: [ceph-users] maximizing VM performance (on CEPH)
Hello, On Sat, 18 Jan 2014 18:51:29 -0500 Gautam Saxena wrote: > I'm trying to maximize emphemeral Windows 7 32-bit performance with > CEPH's RBD as back-end storage engine. (I'm not worried about data loss, > as these VMs are all ephemeral, but I am worried about performance and > responsiveness of the VMs.) My questions are: > The first thing you probably want to determine is, what is the amount of writes happening on those RBDs? IOPS and also type of writes, as in forced out by flushes from the OS or application versus things that are allowed to accumulate in disk caches. > > 1) Are there any recommendations or best-practices on the CEPH RBD cache > settings? I don't fully understand how the above parameters come into > play? Can someone provide some clarification, perhaps through a > quick-and-dirty scenario example? > > The defaults seem low. So, I'm thinking of setting the RBD cache size to > something like 1 GB, the cache max dirty to 1 GB, the cache target dirty > to 500 MB, and the cache max dirty age to say 30 seconds. > Your servers must be filled to the brim with RAM if you are considering giving each RBD mapping a 1GB cache. Again, see above, this will depend on the type of writes, but I think you'll quickly see diminishing returns here. The RBD cache will work best for consolidating lots of small, cachable writes which would otherwise result in many IOPS on the storage backend (ODS). Personally I'm thinking of doubling the defaults once I get to that stage. You might also find that giving the OS that RAM instead could be beneficial. > 2) What do people thinking of my using a separate pool of replication > factor 1 for the "copy-on-write" portion of the clones of these > *ephemeral* VMs? Would this further improve performance for these > *ephemeral* Windows VMs? > This will speed up things, since only one OSD needs to confirm the write (to a SSD backed journal since you're that performance conscious). However no matter how much you don't mind data loss, I would be worried about the fact that ALL these RBDs will be unavailable if just one disk (OSD) fails. > 3) In addition to #2, what if I made this addition pool (of replication > factor 1) reside on the host node's RAM (ramdisk)? Pros/cons to this > idea? (I'm hoping this would minimize impact of boot storms and also > improve overall responsiveness.) How many VMs and associated RBDs are we talking about here? If you create a pool on a ram disk of just one storage node, you're likely to saturate the network interface of that node. And of course that pool would have to be created/populated by some scripts of your own doing before ceph is started... Your best bet is probably a design where the number of OSDs per node is matched to that nodes capabilities (CPU/RAM and most of all network bandwidth) and then deploy as many of these nodes as sensible. For example with a single 10GigE public network interface per OSD node 10 disks (OSDs) would easily be able to saturate that link. However that's just bandwidth, if you're IOPS bound (and mostly that seems to be the case) then more disks per node can make sense. Of course more storage nodes will help even more, it all becomes a question of how much money and rack space you're willing to spend. ^o^ Regards, Christian -- Christian BalzerNetwork/Systems Engineer ch...@gol.com Global OnLine Japan/Fusion Communications http://www.gol.com/ ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] Problem with Mounting to Pool 0 with CephFS
Hi all, I have three pools, which I want to mount Pool 0 with CephFS. When I try to set the layout by changing the pool to to 0 (cephfs /mnt/oruafs/pool0/ set_layout -p 0), it would not be set to pool 0 while I am able to set it to pool 1 or 2. I'd appreciate if anyone could give me their opinion related to this problem. Thanks in advance Sherry___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Gentoo & ceph 0.67 & pg stuck After fresh Installation
On Sun, 19 Jan 2014, Sherry Shahbazi wrote: > Hi Philipp, > > Installing "ntp" on each server might solve the clock skew problem. At the very least a onetime 'ntpdate time.apple.com' should make that issue go away for the time being. s > > Best Regards > Sherry > > > On Sunday, January 19, 2014 6:34 AM, Philipp Strobl > wrote: > HI Aaron, > > sorry for taking so long... > > After i add the osd and buckets to the crushmap i get > > ceph osd tree > # id weight type name up/down reweight > -3 1 host dp2 > 1 1 osd.1 up 1 > -2 1 host dp1 > 0 1 osd.0 up 1 > -1 0 root default > > > Both osds are up and in > > ceph osd stat > e25: 2 osds: 2 up, 2 in > > ceph health detail says: > > HEALTH_WARN 292 pgs stuck inactive; 292 pgs stuck unclean; clock skew > detected on mon.vmsys-dp2 > pg 3.f is stuck inactive since forever, current state creating, last acting > [] > pg 0.c is stuck inactive since forever, current state creating, last acting > [] > pg 1.d is stuck inactive since forever, current state creating, last acting > [] > pg 2.e is stuck inactive since forever, current state creating, last acting > [] > pg 3.8 is stuck inactive since forever, current state creating, last acting > [] > pg 0.b is stuck inactive since forever, current state creating, last acting > [] > pg 1.a is stuck inactive since forever, current state creating, last acting > [] > ... > pg 2.c is stuck unclean since forever, current state creating, last acting > [] > pg 1.f is stuck unclean since forever, current state creating, last acting > [] > pg 0.e is stuck unclean since forever, current state creating, last acting > [] > pg 3.d is stuck unclean since forever, current state creating, last acting > [] > pg 2.f is stuck unclean since forever, current state creating, last acting > [] > pg 1.c is stuck unclean since forever, current state creating, last acting > [] > pg 0.d is stuck unclean since forever, current state creating, last acting > [] > pg 3.e is stuck unclean since forever, current state creating, last acting > [] > mon.vmsys-dp2 addr 10.0.0.22:6789/0 clock skew 16.4914s > max 0.05s (latency > 0.00666228s) > > All pgs have the same status. > > Is the clock skew an important fact ? > > I compiled ceph like this - eix ceph: > ... > Installed versions: 0.67{tbz2}(00:54:50 01/08/14)(fuse -debug -gtk > -libatomic -radosgw -static-libs -tcmalloc) > > cluster name is vmsys, servers are dp1 and dp2 > config: > > [global] > auth cluster required = none > auth service required = none > auth client required = none > auth supported = none > fsid = 265d12ac-e99d-47b9-9651-05cb2b4387a6 > > [mon.vmsys-dp1] > host = dp1 > mon addr = INTERNAL-IP1:6789 > mon data = /var/lib/ceph/mon/ceph-vmsys-dp1 > > [mon.vmsys-dp2] > host = dp2 > mon addr = INTERNAL-IP2:6789 > mon data = /var/lib/ceph/mon/ceph-vmsys-dp2 > > [osd] > [osd.0] > host = dp1 > devs = /dev/sdb1 > osd_mkfs_type = xfs > osd data = /var/lib/ceph/osd/ceph-0 > > [osd.1] > host = dp2 > devs = /dev/sdb1 > osd_mkfs_type = xfs > osd data = /var/lib/ceph/osd/ceph-1 > > [mds.vmsys-dp1] > host = dp1 > > [mds.vmsys-dp2] > host = dp2 > > > > Hope this is helpful - i really don't know at the moment what is wrong. > > Perhaps i try the manual-deploy howto from inktank or do you have an idea ? > > > > Best Philipp > > http://www.pilarkto.net > Am 10.01.2014 20:50, schrieb Aaron Ten Clay: > Hi Philipp, > > It sounds like perhaps you don't have any OSDs that are both "up" and > "in" the cluster. Can you provide the output of "ceph health detail" > and "ceph osd tree" for us? > > As for the "howto" you mentioned, I added some notes to the top but > never really updated the body of the document... I'm not entirely sure > it's straightforward or up to date any longer :) I'd be happy to make > changes as needed but I haven't manually deployed a cluster in several > months, and Inktank now has a manual deployment guide for Ceph at > http://ceph.com/docs/master/install/manual-deployment/ > > -Aaron > > > > On Fri, Jan 10, 2014 at 6:57 AM, Philipp Strobl > wrote: > Hi, > > After managed to deploy ceph manual in gentoo (ceph-disk tools > are under /usr/usr/sbin...), the daemons are coming properly up, > but "ceph health" shows warn for all pgs stuck unclean. > This is a strange behavior for a clean new installtion i guess. > > So the question is, do i'm something wrong Or can i reset the > PGs for getting the Cluster Running ? > > Also the rbd-Client Or Mount.ceph Hangs with no answer. > > I used thishowto: > https://github.com/aarontc/ansible-playbooks/blob/master/roles/ceph. > notes-on-deployment.rst > > Resp. our German translation/expansion > http://wiki.open-laboratory.de/Intern:IT:HowTo:Ceph > > With auth Support ... = none > > > Best regards > And thank you in advance > > Philipp St
Re: [ceph-users] Low write speed
Hi, Ирек Нургаязович. Number of PGS is default "pg_num 64 pgp_num 64" 17.01.2014, 14:39, "Ирек Фасихов" : > Hi, Виталий. > Whether a sufficient number of PGS? > > 2014/1/17 Никитенко Виталий >> Good day! Please help me solve the problem. There are the following scheme : >> Server ESXi with 1Gb NICs. it has local store store2Tb and two isci storage >> connected to the second server . >> The second server supermicro: two 1TB hdd (lsi 9261-8i with battery), 8 CPU >> cores, 32 GB RAM and 2 1Gb NICs . On /dev/sda installed ubuntu 12 and >> ceph-emperor. /dev/sdb disk placed under osd.0. >> What i do next: >> # rbd create esxi >> # rbd map esxi >> >> Get /dev/rbd1 which shared using iscsitarget >> >> # cat ietd.conf >> Target iqn.2014-01.ru.ceph: rados.iscsi.001 >> Lun 0 Path = / dev/rbd1, Type = blockio, ScsiId = f817ab >> Target iqn.2014-01.ru.ceph: rados.iscsi.002 >> Lun 1 Path = / opt/storlun0.bin, Type = fileio, ScsiId = lun1, ScsiSN = >> lun1 >> >> For test I also create iscsi storage on /dev/sda (Lun1). >> When migrating a virtual machine from store2Tb to Lun0 (ceph) the rate of >> migration of 400-450 Mbit/second. >> When migrating a VM from store2Tb to Lun1 (ubuntu file) then the rate of >> migration of 800-900 Mbit / second. >> From this I conclude that the rate is not limited by disk(controller) and >> not to the network. >> Tried osd format to ext4 and xfs and btrfs but same speed. For me, speed is >> very important , especially since the plan >> translate 10Gb network links. >> Thanks. >> Vitaliy >> ___ >> ceph-users mailing list >> ceph-users@lists.ceph.com >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > -- > С уважением, Фасихов Ирек НургаязовичМоб.: +79229045757 ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Gentoo & ceph 0.67 & pg stuck After fresh Installation
Hi Philipp, Installing "ntp" on each server might solve the clock skew problem. Best Regards Sherry On Sunday, January 19, 2014 6:34 AM, Philipp Strobl wrote: HI Aaron, sorry for taking so long... After i add the osd and buckets to the crushmap i get ceph osd tree # id weight type name up/down reweight -3 1 host dp2 1 1 osd.1 up 1 -2 1 host dp1 0 1 osd.0 up 1 -1 0 root default Both osds are up and in ceph osd stat e25: 2 osds: 2 up, 2 in ceph health detail says: HEALTH_WARN 292 pgs stuck inactive; 292 pgs stuck unclean; clock skew detected on mon.vmsys-dp2 pg 3.f is stuck inactive since forever, current state creating, last acting [] pg 0.c is stuck inactive since forever, current state creating, last acting [] pg 1.d is stuck inactive since forever, current state creating, last acting [] pg 2.e is stuck inactive since forever, current state creating, last acting [] pg 3.8 is stuck inactive since forever, current state creating, last acting [] pg 0.b is stuck inactive since forever, current state creating, last acting [] pg 1.a is stuck inactive since forever, current state creating, last acting [] ... pg 2.c is stuck unclean since forever, current state creating, last acting [] pg 1.f is stuck unclean since forever, current state creating, last acting [] pg 0.e is stuck unclean since forever, current state creating, last acting [] pg 3.d is stuck unclean since forever, current state creating, last acting [] pg 2.f is stuck unclean since forever, current state creating, last acting [] pg 1.c is stuck unclean since forever, current state creating, last acting [] pg 0.d is stuck unclean since forever, current state creating, last acting [] pg 3.e is stuck unclean since forever, current state creating, last acting [] mon.vmsys-dp2 addr 10.0.0.22:6789/0 clock skew 16.4914s > max 0.05s (latency 0.00666228s) All pgs have the same status. Is the clock skew an important fact ? I compiled ceph like this - eix ceph: ... Installed versions: 0.67{tbz2}(00:54:50 01/08/14)(fuse -debug -gtk -libatomic -radosgw -static-libs -tcmalloc) cluster name is vmsys, servers are dp1 and dp2 config: [global] auth cluster required = none auth service required = none auth client required = none auth supported = none fsid = 265d12ac-e99d-47b9-9651-05cb2b4387a6 [mon.vmsys-dp1] host = dp1 mon addr = INTERNAL-IP1:6789 mon data = /var/lib/ceph/mon/ceph-vmsys-dp1 [mon.vmsys-dp2] host = dp2 mon addr = INTERNAL-IP2:6789 mon data = /var/lib/ceph/mon/ceph-vmsys-dp2 [osd] [osd.0] host = dp1 devs = /dev/sdb1 osd_mkfs_type = xfs osd data = /var/lib/ceph/osd/ceph-0 [osd.1] host = dp2 devs = /dev/sdb1 osd_mkfs_type = xfs osd data = /var/lib/ceph/osd/ceph-1 [mds.vmsys-dp1] host = dp1 [mds.vmsys-dp2] host = dp2 Hope this is helpful - i really don't know at the moment what is wrong. Perhaps i try the manual-deploy howto from inktank or do you have an idea ? Best Philipp http://www.pilarkto.net Am 10.01.2014 20:50, schrieb Aaron Ten Clay: Hi Philipp, > > It sounds like perhaps you don't have any OSDs that are both "up" and "in" the cluster. Can you provide the output of "ceph health detail" and "ceph osd tree" for us? > > As for the "howto" you mentioned, I added some notes to the top but never really updated the body of the document... I'm not entirely sure it's straightforward or up to date any longer :) I'd be happy to make changes as needed but I haven't manually deployed a cluster in several months, and Inktank now has a manual deployment guide for Ceph at http://ceph.com/docs/master/install/manual-deployment/ > > -Aaron > > > > > > >On Fri, Jan 10, 2014 at 6:57 AM, Philipp Strobl wrote: > >Hi, >> >> >>After managed to deploy ceph manual in gentoo (ceph-disk tools are under >>/usr/usr/sbin...), the daemons are coming properly up, but "ceph health" >>shows warn for all pgs stuck unclean. >>This is a strange behavior for a clean new installtion i guess. >> >> >>So the question is, do i'm something wrong Or can i reset the PGs for getting >>the Cluster Running ? >> >> >>Also the rbd-Client Or Mount.ceph Hangs with no answer. >> >> >>I used this howto: >>https://github.com/aarontc/ansible-playbooks/blob/master/roles/ceph.notes-on-deployment.rst >> >> >>Resp. our German translation/expansion >>http://wiki.open-laboratory.de/Intern:IT:HowTo:Ceph >> >> >>With auth Support ... = none >> >> >> >> >>Best regards >>And thank you in advance >> >>Philipp Strobl >> >> >> >> >>___ >>ceph-users mailing list >>ceph-users@lists.ceph.com >>http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >> >> > > >-- >Aaron Ten Clay >http://www.aarontc.com/ >
Re: [ceph-users] writing data to ceph rbd
On Sun, Jan 19, 2014 at 2:32 PM, kiriti krishna wrote: > Hi, > I have successfully installed ceph, and created and configured ceph > rados, ceph rbd and cephfs. Now I want to want to write data to ceph rbd. I > have created ceph rbd write on copy clone but I'm not finding any info > regarding how to write to ceph rbd. Can you please help me with any > reference to how to write data to rbd. Also where do all of these rbd snap > shots and their clones will be stored. > Thanks and Regards, > Kiriti rbd is short for block device, you have to "map" an rbd image to a block device in order to use it (read or write from it). $ rbd create --size $ rbd map will map image_name image to a /dev/rbd block device. You can then use this block device as any other block device on your system: create a filesystem on it, use it as a raw block device, etc. $ rbd showmapped will give you the /dev/rbd id. The other option is to use fuse. See documentation links below for details. http://ceph.com/docs/master/rbd/rbd-ko/ - rbd map http://ceph.com/docs/master/man/8/rbd-fuse/ - fuse http://ceph.com/docs/master/rbd/rbd/ - everything rbd Thanks, Ilya ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Ceph cluster is unreachable because of authentication failure
Thanks Sage. I just captured part of the log (it was fast growing), the process did not hang but I saw the same pattern repeatedly. Should I increase the log level and send over email (it constantly reproduced)? Thanks, Guang On Jan 18, 2014, at 12:05 AM, Sage Weil wrote: > On Fri, 17 Jan 2014, Guang wrote: >> Thanks Sage. >> >> I further narrow down the problem to #any command using paxos service would >> hang#, following are details: >> >> 1. I am able to run ceph status / osd dump, etc., however, the result are >> out of date (though I stopped all OSDs, it does not reflect in ceph status >> report). >> >> -bash-4.1$ sudo ceph -s >> cluster b9cb3ea9-e1de-48b4-9e86-6921e2c537d2 >> health HEALTH_WARN 2797 pgs degraded; 107 pgs down; 7503 pgs peering; 917 >> pgs recovering; 6079 pgs recovery_wait; 2957 pgs stale; 7771 pgs stuck >> inactive; 2957 pgs stuck stale; 16567 pgs stuck unclean; recovery >> 54346804/779462977 degraded (6.972%); 9/259724199 unfound (0.000%); 2 near >> full osd(s); 57/751 in osds are down; >> noout,nobackfill,norecover,noscrub,nodeep-scrub flag(s) set >> monmap e1: 3 mons at >> {osd151=10.194.0.68:6789/0,osd152=10.193.207.130:6789/0,osd153=10.193.207.131:6789/0}, >> election epoch 123278, quorum 0,1,2 osd151,osd152,osd153 >> osdmap e134893: 781 osds: 694 up, 751 in >>pgmap v2388518: 22203 pgs: 26 inactive, 14 active, 79 >> stale+active+recovering, 5020 active+clean, 242 stale, 4352 >> active+recovery_wait, 616 stale+active+clean, 177 >> active+recovering+degraded, 6714 peering, 925 stale+active+recovery_wait, 86 >> down+peering, 1547 active+degraded, 32 stale+active+recovering+degraded, 648 >> stale+peering, 21 stale+down+peering, 239 stale+active+degraded, 651 >> active+recovery_wait+degraded, 30 remapped+peering, 151 >> stale+active+recovery_wait+degraded, 4 stale+remapped+peering, 629 >> active+recovering; 79656 GB data, 363 TB used, 697 TB / 1061 TB avail; >> 54346804/779462977 degraded (6.972%); 9/259724199 unfound (0.000%) >> mdsmap e1: 0/0/1 up >> >> 2. If I run a command which uses paxos, the command will hang forever, this >> includes, ceph osd set noup (and also including those commands osd send to >> monitor when being started (create-or-add)). >> >> I attached the corresponding monitor log (it is like a bug). > > I see the osd set command coming through, but it arrives while paxos is > converging and the log seems to end before the mon would normally process > te delayed messages. Is there a reason why the log fragment you attached > ends there, or did the process hang or something? > > Thanks- > sage > >> I >> >> On Jan 17, 2014, at 1:35 AM, Sage Weil wrote: >> >>> Hi Guang, >>> >>> On Thu, 16 Jan 2014, Guang wrote: I still have bad the luck to figure out what is the problem making authentication failure, so in order to get the cluster back, I tried: 1. stop all daemons (mon & osd) 2. change the configuration to disable cephx 3. start mon daemons (3 in total) 4. start osd daemon one by one After finishing step 3, the cluster can be reachable ('ceph -s' give results): -bash-4.1$ sudo ceph -s cluster b9cb3ea9-e1de-48b4-9e86-6921e2c537d2 health HEALTH_WARN 2797 pgs degraded; 107 pgs down; 7503 pgs peering; 917 pgs recovering; 6079 pgs recovery_wait; 2957 pgs stale; 7771 pgs stuck inactive; 2957 pgs stuck stale; 16567 pgs stuck unclean; recovery 54346804/779462977 degraded (6.972%); 9/259724199 unfound (0.000%); 2 near full osd(s); 57/751 in osds are down; noout,nobackfill,norecover,noscrub,nodeep-scrub flag(s) set monmap e1: 3 mons at {osd151=10.194.0.68:6789/0,osd152=10.193.207.130:6789/0,osd153=10.193.207.131:6789/0}, election epoch 106022, quorum 0,1,2 osd151,osd152,osd153 osdmap e134893: 781 osds: 694 up, 751 in pgmap v2388518: 22203 pgs: 26 inactive, 14 active, 79 stale+active+recovering, 5020 active+clean, 242 stale, 4352 active+recovery_wait, 616 stale+active+clean, 177 active+recovering+degraded, 6714 peering, 925 stale+active+recovery_wait, 86 down+peering, 1547 active+degraded, 32 stale+active+recovering+degraded, 648 stale+peering, 21 stale+down+peering, 239 stale+active+degraded, 651 active+recovery_wait+degraded, 30 remapped+peering, 151 stale+active+recovery_wait+degraded, 4 stale+remapped+peering, 629 active+recovering; 79656 GB data, 363 TB used, 697 TB / 1061 TB avail; 54346804/779462977 degraded (6.972%); 9/259724199 unfound (0.000%) mdsmap e1: 0/0/1 up (at this point, all OSDs should be down). When I tried to start OSD daemon, the starting script got hang, and the process hang is: root 80497 80496 0 08:18 pts/000:00:00 python /usr/bin/ceph --name=osd.22 --keyring=/var/lib/ceph/osd/ceph-22/keyring osd crush create-or-move -- 22 0.40 root=default host=osd173
[ceph-users] writing data to ceph rbd
Hi, I have successfully installed ceph, and created and configured ceph rados, ceph rbd and cephfs. Now I want to want to write data to ceph rbd. I have created ceph rbd write on copy clone but I'm not finding any info regarding how to write to ceph rbd. Can you please help me with any reference to how to write data to rbd. Also where do all of these rbd snap shots and their clones will be stored. Thanks and Regards, Kiriti ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com