Hello German
Could you remember me what and wich type of bus/adapter your hds are connected ? Julien > Le 28 déc. 2013 à 00:10, German Anders <gand...@despegar.com> a écrit : > > Hi Mark, > I've already make those changes but the performance is almost the > same, i make another test with a DD statement and the results were the same > (i've used all of the 73GB disks for the OSD's and also put the Journal > inside the OSD device), also noticed that the network is at Gb: > > ceph@ceph-node04:~$ sudo rbd -m 10.1.1.151 -p ceph-cloud --size 102400 create > rbdCloud -k /etc/ceph/ceph.client.admin.keyring > ceph@ceph-node04:~$ sudo rbd map -m 10.1.1.151 rbdCloud --pool ceph-cloud > --id admin -k /etc/ceph/ceph.client.admin.keyring > ceph@ceph-node04:~$ sudo mkdir /mnt/rbdCloud > ceph@ceph-node04:~$ sudo mkfs.xfs -l size=64m,lazy-count=1 -f > /dev/rbd/ceph-cloud/rbdCloud > log stripe unit (4194304 bytes) is too large (maximum is 256KiB) > log stripe unit adjusted to 32KiB > meta-data=/dev/rbd/ceph-cloud/rbdCloud isize=256 agcount=17, > agsize=1637376 blks > = sectsz=512 attr=2, projid32bit=0 > data = bsize=4096 blocks=26214400, imaxpct=25 > = sunit=1024 swidth=1024 blks > naming =version 2 bsize=4096 ascii-ci=0 > log =internal log bsize=4096 blocks=16384, version=2 > = sectsz=512 sunit=8 blks, lazy-count=1 > realtime =none extsz=4096 blocks=0, rtextents=0 > ceph@ceph-node04:~$ > ceph@ceph-node04:~$ sudo mount /dev/rbd/ceph-cloud/rbdCloud /mnt/rbdCloud > ceph@ceph-node04:~$ cd /mnt/rbdCloud > ceph@ceph-node04:/mnt/rbdCloud$ > ceph@ceph-node04:/mnt/rbdCloud$ for i in 1 2 3 4; do sudo dd if=/dev/zero > of=a bs=1M count=1000 conv=fdatasync; done > 1000+0 records in > 1000+0 records out > 1048576000 bytes (1.0 GB) copied, 10.2545 s, 102 MB/s > 1000+0 records in > 1000+0 records out > 1048576000 bytes (1.0 GB) copied, 10.0554 s, 104 MB/s > 1000+0 records in > 1000+0 records out > 1048576000 bytes (1.0 GB) copied, 10.2352 s, 102 MB/s > 1000+0 records in > 1000+0 records out > 1048576000 bytes (1.0 GB) copied, 10.1197 s, 104 MB/s > ceph@ceph-node04:/mnt/rbdCloud$ > > OSD tree: > > ceph@ceph-node05:~/ceph-cluster-prd$ sudo ceph osd tree > # id weight type name up/down reweight > -1 3.43 root default > -2 0.6299 host ceph-node01 > 12 0.06999 osd.12 up 1 > 13 0.06999 osd.13 up 1 > 14 0.06999 osd.14 up 1 > 15 0.06999 osd.15 up 1 > 16 0.06999 osd.16 up 1 > 17 0.06999 osd.17 up 1 > 18 0.06999 osd.18 up 1 > 19 0.06999 osd.19 up 1 > 20 0.06999 osd.20 up 1 > -3 0.6999 host ceph-node02 > 22 0.06999 osd.22 up 1 > 23 0.06999 osd.23 up 1 > 24 0.06999 osd.24 up 1 > 25 0.06999 osd.25 up 1 > 26 0.06999 osd.26 up 1 > 27 0.06999 osd.27 up 1 > 28 0.06999 osd.28 up 1 > 29 0.06999 osd.29 up 1 > 30 0.06999 osd.30 up 1 > 31 0.06999 osd.31 up 1 > -4 0.6999 host ceph-node03 > 32 0.06999 osd.32 up 1 > 33 0.06999 osd.33 up 1 > 34 0.06999 osd.34 up 1 > 35 0.06999 osd.35 up 1 > 36 0.06999 osd.36 up 1 > 37 0.06999 osd.37 up 1 > 38 0.06999 osd.38 up 1 > 39 0.06999 osd.39 up 1 > 40 0.06999 osd.40 up 1 > 41 0.06999 osd.41 up 1 > -5 0.6999 host ceph-node04 > 0 0.06999 osd.0 up 1 > 1 0.06999 osd.1 up 1 > 2 0.06999 osd.2 up 1 > 3 0.06999 osd.3 up 1 > 4 0.06999 osd.4 up 1 > 5 0.06999 osd.5 up 1 > 6 0.06999 osd.6 up 1 > 7 0.06999 osd.7 up 1 > 8 0.06999 osd.8 up 1 > 9 0.06999 osd.9 up 1 > -6 0.6999 host ceph-node05 > 10 0.06999 osd.10 up 1 > 11 0.06999 osd.11 up 1 > 42 0.06999 osd.42 up 1 > 43 0.06999 osd.43 up 1 > 44 0.06999 osd.44 up 1 > 45 0.06999 osd.45 up 1 > 46 0.06999 osd.46 up 1 > 47 0.06999 osd.47 up 1 > 48 0.06999 osd.48 up 1 > 49 0.06999 osd.49 up 1 > > > Any ideas? > > Thanks in advance, > > German Anders > > > > > > > > >> --- Original message --- >> Asunto: Re: [ceph-users] Cluster Performance very Poor >> De: Mark Nelson <mark.nel...@inktank.com> >> Para: <ceph-users@lists.ceph.com> >> Fecha: Friday, 27/12/2013 15:39 >> >>> On 12/27/2013 12:19 PM, German Anders wrote: >>> Hi Cephers, >>> >>> I've run a rados bench to measure the throughput of the cluster, >>> and found that the performance is really poor: >>> >>> The setup is the following: >>> >>> OS: Ubuntu 12.10 Server 64 bits >>> >>> >>> ceph-node01(mon) 10.77.0.101 ProLiant BL460c G7 32GB 8 x 2 Ghz >>> 10.1.1.151 D2200sb Storage Blade >>> (Firmware: 2.30) >>> ceph-node02(mon) 10.77.0.102 ProLiant BL460c G7 64GB 8 x 2 Ghz >>> 10.1.1.152 D2200sb Storage Blade >>> (Firmware: 2.30) >>> ceph-node03(mon) 10.77.0.103 ProLiant BL460c G6 32GB 8 x 2 Ghz >>> 10.1.1.153 D2200sb Storage Blade >>> (Firmware: 2.30) >>> ceph-node04 10.77.0.104 ProLiant BL460c G7 32GB 8 x >>> 2 Ghz >>> 10.1.1.154 D2200sb Storage Blade >>> (Firmware: 2.30) >>> ceph-node05(deploy) 10.77.0.105 ProLiant BL460c G6 32GB 8 x >>> 2 Ghz >>> 10.1.1.155 D2200sb Storage >>> Blade (Firmware: 2.30) >> >> If your servers have controllers with writeback cache, please make sure >> it is enabled as that will likely help. >> >>> >>> ceph-node01: >>> >>> /dev/sda 73G (OSD) >>> /dev/sdb 73G (OSD) >>> /dev/sdc 73G (OSD) >>> /dev/sdd 73G (OSD) >>> /dev/sde 73G (OSD) >>> /dev/sdf 73G (OSD) >>> /dev/sdg 73G (OSD) >>> /dev/sdh 73G (OSD) >>> /dev/sdi 73G (OSD) >>> /dev/sdj 73G (Journal) >>> /dev/sdk 500G (OSD) >>> /dev/sdl 500G (OSD) >>> /dev/sdn 146G (Journal) >>> >>> ceph-node02: >>> >>> /dev/sda 73G (OSD) >>> /dev/sdb 73G (OSD) >>> /dev/sdc 73G (OSD) >>> /dev/sdd 73G (OSD) >>> /dev/sde 73G (OSD) >>> /dev/sdf 73G (OSD) >>> /dev/sdg 73G (OSD) >>> /dev/sdh 73G (OSD) >>> /dev/sdi 73G (OSD) >>> /dev/sdj 73G (Journal) >>> /dev/sdk 500G (OSD) >>> /dev/sdl 500G (OSD) >>> /dev/sdn 146G (Journal) >>> >>> ceph-node03: >>> >>> /dev/sda 73G (OSD) >>> /dev/sdb 73G (OSD) >>> /dev/sdc 73G (OSD) >>> /dev/sdd 73G (OSD) >>> /dev/sde 73G (OSD) >>> /dev/sdf 73G (OSD) >>> /dev/sdg 73G (OSD) >>> /dev/sdh 73G (OSD) >>> /dev/sdi 73G (OSD) >>> /dev/sdj 73G (Journal) >>> /dev/sdk 500G (OSD) >>> /dev/sdl 500G (OSD) >>> /dev/sdn 73G (Journal) >>> >>> ceph-node04: >>> >>> /dev/sda 73G (OSD) >>> /dev/sdb 73G (OSD) >>> /dev/sdc 73G (OSD) >>> /dev/sdd 73G (OSD) >>> /dev/sde 73G (OSD) >>> /dev/sdf 73G (OSD) >>> /dev/sdg 73G (OSD) >>> /dev/sdh 73G (OSD) >>> /dev/sdi 73G (OSD) >>> /dev/sdj 73G (Journal) >>> /dev/sdk 500G (OSD) >>> /dev/sdl 500G (OSD) >>> /dev/sdn 146G (Journal) >>> >>> ceph-node05: >>> >>> /dev/sda 73G (OSD) >>> /dev/sdb 73G (OSD) >>> /dev/sdc 73G (OSD) >>> /dev/sdd 73G (OSD) >>> /dev/sde 73G (OSD) >>> /dev/sdf 73G (OSD) >>> /dev/sdg 73G (OSD) >>> /dev/sdh 73G (OSD) >>> /dev/sdi 73G (OSD) >>> /dev/sdj 73G (Journal) >>> /dev/sdk 500G (OSD) >>> /dev/sdl 500G (OSD) >>> /dev/sdn 73G (Journal) >> >> Am I correct in assuming that you've put all of your journals for every >> disk in each node on two spinning disks? This is going to be quite >> slow, because Ceph does a full write of the data the journal for every >> real write. The general solution is to either use SSDs for journals >> (preferably multiple fast SSDs with high write endurance and only 3-6 >> OSD journals each), or put the journals on a partition on the data disk. >> >>> >>> And the OSD tree is: >>> >>> root@ceph-node03:/home/ceph# ceph osd tree >>> # id weight type name up/down reweight >>> -1 7.27 root default >>> -2 1.15 host ceph-node01 >>> 12 0.06999 osd.12 up 1 >>> 13 0.06999 osd.13 up 1 >>> 14 0.06999 osd.14 up 1 >>> 15 0.06999 osd.15 up 1 >>> 16 0.06999 osd.16 up 1 >>> 17 0.06999 osd.17 up 1 >>> 18 0.06999 osd.18 up 1 >>> 19 0.06999 osd.19 up 1 >>> 20 0.06999 osd.20 up 1 >>> 21 0.45 osd.21 up 1 >>> 22 0.06999 osd.22 up 1 >>> -3 1.53 host ceph-node02 >>> 23 0.06999 osd.23 up 1 >>> 24 0.06999 osd.24 up 1 >>> 25 0.06999 osd.25 up 1 >>> 26 0.06999 osd.26 up 1 >>> 27 0.06999 osd.27 up 1 >>> 28 0.06999 osd.28 up 1 >>> 29 0.06999 osd.29 up 1 >>> 30 0.06999 osd.30 up 1 >>> 31 0.06999 osd.31 up 1 >>> 32 0.45 osd.32 up 1 >>> 33 0.45 osd.33 up 1 >>> -4 1.53 host ceph-node03 >>> 34 0.06999 osd.34 up 1 >>> 35 0.06999 osd.35 up 1 >>> 36 0.06999 osd.36 up 1 >>> 37 0.06999 osd.37 up 1 >>> 38 0.06999 osd.38 up 1 >>> 39 0.06999 osd.39 up 1 >>> 40 0.06999 osd.40 up 1 >>> 41 0.06999 osd.41 up 1 >>> 42 0.06999 osd.42 up 1 >>> 43 0.45 osd.43 up 1 >>> 44 0.45 osd.44 up 1 >>> -5 1.53 host ceph-node04 >>> 0 0.06999 osd.0 up 1 >>> 1 0.06999 osd.1 up 1 >>> 2 0.06999 osd.2 up 1 >>> 3 0.06999 osd.3 up 1 >>> 4 0.06999 osd.4 up 1 >>> 5 0.06999 osd.5 up 1 >>> 6 0.06999 osd.6 up 1 >>> 7 0.06999 osd.7 up 1 >>> 8 0.06999 osd.8 up 1 >>> 9 0.45 osd.9 up 1 >>> 10 0.45 osd.10 up 1 >>> -6 1.53 host ceph-node05 >>> 11 0.06999 osd.11 up 1 >>> 45 0.06999 osd.45 up 1 >>> 46 0.06999 osd.46 up 1 >>> 47 0.06999 osd.47 up 1 >>> 48 0.06999 osd.48 up 1 >>> 49 0.06999 osd.49 up 1 >>> 50 0.06999 osd.50 up 1 >>> 51 0.06999 osd.51 up 1 >>> 52 0.06999 osd.52 up 1 >>> 53 0.45 osd.53 up 1 >>> 54 0.45 osd.54 up 1 >> >> Based on this, it appears your 500GB drives are weighted much higher >> than the 73GB drives. This will help even data distribution out, but >> unfortunately will cause the system to be slower if all of the OSDs are >> in the same pool. What this does is cause the 500GB drives to get a >> higher proportion of the writes than the other drives, but those drives >> are almost certainly no faster than the other ones. Because there is a >> limited number of outstanding IOs you can have (due to memory >> constraints), eventually all outstanding IOs will be waiting on the >> 500GB disks while the 73GB disks mostly sit around waiting for work. >> >> What I'd suggest doing is putting all of your 73 disks in the same pool >> and your 500GB disks in another pool. I suspect that if you do that and >> put your journals on the first partition of each disk, you'll see some >> improvement in your benchmark results. >> >>> >>> >>> And the result: >>> >>> root@ceph-node03:/home/ceph# rados bench -p ceph-cloud 20 write -t 10 >>> Maintaining 10 concurrent writes of 4194304 bytes for up to 20 seconds >>> or 0 objects >>> Object prefix: benchmark_data_ceph-node03_29727 >>> sec Cur ops started finished avg MB/s cur MB/s last lat avg lat >>> 0 0 0 0 0 0 - 0 >>> 1 10 30 20 79.9465 80 0.159295 0.378849 >>> 2 10 52 42 83.9604 88 0.719616 0.430293 >>> 3 10 74 64 85.2991 88 0.487685 0.412956 >>> 4 10 97 87 86.9676 92 0.351122 0.418814 >>> 5 10 123 113 90.3679 104 0.317011 0.418876 >>> 6 10 147 137 91.3012 96 0.562112 0.418178 >>> 7 10 172 162 92.5398 100 0.691045 0.413416 >>> 8 10 197 187 93.469 100 0.459424 0.415459 >>> 9 10 222 212 94.1915 100 0.798889 0.416093 >>> 10 10 248 238 95.1697 104 0.440002 0.415609 >>> 11 10 267 257 93.4252 76 0.48959 0.41531 >>> 12 10 289 279 92.9707 88 0.524622 0.420145 >>> 13 10 313 303 93.2016 96 1.02104 0.423955 >>> 14 10 336 326 93.1136 92 0.477328 0.420684 >>> 15 10 359 349 93.037 92 0.591118 0.418589 >>> 16 10 383 373 93.2204 96 0.600392 0.421916 >>> 17 10 407 397 93.3812 96 0.240166 0.419829 >>> 18 10 431 421 93.526 96 0.746706 0.420971 >>> 19 10 457 447 94.0757 104 0.237565 0.419025 >>> 2013-12-27 13:13:21.817874min lat: 0.101352 max lat: 1.81426 avg lat: >>> 0.418242 >>> sec Cur ops started finished avg MB/s cur MB/s last lat avg lat >>> 20 10 480 470 93.9709 92 0.489254 0.418242 >>> Total time run: 20.258064 >>> Total writes made: 481 >>> Write size: 4194304 >>> Bandwidth (MB/sec): 94.975 >>> >>> Stddev Bandwidth: 21.7799 >>> Max bandwidth (MB/sec): 104 >>> Min bandwidth (MB/sec): 0 >>> Average Latency: 0.420573 >>> Stddev Latency: 0.226378 >>> Max latency: 1.81426 >>> Min latency: 0.101352 >>> root@ceph-node03:/home/ceph# >>> >>> Thanks in advance, >>> >>> Best regards, >>> >>> *German Anders* >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> _______________________________________________ >>> ceph-users mailing list >>> ceph-users@lists.ceph.com >>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >> >> _______________________________________________ >> ceph-users mailing list >> ceph-users@lists.ceph.com >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > _______________________________________________ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com