[ceph-users] RadosGW problems on Ubuntu
Hello everyone, we are currently testing ceph (Hammer) and Openstack (Kilo) on Ubuntu 14.04 LTS Servers. Yesterday I tried to setup the radosgateway with keystone integration for swift via ceph-deploy. I followed the instructions on http://ceph.com/docs/master/radosgw/keystone/ and http://ceph.com/ceph-deploy/docs/rgw.html I encountered the following problems: 1. Bootstrap-rgw.keyring missing. I deployed the cluster under firefly, so I had to create it manually (according to documentation, this is normal, but it would be nice to have instructions how to create the bootstrap-rgw.keyring manually). 2. According to the documentation, the bucket in ceph.conf should be named [client.radosgw.InstanceName]. Doing so my config wasn't used at all, when I started rados-gateway using service radosgw-all start after changing bucket name to [client.rgw.InstanceName], nearly everything worked fine. 3. The nss db path parameter from my ceph.conf is still ignored and openstack can't sync users to radosgw. The only way I got it working was to start radosgw manually and passing all parameters directly. I'd like to know, if (or what) I am doing wrong or if I hit a bug in documentation, upstart script or radosgw. My configuration: [client.rgw.gw-v01] # Works except th nss db path log file = /var/log/radosgw/radosgw.log rgw frontends = civetweb port=80 rgw keystone admin token = secret rgw keystone url = http://xxx.xxx.xxx.xxx:5000; rgw keystone accepted roles = s3, swift, admin, _member_, user, Member rgw s3 auth use keystone = true nss db path = /var/lib/ceph/nss rgw keyring = /var/lib/ceph/radosgw/ceph-rgw.gw-v01/keyring rgw host = gw-v01 Working radosgw with: /usr/bin/radosgw --id rgw.gw-v01 --log-file /var/log/radosgw/radosgw.log --rgw-frontends civetweb port=80 --rgw-keystone-admin-token secret --rgw-keystone-url http://xxx.xxx.xxx.xxx:5000; --rgw-keystone-accepted-roles s3, swift, admin, _member_, user, Member --rgw-s3-auth-use-keystone true --nss-db-path /var/lib/ceph/nss --rgw-keyring /var/lib/ceph/radosgw/ceph-rgw.gw-v01/keyring --rgw-host gw-v01 Best regards Felix Forschungszentrum Juelich GmbH 52425 Juelich Sitz der Gesellschaft: Juelich Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498 Vorsitzender des Aufsichtsrats: MinDir Dr. Karl Eugen Huthmacher Geschaeftsfuehrung: Prof. Dr.-Ing. Wolfgang Marquardt (Vorsitzender), Karsten Beneke (stellv. Vorsitzender), Prof. Dr.-Ing. Harald Bolt, Prof. Dr. Sebastian M. Schmidt smime.p7s Description: S/MIME cryptographic signature ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] How repair 2 invalids pgs
Hy, Yesterday, I removed 5 ods on 15 from my cluster ( machine migration ). When I stopped the processes, I haven't verified that all the pages were in active stat. I removed the 5 ods from the cluster ( ceph osd out osd.9 ; ceph osd crush rm osd.9 ; ceph auth del osd.9 ; ceph osd rm osd.9 ) , and i check after... and I had two inactive pgs I have not formatted the filesystem of the osds. The health : pg 7.b is stuck inactive for 86083.236722, current state inactive, last acting [1,2] pg 7.136 is stuck inactive for 86098.214967, current state inactive, last acting [4,7] The recovery state : recovery_state: [ { name: Started\/Primary\/Peering\/WaitActingChange, enter_time: 2015-08-13 15:19:49.559965, comment: waiting for pg acting set to change}, { name: Started, enter_time: 2015-08-13 15:19:46.492625}], How can i solved my problem ? Can i re-add the osds since the filesystem ? My cluster is used for rbd's image and a little cephfs share. I can read all files in cephfs and I tried to check if there pgs were used by an image. I don't find anything, but I not sure of my script. My cluster is used for rbd image and a little cephfs share. I can read all block in cephfs and i check all image to verify if they use these pgs. I don't find anything. How do you know if a pg is used ? Regards -- -- Pierre BLONDEAU Administrateur Systèmes réseaux Université de Caen Laboratoire GREYC, Département d'informatique tel : 02 31 56 75 42 bureau : Campus 2, Science 3, 406 -- { state: inactive, snap_trimq: [], epoch: 15291, up: [ 4, 7], acting: [ 4, 7], info: { pgid: 7.136, last_update: 0'0, last_complete: 0'0, log_tail: 0'0, last_user_version: 0, last_backfill: MAX, purged_snaps: [], history: { epoch_created: 4046, last_epoch_started: 14458, last_epoch_clean: 14458, last_epoch_split: 0, same_up_since: 14475, same_interval_since: 14475, same_primary_since: 1, last_scrub: 0'0, last_scrub_stamp: 2015-08-13 07:07:17.963482, last_deep_scrub: 0'0, last_deep_scrub_stamp: 2015-08-08 06:18:33.726150, last_clean_scrub_stamp: 2015-08-13 07:07:17.963482}, stats: { version: 0'0, reported_seq: 10510, reported_epoch: 15291, state: inactive, last_fresh: 2015-08-14 13:52:48.121254, last_change: 2015-08-13 15:19:43.824578, last_active: 2015-08-13 15:19:31.362363, last_clean: 2015-08-13 15:19:31.362363, last_became_active: 0.00, last_unstale: 2015-08-14 13:52:48.121254, mapping_epoch: 14472, log_start: 0'0, ondisk_log_start: 0'0, created: 4046, last_epoch_clean: 14458, parent: 0.0, parent_split_bits: 0, last_scrub: 0'0, last_scrub_stamp: 2015-08-13 07:07:17.963482, last_deep_scrub: 0'0, last_deep_scrub_stamp: 2015-08-08 06:18:33.726150, last_clean_scrub_stamp: 2015-08-13 07:07:17.963482, log_size: 0, ondisk_log_size: 0, stats_invalid: 0, stat_sum: { num_bytes: 0, num_objects: 0, num_object_clones: 0, num_object_copies: 0, num_objects_missing_on_primary: 0, num_objects_degraded: 0, num_objects_unfound: 0, num_objects_dirty: 0, num_whiteouts: 0, num_read: 0, num_read_kb: 0, num_write: 0, num_write_kb: 0, num_scrub_errors: 0, num_shallow_scrub_errors: 0, num_deep_scrub_errors: 0, num_objects_recovered: 0, num_bytes_recovered: 0, num_keys_recovered: 0, num_objects_omap: 0, num_objects_hit_set_archive: 0}, stat_cat_sum: {}, up: [ 4, 7], acting: [ 4, 7], up_primary: 4, acting_primary: 4}, empty: 1, dne: 0, incomplete: 0, last_epoch_started: 14474, hit_set_history: { current_last_update: 0'0, current_last_stamp: 0.00, current_info: { begin: 0.00, end: 0.00, version: 0'0}, history: []}}, peer_info: [], recovery_state: [ { name: Started\/Primary\/Peering\/WaitActingChange, enter_time: 2015-08-13 15:19:43.688351, comment: waiting for pg acting set to change}, { name: Started, enter_time: 2015-08-13 15:19:35.569102}], agent_state: {}}{ state: inactive, snap_trimq: [], epoch: 15291, up: [ 1, 2],
Re: [ceph-users] Cache tier best practices
Thank you guys , this answers my query Cheers Vickey On Thu, Aug 13, 2015 at 8:02 PM, Bill Sanders billysand...@gmail.com wrote: I think you're looking for this. http://ceph.com/docs/master/man/8/rbd/#cmdoption-rbd--order It's used when you create the RBD images. 1MB is order=20, 512 is order=19. Thanks, Bill Sanders On Thu, Aug 13, 2015 at 1:31 AM, Vickey Singh vickey.singh22...@gmail.com wrote: Thanks Nick for your suggestion. Can you also tell how i can reduce RBD block size to 512K or 1M , do i need to put something in clients ceph.conf ( what parameter i need to set ) Thanks once again - Vickey On Wed, Aug 12, 2015 at 4:49 PM, Nick Fisk n...@fisk.me.uk wrote: -Original Message- From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Dominik Zalewski Sent: 12 August 2015 14:40 To: ceph-us...@ceph.com Subject: [ceph-users] Cache tier best practices Hi, I would like to hear from people who use cache tier in Ceph about best practices and things I should avoid. I remember hearing that it wasn't that stable back then. Has it changed in Hammer release? It's not so much the stability, but the performance. If your working set will sit mostly in the cache tier and won't tend to change then you might be alright. Otherwise you will find that performance is very poor. Only tip I can really give is that I have found dropping the RBD block size down to 512kb-1MB helps quite a bit as it makes the cache more effective and also minimises the amount of data transferred on each promotion/flush. Any tips and tricks are much appreciated! Thanks Dominik ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] НА: CEPH cache layer. Very slow
Hi! Of course, it isn't cheap at all, but we use Intel DC S3700 200Gb for ceph journals and DC S3700 400Gb in the SSD pool: same hosts, separate root in crushmap. SSD pool are not yet in production, journаlling SSDs works under production load for 10 months. They're in good condition - no faults, no degradation. We specially take 200Gb SSD for journals to reduce costs, and also have a higher than recommended OSD/SSD ratio: 1 SSD per 10-12 ODS, whille recommended 1/3 to 1/6. So, as a conclusion - I'll recommend you to get a bigger budget and buy durable and fast SSDs for Ceph. Megov Igor CIO, Yuterra От: ceph-users ceph-users-boun...@lists.ceph.com от имени Voloshanenko Igor igor.voloshane...@gmail.com Отправлено: 13 августа 2015 г. 15:54 Кому: Jan Schermer Копия: ceph-users@lists.ceph.com Тема: Re: [ceph-users] CEPH cache layer. Very slow So, good, but price for 845 DC PRO 400 GB higher in about 2x times than intel S3500 240G ((( Any other models? ((( 2015-08-13 15:45 GMT+03:00 Jan Schermer j...@schermer.czmailto:j...@schermer.cz: I tested and can recommend the Samsung 845 DC PRO (make sure it is DC PRO and not just PRO or DC EVO!). Those were very cheap but are out of stock at the moment (here). Faster than Intels, cheaper, and slightly different technology (3D V-NAND) which IMO makes them superior without needing many tricks to do its job. Jan On 13 Aug 2015, at 14:40, Voloshanenko Igor igor.voloshane...@gmail.commailto:igor.voloshane...@gmail.com wrote: Tnx, Irek! Will try! but another question to all, which SSD good enough for CEPH now? I'm looking into S3500 240G (I have some S3500 120G which show great results. Around 8x times better than Samsung) Possible you can give advice about other vendors/models with same or below price level as S3500 240G? 2015-08-13 12:11 GMT+03:00 Irek Fasikhov malm...@gmail.commailto:malm...@gmail.com: Hi, Igor. Try to roll the patch here: http://www.theirek.com/blog/2014/02/16/patch-dlia-raboty-s-enierghoniezavisimym-keshiem-ssd-diskov P.S. I am no longer tracks changes in this direction(kernel), because we use already recommended SSD С уважением, Фасихов Ирек Нургаязович Моб.: +79229045757tel:%2B79229045757 2015-08-13 11:56 GMT+03:00 Voloshanenko Igor igor.voloshane...@gmail.commailto:igor.voloshane...@gmail.com: So, after testing SSD (i wipe 1 SSD, and used it for tests) root@ix-s2:~# sudo fio --filename=/dev/sda --direct=1 --sync=1 --rw=write --bs=4k --numjobs=1 --iodepth=1 --runtime=60 --time_based --gr[53/1800] ting --name=journal-test journal-test: (g=0): rw=write, bs=4K-4K/4K-4K/4K-4K, ioengine=sync, iodepth=1 fio-2.1.3 Starting 1 process Jobs: 1 (f=1): [W] [100.0% done] [0KB/1152KB/0KB /s] [0/288/0 iops] [eta 00m:00s] journal-test: (groupid=0, jobs=1): err= 0: pid=2849460: Thu Aug 13 10:46:42 2015 write: io=68972KB, bw=1149.6KB/s, iops=287, runt= 60001msec clat (msec): min=2, max=15, avg= 3.48, stdev= 1.08 lat (msec): min=2, max=15, avg= 3.48, stdev= 1.08 clat percentiles (usec): | 1.00th=[ 2704], 5.00th=[ 2800], 10.00th=[ 2864], 20.00th=[ 2928], | 30.00th=[ 3024], 40.00th=[ 3088], 50.00th=[ 3280], 60.00th=[ 3408], | 70.00th=[ 3504], 80.00th=[ 3728], 90.00th=[ 3856], 95.00th=[ 4016], | 99.00th=[ 9024], 99.50th=[ 9280], 99.90th=[ 9792], 99.95th=[10048], | 99.99th=[14912] bw (KB /s): min= 1064, max= 1213, per=100.00%, avg=1150.07, stdev=34.31 lat (msec) : 4=94.99%, 10=4.96%, 20=0.05% cpu : usr=0.13%, sys=0.57%, ctx=17248, majf=0, minf=7 IO depths: 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, =64=0.0% submit: 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, =64=0.0% complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, =64=0.0% issued: total=r=0/w=17243/d=0, short=r=0/w=0/d=0 Run status group 0 (all jobs): WRITE: io=68972KB, aggrb=1149KB/s, minb=1149KB/s, maxb=1149KB/s, mint=60001msec, maxt=60001msec Disk stats (read/write): sda: ios=0/17224, merge=0/0, ticks=0/59584, in_queue=59576, util=99.30% So, it's pain... SSD do only 287 iops on 4K... 1,1 MB/s I try to change cache mode : echo temporary write through /sys/class/scsi_disk/2:0:0:0/cache_type echo temporary write through /sys/class/scsi_disk/3:0:0:0/cache_type no luck, still same shit results, also i found this article: https://lkml.org/lkml/2013/11/20/264 pointed to old very simple patch, which disable CMD_FLUSH https://gist.github.com/TheCodeArtist/93dddcd6a21dc81414ba Has everybody better ideas, how to improve this? (or disable CMD_FLUSH without recompile kernel, i used ubuntu and 4.0.4 for now (4.x branch because SSD 850 Pro have issue with NCQ TRIM and before 4.0.4 this exception was not included into libsata.c) 2015-08-12 19:17 GMT+03:00 Pieter Koorts pieter.koo...@me.commailto:pieter.koo...@me.com: Hi Igor I suspect you have very much the same problem as me.
Re: [ceph-users] НА: CEPH cache layer. Very slow
Nice to hear that you have no SSD failures yet in 10months. How many OSDs are you running, and what is your primary ceph workload? (RBD, rgw, etc?) -Ben On Fri, Aug 14, 2015 at 2:23 AM, Межов Игорь Александрович me...@yuterra.ru wrote: Hi! Of course, it isn't cheap at all, but we use Intel DC S3700 200Gb for ceph journals and DC S3700 400Gb in the SSD pool: same hosts, separate root in crushmap. SSD pool are not yet in production, journаlling SSDs works under production load for 10 months. They're in good condition - no faults, no degradation. We specially take 200Gb SSD for journals to reduce costs, and also have a higher than recommended OSD/SSD ratio: 1 SSD per 10-12 ODS, whille recommended 1/3 to 1/6. So, as a conclusion - I'll recommend you to get a bigger budget and buy durable and fast SSDs for Ceph. Megov Igor CIO, Yuterra От: ceph-users ceph-users-boun...@lists.ceph.com от имени Voloshanenko Igor igor.voloshane...@gmail.com Отправлено: 13 августа 2015 г. 15:54 Кому: Jan Schermer Копия: ceph-users@lists.ceph.com Тема: Re: [ceph-users] CEPH cache layer. Very slow So, good, but price for 845 DC PRO 400 GB higher in about 2x times than intel S3500 240G ((( Any other models? ((( 2015-08-13 15:45 GMT+03:00 Jan Schermer j...@schermer.cz: I tested and can recommend the Samsung 845 DC PRO (make sure it is DC PRO and not just PRO or DC EVO!). Those were very cheap but are out of stock at the moment (here). Faster than Intels, cheaper, and slightly different technology (3D V-NAND) which IMO makes them superior without needing many tricks to do its job. Jan On 13 Aug 2015, at 14:40, Voloshanenko Igor igor.voloshane...@gmail.com wrote: Tnx, Irek! Will try! but another question to all, which SSD good enough for CEPH now? I'm looking into S3500 240G (I have some S3500 120G which show great results. Around 8x times better than Samsung) Possible you can give advice about other vendors/models with same or below price level as S3500 240G? 2015-08-13 12:11 GMT+03:00 Irek Fasikhov malm...@gmail.com: Hi, Igor. Try to roll the patch here: http://www.theirek.com/blog/2014/02/16/patch-dlia-raboty-s-enierghoniezavisimym-keshiem-ssd-diskov P.S. I am no longer tracks changes in this direction(kernel), because we use already recommended SSD С уважением, Фасихов Ирек Нургаязович Моб.: +79229045757 2015-08-13 11:56 GMT+03:00 Voloshanenko Igor igor.voloshane...@gmail.com: So, after testing SSD (i wipe 1 SSD, and used it for tests) root@ix-s2:~# sudo fio --filename=/dev/sda --direct=1 --sync=1 --rw=write --bs=4k --numjobs=1 --iodepth=1 --runtime=60 --time_based --gr[53/1800] ting --name=journal-test journal-test: (g=0): rw=write, bs=4K-4K/4K-4K/4K-4K, ioengine=sync, iodepth=1 fio-2.1.3 Starting 1 process Jobs: 1 (f=1): [W] [100.0% done] [0KB/1152KB/0KB /s] [0/288/0 iops] [eta 00m:00s] journal-test: (groupid=0, jobs=1): err= 0: pid=2849460: Thu Aug 13 10:46:42 2015 write: io=68972KB, bw=1149.6KB/s, iops=287, runt= 60001msec clat (msec): min=2, max=15, avg= 3.48, stdev= 1.08 lat (msec): min=2, max=15, avg= 3.48, stdev= 1.08 clat percentiles (usec): | 1.00th=[ 2704], 5.00th=[ 2800], 10.00th=[ 2864], 20.00th=[ 2928], | 30.00th=[ 3024], 40.00th=[ 3088], 50.00th=[ 3280], 60.00th=[ 3408], | 70.00th=[ 3504], 80.00th=[ 3728], 90.00th=[ 3856], 95.00th=[ 4016], | 99.00th=[ 9024], 99.50th=[ 9280], 99.90th=[ 9792], 99.95th=[10048], | 99.99th=[14912] bw (KB /s): min= 1064, max= 1213, per=100.00%, avg=1150.07, stdev=34.31 lat (msec) : 4=94.99%, 10=4.96%, 20=0.05% cpu : usr=0.13%, sys=0.57%, ctx=17248, majf=0, minf=7 IO depths: 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, =64=0.0% submit: 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, =64=0.0% complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, =64=0.0% issued: total=r=0/w=17243/d=0, short=r=0/w=0/d=0 Run status group 0 (all jobs): WRITE: io=68972KB, aggrb=1149KB/s, minb=1149KB/s, maxb=1149KB/s, mint=60001msec, maxt=60001msec Disk stats (read/write): sda: ios=0/17224, merge=0/0, ticks=0/59584, in_queue=59576, util=99.30% So, it's pain... SSD do only 287 iops on 4K... 1,1 MB/s I try to change cache mode : echo temporary write through /sys/class/scsi_disk/2:0:0:0/cache_type echo temporary write through /sys/class/scsi_disk/3:0:0:0/cache_type no luck, still same shit results, also i found this article: https://lkml.org/lkml/2013/11/20/264 pointed to old very simple patch, which disable CMD_FLUSH https://gist.github.com/TheCodeArtist/93dddcd6a21dc81414ba Has everybody better ideas, how to improve this? (or disable CMD_FLUSH without recompile kernel, i used ubuntu and 4.0.4 for now (4.x branch because SSD 850 Pro have issue with NCQ TRIM and before 4.0.4 this exception was not included into libsata.c)
Re: [ceph-users] CEPH cache layer. Very slow
72 osd, 60 hdd, 12 ssd Primary workload - rbd, kvm пятница, 14 августа 2015 г. пользователь Ben Hines написал: Nice to hear that you have no SSD failures yet in 10months. How many OSDs are you running, and what is your primary ceph workload? (RBD, rgw, etc?) -Ben On Fri, Aug 14, 2015 at 2:23 AM, Межов Игорь Александрович me...@yuterra.ru javascript:; wrote: Hi! Of course, it isn't cheap at all, but we use Intel DC S3700 200Gb for ceph journals and DC S3700 400Gb in the SSD pool: same hosts, separate root in crushmap. SSD pool are not yet in production, journаlling SSDs works under production load for 10 months. They're in good condition - no faults, no degradation. We specially take 200Gb SSD for journals to reduce costs, and also have a higher than recommended OSD/SSD ratio: 1 SSD per 10-12 ODS, whille recommended 1/3 to 1/6. So, as a conclusion - I'll recommend you to get a bigger budget and buy durable and fast SSDs for Ceph. Megov Igor CIO, Yuterra От: ceph-users ceph-users-boun...@lists.ceph.com javascript:; от имени Voloshanenko Igor igor.voloshane...@gmail.com javascript:; Отправлено: 13 августа 2015 г. 15:54 Кому: Jan Schermer Копия: ceph-users@lists.ceph.com javascript:; Тема: Re: [ceph-users] CEPH cache layer. Very slow So, good, but price for 845 DC PRO 400 GB higher in about 2x times than intel S3500 240G ((( Any other models? ((( 2015-08-13 15:45 GMT+03:00 Jan Schermer j...@schermer.cz javascript:; : I tested and can recommend the Samsung 845 DC PRO (make sure it is DC PRO and not just PRO or DC EVO!). Those were very cheap but are out of stock at the moment (here). Faster than Intels, cheaper, and slightly different technology (3D V-NAND) which IMO makes them superior without needing many tricks to do its job. Jan On 13 Aug 2015, at 14:40, Voloshanenko Igor igor.voloshane...@gmail.com javascript:; wrote: Tnx, Irek! Will try! but another question to all, which SSD good enough for CEPH now? I'm looking into S3500 240G (I have some S3500 120G which show great results. Around 8x times better than Samsung) Possible you can give advice about other vendors/models with same or below price level as S3500 240G? 2015-08-13 12:11 GMT+03:00 Irek Fasikhov malm...@gmail.com javascript:;: Hi, Igor. Try to roll the patch here: http://www.theirek.com/blog/2014/02/16/patch-dlia-raboty-s-enierghoniezavisimym-keshiem-ssd-diskov P.S. I am no longer tracks changes in this direction(kernel), because we use already recommended SSD С уважением, Фасихов Ирек Нургаязович Моб.: +79229045757 2015-08-13 11:56 GMT+03:00 Voloshanenko Igor igor.voloshane...@gmail.com javascript:;: So, after testing SSD (i wipe 1 SSD, and used it for tests) root@ix-s2:~# sudo fio --filename=/dev/sda --direct=1 --sync=1 --rw=write --bs=4k --numjobs=1 --iodepth=1 --runtime=60 --time_based --gr[53/1800] ting --name=journal-test journal-test: (g=0): rw=write, bs=4K-4K/4K-4K/4K-4K, ioengine=sync, iodepth=1 fio-2.1.3 Starting 1 process Jobs: 1 (f=1): [W] [100.0% done] [0KB/1152KB/0KB /s] [0/288/0 iops] [eta 00m:00s] journal-test: (groupid=0, jobs=1): err= 0: pid=2849460: Thu Aug 13 10:46:42 2015 write: io=68972KB, bw=1149.6KB/s, iops=287, runt= 60001msec clat (msec): min=2, max=15, avg= 3.48, stdev= 1.08 lat (msec): min=2, max=15, avg= 3.48, stdev= 1.08 clat percentiles (usec): | 1.00th=[ 2704], 5.00th=[ 2800], 10.00th=[ 2864], 20.00th=[ 2928], | 30.00th=[ 3024], 40.00th=[ 3088], 50.00th=[ 3280], 60.00th=[ 3408], | 70.00th=[ 3504], 80.00th=[ 3728], 90.00th=[ 3856], 95.00th=[ 4016], | 99.00th=[ 9024], 99.50th=[ 9280], 99.90th=[ 9792], 99.95th=[10048], | 99.99th=[14912] bw (KB /s): min= 1064, max= 1213, per=100.00%, avg=1150.07, stdev=34.31 lat (msec) : 4=94.99%, 10=4.96%, 20=0.05% cpu : usr=0.13%, sys=0.57%, ctx=17248, majf=0, minf=7 IO depths: 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, =64=0.0% submit: 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, =64=0.0% complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, =64=0.0% issued: total=r=0/w=17243/d=0, short=r=0/w=0/d=0 Run status group 0 (all jobs): WRITE: io=68972KB, aggrb=1149KB/s, minb=1149KB/s, maxb=1149KB/s, mint=60001msec, maxt=60001msec Disk stats (read/write): sda: ios=0/17224, merge=0/0, ticks=0/59584, in_queue=59576, util=99.30% So, it's pain... SSD do only 287 iops on 4K... 1,1 MB/s I try to change cache mode : echo temporary write through /sys/class/scsi_disk/2:0:0:0/cache_type echo temporary write through /sys/class/scsi_disk/3:0:0:0/cache_type no luck, still same shit results, also i found this article: https://lkml.org/lkml/2013/11/20/264 pointed to old
[ceph-users] ODS' weird status. Can not be removed anymore.
Hello, this is my first posting to ceph-users mailgroup and because I am also new to this technology please be patient with me. A description of problem I get stuck follows: 3 Monitors are up and running, one of them is leader, the two are peons. There is no authentication between the nodes yet. Due to my try and error setup I'm afraid I build already some zombies into my ODS map and I am not able to get rid of them now. First: ceph osd tree reveals: # idweight type name up/down reweight -1 3 root default -2 1 host arm01 1 1 osd.1 DNE 0 0 osd.0 DNE -3 1 host arm02 2 1 osd.2 DNE -4 1 host arm03 3 1 osd.3 DNE Second: a removal of any of those OSDs seems not be possible anymore: ceph osd rm osd.0 produces error osd.0 does not exist. Similar error messages occur when I try to remove remaining OSD's osd.1, osd.2 and osd.3 Third: The OSD processes on all three servers are NOT running. Fourth: OSD' configuration section defines: [osd] filestore xattr use omap= true osd mkfs type = xfs osd mkfs options xfs= -f osd mount options xfs = rw,noatime osd data= /var/lib/ceph/osd/$cluster-$id osd journal = /var/lib/ceph/osd/journal osd journal size= 100 [osd.1] host= arm01 [osd.2] host= arm02 [osd.3] host= arm03 Fifth: arch = armhf, OS= debian 8, ceph = 0.80.7 setup = due to absence of convenient setup scripts for armhf architecture I set up ceph manually. Question 1: how can I clean up my OSD map ? Question 2: What does a column reweight wiht value DNE mean ? Does Not Exist perhaps ? Many thanks in advance, Regards, Martin. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] ODS' weird status. Can not be removed anymore.
On 14-08-15 14:30, Marcin Przyczyna wrote: Hello, this is my first posting to ceph-users mailgroup and because I am also new to this technology please be patient with me. A description of problem I get stuck follows: 3 Monitors are up and running, one of them is leader, the two are peons. There is no authentication between the nodes yet. Due to my try and error setup I'm afraid I build already some zombies into my ODS map and I am not able to get rid of them now. First: ceph osd tree reveals: # idweight type name up/down reweight -1 3 root default -2 1 host arm01 1 1 osd.1 DNE 0 0 osd.0 DNE -3 1 host arm02 2 1 osd.2 DNE -4 1 host arm03 3 1 osd.3 DNE Second: a removal of any of those OSDs seems not be possible anymore: ceph osd rm osd.0 produces error osd.0 does not exist. Similar error messages occur when I try to remove remaining OSD's osd.1, osd.2 and osd.3 You need to remove them from the crushmap as well: $ ceph osd crush remove osd.X Wido Third: The OSD processes on all three servers are NOT running. Fourth: OSD' configuration section defines: [osd] filestore xattr use omap= true osd mkfs type = xfs osd mkfs options xfs= -f osd mount options xfs = rw,noatime osd data= /var/lib/ceph/osd/$cluster-$id osd journal = /var/lib/ceph/osd/journal osd journal size= 100 [osd.1] host= arm01 [osd.2] host= arm02 [osd.3] host= arm03 Fifth: arch = armhf, OS= debian 8, ceph = 0.80.7 setup = due to absence of convenient setup scripts for armhf architecture I set up ceph manually. Question 1: how can I clean up my OSD map ? Question 2: What does a column reweight wiht value DNE mean ? Does Not Exist perhaps ? Many thanks in advance, Regards, Martin. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com