Re: [ceph-users] OpenStack Keystone with RadosGW
I've figured out the main reason is. When swift client request through keystone user like 'admin', keystone returned with X-Auth-Token header. After that, the swift client requests with X-Auth-Token to radosgw, but radosgw returned 'AccessDenied' Some people says radosgw doesn't support keystone identity version 3 yet. 2016-11-22 15:41 GMT+09:00 한승진 : > Hi All, > > I am trying to implement radosgw with Openstack as an object storage > service. > > I think there are 2 cases for using radosgw as an object storage > > First, Keystone <-> Ceph connect directly. > > like below guide.. > > http://docs.ceph.com/docs/master/radosgw/keystone/ > > Second, use ceph as a back-end of swift. > > like below guide.. > > https://github.com/openstack/swift-ceph-backend#installation > > In first case, It issues always 405 error therefore I cannot go forward > any more. > > In second case, I don't know how to make ring builder in ceph backend > environment. > > Is anybody use radosgw with OpenStack? Please give me a guide. > > Thanks. > > John. > > = > Here is my ceph.conf configurations > > [client.radosgw.cephmon01] > rgw keystone api version = 3 > rgw keystone url = http://controller:35357 > rgw keystone admin user = swift > rgw keystone admin password = * > rgw keystone admin project = service > rgw keystone admin domain = default > rgw keystone accepted roles = admin,user > > rgw s3 auth use keystone = true > rgw keystone verify ssl = false > > > > > > > > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] deep-scrubbing has large impact on performance
Hi list, I've been searching the mail archive and the web for some help. I tried the things I found, but I can't see the effects. We use Ceph for our Openstack environment. When our cluster (2 pools, each 4092 PGs, in 20 OSDs on 4 nodes, 3 MONs) starts deep-scrubbing, it's impossible to work with the VMs. Currently, the deep-scrubs happen to start on Monday, which is unfortunate. I already plan to start the next deep-scrub on Saturday, so it has no impact on our work days. But if I imagine we had a large multi-datacenter, such performance breaks are not reasonable. So I'm wondering how do you guys manage that? What I've tried so far: ceph tell osd.* injectargs '--osd_scrub_sleep 0.1' ceph tell osd.* injectargs '--osd_disk_thread_ioprio_priority 7' ceph tell osd.* injectargs '--osd_disk_thread_ioprio_class idle' ceph tell osd.* injectargs '--osd_scrub_begin_hour 0' ceph tell osd.* injectargs '--osd_scrub_end_hour 7' And I also added these options to the ceph.conf. To be able to work again, I had to set the nodeep-scrub option and unset it when I left the office. Today, I see the cluster deep-scrubbing again, but only one PG at a time, it seems that now the default for osd_max_scrubs is working now and I don't see major impacts yet. But is there something else I can do to reduce the performance impact? I just found [1] and will have a look into it. [1] http://prob6.com/en/ceph-pg-deep-scrub-cron/ Thanks! Eugen -- Eugen Block voice : +49-40-559 51 75 NDE Netzdesign und -entwicklung AG fax : +49-40-559 51 77 Postfach 61 03 15 D-22423 Hamburg e-mail : ebl...@nde.ag Vorsitzende des Aufsichtsrates: Angelika Mozdzen Sitz und Registergericht: Hamburg, HRB 90934 Vorstand: Jens-U. Mozdzen USt-IdNr. DE 814 013 983 ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] deep-scrubbing has large impact on performance
> -Original Message- > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of > Eugen Block > Sent: 22 November 2016 09:55 > To: ceph-users@lists.ceph.com > Subject: [ceph-users] deep-scrubbing has large impact on performance > > Hi list, > > I've been searching the mail archive and the web for some help. I tried the > things I found, but I can't see the effects. We use Ceph for > our Openstack environment. > > When our cluster (2 pools, each 4092 PGs, in 20 OSDs on 4 nodes, 3 > MONs) starts deep-scrubbing, it's impossible to work with the VMs. > Currently, the deep-scrubs happen to start on Monday, which is unfortunate. I > already plan to start the next deep-scrub on Saturday, > so it has no impact on our work days. But if I imagine we had a large > multi-datacenter, such performance breaks are not reasonable. So > I'm wondering how do you guys manage that? > > What I've tried so far: > > ceph tell osd.* injectargs '--osd_scrub_sleep 0.1' > ceph tell osd.* injectargs '--osd_disk_thread_ioprio_priority 7' > ceph tell osd.* injectargs '--osd_disk_thread_ioprio_class idle' > ceph tell osd.* injectargs '--osd_scrub_begin_hour 0' > ceph tell osd.* injectargs '--osd_scrub_end_hour 7' > > And I also added these options to the ceph.conf. > To be able to work again, I had to set the nodeep-scrub option and unset it > when I left the office. Today, I see the cluster deep- > scrubbing again, but only one PG at a time, it seems that now the default for > osd_max_scrubs is working now and I don't see major > impacts yet. > > But is there something else I can do to reduce the performance impact? If you are using Jewel, the scrubing is now done in the client IO thread, so those disk thread options won't do anything. Instead there is a new priority setting, which seems to work for me, along with a few other settings. osd_scrub_priority = 1 osd_scrub_sleep = .1 osd_scrub_chunk_min = 1 osd_scrub_chunk_max = 5 osd_scrub_load_threshold = 5 Also enabling the weighted priority queue can assist the new priority options osd_op_queue = wpq > I just found [1] and will have a look into it. > > [1] http://prob6.com/en/ceph-pg-deep-scrub-cron/ > > Thanks! > Eugen > > -- > Eugen Block voice : +49-40-559 51 75 > NDE Netzdesign und -entwicklung AG fax : +49-40-559 51 77 > Postfach 61 03 15 > D-22423 Hamburg e-mail : ebl...@nde.ag > > Vorsitzende des Aufsichtsrates: Angelika Mozdzen >Sitz und Registergericht: Hamburg, HRB 90934 >Vorstand: Jens-U. Mozdzen > USt-IdNr. DE 814 013 983 > > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] deep-scrubbing has large impact on performance
Thanks for the very quick answer! If you are using Jewel We are still using Hammer (0.94.7), we wanted to upgrade to Jewel in a couple of weeks, would you recommend to do it now? Zitat von Nick Fisk : -Original Message- From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Eugen Block Sent: 22 November 2016 09:55 To: ceph-users@lists.ceph.com Subject: [ceph-users] deep-scrubbing has large impact on performance Hi list, I've been searching the mail archive and the web for some help. I tried the things I found, but I can't see the effects. We use Ceph for our Openstack environment. When our cluster (2 pools, each 4092 PGs, in 20 OSDs on 4 nodes, 3 MONs) starts deep-scrubbing, it's impossible to work with the VMs. Currently, the deep-scrubs happen to start on Monday, which is unfortunate. I already plan to start the next deep-scrub on Saturday, so it has no impact on our work days. But if I imagine we had a large multi-datacenter, such performance breaks are not reasonable. So I'm wondering how do you guys manage that? What I've tried so far: ceph tell osd.* injectargs '--osd_scrub_sleep 0.1' ceph tell osd.* injectargs '--osd_disk_thread_ioprio_priority 7' ceph tell osd.* injectargs '--osd_disk_thread_ioprio_class idle' ceph tell osd.* injectargs '--osd_scrub_begin_hour 0' ceph tell osd.* injectargs '--osd_scrub_end_hour 7' And I also added these options to the ceph.conf. To be able to work again, I had to set the nodeep-scrub option and unset it when I left the office. Today, I see the cluster deep- scrubbing again, but only one PG at a time, it seems that now the default for osd_max_scrubs is working now and I don't see major impacts yet. But is there something else I can do to reduce the performance impact? If you are using Jewel, the scrubing is now done in the client IO thread, so those disk thread options won't do anything. Instead there is a new priority setting, which seems to work for me, along with a few other settings. osd_scrub_priority = 1 osd_scrub_sleep = .1 osd_scrub_chunk_min = 1 osd_scrub_chunk_max = 5 osd_scrub_load_threshold = 5 Also enabling the weighted priority queue can assist the new priority options osd_op_queue = wpq I just found [1] and will have a look into it. [1] http://prob6.com/en/ceph-pg-deep-scrub-cron/ Thanks! Eugen -- Eugen Block voice : +49-40-559 51 75 NDE Netzdesign und -entwicklung AG fax : +49-40-559 51 77 Postfach 61 03 15 D-22423 Hamburg e-mail : ebl...@nde.ag Vorsitzende des Aufsichtsrates: Angelika Mozdzen Sitz und Registergericht: Hamburg, HRB 90934 Vorstand: Jens-U. Mozdzen USt-IdNr. DE 814 013 983 ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- Eugen Block voice : +49-40-559 51 75 NDE Netzdesign und -entwicklung AG fax : +49-40-559 51 77 Postfach 61 03 15 D-22423 Hamburg e-mail : ebl...@nde.ag Vorsitzende des Aufsichtsrates: Angelika Mozdzen Sitz und Registergericht: Hamburg, HRB 90934 Vorstand: Jens-U. Mozdzen USt-IdNr. DE 814 013 983 ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] export-diff behavior if an initial snapshot is NOT specified
Hi there, According to the official man page: http://docs.ceph.com/docs/jewel/man/8/rbd/ export-diff [–from-snap snap-name] [–whole-object] (image-spec | snap-spec) dest-path Exports an incremental diff for an image to dest path (use - for stdout). If an initial snapshot is specified, only changes since that snapshot are included; otherwise, any regions of the image that contain data are included. So if initial snapshot is NOT specified, then: rbd export-diff image@snap1 will diff all data to snap1. this cmd equals to : rbd export image@snap1. Is my understand right or not?? Thanks Zhongyan ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] deep-scrubbing has large impact on performance
> -Original Message- > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of > Eugen Block > Sent: 22 November 2016 10:11 > To: Nick Fisk > Cc: ceph-users@lists.ceph.com > Subject: Re: [ceph-users] deep-scrubbing has large impact on performance > > Thanks for the very quick answer! > > > If you are using Jewel > > We are still using Hammer (0.94.7), we wanted to upgrade to Jewel in a couple > of weeks, would you recommend to do it now? It's been fairly solid for me, but you might want to wait for the scrubbing hang bug to be fixed before upgrading. I think this might be fixed in the upcoming 10.2.4 release. > > > Zitat von Nick Fisk : > > >> -Original Message- > >> From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf > >> Of Eugen Block > >> Sent: 22 November 2016 09:55 > >> To: ceph-users@lists.ceph.com > >> Subject: [ceph-users] deep-scrubbing has large impact on performance > >> > >> Hi list, > >> > >> I've been searching the mail archive and the web for some help. I > >> tried the things I found, but I can't see the effects. We use > > Ceph for > >> our Openstack environment. > >> > >> When our cluster (2 pools, each 4092 PGs, in 20 OSDs on 4 nodes, 3 > >> MONs) starts deep-scrubbing, it's impossible to work with the VMs. > >> Currently, the deep-scrubs happen to start on Monday, which is > >> unfortunate. I already plan to start the next deep-scrub on > > Saturday, > >> so it has no impact on our work days. But if I imagine we had a large > >> multi-datacenter, such performance breaks are not > > reasonable. So > >> I'm wondering how do you guys manage that? > >> > >> What I've tried so far: > >> > >> ceph tell osd.* injectargs '--osd_scrub_sleep 0.1' > >> ceph tell osd.* injectargs '--osd_disk_thread_ioprio_priority 7' > >> ceph tell osd.* injectargs '--osd_disk_thread_ioprio_class idle' > >> ceph tell osd.* injectargs '--osd_scrub_begin_hour 0' > >> ceph tell osd.* injectargs '--osd_scrub_end_hour 7' > >> > >> And I also added these options to the ceph.conf. > >> To be able to work again, I had to set the nodeep-scrub option and > >> unset it when I left the office. Today, I see the cluster deep- > >> scrubbing again, but only one PG at a time, it seems that now the > >> default for osd_max_scrubs is working now and I don't see major > >> impacts yet. > >> > >> But is there something else I can do to reduce the performance impact? > > > > If you are using Jewel, the scrubing is now done in the client IO > > thread, so those disk thread options won't do anything. Instead there > > is a new priority setting, which seems to work for me, along with a > > few other settings. > > > > osd_scrub_priority = 1 > > osd_scrub_sleep = .1 > > osd_scrub_chunk_min = 1 > > osd_scrub_chunk_max = 5 > > osd_scrub_load_threshold = 5 > > > > Also enabling the weighted priority queue can assist the new priority > > options > > > > osd_op_queue = wpq > > > > > >> I just found [1] and will have a look into it. > >> > >> [1] http://prob6.com/en/ceph-pg-deep-scrub-cron/ > >> > >> Thanks! > >> Eugen > >> > >> -- > >> Eugen Block voice : +49-40-559 51 75 > >> NDE Netzdesign und -entwicklung AG fax : +49-40-559 51 77 > >> Postfach 61 03 15 > >> D-22423 Hamburg e-mail : ebl...@nde.ag > >> > >> Vorsitzende des Aufsichtsrates: Angelika Mozdzen > >>Sitz und Registergericht: Hamburg, HRB 90934 > >>Vorstand: Jens-U. Mozdzen > >> USt-IdNr. DE 814 013 983 > >> > >> ___ > >> ceph-users mailing list > >> ceph-users@lists.ceph.com > >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > > > -- > Eugen Block voice : +49-40-559 51 75 > NDE Netzdesign und -entwicklung AG fax : +49-40-559 51 77 > Postfach 61 03 15 > D-22423 Hamburg e-mail : ebl...@nde.ag > > Vorsitzende des Aufsichtsrates: Angelika Mozdzen >Sitz und Registergericht: Hamburg, HRB 90934 >Vorstand: Jens-U. Mozdzen > USt-IdNr. DE 814 013 983 > > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] deep-scrubbing has large impact on performance
Thank you! Zitat von Nick Fisk : -Original Message- From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Eugen Block Sent: 22 November 2016 10:11 To: Nick Fisk Cc: ceph-users@lists.ceph.com Subject: Re: [ceph-users] deep-scrubbing has large impact on performance Thanks for the very quick answer! > If you are using Jewel We are still using Hammer (0.94.7), we wanted to upgrade to Jewel in a couple of weeks, would you recommend to do it now? It's been fairly solid for me, but you might want to wait for the scrubbing hang bug to be fixed before upgrading. I think this might be fixed in the upcoming 10.2.4 release. Zitat von Nick Fisk : >> -Original Message- >> From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf >> Of Eugen Block >> Sent: 22 November 2016 09:55 >> To: ceph-users@lists.ceph.com >> Subject: [ceph-users] deep-scrubbing has large impact on performance >> >> Hi list, >> >> I've been searching the mail archive and the web for some help. I >> tried the things I found, but I can't see the effects. We use > Ceph for >> our Openstack environment. >> >> When our cluster (2 pools, each 4092 PGs, in 20 OSDs on 4 nodes, 3 >> MONs) starts deep-scrubbing, it's impossible to work with the VMs. >> Currently, the deep-scrubs happen to start on Monday, which is >> unfortunate. I already plan to start the next deep-scrub on > Saturday, >> so it has no impact on our work days. But if I imagine we had a large >> multi-datacenter, such performance breaks are not > reasonable. So >> I'm wondering how do you guys manage that? >> >> What I've tried so far: >> >> ceph tell osd.* injectargs '--osd_scrub_sleep 0.1' >> ceph tell osd.* injectargs '--osd_disk_thread_ioprio_priority 7' >> ceph tell osd.* injectargs '--osd_disk_thread_ioprio_class idle' >> ceph tell osd.* injectargs '--osd_scrub_begin_hour 0' >> ceph tell osd.* injectargs '--osd_scrub_end_hour 7' >> >> And I also added these options to the ceph.conf. >> To be able to work again, I had to set the nodeep-scrub option and >> unset it when I left the office. Today, I see the cluster deep- >> scrubbing again, but only one PG at a time, it seems that now the >> default for osd_max_scrubs is working now and I don't see major >> impacts yet. >> >> But is there something else I can do to reduce the performance impact? > > If you are using Jewel, the scrubing is now done in the client IO > thread, so those disk thread options won't do anything. Instead there > is a new priority setting, which seems to work for me, along with a > few other settings. > > osd_scrub_priority = 1 > osd_scrub_sleep = .1 > osd_scrub_chunk_min = 1 > osd_scrub_chunk_max = 5 > osd_scrub_load_threshold = 5 > > Also enabling the weighted priority queue can assist the new priority > options > > osd_op_queue = wpq > > >> I just found [1] and will have a look into it. >> >> [1] http://prob6.com/en/ceph-pg-deep-scrub-cron/ >> >> Thanks! >> Eugen >> >> -- >> Eugen Block voice : +49-40-559 51 75 >> NDE Netzdesign und -entwicklung AG fax : +49-40-559 51 77 >> Postfach 61 03 15 >> D-22423 Hamburg e-mail : ebl...@nde.ag >> >> Vorsitzende des Aufsichtsrates: Angelika Mozdzen >>Sitz und Registergericht: Hamburg, HRB 90934 >>Vorstand: Jens-U. Mozdzen >> USt-IdNr. DE 814 013 983 >> >> ___ >> ceph-users mailing list >> ceph-users@lists.ceph.com >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- Eugen Block voice : +49-40-559 51 75 NDE Netzdesign und -entwicklung AG fax : +49-40-559 51 77 Postfach 61 03 15 D-22423 Hamburg e-mail : ebl...@nde.ag Vorsitzende des Aufsichtsrates: Angelika Mozdzen Sitz und Registergericht: Hamburg, HRB 90934 Vorstand: Jens-U. Mozdzen USt-IdNr. DE 814 013 983 ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- Eugen Block voice : +49-40-559 51 75 NDE Netzdesign und -entwicklung AG fax : +49-40-559 51 77 Postfach 61 03 15 D-22423 Hamburg e-mail : ebl...@nde.ag Vorsitzende des Aufsichtsrates: Angelika Mozdzen Sitz und Registergericht: Hamburg, HRB 90934 Vorstand: Jens-U. Mozdzen USt-IdNr. DE 814 013 983 ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] export-diff behavior if an initial snapshot is NOT specified
On Tue, Nov 22, 2016 at 5:31 AM, Zhongyan Gu wrote: > So if initial snapshot is NOT specified, then: > rbd export-diff image@snap1 will diff all data to snap1. this cmd equals to > : > rbd export image@snap1. Is my understand right or not?? While they will both export all data associated w/ image@snap1, the "export" command will generate a raw, non-sparse dump of the full image whereas "export-diff" will export only sections of the image that contain data. The file generated from "export" can be used with the "import" command to create a new image, whereas the file generated from "export-diff" can only be used with "import-diff" against an existing image. -- Jason ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Intel P3700 SSD for journals
thx Alan and Anthony for sharing on these P3700 drives. Anthony, just to follow up on your email: my OS is CentOS7.2. Can you please elaborate on nvme on the CentOS7.2, I'm in no way expert on nvme, but I can here see that https://www.pcper.com/files/imagecache/article_max_width/news/2015-06-08/Demartek_SFF-8639.png the connectors are different for nvme. Does this mean I cannot connect to PERC 730 raid controller? Is there anything particular required when installing the CentOS on these drives, or they will be automatically detected and work out of the box by default? Thx will On Mon, Nov 21, 2016 at 12:16 PM, Anthony D'Atri wrote: > The SATA S3700 series has been the de-facto for journals for some time. And > journals don’t neeed all that much space. > > We’re using 400GB P3700’s. I’ll say a couple of things: > > o Update to the latest firmware available when you get your drives, qual it > and stick with it for a while so you have a uniform experience > o Run a recent kernel with a recent nvme.ko, eg. the RHEL 7.1 3.10.0-229.4.2 > kernel’s bundled nvme.ko has a rare timing issue that causes us resets at > times. YMMV. > > Which OS do you run? > > > > Read through this document or a newer version thereof > > https://www-ssl.intel.com/content/dam/www/public/us/en/documents/product-specifications/ssd-dc-p3700-spec.pdf > > or for SATA drives > > http://www.intel.com/content/www/us/en/solid-state-drives/ssd-dc-s3710-spec.html > > > It’s possible that your vendor is uninformed or lying, trying to upsell you. > At times larger units can perform better due to internal parallelism, ie. a > 1.6TB unit may electrically be 4x 400GB parts in parallel. For 7200RPM LFF > drives, as Nick noted 12x journals per P3700 is probably as high as you want > to go, otherwise you can bottleneck. > > What *is* true is the distinction among series. Check the graph halfway down > this page: > > http://www.anandtech.com/show/8104/intel-ssd-dc-p3700-review-the-pcie-ssd-transition-begins-with-nvme > > Prima fascia the P3500’s can seem like a relative bargain, but attend to the > durability — that is where the P3600 and P3700 differ dramatically. For some > the P3600 may be durable enough, given certain workloads and expected years > of service. I tend to be paranoid and lobbied for us to err on the side of > caution with the P3700. YMMV. > > — Anthony ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] ceph-disk dmcrypt : encryption key placement problem
Hello, we have a JEWEL cluster upgraded from FIREFLY. The cluster is encrypted with dmcrypt. Yesterday, i added some new OSDs. The first time since the upgrade. I searched the new keys to backup them and i see that the creation of new OSDs with the option dmcrypt changed. To be able to retrieved the key if the server filesystem crash ( http://tracker.ceph.com/issues/14669 ) or if the OSD move, a ceph user is created and its keyring file is used as LUKS's encryption key. Good idea. The problem is : There is a small partition named ceph lockbox at the begening of the disk. We can find the keyring among the files of this partition. Why is the encryption key stored on the same disk and in clear ? Someone who could get the disk would be able to read it. There's no point encrypting it in this case. It is urgent to move the keyring file elsewhere ( in /etc/ceph/dmcrypt-keys ? ) Regards Pierre -- -- Pierre BLONDEAU Administrateur Système & réseau Université de Caen Normandie Laboratoire GREYC, Département d'informatique Tel : 02 31 56 75 42. Bureau : Campus 2, Science 3, 406 -- smime.p7s Description: Signature cryptographique S/MIME ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Intel P3700 SSD for journals
You wrote P3700 so that’s what I discussed ;) If you want to connect to your HBA you’ll want a SATA device like the S3710 series: http://ark.intel.com/products/family/83425/Data-Center-SSDs#@Server The P3700 is a PCI device, goes into an empty slot, and is not speed-limited by the SATA interface. At perhaps higher cost. With 7.2 I would think you’d be fine, driver-wise. Either should be detected and work out of the box. — Anthony > > thx Alan and Anthony for sharing on these P3700 drives. > > Anthony, just to follow up on your email: my OS is CentOS7.2. Can > you please elaborate on nvme on the CentOS7.2, I'm in no way expert on > nvme, but I can here see that > https://www.pcper.com/files/imagecache/article_max_width/news/2015-06-08/Demartek_SFF-8639.png > the connectors are different for nvme. Does this mean I cannot connect > to PERC 730 raid controller? > > Is there anything particular required when installing the CentOS on > these drives, or they will be automatically detected and work out of > the box by default? Thx will > > On Mon, Nov 21, 2016 at 12:16 PM, Anthony D'Atri wrote: >> The SATA S3700 series has been the de-facto for journals for some time. And >> journals don’t neeed all that much space. >> >> We’re using 400GB P3700’s. I’ll say a couple of things: >> >> o Update to the latest firmware available when you get your drives, qual it >> and stick with it for a while so you have a uniform experience >> o Run a recent kernel with a recent nvme.ko, eg. the RHEL 7.1 3.10.0-229.4.2 >> kernel’s bundled nvme.ko has a rare timing issue that causes us resets at >> times. YMMV. >> >> Which OS do you run? >> >> >> >> Read through this document or a newer version thereof >> >> https://www-ssl.intel.com/content/dam/www/public/us/en/documents/product-specifications/ssd-dc-p3700-spec.pdf >> >> or for SATA drives >> >> http://www.intel.com/content/www/us/en/solid-state-drives/ssd-dc-s3710-spec.html >> >> >> It’s possible that your vendor is uninformed or lying, trying to upsell you. >> At times larger units can perform better due to internal parallelism, ie. a >> 1.6TB unit may electrically be 4x 400GB parts in parallel. For 7200RPM LFF >> drives, as Nick noted 12x journals per P3700 is probably as high as you want >> to go, otherwise you can bottleneck. >> >> What *is* true is the distinction among series. Check the graph halfway >> down this page: >> >> http://www.anandtech.com/show/8104/intel-ssd-dc-p3700-review-the-pcie-ssd-transition-begins-with-nvme >> >> Prima fascia the P3500’s can seem like a relative bargain, but attend to the >> durability — that is where the P3600 and P3700 differ dramatically. For >> some the P3600 may be durable enough, given certain workloads and expected >> years of service. I tend to be paranoid and lobbied for us to err on the >> side of caution with the P3700. YMMV. >> >> — Anthony ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Contribution to CEPH
Hey Jagan, I'm happy to hear you are interested in contributing to Ceph. I would suggest taking a look at the tracker (http://tracker.ceph.com/) for bugs and projects you might be interested in tackling. All code and associated repositories are available on github (https://github.com/ceph/). If you would like to hear some of the latest work I would recommend joining the Ceph Developer Monthly call on the first Wed of each month (http://wiki.ceph.com/Planning/). Hope that gets you headed in the right direction. Thanks. On Sun, Nov 20, 2016 at 9:15 AM, Jagan Kaartik wrote: > I am Jagan Kaartik, a freshman in computer science and engineering from > Amrita school of engineering, Kerala, India. > > I have a basic knowledge in Python and C++. > > My interest in databases and network storage inspired me to join the CEPH > organization. > > I want to learn and contribute and be a part of this organization. Please > guide me. > > With regards, > Jagan Kaartik > Amrita University > > > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > -- Best Regards, Patrick McGarry Director Ceph Community || Red Hat http://ceph.com || http://community.redhat.com @scuttlemonkey || @ceph ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] cephfs (rbd) read performance low - where is the bottleneck?
thank you very much for this info. On 11/21/16 12:33 PM, Eric Eastman wrote: Have you looked at your file layout? On a test cluster running 10.2.3 I created a 5GB file and then looked at the layout: # ls -l test.dat -rw-r--r-- 1 root root 524288 Nov 20 23:09 test.dat # getfattr -n ceph.file.layout test.dat # file: test.dat ceph.file.layout="stripe_unit=4194304 stripe_count=1 object_size=4194304 pool=cephfs_data" The file layout looks the same in my case. From what I understand with this layout you are reading 4MB of data from 1 OSD at a time so I think you are seeing the overall speed of a single SATA drive. I do not think increasing your MON/MDS links to 10Gb will help, nor for a single file read will it help by going to SSD for the metadata. Really? Does ceph really wait until each of the stripe_unit reads has finished reading before the next one? To test this, you may want to try creating 10 x 50GB files, and then read them in parallel and see if your overall throughput increases. Scaling through parallelism works as expected, no problem there. If so, take a look at the layout parameters and see if you can change the file layout to get more parallelization. https://github.com/ceph/ceph/blob/master/doc/dev/file-striping.rst https://github.com/ceph/ceph/blob/master/doc/cephfs/file-layouts.rst Interesting. But how would I change this to improve single threaded read speed? And how would I do the changes to already existing files? Regards, Mike Regards, Eric On Sun, Nov 20, 2016 at 3:24 AM, Mike Miller wrote: Hi, reading a big file 50 GB (tried more too) dd if=bigfile of=/dev/zero bs=4M in a cluster with 112 SATA disks in 10 osd (6272 pgs, replication 3) gives me only about *122 MB/s* read speed in single thread. Scrubbing turned off during measurement. I have been searching for possible bottlenecks. The network is not the problem, the machine running dd is connected to the cluster public network with a 20 GBASE-T bond. osd dual network: cluster public 10 GBASE-T, private 10 GBASE-T. The osd SATA disks are utilized only up until about 10% or 20%, not more than that. CPUs on osd idle too. CPUs on mon idle, mds usage about 1.0 (1 core is used on this 6-core machine). mon and mds connected with only 1 GbE (I would expect some latency from that, but no bandwidth issues; in fact network bandwidth is about 20 Mbit max). If I read a file with 50 GB, then clear the cache on the reading machine (but not the osd caches), I get much better reading performance of about *620 MB/s*. That seems logical to me as much (most) of the data is still in the osd cache buffers. But still the read performance is not super considered that the reading machine is connected to the cluster with a 20 Gbit/s bond. How can I improve? I am not really sure, but from my understanding 2 possible bottlenecks come to mind: 1) 1 GbE connection to mon / mds Is this the reason why reads are slow and osd disks are not hammered by read requests and therewith fully utilized? 2) Move metadata to SSD Currently, cephfs_metadata is on the same pool as the data on the spinning SATA disks. Is this the bottleneck? Is the move of metadata to SSD a solution? Or is it both? Your experience and insight are highly appreciated. Thanks, Mike ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Contribution to CEPH
Also, feel free to ask development related questions in #ceph-devel channel on oftc On Wed, Nov 23, 2016 at 2:30 AM, Patrick McGarry wrote: > Hey Jagan, > > I'm happy to hear you are interested in contributing to Ceph. I would > suggest taking a look at the tracker (http://tracker.ceph.com/) for > bugs and projects you might be interested in tackling. All code and > associated repositories are available on github > (https://github.com/ceph/). > > If you would like to hear some of the latest work I would recommend > joining the Ceph Developer Monthly call on the first Wed of each month > (http://wiki.ceph.com/Planning/). > > Hope that gets you headed in the right direction. Thanks. > > > On Sun, Nov 20, 2016 at 9:15 AM, Jagan Kaartik wrote: >> I am Jagan Kaartik, a freshman in computer science and engineering from >> Amrita school of engineering, Kerala, India. >> >> I have a basic knowledge in Python and C++. >> >> My interest in databases and network storage inspired me to join the CEPH >> organization. >> >> I want to learn and contribute and be a part of this organization. Please >> guide me. >> >> With regards, >> Jagan Kaartik >> Amrita University >> >> >> ___ >> ceph-users mailing list >> ceph-users@lists.ceph.com >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >> > > > > -- > > Best Regards, > > Patrick McGarry > Director Ceph Community || Red Hat > http://ceph.com || http://community.redhat.com > @scuttlemonkey || @ceph > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- Cheers, Brad ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] deep-scrubbing has large impact on performance
If you use wpq, I recommend also setting "osd_op_queue_cut_off = high" as well, otherwise replication OPs are not weighted and really reduces the benefit of wpq. Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 On Tue, Nov 22, 2016 at 5:34 AM, Eugen Block wrote: > Thank you! > > > Zitat von Nick Fisk : > >>> -Original Message- >>> From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of >>> Eugen Block >>> Sent: 22 November 2016 10:11 >>> To: Nick Fisk >>> Cc: ceph-users@lists.ceph.com >>> Subject: Re: [ceph-users] deep-scrubbing has large impact on performance >>> >>> Thanks for the very quick answer! >>> >>> > If you are using Jewel >>> >>> We are still using Hammer (0.94.7), we wanted to upgrade to Jewel in a >>> couple of weeks, would you recommend to do it now? >> >> >> It's been fairly solid for me, but you might want to wait for the >> scrubbing hang bug to be fixed before upgrading. I think this >> might be fixed in the upcoming 10.2.4 release. >> >>> >>> >>> Zitat von Nick Fisk : >>> >>> >> -Original Message- >>> >> From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf >>> >> Of Eugen Block >>> >> Sent: 22 November 2016 09:55 >>> >> To: ceph-users@lists.ceph.com >>> >> Subject: [ceph-users] deep-scrubbing has large impact on performance >>> >> >>> >> Hi list, >>> >> >>> >> I've been searching the mail archive and the web for some help. I >>> >> tried the things I found, but I can't see the effects. We use >>> > Ceph for >>> >> our Openstack environment. >>> >> >>> >> When our cluster (2 pools, each 4092 PGs, in 20 OSDs on 4 nodes, 3 >>> >> MONs) starts deep-scrubbing, it's impossible to work with the VMs. >>> >> Currently, the deep-scrubs happen to start on Monday, which is >>> >> unfortunate. I already plan to start the next deep-scrub on >>> > Saturday, >>> >> so it has no impact on our work days. But if I imagine we had a large >>> >> multi-datacenter, such performance breaks are not >>> > reasonable. So >>> >> I'm wondering how do you guys manage that? >>> >> >>> >> What I've tried so far: >>> >> >>> >> ceph tell osd.* injectargs '--osd_scrub_sleep 0.1' >>> >> ceph tell osd.* injectargs '--osd_disk_thread_ioprio_priority 7' >>> >> ceph tell osd.* injectargs '--osd_disk_thread_ioprio_class idle' >>> >> ceph tell osd.* injectargs '--osd_scrub_begin_hour 0' >>> >> ceph tell osd.* injectargs '--osd_scrub_end_hour 7' >>> >> >>> >> And I also added these options to the ceph.conf. >>> >> To be able to work again, I had to set the nodeep-scrub option and >>> >> unset it when I left the office. Today, I see the cluster deep- >>> >> scrubbing again, but only one PG at a time, it seems that now the >>> >> default for osd_max_scrubs is working now and I don't see major >>> >> impacts yet. >>> >> >>> >> But is there something else I can do to reduce the performance impact? >>> > >>> > If you are using Jewel, the scrubing is now done in the client IO >>> > thread, so those disk thread options won't do anything. Instead there >>> > is a new priority setting, which seems to work for me, along with a >>> > few other settings. >>> > >>> > osd_scrub_priority = 1 >>> > osd_scrub_sleep = .1 >>> > osd_scrub_chunk_min = 1 >>> > osd_scrub_chunk_max = 5 >>> > osd_scrub_load_threshold = 5 >>> > >>> > Also enabling the weighted priority queue can assist the new priority >>> > options >>> > >>> > osd_op_queue = wpq >>> > >>> > >>> >> I just found [1] and will have a look into it. >>> >> >>> >> [1] http://prob6.com/en/ceph-pg-deep-scrub-cron/ >>> >> >>> >> Thanks! >>> >> Eugen >>> >> >>> >> -- >>> >> Eugen Block voice : +49-40-559 51 75 >>> >> NDE Netzdesign und -entwicklung AG fax : +49-40-559 51 77 >>> >> Postfach 61 03 15 >>> >> D-22423 Hamburg e-mail : ebl...@nde.ag >>> >> >>> >> Vorsitzende des Aufsichtsrates: Angelika Mozdzen >>> >>Sitz und Registergericht: Hamburg, HRB 90934 >>> >>Vorstand: Jens-U. Mozdzen >>> >> USt-IdNr. DE 814 013 983 >>> >> >>> >> ___ >>> >> ceph-users mailing list >>> >> ceph-users@lists.ceph.com >>> >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >>> >>> >>> >>> -- >>> Eugen Block voice : +49-40-559 51 75 >>> NDE Netzdesign und -entwicklung AG fax : +49-40-559 51 77 >>> Postfach 61 03 15 >>> D-22423 Hamburg e-mail : ebl...@nde.ag >>> >>> Vorsitzende des Aufsichtsrates: Angelika Mozdzen >>>Sitz und Registergericht: Hamburg, HRB 90934 >>>Vorstand: Jens-U. Mozdzen >>> USt-IdNr. DE 814 013 983 >>> >>> ___ >>> ceph-users mailing list >>> ceph-users@lists.ceph.com >>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > > > > -- > Eugen Block
[ceph-users] osd set noin ignored for old OSD ids
Hi , As part of migration between hardware I have been building new OSDs and cleaning up old ones (osd rm osd.x, osd crush rm osd.x, auth del osd.x). To try and prevent rebalancing kicking in until all the new OSDs are created on a host I use "ceph osd set noin", however what I have seen is that if the new OSD that is created uses a new unique ID, then the flag is honoured and the OSD remains out until I bring it in. However if the OSD re-uses a previous OSD id then it will go straight to in and start backfilling. I have to manually out the OSD to stop it (or set nobackfill,norebalance). Am I doing something wrong in this process or is there something about "noin" that is ignored for previously existing OSDs that have been removed from both the OSD map and crush map? Cheers, Adrian Confidentiality: This email and any attachments are confidential and may be subject to copyright, legal or some other professional privilege. They are intended solely for the attention and use of the named addressee(s). They may only be copied, distributed or disclosed with the consent of the copyright owner. If you have received this email by mistake or by breach of the confidentiality clause, please notify the sender immediately by return email and delete or destroy all copies of the email. Any confidentiality, privilege or copyright is not waived or lost because this email has been sent to you by mistake. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] export-diff behavior if an initial snapshot is NOT specified
Thanks Jason, very clear explanation. However, I found some strange behavior when export-diff on a cloned image, not sure it is a bug on calc_snap_set_diff(). The test is, Image A is cloned from a parent image. then create snap1 for image A. The content of export-diff A@snap1 will be changed when update image A. Only after image A has no overlap with parent, the content of export-diff A@snap1 is stabled, which is almost zero. I don't think it is a designed behavior. export-diff A@snap1 should always get a stable output no matter image A is cloned or not. Please correct me if anything wrong. Thanks, Zhongyan On Tue, Nov 22, 2016 at 10:31 PM, Jason Dillaman wrote: > On Tue, Nov 22, 2016 at 5:31 AM, Zhongyan Gu > wrote: > > So if initial snapshot is NOT specified, then: > > rbd export-diff image@snap1 will diff all data to snap1. this cmd > equals to > > : > > rbd export image@snap1. Is my understand right or not?? > > > While they will both export all data associated w/ image@snap1, the > "export" command will generate a raw, non-sparse dump of the full > image whereas "export-diff" will export only sections of the image > that contain data. The file generated from "export" can be used with > the "import" command to create a new image, whereas the file generated > from "export-diff" can only be used with "import-diff" against an > existing image. > > -- > Jason > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] Ceph strange issue after adding a cache OSD.
Hello, The story goes like this. I have added another 3 drives to the caching layer. OSDs were added to crush map one by one after each successful rebalance. When I added the last OSD and went away for about an hour I noticed that it's still not finished rebalancing. Further investigation showed me that it one of the older cache SSD was restarting like crazy before full boot. So I shut it down and waited for a rebalance without that OSD. Less than an hour later I had another 2 OSD restarting like crazy. I tried running scrubs on the PG's logs asked me to, but that did not help. I'm currently stuck with " 8 scrub errors" and a complete dead cluster. log_channel(cluster) log [WRN] : pg 15.8d has invalid (post-split) stats; must scrub before tier agent can activate I need help with OSD from crashing. Crash log: 0> 2016-11-23 06:41:43.365602 7f935b4eb700 -1 osd/ReplicatedPG.cc: In function 'void ReplicatedPG::hit_set_trim(ReplicatedPG::RepGather*, unsigned int)' thread 7f935b4eb700 time 2016-11-23 06:41:43.363067 osd/ReplicatedPG.cc: 10521: FAILED assert(obc) ceph version 0.94.9 (fe6d859066244b97b24f09d46552afc2071e6f90) 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x85) [0xbde2c5] 2: (ReplicatedPG::hit_set_trim(ReplicatedPG::RepGather*, unsigned int)+0x75f) [0x87e89f] 3: (ReplicatedPG::hit_set_persist()+0xedb) [0x87f8bb] 4: (ReplicatedPG::do_op(std::tr1::shared_ptr&)+0xe3a) [0x8a11aa] 5: (ReplicatedPG::do_request(std::tr1::shared_ptr&, ThreadPool::TPHandle&)+0x68a) [0x83c37a] 6: (OSD::dequeue_op(boost::intrusive_ptr, std::tr1::shared_ptr, ThreadPool::TPHandle&)+0x405) [0x69af05] 7: (OSD::ShardedOpWQ::_process(unsigned int, ceph::heartbeat_handle_d*)+0x333) [0x69b473] 8: (ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x86f) [0xbcd9cf] 9: (ShardedThreadPool::WorkThreadSharded::entry()+0x10) [0xbcfb00] 10: (()+0x7dc5) [0x7f93b9df4dc5] 11: (clone()+0x6d) [0x7f93b88d5ced] NOTE: a copy of the executable, or `objdump -rdS ` is needed to interpret this. I have tried looking with full debug enabled, but those logs didn't help me much. I have tried to evict the cache layer, but some objects are stuck and can't be removed. Any suggestions would be greatly appreciated. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] cephfs (rbd) read performance low - where is the bottleneck?
Hi, did some testing multithreaded access and dd, performance scales as it should. Any ideas to improve single threaded read performance further would be highly appreciated. Some of our use cases requires that we need to read large files by a single thread. I have tried changing the readahead on the kernel client cephfs mount too, rsize and rasize. mount.ceph ... -o name=cephfs,secretfile=secret.key,rsize=67108864 Doing this on kernel 4.5.2 gives the error message: "ceph: Unknown mount option rsize" or unknown rasize. Can someone explain to me how I can experiment with readahead on cephfs? Mike On 11/21/16 12:33 PM, Eric Eastman wrote: Have you looked at your file layout? On a test cluster running 10.2.3 I created a 5GB file and then looked at the layout: # ls -l test.dat -rw-r--r-- 1 root root 524288 Nov 20 23:09 test.dat # getfattr -n ceph.file.layout test.dat # file: test.dat ceph.file.layout="stripe_unit=4194304 stripe_count=1 object_size=4194304 pool=cephfs_data" From what I understand with this layout you are reading 4MB of data from 1 OSD at a time so I think you are seeing the overall speed of a single SATA drive. I do not think increasing your MON/MDS links to 10Gb will help, nor for a single file read will it help by going to SSD for the metadata. To test this, you may want to try creating 10 x 50GB files, and then read them in parallel and see if your overall throughput increases. If so, take a look at the layout parameters and see if you can change the file layout to get more parallelization. https://github.com/ceph/ceph/blob/master/doc/dev/file-striping.rst https://github.com/ceph/ceph/blob/master/doc/cephfs/file-layouts.rst Regards, Eric On Sun, Nov 20, 2016 at 3:24 AM, Mike Miller wrote: Hi, reading a big file 50 GB (tried more too) dd if=bigfile of=/dev/zero bs=4M in a cluster with 112 SATA disks in 10 osd (6272 pgs, replication 3) gives me only about *122 MB/s* read speed in single thread. Scrubbing turned off during measurement. I have been searching for possible bottlenecks. The network is not the problem, the machine running dd is connected to the cluster public network with a 20 GBASE-T bond. osd dual network: cluster public 10 GBASE-T, private 10 GBASE-T. The osd SATA disks are utilized only up until about 10% or 20%, not more than that. CPUs on osd idle too. CPUs on mon idle, mds usage about 1.0 (1 core is used on this 6-core machine). mon and mds connected with only 1 GbE (I would expect some latency from that, but no bandwidth issues; in fact network bandwidth is about 20 Mbit max). If I read a file with 50 GB, then clear the cache on the reading machine (but not the osd caches), I get much better reading performance of about *620 MB/s*. That seems logical to me as much (most) of the data is still in the osd cache buffers. But still the read performance is not super considered that the reading machine is connected to the cluster with a 20 Gbit/s bond. How can I improve? I am not really sure, but from my understanding 2 possible bottlenecks come to mind: 1) 1 GbE connection to mon / mds Is this the reason why reads are slow and osd disks are not hammered by read requests and therewith fully utilized? 2) Move metadata to SSD Currently, cephfs_metadata is on the same pool as the data on the spinning SATA disks. Is this the bottleneck? Is the move of metadata to SSD a solution? Or is it both? Your experience and insight are highly appreciated. Thanks, Mike ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] KVM / Ceph performance problems
Hi, I have an little performance problem with KVM and Ceph. I'm using Proxmox 4.3-10/7230e60f, with KVM version pve-qemu-kvm_2.7.0-8. Ceph is on version jewel 10.2.3 on both the cluster as the client (ceph-common). The systems are connected to the network via an 4x bonding with an total of 4 Gb/s. Within an guest, - when I do an write to I get about 10 MB/s. - Also when I try to do an write within the guest but then directly to ceph I get the same speed. - But when I mount an ceph object on the Proxmox host I get about 110MB/s The guest is connected to interface vmbr160 → bond0.160 → bond0. This bridge vmbr160 has an IP address with the same subnet as the ceph cluster with an mtu 9000. The KVM block device is an virtio device. What can I do to solve this problem? Kind regards, Michiel Piscaer ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] KVM / Ceph performance problems
I am afraid the most probable cause is context switching time related to your guest (or guests). On Wed, Nov 23, 2016 at 9:53 AM, M. Piscaer wrote: > Hi, > > I have an little performance problem with KVM and Ceph. > > I'm using Proxmox 4.3-10/7230e60f, with KVM version > pve-qemu-kvm_2.7.0-8. Ceph is on version jewel 10.2.3 on both the > cluster as the client (ceph-common). > > The systems are connected to the network via an 4x bonding with an total > of 4 Gb/s. > > Within an guest, > - when I do an write to I get about 10 MB/s. > - Also when I try to do an write within the guest but then directly to > ceph I get the same speed. > - But when I mount an ceph object on the Proxmox host I get about 110MB/s > > The guest is connected to interface vmbr160 → bond0.160 → bond0. > > This bridge vmbr160 has an IP address with the same subnet as the ceph > cluster with an mtu 9000. > > The KVM block device is an virtio device. > > What can I do to solve this problem? > > Kind regards, > > Michiel Piscaer > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- Andrey Y Shevel ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] KVM / Ceph performance problems
Hi Michiel, How are you configuring VM disks on Proxmox? What type (virtio, scsi, ide) and what cache setting? El 23/11/16 a las 07:53, M. Piscaer escribió: Hi, I have an little performance problem with KVM and Ceph. I'm using Proxmox 4.3-10/7230e60f, with KVM version pve-qemu-kvm_2.7.0-8. Ceph is on version jewel 10.2.3 on both the cluster as the client (ceph-common). The systems are connected to the network via an 4x bonding with an total of 4 Gb/s. Within an guest, - when I do an write to I get about 10 MB/s. - Also when I try to do an write within the guest but then directly to ceph I get the same speed. - But when I mount an ceph object on the Proxmox host I get about 110MB/s The guest is connected to interface vmbr160 → bond0.160 → bond0. This bridge vmbr160 has an IP address with the same subnet as the ceph cluster with an mtu 9000. The KVM block device is an virtio device. What can I do to solve this problem? Kind regards, Michiel Piscaer ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- Zuzendari Teknikoa / Director Técnico Binovo IT Human Project, S.L. Telf. 943493611 943324914 Astigarraga bidea 2, planta 6 dcha., ofi. 3-2; 20180 Oiartzun (Gipuzkoa) www.binovo.es ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com