Re: [ceph-users] Ceph performance laggy (requests blocked > 32) on OpenStack
Hi Kévin, I am currently having a similar issue. in my env I have around 16 Linux vms (vmware) more or less equaly loaded accessing a 1PB ceph hammer cluster (40 dn, 800 osds) through rbd. Very often we have IO freeze on the VM xfs FS and we also continuously have slow requests on osd ( up to 10/20 minutes sometime ). In my case the slow requests / blocked ops are because primary osd is waiting for subops i.e. waiting for replication to happen on secondary osd. In my case not all the VM are blocked at the same time . .. I still do not have explanation, root cause, nor wa Will keep you Inormed if I find something ... Sent from my Samsung device Original message From: Kevin Olbrich <k...@sv01.de> Date: 11/25/16 19:19 (GMT+05:30) To: ceph-users@lists.ceph.com Subject: [ceph-users] Ceph performance laggy (requests blocked > 32) on OpenStack Hi, we are running 80 VMs using KVM in OpenStack via RBD in Ceph Jewel on a total of 53 disks (RAID parity already excluded). Our nodes are using Intel P3700 DC-SSDs for journaling. Most VMs are linux based and load is low to medium. There are also about 10 VMs running Windows 2012R2, two of them run remote services (terminal). My question is: Are 80 VMs hosted on 53 disks (mostly 7.2k SATA) to much? We sometime experience lags where nearly all servers suffer from "blocked IO > 32" seconds. What are your experiences? Mit freundlichen Grüßen / best regards, Kevin Olbrich. This electronic message contains information from Mycom which may be privileged or confidential. The information is intended to be for the use of the individual(s) or entity named above. If you are not the intended recipient, be aware that any disclosure, copying, distribution or any other use of the contents of this information is prohibited. If you have received this electronic message in error, please notify us by post or telephone (to the numbers or correspondence address above) or by email (at the email address above) immediately. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Ceph performance laggy (requests blocked > 32) on OpenStack
If I use slow HDD, I can get the same outcome. Placing journals on fast SAS or NVMe SSD will make a difference. If you are using SATA SSD, those SSD are much slower. Instead of guessing why Ceph is lagging, have you looked at ceph -w and iostat and vmstat reports during your tests? Io stat will tell you HDD and SSD stats (i use the commands: iostat -tzxm 5 to show only active disks). If they are dedicated luns, then look at %utilization and service times. Looking at vmstat, check the ‘b’ column which shows how often your system is blocked waiting on io. > On Nov 25, 2016, at 8:48 AM, Kevin Olbrichwrote: > > Hi, > > we are running 80 VMs using KVM in OpenStack via RBD in Ceph Jewel on a total > of 53 disks (RAID parity already excluded). > Our nodes are using Intel P3700 DC-SSDs for journaling. > > Most VMs are linux based and load is low to medium. There are also about 10 > VMs running Windows 2012R2, two of them run remote services (terminal). > > My question is: Are 80 VMs hosted on 53 disks (mostly 7.2k SATA) to much? We > sometime experience lags where nearly all servers suffer from "blocked IO > > 32" seconds. > > What are your experiences? > > Mit freundlichen Grüßen / best regards, > Kevin Olbrich. > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com Rick Stehno ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] Ceph performance laggy (requests blocked > 32) on OpenStack
Hi, we are running 80 VMs using KVM in OpenStack via RBD in Ceph Jewel on a total of 53 disks (RAID parity already excluded). Our nodes are using Intel P3700 DC-SSDs for journaling. Most VMs are linux based and load is low to medium. There are also about 10 VMs running Windows 2012R2, two of them run remote services (terminal). My question is: Are 80 VMs hosted on 53 disks (mostly 7.2k SATA) to much? We sometime experience lags where nearly all servers suffer from "blocked IO > 32" seconds. What are your experiences? Mit freundlichen Grüßen / best regards, Kevin Olbrich. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com