[ceph-users] Remove volume
Hi All, I have ceph installed on Ubuntu nodes. At present I have the following volumes in Volumes pool: - root@compute:/home/oss# rbd -p volumes ls volume-53f0dc29-956f-48f1-8db1-b0f9c1b0e9f1 volume-55abf0d4-01a7-41d0-9e7e-407ad0db213c volume-a73d1bd0-2937-41c4-bbca-2545454eefac volume-bd45af55-489f-4d09-bc14-33229c1e3096 volume-cb11564f-7550-4e23-8197-4f8af09e506c volume-f3f67d69-8ac3-41a9-8001-4a2b512af933 What is the command to delete the above volumes ? Thanks Kumar This message is for the designated recipient only and may contain privileged, proprietary, or otherwise confidential information. If you have received it in error, please notify the sender immediately and delete the original. Any other use of the e-mail by you is prohibited. Where allowed by local law, electronic communications with Accenture and its affiliates, including e-mail and instant messaging (including content), may be scanned by our systems for the purposes of information security and assessment of internal compliance with Accenture policy. __ www.accenture.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Remove volume
Hi rbd -p poolname rm imagename e.g. rbd -p volumes rm volume-55abf0d4-01a7-41d0-9e7e-407ad0db213c http://ceph.com/docs JC On Wednesday, March 12, 2014, yalla.gnan.ku...@accenture.com wrote: Hi All, I have ceph installed on Ubuntu nodes. At present I have the following volumes in Volumes pool: - root@compute:/home/oss# rbd -p volumes ls volume-53f0dc29-956f-48f1-8db1-b0f9c1b0e9f1 volume-55abf0d4-01a7-41d0-9e7e-407ad0db213c volume-a73d1bd0-2937-41c4-bbca-2545454eefac volume-bd45af55-489f-4d09-bc14-33229c1e3096 volume-cb11564f-7550-4e23-8197-4f8af09e506c volume-f3f67d69-8ac3-41a9-8001-4a2b512af933 What is the command to delete the above volumes ? Thanks Kumar -- This message is for the designated recipient only and may contain privileged, proprietary, or otherwise confidential information. If you have received it in error, please notify the sender immediately and delete the original. Any other use of the e-mail by you is prohibited. Where allowed by local law, electronic communications with Accenture and its affiliates, including e-mail and instant messaging (including content), may be scanned by our systems for the purposes of information security and assessment of internal compliance with Accenture policy. __ www.accenture.com -- Sent while moving Pardon my French and any spelling | grammar glitches ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Remove volume
Hi Jean, I have deleted all images from openstack. Glance list has no images. root@compute:/home/oss# rbd -p images snap ls 0c605116-0634-4aed-9b3f-12d9483cd38a SNAPID NAME SIZE 2 snap 12839 kB root@compute:/home/oss# rbd -p images snap purge 0c605116-0634-4aed-9b3f-12d9483cd38a Removing all snapshots: 0% complete...failed. rbd: removing snaps failed: (16) Device or resource busy 2014-03-13 04:38:53.425980 7fb824b6f780 -1 librbd: removing snapshot from header failed: (16) Device or resource busy root@compute:/home/oss# How to delete the images ? Does restarting ceph services help ? Thanks Kumar From: Jean-Charles Lopez [mailto:jc.lo...@inktank.com] Sent: Thursday, March 13, 2014 12:44 PM To: Gnan Kumar, Yalla Cc: ceph-users@lists.ceph.com Subject: Re: [ceph-users] Remove volume Hi Probably because you have snapshots mapped somewhere on a node or because you have cloned some images from snapshot which are protected to deploy VMs. This will prevent the deletion of the snapshots in both cases JC On Thursday, March 13, 2014, yalla.gnan.ku...@accenture.commailto:yalla.gnan.ku...@accenture.com wrote: Hi Jean, Thanks a lot. I have the following images in ' images' pool: root@compute:/home/oss# rbd -p images ls 0c605116-0634-4aed-9b3f-12d9483cd38a 9f1a5bdc-3450-4934-99b3-d1b834ad9592 b16aac0e-621f-4f36-a027-39c86d28011f When I try deleting them I get this error: -- root@compute:/home/oss# rbd -p images rm 0c605116-0634-4aed-9b3f-12d9483cd38a 2014-03-13 04:28:19.455049 7f751275e780 -1 librbd: image has snapshots - not removing Removing image: 0% complete...failed. rbd: image has snapshots - these must be deleted with 'rbd snap purge' before the image can be removed. When I try deleting the snaps , I face the following error: root@compute:/home/oss# rbd -p images snap purge 0c605116-0634-4aed-9b3f-12d9483cd38a Removing all snapshots: 0% complete...failed. rbd: removing snaps failed: (16) Device or resource busy 2014-03-13 04:31:39.512199 7f1ac4134780 -1 librbd: removing snapshot from header failed: (16) Device or resource busy How to delete the images ? Thanks Kumar From: Jean-Charles Lopez [mailto:jc.lo...@inktank.comjavascript:_e(%7B%7D,'cvml','jc.lo...@inktank.com');] Sent: Thursday, March 13, 2014 12:37 PM To: Gnan Kumar, Yalla Cc: ceph-users@lists.ceph.comjavascript:_e(%7B%7D,'cvml','ceph-users@lists.ceph.com'); Subject: Re: [ceph-users] Remove volume Hi rbd -p poolname rm imagename e.g. rbd -p volumes rm volume-55abf0d4-01a7-41d0-9e7e-407ad0db213c http://ceph.com/docs JC On Wednesday, March 12, 2014, yalla.gnan.ku...@accenture.comjavascript:_e(%7B%7D,'cvml','yalla.gnan.ku...@accenture.com'); wrote: Hi All, I have ceph installed on Ubuntu nodes. At present I have the following volumes in Volumes pool: - root@compute:/home/oss# rbd -p volumes ls volume-53f0dc29-956f-48f1-8db1-b0f9c1b0e9f1 volume-55abf0d4-01a7-41d0-9e7e-407ad0db213c volume-a73d1bd0-2937-41c4-bbca-2545454eefac volume-bd45af55-489f-4d09-bc14-33229c1e3096 volume-cb11564f-7550-4e23-8197-4f8af09e506c volume-f3f67d69-8ac3-41a9-8001-4a2b512af933 What is the command to delete the above volumes ? Thanks Kumar This message is for the designated recipient only and may contain privileged, proprietary, or otherwise confidential information. If you have received it in error, please notify the sender immediately and delete the original. Any other use of the e-mail by you is prohibited. Where allowed by local law, electronic communications with Accenture and its affiliates, including e-mail and instant messaging (including content), may be scanned by our systems for the purposes of information security and assessment of internal compliance with Accenture policy. __ www.accenture.comhttp://www.accenture.com -- Sent while moving Pardon my French and any spelling | grammar glitches -- Sent while moving Pardon my French and any spelling | grammar glitches ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] OSD down after PG increase
On 03/13/2014 02:08 AM, Gandalf Corvotempesta wrote: I've increased PG number to a running cluster. After this operation, all OSDs from one node was marked as down. Now, after a while, i'm seeing that OSDs are slowly coming up again (sequentially) after rebalancing. Is this an expected behaviour ? Hello Gandalf, Yes, if you have essentially high amount of commited data in the cluster and/or large number of PG(tens of thousands). If you have a room to experiment with this transition from scratch you may want to play with numbers in the OSD` queues since they causing deadlock-like behaviour on operations like increasing PG count or large pool deletion. If cluster has no I/O at all at the moment, such behaviour is not expected definitely. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] help .--Why the PGS is STUCK UNCLEAN?
Dear Sir or Madam, I have come across a problem . Please help me handle it if you are available. Thanks in advance. The question is that I cannot understand why the status of the PGS is always STUCK UNCLEAN. As I see it, the status should be ACTIVE+CLEAN. The following is some information about the PGS. I am not sure whether the information is enough. If not enough, please let me know. Your sincerely, Michael. [root@storage1 ~]# ceph -s cluster 3429fd17-4a92-4d3b-a7fa-04adedb0da82 health HEALTH_WARN 69 pgs degraded; 192 pgs stuck unclean; recovery 366/2000 objects degraded (18.300%) monmap e1: 1 mons at {storage1=193.168.1.100:6789/0}, election epoch 1, quorum 0 storage1 osdmap e125: 8 osds: 8 up, 8 in pgmap v315: 192 pgs, 3 pools, 1 bytes data, 1000 objects 42555 MB used, 13883 GB / 14670 GB avail 366/2000 objects degraded (18.300%) 123 active+remapped 69 active+degraded [root@storage1 ~]# ceph osd tree # idweight type name up/down reweight -1 8 root default -2 8 host storage1 0 1 osd.0 up 1 1 1 osd.1 up 1 2 1 osd.2 up 1 3 1 osd.3 up 1 4 1 osd.4 up 1 5 1 osd.5 up 1 6 1 osd.6 up 1 7 1 osd.7 up 1 ZTE Information Security Notice: The information contained in this mail (and any attachment transmitted herewith) is privileged and confidential and is intended for the exclusive use of the addressee(s). If you are not an intended recipient, any disclosure, reproduction, distribution or other dissemination or use of the information contained is strictly prohibited. If you have received this mail in error, please delete it and notify us immediately. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] can't get radosgw with apache work, 403 and 405 error
Hello, I'm trying to setup radosgw with apache fastcgi on centos 6.5 64bit system. Unfortunately I can' get it right to work, some operation will fail. s3cmd ls will success s3cmd ls s3://bucket-name return 403 AccessDenied s3cmd mb s3://bucket-name return 405 MethodNotAllowed Here is my apache config http://pastebin.com/BWwQxVkD Here is log of s3cmd http://pastebin.com/DRtQjtvP ceph version is 0.72.2 ceph-radosgw version is 0.72.2 Please help me, thanks. --- zhongku 613038...@qq.com___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] help .--Why the PGS is STUCK UNCLEAN?
The question is that I cannot understand why the status of the PGS is always STUCK UNCLEAN. As I see it, the status should be ACTIVE+CLEAN. It looks like you have one physical node. If you have a pool with a replication count of 2 (default) I think it wil try to spread the data across 2 failure domains by default. My guess is the default crush map will see a node as a single failure domain by default. So, edit the crushmap to allow this or add a second node. Cheers, Robert van Leeuwen ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] help .--Why the PGS is STUCK UNCLEAN?
Hi, Anyways its not suggested to have a single node, but if you are having and you must (may be for testing purposes) you can include : osd crush chooseleaf type = 0 in global section of ceph.conf and restart all ceph services, to have all pgs in active+clean state. Thanks and Regards Ashish Chandra Cloud Engineer, Reliance Jio On Thu, Mar 13, 2014 at 2:35 PM, Robert van Leeuwen robert.vanleeu...@spilgames.com wrote: The question is that I cannot understand why the status of the PGS is always STUCK UNCLEAN. As I see it, the status should be ACTIVE+CLEAN. It looks like you have one physical node. If you have a pool with a replication count of 2 (default) I think it wil try to spread the data across 2 failure domains by default. My guess is the default crush map will see a node as a single failure domain by default. So, edit the crushmap to allow this or add a second node. Cheers, Robert van Leeuwen ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] OSD down after PG increase
2014-03-13 9:02 GMT+01:00 Andrey Korolyov and...@xdel.ru: Yes, if you have essentially high amount of commited data in the cluster and/or large number of PG(tens of thousands). I've increased from 64 to 8192 PGs If you have a room to experiment with this transition from scratch you may want to play with numbers in the OSD` queues since they causing deadlock-like behaviour on operations like increasing PG count or large pool deletion. If cluster has no I/O at all at the moment, such behaviour is not expected definitely. My cluster was totally idle, it's a test with ceph-ansible repository and nobody was using it. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] OSD down after PG increase
We have observed a very similar behavior. In a 140 OSD cluster (new created and idle) ~8000 PGs are available. After adding two new pools (each with 2 PGs) 100 out of 140 OSDs are going down + out. The cluster never recovers. This problem can be reproduced every time with v0.67 and 0.72. With v0.61 this problem does not show up. -Dieter On Thu, Mar 13, 2014 at 10:46:05AM +0100, Gandalf Corvotempesta wrote: 2014-03-13 9:02 GMT+01:00 Andrey Korolyov and...@xdel.ru: Yes, if you have essentially high amount of commited data in the cluster and/or large number of PG(tens of thousands). I've increased from 64 to 8192 PGs If you have a room to experiment with this transition from scratch you may want to play with numbers in the OSD` queues since they causing deadlock-like behaviour on operations like increasing PG count or large pool deletion. If cluster has no I/O at all at the moment, such behaviour is not expected definitely. My cluster was totally idle, it's a test with ceph-ansible repository and nobody was using it. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] OSD down after PG increase
On 13 Mar 2014 at 10:46:13, Gandalf Corvotempesta (gandalf.corvotempe...@gmail.commailto:gandalf.corvotempe...@gmail.com) wrote: Yes, if you have essentially high amount of commited data in the cluster and/or large number of PG(tens of thousands). I've increased from 64 to 8192 PGs Do you mean you used PG splitting? You should split PGs by a factor of 2x at a time. So to get from 64 to 8192, do 64-128, then 128-256, …, 4096-8192. Splitting is costly — it should be done with caution, even if you don’t have much data in the pools. Cheers, Dan ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] OSD down after PG increase
2014-03-13 11:19 GMT+01:00 Dan Van Der Ster daniel.vanders...@cern.ch: Do you mean you used PG splitting? You should split PGs by a factor of 2x at a time. So to get from 64 to 8192, do 64-128, then 128-256, ..., 4096-8192. I've brutally increased, no further steps. 64 - 8192 :-) ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] OSD down after PG increase
On 13 Mar 2014 at 11:26:55, Gandalf Corvotempesta (gandalf.corvotempe...@gmail.commailto:gandalf.corvotempe...@gmail.com) wrote: I'm also unsure if 8192 PGs are correct for my cluster. At maximum i'll have 168 OSDs (14 servers, 12 disks each, 1 osd per disk), with replica set to 3, so: (168*100)/3 = 5600. Rounded to the next power of 2: 8192 That’s a bit high for my taste — you’ll average 146 PGs per OSD assuming a uniform distribution. Though it is probably OK. Do you have any other pools? Remember that you need to include _all_ pools in the PG calculation, not just a single pool. — Dan ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] OSD down after PG increase
2014-03-13 11:32 GMT+01:00 Dan Van Der Ster daniel.vanders...@cern.ch: Do you have any other pools? Remember that you need to include _all_ pools in the PG calculation, not just a single pool. Actually I have only standard pools (that should be 3) In production i'll also have RGW. So, which is the exact equation to do including X pools ? Should I multiply the number of replica with number of pools ? (168*100)/(3*4) in case of 4 pools with replica 3 ? ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] OSD down after PG increase
On 13 Mar 2014 at 11:41:30, Gandalf Corvotempesta (gandalf.corvotempe...@gmail.commailto:gandalf.corvotempe...@gmail.com) wrote: 2014-03-13 11:32 GMT+01:00 Dan Van Der Ster daniel.vanders...@cern.ch: Do you have any other pools? Remember that you need to include _all_ pools in the PG calculation, not just a single pool. Actually I have only standard pools (that should be 3) In production i'll also have RGW. So, which is the exact equation to do including X pools ? Should I multiply the number of replica with number of pools ? (168*100)/(3*4) in case of 4 pools with replica 3 ? Yes, that’s would be correct. But only increase the # PGs for pools that you will actually put data into. For example, with RGW probably only need to add PGs to .rgw.buckets. Cheers, Dan ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] First Ceph Athens Meetup!
Hi Loic, thanks a lot. Constantinos On 03/12/2014 06:52 PM, Loic Dachary wrote: Hi Constantinos, I've added it to https://wiki.ceph.com/Community/Meetups . Feel free to update it if I made a mistake ;-) Cheers On 12/03/2014 17:40, Constantinos Venetsanopoulos wrote: Hello everybody, we are happy to invite you to the first Ceph Athens meetup: http://www.meetup.com/Ceph-Athens on March 18th, 19:30, taking place on the 4th floor of the GRNET [1] HQ offices. We'll be happy to have Steve Starbuck of Inktank with us, who will introduce Ceph. Also, Vangelis Koukis from the Synnefo team will present how Ceph is being used to back GRNET’s large-scale, production, public cloud service called “~okeanos” [2]. So, if you want to learn more about Ceph, discuss or ask questions, feel free to join us! See you all there, Constantinos P.S.: Please, let us know if you're coming by joining the meetup on the above link. [1] http://www.grnet.gr/en [2] http://okeanos.grnet.gr ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Remove volume
Hi All, I have deleted the pool and created a fresh images pool. I am unable to delete the existing image. - root@compute:/home/oss# rbd -p images rm c6dbff15-9ea7-4870-92f2-15de1bdd1150 2014-03-13 09:08:08.076116 7f06b1bc0780 -1 librbd: image has snapshots - not removing Removing image: 0% complete...failed. rbd: image has snapshots - these must be deleted with 'rbd snap purge' before the image can be removed. root@compute:/home/oss# rbd -p images snap ls c6dbff15-9ea7-4870-92f2-15de1bdd1150 SNAPID NAME SIZE 2 snap 12839 kB root@compute:/home/oss# rbd -p images snap purge c6dbff15-9ea7-4870-92f2-15de1bdd1150 Removing all snapshots: 0% complete...failed. rbd: removing snaps failed: (16) Device or resource busy 2014-03-13 09:08:32.723478 7f7574e88780 -1 librbd: removing snapshot from header failed: (16) Device or resource busy Any ideas ? Thanks Kumar From: Gnan Kumar, Yalla Sent: Thursday, March 13, 2014 3:34 PM To: 'ceph-users@lists.ceph.com' Cc: 'Jean-Charles Lopez' Subject: RE: [ceph-users] Remove volume Hi All, Any ideas ? As a temporary necessity , I have deleted the entire 'images' pool. How to delete individual images ? Thanks Kumar From: Gnan Kumar, Yalla Sent: Thursday, March 13, 2014 12:48 PM To: 'Jean-Charles Lopez' Cc: ceph-users@lists.ceph.commailto:ceph-users@lists.ceph.com Subject: RE: [ceph-users] Remove volume Hi Jean, I have deleted all images from openstack. Glance list has no images. root@compute:/home/oss# rbd -p images snap ls 0c605116-0634-4aed-9b3f-12d9483cd38a SNAPID NAME SIZE 2 snap 12839 kB root@compute:/home/oss# rbd -p images snap purge 0c605116-0634-4aed-9b3f-12d9483cd38a Removing all snapshots: 0% complete...failed. rbd: removing snaps failed: (16) Device or resource busy 2014-03-13 04:38:53.425980 7fb824b6f780 -1 librbd: removing snapshot from header failed: (16) Device or resource busy root@compute:/home/oss# How to delete the images ? Does restarting ceph services help ? Thanks Kumar From: Jean-Charles Lopez [mailto:jc.lo...@inktank.com] Sent: Thursday, March 13, 2014 12:44 PM To: Gnan Kumar, Yalla Cc: ceph-users@lists.ceph.commailto:ceph-users@lists.ceph.com Subject: Re: [ceph-users] Remove volume Hi Probably because you have snapshots mapped somewhere on a node or because you have cloned some images from snapshot which are protected to deploy VMs. This will prevent the deletion of the snapshots in both cases JC On Thursday, March 13, 2014, yalla.gnan.ku...@accenture.commailto:yalla.gnan.ku...@accenture.com wrote: Hi Jean, Thanks a lot. I have the following images in ' images' pool: root@compute:/home/oss# rbd -p images ls 0c605116-0634-4aed-9b3f-12d9483cd38a 9f1a5bdc-3450-4934-99b3-d1b834ad9592 b16aac0e-621f-4f36-a027-39c86d28011f When I try deleting them I get this error: -- root@compute:/home/oss# rbd -p images rm 0c605116-0634-4aed-9b3f-12d9483cd38a 2014-03-13 04:28:19.455049 7f751275e780 -1 librbd: image has snapshots - not removing Removing image: 0% complete...failed. rbd: image has snapshots - these must be deleted with 'rbd snap purge' before the image can be removed. When I try deleting the snaps , I face the following error: root@compute:/home/oss# rbd -p images snap purge 0c605116-0634-4aed-9b3f-12d9483cd38a Removing all snapshots: 0% complete...failed. rbd: removing snaps failed: (16) Device or resource busy 2014-03-13 04:31:39.512199 7f1ac4134780 -1 librbd: removing snapshot from header failed: (16) Device or resource busy How to delete the images ? Thanks Kumar From: Jean-Charles Lopez [mailto:jc.lo...@inktank.comjavascript:_e(%7B%7D,'cvml','jc.lo...@inktank.com');] Sent: Thursday, March 13, 2014 12:37 PM To: Gnan Kumar, Yalla Cc: ceph-users@lists.ceph.comjavascript:_e(%7B%7D,'cvml','ceph-users@lists.ceph.com'); Subject: Re: [ceph-users] Remove volume Hi rbd -p poolname rm imagename e.g. rbd -p volumes rm volume-55abf0d4-01a7-41d0-9e7e-407ad0db213c http://ceph.com/docs JC On Wednesday, March 12, 2014, yalla.gnan.ku...@accenture.comjavascript:_e(%7B%7D,'cvml','yalla.gnan.ku...@accenture.com'); wrote: Hi All, I have ceph installed on Ubuntu nodes. At present I have the following volumes in Volumes pool: - root@compute:/home/oss# rbd -p volumes ls volume-53f0dc29-956f-48f1-8db1-b0f9c1b0e9f1 volume-55abf0d4-01a7-41d0-9e7e-407ad0db213c volume-a73d1bd0-2937-41c4-bbca-2545454eefac volume-bd45af55-489f-4d09-bc14-33229c1e3096 volume-cb11564f-7550-4e23-8197-4f8af09e506c volume-f3f67d69-8ac3-41a9-8001-4a2b512af933 What is the command to delete the above volumes ? Thanks Kumar This message is for the designated recipient only and may contain privileged, proprietary, or otherwise confidential information. If you have received it in error, please notify the sender immediately and delete the original. Any other use of the e-mail by you is prohibited. Where allowed by local law,
Re: [ceph-users] clock skew
On 03/12/2014 05:04 PM, John Nielsen wrote: On Mar 12, 2014, at 10:44 AM, Gandalf Corvotempesta gandalf.corvotempe...@gmail.com wrote: 2014-01-30 18:41 GMT+01:00 Eric Eastman eri...@aol.com: I have this problem on some of my Ceph clusters, and I think it is due to the older hardware the I am using does not have the best clocks. To fix the problem, I setup one server in my lab to be my local NTP time server, and then on each of my Ceph monitors, in the /etc/ntp.conf file, I put in a single server line that reads: server XX.XX.XX.XX iburst burst minpoll 4 maxpoll 5 I'm using a local NTP server, all Mons are synced with local NTP but ceph still detect a clock skew Machine clocks aren't perfect, even with NTP. Ceph by default is very sensitive. I usually add this to my ceph.conf to prevent the warnings: [mon] mon clock drift allowed = .500 That is, allow the clocks to drift up to 1/2 second before saying anything. Having this as a tunable option is indeed meant to allow one to even find the best value. The current default of .05 was increased from an earlier .01 just because our lab's NTP server wasn't able to keep the clocks that synchronized. However, these warnings are meant to act as an early warning system for the monitor. There are some critical messages that need being passed, and some timeouts that need to be reset in time. Failure to do so results in weirdness. And unlike the OSDs, the monitors do rely in real time, hence the need for synchronized server clocks; and failure to maintain those clocks synchronized for some time may eventually have repercussions: monitors receiving timestamps somewhat in the past, thus ignoring them, or timeouts being triggered too soon/late due because a message wasn't dully received. Anyway, most timeouts will hold for 5 seconds. Allowing clock drifts up to 1 second may work, but we don't have hard data to support such claim. Over a second of drift may be problematic if the monitors are under some workload and message handling is delayed -- in which case other timeouts may have to be adjusted, not only to account for the clock skew but the amount of work the monitor has to deal with. -Joao -- Joao Eduardo Luis Software Engineer | http://inktank.com | http://ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] clock skew
2014-03-13 12:59 GMT+01:00 Joao Eduardo Luis joao.l...@inktank.com: Anyway, most timeouts will hold for 5 seconds. Allowing clock drifts up to 1 second may work, but we don't have hard data to support such claim. Over a second of drift may be problematic if the monitors are under some workload and message handling is delayed -- in which case other timeouts may have to be adjusted, not only to account for the clock skew but the amount of work the monitor has to deal with. I think that 1 seconds is too much. I would like to try with .100 or .200 not with seconds ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] clock skew
can we retest the clock skew condition? or get the value that the skew is? ceph status gives health HEALTH_WARN clock skew detected on mon.ceph003 in a polysh session (ie parallel ssh sort of thing) ready (3) date +%s.%N ceph002 : 1394713567.184218678 ceph003 : 1394713567.182722045 ceph001 : 1394713567.185351320 (they are ptp synced) stijn On 03/13/2014 01:19 PM, Gandalf Corvotempesta wrote: 2014-03-13 12:59 GMT+01:00 Joao Eduardo Luis joao.l...@inktank.com: Anyway, most timeouts will hold for 5 seconds. Allowing clock drifts up to 1 second may work, but we don't have hard data to support such claim. Over a second of drift may be problematic if the monitors are under some workload and message handling is delayed -- in which case other timeouts may have to be adjusted, not only to account for the clock skew but the amount of work the monitor has to deal with. I think that 1 seconds is too much. I would like to try with .100 or .200 not with seconds ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Some Questions about using ceph with VMware
Thank you Greg, thank you Kenneth, i have enabled ntp now and the clock skew is getting smaller and it’s gone after a while, so everything is fine now. I already thought the virtual Disks could slow down things, now i know it for sure. My last question (for now): Is there any difference between rbd’s and a mounted cephFS concerning of the performance? Thank you in advance, Florian Am 13.03.2014 um 03:36 schrieb kenneth kenn...@apolloglobal.net: On 3/12/2014 6:24 PM, Florian Krauß wrote: Hello everyone, this is the first time i ever write to a mailing list, please be patient with me (especially for my poor english)… Im trying to reach my Bachelors Degree in Computer Science, Im doing a Project which involves ceph. I’am able to setup a ceph Cluster, but there are a few things i can’t Figure out… As I’am setting up the cluster with virtual machines im facing a little Problem: clock skew: every time i reboot on node a clock skew is detected. If i restart the Monitor on which the clock skew is detected the Problem is gone. But this is not what i want to show in my presentation. I already enabled the VMware tools, but the Problem persists. Does it make more sense to enable NTP? You'll most likely need to setup NTP. I don't know of any better way to sync clocks so be sure to install NTP on ceph nodes. You should setup a local NTP server on your lan so that latency is very small Are there any performance Issues to expect if i use a drive with ceph-disk-prepare (or activate) /dev/sdb directly ? Are there any (big) performance Issues to expect using these virtual drives instead of „real“ drives? Yes, virtual machine hard drives can be slower than hardware drives so performance may be slower. Kind regards Florian ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] OSD down after PG increase
On Thu, Mar 13, 2014 at 11:16:45AM +0100, Gandalf Corvotempesta wrote: 2014-03-13 10:53 GMT+01:00 Kasper Dieter dieter.kas...@ts.fujitsu.com: After adding two new pools (each with 2 PGs) 100 out of 140 OSDs are going down + out. The cluster never recovers. In my case, cluster recovered after a couple of hours. How much time did you wait ? approx. 1h, but the 100 OSDs really died :-( Mar 7 15:59:53 rx37-8 kernel: Pid 9520(ceph-osd) over core_pipe_limit Mar 7 15:59:53 rx37-8 kernel: Skipping core dump (core_pipe_limit set to 4) -Dieter ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] windows client
Hi JiaMin, There is C++ API for Ceph storage cluster: https://github.com/ceph/ceph/blob/27968a74d29998703207705194ec4e0c93a6b42d/src/include/rados/librados.hpp, maybe you can use that for your development. Here is a Hello World example: https://github.com/ceph/ceph/blob/27968a74d29998703207705194ec4e0c93a6b42d/examples/librados/hello_world.cc Regrard, Kai At 2014-03-11 21:57:23,ljm李嘉敏 jm...@ctrip.com wrote: Hi all, Is it possible that ceph support windows client? Now I can only use RESTful API(Swift-compatible) through ceph object gateway, but the languages that can be used are java, python and ruby, not C# or C++. Is there any good wrapper for C# or C++,thanks. Thanks Regards Li JiaMin System Cloud Platform 3#4F108 ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] PG Calculations
There was a very recent thread discussing PG calculations, and it made me doubt my cluster setup. So, Inktank, please provide some clarification. I followed the documentation, and interpreted that documentation to mean that PG and PGP calculation was based upon a per-pool calculation. The recent discussion introduced a slightly different formula adding in the total number of pools: # OSD * 100 / 3 vs. # OSD's * 100 / (3 * # pools) My current cluster has 24 OSD's, replica size of 3, and the standard three pools, RBD, DATA, and METADATA. My current total PG's is 3072, which by the second formula is way too many. So, do I have too many? Does it need to be addressed, or can it wait until I add more OSD's, which will bring the ratio closer to ideal? I'm currently using only RBD and CephFS, no RadosGW. Thank you! Brad ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] windows client
Is it possible that ceph support windows client? Now I can only use RESTful API(Swift-compatible) through ceph object gateway, but the languages that can be used are java, python and ruby, not C# or C++. Is there any good wrapper for C# or C++,thanks. I have a kind-of-working port of librbd to Windows, if you are interested. James ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Replication lag in block storage
On Thu, Mar 13, 2014 at 3:56 PM, Greg Poirier greg.poir...@opower.com wrote: We've been seeing this issue on all of our dumpling clusters, and I'm wondering what might be the cause of it. In dump_historic_ops, the time between op_applied and sub_op_commit_rec or the time between commit_sent and sub_op_applied is extremely high. Some of the osd_sub_ops are as long as 100 ms. A sample dump_historic_ops is included at the bottom. It's important to understand what each of those timestamps are reporting. op_applied: the point at which an OSD has applied an operation to its readable backing filesystem in-memory (which for xfs or ext4 will be after it's committed to the journal) sub_op_commit_rec: the point at which an OSD has gotten commits from the replica OSDs commit_sent: the point at which a replica OSD has sent a commit back to its primary sub_op_applied: the point at which a replica OSD has applied a particular operation to its backing filesystem in-memory (again, after the journal if using xfs) Reads are never served from replicas, so a long time between commit_sent and sub_op_applied should not in itself be an issue. A lag time between op_applied and sub_op_commit_rec means that the OSD is waiting on its replicas. A long time there indicates either that the replica is processing slowly, or that there's some issue in the communications stack (all the way from the raw ethernet up to the message handling in the OSD itself). So the first thing to look for are sub ops which have a lag time between the received_at and commit_sent timestamps. If none of those ever turn up, but unusually long waits for sub_op_commit_rec are still present, then it'll take more effort to correlate particular subops on replicas with the op on the primary they correspond to, and see where the time lag is coming into it. -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] 答复: Re: help .--Why the PGS is STUCK UNCLEAN?
Hi, Problem solved by editting crushmap to change the default rule from step chooseleaf firstn 0 type host to step chooseleaf firstn 0 type osd . Thanks Ashish and Robert, your reply really helps me a lot. Thanks again. # rules rule data { ruleset 0 type replicated min_size 1 max_size 10 step take default -step chooseleaf firstn 0 type host +step chooseleaf firstn 0 type osd step emit } [root@storage1 ~]# ceph -s cluster 3429fd17-4a92-4d3b-a7fa-04adedb0da82 health HEALTH_OK monmap e1: 1 mons at {storage1=193.168.1.100:6789/0}, election epoch 1, quorum 0 storage1 osdmap e166: 8 osds: 8 up, 8 in pgmap v428: 192 pgs, 3 pools, 1 bytes data, 1000 objects 42564 MB used, 13883 GB / 14670 GB avail 192 active+clean Re: [ceph-users] help .--Why the PGS is STUCK UNCLEAN? Ashish Chandra 收件人: Robert van Leeuwen 2014/03/13 17:21 抄送: duan.xuf...@zte.com.cn, ceph-us...@ceph.com Hi, Anyways its not suggested to have a single node, but if you are having and you must (may be for testing purposes) you can include : osd crush chooseleaf type = 0 in global section of ceph.conf and restart all ceph services, to have all pgs in active+clean state. Thanks and Regards Ashish Chandra Cloud Engineer, Reliance Jio On Thu, Mar 13, 2014 at 2:35 PM, Robert van Leeuwen robert.vanleeu...@spilgames.com wrote: The question is that I cannot understand why the status of the PGS is always STUCK UNCLEAN. As I see it, the status should be ACTIVE+CLEAN. It looks like you have one physical node. If you have a pool with a replication count of 2 (default) I think it wil try to spread the data across 2 failure domains by default. My guess is the default crush map will see a node as a single failure domain by default. So, edit the crushmap to allow this or add a second node. Cheers, Robert van Leeuwen ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ZTE Information Security Notice: The information contained in this mail (and any attachment transmitted herewith) is privileged and confidential and is intended for the exclusive use of the addressee(s). If you are not an intended recipient, any disclosure, reproduction, distribution or other dissemination or use of the information contained is strictly prohibited. If you have received this mail in error, please delete it and notify us immediately. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Replication lag in block storage
Many of the sub ops look like this, with significant lag between received_at and commit_sent: { description: osd_op(client.6869831.0:1192491 rbd_data.67b14a2ae8944a.9105 [write 507904~3686400] 6.556a4db0 e660), received_at: 2014-03-13 20:42:05.811936, age: 46.088198, duration: 0.038328, snip { time: 2014-03-13 20:42:05.850215, event: commit_sent}, { time: 2014-03-13 20:42:05.850264, event: done}]]}, In this case almost 39ms between received_at and commit_sent. A particularly egregious example of 80+ms lag between received_at and commit_sent: { description: osd_op(client.6869831.0:1190526 rbd_data.67b14a2ae8944a.8fac [write 3325952~868352] 6.5255f5fd e660), received_at: 2014-03-13 20:41:40.227813, age: 320.017087, duration: 0.086852, snip { time: 2014-03-13 20:41:40.314633, event: commit_sent}, { time: 2014-03-13 20:41:40.314665, event: done}]]}, On Thu, Mar 13, 2014 at 4:17 PM, Gregory Farnum g...@inktank.com wrote: On Thu, Mar 13, 2014 at 3:56 PM, Greg Poirier greg.poir...@opower.com wrote: We've been seeing this issue on all of our dumpling clusters, and I'm wondering what might be the cause of it. In dump_historic_ops, the time between op_applied and sub_op_commit_rec or the time between commit_sent and sub_op_applied is extremely high. Some of the osd_sub_ops are as long as 100 ms. A sample dump_historic_ops is included at the bottom. It's important to understand what each of those timestamps are reporting. op_applied: the point at which an OSD has applied an operation to its readable backing filesystem in-memory (which for xfs or ext4 will be after it's committed to the journal) sub_op_commit_rec: the point at which an OSD has gotten commits from the replica OSDs commit_sent: the point at which a replica OSD has sent a commit back to its primary sub_op_applied: the point at which a replica OSD has applied a particular operation to its backing filesystem in-memory (again, after the journal if using xfs) Reads are never served from replicas, so a long time between commit_sent and sub_op_applied should not in itself be an issue. A lag time between op_applied and sub_op_commit_rec means that the OSD is waiting on its replicas. A long time there indicates either that the replica is processing slowly, or that there's some issue in the communications stack (all the way from the raw ethernet up to the message handling in the OSD itself). So the first thing to look for are sub ops which have a lag time between the received_at and commit_sent timestamps. If none of those ever turn up, but unusually long waits for sub_op_commit_rec are still present, then it'll take more effort to correlate particular subops on replicas with the op on the primary they correspond to, and see where the time lag is coming into it. -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] 答复: Re: windows client
Hi Kai, Thank you for your suggestion, it is a good choice. Thanks Regards Li JiaMin 发件人: Kai Zhang [mailto:log1...@yeah.net] 发送时间: 2014年3月14日 1:15 收件人: ljm李嘉敏 抄送: ceph-users@lists.ceph.com 主题: Re:[ceph-users] windows client Hi JiaMin, There is C++ API for Ceph storage cluster: https://github.com/ceph/ceph/blob/27968a74d29998703207705194ec4e0c93a6b42d/src/include/rados/librados.hpp, maybe you can use that for your development. Here is a Hello World example: https://github.com/ceph/ceph/blob/27968a74d29998703207705194ec4e0c93a6b42d/examples/librados/hello_world.cc Regrard, Kai At 2014-03-11 21:57:23,ljm李嘉敏 jm...@ctrip.commailto:jm...@ctrip.com wrote: Hi all, Is it possible that ceph support windows client? Now I can only use RESTful API(Swift-compatible) through ceph object gateway, but the languages that can be used are java, python and ruby, not C# or C++. Is there any good wrapper for C# or C++,thanks. Thanks Regards Li JiaMin System Cloud Platform 3#4F108 ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] clock skew
On 03/13/2014 12:30 PM, Stijn De Weirdt wrote: can we retest the clock skew condition? or get the value that the skew is? 'ceph health detail --format=json-pretty' (for instance, but 'json' or 'xml' is also allowed) will give you information on a per-monitor basis of both skew and latency as perceived by the monitors. -Joao ceph status gives health HEALTH_WARN clock skew detected on mon.ceph003 in a polysh session (ie parallel ssh sort of thing) ready (3) date +%s.%N ceph002 : 1394713567.184218678 ceph003 : 1394713567.182722045 ceph001 : 1394713567.185351320 (they are ptp synced) stijn On 03/13/2014 01:19 PM, Gandalf Corvotempesta wrote: 2014-03-13 12:59 GMT+01:00 Joao Eduardo Luis joao.l...@inktank.com: Anyway, most timeouts will hold for 5 seconds. Allowing clock drifts up to 1 second may work, but we don't have hard data to support such claim. Over a second of drift may be problematic if the monitors are under some workload and message handling is delayed -- in which case other timeouts may have to be adjusted, not only to account for the clock skew but the amount of work the monitor has to deal with. I think that 1 seconds is too much. I would like to try with .100 or .200 not with seconds ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- Joao Eduardo Luis Software Engineer | http://inktank.com | http://ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] 答复: windows client
Hi James, Thank you for your reply, do you upload it into the github? If so , can you give a link? Thanks Regards Li JiaMin -邮件原件- 发件人: James Harper [mailto:james.har...@bendigoit.com.au] 发送时间: 2014年3月14日 5:08 收件人: ljm李嘉敏; ceph-users@lists.ceph.com 主题: RE: windows client Is it possible that ceph support windows client? Now I can only use RESTful API(Swift-compatible) through ceph object gateway, but the languages that can be used are java, python and ruby, not C# or C++. Is there any good wrapper for C# or C++,thanks. I have a kind-of-working port of librbd to Windows, if you are interested. James ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] another assertion failure in monitor
On 03/11/2014 05:59 PM, Pawel Veselov wrote: On Tue, Mar 11, 2014 at 9:15 AM, Joao Eduardo Luis joao.l...@inktank.com mailto:joao.l...@inktank.com wrote: On 03/10/2014 10:30 PM, Pawel Veselov wrote: Now, I'm getting this. May be any idea what can be done to straighten this up? This is weird. Can you please share the steps taken until this was triggered, as well as the rest of the log? At this point, no, sorry. This whole thing started with migrating from 0.56.7 to 0.72.2. First, we started seeing failed assertions of (version == pg_map.version) in PGMonitor.cc:273, but on one monitor (d) only. I attempted to resync the failing monitor (d) with --force-sync from (c). (d) started to work, but (c) started to fail with (version==pg_map.version) assertion. So, I tried re-syncing (c) from (d) with --force-resync. That's when (c) started to fail with this particular (ret==0) assertion. I don't really think that resyncing actually worked any at that point. Considering you were upgrading from bobtail, any issues after the upgrade you may have found may have had something to do with improper store conversion -- usually due to somehow (explicitly or inadvertently) killing the monitor during conversion. Or it may have not, but we will never know without logs from back then. Based on this, my guess is that you managed to bork the mon stores of both 'c' and 'd'. See, when you force a sync you're basically telling the monitor to delete its store's contents and sync from somebody else. If 'c' had a broken store after the conversion, that would have been propagated to 'd'. Once you forced the sync of 'c', then the problem would have been propagated from 'd' to 'c'. I didn't find a way to fix this quickly enough, so I restored the mon directories from back up, and started again. The (version == pg_map.version) came back, but my back-up was taken before I was trying to do force-resync, but not before the migration started (that was stupid of me to not have backed up before migration). (That's the point when I tried all kindsa crazy stuff for a while). After some poking around, what I ended up doing is plain removing 'store.db' directory from the monitor fs, and starting the monitors. That just re-initiated the migration, and this time it was done in the absence of client requests, and one monitor at a time. And in a case like this, I would think this was a smart choice, allowing the monitors to reconvert the store from the old plain, file-based format to the new store.db format. Given it worked, my guess is that the source of all your issues was an improperly converted monitor store -- but, once again, without the logs we can't ever be sure. :( -Joao 0 2014-03-10 22:26:23.757166 7fc0397e5700 -1 mon/AuthMonitor.cc: In function 'virtual void AuthMonitor::create_initial()' thread 7fc0397e5700 time 2014-03-10 22:26:23.755442 mon/AuthMonitor.cc: 101: FAILED assert(ret == 0) ceph version 0.72.2 (__a913ded2ff138aefb8cb84d347d721__64099cfd60) 1: (AuthMonitor::create_initial()__+0x4d8) [0x637bb8] 2: (PaxosService::_active()+__0x51b) [0x594fcb] 3: (Context::complete(int)+0x9) [0x565499] 4: (finish_contexts(CephContext*, std::listContext*, std::allocatorContext* , int)+0x95) [0x5698b5] 5: (Paxos::handle_accept(__MMonPaxos*)+0x885) [0x589595] 6: (Paxos::dispatch(__PaxosServiceMessage*)+0x28b) [0x58d66b] 7: (Monitor::dispatch(MonSession*__, Message*, bool)+0x4f0) [0x563620] 8: (Monitor::_ms_dispatch(__Message*)+0x1fb) [0x5639fb] 9: (Monitor::ms_dispatch(Message*__)+0x32) [0x57f212] -- Joao Eduardo Luis Software Engineer | http://inktank.com | http://ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Replication lag in block storage
Right. So which is the interval that's taking all the time? Probably it's waiting for the journal commit, but maybe there's something else blocking progress. If it is the journal commit, check out how busy the disk is (is it just saturated?) and what its normal performance characteristics are (is it breaking?). -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com On Thu, Mar 13, 2014 at 5:48 PM, Greg Poirier greg.poir...@opower.com wrote: Many of the sub ops look like this, with significant lag between received_at and commit_sent: { description: osd_op(client.6869831.0:1192491 rbd_data.67b14a2ae8944a.9105 [write 507904~3686400] 6.556a4db0 e660), received_at: 2014-03-13 20:42:05.811936, age: 46.088198, duration: 0.038328, snip { time: 2014-03-13 20:42:05.850215, event: commit_sent}, { time: 2014-03-13 20:42:05.850264, event: done}]]}, In this case almost 39ms between received_at and commit_sent. A particularly egregious example of 80+ms lag between received_at and commit_sent: { description: osd_op(client.6869831.0:1190526 rbd_data.67b14a2ae8944a.8fac [write 3325952~868352] 6.5255f5fd e660), received_at: 2014-03-13 20:41:40.227813, age: 320.017087, duration: 0.086852, snip { time: 2014-03-13 20:41:40.314633, event: commit_sent}, { time: 2014-03-13 20:41:40.314665, event: done}]]}, On Thu, Mar 13, 2014 at 4:17 PM, Gregory Farnum g...@inktank.com wrote: On Thu, Mar 13, 2014 at 3:56 PM, Greg Poirier greg.poir...@opower.com wrote: We've been seeing this issue on all of our dumpling clusters, and I'm wondering what might be the cause of it. In dump_historic_ops, the time between op_applied and sub_op_commit_rec or the time between commit_sent and sub_op_applied is extremely high. Some of the osd_sub_ops are as long as 100 ms. A sample dump_historic_ops is included at the bottom. It's important to understand what each of those timestamps are reporting. op_applied: the point at which an OSD has applied an operation to its readable backing filesystem in-memory (which for xfs or ext4 will be after it's committed to the journal) sub_op_commit_rec: the point at which an OSD has gotten commits from the replica OSDs commit_sent: the point at which a replica OSD has sent a commit back to its primary sub_op_applied: the point at which a replica OSD has applied a particular operation to its backing filesystem in-memory (again, after the journal if using xfs) Reads are never served from replicas, so a long time between commit_sent and sub_op_applied should not in itself be an issue. A lag time between op_applied and sub_op_commit_rec means that the OSD is waiting on its replicas. A long time there indicates either that the replica is processing slowly, or that there's some issue in the communications stack (all the way from the raw ethernet up to the message handling in the OSD itself). So the first thing to look for are sub ops which have a lag time between the received_at and commit_sent timestamps. If none of those ever turn up, but unusually long waits for sub_op_commit_rec are still present, then it'll take more effort to correlate particular subops on replicas with the op on the primary they correspond to, and see where the time lag is coming into it. -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] 答复: Wrong PG nums
Yes, I have same question. -邮件原件- 发件人: ceph-users-boun...@lists.ceph.com [mailto:ceph-users-boun...@lists.ceph.com] 代表 Gandalf Corvotempesta 发送时间: 2014年3月13日 0:50 收件人: ceph-users@lists.ceph.com 主题: [ceph-users] Wrong PG nums Hi to all I have this in my conf: # grep 'pg num' /etc/ceph/ceph.conf osd pool default pg num = 5600 But: # ceph osd pool get data pg_num pg_num: 64 Is this normal ? Why just 64 pg was created ? ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com