Re: [ceph-users] Monitors repeatedly calling for new elections
On 09. des. 2014 18:19, Sanders, Bill wrote: Thanks for the response. I did forget to mention that NTP is setup and does appear to be running (just double checked). You probably know this, but just in case: If 'ntpq -p' shows a '*' in front of one of the servers, NTP has managed to synch up. If not, NTP has had no effect on your clock. Jon Jon Kåre Hellan, UNINETT AS, Trondheim Norway Is this good enough resolution? $ for node in $nodes; do ssh tvsa${node} sudo date --rfc-3339=ns; done 2014-12-09 09:15:39.404292557-08:00 2014-12-09 09:15:39.521762397-08:00 2014-12-09 09:15:39.641491188-08:00 2014-12-09 09:15:39.761937524-08:00 2014-12-09 09:15:39.911416676-08:00 2014-12-09 09:15:40.029777457-08:00 Bill From: Rodrigo Severo [rodr...@fabricadeideias.com] Sent: Tuesday, December 09, 2014 4:02 AM To: Sanders, Bill Cc: ceph-users@lists.ceph.com Subject: Re: [ceph-users] Monitors repeatedly calling for new elections On Mon, Dec 8, 2014 at 5:23 PM, Sanders, Bill bill.sand...@teradata.com wrote: Under activity, we'll get monitors going into election cycles repeatedly, OSD's being wrongly marked down, as well as slow requests osd.11 39.7.48.6:6833/21938 failed (3 reports from 1 peers after 52.914693 = grace 20.00) . During this, ceph -w shows the cluster essentially idle. None of the network, disks, or cpu's ever appear to max out. It also doesn't appear to be the same OSD's, MON's, or node causing the problem. Top reports all 128 GB RAM (negligible swap) in use on the storage nodes. Only Ceph is running on the storage nodes. I'm really new to Ceph but my first bet is that your computers aren't clock synchronized. Are all of them with working ntpds? Regards, Rodrigo Severo ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] OSD in uninterruptible sleep
We are testing a Giant cluster - on virtual machines for now. We have seen the same problem two nights in a row: One of the OSDs gets stuck in uninterruptible sleep. The only way to get rid of it is apparently to reboot - kill -9, -11 and -15 have all been tried. The monitor apparently believes it is gone, because every 30 minutes we see in the log: lock_fsid failed to lock /var/lib/ceph/osd/ceph-1/fsid, is another ceph-osd still running? (11) Resource temporarily unavailable We interpret this as an attempt to start a new instance. There is a pastebin of the osd log from the night before last in: http://pastebin.com/Y42GvGjr Pastebin of syslog from last evening: http://pastebin.com/7riNWRsy The pid of the stuck OSD is 4222. syslog has call traces of pids 4405, 4406, 4435, 4436, which have been blocked for 120 s. What can we do to get to the bottom of this? Context: This is a test cluster to evaluate Ceph. There are 3 monitor vms, 3 OSD vms each running 2 OSDs, 1 MSD vm and 1 radosgw vm. The vms are running Debian Wheezy under Hyper-V. OSD storage is xfs on virtual disks. The test load was a linux kernel compilation with the tree in cephfs. Silly, I know, but we needed a test load. We do not intend to use cephfs in production. Obviously, we would use physical OSD nodes if we were to decide to deploy ceph in production. Jon Jon Kåre Hellan, UNINETT AS, Trondheim, Norway ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] Stuck OSD
Hi I'm testing a Giant cluster. There are 6 OSD on 3 virtual machines. One OSD is marked down and out. The process still exists, it is in uninterruptible sleep. It has stopped logging. I've uploaded what I think are relevant fragments of the log to pastebin: http://pastebin.com/Y42GvGjr Can anybody help me understand what is going on? If the process had died instead, would a new one have been started automatically? Regards Jon Jon Kåre Hellan, UNINETT AS, Trondheim, Norway ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] where to download 0.87 debs?
Will there be debs? On 30/10/14 10:37, Irek Fasikhov wrote: Hi. Use http://ceph.com/rpm-giant/ 2014-10-30 12:34 GMT+03:00 Kenneth Waegeman kenneth.waege...@ugent.be mailto:kenneth.waege...@ugent.be: Hi, Will http://ceph.com/rpm/ also be updated to have the giant packages? Thanks Kenneth - Message from Patrick McGarry patr...@inktank.com mailto:patr...@inktank.com - Date: Wed, 29 Oct 2014 22:13:50 -0400 From: Patrick McGarry patr...@inktank.com mailto:patr...@inktank.com Subject: Re: [ceph-users] where to download 0.87 RPMS? To: 廖建锋 de...@f-club.cn mailto:de...@f-club.cn Cc: ceph-users ceph-users@lists.ceph.com mailto:ceph-users@lists.ceph.com I have updated the http://ceph.com/get page to reflect a more generic approach to linking. It's also worth noting that the new http://download.ceph.com/ infrastructure is available now. To get to the rpms specifically you can either crawl the download.ceph.com http://download.ceph.com tree or use the symlink at http://ceph.com/rpm-giant/ Hope that (and the updated linkage on ceph.com/get http://ceph.com/get) helps. Thanks! Best Regards, Patrick McGarry Director Ceph Community || Red Hat http://ceph.com || http://community.redhat.com @scuttlemonkey || @ceph On Wed, Oct 29, 2014 at 9:15 PM, 廖建锋 de...@f-club.cn mailto:de...@f-club.cn wrote: ___ ceph-users mailing list ceph-users@lists.ceph.com mailto:ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com mailto:ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com - End message from Patrick McGarry patr...@inktank.com mailto:patr...@inktank.com - -- Met vriendelijke groeten, Kenneth Waegeman ___ ceph-users mailing list ceph-users@lists.ceph.com mailto:ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- С уважением, Фасихов Ирек Нургаязович Моб.: +79229045757 ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] Adding another radosgw node
Hi We've got a three node ceph cluster, and radosgw on a fourth machine. We would like to add another radosgw machine for high availability. Here are a few questions I have: - We aren't expecting to deploy to multiple regions and zones anywhere soon. So presumably, we do not have to worry about federated deployment. Would it be hard to move to a federated deployment later? - What is a radosgw instance? I was guessing that it was a machine running radosgw. If not, is it a separate gateway with a separate set of user and pools, possibly running on the same machine? - Can I simply deploy another radosgw machine with the same configuration as the first one? If the second interpretation is true, I guess I could. - Am I right that all gateway users go in the same keyring, which is copied to all the gateway nodes and all the monitor nodes? - The gateway nodes obviously need a [client.radosgw.{instance-name}] stanza in /etc/ceph.conf. Do the monitor nodes also need a copy of the stanza? - Do the gateway nodes need all of the monitors' [global] stanza in their /etc/ceph.conf? Presumably, they at least need mon_host to know who to talk to. What else? Regards Jon Jon Kåre Hellan, UNINETT AS, Trondheim, Norway ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com