Re: [ceph-users] Wheezy machine died with problems on osdmap
Hi Sage, What kernel version of this? It looks like an old kernel bug. Generally speaking you should be using 3.4 at the very least if you are using the kernel client. sage This is the standard Wheezy kernel, i.e. 3.2.0-4-amd64 While I can recompile the kernel, I don't think would be manageable having a custom kernel in production. Is there a way I can open a bug in debian asking for a backport of the patch? Thanks. Regards, Giuseppe ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Ceph pgs stuck unclean
# ceph -v ceph version 0.61.2 (fea782543a844bb277ae94d3391788b76c5bee60) # rpm -qa | grep ceph ceph-0.61.2-0.el6.x86_64 ceph-radosgw-0.61.2-0.el6.x86_64 ceph-deploy-0.1-31.g7c5f29c.noarch ceph-release-1-0.el6.noarch libcephfs1-0.61.2-0.el6.x86_64 thanks Chris -Original Message- From: Samuel Just [mailto:sam.j...@inktank.com] Sent: 13 August 2013 20:11 To: Howarth, Chris [CCC-OT_IT] Cc: ceph-us...@ceph.com Subject: Re: [ceph-users] Ceph pgs stuck unclean Version? -Sam On Tue, Aug 13, 2013 at 7:52 AM, Howarth, Chris chris.howa...@citi.com wrote: Hi Sam, Thanks for your reply here. Unfortunately I didn't capture all this data at the time of the issue. What I do have I've pasted below. FYI the only way I found to fix this issue was to temporarily reduce the number of replicas in the pool to 1. The stuck pgs then disappeared and so I then increased the replicas back to 2 at this point. Obviously this is not a great workaround so I am keen to get to the bottom of the problem here. Thanks again for your help. Chris # ceph health detail HEALTH_WARN 7 pgs stuck unclean pg 3.5a is stuck unclean for 335339.172516, current state active, last acting [5,4] pg 3.54 is stuck unclean for 335339.157608, current state active, last acting [15,7] pg 3.55 is stuck unclean for 335339.167154, current state active, last acting [16,9] pg 3.1c is stuck unclean for 335339.174150, current state active, last acting [8,16] pg 3.a is stuck unclean for 335339.177001, current state active, last acting [0,8] pg 3.4 is stuck unclean for 335339.165377, current state active, last acting [17,4] pg 3.5 is stuck unclean for 335339.149507, current state active, last acting [2,6] # ceph pg 3.5a query { state: active, epoch: 699, up: [ 5, 4], acting: [ 5, 4], info: { pgid: 3.5a, last_update: 413'688, last_complete: 413'688, log_tail: 0'0, last_backfill: MAX, purged_snaps: [], history: { epoch_created: 67, last_epoch_started: 644, last_epoch_clean: 644, last_epoch_split: 0, same_up_since: 643, same_interval_since: 643, same_primary_since: 561, last_scrub: 0'0, last_scrub_stamp: 2013-08-01 15:23:29.253783, last_deep_scrub: 0'0, last_deep_scrub_stamp: 2013-08-01 15:23:29.253783, last_clean_scrub_stamp: 2013-08-01 15:23:29.253783}, stats: { version: 413'688, reported: 561'1484, state: active, last_fresh: 2013-08-02 12:25:41.793582, last_change: 2013-08-02 09:54:08.163758, last_active: 2013-08-02 12:25:41.793582, last_clean: 2013-08-02 09:49:34.246621, last_became_active: 0.00, last_unstale: 2013-08-02 12:25:41.793582, mapping_epoch: 641, log_start: 0'0, ondisk_log_start: 0'0, created: 67, last_epoch_clean: 67, parent: 0.0, parent_split_bits: 0, last_scrub: 0'0, last_scrub_stamp: 2013-08-01 15:23:29.253783, last_deep_scrub: 0'0, last_deep_scrub_stamp: 2013-08-01 15:23:29.253783, last_clean_scrub_stamp: 2013-08-01 15:23:29.253783, log_size: 0, ondisk_log_size: 0, stats_invalid: 0, stat_sum: { num_bytes: 134217728, num_objects: 32, num_object_clones: 0, num_object_copies: 0, num_objects_missing_on_primary: 0, num_objects_degraded: 0, num_objects_unfound: 0, num_read: 0, num_read_kb: 0, num_write: 688, num_write_kb: 327680, num_scrub_errors: 0, num_shallow_scrub_errors: 0, num_deep_scrub_errors: 0, num_objects_recovered: 45, num_bytes_recovered: 188743680, num_keys_recovered: 0}, stat_cat_sum: {}, up: [ 5, 4], acting: [ 5, 4]}, empty: 0, dne: 0, incomplete: 0, last_epoch_started: 644}, recovery_state: [ { name: Started\/Primary\/Active, enter_time: 2013-08-02 09:49:56.504882, might_have_unfound: [], recovery_progress: { backfill_target: -1, waiting_on_backfill: 0, backfill_pos: 0\/\/0\/\/-1, backfill_info: { begin: 0\/\/0\/\/-1, end: 0\/\/0\/\/-1, objects: []}, peer_backfill_info: { begin: 0\/\/0\/\/-1, end: 0\/\/0\/\/-1, objects: []}, backfills_in_flight: [], pull_from_peer: [], pushing: []}, scrub: { scrubber.epoch_start: 0,
Re: [ceph-users] v0.67 Dumpling released
Hi, is it ok to upgrade from 0.66 to 0.67 by just running 'apt-get upgrade' and rebooting the nodes one by one ? Thanks. Regards, Markus Am 14.08.2013 07:32, schrieb Sage Weil: Another three months have gone by, and the next stable release of Ceph is ready: Dumpling! Thank you to everyone who has contributed to this release! This release focuses on a few major themes since v0.61 (Cuttlefish): * rgw: multi-site, multi-datacenter support for S3/Swift object storage * new RESTful API endpoint for administering the cluster, based on a new and improved management API and updated CLI * mon: stability and performance * osd: stability performance * cephfs: open-by-ino support (for improved NFS reexport) * improved support for Red Hat platforms * use of the Intel CRC32c instruction when available As with previous stable releases, you can upgrade from previous versions of Ceph without taking the entire cluster online, as long as a few simple guidelines are followed. * For Dumpling, we have tested upgrades from both Bobtail and Cuttlefish. If you are running Argonaut, please upgrade to Bobtail and then to Dumpling. * Please upgrade daemons/hosts in the following order: 1. Upgrade ceph-common on all nodes that will use the command line ceph utility. 2. Upgrade all monitors (upgrade ceph package, restart ceph-mon daemons). This can happen one daemon or host at a time. Note that because cuttlefish and dumpling monitors cant talk to each other, all monitors should be upgraded in relatively short succession to minimize the risk that an untimely failure will reduce availability. 3. Upgrade all osds (upgrade ceph package, restart ceph-osd daemons). This can happen one daemon or host at a time. 4. Upgrade radosgw (upgrade radosgw package, restart radosgw daemons). There are several small compatibility changes between Cuttlefish and Dumpling, particularly with the CLI interface. Please see the complete release notes for a summary of the changes since v0.66 and v0.61 Cuttlefish, and other possible issues that should be considered before upgrading: http://ceph.com/docs/master/release-notes/#v0-67-dumpling Dumpling is the second Ceph release on our new three-month stable release cycle. We are very pleased to have pulled everything together on schedule. The next stable release, which will be code-named Emperor, is slated for three months from now (beginning of November). You can download v0.67 Dumpling from the usual locations: * Git at git://github.com/ceph/ceph.git * Tarball at http://ceph.com/download/ceph-0.67.tar.gz * For Debian/Ubuntu packages, see http://ceph.com/docs/master/install/debian * For RPMs, see http://ceph.com/docs/master/install/rpm ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- MfG, Markus Goldberg Markus Goldberg | Universität Hildesheim | Rechenzentrum Tel +49 5121 883212 | Marienburger Platz 22, D-31141 Hildesheim, Germany Fax +49 5121 883205 | email goldb...@uni-hildesheim.de ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] v0.67 Dumpling released
Hi, is it ok to upgrade from 0.66 to 0.67 by just running 'apt-get upgrade' and rebooting the nodes one by one ? Is a full reboot required? James ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] v0.67 Dumpling released
On Wed, Aug 14, 2013 at 11:35 AM, Markus Goldberg goldb...@uni-hildesheim.de wrote: is it ok to upgrade from 0.66 to 0.67 by just running 'apt-get upgrade' and rebooting the nodes one by one ? Did you see http://ceph.com/docs/master/release-notes/#upgrading-from-v0-66 ?? ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] v0.67 Dumpling released
On 2013-08-14 07:32, Sage Weil wrote: Another three months have gone by, and the next stable release of Ceph is ready: Dumpling! Thank you to everyone who has contributed to this release! This release focuses on a few major themes since v0.61 (Cuttlefish): * rgw: multi-site, multi-datacenter support for S3/Swift object storage * new RESTful API endpoint for administering the cluster, based on a new and improved management API and updated CLI * mon: stability and performance * osd: stability performance * cephfs: open-by-ino support (for improved NFS reexport) * improved support for Red Hat platforms * use of the Intel CRC32c instruction when available As with previous stable releases, you can upgrade from previous versions of Ceph without taking the entire cluster online, as long as a few simple guidelines are followed. * For Dumpling, we have tested upgrades from both Bobtail and Cuttlefish. If you are running Argonaut, please upgrade to Bobtail and then to Dumpling. * Please upgrade daemons/hosts in the following order: 1. Upgrade ceph-common on all nodes that will use the command line ceph utility. 2. Upgrade all monitors (upgrade ceph package, restart ceph-mon daemons). This can happen one daemon or host at a time. Note that because cuttlefish and dumpling monitors cant talk to each other, all monitors should be upgraded in relatively short succession to minimize the risk that an untimely failure will reduce availability. 3. Upgrade all osds (upgrade ceph package, restart ceph-osd daemons). This can happen one daemon or host at a time. 4. Upgrade radosgw (upgrade radosgw package, restart radosgw daemons). There are several small compatibility changes between Cuttlefish and Dumpling, particularly with the CLI interface. Please see the complete release notes for a summary of the changes since v0.66 and v0.61 Cuttlefish, and other possible issues that should be considered before upgrading: http://ceph.com/docs/master/release-notes/#v0-67-dumpling Dumpling is the second Ceph release on our new three-month stable release cycle. We are very pleased to have pulled everything together on schedule. The next stable release, which will be code-named Emperor, is slated for three months from now (beginning of November). You can download v0.67 Dumpling from the usual locations: * Git at git://github.com/ceph/ceph.git * Tarball at http://ceph.com/download/ceph-0.67.tar.gz * For Debian/Ubuntu packages, see http://ceph.com/docs/master/install/debian * For RPMs, see http://ceph.com/docs/master/install/rpm ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com Hi Sage, I just upgraded and everything went quite smoothly with osds, mons and mds, good work guys! :) The only problem I have ran into is with radosgw. It is unable to start after the upgrade with the following message: 2013-08-14 11:57:25.841310 7ffd0d2ae780 0 ceph version 0.67 (e3b7bc5bce8ab330ec1661381072368af3c218a0), process radosgw, pid 5612 2013-08-14 11:57:25.841328 7ffd0d2ae780 -1 WARNING: libcurl doesn't support curl_multi_wait() 2013-08-14 11:57:25.841335 7ffd0d2ae780 -1 WARNING: cross zone / region transfer performance may be affected 2013-08-14 11:57:25.855427 7ffcef7fe700 2 RGWDataChangesLog::ChangesRenewThread: start 2013-08-14 11:57:25.856138 7ffd0d2ae780 -1 Couldn't init storage provider (RADOS) ceph auth list returns: client.radosgw.gateway key: xx caps: [mon] allow r caps: [osd] allow rwx my config: [client.radosgw.gateway] keyring = /etc/ceph/keyring.radosgw.gateway rgw socket path = /tmp/radosgw.sock log file = /var/log/ceph/radosgw.log rgw enable ops log = false rgw print continue = true rgw keystone url = http://xx:5000 rgw keystone admin token = password rgw keystone accepted roles = admin,Member rgw keystone token cache size = 500 rgw keystone revocation interval = 600 #nss db path = /var/lib/ceph/nss Also, is the libcurl warning a problem? It seems the libcurl package is a bit old on Ubuntu 12.04LTS: curl --version curl 7.22.0 (x86_64-pc-linux-gnu) libcurl/7.22.0 OpenSSL/1.0.1 zlib/1.2.3.4 libidn/1.23 librtmp/2.3 Protocols: dict file ftp ftps gopher http https imap imaps ldap pop3 pop3s rtmp rtsp smtp smtps telnet tftp Features: GSS-Negotiate IDN IPv6 Largefile NTLM NTLM_WB SSL libz TLS-SRP Cheers, Peter ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] ceph-deploy and journal on separate disk
It looks like at some point the filesystem is not passed to the options. Would you mind running the `ceph-disk-prepare` command again but with the --verbose flag? I think that from the output above (correct it if I am mistaken) that would be something like: ceph-disk-prepare --verbose -- /dev/sdaa /dev/sda1 Hi. If I'm running: ceph-deploy disk zap ceph001:sdaa ceph001:sda1 and ceph-disk -v prepare /dev/sdaa /dev/sda1, get the same errors: == root@ceph001:~# ceph-disk -v prepare /dev/sdaa /dev/sda1 DEBUG:ceph-disk:Journal /dev/sda1 is a partition WARNING:ceph-disk:OSD will not be hot-swappable if journal is not the same device as the osd data DEBUG:ceph-disk:Creating osd partition on /dev/sdaa Information: Moved requested sector from 34 to 2048 in order to align on 2048-sector boundaries. The operation has completed successfully. DEBUG:ceph-disk:Creating xfs fs on /dev/sdaa1 meta-data=/dev/sdaa1 isize=2048 agcount=32, agsize=22892700 blks = sectsz=512 attr=2, projid32bit=0 data = bsize=4096 blocks=732566385, imaxpct=5 = sunit=0 swidth=0 blks naming =version 2 bsize=4096 ascii-ci=0 log =internal log bsize=4096 blocks=357698, version=2 = sectsz=512 sunit=0 blks, lazy-count=1 realtime =none extsz=4096 blocks=0, rtextents=0 DEBUG:ceph-disk:Mounting /dev/sdaa1 on /var/lib/ceph/tmp/mnt.EGTIq2 with options noatime mount: /dev/sdaa1: more filesystems detected. This should not happen, use -t type to explicitly specify the filesystem type or use wipefs(8) to clean up the device. mount: you must specify the filesystem type ceph-disk: Mounting filesystem failed: Command '['mount', '-o', 'noatime', '--', '/dev/sdaa1', '/var/lib/ceph/tmp/mnt.EGTIq2']' returned non-zero exit status 32 If executed this command separately for both disks - looks like ok: For sdaa: root@ceph001:~# ceph-disk -v prepare /dev/sdaa INFO:ceph-disk:Will colocate journal with data on /dev/sdaa DEBUG:ceph-disk:Creating journal partition num 2 size 1024 on /dev/sdaa Information: Moved requested sector from 34 to 2048 in order to align on 2048-sector boundaries. The operation has completed successfully. DEBUG:ceph-disk:Journal is GPT partition /dev/disk/by-partuuid/d1389210-6e02-4460-9cb2-0e31e4b0924f DEBUG:ceph-disk:Creating osd partition on /dev/sdaa Information: Moved requested sector from 2097153 to 2099200 in order to align on 2048-sector boundaries. The operation has completed successfully. DEBUG:ceph-disk:Creating xfs fs on /dev/sdaa1 meta-data=/dev/sdaa1 isize=2048 agcount=32, agsize=22884508 blks = sectsz=512 attr=2, projid32bit=0 data = bsize=4096 blocks=732304241, imaxpct=5 = sunit=0 swidth=0 blks naming =version 2 bsize=4096 ascii-ci=0 log =internal log bsize=4096 blocks=357570, version=2 = sectsz=512 sunit=0 blks, lazy-count=1 realtime =none extsz=4096 blocks=0, rtextents=0 DEBUG:ceph-disk:Mounting /dev/sdaa1 on /var/lib/ceph/tmp/mnt.K3q9v5 with options noatime DEBUG:ceph-disk:Preparing osd data dir /var/lib/ceph/tmp/mnt.K3q9v5 DEBUG:ceph-disk:Creating symlink /var/lib/ceph/tmp/mnt.K3q9v5/journal - /dev/disk/by-partuuid/d1389210-6e02-4460-9cb2-0e31e4b0924f DEBUG:ceph-disk:Unmounting /var/lib/ceph/tmp/mnt.K3q9v5 The operation has completed successfully. DEBUG:ceph-disk:Calling partprobe on prepared device /dev/sdaa For sda1: root@ceph001:~# ceph-disk -v prepare /dev/sda1 DEBUG:ceph-disk:OSD data device /dev/sda1 is a partition DEBUG:ceph-disk:Creating xfs fs on /dev/sda1 meta-data=/dev/sda1 isize=2048 agcount=4, agsize=655360 blks = sectsz=512 attr=2, projid32bit=0 data = bsize=4096 blocks=2621440, imaxpct=25 = sunit=0 swidth=0 blks naming =version 2 bsize=4096 ascii-ci=0 log =internal log bsize=4096 blocks=2560, version=2 = sectsz=512 sunit=0 blks, lazy-count=1 realtime =none extsz=4096 blocks=0, rtextents=0 DEBUG:ceph-disk:Mounting /dev/sda1 on /var/lib/ceph/tmp/mnt.G30zPD with options noatime DEBUG:ceph-disk:Preparing osd data dir /var/lib/ceph/tmp/mnt.G30zPD DEBUG:ceph-disk:Unmounting /var/lib/ceph/tmp/mnt.G30zPD DEBUG:ceph-disk:Calling partprobe on prepared device /dev/sda1 ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] ceph-deploy and journal on separate disk
From: Alfredo Deza [mailto:alfredo.d...@inktank.com] Sent: Wednesday, August 14, 2013 5:41 PM To: Pavel Timoschenkov Cc: Samuel Just; ceph-us...@ceph.com Subject: Re: [ceph-users] ceph-deploy and journal on separate disk On Wed, Aug 14, 2013 at 7:41 AM, Pavel Timoschenkov pa...@bayonetteas.onmicrosoft.commailto:pa...@bayonetteas.onmicrosoft.com wrote: It looks like at some point the filesystem is not passed to the options. Would you mind running the `ceph-disk-prepare` command again but with the --verbose flag? I think that from the output above (correct it if I am mistaken) that would be something like: ceph-disk-prepare --verbose -- /dev/sdaa /dev/sda1 Hi. If I'm running: ceph-deploy disk zap ceph001:sdaa ceph001:sda1 and ceph-disk -v prepare /dev/sdaa /dev/sda1, get the same errors: == root@ceph001:~# ceph-disk -v prepare /dev/sdaa /dev/sda1 DEBUG:ceph-disk:Journal /dev/sda1 is a partition WARNING:ceph-disk:OSD will not be hot-swappable if journal is not the same device as the osd data DEBUG:ceph-disk:Creating osd partition on /dev/sdaa Information: Moved requested sector from 34 to 2048 in order to align on 2048-sector boundaries. The operation has completed successfully. DEBUG:ceph-disk:Creating xfs fs on /dev/sdaa1 meta-data=/dev/sdaa1 isize=2048 agcount=32, agsize=22892700 blks = sectsz=512 attr=2, projid32bit=0 data = bsize=4096 blocks=732566385, imaxpct=5 = sunit=0 swidth=0 blks naming =version 2 bsize=4096 ascii-ci=0 log =internal log bsize=4096 blocks=357698, version=2 = sectsz=512 sunit=0 blks, lazy-count=1 realtime =none extsz=4096 blocks=0, rtextents=0 DEBUG:ceph-disk:Mounting /dev/sdaa1 on /var/lib/ceph/tmp/mnt.EGTIq2 with options noatime mount: /dev/sdaa1: more filesystems detected. This should not happen, use -t type to explicitly specify the filesystem type or use wipefs(8) to clean up the device. mount: you must specify the filesystem type ceph-disk: Mounting filesystem failed: Command '['mount', '-o', 'noatime', '--', '/dev/sdaa1', '/var/lib/ceph/tmp/mnt.EGTIq2']' returned non-zero exit status 32 If executed this command separately for both disks - looks like ok: For sdaa: root@ceph001:~# ceph-disk -v prepare /dev/sdaa INFO:ceph-disk:Will colocate journal with data on /dev/sdaa DEBUG:ceph-disk:Creating journal partition num 2 size 1024 on /dev/sdaa Information: Moved requested sector from 34 to 2048 in order to align on 2048-sector boundaries. The operation has completed successfully. DEBUG:ceph-disk:Journal is GPT partition /dev/disk/by-partuuid/d1389210-6e02-4460-9cb2-0e31e4b0924f DEBUG:ceph-disk:Creating osd partition on /dev/sdaa Information: Moved requested sector from 2097153 to 2099200 in order to align on 2048-sector boundaries. The operation has completed successfully. DEBUG:ceph-disk:Creating xfs fs on /dev/sdaa1 meta-data=/dev/sdaa1 isize=2048 agcount=32, agsize=22884508 blks = sectsz=512 attr=2, projid32bit=0 data = bsize=4096 blocks=732304241, imaxpct=5 = sunit=0 swidth=0 blks naming =version 2 bsize=4096 ascii-ci=0 log =internal log bsize=4096 blocks=357570, version=2 = sectsz=512 sunit=0 blks, lazy-count=1 realtime =none extsz=4096 blocks=0, rtextents=0 DEBUG:ceph-disk:Mounting /dev/sdaa1 on /var/lib/ceph/tmp/mnt.K3q9v5 with options noatime DEBUG:ceph-disk:Preparing osd data dir /var/lib/ceph/tmp/mnt.K3q9v5 DEBUG:ceph-disk:Creating symlink /var/lib/ceph/tmp/mnt.K3q9v5/journal - /dev/disk/by-partuuid/d1389210-6e02-4460-9cb2-0e31e4b0924f DEBUG:ceph-disk:Unmounting /var/lib/ceph/tmp/mnt.K3q9v5 The operation has completed successfully. DEBUG:ceph-disk:Calling partprobe on prepared device /dev/sdaa For sda1: root@ceph001:~# ceph-disk -v prepare /dev/sda1 DEBUG:ceph-disk:OSD data device /dev/sda1 is a partition DEBUG:ceph-disk:Creating xfs fs on /dev/sda1 meta-data=/dev/sda1 isize=2048 agcount=4, agsize=655360 blks = sectsz=512 attr=2, projid32bit=0 data = bsize=4096 blocks=2621440, imaxpct=25 = sunit=0 swidth=0 blks naming =version 2 bsize=4096 ascii-ci=0 log =internal log bsize=4096 blocks=2560, version=2 = sectsz=512 sunit=0 blks, lazy-count=1 realtime =none extsz=4096 blocks=0, rtextents=0 DEBUG:ceph-disk:Mounting /dev/sda1 on /var/lib/ceph/tmp/mnt.G30zPD with options noatime DEBUG:ceph-disk:Preparing osd data dir /var/lib/ceph/tmp/mnt.G30zPD DEBUG:ceph-disk:Unmounting /var/lib/ceph/tmp/mnt.G30zPD
[ceph-users] New Installation
Hello List, I am attempting to build a ceph cluster on RHEL6 machines. Everything seems to work until I get to the step of creating new monitors with ceph-deploy. It seems to work, but when I get to the gatherkeys step, then it displays messages about not being able to get the various bootstrap keys. I have disabled selinux and the firewall allows full access between the hosts involved. The problem seems to pop up on internet searches but I have not been able to find any solutions. Any ideas on what the issue may be? Thanks ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] v0.67 Dumpling released
On Wednesday, August 14, 2013, wrote: Hi Sage, I just upgraded and everything went quite smoothly with osds, mons and mds, good work guys! :) The only problem I have ran into is with radosgw. It is unable to start after the upgrade with the following message: 2013-08-14 11:57:25.841310 7ffd0d2ae780 0 ceph version 0.67 (** e3b7bc5bce8ab330ec166138107236**8af3c218a0), process radosgw, pid 5612 2013-08-14 11:57:25.841328 7ffd0d2ae780 -1 WARNING: libcurl doesn't support curl_multi_wait() 2013-08-14 11:57:25.841335 7ffd0d2ae780 -1 WARNING: cross zone / region transfer performance may be affected 2013-08-14 11:57:25.855427 7ffcef7fe700 2 RGWDataChangesLog::**ChangesRenewThread: start 2013-08-14 11:57:25.856138 7ffd0d2ae780 -1 Couldn't init storage provider (RADOS) ceph auth list returns: client.radosgw.gateway key: xx caps: [mon] allow r caps: [osd] allow rwx That's your problem; the gateway needs to have rw on the monitor to create some new pools. That's always been a soft requirement (it would complain badly if it needed a pool not already created) but it got harder in Dumpling. I think that should have been in the extended release notes...? Anyway, if you update the permissions on the monitor it should be all good. -Greg -- Software Engineer #42 @ http://inktank.com | http://ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] New Installation
On Wed, Aug 14, 2013 at 10:49 AM, Jim Summers jbsumm...@gmail.com wrote: Hello List, I am attempting to build a ceph cluster on RHEL6 machines. Everything seems to work until I get to the step of creating new monitors with ceph-deploy. It seems to work, but when I get to the gatherkeys step, then it displays messages about not being able to get the various bootstrap keys. What version of ceph-deploy are you using? There was a release for OS packages yesterday and for the Python package Index on Friday (v1.2) I have disabled selinux and the firewall allows full access between the hosts involved. The problem seems to pop up on internet searches but I have not been able to find any solutions. Any ideas on what the issue may be? Thanks ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] New Installation
It turned out that I had initially listed four machines as part of my cluster. Thinking that two will have mon's and all four osd's. So I noticed in the mon-log file that it was not able to communicate with two of the machines. So I simply made them mons also and then the keys were generated. I guess it was my misunderstanding of what defines a cluster. Thanks, On Wed, Aug 14, 2013 at 11:14 AM, Alfredo Deza alfredo.d...@inktank.comwrote: On Wed, Aug 14, 2013 at 10:49 AM, Jim Summers jbsumm...@gmail.com wrote: Hello List, I am attempting to build a ceph cluster on RHEL6 machines. Everything seems to work until I get to the step of creating new monitors with ceph-deploy. It seems to work, but when I get to the gatherkeys step, then it displays messages about not being able to get the various bootstrap keys. What version of ceph-deploy are you using? There was a release for OS packages yesterday and for the Python package Index on Friday (v1.2) I have disabled selinux and the firewall allows full access between the hosts involved. The problem seems to pop up on internet searches but I have not been able to find any solutions. Any ideas on what the issue may be? Thanks ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] ceph-deploy and journal on separate disk
On Wed, Aug 14, 2013 at 10:47 AM, Pavel Timoschenkov pa...@bayonetteas.onmicrosoft.com wrote: ** ** ** ** *From:* Alfredo Deza [mailto:alfredo.d...@inktank.com] *Sent:* Wednesday, August 14, 2013 5:41 PM *To:* Pavel Timoschenkov *Cc:* Samuel Just; ceph-us...@ceph.com *Subject:* Re: [ceph-users] ceph-deploy and journal on separate disk ** ** ** ** ** ** On Wed, Aug 14, 2013 at 7:41 AM, Pavel Timoschenkov pa...@bayonetteas.onmicrosoft.com wrote: It looks like at some point the filesystem is not passed to the options. Would you mind running the `ceph-disk-prepare` command again but with the --verbose flag? I think that from the output above (correct it if I am mistaken) that would be something like: ceph-disk-prepare --verbose -- /dev/sdaa /dev/sda1 Hi. If I’m running: ceph-deploy disk zap ceph001:sdaa ceph001:sda1 and ceph-disk -v prepare /dev/sdaa /dev/sda1, get the same errors: == root@ceph001:~# ceph-disk -v prepare /dev/sdaa /dev/sda1 DEBUG:ceph-disk:Journal /dev/sda1 is a partition WARNING:ceph-disk:OSD will not be hot-swappable if journal is not the same device as the osd data DEBUG:ceph-disk:Creating osd partition on /dev/sdaa Information: Moved requested sector from 34 to 2048 in order to align on 2048-sector boundaries. The operation has completed successfully. DEBUG:ceph-disk:Creating xfs fs on /dev/sdaa1 meta-data=/dev/sdaa1 isize=2048 agcount=32, agsize=22892700 blks = sectsz=512 attr=2, projid32bit=0 data = bsize=4096 blocks=732566385, imaxpct=5* *** = sunit=0 swidth=0 blks naming =version 2 bsize=4096 ascii-ci=0 log =internal log bsize=4096 blocks=357698, version=2 = sectsz=512 sunit=0 blks, lazy-count=1** ** realtime =none extsz=4096 blocks=0, rtextents=0 DEBUG:ceph-disk:Mounting /dev/sdaa1 on /var/lib/ceph/tmp/mnt.EGTIq2 with options noatime mount: /dev/sdaa1: more filesystems detected. This should not happen, use -t type to explicitly specify the filesystem type or use wipefs(8) to clean up the device. mount: you must specify the filesystem type ceph-disk: Mounting filesystem failed: Command '['mount', '-o', 'noatime', '--', '/dev/sdaa1', '/var/lib/ceph/tmp/mnt.EGTIq2']' returned non-zero exit status 32 If executed this command separately for both disks - looks like ok: *For sdaa:* root@ceph001:~# ceph-disk -v prepare /dev/sdaa INFO:ceph-disk:Will colocate journal with data on /dev/sdaa DEBUG:ceph-disk:Creating journal partition num 2 size 1024 on /dev/sdaa*** * Information: Moved requested sector from 34 to 2048 in order to align on 2048-sector boundaries. The operation has completed successfully. DEBUG:ceph-disk:Journal is GPT partition /dev/disk/by-partuuid/d1389210-6e02-4460-9cb2-0e31e4b0924f DEBUG:ceph-disk:Creating osd partition on /dev/sdaa Information: Moved requested sector from 2097153 to 2099200 in order to align on 2048-sector boundaries. The operation has completed successfully. DEBUG:ceph-disk:Creating xfs fs on /dev/sdaa1 meta-data=/dev/sdaa1 isize=2048 agcount=32, agsize=22884508 blks = sectsz=512 attr=2, projid32bit=0 data = bsize=4096 blocks=732304241, imaxpct=5* *** = sunit=0 swidth=0 blks naming =version 2 bsize=4096 ascii-ci=0 log =internal log bsize=4096 blocks=357570, version=2 = sectsz=512 sunit=0 blks, lazy-count=1** ** realtime =none extsz=4096 blocks=0, rtextents=0 DEBUG:ceph-disk:Mounting /dev/sdaa1 on /var/lib/ceph/tmp/mnt.K3q9v5 with options noatime DEBUG:ceph-disk:Preparing osd data dir /var/lib/ceph/tmp/mnt.K3q9v5 DEBUG:ceph-disk:Creating symlink /var/lib/ceph/tmp/mnt.K3q9v5/journal - /dev/disk/by-partuuid/d1389210-6e02-4460-9cb2-0e31e4b0924f DEBUG:ceph-disk:Unmounting /var/lib/ceph/tmp/mnt.K3q9v5 The operation has completed successfully. DEBUG:ceph-disk:Calling partprobe on prepared device /dev/sdaa *For sda1:* * * root@ceph001:~# ceph-disk -v prepare /dev/sda1 DEBUG:ceph-disk:OSD data device /dev/sda1 is a partition DEBUG:ceph-disk:Creating xfs fs on /dev/sda1 meta-data=/dev/sda1 isize=2048 agcount=4, agsize=655360 blks = sectsz=512 attr=2, projid32bit=0 data =
Re: [ceph-users] v0.67 Dumpling released
There are version specific repos, but you shouldn't need them if you want the latest. In fact, http://ceph.com/rpm/ is simply a link to http://ceph.com/rpm-dumpling Ian R. Colle Director of Engineering Inktank Cell: +1.303.601.7713 tel:%2B1.303.601.7713 Email: i...@inktank.com Delivering the Future of Storage http://www.linkedin.com/in/ircolle http://www.twitter.com/ircolle On 8/14/13 8:28 AM, Kyle Hutson kylehut...@k-state.edu wrote: Ah, didn't realize the repos were version-specific. Thanks Dan! On Wed, Aug 14, 2013 at 9:20 AM, Dan van der Ster daniel.vanders...@cern.ch wrote: http://ceph.com/rpm-dumpling/el6/x86_64/ -- Dan van der Ster CERN IT-DSS On Wednesday, August 14, 2013 at 4:17 PM, Kyle Hutson wrote: Any suggestions for upgrading CentOS/RHEL? The yum repos don't appear to have been updated yet. I thought maybe with the improved support for Red Hat platforms that would be the easy way of going about it. On Wed, Aug 14, 2013 at 5:08 AM, pe...@2force.nl (mailto:pe...@2force.nl) wrote: On 2013-08-14 07:32, Sage Weil wrote: Another three months have gone by, and the next stable release of Ceph is ready: Dumpling! Thank you to everyone who has contributed to this release! This release focuses on a few major themes since v0.61 (Cuttlefish): * rgw: multi-site, multi-datacenter support for S3/Swift object storage * new RESTful API endpoint for administering the cluster, based on a new and improved management API and updated CLI * mon: stability and performance * osd: stability performance * cephfs: open-by-ino support (for improved NFS reexport) * improved support for Red Hat platforms * use of the Intel CRC32c instruction when available As with previous stable releases, you can upgrade from previous versions of Ceph without taking the entire cluster online, as long as a few simple guidelines are followed. * For Dumpling, we have tested upgrades from both Bobtail and Cuttlefish. If you are running Argonaut, please upgrade to Bobtail and then to Dumpling. * Please upgrade daemons/hosts in the following order: 1. Upgrade ceph-common on all nodes that will use the command line ceph utility. 2. Upgrade all monitors (upgrade ceph package, restart ceph-mon daemons). This can happen one daemon or host at a time. Note that because cuttlefish and dumpling monitors cant talk to each other, all monitors should be upgraded in relatively short succession to minimize the risk that an untimely failure will reduce availability. 3. Upgrade all osds (upgrade ceph package, restart ceph-osd daemons). This can happen one daemon or host at a time. 4. Upgrade radosgw (upgrade radosgw package, restart radosgw daemons). There are several small compatibility changes between Cuttlefish and Dumpling, particularly with the CLI interface. Please see the complete release notes for a summary of the changes since v0.66 and v0.61 Cuttlefish, and other possible issues that should be considered before upgrading: http://ceph.com/docs/master/release-notes/#v0-67-dumpling Dumpling is the second Ceph release on our new three-month stable release cycle. We are very pleased to have pulled everything together on schedule. The next stable release, which will be code-named Emperor, is slated for three months from now (beginning of November). You can download v0.67 Dumpling from the usual locations: * Git at git://github.com/ceph/ceph.git http://github.com/ceph/ceph.git (http://github.com/ceph/ceph.git) * Tarball at http://ceph.com/download/ceph-0.67.tar.gz * For Debian/Ubuntu packages, see http://ceph.com/docs/master/install/debian * For RPMs, see http://ceph.com/docs/master/install/rpm ___ ceph-users mailing list ceph-users@lists.ceph.com (mailto:ceph-users@lists.ceph.com) http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com Hi Sage, I just upgraded and everything went quite smoothly with osds, mons and mds, good work guys! :) The only problem I have ran into is with radosgw. It is unable to start after the upgrade with the following message: 2013-08-14 11:57:25.841310 7ffd0d2ae780 0 ceph version 0.67 (e3b7bc5bce8ab330ec1661381072368af3c218a0), process radosgw, pid 5612 2013-08-14 11:57:25.841328 7ffd0d2ae780 -1 WARNING: libcurl doesn't support curl_multi_wait() 2013-08-14 11:57:25.841335 7ffd0d2ae780 -1 WARNING: cross zone / region transfer performance may be affected 2013-08-14 11:57:25.855427 7ffcef7fe700 2 RGWDataChangesLog::ChangesRenewThread: start 2013-08-14 11:57:25.856138 7ffd0d2ae780 -1 Couldn't init storage provider (RADOS) ceph auth list returns: client.radosgw.gateway key: xx caps: [mon] allow r caps: [osd] allow rwx my config:
Re: [ceph-users] one pg stuck with 2 unfound pieces
Try restarting the two osd processes with debug osd = 20, debug ms = 1, debug filestore = 20. Restarting the osds may clear the problem, but if it recurs, the logs should help explain what's going on. -Sam On Wed, Aug 14, 2013 at 12:17 AM, Jens-Christian Fischer jens-christian.fisc...@switch.ch wrote: On 13.08.2013, at 21:09, Samuel Just sam.j...@inktank.com wrote: You can run 'ceph pg 0.cfa mark_unfound_lost revert'. (Revert Lost section of http://ceph.com/docs/master/rados/operations/placement-groups/). -Sam As I wrote further down the info, ceph wouldn't let me do that: root@ineri ~$ ceph pg 0.cfa mark_unfound_lost revert pg has 2 objects but we haven't probed all sources, not marking lost I'm looking for a way that forces the (re) probing of the sources… cheers jc On Tue, Aug 13, 2013 at 6:50 AM, Jens-Christian Fischer jens-christian.fisc...@switch.ch wrote: We have a cluster with 10 servers, 64 OSDs and 5 Mons on them. The OSDs are 3TB disk, formatted with btrfs and the servers are either on Ubuntu 12.10 or 13.04. Recently one of the servers (13.04) stood still (due to problems with btrfs - something we have seen a few times). I decided to not try to recover the disks, but reformat them with XFS. I removed the OSDs, reformatted, and re-created them (they got the same OSD numbers) I redid this twice (because I wrongly partioned the disks in the first place) and I ended up with 2 unfound pieces in one pg: root@s2:~# ceph health details HEALTH_WARN 1 pgs degraded; 1 pgs recovering; 1 pgs stuck unclean; recovery 4448/28915270 degraded (0.015%); 2/9854766 unfound (0.000%) pg 0.cfa is stuck unclean for 1004252.309704, current state active+recovering+degraded+remapped, last acting [23,50] pg 0.cfa is active+recovering+degraded+remapped, acting [23,50], 2 unfound recovery 4448/28915270 degraded (0.015%); 2/9854766 unfound (0.000%) root@s2:~# ceph pg 0.cfa query { state: active+recovering+degraded+remapped, epoch: 28197, up: [ 23, 50, 18], acting: [ 23, 50], info: { pgid: 0.cfa, last_update: 28082'7774, last_complete: 23686'7083, log_tail: 14360'4061, last_backfill: MAX, purged_snaps: [], history: { epoch_created: 1, last_epoch_started: 28197, last_epoch_clean: 24810, last_epoch_split: 0, same_up_since: 28195, same_interval_since: 28196, same_primary_since: 26036, last_scrub: 20585'6801, last_scrub_stamp: 2013-07-28 15:40:53.298786, last_deep_scrub: 20585'6801, last_deep_scrub_stamp: 2013-07-28 15:40:53.298786, last_clean_scrub_stamp: 2013-07-28 15:40:53.298786}, stats: { version: 28082'7774, reported: 28197'41950, state: active+recovering+degraded+remapped, last_fresh: 2013-08-13 14:34:33.057271, last_change: 2013-08-13 14:34:33.057271, last_active: 2013-08-13 14:34:33.057271, last_clean: 2013-08-01 23:50:18.414082, last_became_active: 2013-05-29 13:10:51.366237, last_unstale: 2013-08-13 14:34:33.057271, mapping_epoch: 28195, log_start: 14360'4061, ondisk_log_start: 14360'4061, created: 1, last_epoch_clean: 24810, parent: 0.0, parent_split_bits: 0, last_scrub: 20585'6801, last_scrub_stamp: 2013-07-28 15:40:53.298786, last_deep_scrub: 20585'6801, last_deep_scrub_stamp: 2013-07-28 15:40:53.298786, last_clean_scrub_stamp: 2013-07-28 15:40:53.298786, log_size: 0, ondisk_log_size: 0, stats_invalid: 0, stat_sum: { num_bytes: 145307402, num_objects: 2234, num_object_clones: 0, num_object_copies: 0, num_objects_missing_on_primary: 0, num_objects_degraded: 0, num_objects_unfound: 0, num_read: 744, num_read_kb: 410184, num_write: 7774, num_write_kb: 1155438, num_scrub_errors: 0, num_shallow_scrub_errors: 0, num_deep_scrub_errors: 0, num_objects_recovered: 3998, num_bytes_recovered: 278803622, num_keys_recovered: 0}, stat_cat_sum: {}, up: [ 23, 50, 18], acting: [ 23, 50]}, empty: 0, dne: 0, incomplete: 0, last_epoch_started: 28197}, recovery_state: [ { name: Started\/Primary\/Active, enter_time: 2013-08-13 14:34:33.026698, might_have_unfound: [ { osd: 9, status: querying}, { osd: 18, status: querying}, { osd: 50, status: already probed}],
Re: [ceph-users] v0.67 Dumpling released
Thanks for that bit, too, Ian. For what it's worth, I updated /etc/yum.repos.d/ceph.repo , installed the latest version (from cuttlefish), restarted (monitors first, then everything else) and everything looks great. On Wed, Aug 14, 2013 at 1:28 PM, Ian Colle ian.co...@inktank.com wrote: There are version specific repos, but you shouldn't need them if you want the latest. In fact, http://ceph.com/rpm/ is simply a link to http://ceph.com/rpm-dumpling Ian R. Colle Director of Engineering Inktank Cell: +1.303.601.7713 Email: i...@inktank.com Delivering the Future of Storage http://www.linkedin.com/in/ircolle [image: Follow teststamp on Twitter] http://www.twitter.com/ircolle On 8/14/13 8:28 AM, Kyle Hutson kylehut...@k-state.edu wrote: Ah, didn't realize the repos were version-specific. Thanks Dan! On Wed, Aug 14, 2013 at 9:20 AM, Dan van der Ster daniel.vanders...@cern.ch wrote: http://ceph.com/rpm-dumpling/el6/x86_64/ -- Dan van der Ster CERN IT-DSS On Wednesday, August 14, 2013 at 4:17 PM, Kyle Hutson wrote: Any suggestions for upgrading CentOS/RHEL? The yum repos don't appear to have been updated yet. I thought maybe with the improved support for Red Hat platforms that would be the easy way of going about it. On Wed, Aug 14, 2013 at 5:08 AM, pe...@2force.nl (mailto: pe...@2force.nl) wrote: On 2013-08-14 07:32, Sage Weil wrote: Another three months have gone by, and the next stable release of Ceph is ready: Dumpling! Thank you to everyone who has contributed to this release! This release focuses on a few major themes since v0.61 (Cuttlefish): * rgw: multi-site, multi-datacenter support for S3/Swift object storage * new RESTful API endpoint for administering the cluster, based on a new and improved management API and updated CLI * mon: stability and performance * osd: stability performance * cephfs: open-by-ino support (for improved NFS reexport) * improved support for Red Hat platforms * use of the Intel CRC32c instruction when available As with previous stable releases, you can upgrade from previous versions of Ceph without taking the entire cluster online, as long as a few simple guidelines are followed. * For Dumpling, we have tested upgrades from both Bobtail and Cuttlefish. If you are running Argonaut, please upgrade to Bobtail and then to Dumpling. * Please upgrade daemons/hosts in the following order: 1. Upgrade ceph-common on all nodes that will use the command line ceph utility. 2. Upgrade all monitors (upgrade ceph package, restart ceph-mon daemons). This can happen one daemon or host at a time. Note that because cuttlefish and dumpling monitors cant talk to each other, all monitors should be upgraded in relatively short succession to minimize the risk that an untimely failure will reduce availability. 3. Upgrade all osds (upgrade ceph package, restart ceph-osd daemons). This can happen one daemon or host at a time. 4. Upgrade radosgw (upgrade radosgw package, restart radosgw daemons). There are several small compatibility changes between Cuttlefish and Dumpling, particularly with the CLI interface. Please see the complete release notes for a summary of the changes since v0.66 and v0.61 Cuttlefish, and other possible issues that should be considered before upgrading: http://ceph.com/docs/master/release-notes/#v0-67-dumpling Dumpling is the second Ceph release on our new three-month stable release cycle. We are very pleased to have pulled everything together on schedule. The next stable release, which will be code-named Emperor, is slated for three months from now (beginning of November). You can download v0.67 Dumpling from the usual locations: * Git at git://github.com/ceph/ceph.git ( http://github.com/ceph/ceph.git) * Tarball at http://ceph.com/download/ceph-0.67.tar.gz * For Debian/Ubuntu packages, see http://ceph.com/docs/master/install/debian * For RPMs, see http://ceph.com/docs/master/install/rpm ___ ceph-users mailing list ceph-users@lists.ceph.com (mailto:ceph-users@lists.ceph.com) http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com Hi Sage, I just upgraded and everything went quite smoothly with osds, mons and mds, good work guys! :) The only problem I have ran into is with radosgw. It is unable to start after the upgrade with the following message: 2013-08-14 11:57:25.841310 7ffd0d2ae780 0 ceph version 0.67 (e3b7bc5bce8ab330ec1661381072368af3c218a0), process radosgw, pid 5612 2013-08-14 11:57:25.841328 7ffd0d2ae780 -1 WARNING: libcurl doesn't support curl_multi_wait() 2013-08-14 11:57:25.841335 7ffd0d2ae780 -1 WARNING: cross zone / region transfer performance may be affected 2013-08-14
Re: [ceph-users] ceph-deploy and journal on separate disk
It looks like at some point the filesystem is not passed to the options. Would you mind running the `ceph-disk-prepare` command again but with the --verbose flag? I think that from the output above (correct it if I am mistaken) that would be something like: ceph-disk-prepare --verbose -- /dev/sdaa /dev/sda1 Hi. If I'm running: ceph-deploy disk zap ceph001:sdaa ceph001:sda1 and ceph-disk -v prepare /dev/sdaa /dev/sda1, get the same errors: == root@ceph001:~# ceph-disk -v prepare /dev/sdaa /dev/sda1 DEBUG:ceph-disk:Journal /dev/sda1 is a partition WARNING:ceph-disk:OSD will not be hot-swappable if journal is not the same device as the osd data DEBUG:ceph-disk:Creating osd partition on /dev/sdaa Information: Moved requested sector from 34 to 2048 in order to align on 2048-sector boundaries. The operation has completed successfully. DEBUG:ceph-disk:Creating xfs fs on /dev/sdaa1 meta-data=/dev/sdaa1 isize=2048 agcount=32, agsize=22892700 blks = sectsz=512 attr=2, projid32bit=0 data = bsize=4096 blocks=732566385, imaxpct=5 = sunit=0 swidth=0 blks naming =version 2 bsize=4096 ascii-ci=0 log =internal log bsize=4096 blocks=357698, version=2 = sectsz=512 sunit=0 blks, lazy-count=1 realtime =none extsz=4096 blocks=0, rtextents=0 DEBUG:ceph-disk:Mounting /dev/sdaa1 on /var/lib/ceph/tmp/mnt.EGTIq2 with options noatime mount: /dev/sdaa1: more filesystems detected. This should not happen, use -t type to explicitly specify the filesystem type or use wipefs(8) to clean up the device. mount: you must specify the filesystem type ceph-disk: Mounting filesystem failed: Command '['mount', '-o', 'noatime', '--', '/dev/sdaa1', '/var/lib/ceph/tmp/mnt.EGTIq2']' returned non-zero exit status 32 If executed this command separately for both disks - looks like ok: For sdaa: root@ceph001:~# ceph-disk -v prepare /dev/sdaa INFO:ceph-disk:Will colocate journal with data on /dev/sdaa DEBUG:ceph-disk:Creating journal partition num 2 size 1024 on /dev/sdaa Information: Moved requested sector from 34 to 2048 in order to align on 2048-sector boundaries. The operation has completed successfully. DEBUG:ceph-disk:Journal is GPT partition /dev/disk/by-partuuid/d1389210-6e02-4460-9cb2-0e31e4b0924f DEBUG:ceph-disk:Creating osd partition on /dev/sdaa Information: Moved requested sector from 2097153 to 2099200 in order to align on 2048-sector boundaries. The operation has completed successfully. DEBUG:ceph-disk:Creating xfs fs on /dev/sdaa1 meta-data=/dev/sdaa1 isize=2048 agcount=32, agsize=22884508 blks = sectsz=512 attr=2, projid32bit=0 data = bsize=4096 blocks=732304241, imaxpct=5 = sunit=0 swidth=0 blks naming =version 2 bsize=4096 ascii-ci=0 log =internal log bsize=4096 blocks=357570, version=2 = sectsz=512 sunit=0 blks, lazy-count=1 realtime =none extsz=4096 blocks=0, rtextents=0 DEBUG:ceph-disk:Mounting /dev/sdaa1 on /var/lib/ceph/tmp/mnt.K3q9v5 with options noatime DEBUG:ceph-disk:Preparing osd data dir /var/lib/ceph/tmp/mnt.K3q9v5 DEBUG:ceph-disk:Creating symlink /var/lib/ceph/tmp/mnt.K3q9v5/journal - /dev/disk/by-partuuid/d1389210-6e02-4460-9cb2-0e31e4b0924f DEBUG:ceph-disk:Unmounting /var/lib/ceph/tmp/mnt.K3q9v5 The operation has completed successfully. DEBUG:ceph-disk:Calling partprobe on prepared device /dev/sdaa For sda1: root@ceph001:~# ceph-disk -v prepare /dev/sda1 DEBUG:ceph-disk:OSD data device /dev/sda1 is a partition DEBUG:ceph-disk:Creating xfs fs on /dev/sda1 meta-data=/dev/sda1 isize=2048 agcount=4, agsize=655360 blks = sectsz=512 attr=2, projid32bit=0 data = bsize=4096 blocks=2621440, imaxpct=25 = sunit=0 swidth=0 blks naming =version 2 bsize=4096 ascii-ci=0 log =internal log bsize=4096 blocks=2560, version=2 = sectsz=512 sunit=0 blks, lazy-count=1 realtime =none extsz=4096 blocks=0, rtextents=0 DEBUG:ceph-disk:Mounting /dev/sda1 on /var/lib/ceph/tmp/mnt.G30zPD with options noatime DEBUG:ceph-disk:Preparing osd data dir /var/lib/ceph/tmp/mnt.G30zPD DEBUG:ceph-disk:Unmounting /var/lib/ceph/tmp/mnt.G30zPD DEBUG:ceph-disk:Calling partprobe on prepared device /dev/sda1 From: Alfredo Deza [mailto:alfredo.d...@inktank.com] Sent: Tuesday, August 13, 2013 11:14 PM To: Pavel Timoschenkov Cc: Samuel Just; ceph-us...@ceph.com Subject: Re: [ceph-users] ceph-deploy and journal on separate disk On Tue, Aug 13, 2013 at 3:21 AM, Pavel Timoschenkov
[ceph-users] Adding Disks / Storage
Hello All, Just starting out with ceph and wanted to make sure that ceph will do a couple of things. 1. Has the ability to keep a cephfs available to users even if one of the ODS servers has to be rebooted or whatever. 2. It is possible to keep adding disks to build one large storage / cephfs. By this I am thinking I want users to see the mount point: /data/ceph and it is initially made from two ODS that have 24TB of local storage to serve. So that would give them about 40TB initially. Is that correct? 3. Then over time add a third ODS that also has 24TB and that just becomes part of the cephfs that is mounted at /data/ceph which would then allow it to have about 60TB. Is that do-able? 4. The ods servers also have fiber channel access to some LUNS on a DDN san. Can I also add those into the same storage pool and mount point /data/ceph TIA ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] v0.67 Dumpling released
Would it be possible to generate rpms for the latest OpenSuSE-12.3? Regards, Mikhail From: ceph-users-boun...@lists.ceph.com [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Ian Colle Sent: Wednesday, August 14, 2013 2:29 PM To: Kyle Hutson Cc: ceph-users@lists.ceph.com Subject: Re: [ceph-users] v0.67 Dumpling released There are version specific repos, but you shouldn't need them if you want the latest. In fact, http://ceph.com/rpm/ is simply a link to http://ceph.com/rpm-dumpling ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] Ceph-Devel
Hello All, I just re-installed the ceph-release package on my RHEL system in an effort to get dumpling installed. After doing that I can not yum install ceph-deploy. Then I tyum installed ceph but still no ceph-deploy? Ideas? Thanks ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] rbd ls -l hangs
On Thu, Aug 1, 2013 at 9:57 AM, Jeff Moskow j...@rtr.com wrote: Greg, Thanks for the hints. I looked through the logs and found OSD's with RETRY's. I marked those out (marked in orange) and let ceph rebalance. Then I ran the bench command. I now have many more errors than before :-(. health HEALTH_WARN 1 pgs incomplete; 1 pgs stuck inactive; 151 pgs stuck unclean Note that the incomplete pg is still the same (2.1f6). Any ideas on what to try next? 2013-08-01 12:39:38.349011 osd.4 172.16.170.2:6801/1778 1154 : [INF] bench: wrote 1024 MB in blocks of 4096 KB in 18.085318 sec at 57979 KB/sec 2013-08-01 12:39:38.499002 osd.5 172.16.170.2:6802/19375 454 : [INF] bench: wrote 1024 MB in blocks of 4096 KB in 18.232358 sec at 57511 KB/sec 2013-08-01 12:39:44.077347 osd.3 172.16.170.2:6800/1647 1211 : [INF] bench: wrote 1024 MB in blocks of 4096 KB in 23.813801 sec at 44032 KB/sec 2013-08-01 12:39:49.118812 osd.16 172.16.170.4:6802/1837 746 : [INF] bench: wrote 1024 MB in blocks of 4096 KB in 28.453320 sec at 36852 KB/sec 2013-08-01 12:39:48.468020 osd.15 172.16.170.4:6801/1699 821 : [INF] bench: wrote 1024 MB in blocks of 4096 KB in 27.802566 sec at 37715 KB/sec 2013-08-01 12:39:54.369364 osd.0 172.16.170.1:6800/3783 948 : [INF] bench: wrote 1024 MB in blocks of 4096 KB in 34.076451 sec at 30771 KB/sec 2013-08-01 12:39:48.618080 osd.14 172.16.170.4:6800/1572 16161 : [INF] bench: wrote 1024 MB in blocks of 4096 KB in 27.952574 sec at 37512 KB/sec 2013-08-01 12:39:54.382830 osd.2 172.16.170.1:6803/22033 222 : [INF] bench: wrote 1024 MB in blocks of 4096 KB in 34.090170 sec at 30758 KB/sec 2013-08-01 12:40:03.458096 osd.6 172.16.170.3:6801/1738 1582 : [INF] bench: wrote 1024 MB in blocks of 4096 KB in 43.143180 sec at 24304 KB/sec 2013-08-01 12:40:03.724504 osd.10 172.16.170.3:6800/1473 1238 : [INF] bench: wrote 1024 MB in blocks of 4096 KB in 43.409558 sec at 24155 KB/sec 2013-08-01 12:40:02.426650 osd.8 172.16.170.3:6803/2013 8272 : [INF] bench: wrote 1024 MB in blocks of 4096 KB in 42.111713 sec at 24899 KB/sec 2013-08-01 12:40:02.997093 osd.7 172.16.170.3:6802/1864 1094 : [INF] bench: wrote 1024 MB in blocks of 4096 KB in 42.682079 sec at 24567 KB/sec 2013-08-01 12:40:02.867046 osd.9 172.16.170.3:6804/2149 2258 : [INF] bench: wrote 1024 MB in blocks of 4096 KB in 42.551771 sec at 24642 KB/sec 2013-08-01 12:39:54.360014 osd.1 172.16.170.1:6801/4243 3060 : [INF] bench: wrote 1024 MB in blocks of 4096 KB in 34.070725 sec at 30776 KB/sec 2013-08-01 12:42:56.984632 osd.11 172.16.170.5:6800/28025 43996 : [INF] bench: wrote 1024 MB in blocks of 4096 KB in 216.687559 sec at 4839 KB/sec 2013-08-01 12:43:21.271481 osd.13 172.16.170.5:6802/1872 1056 : [INF] bench: wrote 1024 MB in blocks of 4096 KB in 240.974360 sec at 4351 KB/sec 2013-08-01 12:43:39.320462 osd.12 172.16.170.5:6801/1700 1348 : [INF] bench: wrote 1024 MB in blocks of 4096 KB in 259.023646 sec at 4048 KB/sec Sorry for the slow reply; I've been out on vacation. :) Looking through this list, I'm noticing that many of your OSDs are reporting 4MB/s write speeds and they don't correspond to the ones you marked out (though if your cluster was somehow under load that could have something to do with the very different speed reports). You still want to look at the pg statistics for the stuck PG; I'm not seeing that anywhere? -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] Glance image upload errors after upgrading to Dumpling
Hello Everyone, I have a Ceph test cluster doing storage for an OpenStack Grizzly platform (also testing). Upgrading to 0.67 went fine on the Ceph side with the cluster showing healthy but suddenly I can't upload images into Glance anymore. The upload fails and glance-api throws an error: 2013-08-14 15:19:55.898 ERROR glance.api.v1.images [4dcd9de0-af65-4902-a36d-afc5497605e7 3867c65db6cc48398a0f57ce53144e69 5dbca756421c4a3eb0a1cc2f1ee3c67c] Failed to upload image 2013-08-14 15:19:55.898 24740 TRACE glance.api.v1.images Traceback (most recent call last): 2013-08-14 15:19:55.898 24740 TRACE glance.api.v1.images File /usr/lib/python2.6/site-packages/glance/api/v1/images.py, line 444, in _upload 2013-08-14 15:19:55.898 24740 TRACE glance.api.v1.images image_meta['size']) 2013-08-14 15:19:55.898 24740 TRACE glance.api.v1.images File /usr/lib/python2.6/site-packages/glance/store/rbd.py, line 241, in add 2013-08-14 15:19:55.898 24740 TRACE glance.api.v1.images with rados.Rados(conffile=self.conf_file, rados_id=self.user) as conn: 2013-08-14 15:19:55.898 24740 TRACE glance.api.v1.images File /usr/lib/python2.6/site-packages/rados.py, line 195, in __init__ 2013-08-14 15:19:55.898 24740 TRACE glance.api.v1.images raise Error(Rados(): can't supply both rados_id and name) 2013-08-14 15:19:55.898 24740 TRACE glance.api.v1.images Error: Rados(): can't supply both rados_id and name 2013-08-14 15:19:55.898 24740 TRACE glance.api.v1.images I'm not sure if there's a patch I need to track down for Glance or if I missed a change in the necessary Glance/Ceph setup. Is anyone else seeing this behavior? Thanks! -Mike ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Glance image upload errors after upgrading to Dumpling
On 08/14/2013 02:22 PM, Michael Morgan wrote: Hello Everyone, I have a Ceph test cluster doing storage for an OpenStack Grizzly platform (also testing). Upgrading to 0.67 went fine on the Ceph side with the cluster showing healthy but suddenly I can't upload images into Glance anymore. The upload fails and glance-api throws an error: 2013-08-14 15:19:55.898 ERROR glance.api.v1.images [4dcd9de0-af65-4902-a36d-afc5497605e7 3867c65db6cc48398a0f57ce53144e69 5dbca756421c4a3eb0a1cc2f1ee3c67c] Failed to upload image 2013-08-14 15:19:55.898 24740 TRACE glance.api.v1.images Traceback (most recent call last): 2013-08-14 15:19:55.898 24740 TRACE glance.api.v1.images File /usr/lib/python2.6/site-packages/glance/api/v1/images.py, line 444, in _upload 2013-08-14 15:19:55.898 24740 TRACE glance.api.v1.images image_meta['size']) 2013-08-14 15:19:55.898 24740 TRACE glance.api.v1.images File /usr/lib/python2.6/site-packages/glance/store/rbd.py, line 241, in add 2013-08-14 15:19:55.898 24740 TRACE glance.api.v1.images with rados.Rados(conffile=self.conf_file, rados_id=self.user) as conn: 2013-08-14 15:19:55.898 24740 TRACE glance.api.v1.images File /usr/lib/python2.6/site-packages/rados.py, line 195, in __init__ 2013-08-14 15:19:55.898 24740 TRACE glance.api.v1.images raise Error(Rados(): can't supply both rados_id and name) 2013-08-14 15:19:55.898 24740 TRACE glance.api.v1.images Error: Rados(): can't supply both rados_id and name 2013-08-14 15:19:55.898 24740 TRACE glance.api.v1.images This would be a backwards-compatibility regression in the librados python bindings - a fix is in the dumpling branch, and a point release is in the works. You could add name=None to that rados.Rados() call in glance to work around it in the meantime. Josh I'm not sure if there's a patch I need to track down for Glance or if I missed a change in the necessary Glance/Ceph setup. Is anyone else seeing this behavior? Thanks! -Mike ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] v0.67 Dumpling released
Sage et al, This is an exciting release but I must say I'm a bit confused about some of the new rgw details. Questions: 1) I'd like to understand how regions work. I assume that's how you get multi-site, multi-datacenter support working but must they be part of the same ceph cluster still? 2) I have two independent zones (intranet and internet). Should they be put in the same region by setting 'rgw region root pool = blabla' ? I wasn't sure how placement_targets work. 3) When I upgraded my rgw from .61 to .67 I lost access to my data. I used 'rgw_zone_root_pool' and noticed zone object changed from zone_info to zone_info.default. I did a 'rados cp zone_info zone_info.default -pool bIabla'. That fixed it but not sure if that's the correct fix. 4) In the zone_info.default I the following at the end : ...system_key: { access_key: , secret_key: }, placement_pools: []} What are these for exactly and should they be set? Or just a placeholder for E release? Thanks and keep up the great work! ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] cuttlefish cluster operation problem
Dear,Mr.: Hi!After I deployed a cuttlefish cluster on several nodes,none other daemons was found on monitor with sudo initctl list | grep ceph .As the content below , I can only find the monitor daemon process. ceph-osd-all stop/waiting ceph-mds-all-starter stop/waiting ceph-mds-all stop/waiting ceph-osd-all-starter stop/waiting ceph-all start/running ceph-mon-all start/running ceph-mon-all-starter stop/waiting ceph-mon (ceph/ceph-mon21) start/running, process 9236 ceph-create-keys stop/waiting ceph-osd stop/waiting ceph-mds stop/waiting Of course I can not control the others daemons among the cluster.Would you please help me to find any problem. Thanks very much ! ashely___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] v0.67 Dumpling released
Hi lists, in this release I see that the ceph command is not compatible with python 3. The changes were not all trivial so I gave up, but for those using gentoo, I made my ceph git repository available here with an ebuild that forces the python version to 2.6 ou 2.7 : git clone https://git.isi.nc/cloud/cloud-overlay.git I upgraded from cuttlefish without any problem so good job ceph-contributors :) Have a nice day. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com