Re: [ceph-users] radowsg keystone integration in mitaka
The ability to use Keystone v3 and authtokens in lieu of admin token was added in jewel. The release notes state it but unfortunately the Jewel docs don't reflect it, so you'll need to visit http://docs.ceph.com/ docs/master/radosgw/keystone/ to find the configuration information. When I tested this out, I had something like: [client.rgw.radosgw-1] rgw keystone admin user = radosgw rgw keystone admin password = rgw keystone token cache size = 1 keyring = /var/lib/ceph/radosgw/ceph-rgw.radosgw-1/keyring rgw keystone url = http://keystone-admin-endpoint:35357 rgw data = /var/lib/ceph/radosgw/ceph-rgw.radosgw-1 rgw keystone admin tenant = service rgw keystone admin domain = default rgw keystone api version = 3 host = radosgw-1 rgw s3 auth use keystone = true rgw socket path = /tmp/radosgw-radosgw-1.sock log file = /var/log/ceph/ceph-rgw-radosgw-1.log rgw keystone accepted roles = Member, _member_, admin rgw frontends = civetweb port=10.13.32.15:8080 num_threads=50 rgw keystone revocation interval = 900 Logan On Friday, October 14, 2016, Jonathan Proulxwrote: > Hi All, > > Recently upgraded from Kilo->Mitaka on my OpenStack deploy and now > radowsgw nodes (jewel) are unable to validate keystone tokens. > > > Initially I though it was because radowsgw relies on admin_token > (which is a a bad idea, but ...) and that's now deperecated. I > verified the token was still in keystone.conf and fixed it when I foun > it had been commented out of keystone-paste.ini but even after fixing > that and resarting my keystone I get: > > > -- grep req-a5030a83-f265-4b25-b6e5-1918c978f824 > /var/log/keystone/keystone.log > 2016-10-14 15:12:47.631 35977 WARNING keystone.middleware.auth > [req-a5030a83-f265-4b25-b6e5-1918c978f824 - - - - -] Deprecated: > build_auth_context middleware checking for the admin token is deprecated as > of the Mitaka release and will be removed in the O release. If your > deployment requires use of the admin token, update keystone-paste.ini so > that admin_token_auth is before build_auth_context in the paste pipelines, > otherwise remove the admin_token_auth middleware from the paste pipelines. > 2016-10-14 15:12:47.671 35977 INFO keystone.common.wsgi > [req-a5030a83-f265-4b25-b6e5-1918c978f824 - - - - -] GET > https://nimbus-1.csail.mit.edu:35358/v2.0/tokens/ > 2016-10-14 15:12:47.672 35977 WARNING oslo_log.versionutils > [req-a5030a83-f265-4b25-b6e5-1918c978f824 - - - - -] Deprecated: > validate_token of the v2 API is deprecated as of Mitaka in favor of a > similar function in the v3 API and may be removed in Q. > 2016-10-14 15:12:47.684 35977 WARNING keystone.common.wsgi > [req-a5030a83-f265-4b25-b6e5-1918c978f824 - - - - -] You are not > authorized to perform the requested action: identity:validate_token > > I've dug through keystone/policy.json and identity:validate_token is > authorized to "role:admin or is_admin:1" which I *think* should cover > the token use case...but not 100% sure. > > Can radosgw use a propper keystone user so I can avoid the admin_token > mess (http://docs.ceph.com/docs/jewel/radosgw/keystone/ seems to > indicate no)? > > Or anyone see where in my keystone chain I might have dropped a link? > > Thanks, > -Jon > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] all three mons segfault at same time
I am in the process of upgrading a cluster with mixed 0.94.2/0.94.3 to 0.94.5 this morning and am seeing identical crashes. In the process of doing a rolling upgrade across the mons this morning, after the 3rd of 3 mons was restarted to 0.94.5, all 3 crashed simultaneously identical to what you are describing above. Now I am seeing rolling crashes across the 3 mons continually. I am still in the process of upgrading about 200 OSDs to 0.94.5 so most of them are still running 0.94.2 and 0.94.3. There are 3 mds's running 0.94.5 during these crashes. ==> /var/log/clusterboot/lsn-mc1008/syslog <== Nov 10 10:07:30 lsn-mc1008 kernel: [6392349.844640] init: ceph-mon (ceph/lsn-mc1008) main process (2254664) killed by SEGV signal Nov 10 10:07:30 lsn-mc1008 kernel: [6392349.844648] init: ceph-mon (ceph/lsn-mc1008) main process ended, respawning ==> /var/log/clusterboot/lsn-mc1006/syslog <== Nov 10 10:07:46 lsn-mc1006 kernel: [6392890.294124] init: ceph-mon (ceph/lsn-mc1006) main process (2183307) killed by SEGV signal Nov 10 10:07:46 lsn-mc1006 kernel: [6392890.294132] init: ceph-mon (ceph/lsn-mc1006) main process ended, respawning ==> /var/log/clusterboot/lsn-mc1007/syslog <== Nov 10 10:07:46 lsn-mc1007 kernel: [6392599.894914] init: ceph-mon (ceph/lsn-mc1007) main process (1998234) killed by SEGV signal Nov 10 10:07:46 lsn-mc1007 kernel: [6392599.894923] init: ceph-mon (ceph/lsn-mc1007) main process ended, respawning ==> /var/log/clusterboot/lsn-mc1008/syslog <== Nov 10 10:07:46 lsn-mc1008 kernel: [6392365.959984] init: ceph-mon (ceph/lsn-mc1008) main process (2263082) killed by SEGV signal Nov 10 10:07:46 lsn-mc1008 kernel: [6392365.959992] init: ceph-mon (ceph/lsn-mc1008) main process ended, respawning ==> /var/log/clusterboot/lsn-mc1006/syslog <== Nov 10 10:07:52 lsn-mc1006 kernel: [6392896.674332] init: ceph-mon (ceph/lsn-mc1006) main process (2191273) killed by SEGV signal Nov 10 10:07:52 lsn-mc1006 kernel: [6392896.674340] init: ceph-mon (ceph/lsn-mc1006) main process ended, respawning ==> /var/log/clusterboot/lsn-mc1008/syslog <== Nov 10 10:07:52 lsn-mc1008 kernel: [6392372.324282] init: ceph-mon (ceph/lsn-mc1008) main process (2270979) killed by SEGV signal Nov 10 10:07:52 lsn-mc1008 kernel: [6392372.324295] init: ceph-mon (ceph/lsn-mc1008) main process ended, respawning ==> /var/log/clusterboot/lsn-mc1007/syslog <== Nov 10 10:07:52 lsn-mc1007 kernel: [6392606.272911] init: ceph-mon (ceph/lsn-mc1007) main process (2006118) killed by SEGV signal Nov 10 10:07:52 lsn-mc1007 kernel: [6392606.272995] init: ceph-mon (ceph/lsn-mc1007) main process ended, respawning ==> /var/log/clusterboot/lsn-mc1006/syslog <== Nov 10 10:07:55 lsn-mc1006 kernel: [6392899.046307] init: ceph-mon (ceph/lsn-mc1006) main process (2192187) killed by SEGV signal Nov 10 10:07:55 lsn-mc1006 kernel: [6392899.046315] init: ceph-mon (ceph/lsn-mc1006) main process ended, respawning ==> /var/log/clusterboot/lsn-mc1007/syslog <== Nov 10 10:08:17 lsn-mc1007 kernel: [6392631.192476] init: ceph-mon (ceph/lsn-mc1007) main process (2006489) killed by SEGV signal Nov 10 10:08:17 lsn-mc1007 kernel: [6392631.192484] init: ceph-mon (ceph/lsn-mc1007) main process ended, respawning ==> /var/log/clusterboot/lsn-mc1006/syslog <== Nov 10 10:08:17 lsn-mc1006 kernel: [6392921.600089] init: ceph-mon (ceph/lsn-mc1006) main process (2192298) killed by SEGV signal Nov 10 10:08:17 lsn-mc1006 kernel: [6392921.600108] init: ceph-mon (ceph/lsn-mc1006) main process ended, respawning ==> /var/log/clusterboot/lsn-mc1008/syslog <== Nov 10 10:08:17 lsn-mc1008 kernel: [6392397.277994] init: ceph-mon (ceph/lsn-mc1008) main process (2271246) killed by SEGV signal Nov 10 10:08:17 lsn-mc1008 kernel: [6392397.278002] init: ceph-mon (ceph/lsn-mc1008) main process ended, respawning ==> /var/log/clusterboot/lsn-mc1006/syslog <== Nov 10 10:08:23 lsn-mc1006 kernel: [6392927.999229] init: ceph-mon (ceph/lsn-mc1006) main process (2200399) killed by SEGV signal Nov 10 10:08:23 lsn-mc1006 kernel: [6392927.999242] init: ceph-mon (ceph/lsn-mc1006) main process ended, respawning ==> /var/log/clusterboot/lsn-mc1008/syslog <== Nov 10 10:08:23 lsn-mc1008 kernel: [6392403.641241] init: ceph-mon (ceph/lsn-mc1008) main process (2279050) killed by SEGV signal Nov 10 10:08:23 lsn-mc1008 kernel: [6392403.641254] init: ceph-mon (ceph/lsn-mc1008) main process ended, respawning ==> /var/log/clusterboot/lsn-mc1007/syslog <== Nov 10 10:08:24 lsn-mc1007 kernel: [6392637.614495] init: ceph-mon (ceph/lsn-mc1007) main process (2013418) killed by SEGV signal Nov 10 10:08:24 lsn-mc1007 kernel: [6392637.614504] init: ceph-mon (ceph/lsn-mc1007) main process ended, respawning On Mon, Nov 2, 2015 at 8:35 AM, Arnulf Heimsbakkwrote: > When I did a unset noout on the cluster all three mons got a > segmentation fault, then continued as if nothing had happened. Regular > segmentation faults started on mons after upgrading to 0.94.5. Ubuntu > Trusty LTS. Anyone had similar? > > -Arnulf > >
Re: [ceph-users] all three mons segfault at same time
I am on trusty also but my /var/lib/ceph/mon lives on an xfs filesystem. My mons seem to have stabilized now after upgrading the last of the OSDs to 0.94.5. No crashes in the last 20 minutes whereas they were crashing every 1-2 minutes in a rolling fashion the entire time I was upgrading OSDs. On Tue, Nov 10, 2015 at 12:03 PM, Arnulf Heimsbakk <aheimsb...@met.no> wrote: > Hi Logan! > > It seems that I've solved the segfaults on my monitors. Maybe not in the > best way, but they seem to be gone. Original my monitor servers ran > Ubuntu Trusty on ext4, but they've now been converted to CentOS 7 with > XFS as root file system. They've run stable for 24H now. > > I'm still running Ubuntu on my OSDs and no issues so far running mixed > OS. Everything is running 0.94.5. > > Not a ideal solution, but I'm preparing to convert OSDs to CentOS too if > things stay stable over time. > > -Arnulf > > On 11/10/2015 05:13 PM, Logan V. wrote: >> I am in the process of upgrading a cluster with mixed 0.94.2/0.94.3 to >> 0.94.5 this morning and am seeing identical crashes. In the process of >> doing a rolling upgrade across the mons this morning, after the 3rd of >> 3 mons was restarted to 0.94.5, all 3 crashed simultaneously identical >> to what you are describing above. Now I am seeing rolling crashes >> across the 3 mons continually. I am still in the process of upgrading >> about 200 OSDs to 0.94.5 so most of them are still running 0.94.2 and >> 0.94.3. There are 3 mds's running 0.94.5 during these crashes. >> >> ==> /var/log/clusterboot/lsn-mc1008/syslog <== >> Nov 10 10:07:30 lsn-mc1008 kernel: [6392349.844640] init: ceph-mon >> (ceph/lsn-mc1008) main process (2254664) killed by SEGV signal >> Nov 10 10:07:30 lsn-mc1008 kernel: [6392349.844648] init: ceph-mon >> (ceph/lsn-mc1008) main process ended, respawning >> ==> /var/log/clusterboot/lsn-mc1006/syslog <== >> Nov 10 10:07:46 lsn-mc1006 kernel: [6392890.294124] init: ceph-mon >> (ceph/lsn-mc1006) main process (2183307) killed by SEGV signal >> Nov 10 10:07:46 lsn-mc1006 kernel: [6392890.294132] init: ceph-mon >> (ceph/lsn-mc1006) main process ended, respawning >> ==> /var/log/clusterboot/lsn-mc1007/syslog <== >> Nov 10 10:07:46 lsn-mc1007 kernel: [6392599.894914] init: ceph-mon >> (ceph/lsn-mc1007) main process (1998234) killed by SEGV signal >> Nov 10 10:07:46 lsn-mc1007 kernel: [6392599.894923] init: ceph-mon >> (ceph/lsn-mc1007) main process ended, respawning >> ==> /var/log/clusterboot/lsn-mc1008/syslog <== >> Nov 10 10:07:46 lsn-mc1008 kernel: [6392365.959984] init: ceph-mon >> (ceph/lsn-mc1008) main process (2263082) killed by SEGV signal >> Nov 10 10:07:46 lsn-mc1008 kernel: [6392365.959992] init: ceph-mon >> (ceph/lsn-mc1008) main process ended, respawning >> ==> /var/log/clusterboot/lsn-mc1006/syslog <== >> Nov 10 10:07:52 lsn-mc1006 kernel: [6392896.674332] init: ceph-mon >> (ceph/lsn-mc1006) main process (2191273) killed by SEGV signal >> Nov 10 10:07:52 lsn-mc1006 kernel: [6392896.674340] init: ceph-mon >> (ceph/lsn-mc1006) main process ended, respawning >> ==> /var/log/clusterboot/lsn-mc1008/syslog <== >> Nov 10 10:07:52 lsn-mc1008 kernel: [6392372.324282] init: ceph-mon >> (ceph/lsn-mc1008) main process (2270979) killed by SEGV signal >> Nov 10 10:07:52 lsn-mc1008 kernel: [6392372.324295] init: ceph-mon >> (ceph/lsn-mc1008) main process ended, respawning >> ==> /var/log/clusterboot/lsn-mc1007/syslog <== >> Nov 10 10:07:52 lsn-mc1007 kernel: [6392606.272911] init: ceph-mon >> (ceph/lsn-mc1007) main process (2006118) killed by SEGV signal >> Nov 10 10:07:52 lsn-mc1007 kernel: [6392606.272995] init: ceph-mon >> (ceph/lsn-mc1007) main process ended, respawning >> ==> /var/log/clusterboot/lsn-mc1006/syslog <== >> Nov 10 10:07:55 lsn-mc1006 kernel: [6392899.046307] init: ceph-mon >> (ceph/lsn-mc1006) main process (2192187) killed by SEGV signal >> Nov 10 10:07:55 lsn-mc1006 kernel: [6392899.046315] init: ceph-mon >> (ceph/lsn-mc1006) main process ended, respawning >> ==> /var/log/clusterboot/lsn-mc1007/syslog <== >> Nov 10 10:08:17 lsn-mc1007 kernel: [6392631.192476] init: ceph-mon >> (ceph/lsn-mc1007) main process (2006489) killed by SEGV signal >> Nov 10 10:08:17 lsn-mc1007 kernel: [6392631.192484] init: ceph-mon >> (ceph/lsn-mc1007) main process ended, respawning >> ==> /var/log/clusterboot/lsn-mc1006/syslog <== >> Nov 10 10:08:17 lsn-mc1006 kernel: [6392921.600089] init: ceph-mon >> (ceph/lsn-mc1006) main process (2192298) killed by SEGV signal >> Nov 10 10:08:17 lsn-mc1006 kernel: [6392921.600108]
Re: [ceph-users] Chown in Parallel
Thanks for sharing this. I modified it slightly to stop and start the OSDs on the fly rather than having all osds needlessly stopped during the chown. ie. chown ceph:ceph /var/lib/ceph /var/lib/ceph/* && find /var/lib/ceph/osd -maxdepth 1 -mindepth 1 -print | xargs -P12 -n1 -I '{}' bash -c 'echo "starting run on osd.$(cat {}/whoami)"; service ceph-osd stop id=$(cat {}/whoami); time chown -R ceph:ceph {} && service ceph-osd restart id=$(cat {}/whoami); echo "done with osd.$(cat {}/whoami)";' Same note of course.. all of the non-osd dirs in /var/lib/ceph need to be handled separately. Also keep an eye out for downed OSDs after the run completes. If the chown does not return success then the script above will not start the OSD. On Tue, Nov 10, 2015 at 5:58 AM, Nick Fiskwrote: > I’m currently upgrading to Infernalis and the chown stage is taking a log > time on my OSD nodes. I’ve come up with this little one liner to run the > chown’s in parallel > > > > find /var/lib/ceph/osd -maxdepth 1 -mindepth 1 -print | xargs -P12 -n1 > chown -R ceph:ceph > > > > NOTE: You still need to make sure the other directory’s in the > /var/lib/ceph folder are updated separately but this should speed up the > process for machines with larger number of disks. > > > > Nick > > > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] radosgw keystone integration
After setting up radosgw federated configuration last week and integrating with openstack keystone auth, I have a question regarding the configuration. In the Keystone setup instructions for Kilo, the admin token auth method is disabled: http://docs.openstack.org/kilo/install-guide/install/apt/content/keystone-verify.html For security reasons, disable the temporary authentication token mechanism: Edit the /etc/keystone/keystone-paste.ini file and remove admin_token_auth from the [pipeline:public_api], [pipeline:admin_api], and [pipeline:api_v3] sections. So after using this setup guide for kilo, the environment is not compatible with radosgw because apparently radosgw requires admin token auth. This is not documented at http://ceph.com/docs/master/radosgw/keystone/ and resulted in a really frustrating day of troubleshooting why keystone was rejecting radosgw's attempts to load the token revocation list. So first, I think this requirement should be listed on the radosgw/keystone integration setup instructions. Long term, I am curious if ceph intends to continue using this temporary authentication mechanism that is recommended to be disabled after bootstrapping Keystone's setup by openstack. For reference, these are the kinds of errors seen when the admin token auth is disabled as recommended: ceph rgw node: T 10.13.32.6:42533 - controller:5000 [AP] GET /v2.0/tokens/revoked HTTP/1.1..Host: controller:5000..Accept: */*..Transfer-Encoding: chunked..X-Auth-Token: removed..Expect: 100-continue ## T controller:5000 - 10.13.32.6:42533 [AP] HTTP/1.1 100 Continue ## T 10.13.32.6:42533 - controller:5000 [AP] 0 # T controller:5000 - 10.13.32.6:42533 [AP] HTTP/1.1 403 Forbidden..Date: Sat, 15 Aug 2015 00:46:58 GMT..Server: Apache/2.4.7 (Ubuntu)..Vary: X-Auth-Token..X-Distribution: Ubuntu..x-openstack-request-id: req-869523c8-12bb-46d4-9d5b -89e0efd1dc38..Content-Length: 141..Content-Type: application/json{error: {message: You are not authorized to perform the requested action: identity:revocation_list, code: 403 , title: Forbidden}} root@radosgw-template:~# radosgw --id radosgw.us-dfw-1 -d 2015-08-15 00:51:17.992497 7ff2281e0840 0 ceph version 0.94.2 (5fb85614ca8f354284c713a2f9c610860720bbf3), process radosgw, pid 15381 2015-08-15 00:51:18.515909 7ff2281e0840 0 framework: fastcgi 2015-08-15 00:51:18.515927 7ff2281e0840 0 framework: civetweb 2015-08-15 00:51:18.515946 7ff2281e0840 0 framework conf key: port, val: 7480 2015-08-15 00:51:18.515958 7ff2281e0840 0 starting handler: civetweb 2015-08-15 00:51:18.529113 7ff2281e0840 0 starting handler: fastcgi 2015-08-15 00:51:18.541553 7ff1a67fc700 0 revoked tokens response is missing signed section 2015-08-15 00:51:18.541573 7ff1a67fc700 0 ERROR: keystone revocation processing returned error r=-22 2015-08-15 00:51:21.222619 7ff1a6ffd700 0 ERROR: can't read user header: ret=-2 2015-08-15 00:51:21.222648 7ff1a6ffd700 0 ERROR: sync_user() failed, user=us-dfw ret=-2 keystone error log: 2015-08-14 19:46:58.582172 2015-08-14 19:46:58.582 8782 WARNING keystone.common.wsgi [-] You are not authorized to perform the requested action: identity:revocation_list ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com