Re: [ceph-users] radowsg keystone integration in mitaka

2016-10-15 Thread Logan V.
The ability to use Keystone v3 and authtokens in lieu of admin token was
added in jewel. The release notes state it but unfortunately the Jewel docs
don't reflect it, so you'll need to visit http://docs.ceph.com/
docs/master/radosgw/keystone/ to find the configuration information.

When I tested this out, I had something like:

[client.rgw.radosgw-1]
rgw keystone admin user = radosgw
rgw keystone admin password = 
rgw keystone token cache size = 1
keyring = /var/lib/ceph/radosgw/ceph-rgw.radosgw-1/keyring
rgw keystone url = http://keystone-admin-endpoint:35357
rgw data = /var/lib/ceph/radosgw/ceph-rgw.radosgw-1
rgw keystone admin tenant = service
rgw keystone admin domain = default
rgw keystone api version = 3
host = radosgw-1
rgw s3 auth use keystone = true
rgw socket path = /tmp/radosgw-radosgw-1.sock
log file = /var/log/ceph/ceph-rgw-radosgw-1.log
rgw keystone accepted roles = Member, _member_, admin
rgw frontends = civetweb port=10.13.32.15:8080 num_threads=50
rgw keystone revocation interval = 900

Logan

On Friday, October 14, 2016, Jonathan Proulx  wrote:

> Hi All,
>
> Recently upgraded from Kilo->Mitaka on my OpenStack deploy and now
> radowsgw nodes (jewel) are unable to validate keystone tokens.
>
>
> Initially I though it was because radowsgw relies on admin_token
> (which is a a bad idea, but ...) and that's now deperecated.  I
> verified the token was still in keystone.conf and fixed it when I foun
> it had been commented out of  keystone-paste.ini but even after fixing
> that and resarting my keystone I get:
>
>
> -- grep req-a5030a83-f265-4b25-b6e5-1918c978f824
> /var/log/keystone/keystone.log
> 2016-10-14 15:12:47.631 35977 WARNING keystone.middleware.auth
> [req-a5030a83-f265-4b25-b6e5-1918c978f824 - - - - -] Deprecated:
> build_auth_context middleware checking for the admin token is deprecated as
> of the Mitaka release and will be removed in the O release. If your
> deployment requires use of the admin token, update keystone-paste.ini so
> that admin_token_auth is before build_auth_context in the paste pipelines,
> otherwise remove the admin_token_auth middleware from the paste pipelines.
> 2016-10-14 15:12:47.671 35977 INFO keystone.common.wsgi
> [req-a5030a83-f265-4b25-b6e5-1918c978f824 - - - - -] GET
> https://nimbus-1.csail.mit.edu:35358/v2.0/tokens/
> 2016-10-14 15:12:47.672 35977 WARNING oslo_log.versionutils
> [req-a5030a83-f265-4b25-b6e5-1918c978f824 - - - - -] Deprecated:
> validate_token of the v2 API is deprecated as of Mitaka in favor of a
> similar function in the v3 API and may be removed in Q.
> 2016-10-14 15:12:47.684 35977 WARNING keystone.common.wsgi
> [req-a5030a83-f265-4b25-b6e5-1918c978f824 - - - - -] You are not
> authorized to perform the requested action: identity:validate_token
>
> I've dug through keystone/policy.json and identity:validate_token is
> authorized to "role:admin or is_admin:1" which I *think* should cover
> the token use case...but not 100% sure.
>
> Can radosgw use a propper keystone user so I can avoid the admin_token
> mess (http://docs.ceph.com/docs/jewel/radosgw/keystone/ seems to
> indicate no)?
>
> Or anyone see where in my keystone chain I might have dropped a link?
>
> Thanks,
> -Jon
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] all three mons segfault at same time

2015-11-10 Thread Logan V.
I am in the process of upgrading a cluster with mixed 0.94.2/0.94.3 to
0.94.5 this morning and am seeing identical crashes. In the process of
doing a rolling upgrade across the mons this morning, after the 3rd of
3 mons was restarted to 0.94.5, all 3 crashed simultaneously identical
to what you are describing above. Now I am seeing rolling crashes
across the 3 mons continually. I am still in the process of upgrading
about 200 OSDs to 0.94.5 so most of them are still running 0.94.2 and
0.94.3. There are 3 mds's running 0.94.5 during these crashes.

==> /var/log/clusterboot/lsn-mc1008/syslog <==
Nov 10 10:07:30 lsn-mc1008 kernel: [6392349.844640] init: ceph-mon
(ceph/lsn-mc1008) main process (2254664) killed by SEGV signal
Nov 10 10:07:30 lsn-mc1008 kernel: [6392349.844648] init: ceph-mon
(ceph/lsn-mc1008) main process ended, respawning
==> /var/log/clusterboot/lsn-mc1006/syslog <==
Nov 10 10:07:46 lsn-mc1006 kernel: [6392890.294124] init: ceph-mon
(ceph/lsn-mc1006) main process (2183307) killed by SEGV signal
Nov 10 10:07:46 lsn-mc1006 kernel: [6392890.294132] init: ceph-mon
(ceph/lsn-mc1006) main process ended, respawning
==> /var/log/clusterboot/lsn-mc1007/syslog <==
Nov 10 10:07:46 lsn-mc1007 kernel: [6392599.894914] init: ceph-mon
(ceph/lsn-mc1007) main process (1998234) killed by SEGV signal
Nov 10 10:07:46 lsn-mc1007 kernel: [6392599.894923] init: ceph-mon
(ceph/lsn-mc1007) main process ended, respawning
==> /var/log/clusterboot/lsn-mc1008/syslog <==
Nov 10 10:07:46 lsn-mc1008 kernel: [6392365.959984] init: ceph-mon
(ceph/lsn-mc1008) main process (2263082) killed by SEGV signal
Nov 10 10:07:46 lsn-mc1008 kernel: [6392365.959992] init: ceph-mon
(ceph/lsn-mc1008) main process ended, respawning
==> /var/log/clusterboot/lsn-mc1006/syslog <==
Nov 10 10:07:52 lsn-mc1006 kernel: [6392896.674332] init: ceph-mon
(ceph/lsn-mc1006) main process (2191273) killed by SEGV signal
Nov 10 10:07:52 lsn-mc1006 kernel: [6392896.674340] init: ceph-mon
(ceph/lsn-mc1006) main process ended, respawning
==> /var/log/clusterboot/lsn-mc1008/syslog <==
Nov 10 10:07:52 lsn-mc1008 kernel: [6392372.324282] init: ceph-mon
(ceph/lsn-mc1008) main process (2270979) killed by SEGV signal
Nov 10 10:07:52 lsn-mc1008 kernel: [6392372.324295] init: ceph-mon
(ceph/lsn-mc1008) main process ended, respawning
==> /var/log/clusterboot/lsn-mc1007/syslog <==
Nov 10 10:07:52 lsn-mc1007 kernel: [6392606.272911] init: ceph-mon
(ceph/lsn-mc1007) main process (2006118) killed by SEGV signal
Nov 10 10:07:52 lsn-mc1007 kernel: [6392606.272995] init: ceph-mon
(ceph/lsn-mc1007) main process ended, respawning
==> /var/log/clusterboot/lsn-mc1006/syslog <==
Nov 10 10:07:55 lsn-mc1006 kernel: [6392899.046307] init: ceph-mon
(ceph/lsn-mc1006) main process (2192187) killed by SEGV signal
Nov 10 10:07:55 lsn-mc1006 kernel: [6392899.046315] init: ceph-mon
(ceph/lsn-mc1006) main process ended, respawning
==> /var/log/clusterboot/lsn-mc1007/syslog <==
Nov 10 10:08:17 lsn-mc1007 kernel: [6392631.192476] init: ceph-mon
(ceph/lsn-mc1007) main process (2006489) killed by SEGV signal
Nov 10 10:08:17 lsn-mc1007 kernel: [6392631.192484] init: ceph-mon
(ceph/lsn-mc1007) main process ended, respawning
==> /var/log/clusterboot/lsn-mc1006/syslog <==
Nov 10 10:08:17 lsn-mc1006 kernel: [6392921.600089] init: ceph-mon
(ceph/lsn-mc1006) main process (2192298) killed by SEGV signal
Nov 10 10:08:17 lsn-mc1006 kernel: [6392921.600108] init: ceph-mon
(ceph/lsn-mc1006) main process ended, respawning
==> /var/log/clusterboot/lsn-mc1008/syslog <==
Nov 10 10:08:17 lsn-mc1008 kernel: [6392397.277994] init: ceph-mon
(ceph/lsn-mc1008) main process (2271246) killed by SEGV signal
Nov 10 10:08:17 lsn-mc1008 kernel: [6392397.278002] init: ceph-mon
(ceph/lsn-mc1008) main process ended, respawning
==> /var/log/clusterboot/lsn-mc1006/syslog <==
Nov 10 10:08:23 lsn-mc1006 kernel: [6392927.999229] init: ceph-mon
(ceph/lsn-mc1006) main process (2200399) killed by SEGV signal
Nov 10 10:08:23 lsn-mc1006 kernel: [6392927.999242] init: ceph-mon
(ceph/lsn-mc1006) main process ended, respawning
==> /var/log/clusterboot/lsn-mc1008/syslog <==
Nov 10 10:08:23 lsn-mc1008 kernel: [6392403.641241] init: ceph-mon
(ceph/lsn-mc1008) main process (2279050) killed by SEGV signal
Nov 10 10:08:23 lsn-mc1008 kernel: [6392403.641254] init: ceph-mon
(ceph/lsn-mc1008) main process ended, respawning
==> /var/log/clusterboot/lsn-mc1007/syslog <==
Nov 10 10:08:24 lsn-mc1007 kernel: [6392637.614495] init: ceph-mon
(ceph/lsn-mc1007) main process (2013418) killed by SEGV signal
Nov 10 10:08:24 lsn-mc1007 kernel: [6392637.614504] init: ceph-mon
(ceph/lsn-mc1007) main process ended, respawning


On Mon, Nov 2, 2015 at 8:35 AM, Arnulf Heimsbakk
 wrote:
> When I did a unset noout on the cluster all three mons got a
> segmentation fault, then continued as if nothing had happened. Regular
> segmentation faults started on mons after upgrading to 0.94.5. Ubuntu
> Trusty LTS. Anyone had similar?
>
> -Arnulf
>
> 

Re: [ceph-users] all three mons segfault at same time

2015-11-10 Thread Logan V.
I am on trusty also but my /var/lib/ceph/mon lives on an xfs filesystem.

My mons seem to have stabilized now after upgrading the last of the
OSDs to 0.94.5. No crashes in the last 20 minutes whereas they were
crashing every 1-2 minutes in a rolling fashion the entire time I was
upgrading OSDs.

On Tue, Nov 10, 2015 at 12:03 PM, Arnulf Heimsbakk <aheimsb...@met.no> wrote:
> Hi Logan!
>
> It seems that I've solved the segfaults on my monitors. Maybe not in the
> best way, but they seem to be gone. Original my monitor servers ran
> Ubuntu Trusty on ext4, but they've now been converted to CentOS 7 with
> XFS as root file system. They've run stable for 24H now.
>
> I'm still running Ubuntu on my OSDs and no issues so far running mixed
> OS. Everything is running 0.94.5.
>
> Not a ideal solution, but I'm preparing to convert OSDs to CentOS too if
> things stay stable over time.
>
> -Arnulf
>
> On 11/10/2015 05:13 PM, Logan V. wrote:
>> I am in the process of upgrading a cluster with mixed 0.94.2/0.94.3 to
>> 0.94.5 this morning and am seeing identical crashes. In the process of
>> doing a rolling upgrade across the mons this morning, after the 3rd of
>> 3 mons was restarted to 0.94.5, all 3 crashed simultaneously identical
>> to what you are describing above. Now I am seeing rolling crashes
>> across the 3 mons continually. I am still in the process of upgrading
>> about 200 OSDs to 0.94.5 so most of them are still running 0.94.2 and
>> 0.94.3. There are 3 mds's running 0.94.5 during these crashes.
>>
>> ==> /var/log/clusterboot/lsn-mc1008/syslog <==
>> Nov 10 10:07:30 lsn-mc1008 kernel: [6392349.844640] init: ceph-mon
>> (ceph/lsn-mc1008) main process (2254664) killed by SEGV signal
>> Nov 10 10:07:30 lsn-mc1008 kernel: [6392349.844648] init: ceph-mon
>> (ceph/lsn-mc1008) main process ended, respawning
>> ==> /var/log/clusterboot/lsn-mc1006/syslog <==
>> Nov 10 10:07:46 lsn-mc1006 kernel: [6392890.294124] init: ceph-mon
>> (ceph/lsn-mc1006) main process (2183307) killed by SEGV signal
>> Nov 10 10:07:46 lsn-mc1006 kernel: [6392890.294132] init: ceph-mon
>> (ceph/lsn-mc1006) main process ended, respawning
>> ==> /var/log/clusterboot/lsn-mc1007/syslog <==
>> Nov 10 10:07:46 lsn-mc1007 kernel: [6392599.894914] init: ceph-mon
>> (ceph/lsn-mc1007) main process (1998234) killed by SEGV signal
>> Nov 10 10:07:46 lsn-mc1007 kernel: [6392599.894923] init: ceph-mon
>> (ceph/lsn-mc1007) main process ended, respawning
>> ==> /var/log/clusterboot/lsn-mc1008/syslog <==
>> Nov 10 10:07:46 lsn-mc1008 kernel: [6392365.959984] init: ceph-mon
>> (ceph/lsn-mc1008) main process (2263082) killed by SEGV signal
>> Nov 10 10:07:46 lsn-mc1008 kernel: [6392365.959992] init: ceph-mon
>> (ceph/lsn-mc1008) main process ended, respawning
>> ==> /var/log/clusterboot/lsn-mc1006/syslog <==
>> Nov 10 10:07:52 lsn-mc1006 kernel: [6392896.674332] init: ceph-mon
>> (ceph/lsn-mc1006) main process (2191273) killed by SEGV signal
>> Nov 10 10:07:52 lsn-mc1006 kernel: [6392896.674340] init: ceph-mon
>> (ceph/lsn-mc1006) main process ended, respawning
>> ==> /var/log/clusterboot/lsn-mc1008/syslog <==
>> Nov 10 10:07:52 lsn-mc1008 kernel: [6392372.324282] init: ceph-mon
>> (ceph/lsn-mc1008) main process (2270979) killed by SEGV signal
>> Nov 10 10:07:52 lsn-mc1008 kernel: [6392372.324295] init: ceph-mon
>> (ceph/lsn-mc1008) main process ended, respawning
>> ==> /var/log/clusterboot/lsn-mc1007/syslog <==
>> Nov 10 10:07:52 lsn-mc1007 kernel: [6392606.272911] init: ceph-mon
>> (ceph/lsn-mc1007) main process (2006118) killed by SEGV signal
>> Nov 10 10:07:52 lsn-mc1007 kernel: [6392606.272995] init: ceph-mon
>> (ceph/lsn-mc1007) main process ended, respawning
>> ==> /var/log/clusterboot/lsn-mc1006/syslog <==
>> Nov 10 10:07:55 lsn-mc1006 kernel: [6392899.046307] init: ceph-mon
>> (ceph/lsn-mc1006) main process (2192187) killed by SEGV signal
>> Nov 10 10:07:55 lsn-mc1006 kernel: [6392899.046315] init: ceph-mon
>> (ceph/lsn-mc1006) main process ended, respawning
>> ==> /var/log/clusterboot/lsn-mc1007/syslog <==
>> Nov 10 10:08:17 lsn-mc1007 kernel: [6392631.192476] init: ceph-mon
>> (ceph/lsn-mc1007) main process (2006489) killed by SEGV signal
>> Nov 10 10:08:17 lsn-mc1007 kernel: [6392631.192484] init: ceph-mon
>> (ceph/lsn-mc1007) main process ended, respawning
>> ==> /var/log/clusterboot/lsn-mc1006/syslog <==
>> Nov 10 10:08:17 lsn-mc1006 kernel: [6392921.600089] init: ceph-mon
>> (ceph/lsn-mc1006) main process (2192298) killed by SEGV signal
>> Nov 10 10:08:17 lsn-mc1006 kernel: [6392921.600108]

Re: [ceph-users] Chown in Parallel

2015-11-10 Thread Logan V.
Thanks for sharing this. I modified it slightly to stop and start the OSDs
on the fly rather than having all osds needlessly stopped during the chown.

ie.

chown ceph:ceph /var/lib/ceph /var/lib/ceph/* && find /var/lib/ceph/osd
-maxdepth 1 -mindepth 1 -print | xargs -P12 -n1 -I '{}' bash -c 'echo
"starting run on osd.$(cat {}/whoami)"; service ceph-osd stop id=$(cat
{}/whoami); time chown -R ceph:ceph {} && service ceph-osd restart id=$(cat
{}/whoami); echo "done with osd.$(cat {}/whoami)";'


Same note of course.. all of the non-osd dirs in /var/lib/ceph need to be
handled separately. Also keep an eye out for downed OSDs after the run
completes. If the chown does not return success then the script above will
not start the OSD.

On Tue, Nov 10, 2015 at 5:58 AM, Nick Fisk  wrote:

> I’m currently upgrading to Infernalis and the chown stage is taking a log
> time on my OSD nodes. I’ve come up with this little one liner to run the
> chown’s in parallel
>
>
>
> find /var/lib/ceph/osd -maxdepth 1 -mindepth 1 -print | xargs -P12 -n1
> chown -R ceph:ceph
>
>
>
> NOTE: You still need to make sure the other directory’s in the
> /var/lib/ceph folder are updated separately but this should speed up the
> process for machines with larger number of disks.
>
>
>
> Nick
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] radosgw keystone integration

2015-08-17 Thread Logan V.
After setting up radosgw federated configuration last week and
integrating with openstack keystone auth, I have a question regarding
the configuration.

In the Keystone setup instructions for Kilo, the admin token auth
method is disabled:
http://docs.openstack.org/kilo/install-guide/install/apt/content/keystone-verify.html
For security reasons, disable the temporary authentication token mechanism:

Edit the /etc/keystone/keystone-paste.ini file and remove
admin_token_auth from the [pipeline:public_api], [pipeline:admin_api],
and [pipeline:api_v3] sections.

So after using this setup guide for kilo, the environment is not
compatible with radosgw because apparently radosgw requires admin
token auth. This is not documented at
http://ceph.com/docs/master/radosgw/keystone/ and resulted in a really
frustrating day of troubleshooting why keystone was rejecting
radosgw's attempts to load the token revocation list.

So first, I think this requirement should be listed on the
radosgw/keystone integration setup instructions.

Long term, I am curious if ceph intends to continue using this
temporary authentication mechanism that is recommended to be
disabled after bootstrapping Keystone's setup by openstack.

For reference, these are the kinds of errors seen when the admin token
auth is disabled as recommended:
ceph rgw node:
T 10.13.32.6:42533 - controller:5000 [AP]
  GET /v2.0/tokens/revoked HTTP/1.1..Host: controller:5000..Accept:
*/*..Transfer-Encoding: chunked..X-Auth-Token: removed..Expect:
100-continue
##
T controller:5000 - 10.13.32.6:42533 [AP]
  HTTP/1.1 100 Continue
##
T 10.13.32.6:42533 - controller:5000 [AP]
  0
#
T controller:5000 - 10.13.32.6:42533 [AP]
  HTTP/1.1 403 Forbidden..Date: Sat, 15 Aug 2015 00:46:58 GMT..Server:
Apache/2.4.7 (Ubuntu)..Vary: X-Auth-Token..X-Distribution:
Ubuntu..x-openstack-request-id: req-869523c8-12bb-46d4-9d5b
  -89e0efd1dc38..Content-Length: 141..Content-Type:
application/json{error: {message: You are not authorized to
perform the requested action: identity:revocation_list, code: 403
  , title: Forbidden}}

root@radosgw-template:~# radosgw --id radosgw.us-dfw-1 -d
2015-08-15 00:51:17.992497 7ff2281e0840  0 ceph version 0.94.2
(5fb85614ca8f354284c713a2f9c610860720bbf3), process radosgw, pid 15381
2015-08-15 00:51:18.515909 7ff2281e0840  0 framework: fastcgi
2015-08-15 00:51:18.515927 7ff2281e0840  0 framework: civetweb
2015-08-15 00:51:18.515946 7ff2281e0840  0 framework conf key: port, val: 7480
2015-08-15 00:51:18.515958 7ff2281e0840  0 starting handler: civetweb
2015-08-15 00:51:18.529113 7ff2281e0840  0 starting handler: fastcgi
2015-08-15 00:51:18.541553 7ff1a67fc700  0 revoked tokens response is
missing signed section
2015-08-15 00:51:18.541573 7ff1a67fc700  0 ERROR: keystone revocation
processing returned error r=-22
2015-08-15 00:51:21.222619 7ff1a6ffd700  0 ERROR: can't read user header: ret=-2
2015-08-15 00:51:21.222648 7ff1a6ffd700  0 ERROR: sync_user() failed,
user=us-dfw ret=-2


keystone error log:
2015-08-14 19:46:58.582172 2015-08-14 19:46:58.582 8782 WARNING
keystone.common.wsgi [-] You are not authorized to perform the
requested action: identity:revocation_list
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com