[ceph-users] radosgw keystone accepted roles not matching

2015-10-15 Thread Mike Lowe
I’m having some trouble with radosgw and keystone integration, I always get the 
following error:

user does not hold a matching role; required roles: Member,user,_member_,admin

Despite my token clearly having one of the roles: 

"user": {
"id": "401375297eb540bbb1c32432439827b0",
"name": "jomlowe",
"roles": [
{
"id": "8adcf7413cd3469abe4ae13cf259be6e",
"name": "user"
}
],
"roles_links": [],
"username": "jomlowe"
}

Does anybody have any hints?
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] radosgw keystone accepted roles not matching

2015-10-15 Thread Mike Lowe
I think so, unless I misunderstand how it works.

(openstack) role list --user jomlowe --project jomlowe
+--+--+-+-+
| ID   | Name | Project | User|
+--+--+-+-+
| 9fe2ff9ee4384b1894a90878d3e92bab | _member_ | jomlowe | jomlowe |
| 8adcf7413cd3469abe4ae13cf259be6e | user | jomlowe | jomlowe |
+--+--+-+-+


> On Oct 15, 2015, at 1:50 PM, Yehuda Sadeh-Weinraub  wrote:
> 
> On Thu, Oct 15, 2015 at 8:34 AM, Mike Lowe  wrote:
>> I’m having some trouble with radosgw and keystone integration, I always get 
>> the following error:
>> 
>> user does not hold a matching role; required roles: 
>> Member,user,_member_,admin
>> 
>> Despite my token clearly having one of the roles:
>> 
>>"user": {
>>"id": "401375297eb540bbb1c32432439827b0",
>>"name": "jomlowe",
>>"roles": [
>>{
>>"id": "8adcf7413cd3469abe4ae13cf259be6e",
>>"name": "user"
>>}
>>],
>>"roles_links": [],
>>"username": "jomlowe"
>>}
>> 
>> Does anybody have any hints?
> 
> 
> Does the user has these roles assigned on keystone?
> 
> Yehuda

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] radosgw keystone accepted roles not matching

2015-10-17 Thread Mike Lowe
I think I figured it out, for my install the admin token is broken for v2 auth 
and I needed to use user:password w/ admin role.  It is the more correct way to 
do things but is conspicuously missing from here 
http://docs.ceph.com/docs/master/radosgw/keystone/ 
<http://docs.ceph.com/docs/master/radosgw/keystone/> and here 
http://docs.ceph.com/docs/master/radosgw/config-ref/ 
<http://docs.ceph.com/docs/master/radosgw/config-ref/> and I had to read the 
source code to find it.  I would have expected some sort of error to be thrown 
before the role checking failed.  I’ll see if I can’t file a documentation bug.


> On Oct 15, 2015, at 2:06 PM, Mike Lowe  wrote:
> 
> I think so, unless I misunderstand how it works.
> 
> (openstack) role list --user jomlowe --project jomlowe
> +--+--+-+-+
> | ID   | Name | Project | User|
> +--+--+-+-+
> | 9fe2ff9ee4384b1894a90878d3e92bab | _member_ | jomlowe | jomlowe |
> | 8adcf7413cd3469abe4ae13cf259be6e | user | jomlowe | jomlowe |
> +--+--+-+-+
> 
> 
>> On Oct 15, 2015, at 1:50 PM, Yehuda Sadeh-Weinraub  wrote:
>> 
>> On Thu, Oct 15, 2015 at 8:34 AM, Mike Lowe  wrote:
>>> I’m having some trouble with radosgw and keystone integration, I always get 
>>> the following error:
>>> 
>>> user does not hold a matching role; required roles: 
>>> Member,user,_member_,admin
>>> 
>>> Despite my token clearly having one of the roles:
>>> 
>>>   "user": {
>>>   "id": "401375297eb540bbb1c32432439827b0",
>>>   "name": "jomlowe",
>>>   "roles": [
>>>   {
>>>   "id": "8adcf7413cd3469abe4ae13cf259be6e",
>>>   "name": "user"
>>>   }
>>>   ],
>>>   "roles_links": [],
>>>   "username": "jomlowe"
>>>   }
>>> 
>>> Does anybody have any hints?
>> 
>> 
>> Does the user has these roles assigned on keystone?
>> 
>> Yehuda
> 

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] centos 7.3 libvirt (2.0.0-10.el7_3.2) and openstack volume attachment w/ cephx broken

2016-12-19 Thread Mike Lowe
It looks like the libvirt (2.0.0-10.el7_3.2) that ships with centos 7.3 is 
broken out of the box when it comes to hot plugging new virtio-scsi devices 
backed by rbd and cephx auth.  If you use openstack, cephx auth, and centos, 
I’d caution against the upgrade to centos 7.3 right now.  
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] centos 7.3 libvirt (2.0.0-10.el7_3.2) and openstack volume attachment w/ cephx broken

2016-12-19 Thread Mike Lowe
Not that I’ve found, it’s a little hard to search for.  I believe it’s related 
to this libvirt mailing list thread 
https://www.redhat.com/archives/libvir-list/2016-October/msg00396.html 
<https://www.redhat.com/archives/libvir-list/2016-October/msg00396.html>
You’ll find this in the libvirt qemu log for the instance 'No secret with id 
'scsi0-0-0-1-secret0’’ and this in the nova-compute log 'libvirtError: internal 
error: unable to execute QEMU command '__com.redhat_drive_add': Device 
'drive-scsi0-0-0-1' could not be initialized’.  I was able to yum downgrade 
twice to get to something from the 1.2 series.


> On Dec 19, 2016, at 6:40 PM, Jason Dillaman  wrote:
> 
> Do you happen to know if there is an existing bugzilla ticket against
> this issue?
> 
> On Mon, Dec 19, 2016 at 3:46 PM, Mike Lowe  wrote:
>> It looks like the libvirt (2.0.0-10.el7_3.2) that ships with centos 7.3 is 
>> broken out of the box when it comes to hot plugging new virtio-scsi devices 
>> backed by rbd and cephx auth.  If you use openstack, cephx auth, and centos, 
>> I’d caution against the upgrade to centos 7.3 right now.
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 
> 
> 
> -- 
> Jason

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] upgrade procedure to Luminous

2017-07-14 Thread Mike Lowe
Having run ceph clusters in production for the past six years and upgrading 
from every stable release starting with argonaut to the next, I can honestly 
say being careful about order of operations has not been a problem.

> On Jul 14, 2017, at 10:27 AM, Lars Marowsky-Bree  wrote:
> 
> On 2017-07-14T14:12:08, Sage Weil  wrote:
> 
>>> Any thoughts on how to mitigate this, or on whether I got this all wrong and
>>> am missing a crucial detail that blows this wall of text away, please let me
>>> know.
>> I don't know; the requirement that mons be upgraded before OSDs doesn't 
>> seem that unreasonable to me.  That might be slightly more painful in a 
>> hyperconverged scenario (osds and mons on the same host), but it should 
>> just require some admin TLC (restart mon daemons instead of 
>> rebooting).
> 
> I think it's quite unreasonable, to be quite honest. Collocated MONs
> with OSDs is very typical for smaller cluster environments.
> 
>> Is there something in some distros that *requires* a reboot in order to 
>> upgrade packages?
> 
> Not necessarily.
> 
> *But* once we've upgraded the packages, a failure or reboot might
> trigger this.
> 
> And customers don't always upgrade all nodes at once in a short period
> (the benefit of a supposed rolling upgrade cycle), increasing the risk.
> 
> I wish we'd already be fully containerized so indeed the MONs were truly
> independent of everything else going on on the cluster, but ...
> 
>> Also, this only seems like it will affect users that are getting their 
>> ceph packages from the distro itself and not from a ceph.com channel or a 
>> special subscription/product channel (this is how the RHEL stuff works, I 
>> think).
> 
> Even there, upgrading only the MON daemons and not the OSDs is tricky?
> 
> 
> 
> 
> -- 
> SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 
> 21284 (AG Nürnberg)
> "Experience is the name everyone gives to their mistakes." -- Oscar Wilde
> 
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] upgrade procedure to Luminous

2017-07-14 Thread Mike Lowe
It was required for Bobtail to Cuttlefish and Cuttlefish to Dumpling.  

Exactly how many mons do you have such that you are concerned about failure?  
If you have let’s say 3 mons, you update all the bits, then it shouldn’t take 
you more than 2 minutes to restart the mons one by one.  You can take your time 
updating/restarting the osd’s.  I generally consider it bad practice to save 
your system updates for a major ceph upgrade. How exactly can you parse the 
difference between a ceph bug and a kernel regression if you do them all at 
once?  You have a resilient system why wouldn’t you take advantage of that 
property to change one thing at a time?  So what we are really talking about 
here is a hardware failure in the short period it takes to restart mon services 
because you shouldn’t be rebooting.  If the ceph mon doesn’t come back from a 
restart then you have a bug which in all likelihood will happen on the first 
mon and at that point you have options to roll back or run with degraded mons 
until Sage et al puts out a fix.  My only significant downtime was due to a bug 
in a new release having to do with pg splitting, 8 hours later I had my fix.

> On Jul 14, 2017, at 10:39 AM, Lars Marowsky-Bree  wrote:
> 
> On 2017-07-14T10:34:35, Mike Lowe  wrote:
> 
>> Having run ceph clusters in production for the past six years and upgrading 
>> from every stable release starting with argonaut to the next, I can honestly 
>> say being careful about order of operations has not been a problem.
> 
> This requirement did not exist as a mandatory one for previous releases.
> 
> The problem is not the sunshine-all-is-good path. It's about what to do
> in case of failures during the upgrade process.
> 
> 
> 
> -- 
> SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 
> 21284 (AG Nürnberg)
> "Experience is the name everyone gives to their mistakes." -- Oscar Wilde
> 

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] radosgw error

2016-05-11 Thread Mike Lowe
Can anybody help shed some light on this error I’m getting from radosgw?

2016-05-11 10:09:03.471649 7f1b957fa700  1 -- 172.16.129.49:0/3896104243 --> 
172.16.128.128:6814/121075 -- osd_op(client.111957498.0:726 27.4742be4b 
97c56252-6103-4ef4-b37a-42739393f0f1.113770300.1_interfaces [create 0~0 
[excl],setxattr user.rgw.idtag (49),writefull 0~961,setxattr user.rgw.manifest 
(589),setxattr user.rgw.acl (239),setxattr user.rgw.content_type (1),setxattr 
user.rgw.etag (33),setxattr user.rgw.x-amz-meta-mtime (18),call 
rgw.obj_store_pg_ver,setxattr user.rgw.source_zone (4)] snapc 0=[] 
ondisk+write+known_if_redirected e5202) v7 -- ?+0 0x7f1d0c020280 con 
0x7f1d3403cd80
2016-05-11 10:09:03.472137 7f1b801ca700  1 -- 172.16.129.49:0/3896104243 <== 
osd.98 172.16.128.128:6814/121075 13  osd_op_reply(726 
97c56252-6103-4ef4-b37a-42739393f0f1.113770300.1_interfaces [create 0~0 
[excl],setxattr (49),writefull 0~961,setxattr (589),setxattr (239),setxattr 
(1),setxattr (33),setxattr (18),call,setxattr (4)] v0'0 uv0 ondisk = -95 ((95) 
Operation not supported)) v6  604+0+0 (3413131457 0 0) 0x7f195c002ef0 con 
0x7f1d3403cd80
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Concurrent access to Ceph filesystems

2013-03-01 Thread Mike Lowe

On Mar 1, 2013, at 6:08 PM, "McNamara, Bradley"  
wrote:

> I'm new, too, and I guess I just need a little clarification on Greg's 
> statement.  The RBD filesystem is mounted to multiple VM servers, say, in a 
> Proxmox cluster, and as long as any one VM image file on that filesystem is 
> only being accessed from one node of the cluster, everything will work, and 
> that's the way shared storage is intended to work within Ceph/RBD.  Correct?
> 

Technically it's a Rados Block Device and not a filesystem.  I might suggest 
libvirt and sanlock to ease your mind about vm's fighting over the same disk.  
Here is the way to think of it, let's say you have a hard drive, you want to 
access a sector, the information you need to access that is the bus, address on 
that bus, and sector number.  The analog for rbd be pool, image name, object 
number.  The rbd kernel driver and qemu driver combine the identifying 
information and translate between traditional ata or scsi and the collection of 
objects that make up a rbd image.  

> I can understand things blowing up if the same VM image file is being 
> accessed from multiple nodes in the cluster, and that's where a clustered 
> filesystem comes into play.
> 

Ideally all nodes talk to all nodes, if they don't then your cluster isn't 
balanced and functioning properly.

> I guess in my mind/world I was envisioning a group of VM servers using one 
> large RBD volume, that is mounted to each VM server in the group, to store 
> the VM images for all the VM's in the group of VM servers.  This way the VM's 
> could migrate to any VM server in the group using the RBD volume.
> 
> No?
> 

No, one rbd device per vm.  I think what you are looking for is perhaps a ceph 
storage pool not a rbd "volume".

> Brad
> 
> -Original Message-
> From: ceph-users-boun...@lists.ceph.com 
> [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Gregory Farnum
> Sent: Friday, March 01, 2013 2:13 PM
> To: Karsten Becker
> Cc: ceph-us...@ceph.com
> Subject: Re: [ceph-users] Concurrent access to Ceph filesystems
> 
> On Fri, Mar 1, 2013 at 1:53 PM, Karsten Becker  
> wrote:
>> Hi,
>> 
>> I'm new to Ceph. I currently find no answer in the official docs for 
>> the following question.
>> 
>> Can Ceph filesystems be used concurrently by clients, both when 
>> accessing via RBD and CephFS? Concurrently means in terms of multiple 
>> clients accessing an writing on the same Ceph volume (like it is 
>> possible with OCFS2) and extremely, in the same file at the same time.
>> Or is Ceph a "plain" distributed filesystem?
> 
> CephFS supports this very nicely, though it is of course not yet production 
> ready for most users. RBD provides block device semantics - you can mount it 
> from multiple hosts, but if you aren't using cluster-aware software on top of 
> it you won't like the results (eg, you could run OCFS2 on top of RBD, but 
> running ext4 on top of it will work precisely as well as doing so would with 
> a regular hard drive that you somehow managed to plug into two systems at 
> once).
> -Greg
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] ceph 0.57

2013-03-05 Thread Mike Lowe
Try http://ceph.com/debian-testing/dists/




On Mar 5, 2013, at 11:44 AM, Scott Kinder  wrote:

> When is ceph 0.57 going to be available from the ceph.com PPA? I checked, and 
> all releases under http://ceph.com/debian/dists/ seem to still be 0.56.3. Or 
> am I missing something?
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Stalled requests; help?

2013-04-19 Thread Mike Lowe
When I had similar trouble, it was btrfs file deletion, and I just had to wait 
until it recovered.  I promptly switched to xfs.  Also, if you are using a 
kernel before 3.8.0 with btrfs you will loose data.

On Apr 19, 2013, at 7:20 PM, Steven Presser  wrote:

> Huh.  Mu whole cluster seems stuck now.  Just messages like:
> 
> 2013-04-19 19:18:09.581867 osd.8 [WRN] slow request 30.431828 seconds old, 
> received at 2013-04-19 19:17:39.149972: osd_op(client.6381.1:7280 
> rb.0.1211.238e1f29.0007 [read 0~4096] 2.93aa4882) currently reached pg
> 
> Any ideas?  I start the io test and it just never comes back from the first 
> stage.  (Previously it'd been stuck in stage 5 or so.  My (basic) io test, 
> for reference is:
> 
> ./nmon_x86_64_centos6 -F node001-ramdisk-fiber-gig.nmon -s 1 -c 1; for 
> SIZE in 512 1M 2M 4M 8M; do time dd if=/mnt/ram/noise of=/dev/rbd1 bs=$SIZE 
> && sleep 10 && time dd if=/dev/rbd1 of=/dev/null bs=$SIZE && sleep 10; done; 
> killall nmon_x86_64_centos6
> 
> Steve
> 
> 
> On 04/19/2013 06:44 PM, Gregory Farnum wrote:
>> On Fri, Apr 19, 2013 at 3:37 PM, Steven Presser  wrote:
>>> I was running an IO benchmark, so its entirely possible.  However, I was
>>> running it against an RBD device, so I would expect it to block at some
>>> maximum number of outstanding requests (and if the network traffic is to be
>>> believed, it does).  Where would I look to see if it was the case that I was
>>> overloading my OSDs, and which tunable should I look at adjusting if that's
>>> the case?
>> The easiest broad check is just to look at iowait. ;) If this does
>> turn out to be the problem we can get into the details of addressing
>> it, but honestly this is something we're still discussing (we can
>> throttle easily based on bandwidth but are still determining how best
>> to throttle random IO).
>> Also of course, these are warnings that operations are taking a while,
>> but if your "user" is still happy then the system will keep running.
>> -Greg
>> Software Engineer #42 @ http://inktank.com | http://ceph.com
>> 
>>> And hm, I'll look at the networking.  Right now the client is on the same
>>> host as one of my OSDs.  I'm not seeing huge amounts of dropped packets, but
>>> this is really pretty early in the setup.  (Heck, i just got it switched to
>>> gigabit ethernet...)
>>> 
>>> Thanks,
>>> Steve
>>> 
>>> 
>>> On 04/19/2013 06:31 PM, Gregory Farnum wrote:
 On Fri, Apr 19, 2013 at 3:12 PM, Steven Presser
 wrote:
> Hey all,
>  I've got a ceph cluster set up (0.56.4) on a custom centos image
> (base
> centos 6, plus kernel 3.6.9) running as a Xen dom0.  I'm seeing a lot of
> messages like the ones at the bottom of this message.  I'm entirely
> willing
> to believe the hardware on these is going bad (it's donated hardware) but
> have run stress tests on some of these and can't figure out what could be
> failing.  I'm likely to blame the Myricom fiber cards (old, I had to hack
> the driver a bit to get them to run here...), but this looks like it
> doesn't
> involve that.
> 
> Any help or advice is appreciated.
> 
> Thanks in advance,
> Steve
> 
> 
> 2013-04-19 17:47:50.360892 osd.0 [WRN] slow request 33.009444 seconds
> old,
> received at 2013-04-19 17:47:17.351358: osd_op(client.6318.1:285339
> rb.0.1211.238e1f29.00cd [write 3674112~520192] 2.2e1e015e RETRY)
> currently waiting for ondisk
 So what these (generically) mean is that a request is taking a long
 time to get all the way through the OSD pipeline. This one in
 particular is an op that's being replayed, and "waiting for ondisk",
 which means it's waiting for a write to get committed to disk
 everywhere. What workload are you running and have you looked at your
 general system utilization? This could happen if you're just
 overloading your OSDs with too many writes, for instance.
 
 However, given the particular path you've hit, you might also want to
 check your network cards — the connection got reset between the OSD
 and the client, but the OSD remained healthy enough (in its own
 observation and that of its peers) to stay alive.
 -Greg
 Software Engineer #42 @ http://inktank.com | http://ceph.com
 
 
 
> This is (eventually) accompanied by a panic much like:
> general protection fault:  [#1] SMP
> Modules linked in: cbc ip6table_filter ip6_tables ebtable_nat ebtables
> ipt_MASQUERADE iptable_nat nf_nat nf_conntrack_ipv4 nf_defrag_ipv4
> xt_state
> nf_conntrack ipt_REJECT xt_CHECKSUM iptable_mangle iptable_filter
> ip_tables
> bridge stp llc xen_pciback xen_netback xen_blkback xen_gntalloc
> xen_gntdev
> xen_evtchn xenfs xen_privcmd btrfs zlib_deflate openafs(PO) mx_driver(PO)
> mx_mcp(PO) autofs4 nfsv4 auth_rpcgss nfs fscache lockd sunrpc ipv6 tg3
> ppdev
> freq_table mperf pcspkr serio_r

Re: [ceph-users] Ceph 0.56.4 - pgmap state: active+clean+scrubbing+deep

2013-04-22 Thread Mike Lowe
If it says 'active+clean' then it is OK no mater what else may additionally 
have as a status.  Deep scrubbing is just a normal background process that 
makes sure your data is consistent and shouldn't keep you from accessing it.  
Repair should only be done as a last resort, it will discard any replica's 
which may actually be the data you want  if your primary replica is corrupt. 

On Apr 22, 2013, at 10:07 AM, MinhTien MinhTien  
wrote:

> Dear all,
> 
> - I use CentOS 6.3 up kernel 3.8.6-1.el6.elrepo.x86_64:  ceph storage 
> (version 0.56.4), i set pool data (contains all data):
> 
> ceph osd pool set data size 1
> 
> - pool metadata:
> 
> ceph osd pool set data size 2
> 
> I have  osd, earch osd = 14TB (format ext4)
> 
> I have 1 permanent error exists in the system.
> 
> 2013-04-22 20:24:20.942457 mon.0 [INF] pgmap v313221: 640 pgs: 638 
> active+clean, 2 active+clean+scrubbing+deep; 17915 GB data, 17947 GB used, 
> 86469 GB / 107 TB avail
> 2013-04-22 20:24:12.256632 osd.1 [INF] 1.2e scrub ok
> 2013-04-22 20:24:23.348560 mon.0 [INF] pgmap v313222: 640 pgs: 638 
> active+clean, 2 active+clean+scrubbing+deep; 17915 GB data, 17947 GB used, 
> 86469 GB / 107 TB avail
> 2013-04-22 20:24:21.551528 osd.1 [INF] 1.3f scrub ok
> 2013-04-22 20:24:52.009562 mon.0 [INF] pgmap v313223: 640 pgs: 638 
> active+clean, 2 active+clean+scrubbing+deep; 17915 GB data, 17947 GB used, 
> 86469 GB / 107 TB avail
> 
> This makes me not access some data.
> 
> I tried to restart, use command "ceph pg repair " but error still exists
> 
> I need some advice..
> 
> Thanks
> 
> 
> 
> -- 
> TienBM
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Upcoming Stable Release and Wheezy

2013-04-23 Thread Mike Lowe
2. At kernels less than 3.8 BTRFS will loose data with sparse files, so DO NOT 
USE IT.  I've had trouble with btrfs file deletion hanging my osd's for up to 
15 minutes with kernel 3.7 with btrfs sparse file patch applied.

On Apr 23, 2013, at 8:20 PM, Steve Hindle  wrote:

> 
> Hi All,
> 
>   The next stable release of both ceph and Debian are fast approaching.  I'm 
> just looking to get started with ceph and I was hoping to to install ceph 
> when I do the wheezy upgrades.  As such, I have a couple of questions:
> 
> 1.) Will the upcoming stable release will have packages for Debian Wheezy?
> 2.) Is BTRFS still the recommended storage layer? 
> 3.) Any known issues / special tuning required with Wheezy and ceph?
> 
> On a related note, I'll be using ceph as a storage pool for vm images - if 
> anyone has any tips, tricks they'd like to share for this usage I be very 
> grateful. (it appears I'll have to migrate from xen to kvm to get 'native' vm 
> support for ceph? )
> 
> Thanks and have a great week!
> Steve
> 
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] RBD shared between clients

2013-05-01 Thread Mike Lowe
That is the expected behavior.  RBD is emulating a real device, you wouldn't 
expect good things to happen if you were to plug the same drive into two 
different machines at once (perhaps with some soldering).  There is no built in 
mechanism for two machines to access the same block device concurrently, you 
would need to add a bunch of extra stuff in the same way that ocfs or gfs has.  
CephFS on the other hand is a parallel filesystem designed for multiple 
concurrent client access and does have such mechanisms (with limits, read the 
docs).

On May 1, 2013, at 11:19 AM, Yudong Guang  wrote:

> Hi, 
> 
> I've been trying to use block device recently. I have a running cluster with 
> 2 machines and 3 OSDs. 
> 
> On a client machine, let's say A, I created a rbd image using `rbd create` , 
> then formatted, mounted and wrote something in it, everything was working 
> fine.
> 
> However, problem occurred when I tried to use this image on the other client, 
> let's say B, on which I mapped the same image that created on A. I found that 
> any changes I made on any of them cannot be shown on the other client, but if 
> I unmap the device and then map again, the changes will be shown.
> 
> I tested the same thing with ceph fs, but there was no such problem. Every 
> change made on one client can be shown on the other client instantly.
> 
> I wonder whether this kind of behavior of RADOS block device is normal or 
> not. Is there any way that we can read and write on the same image on 
> multiple clients?
> 
> Any idea is appreciated.
> 
> Thanks 
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Dell R515 performance and specification question

2013-05-07 Thread Mike Lowe
FWIW, here is what I have for my ceph cluster:

4 x HP DL 180 G6
12Gb RAM
P411 with 512MB Battery Backed Cache
10GigE
4 HP MSA 60's with 12 x 1TB 7.2k SAS and SATA drives (bought at different times 
so there is a mix)
2 HP D2600 with 12 x 3TB 7.2k SAS Drives

I'm currently running 79 qemu/kvm vm's for Indiana University and xsede.org.

On May 7, 2013, at 7:50 AM, "Barry O'Rourke"  wrote:

> Hi,
> 
> I'm looking to purchase a production cluster of 3 Dell Poweredge R515's which 
> I intend to run in 3 x replication. I've opted for the following 
> configuration;
> 
> 2 x 6 core processors
> 32Gb RAM
> H700 controller (1Gb cache)
> 2 x SAS OS disks (in RAID1)
> 2 x 1Gb ethernet (bonded for cluster network)
> 2 x 1Gb ethernet (bonded for client network)
> 
> and either 4 x 2Tb nearline SAS OSDs or 8 x 1Tb nearline SAS OSDs.
> 
> At the moment I'm undecided on the OSDs, although I'm swaying towards the 
> second option at the moment as it would give me more flexibility and the 
> option of using some of the disks as journals.
> 
> I'm intending to use this cluster to host the images for ~100 virtual 
> machines, which will run on different hardware most likely be managed by 
> OpenNebula.
> 
> I'd be interested to hear from anyone running a similar configuration with a 
> similar use case, especially people who have spent some time benchmarking a 
> similar configuration and still have a copy of the results.
> 
> I'd also welcome any comments or critique on the above specification. 
> Purchases have to be made via Dell and 10Gb ethernet is out of the question 
> at the moment.
> 
> Cheers,
> 
> Barry
> 
> 
> -- 
> The University of Edinburgh is a charitable body, registered in
> Scotland, with registration number SC005336.
> 
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] HEALTH WARN: clock skew detected

2013-05-07 Thread Mike Lowe
You've learned on of the three computer science facts you need to know about 
distributed systems, and I'm glad I could pass something on:

1. Consistent, Available, Distributed - pick any two
2. To completely guard against k failures where you don't know which one failed 
just by looking you need 2k+1 redundant copies
3. Fault tolerant systems must all agree on what time it is

On May 7, 2013, at 6:29 AM, Varun Chandramouli  wrote:

> Hi All,
> 
> Thanks for the replies. I started the ntp daemon and the warnings as well as 
> the crashes seem to have gone. This is the first time I set up a cluster (of 
> physical machines), and was unaware of the need to synchronize the clocks. 
> Probably should have googled it more :). Pardon my ignorance.
> 
> Thanks Again,
> Varun
> 

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] RBD vs RADOS benchmark performance

2013-05-11 Thread Mike Lowe
Hmm try searching the libvirt git for josh as an author you should see the 
commit from Josh Durgan about white listing rbd migration.




On May 11, 2013, at 10:53 AM, w sun  wrote:

> The reference Mike provided is not valid to me.  Anyone else has the same 
> problem? --weiguo
> 
> From: j.michael.l...@gmail.com
> Date: Sat, 11 May 2013 08:45:41 -0400
> To: pi...@pioto.org
> CC: ceph-users@lists.ceph.com
> Subject: Re: [ceph-users] RBD vs RADOS benchmark performance
> 
> I believe that this is fixed in the most recent versions of libvirt, sheepdog 
> and rbd were marked erroneously as unsafe.
> 
> http://libvirt.org/git/?p=libvirt.git;a=commit;h=78290b1641e95304c862062ee0aca95395c5926c
> 
> Sent from my iPad
> 
> On May 11, 2013, at 8:36 AM, Mike Kelly  wrote:
> 
> (Sorry for sending this twice... Forgot to reply to the list)
> 
> Is rbd caching safe to enable when you may need to do a live migration of the 
> guest later on? It was my understanding that it wasn't, and that libvirt 
> prevented you from doing the migration of it knew about the caching setting.
> 
> If it isn't, is there anything else that could help performance? Like, some 
> tuning of block size parameters for the rbd image or the qemu
> 
> On May 10, 2013 8:57 PM, "Mark Nelson"  wrote:
> On 05/10/2013 07:21 PM, Yun Mao wrote:
> Hi Mark,
> 
> Given the same hardware, optimal configuration (I have no idea what that
> means exactly but feel free to specify), which is supposed to perform
> better, kernel rbd or qemu/kvm? Thanks,
> 
> Yun
> 
> Hi Yun,
> 
> I'm in the process of actually running some tests right now.
> 
> In previous testing, it looked like kernel rbd and qemu/kvm performed about 
> the same with cache off.  With cache on (in cuttlefish), small sequential 
> write performance improved pretty dramatically vs without cache.  Large write 
> performance seemed to take more concurrency to reach peak performance, but 
> ultimately aggregate throughput was about the same.
> 
> Hopefully I should have some new results published in the near future.
> 
> Mark
> 
> 
> 
> On Fri, May 10, 2013 at 6:56 PM, Mark Nelson  > wrote:
> 
> On 05/10/2013 12:16 PM, Greg wrote:
> 
> Hello folks,
> 
> I'm in the process of testing CEPH and RBD, I have set up a small
> cluster of  hosts running each a MON and an OSD with both
> journal and
> data on the same SSD (ok this is stupid but this is simple to
> verify the
> disks are not the bottleneck for 1 client). All nodes are
> connected on a
> 1Gb network (no dedicated network for OSDs, shame on me :).
> 
> Summary : the RBD performance is poor compared to benchmark
> 
> A 5 seconds seq read benchmark shows something like this :
> 
> sec Cur ops   started  finished  avg MB/s  cur MB/s
>   last lat   avg
> lat
>   0   0 0 0 0 0 -
>0
>   1  163923   91.958692
> 0.966117  0.431249
>   2  166448   95.9602   100
> 0.513435   0.53849
>   3  169074   98.6317   104
> 0.25631   0.55494
>   4  119584   83.973540
> 1.80038   0.58712
>   Total time run:4.165747
> Total reads made: 95
> Read size:4194304
> Bandwidth (MB/sec):91.220
> 
> Average Latency:   0.678901
> Max latency:   1.80038
> Min latency:   0.104719
> 
> 
> 91MB read performance, quite good !
> 
> Now the RBD performance :
> 
> root@client:~# dd if=/dev/rbd1 of=/dev/null bs=4M count=100
> 100+0 records in
> 100+0 records out
> 419430400 bytes (419 MB) copied, 13.0568 s, 32.1 MB/s
> 
> 
> There is a 3x performance factor (same for write: ~60M
> benchmark, ~20M
> dd on block device)
> 
> The network is ok, the CPU is also ok on all OSDs.
> CEPH is Bobtail 0.56.4, linux is 3.8.1 arm (vanilla release + some
> patches for the SoC being used)
> 
> Can you show me the starting point for digging into this ?
> 
> 
> Hi Greg, First things first, are you doing kernel rbd or qemu/kvm?
>   If you are doing qemu/kvm, make sure you are using virtio disks.
>   This can have a pretty big performance impact.  Next, are you
> using RBD cache? With 0.56.4 there are some performance issues with
> large sequential writes if cache is on, but it does provide benefit
> for small sequential writes.  In general RBD cache behaviour has
> improved with Cuttlefish.
> 
> Beyond that, are the pools being targeted by RBD and rados bench
> setup the same w

[ceph-users] ceph repair details

2013-05-25 Thread Mike Lowe
Does anybody know exactly what ceph repair does?  Could you list out briefly 
the steps it takes?  I unfortunately need to use it for an inconsistent pg.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] qemu-1.4.2 rbd-fixed ubuntu packages

2013-06-05 Thread Mike Lowe
I wonder if it has something to do with them renaming /usr/bin/kvm, in qemu 1.4 
packaged with ubuntu 13.04 it has been replaced with the following:

#! /bin/sh

echo "W: kvm binary is deprecated, please use qemu-system-x86_64 instead" >&2
exec qemu-system-x86_64 -machine accel=kvm:tcg "$@"


On Jun 3, 2013, at 2:10 AM, Wolfgang Hennerbichler 
 wrote:

> On Wed, May 29, 2013 at 04:16:14PM +0200, w sun wrote:
>> Hi Wolfgang,
>> 
>> Can you elaborate the issue for 1.5 with libvirt? Wonder if that will impact 
>> the usage with Grizzly. Did a quick compile for 1.5 with RBD support 
>> enabled, so far it seems to be ok for openstack with a few simple tests. But 
>> definitely want to be cautious if there is known integration issue with 1.5.
>> 
>> Thanks. --weiguo
> 
> I basically couldn't make the vm boot with libvirt. Libvirt complained about 
> a missing monitor command (not ceph monitor, but kvm monitor file or 
> something). I didn't want to start upgrading libvirt too, so I stepped back 
> to 1.4.2. 
> 
> Wolfgang
> 
> 
> -- 
> http://www.wogri.com
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] More data corruption issues with RBD (Ceph 0.61.2)

2013-06-18 Thread Mike Lowe
I think the bug Sage is talking about was fixed in 3.8.0

On Jun 18, 2013, at 11:38 AM, Guido Winkelmann  
wrote:

> Am Dienstag, 18. Juni 2013, 07:58:50 schrieb Sage Weil:
>> On Tue, 18 Jun 2013, Guido Winkelmann wrote:
>>> Am Donnerstag, 13. Juni 2013, 01:58:08 schrieb Josh Durgin:
 Which filesystem are the OSDs using?
>>> 
>>> BTRFS
>> 
>> Which kernel version?  There was a recent bug (fixed in 3.9 or 3.8) that
>> resolves a data corruption issue on file extension.
> 
> It was 3.8.7 when I saw the problem first, but I have since upgraded to 3.9.4.
> 
> Maybe that is why I cannot reproduce this anymore...
> 
>   Guido
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] MON quorum a single point of failure?

2013-06-20 Thread Mike Lowe
Quorum means you need at least %51 participating be it people following 
parliamentary procedures or mons in ceph.  With one dead and two up you have 
%66 participating or enough to have a quorum. An even number doesn't get you 
any additional safety but does give you one more thing than can fail vs the 
next marginal odd number of mons.  You can survive n failures if you have 2n+1 
redundant systems.

On Jun 20, 2013, at 11:54 AM, Bo  wrote:

> Howdy!
> 
> Loving working with ceph; learning a lot. :)
> 
> I am curious about the quorum process because I seem to get conflicting 
> information from "experts". Those that I report to need a clear answer from 
> me which I am currently unable to give.
> 
> Ceph needs an odd number of monitors in any given cluster (3, 5, 7) to avoid 
> split-brain syndrome. So what happens whenever I have 3 monitors, 1 dies, and 
> I have 2 left?
> 
> The information regarding this situation that I have gathered over the past 
> few months all falls within these three categories:
> A) commonly "stated"--nothing is said. period.
> B) rarely stated--this is a bad situation (possibly split-brain).
> C) rarely stated--each monitor has a "rank", so the highest ranking monitor 
> is the boss, thus quorum.
> 
> Does anyone know with absolute certainty what ceph's quorum logic will do 
> with an even number of (specifically 2) monitors left?
> 
> You may say, "well, take down one of your monitors", to which I respectfully 
> state that my testing is not an authoritative answer on what ceph is designed 
> to do and what it does in production. My testing cannot cover the vast 
> majority of cases covered by the hundreds/thousands who have had a monitor 
> die.
> 
> Thank you for your time and brain juice,
> -bo
> 
> 
> -- 
> 
> "But God demonstrates His own love toward us, in that while we were yet 
> sinners, Christ died for us. Much more then, having now been justified by His 
> blood, we shall be saved from the wrath of God through Him." Romans 5:8-9
> All have sinned, broken God's law, and deserve eternal torment. Jesus Christ, 
> the Son of God, died for the sins of those that will believe, purchasing our 
> salvation, and defeated death so that we all may spend eternity in heaven. Do 
> you desire freedom from hell and be with God in His love for eternity?
> "If you confess with your mouth Jesus as Lord, and believe in your heart that 
> God raised Him from the dead, you will be saved." Romans 10:9
> 
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Use of RDB Kernel module

2013-07-02 Thread Mike Lowe
The ceph kernel module is only for mounting rbd block devices on bare metal 
(technically you could do it in a vm but there is no good reason to do so).  
QEMU/KVM has its own rbd implementation that tends to lead the kernel 
implementation and should be used with vm's.

The rbd module is always used on a remote client and is never used inside a 
ceph cluster (comprised of mon,mds,osd).  

On Jul 2, 2013, at 3:19 PM, "Howarth, Chris "  wrote:

> Hi – I am quite new to Ceph and would be grateful if you could explain a 
> couple of items as these are not immediately clear to me from the Ceph 
> documentation:
> 
> 1) My Ceph cluster is running on RHEL 6.4 (kernel 2.6.32). As such I cannot 
> use the RDB kernel module and so map block device images to the kernel 
> module. I tried and this failed:
> 
> 1.rbd map fedbk --pool fed --name client.fed -m 10.40.99.165 -k 
> /etc/ceph/lient.fed.keyring
> FATAL: Module rbd not found.
> rbd: modprobe rbd failed! (256)
> Does this then mean that I cannot access these block devices from another 
> server using libvirt/KVM ? i.e. it is not possible to use Ceph with 
> libvirt/KVM etc unless I have a kernel level of 2.6.34 or later which 
> supports the kernel module ?
> 
> 2) Is the kernel RBD module ever used on a remote client ? For example the 
> kernel CephFS module enable a client to remotely access and mount a CephFS 
> filesystem. However it is not clear to me if there is a similar role for the 
> RBD kernel module ? Can this be used to remotely access a remote block device 
> - or is it only ever used on the Ceph cluster itself ?
> 
> thanks for your help.
> 
> Chris
> 
>  
> __
> Chris Howarth
> OS Platforms Engineering
> Citi Architecture & Technology Engineering
> (e) chris.howa...@citi.com
> (t) +44 (0) 20 7508 3848
> (f) +44 (0) 20 7508 0964
> (mail-drop) CGC-06-3A
>  
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Unclean PGs in active+degrared or active+remapped

2013-07-19 Thread Mike Lowe
I'm by no means an expert, but from what I understand you do need to stick to 
numbering from zero if you want things to work out in the long term.  Is there 
a chance that the cluster didn't finish bringing things back up to full 
replication before osd's were removed?  

If I were moving from 0,1 to 2,3 I'd bring both 2 and 3 up, set the weight of 
0,1 to zero and let all of the pg's get active+clean again then remove 0,1.  
Doing your swap I might bring up 2 under rack az2, set 1 to weight 0, stop 1 
after getting active+clean and remake what is now 3 as 1 and bring it back in 
as 1 with full weight, then finally drop 2 to weight zero and remove after 
active+clean.  I'd follow on doing a similar shuffle for the now inactive 
former osd 1 the current osd 0 and the future osd 0 which was osd 2.  Clear as 
mud?


On Jul 19, 2013, at 7:03 PM, Pawel Veselov  wrote:

> On Fri, Jul 19, 2013 at 3:54 PM, Mike Lowe  wrote:
> I'm not sure how to get you out of the situation you are in but what you have 
> in your crush map is osd 2 and osd 3 but ceph starts counting from 0 so I'm 
> guessing it's probably gotten confused.  Some history on your cluster might 
> give somebody an idea for a fix. 
> 
> We had osd.0 and osd.1 first, then we added osd.2. We then removed osd.1, 
> added osd.3 and removed osd.0.
> Do you think that adding back a new osd.0 and osd.1, and then removing osd.2 
> and osd.3 will solve that confusion? I'm a bit concerned that proper osd 
> numbering is required to maintain a healthy cluster...
>  
>  
> On Jul 19, 2013, at 6:44 PM, Pawel Veselov  wrote:
> 
>> Hi.
>> 
>> I'm trying to understand the reason behind some of my unclean pages, after 
>> moving some OSDs around. Any help would be greatly appreciated.I'm sure we 
>> are missing something, but can't quite figure out what.
>> 
>> [root@ip-10-16-43-12 ec2-user]# ceph health detail
>> HEALTH_WARN 29 pgs degraded; 68 pgs stuck unclean; recovery 4071/217370 
>> degraded (1.873%)
>> pg 0.50 is stuck unclean since forever, current state active+degraded, last 
>> acting [2]
>> ...
>> pg 2.4b is stuck unclean for 836.989336, current state active+remapped, last 
>> acting [3,2]
>> ...
>> pg 0.6 is active+degraded, acting [3]
>> 
>> These are distinct examples of problems. There are total of 676 page groups.
>> Query shows pretty much the same on them: .
>> 
>> crush map: http://pastebin.com/4Hkkgau6
>> There are some pg_temps (I don't quite understand what those are), that are 
>> mapped to non-existing OSDs. osdmap: http://pastebin.com/irbRNYJz
>> queries for all stuck page groups:http://pastebin.com/kzYa6s2G
>> 
>> 
>> 
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 
> 
> 
> 
> -- 
> With best of best regards
> Pawel S. Veselov

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Storage, File Systems and Data Scrubbing

2013-08-21 Thread Mike Lowe
I think you are missing the distinction between metadata journaling and data 
journaling.  In most cases a journaling filesystem is one that journal's it's 
own metadata but your data is on its own.  Consider the case where you have a 
replication level of two, the osd filesystems have journaling disabled and you 
append a block to a file (which is an object in terms of ceph) but only one 
commits the change in file size to disk.  Later you scrub and discover a 
discrepancy in object sizes, with a replication level of 2 there is no way to 
authoritatively say which one is correct just based on what's in ceph.  This is 
a similar scenario to a btrfs bug that caused me to lose data with ceph.  
Journaling your metadata is the absolute minimum level of assurance you need to 
make a transactional system like ceph work.

On Aug 21, 2013, at 4:23 PM, Johannes Klarenbeek  
wrote:

> Dear ceph-users,
>  
> I read a lot of documentation today about ceph architecture and linux file 
> system benchmarks in particular and I could not help notice something that I 
> like to clear up for myself. Take into account that it has been a while that 
> I actually touched linux, but I did some programming on php2b12 and apache 
> back in the days so I’m not a complete newbie. The real question is below if 
> you do not like reading the rest ;)
>  
> What I have come to understand about file systems for OSD’s is that in theory 
> btrfs is the file system of choice. However, due to its young age it’s not 
> considered stable yet. Therefore EXT4 but preferably XFS is used in most 
> cases. It seems that most people choose this system because of its journaling 
> feature and XFS for its additional attribute storage which has a 64kb limit 
> which should be sufficient for most operations.
>  
> But when you look at file system benchmarks btrfs is really, really slow. 
> Then comes XFS, then EXT4, but EXT2 really dwarfs all other throughput 
> results. On journaling systems (like XFS, EXT4 and btrfs) disabling 
> journaling actually helps throughput as well. Sometimes more then 2 times for 
> write actions.
>  
> The preferred configuration for OSD’s is one OSD per disk. Each object is 
> striped among all Object Storage Daemons in a cluster. So if I would take one 
> disk for the cluster and check its data, chances are slim that I will find a 
> complete object there (a non-striped, full object I mean).
>  
> When a client issues an object write (I assume a full object/file write in 
> this case) it is the client’s responsibility to stripe it among the object 
> storage daemons. When a stripe is successfully stored by the daemon an ACK 
> signal is send to (?) the client and all participating OSD’s. When all 
> participating OSD’s for the object have completed the client assumes all is 
> well and returns control to the application
>  
> If I’m not mistaken, then journaling is meant for the rare occasions that a 
> hardware failure will occur and the data is corrupted. Ceph does this too in 
> another way of course. But ceph should be able to notice when a block/stripe 
> is correct or not. In the rare occasion that a node is failing while doing a 
> write; an ACK signal is not send to the caller and therefor the client can 
> resend the block/stripe to another OSD. Therefor I fail to see the purpose of 
> this extra journaling feature.
>  
> Also ceph schedules a data scrubbing process every day (or however it is 
> configured) that should be able to tackle bad sectors or other errors on the 
> file system and accordingly repair them on the same daemon or flag the whole 
> block as bad. Since everything is replicated the block is still in the 
> storage cluster so no harm is done.
>  
> In a normal/single file system I truly see the value of journaling and the 
> potential for btrfs (although it’s still very slow). However in a system like 
> ceph, journaling seems to me more like a paranoid super fail save.
>  
> Did anyone experiment with file systems that disabled journaling and how did 
> it perform?
>  
> Regards,
> Johannes
>  
>  
>  
>  
> 
> 
> __ Informatie van ESET Endpoint Antivirus, versie van database 
> viruskenmerken 8713 (20130821) __
> 
> Het bericht is gecontroleerd door ESET Endpoint Antivirus.
> 
> http://www.eset.com
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Storage, File Systems and Data Scrubbing

2013-08-21 Thread Mike Lowe
Let me make a simpler case, to do ACID (https://en.wikipedia.org/wiki/ACID) 
which are all properties you want in a filesystem or a database, you need a 
journal.  You need a journaled filesystem to make the object store's file 
operations safe.  You need a journal in ceph to make sure the object operations 
are safe.  Flipped bits are a separate problem that may be aided by journaling 
but the primary objective of a journal is to make guarantees about concurrent 
operations and interrupted operations.  There isn't a person on this list who 
hasn't had an osd die, without a journal starting that osd up again and getting 
it usable would be impractical.

On Aug 21, 2013, at 8:00 PM, Johannes Klarenbeek  
wrote:

>  
>  
>  
> I think you are missing the distinction between metadata journaling and data 
> journaling.  In most cases a journaling filesystem is one that journal's it's 
> own metadata but your data is on its own.  Consider the case where you have a 
> replication level of two, the osd filesystems have journaling disabled and 
> you append a block to a file (which is an object in terms of ceph) but only 
> one commits the change in file size to disk.  Later you scrub and discover a 
> discrepancy in object sizes, with a replication level of 2 there is no way to 
> authoritatively say which one is correct just based on what's in ceph.  This 
> is a similar scenario to a btrfs bug that caused me to lose data with ceph.  
> Journaling your metadata is the absolute minimum level of assurance you need 
> to make a transactional system like ceph work.
>  
> Hey Mike J
>  
> I get your point. However, isn’t it then possible to authoritatively say 
> which one is the correct one in case of 3 OSD’s?
> Or is the replication level a configuration setting that tells the cluster 
> that the object needs to be replicated 3 times?
> In both cases, data scrubbing chooses the majority of the same-same 
> replicated objects in order to know which one is authorative.
>  
> But I also believe (!) that each object has a checksum and each PG too so 
> that it should be easy to find the corrupted object on any of the OSD’s.
> How else would scrubbing find corrupted sectors? Especially when I think 
> about 2TB SATA disks being hit by cosmic-rays that flip a bit somewhere.
> It happens more often with big cheap TB disks, but that doesn’t mean the 
> corrupted sector is a bad sector (in not useable anymore). Journaling is not 
> going to help anyone with this.
> Therefor I believe (again) that the data scrubber must have a mechanism to 
> detect these types of corruptions even in a 2 OSD setup by means of checksums 
> (or better, with a hashed checksum id).
>  
> Also, aren’t there 2 types of transactions; one for writing and one for 
> replicating?
>  
> On Aug 21, 2013, at 4:23 PM, Johannes Klarenbeek 
>  wrote:
>  
> 
> Dear ceph-users,
>  
> I read a lot of documentation today about ceph architecture and linux file 
> system benchmarks in particular and I could not help notice something that I 
> like to clear up for myself. Take into account that it has been a while that 
> I actually touched linux, but I did some programming on php2b12 and apache 
> back in the days so I’m not a complete newbie. The real question is below if 
> you do not like reading the rest ;)
>  
> What I have come to understand about file systems for OSD’s is that in theory 
> btrfs is the file system of choice. However, due to its young age it’s not 
> considered stable yet. Therefore EXT4 but preferably XFS is used in most 
> cases. It seems that most people choose this system because of its journaling 
> feature and XFS for its additional attribute storage which has a 64kb limit 
> which should be sufficient for most operations.
>  
> But when you look at file system benchmarks btrfs is really, really slow. 
> Then comes XFS, then EXT4, but EXT2 really dwarfs all other throughput 
> results. On journaling systems (like XFS, EXT4 and btrfs) disabling 
> journaling actually helps throughput as well. Sometimes more then 2 times for 
> write actions.
>  
> The preferred configuration for OSD’s is one OSD per disk. Each object is 
> striped among all Object Storage Daemons in a cluster. So if I would take one 
> disk for the cluster and check its data, chances are slim that I will find a 
> complete object there (a non-striped, full object I mean).
>  
> When a client issues an object write (I assume a full object/file write in 
> this case) it is the client’s responsibility to stripe it among the object 
> storage daemons. When a stripe is successfully stored by the daemon an ACK 
> signal is send to (?) the client and all participating OSD’s. When all 
> participating OSD’s for the object have completed the client assumes all is 
> well and returns control to the application
>  
> If I’m not mistaken, then journaling is meant for the rare occasions that a 
> hardware failure will occur and the data is corrupted. Ceph does this too in 
> anothe

Re: [ceph-users] RBD hole punching

2013-08-22 Thread Mike Lowe
There is TRIM/discard support and I use it with some success. There are some 
details here http://ceph.com/docs/master/rbd/qemu-rbd/  The one caveat I have 
is that I've sometimes been able to crash an osd by doing fstrim inside a guest.

On Aug 22, 2013, at 10:24 AM, Guido Winkelmann  
wrote:

> Hi,
> 
> RBD has had support for sparse allocation for some time now. However, when 
> using an RBD volume as a virtual disk for a virtual machine, the RBD volume 
> will inevitably grow until it reaches its actual nominal size, even if the 
> filesystem in the guest machine never reaches full utilization.
> 
> Is there some way to reverse this? Like going through the whole image, 
> looking 
> for large consecutive areas of zeroes and just deleting the objects for that 
> area? How about support for TRIM/discard commands used by some modern 
> filesystems?
> 
>   Guido
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Ceph performance with 8K blocks.

2013-09-18 Thread Mike Lowe
Well, in a word, yes. You really expect a network replicated storage system in 
user space to be comparable to direct attached ssd storage?  For what it's 
worth, I've got a pile of regular spinning rust, this is what my cluster will 
do inside a vm with rbd writeback caching on.  As you can see, latency is 
everything.

dd if=/dev/zero of=1g bs=1M count=1024
1024+0 records in
1024+0 records out
1073741824 bytes (1.1 GB) copied, 6.26289 s, 171 MB/s
dd if=/dev/zero of=1g bs=1M count=1024 oflag=dsync
1024+0 records in
1024+0 records out
1073741824 bytes (1.1 GB) copied, 37.4144 s, 28.7 MB/s

As you can see, latency is a killer.

On Sep 18, 2013, at 3:23 PM, Jason Villalta  wrote:

> Any other thoughts on this thread guys.  I am just crazy to want near native 
> SSD performance on a small SSD cluster?
> 
> 
> On Wed, Sep 18, 2013 at 8:21 AM, Jason Villalta  wrote:
> That dd give me this.
> 
> dd if=ddbenchfile of=- bs=8K | dd if=- of=/dev/null bs=8K
> 819200 bytes (8.2 GB) copied, 31.1807 s, 263 MB/s 
> 
> Which makes sense because the SSD is running as SATA 2 which should give 
> 3Gbps or ~300MBps
> 
> I am still trying to better understand the speed difference between the small 
> block speeds seen with dd vs the same small object size with rados.  It is 
> not a difference of a few MB per sec.  It seems to nearly be a factor of 10.  
> I just want to know if this is a hard limit in Ceph or a factor of the 
> underlying disk speed.  Meaning if I use spindles to read data would the 
> speed be the same or would the read speed be a factor of 10 less than the 
> speed of the underlying disk?
> 
> 
> On Wed, Sep 18, 2013 at 4:27 AM, Alex Bligh  wrote:
> 
> On 17 Sep 2013, at 21:47, Jason Villalta wrote:
> 
> > dd if=ddbenchfile of=/dev/null bs=8K
> > 819200 bytes (8.2 GB) copied, 19.7318 s, 415 MB/s
> 
> As a general point, this benchmark may not do what you think it does, 
> depending on the version of dd, as writes to /dev/null can be heavily 
> optimised.
> 
> Try:
>   dd if=ddbenchfile of=- bs=8K | dd if=- of=/dev/null bs=8K
> 
> --
> Alex Bligh
> 
> 
> 
> 
> 
> 
> 
> -- 
> -- 
> Jason Villalta
> Co-founder
> 
> 800.799.4407x1230 | www.RubixTechnology.com
> 
> 
> 
> -- 
> -- 
> Jason Villalta
> Co-founder
> 
> 800.799.4407x1230 | www.RubixTechnology.com
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] About Ceph SSD and HDD strategy

2013-10-07 Thread Mike Lowe
Based on my experience I think you are grossly underestimating the expense and 
frequency of flushes issued from your vm's.  This will be especially bad if you 
aren't using the async flush from qemu >= 1.4.2 as the vm is suspended while 
qemu waits for the flush to finish.  I think your best course of action until 
the caching pool work is completed (I think I remember correctly that this is 
currently in development) is to either use the ssd's as large caches with 
bcache or to use them for journal devices.  I'm sure there are some other more 
informed opinions out there on the best use of ssd's in a ceph cluster and 
hopefully they will chime in.

On Oct 6, 2013, at 9:23 PM, Martin Catudal  wrote:

> Hi Guys,
> I read all Ceph documentation more than twice. I'm now very 
> comfortable with all the aspect of Ceph except for the strategy of using 
> my SSD and HDD.
> 
> Here is my reflexion
> 
> I've two approach in my understanding about use fast SSD (900 GB) for my 
> primary storage and huge but slower HDD (4 TB) for replicas.
> 
> FIRST APPROACH
> 1. I can use PG with cache write enable as my primary storage that's 
> goes on my SSD and let replicas goes on my 7200 RPM.
>  With the cache write enable, I will gain performance for my VM 
> user machine in VDI environment since Ceph client will not have to wait 
> for the replicas write confirmation on the slower HDD.
> 
> SECOND APPROACH
> 2. Use pools hierarchies and let's have one pool for the SSD as primary 
> and lets the replicas goes to a second pool name platter for HDD 
> replication.
> As explain in the Ceph documentation
> rule ssd-primary {
>   ruleset 4
>   type replicated
>   min_size 5
>   max_size 10
>   step take ssd
>   step chooseleaf firstn 1 type host
>   step emit
>   step take platter
>   step chooseleaf firstn -1 type host
>   step emit
>   }
> 
> At this point, I could not figure out what approach could have the most 
> advantage.
> 
> Your point of view would definitely help me.
> 
> Sincerely,
> Martin
> 
> -- 
> Martin Catudal
> Responsable TIC
> Ressources Metanor Inc
> Ligne directe: (819) 218-2708
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Expanding ceph cluster by adding more OSDs

2013-10-09 Thread Mike Lowe
You can add PGs,  the process is called splitting.  I don't think PG merging, 
the reduction in the number of PGs, is ready yet.

On Oct 8, 2013, at 11:58 PM, Guang  wrote:

> Hi ceph-users,
> Ceph recommends the PGs number of a pool is (100 * OSDs) / Replicas, per my 
> understanding, the number of PGs for a pool should be fixed even we scale out 
> / in the cluster by adding / removing OSDs, does that mean if we double the 
> OSD numbers, the PG number for a pool is not optimal any more and there is no 
> chance to correct it?
> 
> 
> Thanks,
> Guang
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] kvm live migrate wil ceph

2013-10-16 Thread Mike Lowe
I wouldn't go so far as to say putting a vm in a file on a networked filesystem 
is wrong.  It is just not the best choice if you have a ceph cluster at hand, 
in my opinion.  Networked filesystems have a bunch of extra stuff to implement 
posix semantics and live in kernel space.  You just need simple block device 
semantics and you don't need to entangle the hypervisor's kernel space.  What 
it boils down to is the engineering first principle of selecting the least 
complicated solution that satisfies the requirements of the problem. You don't 
get anything when you trade the simplicity of rbd for the complexity of a 
networked filesystem.

For format 2 I think the only caveat is that it requires newer clients and the 
kernel client takes some time to catch up to the user space clients.  You may 
not be able to mount filesystems on rbd devices with the kernel client 
depending on kernel version, this may or may not be important to you.  You can 
always use a vm to mount a filesystem on a rbd device as a work around.  

On Oct 16, 2013, at 9:11 AM, Jon  wrote:

> Hello Michael,
> 
> Thanks for the reply.  It seems like ceph isn't actually "mounting" the rbd 
> to the vm host which is where I think I was getting hung up (I had previously 
> been attempting to mount rbds directly to multiple hosts and as you can 
> imagine having issues).
> 
> Could you possible expound on why using a clustered filesystem approach is 
> wrong (or conversely why using RBD's is the correct approach)?
> 
> As for format2 rbd images, it looks like they provide exactly the 
> Copy-On-Write functionality that I am looking for.  Any caveats or things I 
> should look out for when going from format 1 to format 2 images? (I think I 
> read something about not being able to use both at the same time...)
> 
> Thanks Again,
> Jon A
> 
> 
> On Mon, Oct 14, 2013 at 4:42 PM, Michael Lowe  
> wrote:
> I live migrate all the time using the rbd driver in qemu, no problems.  Qemu 
> will issue a flush as part of the migration so everything is consistent.  
> It's the right way to use ceph to back vm's. I would strongly recommend 
> against a network file system approach.  You may want to look into format 2 
> rbd images, the cloning and writable snapshots may be what you are looking 
> for.
> 
> Sent from my iPad
> 
> On Oct 14, 2013, at 5:37 AM, Jon  wrote:
> 
>> Hello,
>> 
>> I would like to live migrate a VM between two "hypervisors".  Is it possible 
>> to do this with a rbd disk or should the vm disks be created as qcow images 
>> on a CephFS/NFS share (is it possible to do clvm over rbds? OR GlusterFS 
>> over rbds?)and point kvm at the network directory.  As I understand it, rbds 
>> aren't "cluster aware" so you can't mount an rbd on multiple hosts at once, 
>> but maybe libvirt has a way to handle the transfer...?  I like the idea of 
>> "master" or "golden" images where guests write any changes to a new image, I 
>> don't think rbds are able to handle copy-on-write in the same way kvm does 
>> so maybe a clustered filesystem approach is the ideal way to go.
>> 
>> Thanks for your input. I think I'm just missing some piece. .. I just don't 
>> grok...
>> 
>> Bestv Regards,
>> Jon A
>> 
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] saucy salamander support?

2013-10-22 Thread Mike Lowe
And a +1 from me as well.  It would appear that ubuntu has picked up the 0.67.4 
source and included a build of it in their official repo, so you may be able to 
get by until the next point release with those.

http://packages.ubuntu.com/search?keywords=ceph

On Oct 22, 2013, at 11:46 AM, Mike Dawson  wrote:

> For the time being, you can install the Raring debs on Saucy without issue.
> 
> echo deb http://ceph.com/debian-dumpling/ raring main | sudo tee 
> /etc/apt/sources.list.d/ceph.list
> 
> I'd also like to register a +1 request for official builds targeted at Saucy.
> 
> Cheers,
> Mike
> 
> 
> On 10/22/2013 11:42 AM, LaSalle, Jurvis wrote:
>> Hi,
>> 
>>  I accidentally installed Saucy Salamander.  Does the project have a
>> timeframe for supporting this Ubuntu release?
>> 
>> Thanks,
>> JL
>> 
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>> 
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] OSD numbering

2013-10-30 Thread Mike Lowe
You really should, I believe the osd number is used in computing crush.  Bad 
things will happen if you don't use sequential numbers.

On Oct 30, 2013, at 11:37 AM, Glen Aidukas  wrote:

> I wanted to know, does the OSD numbering half to be sequential and what is 
> the highest usable number (2^16 or 2^32)?
> 
> The reason is, I would like to use a numbering convention that reflects the 
> cluster number (assuming I will have more than one down the road; test, dev, 
> prod), the host and disk used by a given OSD.
> 
> So, for example: osd.CHHHDD  where:
>   C   Cluster number 1-9
>   HHH Host number  IE: ceph001, ceph002, ...
>   DD  Disk number on a given host Ex: 00 = /dev/sda or something like 
> this.
> 
> If the highest number usable is 65534 or near that (2^16) then maybe I could 
> use format CHHDD or CHHHD where I could have clusters 1-5.
> 
> The up side to this is I quickly know where osd.200503 is.  It's on cluster 2 
> host ceph005 and the third disk.  Also, if I add a new disk on a middle host, 
> it doesn’t scatter the numbering to where I don't easily know were an OSD is. 
>  I know I can always look this up but having it as part of the OSD number 
> makes life easier. :)
> 
> Also, it might seem silly to have the first digit as a cluster number but I 
> think we probable can't pad the number with zeros so using an initial digit 
> of 1-9 cleans this up so I might as well use it to identify the cluster.  
> 
> This numbering system is not important for the monitors or metadata but could 
> help with the OSDs.
> 
> -Glen
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Red Hat clients

2013-10-30 Thread Mike Lowe
If you were to run your Red Hat based client in a vm you could run run 
unmodified an unmodified kernel.  if you are using rhel 6.4 then you get the 
extra goodies in the virtio-scsi qemu driver.

On Oct 30, 2013, at 2:47 PM,  
 wrote:

> Now that my ceph cluster seems to be happy and stable, I have been looking at 
> different ways of using it.   Object, block and file.
>  
> Object is relatively easy and I will use different ones to test with Ceph.
>  
> When I look at block, I’m getting the impression from a lot of Googling that 
> deploying clients on Red Hat to connect to a Ceph cluster can be complex.   
> As I understand it, the rbd module is not currently in the Red Hat kernel 
> (and I am not allowed to make changes to our standard kernel as is suggested 
> in places as a possible solution).  Does this mean I can’t connect a Red Hat 
> machine to Ceph as a block client?
>  
> ___
> 
> This message is for information purposes only, it is not a recommendation, 
> advice, offer or solicitation to buy or sell a product or service nor an 
> official confirmation of any transaction. It is directed at persons who are 
> professionals and is not intended for retail customer use. Intended for 
> recipient only. This message is subject to the terms at: 
> www.barclays.com/emaildisclaimer.
> 
> For important disclosures, please see: 
> www.barclays.com/salesandtradingdisclaimer regarding market commentary from 
> Barclays Sales and/or Trading, who are active market participants; and in 
> respect of Barclays Research, including disclosures relating to specific 
> issuers, please see http://publicresearch.barclays.com.
> 
> ___
> 
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] rgw, nss: dropping the legacy PKI token support in RadosGW (removed in OpenStack Ocata)

2019-04-19 Thread Mike Lowe
I’ve run production Ceph/OpenStack since 2015.  The reality is running 
OpenStack Newton (the last one with pki) with a post Nautilus release just 
isn’t going to work. You are going to have bigger problems than trying to make 
object storage work with keystone issued tokens. Worst case is you will have to 
do the right thing and switch to fernet tokens which are supported all the way 
back to Kilo released 4 years ago.

> On Apr 19, 2019, at 2:39 PM, Anthony D'Atri  wrote:
> 
> 
> I've been away from OpenStack for a couple of years now, so this may have 
> changed.  But back around the Icehouse release, at least, upgrading between 
> OpenStack releases was a major undertaking, so backing an older OpenStack 
> with newer Ceph seems like it might be more common than one might think.
> 
> Which is not to argue for or against dropping PKI in Ceph, but if it's going 
> to be done, please call that out early in the release notes to avoid rude 
> awakenings.
> 
> 
>> [Adding ceph-users for better usability]
>> 
>> On Fri, 19 Apr 2019, Radoslaw Zarzynski wrote:
>>> Hello,
>>> 
>>> RadosGW can use OpenStack Keystone as one of its authentication
>>> backends. Keystone in turn had been offering many token variants
>>> over the time with PKI/PKIz being one of them. Unfortunately,
>>> this specific type had many flaws (like explosion in size of HTTP
>>> header) and has been dropped from Keystone in August 2016 [1].
>>> By "dropping" I don't mean just "deprecating". PKI tokens have
>>> been physically eradicated from Keystone's code base not leaving
>>> documentation behind. This happened in OpenStack Ocata.
>>> 
>>> Intuitively I don't expect that brand new Ceph is deployed with
>>> an ancient OpenStack release. Similarly, upgrading Ceph while
>>> keeping very old OpenStack seems quite improbable.
>> 
>> This sounds reasonable to me.  If someone is running an old OpenStack, 
>> they should be able to defer their Ceph upgrade until OpenStack is 
>> upgraded... or at least transition off the old keystone variant?
>> 
>> sage
>> 
>>> If so, we may consider dropping PKI token support in further
>>> releases. What makes me perceive this idea as attractive is:
>>> 1) significant clean-up in RGW. We could remove a lot of
>>> complexity including the entire revocation machinery with
>>> its dedicated thread.
>>> 2) Killing the NSS dependency. After moving the AWS-like
>>> crypto services of RGW to OpenSSL, the CMS utilized by PKI
>>> token support is the library sole's user.
>>> I'm not saying it's a blocker for NSS removal. Likely we could
>>> reimplement the stuff on top of OpenSSL as well.
>>> All I'm worrying about is this can be futile effort bringing
>>> more problems/confusion than benefits. For instance, instead
>>> of just dropping the "nss_db_path" config option, we would
>>> need to replace it with counterpart for OpenSSL or take care
>>> of differences in certificate formats between the libraries.
>>> 
>>> I can see benefits of the removal. However, the actual cost
>>> is mysterious to me. Is the feature useful?
>>> 
>>> Regards,
>>> Radek
>>> 
>>> [1]: 
>>> https://github.com/openstack/keystone/commit/8a66ef635400083fa426c0daf477038967785caf
>>> 
>>> 
> 

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com