Re: [ceph-users] RBD image has no active watchers while OpenStack KVM VM is running

2017-11-29 Thread Logan Kuhn
We've seen this.  Our environment isn't identical though, we use oVirt and 
connect to ceph (11.2.1) via cinder (9.2.1), but it's so very rare that we've 
never had any luck in pin pointing it and have a lot less VMs, <300.

Regards,
Logan

- On Nov 29, 2017, at 7:48 AM, Wido den Hollander w...@42on.com wrote:

| Hi,
| 
| On a OpenStack environment I encountered a VM which went into R/O mode after a
| RBD snapshot was created.
| 
| Digging into this I found 10s (out of thousands) RBD images which DO have a
| running VM, but do NOT have a watcher on the RBD image.
| 
| For example:
| 
| $ rbd status volumes/volume-79773f2e-1f40-4eca-b9f0-953fa8d83086
| 
| 'Watchers: none'
| 
| The VM is however running since September 5th 2017 with Jewel 10.2.7 on the
| client.
| 
| In the meantime the cluster was already upgraded to 10.2.10
| 
| Looking further I also found a Compute node with 10.2.10 installed which also
| has RBD images without watchers.
| 
| Restarting or live migrating the VM to a different host resolves this issue.
| 
| The internet is full of posts where RBD images still have Watchers when people
| don't expect them, but in this case I'm expecting a watcher which isn't there.
| 
| The main problem right now is that creating a snapshot potentially puts a VM 
in
| Read-Only state because of the lack of notification.
| 
| Has anybody seen this as well?
| 
| Thanks,
| 
| Wido
| ___
| ceph-users mailing list
| ceph-users@lists.ceph.com
| http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Prioritise recovery on specific PGs/OSDs?

2017-06-20 Thread Logan Kuhn
Is there a way to prioritize specific pools during recovery? I know there are 
issues open for it, but I wasn't aware it was implemented yet... 

Regards, 
Logan 

- On Jun 20, 2017, at 8:20 AM, Sam Wouters  wrote: 

| Hi,

| Are they all in the same pool? Otherwise you could prioritize pool recovery.
| If not, maybe you can play with the osd max backfills number, no idea if it
| accepts a value of 0 to actually disable it for specific OSDs.

| r,
| Sam

| On 20-06-17 14:44, Richard Hesketh wrote:

|| Is there a way, either by individual PG or by OSD, I can prioritise
|| backfill/recovery on a set of PGs which are currently particularly important 
to
|| me?

|| For context, I am replacing disks in a 5-node Jewel cluster, on a 
node-by-node
|| basis - mark out the OSDs on a node, wait for them to clear, replace OSDs,
|| bring up and in, mark out the OSDs on the next set, etc. I've done my first
|| node, but the significant CRUSH map changes means most of my data is moving. 
I
|| only currently care about the PGs on my next set of OSDs to replace - the 
other
|| remapped PGs I don't care about settling because they're only going to end up
|| moving around again after I do the next set of disks. I do want the PGs
|| specifically on the OSDs I am about to replace to backfill because I don't 
want
|| to compromise data integrity by downing them while they host active PGs. If I
|| could specifically prioritise the backfill on those PGs/OSDs, I could get on
|| with replacing disks without worrying about causing degraded PGs.

|| I'm in a situation right now where there is merely a couple of dozen PGs on 
the
|| disks I want to replace, which are all remapped and waiting to backfill - but
|| there are 2200 other PGs also waiting to backfill because they've moved 
around
|| too, and it's extremely frustating to be sat waiting to see when the ones I
|| care about will finally be handled so I can get on with replacing those 
disks.

|| Rich

|| ___
|| ceph-users mailing list [ mailto:ceph-users@lists.ceph.com |
|| ceph-users@lists.ceph.com ] [
|| http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com |
|| http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ]

| ___
| ceph-users mailing list
| ceph-users@lists.ceph.com
| http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Brainstorming ideas for Python-CRUSH

2017-03-21 Thread Logan Kuhn
I like the idea 

Being able to play around with different configuration options and using this 
tool as a sanity checker or showing what will change as well as whether or not 
the changes could cause health warn or health err. 

For example, if I were to change the replication level of a pool, how much 
space would be left as well as an estimate for how long it would take to 
rebalance. 

Benchmark capabilities, replication, crush changes, osd add/drop, node 
add/drop, iops, read/write performance 

Regards, 
Logan 

- On Mar 21, 2017, at 6:58 AM, Xavier Villaneau  
wrote: 

| Hello all,

| A few weeks ago Loïc Dachary presented his work on python-crush to the
| ceph-devel list, but I don't think it's been done here yet. In a few words,
| python-crush is a new Python 2 and 3 library / API for the CRUSH algorithm. It
| also provides a CLI executable with a few built-in tools related to CRUSH 
maps.
| If you want to try it, follow the instructions from its documentation page:
| [ http://crush.readthedocs.io/en/latest/ |
| http://crush.readthedocs.io/en/latest/ ]

| Currently the crush CLI has two features:
| - analyze: Get a estimation of how (un)evenly the objects will be placed into
| your cluster
| - compare: Get a summary of how much data would be moved around if the map was
| changed
| Both these tools are very basic and have a few known caveats. But nothing that
| cannot be fixed, the project is still young and open to suggestions and
| contributions.

| This is where we'd like to hear from the users' community feedback, given
| everyone's experience in operating (or just messing around with) Ceph 
clusters.
| What kind of CRUSH / data placement tools would be interesting to have? Are
| there some very common architectural / technical questions related to CRUSH
| that such tools would help answering? Any specific cases where such a thing
| could have spared you some pain?

| Here a few ideas on top of my head, to help with starting the discussion:
| - Static analysis of the failure domains, with detection of potential SPOFs
| - Help to capacity planning, estimations of how much data could practically be
| stored in a cluster
| - Built-in basic scenarios for "compare" such as adding a node or removing an
| OSD.

| Please share your ideas, those will eventually help making a better tool!
| Regards,
| --
| Xavier Villaneau
| Software Engineer, working with Ceph during day and sometimes at night too.
| Storage R at Concurrent Computer Corporation, Atlanta USA

| ___
| ceph-users mailing list
| ceph-users@lists.ceph.com
| http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Cephfs with large numbers of files per directory

2017-02-21 Thread Logan Kuhn
We have a very similar configuration at one point. 

I was fairly new when we started to move away from it, but what happened to us 
is that anytime a directory needed to stat, backup, ls, rsync, etc. It would 
take minutes to return and while it was waiting CPU load would spike due to 
iowait. The difference between what you've said and what we did was that we 
used a gateway machine, the actual cluster never had any issues with it. This 
was also on infernalis so things probably have changed in Jewel and Kraken. 

Regards, 
Logan 

- On Feb 21, 2017, at 7:37 AM, Rhian Resnick  wrote: 

| Good morning,

| We are currently investigating using Ceph for a KVM farm, block storage and
| possibly file systems (cephfs with ceph-fuse, and ceph hadoop). Our cluster
| will be composed of 4 nodes, ~240 OSD's, and 4 monitors providing mon and mds
| as required.

| What experience has the community had with large numbers of files in a single
| directory (500,000 - 5 million). We know that directory fragmentation will be
| required but are concerned about the stability of the implementation.

| Your opinions and suggestions are welcome.

| Thank you

| Rhian Resnick

| Assistant Director Middleware and HPC

| Office of Information Technology

| Florida Atlantic University

| 777 Glades Road, CM22, Rm 173B

| Boca Raton, FL 33431

| Phone 561.297.2647

| Fax 561.297.0222

| ___
| ceph-users mailing list
| ceph-users@lists.ceph.com
| http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com