Re: [ceph-users] Hammer OSD crash during deep scrub

2016-02-18 Thread Max A. Krasilnikov
Hello! On Wed, Feb 17, 2016 at 11:14:09AM +0200, pseudo wrote: > Hello! > Now I'm going to check OSD filesystem. But I have neither strange logs in > syslog, nor SMART reports about this drive. Filesystem check did not find any troubles. Removing OSD and scrubbing problematic PG on other pair

Re: [ceph-users] Recover unfound objects from crashed OSD's underlying filesystem

2016-02-18 Thread Kostis Fardelas
Can it be any OSD or one of them that the PG reports to have probed? Do you know if there is a way to force probing for a PG besides restarting an OSD? It doesn't also need to be an empty OSD I guess. I also suppose that trying to manually copy the objects is not going to work: a. either by just u

[ceph-users] OSD Performance Counters

2016-02-18 Thread Nick Fisk
Hi All, Could someone please sanity check this for me please. I trying to get my head round what counter reflect what and how they correlate to end user performance. In the attached graph I am graphing averages of the counters across all OSD's on one host Blue = osd.w_op_latency Red = Max of abo

[ceph-users] OSD Journal size config

2016-02-18 Thread M Ranga Swami Reddy
Hello All, I have increased my cluster's OSD journal size from 2GB to 10GB. But could NOT see much write/read performance improvements. (Cluster is with 4 servers + 96 osds). Do I miss anything here? Or do I need to update some more config variables related to journalling? like(using the default

[ceph-users] Large directory block size on XFS may be harmful

2016-02-18 Thread Jens Rosenboom
Various people have noticed performance problems and sporadic kernel log messages like kernel: XFS: possible memory allocation deadlock in kmem_alloc (mode:0x8250) with their Ceph clusters. We have seen this in one of our clusters ourselves, but not been able to reproduce it in a lab environment

Re: [ceph-users] OSD Journal size config

2016-02-18 Thread Nick Fisk
> -Original Message- > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of > M Ranga Swami Reddy > Sent: 18 February 2016 12:09 > To: ceph-users > Subject: [ceph-users] OSD Journal size config > > Hello All, > I have increased my cluster's OSD journal size from 2GB

Re: [ceph-users] Large directory block size on XFS may be harmful

2016-02-18 Thread Dan van der Ster
Hi, Thanks for linking to a current update on this problem [1] [2]. I really hope that new Ceph installations aren't still following that old advice... it's been known to be a problem for around a year and a half [3]. That said, the "-n size=64k" wisdom was really prevalent a few years ago, and I

Re: [ceph-users] Large directory block size on XFS may be harmful

2016-02-18 Thread Jens Rosenboom
2016-02-18 15:10 GMT+01:00 Dan van der Ster : > Hi, > > Thanks for linking to a current update on this problem [1] [2]. I > really hope that new Ceph installations aren't still following that > old advice... it's been known to be a problem for around a year and a > half [3]. > That said, the "-n si

[ceph-users] osd not removed from crush map after ceph osd crush remove

2016-02-18 Thread Dimitar Boichev
Hello, I am running a tiny cluster of 2 nodes. ceph -v ceph version 0.80.7 (6c0127fcb58008793d3c8b62d925bc91963672a3) One osd died and I added a new osd (not replacing the old one). After that I wanted to remove the failed osd completely from the cluster. Here is what I did: ceph osd reweight osd.

Re: [ceph-users] Large directory block size on XFS may be harmful

2016-02-18 Thread Mark Nelson
FWIW, We still had a couple of example cbt configuration files in the github repo that used -n 64k until Jens pointed it out earlier this week. That's now been fixed, so that should at least help. Unfortunately there' still some old threads on the mailing list that list it as a potential perfo

Re: [ceph-users] Large directory block size on XFS may be harmful

2016-02-18 Thread Dan van der Ster
On Thu, Feb 18, 2016 at 3:46 PM, Jens Rosenboom wrote: > 2016-02-18 15:10 GMT+01:00 Dan van der Ster : >> Hi, >> >> Thanks for linking to a current update on this problem [1] [2]. I >> really hope that new Ceph installations aren't still following that >> old advice... it's been known to be a prob

Re: [ceph-users] Idea for speedup RadosGW for buckets with many objects.

2016-02-18 Thread Yehuda Sadeh-Weinraub
On Wed, Feb 17, 2016 at 12:51 PM, Krzysztof Księżyk wrote: > Hi, > > I'm experiencing problem with poor performance of RadosGW while operating on > bucket with many object. That's known issue with LevelDB and can be > partially resolved using shrading but I have one more idea. As I see in ceph > o

[ceph-users] GSoC Mentor Submissions Due

2016-02-18 Thread Patrick McGarry
Hey cephers, Just a reminder, today at 5p EST is the cutoff to get your project ideas (mentor name, project title, 2-3 sentences description) to me in order to be considered as a Google Summer of Code mentor. If you have any questions please let me know asap. If you would like a reference, last ye

Re: [ceph-users] OSD Journal size config

2016-02-18 Thread Nick Fisk
> -Original Message- > From: M Ranga Swami Reddy [mailto:swamire...@gmail.com] > Sent: 18 February 2016 13:44 > To: Nick Fisk > Subject: Re: [ceph-users] OSD Journal size config > > > Hello All, > > I have increased my cluster's OSD journal size from 2GB to 10GB. > > But could NOT see muc

[ceph-users] R: cancel or remove default pool rbd

2016-02-18 Thread Andrea Annoè
Hi Micheal, I have resolve with cancel and recreate pool rbd. Thanks in advance. Andrea. -Messaggio originale- Da: Michael Hackett [mailto:mhack...@redhat.com] Inviato: giovedì 11 febbraio 2016 23:26 A: Andrea Annoè Cc: ceph-users@lists.ceph.com Oggetto: Re: [ceph-users] cancel or remove

Re: [ceph-users] SSD-Cache Tier + RBD-Cache = Filesystem corruption?

2016-02-18 Thread Jason Dillaman
That's a pretty strange and seemingly non-random corruption of your first block. Is that object in the cache pool right now? If so, is the backing pool object just as corrupt as the cache pool's object? I see that your cache pool is currently configured in forward mode. Did you switch to t

Re: [ceph-users] How to recover from OSDs full in small cluster

2016-02-18 Thread Lukáš Kubín
Hi, we've managed to release some space from our cluster. Now I would like to restart those 2 full OSDs. As they're completely full I probably need to delete some data from them. I would like to ask: Is it OK to delete all pg directories (eg. all subdirectories in /var/lib/ceph/osd/ceph-5/current/

Re: [ceph-users] How to recover from OSDs full in small cluster

2016-02-18 Thread Stillwell, Bryan
When I've run into this situation I look for PGs that are on the full drives, but are in an active+clean state in the cluster. That way I can safely remove the PGs from the full drives and not have to risk data loss. It usually doesn't take much before you can restart the OSDs and let ceph take c

[ceph-users] incorrect numbers in ceph osd pool stats

2016-02-18 Thread Ben Hines
Ceph 9.2.0 Anyone seen this? Crazy numbers in osd stats command ceph osd stats pool .rgw.buckets id 12 2/39 objects degraded (5.128%) -105/39 objects misplaced (-269.231%) recovery io 20183 kB/s, 36 objects/s client io 79346 kB/s rd, 703 kB/s wr, 476 op/s ceph osd stats -f json {"po

[ceph-users] Replication between regions?

2016-02-18 Thread Alexandr Porunov
Is it possible to replicate objects across the regions. How can we create such clusters? Could you suggest me helpful articles/books about Ceph Cooking? I want to know is it possible to create multi master data centers with data replication among them. Sincerely __

Re: [ceph-users] How to properly deal with NEAR FULL OSD

2016-02-18 Thread Vlad Blando
I changed my volume PGs from 300 to 512 to even out the distribution, right now it is backfilling and remapping and I noticed that it's working. --- osd.2 is near full at 85% osd.4 is near full at 85% osd.5 is near full at 85% osd.6 is near full at 85% osd.7 is near full at 86% osd.8 is near full

Re: [ceph-users] Replication between regions?

2016-02-18 Thread LOPEZ Jean-Charles
Hi, this is where it is discussed : http://docs.ceph.com/docs/hammer/radosgw/federated-config/ JC > On Feb 18, 2016, at 15:14, Alexandr Porunov > wrote: > > Is it possible to replicate objects across the regions. How can we create > such clusters? > > Could you suggest me helpful articles/

Re: [ceph-users] Replication between regions?

2016-02-18 Thread tobe
Thanks @jelopez for the link. I don't think this is what we want because it's just for RGW. It would be much better to have the native and low-level geo-replication for both RBD, RGW and CephFS. We would like to know about the process or idea about this :) Thanks and regards On Fri, Feb 19, 2016

Re: [ceph-users] How to properly deal with NEAR FULL OSD

2016-02-18 Thread Vlad Blando
I tried setting this ceph tell mon.* injectargs "--mon_osd_nearfull_ratio .92" but it seems not working or the mon is busy and the command is on queue? --- osd.2 is near full at 85% osd.4 is near full at 85% osd.5 is near full at 85% osd.6 is near full at 85% osd.7 is near full at 86% osd.8 is n

[ceph-users] Erasure code Plugins

2016-02-18 Thread Daleep Singh Bais
Hi All, I am experimenting with erasure profiles and would like to understand more about them. I created an LRC profile based on *http://docs.ceph.com/docs/master/rados/operations/erasure-code-lrc/* The LRC profile created by me is *ceph osd erasure-code-profile get lrctest1* k=2 l=2 m=2 plugin=