Re: [ceph-users] ONE pg deep-scrub blocks cluster

2016-08-30 Thread Mehmet
Good news Jean-Charles :) now i have deleted the object [...] -rw-r--r-- 1 ceph ceph 100G Jul 31 01:04 vm-101-disk-2__head_383C3223__0 [...] root@:~# rados -p rbd rm vm-101-disk-2 and did run again a deep-scrub on 0.223. root@gengalos:~# ceph pg 0.223 query No blocked requests anymore :) To

Re: [ceph-users] ONE pg deep-scrub blocks cluster

2016-08-29 Thread Jean-Charles Lopez
How Mehmet OK so it does come from a rados put. As you were able to check the VM device objet size is 4 MB. So we'll see after you have removed the object with rados -p rbd rm. I'll wait for an update. JC While moving. Excuse unintended typos. > On Aug 29, 2016, at 14:34, Mehmet wrote:

Re: [ceph-users] ONE pg deep-scrub blocks cluster

2016-08-29 Thread Mehmet
Hey JC, after setting up the ceph-cluster i tried to migrate an image from one of our production vm into ceph via # rados -p rbd put ... but i have got always "file too large". I guess this file # -rw-r--r-- 1 ceph ceph 100G Jul 31 01:04 vm-101-disk-2__head_383C3223__0 is the result of th

Re: [ceph-users] ONE pg deep-scrub blocks cluster

2016-08-29 Thread Mehmet
Hello JC, in short for the records: What you can try doing is to change the following settings on all the OSDs that host this particular PG and see if it makes things better [osd] [...] osd_scrub_chunk_max = 5 # maximum number of chunks the scrub will

Re: [ceph-users] ONE pg deep-scrub blocks cluster

2016-08-29 Thread JC Lopez
Hi Mehmet, see inline Keep me posted JC > On Aug 29, 2016, at 01:23, Mehmet wrote: > > Hey JC, > > thank you very much! - My answers inline :) > > Am 2016-08-26 19:26, schrieb LOPEZ Jean-Charles: >> Hi Mehmet, >> what is interesting in the PG stats is that the PG contains around >> 700+ obj

Re: [ceph-users] ONE pg deep-scrub blocks cluster

2016-08-29 Thread Mehmet
Hey JC, thank you very much! - My answers inline :) Am 2016-08-26 19:26, schrieb LOPEZ Jean-Charles: Hi Mehmet, what is interesting in the PG stats is that the PG contains around 700+ objects and you said that you are using RBD only in your cluster if IIRC. With the default RBD order (4MB obje

Re: [ceph-users] ONE pg deep-scrub blocks cluster

2016-08-26 Thread Mehmet
Hello JC, as promised here is my - ceph.conf (I have done a "diff" on all involved server - all using the same ceph.conf) = ceph_conf.txt - ceph pg 0.223 query = ceph_pg_0223_query_20161236.txt - ceph -s = ceph_s.txt - ceph df = ceph_df.txt - ceph osd df = ceph_osd_df.txt - ceph osd dump | gre

Re: [ceph-users] ONE pg deep-scrub blocks cluster

2016-08-25 Thread ceph
Hey JC, Thank you very much for your mail! I will provide the Informations tomorrow when i am at work again. Hope that we will find a solution :) - Mehmet Am 24. August 2016 16:58:58 MESZ, schrieb LOPEZ Jean-Charles : >Hi Mehmet, > >I’m just seeing your message and read the thread going with

Re: [ceph-users] ONE pg deep-scrub blocks cluster

2016-08-24 Thread Mehmet
Hello Guys, the issue still exists :( If we run a "ceph pg deep-scrub 0.223" nearly all VMs stop for a while (blocked requests). - we already replaced the OSDs (SAS Disks - journal on NVMe) - Removed OSDs so that acting set for pg 0.223 has changed - checked the filesystem on the acting OSDs

Re: [ceph-users] ONE pg deep-scrub blocks cluster

2016-08-02 Thread c
Am 2016-08-02 13:30, schrieb c: Hello Guys, this time without the original acting-set osd.4, 16 and 28. The issue still exists... [...] For the record, this ONLY happens with this PG and no others that share the same OSDs, right? Yes, right. [...] When doing the deep-scrub, monitor (atop,

Re: [ceph-users] ONE pg deep-scrub blocks cluster

2016-08-02 Thread c
Hello Guys, this time without the original acting-set osd.4, 16 and 28. The issue still exists... [...] For the record, this ONLY happens with this PG and no others that share the same OSDs, right? Yes, right. [...] When doing the deep-scrub, monitor (atop, etc) all 3 nodes and see if a p

Re: [ceph-users] ONE pg deep-scrub blocks cluster

2016-08-01 Thread c
Hello Guys, your help is realy appreciated! [...] For the record, this ONLY happens with this PG and no others that share the same OSDs, right? Yes, right. [...] When doing the deep-scrub, monitor (atop, etc) all 3 nodes and see if a particular OSD (HDD) stands out, as I would expect it to.

Re: [ceph-users] ONE pg deep-scrub blocks cluster

2016-07-30 Thread c
Am 2016-07-30 14:04, schrieb Marius Vaitiekunas: Hi, We had a similar issue. If you use radosgw and have large buckets, this pg could hold a bucket index.  Hello Marius, thanks for your hint. But, it seems that i forgot to mention that we are using ceph only as rbd for our virtual machines

Re: [ceph-users] ONE pg deep-scrub blocks cluster

2016-07-29 Thread c
Hi Christian, Hello Bill, thank you very much for your Post. For the record, this ONLY happens with this PG and no others that share the same OSDs, right? Yes, right. If so then we're looking at something (HDD or FS wise) that's specific to the data of this PG. When doing the deep-scrub,

Re: [ceph-users] ONE pg deep-scrub blocks cluster

2016-07-28 Thread Bill Sharer
Removing osd.4 and still getting the scrub problems removes its drive from consideration as the culprit. Try the same thing again for osd.16 and then osd.28. smartctl may not show anything out of sorts until the marginally bad sector or sectors finally goes bad and gets remapped. The only hi

Re: [ceph-users] ONE pg deep-scrub blocks cluster

2016-07-28 Thread Christian Balzer
Hello, On Thu, 28 Jul 2016 14:46:58 +0200 c wrote: > Hello Ceph alikes :) > > i have a strange issue with one PG (0.223) combined with "deep-scrub". > > Always when ceph - or I manually - run a " ceph pg deep-scrub 0.223 ", > this leads to many "slow/block requests" so that nearly all of my V

Re: [ceph-users] ONE pg deep-scrub blocks cluster

2016-07-28 Thread c
Am 2016-07-28 15:26, schrieb Bill Sharer: I suspect the data for one or more shards on this osd's underlying filesystem has a marginally bad sector or sectors. A read from the deep scrub may be causing the drive to perform repeated seeks and reads of the sector until it gets a good read from the

Re: [ceph-users] ONE pg deep-scrub blocks cluster

2016-07-28 Thread Bill Sharer
I suspect the data for one or more shards on this osd's underlying filesystem has a marginally bad sector or sectors. A read from the deep scrub may be causing the drive to perform repeated seeks and reads of the sector until it gets a good read from the filesystem. You might want to look at

[ceph-users] ONE pg deep-scrub blocks cluster

2016-07-28 Thread c
Hello Ceph alikes :) i have a strange issue with one PG (0.223) combined with "deep-scrub". Always when ceph - or I manually - run a " ceph pg deep-scrub 0.223 ", this leads to many "slow/block requests" so that nearly all of my VMs stop working for a while. This happens only to this one PG