[ceph-users] Re: RGW blocking on large objects

2019-10-14 Thread Paul Emmerich
Could the 4 GB GET limit saturate the connection from rgw to Ceph? Simple to test: just rate-limit the health check GET Did you increase "objecter inflight ops" and "objecter inflight op bytes"? You absolutely should adjust these settings for large RGW setups, defaults of 1024 and 100 MB are way

[ceph-users] Re: Recurring issue: PG is inconsistent, but lists no inconsistent objects

2019-10-14 Thread Reed Dier
I had something slightly similar to you. However, my issue was specific/limited to the device_health_metrics pool that is auto-created with 1 PG when you turn that mgr feature on. https://www.mail-archive.com/ceph-users@lists.ceph.com/msg56315.html

[ceph-users] Re: Recurring issue: PG is inconsistent, but lists no inconsistent objects

2019-10-14 Thread Florian Haas
On 14/10/2019 17:21, Dan van der Ster wrote: >> I'd appreciate a link to more information if you have one, but a PG >> autoscaling problem wouldn't really match with the issue already >> appearing in pre-Nautilus releases. :) > > https://github.com/ceph/ceph/pull/30479 Thanks! But no, this

[ceph-users] RGW blocking on large objects

2019-10-14 Thread Robert LeBlanc
We set up a new Nautilus cluster and only have RGW on it. While we had a job doing 200k IOPs of really small objects, I noticed that HAProxy was kicking out RGW backends because they were taking more than 2 seconds to return. We GET a large ~4GB file each minute and use that as a health check to

[ceph-users] Re: Constant write load on 4 node ceph cluster

2019-10-14 Thread Paul Emmerich
It's pretty common to see way more writes than reads if you got lots of idle VMs Paul -- Paul Emmerich Looking for help with your Ceph cluster? Contact us at https://croit.io croit GmbH Freseniusstr. 31h 81247 München www.croit.io Tel: +49 89 1896585 90 On Mon, Oct 14, 2019 at 6:34 PM Ingo

[ceph-users] Re: Constant write load on 4 node ceph cluster

2019-10-14 Thread Ingo Schmidt
Great, this helped a lot. Although "ceph iostat" didn't give iostats of single images, but just general overview of IO, i remembered the new nautilus RDB performance monitoring. https://ceph.com/rbd/new-in-nautilus-rbd-performance-monitoring/ With a "simple" >rbd perf image iotop i was able to

[ceph-users] Past_interval start interval mismatch (last_clean_epoch reported)

2019-10-14 Thread Huseyin Cotuk
Hi all, I also hit the bug #24866 in my test environment. According to the logs, the last_clean_epoch in the specified OSD/PG is 17703, but the interval starts with 17895. So the OSD fails to start. There are some other OSDs in the same status. 2019-10-14 18:22:51.908 7f0a275f1700 -1 osd.21

[ceph-users] Re: Recurring issue: PG is inconsistent, but lists no inconsistent objects

2019-10-14 Thread Dan van der Ster
On Mon, Oct 14, 2019 at 3:14 PM Florian Haas wrote: > > On 14/10/2019 13:29, Dan van der Ster wrote: > >> Hi Dan, > >> > >> what's in the log is (as far as I can see) consistent with the pg query > >> output: > >> > >> 2019-10-14 08:33:57.345 7f1808fb3700 0 log_channel(cluster) log [DBG] : > >>

[ceph-users] object goes missing in bucket

2019-10-14 Thread Benjamin . Zieglmeier
Hey all, Experiencing an odd issue over the last week or so with a single bucket in a Ceph Luminous (12.2.11) cluster. We occasionally get a complaint from the owner of one bucket (bucket1) that a single object they have written has gone missing. If we list the bucket, the object is indeed

[ceph-users] Re: Constant write load on 4 node ceph cluster

2019-10-14 Thread Ashley Merrick
Is the storage being used for the whole VM disk? If so have you checked none of your software is writing constant log's? Or something that could continuously write to disk. If your running a new version you can use : https://docs.ceph.com/docs/mimic/mgr/iostat/ to locate the exact RBD

[ceph-users] Re: Recurring issue: PG is inconsistent, but lists no inconsistent objects

2019-10-14 Thread Florian Haas
On 14/10/2019 13:29, Dan van der Ster wrote: >> Hi Dan, >> >> what's in the log is (as far as I can see) consistent with the pg query >> output: >> >> 2019-10-14 08:33:57.345 7f1808fb3700 0 log_channel(cluster) log [DBG] : >> 10.10d scrub starts >> 2019-10-14 08:33:57.345 7f1808fb3700 -1

[ceph-users] Constant write load on 4 node ceph cluster

2019-10-14 Thread Ingo Schmidt
Hi all We have a 4 node ceph cluster that runs generally fine. It is the storage backend for our virtualization cluster with Proxmox, that runs about 40 virtual machines (80% various Linux Servers). Now that we have implemented monitoring, i see that there is a quite constant write load of

[ceph-users] Re: CephFS and 32-bit Inode Numbers

2019-10-14 Thread Dan van der Ster
OK I found that the kernel has an "ino32" mount option which hashes 64 bit inos to 32-bit space. Has anyone tried this? What happens if two files collide? -- Dan On Mon, Oct 14, 2019 at 1:18 PM Dan van der Ster wrote: > > Hi all, > > One of our users has some 32-bit commercial software that

[ceph-users] Re: Recurring issue: PG is inconsistent, but lists no inconsistent objects

2019-10-14 Thread Dan van der Ster
On Mon, Oct 14, 2019 at 1:27 PM Florian Haas wrote: > > On 14/10/2019 13:20, Dan van der Ster wrote: > > Hey Florian, > > > > What does the ceph.log ERR or ceph-osd log show for this inconsistency? > > > > -- Dan > > Hi Dan, > > what's in the log is (as far as I can see) consistent with the pg

[ceph-users] Re: Recurring issue: PG is inconsistent, but lists no inconsistent objects

2019-10-14 Thread Florian Haas
On 14/10/2019 13:20, Dan van der Ster wrote: > Hey Florian, > > What does the ceph.log ERR or ceph-osd log show for this inconsistency? > > -- Dan Hi Dan, what's in the log is (as far as I can see) consistent with the pg query output: 2019-10-14 08:33:57.345 7f1808fb3700 0

[ceph-users] Re: Recurring issue: PG is inconsistent, but lists no inconsistent objects

2019-10-14 Thread Dan van der Ster
Hey Florian, What does the ceph.log ERR or ceph-osd log show for this inconsistency? -- Dan On Mon, Oct 14, 2019 at 1:04 PM Florian Haas wrote: > > Hello, > > I am running into an "interesting" issue with a PG that is being flagged > as inconsistent during scrub (causing the cluster to go to

[ceph-users] Recurring issue: PG is inconsistent, but lists no inconsistent objects

2019-10-14 Thread Florian Haas
Hello, I am running into an "interesting" issue with a PG that is being flagged as inconsistent during scrub (causing the cluster to go to HEALTH_ERR), but doesn't actually appear to contain any inconsistent objects. $ ceph health detail HEALTH_ERR 1 scrub errors; Possible data damage: 1 pg

[ceph-users] RDMA

2019-10-14 Thread gabryel . mason-williams
Hello, I was wondering what user experience was with using Ceph over RDMA? - How you set it up? - Documentation used to set it up? - Known issues when using it? - If you still use it? Kind regards Gabryel Mason-Williams ___ ceph-users mailing