Could the 4 GB GET limit saturate the connection from rgw to Ceph?
Simple to test: just rate-limit the health check GET
Did you increase "objecter inflight ops" and "objecter inflight op bytes"?
You absolutely should adjust these settings for large RGW setups,
defaults of 1024 and 100 MB are way
I had something slightly similar to you.
However, my issue was specific/limited to the device_health_metrics pool that
is auto-created with 1 PG when you turn that mgr feature on.
https://www.mail-archive.com/ceph-users@lists.ceph.com/msg56315.html
On 14/10/2019 17:21, Dan van der Ster wrote:
>> I'd appreciate a link to more information if you have one, but a PG
>> autoscaling problem wouldn't really match with the issue already
>> appearing in pre-Nautilus releases. :)
>
> https://github.com/ceph/ceph/pull/30479
Thanks! But no, this
We set up a new Nautilus cluster and only have RGW on it. While we had
a job doing 200k IOPs of really small objects, I noticed that HAProxy
was kicking out RGW backends because they were taking more than 2
seconds to return. We GET a large ~4GB file each minute and use that
as a health check to
It's pretty common to see way more writes than reads if you got lots of idle VMs
Paul
--
Paul Emmerich
Looking for help with your Ceph cluster? Contact us at https://croit.io
croit GmbH
Freseniusstr. 31h
81247 München
www.croit.io
Tel: +49 89 1896585 90
On Mon, Oct 14, 2019 at 6:34 PM Ingo
Great, this helped a lot. Although "ceph iostat" didn't give iostats of single
images, but just general overview of IO, i remembered the new nautilus RDB
performance monitoring.
https://ceph.com/rbd/new-in-nautilus-rbd-performance-monitoring/
With a "simple"
>rbd perf image iotop
i was able to
Hi all,
I also hit the bug #24866 in my test environment. According to the logs, the
last_clean_epoch in the specified OSD/PG is 17703, but the interval starts with
17895. So the OSD fails to start. There are some other OSDs in the same status.
2019-10-14 18:22:51.908 7f0a275f1700 -1 osd.21
On Mon, Oct 14, 2019 at 3:14 PM Florian Haas wrote:
>
> On 14/10/2019 13:29, Dan van der Ster wrote:
> >> Hi Dan,
> >>
> >> what's in the log is (as far as I can see) consistent with the pg query
> >> output:
> >>
> >> 2019-10-14 08:33:57.345 7f1808fb3700 0 log_channel(cluster) log [DBG] :
> >>
Hey all,
Experiencing an odd issue over the last week or so with a single bucket in a
Ceph Luminous (12.2.11) cluster. We occasionally get a complaint from the owner
of one bucket (bucket1) that a single object they have written has gone
missing. If we list the bucket, the object is indeed
Is the storage being used for the whole VM disk?
If so have you checked none of your software is writing constant log's? Or
something that could continuously write to disk.
If your running a new version you can use :
https://docs.ceph.com/docs/mimic/mgr/iostat/ to locate the exact RBD
On 14/10/2019 13:29, Dan van der Ster wrote:
>> Hi Dan,
>>
>> what's in the log is (as far as I can see) consistent with the pg query
>> output:
>>
>> 2019-10-14 08:33:57.345 7f1808fb3700 0 log_channel(cluster) log [DBG] :
>> 10.10d scrub starts
>> 2019-10-14 08:33:57.345 7f1808fb3700 -1
Hi all
We have a 4 node ceph cluster that runs generally fine. It is the storage
backend for our virtualization cluster with Proxmox, that runs about 40 virtual
machines (80% various Linux Servers). Now that we have implemented monitoring,
i see that there is a quite constant write load of
OK I found that the kernel has an "ino32" mount option which hashes 64
bit inos to 32-bit space.
Has anyone tried this?
What happens if two files collide?
-- Dan
On Mon, Oct 14, 2019 at 1:18 PM Dan van der Ster wrote:
>
> Hi all,
>
> One of our users has some 32-bit commercial software that
On Mon, Oct 14, 2019 at 1:27 PM Florian Haas wrote:
>
> On 14/10/2019 13:20, Dan van der Ster wrote:
> > Hey Florian,
> >
> > What does the ceph.log ERR or ceph-osd log show for this inconsistency?
> >
> > -- Dan
>
> Hi Dan,
>
> what's in the log is (as far as I can see) consistent with the pg
On 14/10/2019 13:20, Dan van der Ster wrote:
> Hey Florian,
>
> What does the ceph.log ERR or ceph-osd log show for this inconsistency?
>
> -- Dan
Hi Dan,
what's in the log is (as far as I can see) consistent with the pg query
output:
2019-10-14 08:33:57.345 7f1808fb3700 0
Hey Florian,
What does the ceph.log ERR or ceph-osd log show for this inconsistency?
-- Dan
On Mon, Oct 14, 2019 at 1:04 PM Florian Haas wrote:
>
> Hello,
>
> I am running into an "interesting" issue with a PG that is being flagged
> as inconsistent during scrub (causing the cluster to go to
Hello,
I am running into an "interesting" issue with a PG that is being flagged
as inconsistent during scrub (causing the cluster to go to HEALTH_ERR),
but doesn't actually appear to contain any inconsistent objects.
$ ceph health detail
HEALTH_ERR 1 scrub errors; Possible data damage: 1 pg
Hello,
I was wondering what user experience was with using Ceph over RDMA?
- How you set it up?
- Documentation used to set it up?
- Known issues when using it?
- If you still use it?
Kind regards
Gabryel Mason-Williams
___
ceph-users mailing
18 matches
Mail list logo