date:20190806

Re: [ceph-users] [Ceph-users] Re: MDS failing under load with large cache sizes

2019-08-06 Thread Janek Bevendorff

Thanks that helps. Looks like the problem is that the MDS is not automatically trimming its cache fast enough. Please try bumping mds_cache_trim_threshold: bin/ceph config set mds mds_cache_trim_threshold 512K That did help. Somewhat. I removed the aggressive recall settings I set before an

Re: [ceph-users] [Ceph-users] Re: MDS failing under load with large cache sizes

2019-08-06 Thread Janek Bevendorff

However, now my client processes are basically in constant I/O wait state and the CephFS is slow for everybody. After I restarted the copy job, I got around 4k reqs/s and then it went down to 100 reqs/s with everybody waiting their turn. So yes, it does seem to help, but it increases latency

Re: [ceph-users] bluestore write iops calculation

2019-08-06 Thread nokia ceph

On Mon, Aug 5, 2019 at 6:35 PM wrote: > > Hi Team, > > @vita...@yourcmc.ru , thank you for information and could you please > > clarify on the below quires as well, > > > > 1. Average object size we use will be 256KB to 512KB , will there be > > deferred write queue ? > > With the default settin

[ceph-users] OSD's keep crasching after clusterreboot

2019-08-06 Thread Ansgar Jazdzewski

hi folks, we had to move one of our clusters so we had to boot all servers, now we found an Error on all OSD with the EC-Pool. do we miss some opitons, will an upgrade to 13.2.6 help? Thanks, Ansgar 2019-08-06 12:10:16.265 7fb337b83200 -1 /build/ceph-13.2.4/src/osd/ECUtil.h: In function 'ECUti

Re: [ceph-users] tcmu-runner: "Acquired exclusive lock" every 21s

2019-08-06 Thread Matthias Leopold

Am 05.08.19 um 18:31 schrieb Mike Christie: On 08/05/2019 05:58 AM, Matthias Leopold wrote: Hi, I'm still testing my 2 node (dedicated) iSCSI gateway with ceph 12.2.12 before I dare to put it into production. I installed latest tcmu-runner release (1.5.1) and (like before) I'm seeing that bo

Re: [ceph-users] [Ceph-users] Re: MDS failing under load with large cache sizes

2019-08-06 Thread Patrick Donnelly

On Tue, Aug 6, 2019 at 12:48 AM Janek Bevendorff wrote: > > However, now my client processes are basically in constant I/O wait > > state and the CephFS is slow for everybody. After I restarted the copy > > job, I got around 4k reqs/s and then it went down to 100 reqs/s with > > everybody waiting

Re: [ceph-users] [Ceph-users] Re: MDS failing under load with large cache sizes

2019-08-06 Thread Janek Bevendorff

4k req/s is too fast for a create workload on one MDS. That must include other operations like getattr. That is rsync going through millions of files checking which ones need updating. Right now there are not actually any create operations, since I restarted the copy job. I wouldn't expec

[ceph-users] radosgw (beast): how to enable verbose log? request, user-agent, etc.

2019-08-06 Thread Félix Barbeira

Hi, I'm testing radosgw with beast backend and I did not found a way to view more information on logfile. This is an example: 2019-08-06 16:59:14.488 7fc808234700 1 == starting new request req=0x5608245646f0 = 2019-08-06 16:59:14.496 7fc808234700 1 == req done req=0x5608245646f0 op

Re: [ceph-users] radosgw (beast): how to enable verbose log? request, user-agent, etc.

2019-08-06 Thread EDH - Manuel Rios Fernandez

Hi Felix, You can increase debug option with debug rgw in your rgw nodes. We got it to 10. But at least in our case we switched again to civetweb because it don’t provide a clear log without a lot verbose. Regards Manuel De: ceph-users En nombre de Félix Barbeira Enviad

Re: [ceph-users] tcmu-runner: "Acquired exclusive lock" every 21s

2019-08-06 Thread Mike Christie

On 08/06/2019 07:51 AM, Matthias Leopold wrote: > > > Am 05.08.19 um 18:31 schrieb Mike Christie: >> On 08/05/2019 05:58 AM, Matthias Leopold wrote: >>> Hi, >>> >>> I'm still testing my 2 node (dedicated) iSCSI gateway with ceph 12.2.12 >>> before I dare to put it into production. I installed lat

Re: [ceph-users] tcmu-runner: "Acquired exclusive lock" every 21s

2019-08-06 Thread Mike Christie

On 08/06/2019 11:28 AM, Mike Christie wrote: > On 08/06/2019 07:51 AM, Matthias Leopold wrote: >> >> >> Am 05.08.19 um 18:31 schrieb Mike Christie: >>> On 08/05/2019 05:58 AM, Matthias Leopold wrote: Hi, I'm still testing my 2 node (dedicated) iSCSI gateway with ceph 12.2.12 bef

Re: [ceph-users] How to maximize the OSD effective queue depth in Ceph?

2019-08-06 Thread Mark Lehrer

I have a few more cycles this week to dedicate to the problem of making OSDs do more than maybe 5 simultaneous operations (as measured by the iostat effective queue depth of the drive). However, I'm starting to think that the problem isn't with the number of threads that have work to do... the pro

Re: [ceph-users] [Ceph-users] Re: MDS failing under load with large cache sizes

2019-08-06 Thread Patrick Donnelly

On Tue, Aug 6, 2019 at 7:57 AM Janek Bevendorff wrote: > > > > 4k req/s is too fast for a create workload on one MDS. That must > > include other operations like getattr. > > That is rsync going through millions of files checking which ones need > updating. Right now there are not actually any cre

Re: [ceph-users] How to maximize the OSD effective queue depth in Ceph?

2019-08-06 Thread Mark Nelson

You may be interested in using my wallclock profiler to look at lock contention: https://github.com/markhpc/gdbpmp It will greatly slow down the OSD but will show you where time is being spent and so far the results appear to at least be relatively informative. I used it recently when refa

[ceph-users] New CRUSH device class questions

2019-08-06 Thread Robert LeBlanc

We have a 12.2.8 luminous cluster with all NVMe and we want to take some of the NVMe OSDs and allocate them strictly to metadata pools (we have a problem with filling up this cluster and causing lingering metadata problems, and this will guarantee space for metadata operations). In the past, we hav

Re: [ceph-users] New CRUSH device class questions

2019-08-06 Thread Paul Emmerich

On Tue, Aug 6, 2019 at 7:45 PM Robert LeBlanc wrote: > We have a 12.2.8 luminous cluster with all NVMe and we want to take some of > the NVMe OSDs and allocate them strictly to metadata pools (we have a problem > with filling up this cluster and causing lingering metadata problems, and > this w

Re: [ceph-users] [Ceph-users] Re: MDS failing under load with large cache sizes

2019-08-06 Thread Janek Bevendorff

> Your parallel rsync job is only getting 150 creates per second? What > was the previous throughput? I am actually not quite sure what the exact throughput was or is or what I can expect. It varies so much. I am copying from a 23GB file list that is split into 3000 chunks which are then process

Re: [ceph-users] How to maximize the OSD effective queue depth in Ceph?

2019-08-06 Thread Mark Lehrer

Thanks, that looks quite useful. I did a few tests and got basically a null result. In fact, when I put the RBDs on different pools on the same SSDs or pools on different SSDs, performance was a few percent worse than leaving them on the same pool. I definitely wasn't expecting this! It looks l

[ceph-users] 14.2.2 - OSD Crash

2019-08-06 Thread EDH - Manuel Rios Fernandez

Hi We got a pair of OSD located in node that crash randomly since 14.2.2 OS Version : Centos 7.6 There're a ton of lines before crash , I will unespected: -- 3045> 2019-08-07 00:39:32.013 7fe9a4996700 1 heartbeat_map is_healthy 'OSD::osd_op_tp thread 0x7fe987e49700' had timed ou

Re: [ceph-users] How to maximize the OSD effective queue depth in Ceph?

2019-08-06 Thread Anthony D'Atri

> However, I'm starting to think that the problem isn't with the number > of threads that have work to do... the problem may just be that the > OSD & PG code has enough thread locking happening that there is no > possible way to have more than a few things happening on a single OSD > (or perhaps a

[ceph-users] Error Mounting CephFS

2019-08-06 Thread DHilsbos

All; I have a server running CentOS 7.6 (1810), that I want to set up with CephFS (full disclosure, I'm going to be running samba on the CephFS). I can mount the CephFS fine when I use the option secret=, but when I switch to secretfile=, I get an error "No such process." I installed ceph-com

[ceph-users] RadosGW (Ceph Object Gateay) Pools

2019-08-06 Thread DHilsbos

All; Based on the PG Calculator, on the Ceph website, I have this list of pools to pre-create for my Object Gateway: .rgw.root default.rgw.control default.rgw.data.root default.rgw.gc default.rgw.log default.rgw.intent-log default.rgw.meta default.rgw.usage default.rgw.users.keys default.rgw.user

Re: [ceph-users] RadosGW (Ceph Object Gateay) Pools

2019-08-06 Thread EDH - Manuel Rios Fernandez

Hi, I think -> default.rgw.buckets.index for us it reach 2k-6K iops for a index size of 23GB. Regards Manuel -Mensaje original- De: ceph-users En nombre de dhils...@performair.com Enviado el: miércoles, 7 de agosto de 2019 1:41 Para: ceph-users@lists.ceph.com Asunto: [ceph-users] Rado

Re: [ceph-users] 14.2.2 - OSD Crash

2019-08-06 Thread Brad Hubbard

-63> 2019-08-07 00:51:52.861 7fe987e49700 1 heartbeat_map clear_timeout 'OSD::osd_op_tp thread 0x7fe987e49700' had suicide timed out after 150 You hit a suicide timeout, that's fatal. On line 80 the process kills the thread based on the assumption it's hung. src/common/HeartbeatMap.cc: 66 boo

Re: [ceph-users] New CRUSH device class questions

2019-08-06 Thread Robert LeBlanc

On Tue, Aug 6, 2019 at 11:11 AM Paul Emmerich wrote: > On Tue, Aug 6, 2019 at 7:45 PM Robert LeBlanc > wrote: > > We have a 12.2.8 luminous cluster with all NVMe and we want to take some > of the NVMe OSDs and allocate them strictly to metadata pools (we have a > problem with filling up this clu

Re: [ceph-users] New CRUSH device class questions

2019-08-06 Thread Konstantin Shalygin

Is it possible to add a new device class like 'metadata'? Yes, but you don't need this. Just use your existing class with another crush ruleset. If I set the device class manually, will it be overwritten when the OSD boots up? Nope. Classes assigned automatically when OSD is created, not

[ceph-users] Delay time in Multi-site sync

2019-08-06 Thread Hoan Nguyen Van

Hi all. I want to delay time for sync process from primary zone to secondary zone. If some one want to delete my data i have enough time to process. How i can do it. Config some options, install more proxy. Any solutions. Thanks. Regards ___ ceph-user

Re: [ceph-users] radosgw (beast): how to enable verbose log? request, user-agent, etc.

2019-08-06 Thread Félix Barbeira

Hi Manuel, Yes, I already tried that option but the result it's extremely noisy and not usable due to lack of some fields, besides that forget to parse those logs in order to print some stats. Also, I'm not sure if this is a good hint to rgw performance. I think I'm going to stick with nginx and

Re: [ceph-users] New CRUSH device class questions

2019-08-06 Thread Robert LeBlanc

On Tue, Aug 6, 2019 at 7:56 PM Konstantin Shalygin wrote: > Is it possible to add a new device class like 'metadata'? > > > Yes, but you don't need this. Just use your existing class with another > crush ruleset. > Maybe it's the lateness of the day, but I'm not sure how to do that. Do you have

Re: [ceph-users] [Ceph-users] Re: MDS failing under load with large cache sizes

Re: [ceph-users] [Ceph-users] Re: MDS failing under load with large cache sizes

Re: [ceph-users] bluestore write iops calculation

[ceph-users] OSD's keep crasching after clusterreboot

Re: [ceph-users] tcmu-runner: "Acquired exclusive lock" every 21s

Re: [ceph-users] [Ceph-users] Re: MDS failing under load with large cache sizes

Re: [ceph-users] [Ceph-users] Re: MDS failing under load with large cache sizes

[ceph-users] radosgw (beast): how to enable verbose log? request, user-agent, etc.

Re: [ceph-users] radosgw (beast): how to enable verbose log? request, user-agent, etc.

Re: [ceph-users] tcmu-runner: "Acquired exclusive lock" every 21s

Re: [ceph-users] tcmu-runner: "Acquired exclusive lock" every 21s

Re: [ceph-users] How to maximize the OSD effective queue depth in Ceph?

Re: [ceph-users] [Ceph-users] Re: MDS failing under load with large cache sizes

Re: [ceph-users] How to maximize the OSD effective queue depth in Ceph?

[ceph-users] New CRUSH device class questions

Re: [ceph-users] New CRUSH device class questions

Re: [ceph-users] [Ceph-users] Re: MDS failing under load with large cache sizes

Re: [ceph-users] How to maximize the OSD effective queue depth in Ceph?

[ceph-users] 14.2.2 - OSD Crash

Re: [ceph-users] How to maximize the OSD effective queue depth in Ceph?

[ceph-users] Error Mounting CephFS

[ceph-users] RadosGW (Ceph Object Gateay) Pools

Re: [ceph-users] RadosGW (Ceph Object Gateay) Pools

Re: [ceph-users] 14.2.2 - OSD Crash

Re: [ceph-users] New CRUSH device class questions

Re: [ceph-users] New CRUSH device class questions

[ceph-users] Delay time in Multi-site sync

Re: [ceph-users] radosgw (beast): how to enable verbose log? request, user-agent, etc.

Re: [ceph-users] New CRUSH device class questions

29 matches

Site Navigation

Mail list logo

Footer information