Re: [ceph-users] HEALTH_ERR, size and min_size

2019-12-29 Thread Bernhard Krieger

Is the "ceph osd tree" output before or after replacing the osds.
I ask, because osd10 is not shown in the output?

regards
Bernhard


On 26/12/2019 21:18, Ml Ml wrote:

Hello List,
i have size = 3 and min_size = 2 with 3 Nodes.

My OSDs:

ceph osd tree
ID CLASS WEIGHT   TYPE NAME   STATUS REWEIGHT PRI-AFF
-1   60.17775 root default
-2   20.21155 host ceph01
  0   hdd  1.71089 osd.0   up  1.0 1.0
  8   hdd  1.71660 osd.8   up  1.0 1.0
  9   hdd  2.67029 osd.9   up  1.0 1.0
11   hdd  1.71649 osd.11  up  1.0 1.0
12   hdd  2.67020 osd.12  up  1.0 1.0
14   hdd  2.67020 osd.14  up  1.0 1.0
18   hdd  1.71649 osd.18  up  1.0 1.0
22   hdd  2.67020 osd.22  up  1.0 1.0
23   hdd  2.67020 osd.23  up  1.0 1.0
-3   19.08154 host ceph02
  2   hdd  2.67029 osd.2   up  1.0 1.0
  3   hdd  2.7 osd.3   up  1.0 1.0
  7   hdd  2.67029 osd.7   up  1.0 1.0
13   hdd  2.67020 osd.13  up  1.0 1.0
16   hdd  1.5 osd.16  up  1.0 1.0
19   hdd  2.38409 osd.19  up  1.0 1.0
24   hdd  2.67020 osd.24  up  1.0 1.0
25   hdd  1.71649 osd.25  up  1.0 1.0
-4   20.88466 host ceph03
  1   hdd  1.71660 osd.1   up  1.0 1.0
  4   hdd  2.67020 osd.4   up  1.0 1.0
  5   hdd  1.71660 osd.5   up  1.0 1.0
  6   hdd  1.71660 osd.6   up  1.0 1.0
15   hdd  2.67020 osd.15  up  1.0 1.0
17   hdd  1.62109 osd.17  up  1.0 1.0
20   hdd  1.71649 osd.20  up  1.0 1.0
21   hdd  2.67020 osd.21  up  1.0 1.0
27   hdd  1.71649 osd.27  up  1.0 1.0
32   hdd  2.67020 osd.32  up  1.0 1.0

I replaced two osds on node ceph01 and ran into "HEALTH_ERR".
My problem: it waits for the backfilling process?
Why did i run into HEALTH_ERR? I thought all data will be available on
at least one more node. or even two:

HEALTH_ERR 343351/10358292 objects misplaced (3.315%); Reduced data
availability: 19 pgs inactive; Degraded data redundancy:
639455/10358292 objects degraded (6.173%), 208 pgs degraded, 204 pgs
undersized; application not enabled on 1 pool(s); 29 slow requests are
blocked > 32 sec. Implicated osds ; 29 stuck requests are blocked >
4096 sec. Implicated osds 2,19,24
OBJECT_MISPLACED 343351/10358292 objects misplaced (3.315%)
PG_AVAILABILITY Reduced data availability: 19 pgs inactive
 pg 0.4 is stuck inactive for 4227.236803, current state
undersized+degraded+remapped+backfilling+peered, last acting [19]
 pg 0.12 is stuck inactive for 4227.267137, current state
undersized+degraded+remapped+backfilling+peered, last acting [13]
 pg 0.1b is stuck inactive for 4198.153642, current state
undersized+degraded+remapped+backfill_wait+peered, last acting [24]
 pg 0.1f is stuck inactive for 4226.574006, current state
undersized+degraded+remapped+backfilling+peered, last acting [19]
 pg 0.61 is stuck inactive for 4227.316336, current state
undersized+degraded+remapped+backfilling+peered, last acting [2]
 pg 0.85 is stuck inactive for 4227.287134, current state
undersized+degraded+remapped+backfill_wait+peered, last acting [13]
 pg 0.88 is stuck inactive for 4197.261935, current state
undersized+degraded+remapped+backfill_wait+peered, last acting [24]
 pg 0.bd is stuck inactive for 4226.607646, current state
undersized+degraded+remapped+backfilling+peered, last acting [2]
 pg 0.fc is stuck inactive for 4226.642664, current state
undersized+degraded+remapped+backfill_wait+peered, last acting [13]
 pg 0.140 is stuck inactive for 4198.277165, current state
undersized+degraded+remapped+backfilling+peered, last acting [2]
 pg 0.16c is stuck inactive for 4198.268985, current state
undersized+degraded+remapped+backfilling+peered, last acting [7]
 pg 0.21f is stuck inactive for 4198.228206, current state
undersized+degraded+remapped+backfilling+peered, last acting [2]
 pg 0.222 is stuck inactive for 4198.241280, current state
undersized+degraded+remapped+backfilling+peered, last acting [2]
 pg 0.27f is stuck inactive for 4198.201034, current state
undersized+degraded+remapped+backfill_wait+peered, last acting [19]
 pg 0.297 is stuck inactive for 4197.247869, current state
undersized+degraded+remapped+backfilling+peered, last acting [24]
 pg 0.298 is stuck inactive for 4226.572652, current state
undersized+degraded+remapped+backfilling+peered, last acting [19]
 pg 0.2cd is stuck inactive for 4226.643455, current state
undersized+degraded+remapped+backfilling+peered, last acting [16]
 pg 0.314 is stuck inactive for 4227.339749, current state

[ceph-users] ceph log level

2019-12-29 Thread Zhenshi Zhou
Hi all,

OSD servers generate huge number of log. I configure 'debug_osd' to 1/5 or
1/20, but it seems not working. Is there any other option which overrides
this configuration?

Ceph version mimic(13.2.5)

Thanks
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] cephfs kernel client io performance decreases extremely

2019-12-29 Thread renjianxinlover
hi,Stefan
  could you please provide further guidence?
Brs


| |
renjianxinlover
|
|
renjianxinlo...@163.com
|
签名由网易邮箱大师定制
On 12/28/2019 21:44,renjianxinlover wrote:
Sorry what i said was fuzzy before. 
Currently, my mds is running with certain osds at same node in which SSD drive 
serves as cache device.


| |
renjianxinlover
|
|
renjianxinlo...@163.com
|
签名由网易邮箱大师定制
On 12/28/2019 15:49,Stefan Kooman wrote:
Quoting renjianxinlover (renjianxinlo...@163.com):
HI, Nathan, thanks for your quick reply!
comand 'ceph status' outputs warning including about ten clients failing to 
respond to cache pressure;
in addition, in mds node, 'iostat -x 1' shows drive io usage of mds within five 
seconds as follow,

You should run this iostat -x 1 on the OSD nodes ... MDS is not doing
any IO in and of itself as far as Ceph is concerned.

Gr. Stefan


--
| BIT BV  https://www.bit.nl/Kamer van Koophandel 09090351
| GPG: 0xD14839C6   +31 318 648 688 / i...@bit.nl
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Benchmark diffrence between rados bench and rbd bench

2019-12-29 Thread Vitaliy Filippov

rados bench -p scbench 60 seq --io-size 8192 --io-threads 256
Read size:4194304


rados bench doesn't have --io-size option

testing sequential read with 8K I/O size is a strange idea anyway though

--
With best regards,
  Vitaliy Filippov
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Benchmark diffrence between rados bench and rbd bench

2019-12-29 Thread Ml Ml
Hello,

is there a difference between those two benchmarks?
I mean, with rados i get the expected results, and rbd bench ist far
away from it.
I was not able to see any bottlenecks at all. It looked like with
rados bench the disk did a lot more work than with rbd bench. Any
ideas on this?


rados bench -p scbench 60 seq --io-size 8192 --io-threads 256
...
Total time run:   46.565262
Total reads made: 8021
Read size:4194304
Object size:  4194304
Bandwidth (MB/sec):   689.011
Average IOPS: 172
Stddev IOPS:  14
Max IOPS: 190
Min IOPS: 114
Average Latency(s):   0.0921206
Max latency(s):   1.54236
Min latency(s):   0.0113188



rbd -p rbdbench bench bench1_image  --io-type read --io-size 8192
--io-threads 256 --io-total 10G --io-pattern seq
...
elapsed:   571  ops:  1310720  ops/sec:  2295.01  bytes/sec:
18800700.54 (18MB/sec)


I tried to use the same parameters in order to compare two same benchmarks.
Why do i get so diffrent IOPS/Sec and MB/Sec results?

Any ideas on this?

Thanks,
Mario
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] HEALTH_ERR, size and min_size

2019-12-29 Thread Stefan Kooman
Quoting Ml Ml (mliebher...@googlemail.com):
> Hello Stefan,
> 
> The status was "HEALTH_OK" before i ran those commands.

\o/

> root@ceph01:~# ceph osd crush rule dump
> [
> {
> "rule_id": 0,
> "rule_name": "replicated_ruleset",
> "ruleset": 0,
> "type": 1,
> "min_size": 1,
> "max_size": 10,
> "steps": [
> {
> "op": "take",
> "item": -1,
> "item_name": "default"
> },
> {
> "op": "chooseleaf_firstn",
> "num": 0,
> "type": "host"


^^ This is the important part ... host as failure domain (not osd), but
that's fine in your case.

Make sure you only remove OSDs within the same failure domain at a time and
your safe.

Gr. Stefan

-- 
| BIT BV  https://www.bit.nl/Kamer van Koophandel 09090351
| GPG: 0xD14839C6   +31 318 648 688 / i...@bit.nl
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] HEALTH_ERR, size and min_size

2019-12-29 Thread Ml Ml
Hello Stefan,

The status was "HEALTH_OK" before i ran those commands.
I removed the OSDs with:

ceph osd out osd.10
ceph auth del osd.10
systemctl stop ceph-osd@10
ceph osd rm 10
umount /var/lib/ceph/osd/ceph-10
ceph osd crush remove osd.10
dd if=/dev/zero of=/dev/sdc

ceph osd out osd.9
ceph auth del osd.9
systemctl stop ceph-osd@9
ceph osd rm 9
umount /var/lib/ceph/osd/ceph-9
ceph osd crush remove osd.9



root@ceph01:~# ceph osd crush rule dump
[
{
"rule_id": 0,
"rule_name": "replicated_ruleset",
"ruleset": 0,
"type": 1,
"min_size": 1,
"max_size": 10,
"steps": [
{
"op": "take",
"item": -1,
"item_name": "default"
},
{
"op": "chooseleaf_firstn",
"num": 0,
"type": "host"
},
{
"op": "emit"
}
]
}
]

Thanks,
Mario

On Sun, Dec 29, 2019 at 2:16 PM Stefan Kooman  wrote:
>
> Quoting Ml Ml (mliebher...@googlemail.com):
> > Hello List,
> > i have size = 3 and min_size = 2 with 3 Nodes.
>
> That's good.
>
> >
> >
> > I replaced two osds on node ceph01 and ran into "HEALTH_ERR".
> > My problem: it waits for the backfilling process?
> > Why did i run into HEALTH_ERR? I thought all data will be available on
> > at least one more node. or even two:
>
> How did you replace them? Did you first set them "out" and waited for
> the data to be repicated elsewhere before you removed them?
>
> It *might* be because your CRUSH rule set is replicating over "OSD" and
> not host. What does a "ceph osd crush rule dump" shows?
>
> Gr. Stefan
>
> --
> | BIT BV  https://www.bit.nl/Kamer van Koophandel 09090351
> | GPG: 0xD14839C6   +31 318 648 688 / i...@bit.nl
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] HEALTH_ERR, size and min_size

2019-12-29 Thread Stefan Kooman
Quoting Ml Ml (mliebher...@googlemail.com):
> Hello List,
> i have size = 3 and min_size = 2 with 3 Nodes.

That's good.

> 
> 
> I replaced two osds on node ceph01 and ran into "HEALTH_ERR".
> My problem: it waits for the backfilling process?
> Why did i run into HEALTH_ERR? I thought all data will be available on
> at least one more node. or even two:

How did you replace them? Did you first set them "out" and waited for
the data to be repicated elsewhere before you removed them?

It *might* be because your CRUSH rule set is replicating over "OSD" and
not host. What does a "ceph osd crush rule dump" shows?

Gr. Stefan

-- 
| BIT BV  https://www.bit.nl/Kamer van Koophandel 09090351
| GPG: 0xD14839C6   +31 318 648 688 / i...@bit.nl
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com