date:20191231

Re: [ceph-users] ceph log level

2019-12-31 Thread Zhenshi Zhou

Hi Marc,

Thanks for the script, I'll try it to configure the log levels.
I configured 'debug_ms' to 0 so that the cluster has not generated much
logs ever.
The logs output show much network connect information last few days.

Thanks

Marc Roos  于2019年12月30日周一 下午5:17写道：

> However, I can not get rid of these messages.
>
> Dec 30 10:13:10 c02 ceph-mgr: 2019-12-30 10:13:10.343 7f7d3a2f8700  0
> log_channel(cluster) log [DBG] : pgmap v710220:
>
>
> -Original Message-
> To: ceph-users; deaderzzs
> Subject: Re: [ceph-users] ceph log level
>
> I am decreasing logging with this script.
>
>
> #!/bin/bash
>
> declare -A logarrosd
> declare -A logarrmon
> declare -A logarrmgr
>
> # default values luminous 12.2.7
> logarrosd[debug_asok]="1/5"
> logarrosd[debug_auth]="1/5"
> logarrosd[debug_buffer]="0/1"
> logarrosd[debug_client]="0/5"
> logarrosd[debug_context]="0/1"
> logarrosd[debug_crush]="1/1"
> logarrosd[debug_filer]="0/1"
> logarrosd[debug_filestore]="1/3"
> logarrosd[debug_finisher]="1/1"
> logarrosd[debug_heartbeatmap]="1/5"
> logarrosd[debug_journal]="1/3"
> logarrosd[debug_journaler]="0/5"
> logarrosd[debug_lockdep]="0/1"
> logarrosd[debug_mds]="1/5"
> logarrosd[debug_mon]="1/5"
> logarrosd[debug_monc]="0/10"
> logarrosd[debug_ms]="0/5"
> logarrosd[debug_objclass]="0/5"
> logarrosd[debug_objectcacher]="0/5"
> logarrosd[debug_objecter]="0/1"
> logarrosd[debug_optracker]="0/5"
> logarrosd[debug_osd]="1/5"
> logarrosd[debug_paxos]="1/5"
> logarrosd[debug_perfcounter]="1/5"
> logarrosd[debug_rados]="0/5"
> logarrosd[debug_rbd]="0/5"
> logarrosd[debug_rgw]="1/5"
> logarrosd[debug_rgw_sync]="1/5"
> logarrosd[debug_throttle]="1/1"
> logarrosd[debug_timer]="0/1"
> logarrosd[debug_tp]="0/5"
>
> logarrosd[debug_mds_balancer]="1/5"
> logarrosd[debug_mds_locker]="1/5"
> logarrosd[debug_mds_log]="1/5"
> logarrosd[debug_mds_log_expire]="1/5"
> logarrosd[debug_mds_migrator]="1/5"
> logarrosd[debug_striper]="0/1"
> logarrosd[debug_rbd_mirror]="0/5"
> logarrosd[debug_rbd_replay]="0/5"
> logarrosd[debug_crypto]="1/5"
> logarrosd[debug_reserver]="1/1"
> logarrosd[debug_civetweb]="1/10"
> logarrosd[debug_javaclient]="1/5"
> logarrosd[debug_xio]="1/5"
> logarrosd[debug_compressor]="1/5"
> logarrosd[debug_bluestore]="1/5"
> logarrosd[debug_bluefs]="1/5"
> logarrosd[debug_bdev]="1/3"
> logarrosd[debug_kstore]="1/5"
> logarrosd[debug_rocksdb]="4/5"
> logarrosd[debug_leveldb]="4/5"
> logarrosd[debug_memdb]="4/5"
> logarrosd[debug_kinetic]="1/5"
> logarrosd[debug_fuse]="1/5"
> logarrosd[debug_mgr]="1/5"
> logarrosd[debug_mgrc]="1/5"
> logarrosd[debug_dpdk]="1/5"
> logarrosd[debug_eventtrace]="1/5"
> logarrmon[debug_asok]="1/5"
> logarrmon[debug_auth]="1/5"
> logarrmon[debug_bdev]="1/3"
> logarrmon[debug_bluefs]="1/5"
> logarrmon[debug_bluestore]="1/5"
> logarrmon[debug_buffer]="0/1"
> logarrmon[debug_civetweb]="1/10"
> logarrmon[debug_client]="0/5"
> logarrmon[debug_compressor]="1/5"
> logarrmon[debug_context]="0/1"
> logarrmon[debug_crush]="1/1"
> logarrmon[debug_crypto]="1/5"
> logarrmon[debug_dpdk]="1/5"
> logarrmon[debug_eventtrace]="1/5"
> logarrmon[debug_filer]="0/1"
> logarrmon[debug_filestore]="1/3"
> logarrmon[debug_finisher]="1/1"
> logarrmon[debug_fuse]="1/5"
> logarrmon[debug_heartbeatmap]="1/5"
> logarrmon[debug_javaclient]="1/5"
> logarrmon[debug_journal]="1/3"
> logarrmon[debug_journaler]="0/5"
> logarrmon[debug_kinetic]="1/5"
> logarrmon[debug_kstore]="1/5"
> logarrmon[debug_leveldb]="4/5"
> logarrmon[debug_lockdep]="0/1"
> logarrmon[debug_mds]="1/5"
> logarrmon[debug_mds_balancer]="1/5"
> logarrmon[debug_mds_locker]="1/5"
> logarrmon[debug_mds_log]="1/5"
> logarrmon[debug_mds_log_expire]="1/5"
> logarrmon[debug_mds_migrator]="1/5"
> logarrmon[debug_memdb]="4/5"
> logarrmon[debug_mgr]="1/5"
> logarrmon[debug_mgrc]="1/5"
> logarrmon[debug_mon]="1/5"
> logarrmon[debug_monc]="0/10"
> logarrmon[debug_ms]="0/0"
> logarrmon[debug_none]="0/5"
> logarrmon[debug_objclass]="0/5"
> logarrmon[debug_objectcacher]="0/5"
> logarrmon[debug_objecter]="0/1"
> logarrmon[debug_optracker]="0/5"
> logarrmon[debug_osd]="1/5"
> logarrmon[debug_paxos]="1/5"
> logarrmon[debug_perfcounter]="1/5"
> logarrmon[debug_rados]="0/5"
> logarrmon[debug_rbd]="0/5"
> logarrmon[debug_rbd_mirror]="0/5"
> logarrmon[debug_rbd_replay]="0/5"
> logarrmon[debug_refs]="0/0"
> logarrmon[debug_reserver]="1/1"
> logarrmon[debug_rgw]="1/5"
> logarrmon[debug_rgw_sync]="1/5"
> logarrmon[debug_rocksdb]="4/5"
> logarrmon[debug_striper]="0/1"
> logarrmon[debug_throttle]="1/1"
> logarrmon[debug_timer]="0/1"
> logarrmon[debug_tp]="0/5"
> logarrmon[debug_xio]="1/5"
>
>
> for osdk in "${!logarrosd[@]}"
> do
>   ceph tell osd.* injectargs --$osdk=0/0 done
>
> for monk in "${!logarrmon[@]}"
> do
>   ceph tell mon.* injectargs --$monk=0/0 done
>
>
>
> -Original Message-
> From: Zhenshi Zhou [mailto:deader...@gmail.com]
> Sent: 30 December 2019 05:41
> To: ceph-users
> Subject: [ceph-users] ceph log level
>
> Hi all,
>
> OSD servers generate huge number of log. I

Re: [ceph-users] Architecture - Recommendations

2019-12-31 Thread Radhakrishnan2 S

Regards
Radha Krishnan S
TCS Enterprise Cloud Practice
Tata Consultancy Services
Cell:- +1 848 466 4870
Mailto: radhakrishnan...@tcs.com
Website: http://www.tcs.com

Experience certainty.   IT Services
Business Solutions
Consulting

-"Stefan Kooman"  wrote: -
To: "Radhakrishnan2 S" 
From: "Stefan Kooman" 
Date: 12/31/2019 07:25AM
Cc: ceph-users@lists.ceph.com, "ceph-users" 
Subject: Re: [ceph-users] Architecture - Recommendations

"External email. Open with Caution"

Hi,
> 
> Radha: I'm sure we are using BGP EVPN over VXLAN, but all deployments
> are through the infrastructure management network. We are a CSP and
> overlay means tenant network, if ceph nodes are in overlay, then
> multiple tenants will need to be able to communicate to the ceph
> nodes. If LB is out of the ceph network, lets say XaaS, will routing
> across networks not create a bottleneck ? I'm novice in network, so if
> you can help with a reference architecture, it would be of help. 

https://vincent.bernat.ch/en/blog/2017-vxlan-bgp-evpn

And

https://vincent.bernat.ch/en/blog/2018-l3-routing-hypervisor

Where hypervisor would be your Ceph nodes. I.e. you can connect your
Ceph nodes on L2 or make them part of the L3 setup (more modern way of
doing it). You can use "ECMP" to add more network capacity when you need
it. Setting up a BGP EVPN VXLAN network is not trivial ... I advise on
getting networking expertise in your team.

Radha: Thanks for the reference. We are planning to have a dedicated set of 
nodes, for our ceph cluster and not make it hyperconverged. Do you see that as 
a recommended option ? Since we might also have baremetal servers for 
workloads, we want to make the storage as a separate dedicated one. 

Gr. Stefan

-- 
| BIT BV  https://www.bit.nl/Kamer van Koophandel 09090351
| GPG: 0xD14839C6   +31 318 648 688 / i...@bit.nl

=-=-=
Notice: The information contained in this e-mail
message and/or attachments to it may contain 
confidential or privileged information. If you are 
not the intended recipient, any dissemination, use, 
review, distribution, printing or copying of the 
information contained in this e-mail message 
and/or attachments to it are strictly prohibited. If 
you have received this communication in error, 
please notify us by reply e-mail or telephone and 
immediately and permanently delete the message 
and any attachments. Thank you

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Architecture - Recommendations

2019-12-31 Thread Radhakrishnan2 S

Hi Stefan - 

Thanks once again for taking time in explaining. 

Regards
Radha Krishnan S
TCS Enterprise Cloud Practice
Tata Consultancy Services
Cell:- +1 848 466 4870
Mailto: radhakrishnan...@tcs.com
Website: http://www.tcs.com

Experience certainty.   IT Services
Business Solutions
Consulting

-"Stefan Kooman"  wrote: -
To: "Radhakrishnan2 S" 
From: "Stefan Kooman" 
Date: 12/31/2019 07:41AM
Cc: ceph-users@lists.ceph.com, "ceph-users" 
Subject: Re: [ceph-users] Architecture - Recommendations

"External email. Open with Caution"

Quoting Radhakrishnan2 S (radhakrishnan...@tcs.com):
> In addition, about putting all kinds of disks, putting all drives in
> one box was done for two reasons, 
> 
> 1. Avoid CPU choking
This depends only on what kind of hardware you select and how you
configure it. You can (if need be) restrict #CPU the ceph daemons get
with cgroups for example ... (or use containers).
Radha: Containers are definitely in pipeline with just ephemerals and no 
persistent storage, but we want to take baby steps and not all at once. At the 
moment our nodes have dual sockets on the osd nodes, single socket on monitor 
and again dual socket on gateway nodes, with 18 core per socket. I'm planning 
to increase them to 20 in our production deployment, as our NVMe drives are 
going to have 4 osd per drive.

>2. Example: If my cluster has 20 nodes in total,
> then all 20 nodes will have NVMe SSD and NL-SAS, this way I'll get
> more capacity and performance when compared to homogeneous nodes. If I
> have to break the 20 nodes into 5 NVMe based, 5 SSD based and
> remaining 10 as spindle based with NVMe acting as bcache, then I'm
> restricting the count of drives there by lesser IO density /
> performance. Please advice in detail based on your production
> deployments. 

The drawback of all types of disk in one box is that all pools in your
cluster are affected when one nodes goes down.

If your storage needs change in the future than it does not make sense
to buy similar boxes. I.e. it's cheaper to buy dedicated boxes for
say spinners only if you end up needing that (lower CPU requirements,
cheaper boxes). You need to decide if you want max performance or max
capactity.

Radha: Performance is important from the block storage offering perspective. We 
have Rep3 pools for NVMe and SSD's each, while EC 6+4 for the spinning media 
based pool with NVMe as bcache. Spinning media based pool will host both S3 and 
as Tier 3 block storage target. With that said, performance is extremely 
important for the NVMe and SSD pools, while its equally needed for the Tier 3/ 
S3 based pool as well. I'm not against going into a homogeneous set of nodes, 
just worried about reduction in IO Density. Our current model is 4 NVMe, 10 SSD 
and 12 HDD per node.

More smaller nodes means the overall impact when one node fails is much
smaller. Just check what your budget allowys you to buy with
"all-on-one" boxes versus "dedicated" boxes.

Are you planning on dedicated monitor nodes (I would definately do
that)?
Radha: Yes, we have 3 physical nodes dedicated for monitors, planning to 
increase the monitor nodes to 5 after the cluster size gets over 50. 

Gr. Stefan

-- 
| BIT BV  https://www.bit.nl/Kamer van Koophandel 09090351
| GPG: 0xD14839C6   +31 318 648 688 / i...@bit.nl

=-=-=
Notice: The information contained in this e-mail
message and/or attachments to it may contain 
confidential or privileged information. If you are 
not the intended recipient, any dissemination, use, 
review, distribution, printing or copying of the 
information contained in this e-mail message 
and/or attachments to it are strictly prohibited. If 
you have received this communication in error, 
please notify us by reply e-mail or telephone and 
immediately and permanently delete the message 
and any attachments. Thank you

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Architecture - Recommendations

2019-12-31 Thread Stefan Kooman

Quoting Radhakrishnan2 S (radhakrishnan...@tcs.com):
> In addition, about putting all kinds of disks, putting all drives in
> one box was done for two reasons, 
> 
> 1. Avoid CPU choking
This depends only on what kind of hardware you select and how you
configure it. You can (if need be) restrict #CPU the ceph daemons get
with cgroups for example ... (or use containers).

>2. Example: If my cluster has 20 nodes in total,
> then all 20 nodes will have NVMe SSD and NL-SAS, this way I'll get
> more capacity and performance when compared to homogeneous nodes. If I
> have to break the 20 nodes into 5 NVMe based, 5 SSD based and
> remaining 10 as spindle based with NVMe acting as bcache, then I'm
> restricting the count of drives there by lesser IO density /
> performance. Please advice in detail based on your production
> deployments. 

The drawback of all types of disk in one box is that all pools in your
cluster are affected when one nodes goes down.

If your storage needs change in the future than it does not make sense
to buy similar boxes. I.e. it's cheaper to buy dedicated boxes for
say spinners only if you end up needing that (lower CPU requirements,
cheaper boxes). You need to decide if you want max performance or max
capactity.

More smaller nodes means the overall impact when one node fails is much
smaller. Just check what your budget allowys you to buy with
"all-on-one" boxes versus "dedicated" boxes.

Are you planning on dedicated monitor nodes (I would definately do
that)?

Gr. Stefan

-- 
| BIT BV  https://www.bit.nl/Kamer van Koophandel 09090351
| GPG: 0xD14839C6   +31 318 648 688 / i...@bit.nl
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Architecture - Recommendations

2019-12-31 Thread Stefan Kooman

Hi,
> 
> Radha: I'm sure we are using BGP EVPN over VXLAN, but all deployments
> are through the infrastructure management network. We are a CSP and
> overlay means tenant network, if ceph nodes are in overlay, then
> multiple tenants will need to be able to communicate to the ceph
> nodes. If LB is out of the ceph network, lets say XaaS, will routing
> across networks not create a bottleneck ? I'm novice in network, so if
> you can help with a reference architecture, it would be of help. 

https://vincent.bernat.ch/en/blog/2017-vxlan-bgp-evpn

And

https://vincent.bernat.ch/en/blog/2018-l3-routing-hypervisor

Where hypervisor would be your Ceph nodes. I.e. you can connect your
Ceph nodes on L2 or make them part of the L3 setup (more modern way of
doing it). You can use "ECMP" to add more network capacity when you need
it. Setting up a BGP EVPN VXLAN network is not trivial ... I advise on
getting networking expertise in your team.

Gr. Stefan

-- 
| BIT BV  https://www.bit.nl/Kamer van Koophandel 09090351
| GPG: 0xD14839C6   +31 318 648 688 / i...@bit.nl
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] cephfs kernel client io performance decreases extremely

2019-12-31 Thread ste...@bit.nl

Quoting renjianxinlover (renjianxinlo...@163.com):
> hi，Stefan
>   could you please provide further guidence?

https://docs.ceph.com/docs/master/cephfs/troubleshooting/#slow-requests-mds

Do a "dump ops in flight" to see what's going on on the MDS.

https://docs.ceph.com/docs/master/cephfs/troubleshooting/#kernel-mount-debugging

^^ Check out what the kernel is doing

What kernel version are you using? Newer is better ... 

Does the MDS report slow requests and / or slow medata requests?


Gr. Stefan

-- 
| BIT BV  https://www.bit.nl/Kamer van Koophandel 09090351
| GPG: 0xD14839C6   +31 318 648 688 / i...@bit.nl
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] ceph log level

Re: [ceph-users] Architecture - Recommendations

Re: [ceph-users] Architecture - Recommendations

Re: [ceph-users] Architecture - Recommendations

Re: [ceph-users] Architecture - Recommendations

Re: [ceph-users] cephfs kernel client io performance decreases extremely

6 matches

Site Navigation

Mail list logo

Footer information