[ceph-users] BLUEFS_SPILLOVER BlueFS spillover detected

2020-11-13 Thread Zhenshi Zhou
Hi, I have a cluster of 14.2.8. I created OSDs with dedicated PCIE for wal/db when deployed the cluster. I set 72G for db and 3G for wal on each OSD. And now my cluster is in a WARN stats until a long health time. # ceph health detail HEALTH_WARN BlueFS spillover detected on 1 OSD(s) BLUEFS_SPILL

[ceph-users] Re: monitor sst files continue growing

2020-11-13 Thread Zhenshi Zhou
er 于2020年10月30日周五 下午4:39写道: > > > On 29/10/2020 19:29, Zhenshi Zhou wrote: > > Hi Alex, > > > > We found that there were a huge number of keys in the "logm" and "osdmap" > > table > > while using ceph-monstore-tool. I think that could b

[ceph-users] Re: BLUEFS_SPILLOVER BlueFS spillover detected

2020-11-15 Thread Zhenshi Zhou
Has anyone met this issue yet? Zhenshi Zhou 于2020年11月14日周六 下午12:36写道: > Hi, > > I have a cluster of 14.2.8. > I created OSDs with dedicated PCIE for wal/db when deployed the cluster. > I set 72G for db and 3G for wal on each OSD. > > And now my cluster is in a WARN stat

[ceph-users] Re: BLUEFS_SPILLOVER BlueFS spillover detected

2020-11-15 Thread Zhenshi Zhou
well, the warning message disappeared after I executed "ceph tell osd.63 compact". Zhenshi Zhou 于2020年11月16日周一 上午10:04写道: > Has anyone met this issue yet? > > Zhenshi Zhou 于2020年11月14日周六 下午12:36写道: > >> Hi, >> >> I have a cluster of 14.2.8. >> I cr

[ceph-users] issue on adding SSD to SATA cluster for db/wal

2020-12-15 Thread Zhenshi Zhou
Hi all, I have a 14.2.15 cluster with all SATA OSDs. Now we plan to add SSDs in the cluster for db/wal usage. I checked the docs and found a command 'ceph-bluestore-tool' can deal with the issue. I added db/wal to the osd in my test environment but in the end it still get the warning message. "os

[ceph-users] Re: issue on adding SSD to SATA cluster for db/wal

2020-12-15 Thread Zhenshi Zhou
_name=ceph,ceph.crush_device_class=None,ceph.db_device=/dev/ceph-5d92c94b-9dde-4d38-ba2a-0f3e766162d1/osd-db-add92714-e279-468e-b0a2-dca494fbd5bd,ceph.db_uuid=add92714-e279-468e-b0a2-dca494fbd5bd,ceph.encrypted=0,ceph.osd_fsid=e5ec5f6b-333b-42d5-b760-175c1c528f8a,ceph.osd_id=2,ceph.osdspec_affinit

[ceph-users] MDS stuck in replay/resolve stats

2021-03-15 Thread Zhenshi Zhou
Hi, I encountered an issue lately. I have a cephfs cluster on 14.2.11 with 5 active MDS and 5 stand-replay MDS. Metadata pool is on SSD and datapool is on SATA. 2 of MDS restart frequently and the replay MDS stuck into replay and resolve states and never active. What's wrong with my MDS? *the re

[ceph-users] What exactly does the number of monitors depends on

2022-01-27 Thread Zhenshi Zhou
Hi all, What makes me confused recently is that how to decide the number of monitors in a cluster. In my opinion, it depends on the size of the cluster while my colleague says it should be at least 5 in a cluster. He sends me the redhat document

[ceph-users] Re: [Warning Possible spam] Re: What exactly does the number of monitors depends on

2022-01-27 Thread Zhenshi Zhou
Thanks guys for the explanation :) Anthony D'Atri 于2022年1月28日周五 02:52写道: > Agreed. > > I’ve been in a situation where I wasn’t able to stock spare chassis > because of ostensible 2-hour turnaround from vendor support. Which was > actually 2 hours *from when they agreed to replace/repair*, and i

[ceph-users] What is ceph doing after sync

2020-05-13 Thread Zhenshi Zhou
Hi, I deployed a multi-site in order to sync data from a cluster to anther. The data is fully synced(I suppose) and the cluster has no traffic at present. Everything seems fine. However, the sync status is not what I expected. Is there any step after data transfer? Can I change the master zone to

[ceph-users] Re: all VMs in compute node openstack connecting to this ceph cluster error connect after run command ceph osd set-require-min-compat-client luminus

2020-05-14 Thread Zhenshi Zhou
What the command "ceph osd dump | grep min_compat_client" and "ceph features" output Eugen Block 于2020年5月14日周四 下午3:17写道: > Can you share what you have tried so far? It's unclear at which point > it's failing so I'd suggest to stop the instances, restart > nova-compute.service and then start inst

[ceph-users] Re: Migrating clusters (and versions)

2020-05-14 Thread Zhenshi Zhou
rbd-mirror can work on a single image in the pool. and I did a test on image copy from 13.2 to 14.2. however, the data new in the source image didn't copy to the destination image. I'm not sure if this is normal. Kees Meijs 于2020年5月14日周四 下午3:24写道: > I need to mirror single RBDs while rbd-mirror:

[ceph-users] Re: all VMs in compute node openstack connecting to this ceph cluster error connect after run command ceph osd set-require-min-compat-client luminus

2020-05-14 Thread Zhenshi Zhou
The doc says "This subcommand will fail if any connected daemon or client is not compatible with the features offered by the given ". The command could be done if the client is disconnected, I guess. 于2020年5月14日周四 下午4:50写道: > HI > ___ > ceph-users mail

[ceph-users] Re: all VMs in compute node openstack connecting to this ceph cluster error connect after run command ceph osd set-require-min-compat-client luminus

2020-05-14 Thread Zhenshi Zhou
you should find this client and deal with it. Zhenshi Zhou 于2020年5月14日周四 下午4:56写道: > The doc says "This subcommand will fail if any connected daemon or client > is not compatible with the features offered by the given ". The > command > could be done if the client is disconn

[ceph-users] Re: Using rbd-mirror in existing pools

2020-05-14 Thread Zhenshi Zhou
As a matter of my experience, rbd-mirror only copy the images with journaling feature of clusterA to clusterB. It doesn't influence the other images in the pool of clusterB. You'd better have a test on it. Kees Meijs | Nefos 于2020年5月14日周四 下午10:22写道: > Hi list, > > Thanks again for pointing me to

[ceph-users] Re: nfs migrate to rgw

2020-05-18 Thread Zhenshi Zhou
. But how about the small files around 50KB. Does rgw serve well on small files? Wido den Hollander 于2020年5月12日周二 下午2:41写道: > > > On 5/12/20 4:22 AM, Zhenshi Zhou wrote: > > Hi all, > > > > We have several nfs servers providing file storage. There is a nginx in > >

[ceph-users] Re: nfs migrate to rgw

2020-05-18 Thread Zhenshi Zhou
Awesome, thanks a lot ! I'll try it. Paul Emmerich 于2020年5月18日周一 下午8:53写道: > > On Mon, May 18, 2020 at 1:52 PM Zhenshi Zhou wrote: > >> >> 50KB, and much video files around 30MB. The amount of the files is more >> than >> 1 million. Maybe I can find a way

[ceph-users] remove secondary zone from multisite

2020-05-21 Thread Zhenshi Zhou
Hi all, I'm gonna make my secondary zone offline. How to remove the secondary zone from a mutisite? ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: remove secondary zone from multisite

2020-05-24 Thread Zhenshi Zhou
Did anyone deal with it? Can I just remove the secondary zone from the cluster? I'm not sure if this action has any effect on the master zone. Thanks Zhenshi Zhou 于2020年5月22日周五 上午11:22写道: > Hi all, > > I'm gonna make my secondary zone offline. > How to remove the secondar

[ceph-users] Re: RGW Multi-site Issue

2020-05-25 Thread Zhenshi Zhou
Hi Sailaja, Maybe you can try to restart rgw on master zone before commit period on secondary zone. Sailaja Yedugundla 于2020年5月25日周一 下午11:24写道: > I am also facing the same problem. Did you find any solution? > ___ > ceph-users mailing list -- ceph-use

[ceph-users] Re: RGW Multisite metadata sync

2020-05-25 Thread Zhenshi Zhou
Did you restart rgw service on master zone? Sailaja Yedugundla 于2020年5月26日周二 上午1:09写道: > Hi, > I am trying to setup multisite cluster with 2 sites. I created master > zonegroup and zone by following the instructions given in the > documentation. On the secondary zone cluster I could pull the mas

[ceph-users] Re: RGW Multisite metadata sync

2020-05-25 Thread Zhenshi Zhou
I did encounter the same issue. I found that I missed the restart progress, and after restart the rgw I can commit period. What's more, I rename the default zone as well as zonegroup. Sailaja Yedugundla 于2020年5月26日周二 上午11:06写道: > Yes. I restarted the rgw service on master zone before committing

[ceph-users] looking for telegram group in English or Chinese

2020-05-25 Thread Zhenshi Zhou
Hi all, Is there any telegram group for communicating with ceph users? ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: looking for telegram group in English or Chinese

2020-05-26 Thread Zhenshi Zhou
roit.io > YouTube: https://goo.gl/PGE1Bx > > > Am Mi., 27. Mai 2020 um 07:07 Uhr schrieb Konstantin Shalygin < > k0...@k0ste.ru>: > >> On 5/26/20 1:13 PM, Zhenshi Zhou wrote: >> > Is there any telegram group for communicating with ceph users? >> >&g

[ceph-users] Re: Best way to change bucket hierarchy

2020-06-03 Thread Zhenshi Zhou
I think 'chassis' is OK. If you change host to chassis, you should have chassis declaration in the crushmap, as osds and hosts do. Using command for example, "ceph osd crush add-bucket chassis-1 chassis" and "ceph osd crush move host-1 chassis=chassis-1" should be executed.

[ceph-users] Re: Help! ceph-mon is blocked after shutting down and ip address changed

2020-06-03 Thread Zhenshi Zhou
did you change mon_host in ceph.conf while you set the ip back to 192.168.0.104. I did a monitor ip changing in a live cluster. But I had 3 mon and I modified only 1 ip and then submitted the monmap. 于2020年5月29日周五 下午11:55写道: > ceph version 14.2.4 (75f4de193b3ea58512f204623e6c5a16e6c1e1ba) nautil

[ceph-users] Re: RBD logs

2020-06-03 Thread Zhenshi Zhou
maybe you could try moving the options to global or osd section 陈旭 于2020年5月29日周五 下午11:29写道: > Hi guys, I deploy an efk cluster and use ceph as block storage in > kubernetes, but RBD write iops sometimes becomes zero and last for a few > minutes. I want to check logs about RBD so I add some confi

[ceph-users] rbd-mirror sync image continuously or only sync once

2020-06-03 Thread Zhenshi Zhou
Hi all, I'm gonna deploy a rbd-mirror in order to sync image from clusterA to clusterB. The image will be used while syncing. I'm not sure if the rbd-mirror will sync image continuously or not. If not, I will inform clients not to write data in it. Thanks. Regards

[ceph-users] Re: rbd-mirror sync image continuously or only sync once

2020-06-04 Thread Zhenshi Zhou
ry image to the remote image (if the rbd journal feature is > enabled). > > > Zitat von Zhenshi Zhou : > > > Hi all, > > > > I'm gonna deploy a rbd-mirror in order to sync image from clusterA to > > clusterB. > > The image will be used while syncing

[ceph-users] Re: rbd-mirror sync image continuously or only sync once

2020-06-04 Thread Zhenshi Zhou
e I fully understand what you're asking, > maybe you could rephrase your question? > > > Zitat von Zhenshi Zhou : > > > Hi Eugen, > > > > Thanks for the reply. If rbd-mirror constantly synchronize changes, > > what frequency to replay once? I don

[ceph-users] Re: rbd-mirror sync image continuously or only sync once

2020-06-04 Thread Zhenshi Zhou
Thank you for the clarification. That's very clear. Jason Dillaman 于2020年6月5日周五 上午12:46写道: > On Thu, Jun 4, 2020 at 3:43 AM Zhenshi Zhou wrote: > > > > My condition is that the primary image being used while rbd-mirror sync. > > I want to get the period between t

[ceph-users] Re: rbd-mirror sync image continuously or only sync once

2020-06-08 Thread Zhenshi Zhou
ere is an interval between the synchronization. I use version 14.2.9 and I deployed a one direction mirror. Zhenshi Zhou 于2020年6月5日周五 上午10:22写道: > Thank you for the clarification. That's very clear. > > Jason Dillaman 于2020年6月5日周五 上午12:46写道: > >> On Thu, Jun 4, 2020 at 3:43 A

[ceph-users] Re: rbd-mirror sync image continuously or only sync once

2020-06-08 Thread Zhenshi Zhou
l lose at most 30s of data 2. I stopped syncing on 11:02, while the data from rbd_blk on the clusterB is not newer than 10:50. Did I have the wrong steps in the switching progress? Zhenshi Zhou 于2020年6月9日周二 上午8:57写道: > Well, I'm afraid that the image didn't replay continuously, whic

[ceph-users] Re: rbd-mirror sync image continuously or only sync once

2020-06-09 Thread Zhenshi Zhou
es less data than that of clusterA. I tried image mode and pool mode both. Zhenshi Zhou 于2020年6月9日周二 上午11:41写道: > I have just done a test on rbd-mirror. Follow the steps: > 1. deploy two new clusters, clusterA and clusterB > 2. configure one-way replication from clusterA to clusterB with rbd

[ceph-users] Re: rbd-mirror sync image continuously or only sync once

2020-06-09 Thread Zhenshi Zhou
I did promote the non-primary image, or I couldn't disable the image mirror. Jason Dillaman 于2020年6月9日周二 下午7:19写道: > On Mon, Jun 8, 2020 at 11:42 PM Zhenshi Zhou wrote: > > > > I have just done a test on rbd-mirror. Follow the steps: > > 1. deploy two new clusters, c

[ceph-users] Re: rbd-mirror sync image continuously or only sync once

2020-06-09 Thread Zhenshi Zhou
n Dillaman 于2020年6月9日周二 下午7:48写道: > On Tue, Jun 9, 2020 at 7:26 AM Zhenshi Zhou wrote: > > > > I did promote the non-primary image, or I couldn't disable the image > mirror. > > OK, that means that 100% of the data was properly transferred since it > needs to

[ceph-users] Re: radosgw-admin sync status output

2020-06-10 Thread Zhenshi Zhou
It seems normal because my multisite's status shows like that too. I'm curious about the output as well. 于2020年6月11日周四 上午1:09写道: > All; > > We've been running our Ceph clusters (Nautilus / 14.2.8) for a while now > (roughly 9 months), and I've become curious about the output of the > "radosgw-ad

[ceph-users] Re: rbd-mirror sync image continuously or only sync once

2020-06-10 Thread Zhenshi Zhou
Well, I'm sure that there was something which triggers the synchronization but that's definitely not the CLI command 'echo 3 > /proc/sys/vm/drop_caches'. So how can I trigger the delta transfer manually? Zhenshi Zhou 于2020年6月9日周二 下午8:02写道: > It claimed error when

[ceph-users] Re: Is there a way froce sync metadata in a multisite cluster

2020-06-11 Thread Zhenshi Zhou
If there were some buckets on the secondary zone before you deployed the multisite, they won't be synced to the master zone. In this case I think it's normal that the bucket number is not the same. 黄明友 于2020年6月12日周五 上午10:17写道: > > > Hi,all: > > the slave zone show metadata is caugh

[ceph-users] Re: ceph grafana dashboards: rbd overview empty

2020-06-14 Thread Zhenshi Zhou
Yep, you should also tell mgr that rbd of which pool you wanna export statistics. Follow this, https://ceph.io/rbd/new-in-nautilus-rbd-performance-monitoring/ Marc Roos 于2020年6月12日周五 下午10:33写道: > > The grafana dashboard 'rbd overview' is empty. Queries have measurements > 'ceph_rbd_write_ops' th

[ceph-users] Re: poor cephFS performance on Nautilus 14.2.9 deployed by ceph_ansible

2020-06-15 Thread Zhenshi Zhou
I have encountered an issue on clients hanging on by opening a file. Besides, any other client who visited the same file or directory hung as well. The only way to resolve it is rebooting the clients server. This happened on kernel client only, Luminous version. After that I chose fuse client excep

[ceph-users] Re: mount cephfs with autofs

2020-06-15 Thread Zhenshi Zhou
The systemd autofs will mount cephfs successfully, with both kernel and fuse clients. Marc Roos 于2020年6月15日周一 下午6:44写道: > > > Thanks for these I was missing the x-systemd. entries. I assume these > are necessary so booting does not 'hang' on trying to mount these? I > thought the _netdev was for

[ceph-users] Re: Should the fsid in /etc/ceph/ceph.conf match the ceph_fsid in /var/lib/ceph/osd/ceph-*/ceph_fsid?

2020-06-15 Thread Zhenshi Zhou
Yep, I think the ceph_fsid tells OSDs how to recognize the cluster. It should be the same as the fsid in ceph.conf. 于2020年6月16日周二 上午6:28写道: > I am having a problem on my cluster where OSDs on one host are down after > reboot. When I run ceph-disk activate-all I get an error message stating > "No

[ceph-users] Re: enabling pg_autoscaler on a large production storage?

2020-06-16 Thread Zhenshi Zhou
I did this on my cluster and there was a huge number of pg rebalanced. I think setting this option to 'on' is a good idea if it's a brand new cluster. Dan van der Ster 于2020年6月16日周二 下午7:07写道: > Could you share the output of > > ceph osd pool ls detail > > ? > > This way we can see how the po

[ceph-users] fault tolerant about erasure code pool

2020-06-26 Thread Zhenshi Zhou
Hi all, I'm going to deploy a cluster with erasure code pool for cold storage. There are 3 servers for me to set up the cluster, 12 OSDs on each server. Does that mean the data is secure while 1/3 OSDs of the cluster is down, or only 2 of the OSDs is down , if I set the ec profile with k=4 and m=2

[ceph-users] Re: Bluestore performance tuning for hdd with nvme db+wal

2020-06-26 Thread Zhenshi Zhou
From my point of view, it's better to have no more than 6 osd wal/db on 1 nvme. I think that's the root cause of the slow requests, maybe. Mark Kirkwood 于2020年6月26日周五 上午7:47写道: > Progress update: > > - tweaked debug_rocksdb to 1/5. *possibly* helped, fewer slow requests > > - will increase osd_m

[ceph-users] Re: fault tolerant about erasure code pool

2020-06-26 Thread Zhenshi Zhou
Hi Janne, I use the default profile(2+1) and set failure-domain=host, is my best practice? Janne Johansson 于2020年6月26日周五 下午4:59写道: > Den fre 26 juni 2020 kl 10:32 skrev Zhenshi Zhou : > >> Hi all, >> >> I'm going to deploy a cluster with erasure code pool for

[ceph-users] Re: fault tolerant about erasure code pool

2020-06-26 Thread Zhenshi Zhou
Hi Lindsay, I have only 3 hosts, and is there any method to set a EC pool cluster in a better way Lindsay Mathieson 于2020年6月26日周五 下午6:03写道: > On 26/06/2020 6:31 pm, Zhenshi Zhou wrote: > > I'm going to deploy a cluster with erasure code pool for cold storage. > > There are

[ceph-users] Re: fault tolerant about erasure code pool

2020-06-26 Thread Zhenshi Zhou
I will give it a try, thanks:) Lindsay Mathieson 于2020年6月26日周五 下午7:07写道: > On 26/06/2020 8:08 pm, Zhenshi Zhou wrote: > > Hi Lindsay, > > > > I have only 3 hosts, and is there any method to set a EC pool cluster > > in a better way > > There's failure domai

[ceph-users] Re: fault tolerant about erasure code pool

2020-06-27 Thread Zhenshi Zhou
o:lindsay.mathie...@gmail.com] > Sent: Friday, June 26, 2020 4:08 AM > To: Zhenshi Zhou > Cc: ceph-users > Subject: [ceph-users] Re: fault tolerant about erasure code pool > > On 26/06/2020 8:08 pm, Zhenshi Zhou wrote: > > Hi Lindsay, > > > > I have only 3 hosts,

[ceph-users] Re: fault tolerant about erasure code pool

2020-06-27 Thread Zhenshi Zhou
I'm going to try the way Janne said, but what confused me is the expansion of the cluster. Should I change the profile if I add hosts which don't have the same OSDs as these 3 hosts? Zhenshi Zhou 于2020年6月28日周日 下午12:53写道: > I have only 3 hosts at present, and I tend to use EC po

[ceph-users] Re: mgr log shows a lot of ms_handle_reset messages

2020-06-28 Thread Zhenshi Zhou
I cannot find the same messages in my mgr.log. I think you can increase the log level and see if there is an error or not. XuYun 于2020年6月28日周日 下午4:39写道: > Hi, > > We are running Ceph nautilus (14.2.10 now), every mgr reports a > ‘ms_handle_reset’ message every second. Would it be a normal behavi

[ceph-users] Re: mgr log shows a lot of ms_handle_reset messages

2020-06-28 Thread Zhenshi Zhou
: > 111.111.121.3:6807/7 > 2020-06-28 18:52:35.690 7fcdc96e6700 4 mgr ms_dispatch standby > mgrconfigure(period=5, threshold=5) v3 > > > 2020年6月28日 下午6:02,Zhenshi Zhou 写道: > > I cannot find the same messages in my mgr.log. I think you can increase > the log level and see i

[ceph-users] Re: problem with ceph osd blacklist

2020-07-08 Thread Zhenshi Zhou
From my point of view, preventing clients from being added to blacklists may be better if you are in a poor network environment. AFAIK the server will send signals to clients frequently. And it will add someone to the blacklist if it doesn't receive a reply. <380562...@qq.com> 于2020年7月8日周三 下午3:01写

[ceph-users] about replica size

2020-07-09 Thread Zhenshi Zhou
Hi, As we all know, the default replica setting of 'size' is 3 which means there are 3 copies of an object. What is the disadvantages if I set it to 2, except I get fewer copies? Thanks ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe se

[ceph-users] Re: about replica size

2020-07-09 Thread Zhenshi Zhou
uld be slower because > they're would only be one other copy to get it from. You could look into > erasure coding if you are trying to save storage cost but that takes higher > CPU process. > > On Thu, Jul 9, 2020 at 19:12 Zhenshi Zhou wrote: > >> Hi, >> >> As we

[ceph-users] how to configure cephfs-shell correctly

2020-07-10 Thread Zhenshi Zhou
Hi all, I want to use cephfs-shell dealing with operations like directory creation, instead of mounting the root directory and create manually. But I get errors when I execute the command 'cephfs-shell'. Traceback (most recent call last): File "./cephfs-shell", line 9, in import cephfs as

[ceph-users] Re: about replica size

2020-07-13 Thread Zhenshi Zhou
le. > > If you are going to reboot one node and at the same time the other disk > fails, then you very like loose data. > > Just never ever use size 2. Not even temporary :) > > > Zhenshi Zhou schrieb am Fr., 10. Juli 2020, 04:11: > >> Hi, >> >> As we all know

[ceph-users] Re: Ceph `realm pull` permission denied error

2020-07-13 Thread Zhenshi Zhou
Hi Alex, I didn't deploy this in containers/vms, as well as ansible or other tools. However I deployed multisite once and I remember that I restarted the rgw on the master site before I sync realm on the secondary site. I'm not sure if this can help. Alex Hussein-Kershaw 于2020年7月13日周一 下午5:48写道

[ceph-users] Re: how to configure cephfs-shell correctly

2020-07-13 Thread Zhenshi Zhou
;onecmd() got an unexpected keyword argument 'add_to_history'' How can I use 'cephfs-shell'? Zhenshi Zhou 于2020年7月10日周五 下午3:09写道: > Hi all, > > I want to use cephfs-shell dealing with operations like directory > creation, > instead of mounting the root director

[ceph-users] Re: how to configure cephfs-shell correctly

2020-07-14 Thread Zhenshi Zhou
14日周二 下午4:04写道: > On Tue, 14 Jul 2020 at 12:18, Zhenshi Zhou wrote: > > > > This error disappeared after I installed 'python3-cephfs'. However the > > cephfs-shell command stuck. > > It stays at 'CephFS:~/>>>' and whatever subcommand I execu

[ceph-users] Re: how to configure cephfs-shell correctly

2020-07-14 Thread Zhenshi Zhou
What's more, I tried in CentOS 7 as well as 8, and I got the same error. Zhenshi Zhou 于2020年7月14日周二 下午4:23写道: > My system is CentOS7 and python is 3.6.8. > Ceph version is 14.2.10, installed from ceph official repo. > Versions of python packages list: > attrs==19.3.0 >

[ceph-users] Re: how to configure cephfs-shell correctly

2020-07-14 Thread Zhenshi Zhou
Hi Rishabh, I installed cmd2 with pip, followed by document <https://docs.ceph.com/docs/master/cephfs/cephfs-shell/>. As you recommended, I set the install version of cmd2 to 0.7.9, and cephfs-shell is good to go. Thanks a lot :) Zhenshi Zhou 于2020年7月14日周二 下午4:25写道: > What's mo

[ceph-users] Re: how to configure cephfs-shell correctly

2020-07-14 Thread Zhenshi Zhou
When I use cephfs-shell, I find that some commands are useless. And I am forced to use the interactive mode. I'm not sure if I use it in the correct way or not. Is there a more clear document? [image: image.png] Rishabh Dave 于2020年7月14日周二 下午4:36写道: > On Tue, 14 Jul 2020 at 14:03, Zhen

[ceph-users] Re: how to configure cephfs-shell correctly

2020-07-14 Thread Zhenshi Zhou
That's my mistake, I should have read the document of Nautilus. It's OK for the failed commands. 'mkdir' is enough for me at present. Thanks Rishabh, I will try it on Octopus cluster. Rishabh Dave 于2020年7月14日周二 下午11:15写道: > On Tue, 14 Jul 2020 at 14:24, Zhenshi Zhou wr

[ceph-users] Re: osd bench with or without a separate WAL device deployed

2020-07-15 Thread Zhenshi Zhou
I deployed the cluster either with separate db/wal or put db/wal/data together. Never tried to have only a seperate db. AFAIK wal does have an effect on writing but I'm not sure if it could be two times of the bench value. Hardware and network environment are also important factors. rainning 于202

[ceph-users] Re: Monitor IPs

2020-07-15 Thread Zhenshi Zhou
Hi Will, I once changed monitor IPs on Nautilus cluster. What I did is change the monitor information by monmap one by one. Both old and new IPs can communicate with each other of course. If it's a new cluster, I suggest deploying a new cluster instead of changing the monitor IPs. Amit Ghadge 于2

[ceph-users] Re: osd bench with or without a separate WAL device deployed

2020-07-15 Thread Zhenshi Zhou
rence, but I am not sure if it can make two times difference. > > ---Original--- > *From:* "Zhenshi Zhou" > *Date:* Wed, Jul 15, 2020 18:39 PM > *To:* "rainning"; > *Cc:* "ceph-users"; > *Subject:* [ceph-users] Re: osd bench with or without a separ

[ceph-users] Re: osd bench with or without a separate WAL device deployed

2020-07-15 Thread Zhenshi Zhou
root@stor-mgt01:~# ceph tell osd.30 bench 196608000 65536 > { > "bytes_written": 196608000, > "blocksize": 65536, > "elapsed_sec": 1.081617, > "bytes_per_sec": 181772360.338257, > "iops": 2773.626104 > } > root@stor-mgt01:~#

[ceph-users] Re: Radosgw stuck in syncing status

2020-07-22 Thread Zhenshi Zhou
It took 2 weeks to finish the syncing when I saw hundreds of numbers in the 'behind shards'. I'm not sure if that was normal, I found that there was data transferring between two zones though. Nghia Viet Tran 于2020年7月22日周三 下午7:16写道: > Hi everyone, > > Our Ceph cluster is stuck in syncing status

[ceph-users] Re: What affection to vm with Monitor down

2020-07-24 Thread Zhenshi Zhou
It is advisable to run an odd-number(at least 3) of monitors but not mandatory. A 2 monitor cluster can also provide service. I think it's ok to shutdown a monitor host for a while. If you are afraid of monitor down when maintaining, means that only 1 monitor is alive in the cluster which makes the

[ceph-users] Re: Ceph-deploy on rhel.

2020-07-26 Thread Zhenshi Zhou
The user provided to the dashboard must be created with '--system' with radosgw-admin, or it's not working. sathvik vutukuri <7vik.sath...@gmail.com> 于2020年7月26日周日 上午9:54写道: > I have enabled it using the same doc, but some how it's not working. > > On Sun, 26 Jul 2020, 06:55 Oliver Freyermuth, <

[ceph-users] large omap objects

2020-07-26 Thread Zhenshi Zhou
Hi all, I have a cluster providing object storage. The cluster has worked well until someone saves flink checkpoints in the 'flink' bucket. I checked its behavior and I find that the flink saves the current checkpoint data and delete the former ones frequently. I suppose that it makes the bucket

[ceph-users] Re: Ceph-deploy on rhel.

2020-07-26 Thread Zhenshi Zhou
keys': [], u'user_quota': > {u'max_objects': -1, u'enabled': False, u'max_size_kb': 0, u'max_size': -1, > u'check_on_raw': False}, u'placement_tags': [], u'suspended': 0, > u'op_mask': u're

[ceph-users] slow ops on one osd makes all my buckets unavailable

2020-07-28 Thread Zhenshi Zhou
Hi, My harbor registry uses ceph object storage to save the images. But I couldn't pull/push images from harbor a few moments ago. Ceph was in warning health status in the same time. The cluster just had a warning message said that osd.24 has slow ops. I check the ceph-osd.24.log, and showed as b

[ceph-users] Re: Current best practice for migrating from one EC profile to another?

2020-07-28 Thread Zhenshi Zhou
I'm facing the same issue. My cluster will have an expansion and I wanna modify the ec profile too. What I can think of is to create a new profile and a new pool, and then migrate the data from the old pool to the new one. Finally rename the pools as I can use the new pool just like nothing happene

[ceph-users] Re: Not able to access radosgw S3 bucket creation with AWS java SDK. Caused by: java.net.UnknownHostException: issue.

2020-07-29 Thread Zhenshi Zhou
It's maybe a dns issue, I guess. sathvik vutukuri <7vik.sath...@gmail.com> 于2020年7月29日周三 下午3:21写道: > Hi All, > > Any update in this from any one? > > On Tue, Jul 28, 2020 at 4:00 PM sathvik vutukuri <7vik.sath...@gmail.com> > wrote: > > > Hi All, > > > > radosgw-admin is configured in ceph-deploy

[ceph-users] does ceph rgw has any option to limit bandwidth

2020-08-19 Thread Zhenshi Zhou
Hi, Is there any option of rados gateway that limit bandwidth? ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: does ceph rgw has any option to limit bandwidth

2020-08-19 Thread Zhenshi Zhou
ything that limits > traffic that matches "outgoing tcp from self with source-port 80,443*" > should work for your rgw too, if you think it eats too much BW, without > limiting its speed towards the OSDs. > > > *) if your rgw is on those two ports of course. > > Den

[ceph-users] Re: Add OSD with primary on HDD, WAL and DB on SSD

2020-08-26 Thread Zhenshi Zhou
Official document says that you should allocate 4% of the slow device space for block.db. But the main problem is that Bluestore uses RocksDB and RocksDB puts a file on the fast device only if it thinks that the whole layer will fit there. As for RocksDB, L1 is about 300M, L2 is about 3G, L3 is nea

[ceph-users] Re: Add OSD with primary on HDD, WAL and DB on SSD

2020-08-28 Thread Zhenshi Zhou
ith DB? > Say device size 50G, 100G, 200G, they are no difference to DB > because DB will take 30G anyways. Does it make any difference > to WAL? > > Thanks! > Tony > > -Original Message- > > From: Zhenshi Zhou > > Sent: Wednesday, August 26, 2020 11:16 P

[ceph-users] java client cannot visit rgw behind nginx

2020-09-02 Thread Zhenshi Zhou
Hi My fellows wanna use ceph rgw to store ES backup and Nexus blobs. But the services cannot connect to the rgw with s3 protocol when I provided them with the frontend nginx address(virtual ip). Only when they use the backend rgw's address(real ip) the ES and Nexus works well with rgw. Has anyone

[ceph-users] Re: java client cannot visit rgw behind nginx

2020-09-02 Thread Zhenshi Zhou
ng a single upload", "caused_by": { "type": "sdk_client_exception", "reason": "sdk_client_exception: Unable to execute HTTP request: oldelk-snapshot.rgw.abc.cn", "caused_by": { "type": &qu

[ceph-users] Re: java client cannot visit rgw behind nginx

2020-09-02 Thread Zhenshi Zhou
he wrong configuration for reverse proxy > of S3. > > Thanks. > > Zhenshi Zhou wrote: > > this is ES error log: > > { > >"error": { > > "root_cause": [ > >{ > > "type": "repository_

[ceph-users] Re: java client cannot visit rgw behind nginx

2020-09-03 Thread Zhenshi Zhou
eader X-Forwarded-For $proxy_add_x_forwarded_for; > proxy_set_header Proxy ""; > proxy_http_version 1.1; > proxy_max_temp_file_size 0; > proxy_request_buffering off; >} > } > > On 3/09/20 2:19 pm, Zhenshi Zhou wrote: > > Hi Tom >

[ceph-users] monitor sst files continue growing

2020-10-28 Thread Zhenshi Zhou
Hi all, My cluster is in wrong state. SST files in /var/lib/ceph/mon/xxx/store.db continue growing. It claims mon are using a lot of disk space. I set "mon compact on start = true" and restart one of the monitors. But it started and campacting for a long time, seems it has no end. [image: image.

[ceph-users] Re: monitor sst files continue growing

2020-10-28 Thread Zhenshi Zhou
My cluster is 12.2.12, with all sata disks. the space of store.db: [image: image.png] How can I deal with it? Zhenshi Zhou 于2020年10月29日周四 下午2:37写道: > Hi all, > > My cluster is in wrong state. SST files in /var/lib/ceph/mon/xxx/store.db > continue growing. It claims mon are using a

[ceph-users] Re: monitor sst files continue growing

2020-10-28 Thread Zhenshi Zhou
MISTAKE: version is 14.2.12 Zhenshi Zhou 于2020年10月29日周四 下午2:38写道: > My cluster is 12.2.12, with all sata disks. > the space of store.db: > [image: image.png] > > How can I deal with it? > > Zhenshi Zhou 于2020年10月29日周四 下午2:37写道: > >> Hi all, >> >> My cl

[ceph-users] Re: monitor sst files continue growing

2020-10-29 Thread Zhenshi Zhou
re are inactive PGs > - why there are incomplete PGs > > This usually happens when OSDs go missing. > > Best regards, > = > Frank Schilder > AIT Risø Campus > Bygning 109, rum S14 > > ____ > From: Zhenshi Zh

[ceph-users] Re: monitor sst files continue growing

2020-10-29 Thread Zhenshi Zhou
After add OSDs into the cluster, the recovery and backfill progress has not finished yet Zhenshi Zhou 于2020年10月29日周四 下午3:29写道: > MGR is stopped by me cause it took too much memories. > For pg status, I added some OSDs in this cluster, and it > > Frank Schilder 于2020年10月29

[ceph-users] Re: monitor sst files continue growing

2020-10-29 Thread Zhenshi Zhou
D restart"? In that case, temporarily stopping and > restarting all new OSDs might help. > > Best regards, > = > Frank Schilder > AIT Risø Campus > Bygning 109, rum S14 > > ________ > From: Zhenshi Zhou > Sent

[ceph-users] Re: monitor sst files continue growing

2020-10-29 Thread Zhenshi Zhou
is going on. > > Best regards, > = > Frank Schilder > AIT Risø Campus > Bygning 109, rum S14 > > > From: Zhenshi Zhou > Sent: 29 October 2020 09:44:14 > To: Frank Schilder > Cc: ceph-users > Subject: Re: [ceph-users] monitor sst files continue

[ceph-users] Re: monitor sst files continue growing

2020-10-29 Thread Zhenshi Zhou
age.png] [image: image.png] Zhenshi Zhou 于2020年10月29日周四 下午8:29写道: > Hi, > > I was so anxious a few hours ago cause the sst files were growing so fast > and I don't think > the space on mon servers could afford it. > > Let me talk it from the beginning. I have a cluster wi

[ceph-users] Re: monitor sst files continue growing

2020-10-29 Thread Zhenshi Zhou
Hi Alex, We found that there were a huge number of keys in the "logm" and "osdmap" table while using ceph-monstore-tool. I think that could be the root cause. Well, some pages also say that disable 'insight' module can resolve this issue, but I checked our cluster and we didn't enable this module

[ceph-users] Re: How to disable RGW log or change RGW log level

2019-08-07 Thread Zhenshi Zhou
Hi Yang, I once debug the osd logs. I think the log level of rgw should also be changed. Hope this doc may have some help. Thanks shellyyang1989 于2019年8月8日周四 下午1:44写道: > Hi All, > > The RG

[ceph-users] RGW Multi-site Issue

2020-04-01 Thread Zhenshi Zhou
Hi, I am new on rgw and try deploying a mutisite cluster in order to sync data from one cluster to another. My source zone is the default zone in the default zonegroup, structure as belows: realm: big-realm |

[ceph-users] Re: RGW Multi-site Issue

2020-04-02 Thread Zhenshi Zhou
I create two new cluster and successfully deploy a multisite. However it gets error "failed to commit period: (2202) Unknown error 2202" when I commit period on the secondary zone, while the master zone has data in it. I'm not sure if the multisite can just be deployed on two new

[ceph-users] octopus cluster deploy with cephadm failed on bootstrap

2020-05-09 Thread Zhenshi Zhou
Hi all, I'm deploying a new octopus cluster using cephadm, follow docs . However it failed on the bootstrap step. According to the logs, key generating failed because of the lack of directory and file. Did I miss something? here is the logs: [

[ceph-users] Re: Cluster network and public network

2020-05-09 Thread Zhenshi Zhou
Hi, I deployed few clusters with two networks as well as only one network. There has little impact between them for my experience. I did a performance test on nautilus cluster with two networks last week. What I found is that the cluster network has low bandwidth usage while public network bandwi

[ceph-users] Re: Cluster network and public network

2020-05-09 Thread Zhenshi Zhou
Hi Anthony, Thanks for the feedback! The servers are using two bond interfaces for two networks. And each interface is bonded with two 25Gb/s cards(active-backup mode). You're right, I should have done the test in a havey recovery or backfill situation. I will benchmark the cluster once again in

  1   2   >