[ceph-users] move bluestore wal/db

2020-05-26 Thread Frank R
Is there a safe way to move the bluestore wal and db to a new device that doesn't involve rebuilding the entire OSD? ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] bluestore - rocksdb level sizes

2020-05-28 Thread Frank R
If I remember correctly, being able to configure the rocksdb level sizes was targeted for Octopus. I was wondering if this feature ever made it into the code as it would be useful when you want to use a drive smaller than 300G for the WAL/DB. ___ ceph-us

[ceph-users] rocksdb tuning

2020-05-29 Thread Frank R
Hi all, I am attempting to prevent bluestore rocksdb Level 3/4 spillover with a 150GB logical volume for the db/wal. I am thinking of setting max_bytes_for_level_base to about 1.3G (1342177280). This should let Level 3 fill up the 150GB logical volume. I don't expect to ever actually need L4. An

[ceph-users] ceph mgr prometheus

2020-06-08 Thread Frank R
Hi all, Does the ceph manager prometheus module export bluestore rocksdb compaction times per OSD? I couldn't find anything. thx Frank ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: radosgw-admin sync status output

2020-06-11 Thread Frank R
That is my experience as well. The full sync will only run after initiating a data or metadata sync init. On Thu, Jun 11, 2020 at 9:30 AM wrote: > > > What I am curious about is these 2 lines: > > full sync: 0/64 shards > > full sync: 0/128 shards > > > > Is this considered normal? If so, why ha

[ceph-users] Re: MAX AVAIL goes up when I reboot an OSD node

2020-06-14 Thread Frank R
I have seen this when there is one OSD on the node being rebooted that is using more space than the others. Max avail for the pool is based on the fullest OSD as far as I know. On Sun, Jun 14, 2020 at 4:29 PM KervyN wrote: > > Does someone got any ideas on this? > > The mgr nodes are separate, pg

[ceph-users] Re: bluestore_rocksdb_options

2020-06-20 Thread Frank R
I believe it is the number of WALs that should be reused and should be equal to write_buffer_number but don't quote me. On Sat, Jun 20, 2020 at 11:35 AM Seena Fallah wrote: > > Hi. I found a default rocksdb option in bluestore that I can't find in > facebook rocksdb. > recycle_log_file_num this c

[ceph-users] Re: bluestore_rocksdb_options

2020-06-20 Thread Frank R
esn't change, so the I/O for metadata might be avoided (also depends on file system mount options). Assuming most WAL files will have similar sizes, I/O needed for metadata will be minimal. " On Sat, Jun 20, 2020 at 12:43 PM Frank R wrote: > > I believe it is the number of WALs that

[ceph-users] Re: bluestore_rocksdb_options

2020-06-20 Thread Frank R
log file, random writes >> are used from size 0. Before writes hit the end of the file, the file >> size doesn't change, so the I/O for metadata might be avoided (also >> depends on file system mount options). Assuming most WAL files will >> have similar sizes, I/O n

[ceph-users] Re: bluestore_rocksdb_options

2020-06-20 Thread Frank R
un 20, 2020 at 12:55 PM Seena Fallah wrote: > > Do you have any reason for this value? :)) > > On Sat, Jun 20, 2020 at 9:24 PM Frank R wrote: >> >> With ceph I have always used it to set the number of WALs to recycle, >> ie to recycle 8 WALs I use: >> >> &qu

[ceph-users] Re: High ceph_osd_commit_latency_ms on Toshiba MG07ACA14TE HDDs

2020-06-24 Thread Frank R
fyi, there is an interesting note on disabling the write cache here: https://yourcmc.ru/wiki/index.php?title=Ceph_performance&mobileaction=toggle_view_desktop#Drive_cache_is_slowing_you_down On Wed, Jun 24, 2020 at 9:45 AM BenoƮt Knecht wrote: > > Hi Igor, > > Igor Fedotov wrote: > > for the sak

[ceph-users] rbd image usage per osd

2019-08-09 Thread Frank R
I have an all RBD pool/cluster. I am interested in tracking how much disk space is being used by each RBD image on every OSD drive. The OSDs are Filestore. Does anyone know of any existing scripts that accomplish this task? If not, what commands can be used to generate this info? ___

[ceph-users] krdb upmap compatibility

2019-08-26 Thread Frank R
It seems that with Linux kernel 4.16.10 krdb clients are seen as Jewel rather than Luminous. Can someone tell me which kernel version will be seen as Luminous as I want to enable the Upmap Balancer. ___ ceph-users mailing list -- ceph-users@ceph.io To uns

[ceph-users] Re: krdb upmap compatibility

2019-08-26 Thread Frank R
will 4.13 also work for cephfs? On Mon, Aug 26, 2019 at 2:31 PM Ilya Dryomov wrote: > On Mon, Aug 26, 2019 at 8:25 PM wrote: > > > > What will actually happen if an old client comes by, potential data > damage - or just broken connections from the client? > > The latter (with "libceph: ... feat

[ceph-users] cephx user performance impact

2019-10-06 Thread Frank R
Would RBD performance be hurt by having thousands of cephx users defined? ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] mix sata/sas same pool

2019-10-16 Thread Frank R
I have inherited a cluster where about 30% of the osds in a pool are 7200 SAS. The other 70% are 7200 SATA. Should I look into creating 2 pools or will this likely not be a huge deal? ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send

[ceph-users] rgw multisite failover

2019-10-18 Thread Frank R
I am looking to change an RGW multisite deployment so that the secondary will become master. This is meant to be a permanent change. Per: https://docs.ceph.com/docs/mimic/radosgw/multisite/ I need to: 1. Stop RGW daemons on the current master end. On a secondary RGW node: 2. radosgw-admin zone

[ceph-users] rgw index large omap

2019-10-21 Thread Frank R
I have an rgw index pool that is alerting as "large" in 2 of the 3 osds on the PG. The primary has a large omap. The index is definitely in use by the bucket. Any opinions on the best way to solve this? 1. Remove the 2 osds with large index from cluster and rebalance? 2. Delete 2 of the 3 and deep

[ceph-users] multiple nvme per osd

2019-10-21 Thread Frank R
Hi all, Has anyone successfully created multiple partitions on an NVME device using ceph-disk? If so, which commands were used? ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] minimum osd size?

2019-10-22 Thread Frank R
Hi all, I have 40 nvme drives with about 20G free space each. Would creating a 10GB partition/lvm on each of the nvmes for an rgw index pool be a bad idea? RGW has about about 5 million objects I don't think space will be an issue but I am worried about the 10G size, is it just too small for a

[ceph-users] mix ceph-disk and ceph-volume

2019-10-22 Thread Frank R
Is it ok to create a new OSD using ceph-volume on a server where the other OSDs were created with ceph-disk? thx Frank ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] rgw recovering shards

2019-10-24 Thread Frank R
Hi all, After an RGW upgrade from 12.2.7 to 12.2.12 for RGW multisite a few days ago the "sync status" has constantly shown a few "recovering shards", ie: - # radosgw-admin sync status realm 8f7fd3fd-f72d-411d-b06b-7b4b579f5f2f (prod) zonegroup 60a2cb75-6978-46a3-b830-061c8b

[ceph-users] ceph pg commands hang forever

2019-10-26 Thread Frank R
Hi all, I have an issue where all "ceph pg" command hang forever, ie: ceph pg dump ceph pg ls-by-pool etc All pgs report as active+clean. Any pointers on correcting this? thx Frank ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send

[ceph-users] Re: ceph pg commands hang forever

2019-10-26 Thread Frank R
Please disregard this. There were actually some remapped PGs. Thx, Frank On Sat, Oct 26, 2019 at 12:05 PM Frank R wrote: > Hi all, > > I have an issue where all "ceph pg" command hang forever, ie: > > ceph pg dump > ceph pg ls-by-pool > etc > > All pgs rep

[ceph-users] radosgw recovering shards

2019-10-28 Thread Frank R
Hi all, Apologies for all the messages to the list over the past few days. After an upgrade from 12.2.7 to 12.2.12 (inherited cluster) for an RGW multisite active/active setup I am almost constantly seeing 1-10 "recovering shards" when running "radosgw-admin sync status", ie: -- # rados

[ceph-users] Re: rgw recovering shards

2019-10-29 Thread Frank R
buckets. The sync error log doesn't report anything recent. Do you know of a way to determine what this error is? thanks Frank On Mon, Oct 28, 2019 at 4:38 AM Konstantin Shalygin wrote: > On 10/27/19 6:01 AM, Frank R wrote: > > I hate to be a pain but I have one more question. &

[ceph-users] Re: rgw recovering shards

2019-10-30 Thread Frank R
seemed to clear out the shards and things are running normally now. radosgw-admin stale instances list didn't find these for some reason. thx Frank On Wed, Oct 30, 2019 at 2:55 AM Konstantin Shalygin wrote: > On 10/29/19 10:56 PM, Frank R wrote: > > oldest incremental change not ap

[ceph-users] ceph pg dump hangs on mons w/o mgr

2019-10-31 Thread Frank R
I have a 12.2.12 cluster with 3 mons where mgr will be active on 1. I have noticed that the command "ceph pg dump" hangs on all mons except the one where the mgr is running. "ceph pg dump" also runs fine on osd nodes. Is this expected behavior? thx Frank

[ceph-users] ceph clients and cluster map

2019-11-12 Thread Frank R
Hi all, When the cluster map changes say due to a failed OSD how are the clients with established sessions notified? thx Frank ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] increasing PG count - limiting disruption

2019-11-14 Thread Frank R
Hi all, When increasing the number of placement groups for a pool by a large amount (say 2048 to 4096) is it better to go in small steps or all at once? This is a filestore cluster. Thanks, Frank ___ ceph-users mailing list -- ceph-users@ceph.io To uns

[ceph-users] ceph user list respone

2019-11-25 Thread Frank R
Hi, For your response: "You should use not more 1Gb for WAL and 30Gb for RocksDB. Numbers ! 3,30,300 (Gb) for block.db is useless. " Do you mean the block.db size should be 3, 30 or 300GB and nothing else? If so, thy not? Thanks, Frank ___ ceph-users

[ceph-users] bluestore rocksdb behavior

2019-12-04 Thread Frank R
Hi all, How is the following situation handled with bluestore: 1. You have a 200GB OSD (no separate DB/WAL devices) 2. The metadata grows past 30G for some reason and wants to create a 300GB level but can't? Where is the metadata over 30G stored? ___ c

[ceph-users] Re: bluestore rocksdb behavior

2019-12-04 Thread Frank R
; written to main device as well. > > > Thanks, > > Igor > On 12/4/2019 6:13 PM, Frank R wrote: > > Hi all, > > How is the following situation handled with bluestore: > > 1. You have a 200GB OSD (no separate DB/WAL devices) > 2. The metadata grows past 30G for

[ceph-users] ceph public network definition

2019-12-07 Thread Frank R
Is the public_network definition in ceph.conf just used to determine which interface or IP to use for the public network or does it need to need to encompass the public ip addresses of all cluster nodes? Specifically, can the public_network be defined differently for different OSD nodes as long as

[ceph-users] ceph mgr daemon multiple ip addresses

2019-12-09 Thread Frank R
Hi all, Does anyone know what possible issues can arise if the ceph mgr daemon is running on a mon node that has 2 ips in the public net range (1 is a loopback address). As I understand the it. mgr will bind to all ips FYI - I am not sure why the loopback is there, I am trying to find out. thx

[ceph-users] bluestore worries

2019-12-13 Thread Frank R
Hi all, I am thinking about converting a Filestore cluster to Bluestore. The OSD nodes have 16X4TB 7200 SATA OSDs with NVME write journals. The NVME drives should be large enough to house ~30G DB/WAL OSDs. I am worried that I will see a significant performance hit when the deferred writes to the

[ceph-users] rgw logs

2019-12-23 Thread Frank R
I have about 1TB of data in the pool default.rgw.logs. What logs are stored in this pool and can they be safely deleted? thx Frank ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] rgw - ERROR: failed to fetch mdlog info

2019-12-27 Thread Frank R
I am seeing the following errors on an RGW multisite slave: 1. ERROR: failed to fetch mdlog info 2. failed to fetch local sync status: (5) Input/output error Data seems to be replicating but not metadata. Does anyone have any ideas on what may be wrong? - # radosgw-admin sync status

[ceph-users] rgw multisite debugging

2020-01-02 Thread Frank R
Hi all, This is a self inflicted issue but I am wondering if there is a way to recover. Setup: - RGW Multisite with 1 realm, 1 zonegroup and 2 zones. - Metadata and data replication enable (data bi-directional). - Only the master side is currently used by clients. - Ceph 12.2.12 My mistake was t

[ceph-users] rgw multisite rebuild

2020-01-03 Thread Frank R
Hi all, It looks like I have an RGW multisite setup that I need to rebuild to get metadata syncing again (I did some stupid things to break it). If it possible to remove the slave zone from the zonegroup and then re-add it without destroying the rgw data pool (bucket data)? thx Frank ___

[ceph-users] centralized config map error

2020-01-10 Thread Frank R
Hi all, When upgrading from Luminous to Nautilus the global configmap options for cluster_network and public_network were inadvertently set to an incorrect value (10.192.80.0/24): - [root@ceph-osd134 ceph]# ceph config dump | grep network global advanced cluster_network

[ceph-users] unset centralized config read only global setting

2020-01-13 Thread Frank R
Hi All, Sorry for the repost. How do you unset a global config setting from the centralized config with mimic+ (specifically public_network and cluster_network) "ceph config rm global public_network" doesn't seem to do the trick. These were set inadvertently during an upgrade with: "ceph conf

[ceph-users] upmap balancer

2020-01-23 Thread Frank R
Hi all, Does using the Upmap balancer require that all OSDs be the same size (per device class)? thx Frank ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] ceph positions

2020-02-03 Thread Frank R
Hi all, I really hope this isn't seen as spam. I am looking to find a position where I can focus on Linux storage/Ceph. If anyone is currently looking please let me know. Linkedin profile frankritchie. Thanks, Frank ___ ceph-users mailing list -- ceph-u

[ceph-users] cephfs metadata

2020-02-17 Thread Frank R
Hi all, Is there a way to estimate how much storage space is required for CephFS metadata given an expected number of files in the filesystem? thx Frank ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@c

[ceph-users] ceph nvme 2x replication

2020-02-19 Thread Frank R
Hi all, I have noticed that RedHat is willing to support 2x replication with NVME drives. Additionally, I have seen CERN presentation where they use a 2x replication with NVME for a hyperconverged/HPC/CephFS solution. I would like to hear some opinions on whether this is really a good idea for pr

[ceph-users] Re: ceph nvme 2x replication

2020-02-19 Thread Frank R
Thanks for clearing that up. On Wed, Feb 19, 2020 at 9:47 AM Dan van der Ster wrote: > > Hi, > > 2x replication was for a performance test. We use 3x in production. > > -- dan > > On Wed, Feb 19, 2020 at 3:18 PM Frank R wrote: > > > > Hi all, > > &g