[ceph-users] Re: [Suspicious newsletter] Weird performance issue with long heartbeat and slow ops warnings

2020-10-08 Thread Szabo, Istvan (Agoda)
Hi, We have a quite serious issue regarding slow ops. In our case DB team used the cluster to read and write in the same pool at the same time and it made the cluster useless. When we ran fio, we realised that ceph doesn't like the read and write at the same time in the same pool, so we tested t

[ceph-users] Bucket sharding

2020-10-09 Thread Szabo, Istvan (Agoda)
Hello, I have a bucket which is close to 10 millions objects (9.1 millions), we have: rgw_dynamic_resharding = false rgw_override_bucket_index_max_shards = 100 rgw_max_objs_per_shard = 10 Do I need to increase the numbers soon or it is not possible so they need to start to use new bucket?

[ceph-users] Re: Bucket sharding

2020-10-09 Thread Szabo, Istvan (Agoda)
What I've found is the following method: radosgw-admin reshard add --bucket dsfdsfsf--num-shards 200 radosgw-admin reshard process Could this cause any issue in a 10 millions object bucket if I increase it to 200 maybe? From: Szabo, Istvan (Agoda)

[ceph-users] Mon DB compaction MON_DISK_BIG

2020-10-19 Thread Szabo, Istvan (Agoda)
Hi, I've received a warning today morning: HEALTH_WARN mons monserver-2c01,monserver-2c02,monserver-2c03 are using a lot of disk space MON_DISK_BIG mons monserver-2c01,monserver-2c02,monserver-2c03 are using a lot of disk space mon.monserver-2c01 is 15.3GiB >= mon_data_size_warn (15GiB)

[ceph-users] Re: Mon DB compaction MON_DISK_BIG

2020-10-19 Thread Szabo, Istvan (Agoda)
k you From: Anthony D'Atri Sent: Tuesday, October 20, 2020 9:13 AM To: ceph-users@ceph.io Cc: Szabo, Istvan (Agoda) Subject: Re: [ceph-users] Mon DB compaction MON_DISK_BIG Email received from outside the company. If in doubt don't click links nor op

[ceph-users] Re: Mon DB compaction MON_DISK_BIG

2020-10-19 Thread Szabo, Istvan (Agoda)
Okay, thank you very much. From: Anthony D'Atri Sent: Tuesday, October 20, 2020 9:32 AM To: Szabo, Istvan (Agoda) Cc: ceph-users@ceph.io Subject: Re: [ceph-users] Re: Mon DB compaction MON_DISK_BIG Email received from outside the company. If in

[ceph-users] Hadoop to Ceph

2020-11-05 Thread Szabo, Istvan (Agoda)
Hi, Is there anybody tried to migrate data from Hadoop to Ceph? If yes what is the right way? Thank you This message is confidential and is for the sole use of the intended recipient(s). It may also be privileged or otherwise protected by copyright or other leg

[ceph-users] Re: Hadoop to Ceph

2020-11-06 Thread Szabo, Istvan (Agoda)
020 o 04:00 Szabo, Istvan (Agoda) mailto:istvan.sz...@agoda.com>> napisał(a): Hi, Is there anybody tried to migrate data from Hadoop to Ceph? If yes what is the right way? Thank you This message is confidential and is for the sole use of the intended recip

[ceph-users] Re: [Suspicious newsletter] Re: Multisite sync not working - permission denied

2020-11-08 Thread Szabo, Istvan (Agoda)
Hi, Update to 15.2.5. We have the same issue, in the relase notes they don’t mention anything regarding multisite, but once we updated everything started to work ith 15.2.5. Best regards From: Michael Breen Sent: Friday, November 6, 2020 10:40 PM To: ceph-users@ceph.io Subject: [Suspicious ne

[ceph-users] Multisite mechanism deeper understanding

2020-11-09 Thread Szabo, Istvan (Agoda)
Hi, Couple of questions came up which is not really documented anywhere, hopefully someone knows the answers: 1. Is there a way to see the replication queue? I want to create metrics like is there any delay in the replication etc ... 2. Is the replication FIFO? 3. Actually how a replication w

[ceph-users] Ceph EC PG calculation

2020-11-17 Thread Szabo, Istvan (Agoda)
Hi, I have this error: I have 36 osd and get this: Error ERANGE: pg_num 4096 size 6 would mean 25011 total pgs, which exceeds max 10500 (mon_max_pg_per_osd 250 * num_in_osds 42) If I want to calculate the max pg in my server, how it works if I have EC pool? I have 4:2 data EC pool, and the oth

[ceph-users] EC overwrite

2020-11-18 Thread Szabo, Istvan (Agoda)
Hi, Is it s problem if ec_overwrite enabled in the data pool? https://docs.ceph.com/en/latest/rados/operations/erasure-code/#erasure-coding-with-overwrites Thanks This message is confidential and is for the sole use of the intended recipient(s). It may also be p

[ceph-users] Re: Ceph EC PG calculation

2020-11-18 Thread Szabo, Istvan (Agoda)
er in warn and do when it instruct. To be honest I just want to be sure my setup is correct or I miss something or did something wrong. -Original Message- From: Frank Schilder Sent: Wednesday, November 18, 2020 3:11 PM To: Szabo, Istvan (Agoda) ; ceph-users@ceph.io Subject: Re: Ceph

[ceph-users] Weird ceph use case, is there any unknown bucket limitation?

2020-11-18 Thread Szabo, Istvan (Agoda)
Hi, I have a use case where the user would like to have 5 Buckets. Is it normal for ceph just too much for me? The reason they want this level of granularity is because they might need to clean buckets for a specific subset and not affect others. Bucket format is this: PR_PAGETPYE-_DEVI

[ceph-users] Re: [Suspicious newsletter] Re: Unable to reshard bucket

2020-11-21 Thread Szabo, Istvan (Agoda)
Seems like this sharding we need to be plan carefully since the beginning. I'm thinking to set the shard number by default to the maximum which is 64k and leave it as is so we will never reach the limit only if we reach the maximum number of objects. Would be interesting to know what is the sid

[ceph-users] Sizing radosgw and monitor

2020-11-23 Thread Szabo, Istvan (Agoda)
Hi, I haven't really find any documentation about how to size radosgw. One redhat doc says we need to decide the ratio like 1:50 or 1:100 osd / rgw. I had an issue earlier where I had a user who source loadbalanced so always went to the same radosgateway and 1 time just maxed out. So the questio

[ceph-users] HA_proxy setup

2020-11-23 Thread Szabo, Istvan (Agoda)
Hi, I wonder is there anybody have a setup like I want to setup? 1st subnet: 10.118.170.0/24 (FE users) 2nd subnet: 10.192.150.0/24 (BE users) The users are coming from these subnets, and I want that the FE users will come on the 1st interface on the loadbalancer, the BE users will come one the

[ceph-users] Re: [Suspicious newsletter] Re: Unable to reshard bucket

2020-11-26 Thread Szabo, Istvan (Agoda)
ance your clarification. -Original Message- From: Eric Ivancich Sent: Wednesday, November 25, 2020 5:37 AM To: Szabo, Istvan (Agoda) Cc: ceph-users Subject: Re: [Suspicious newsletter] [ceph-users] Re: Unable to reshard bucket Email received from outside the company. If in doubt

[ceph-users] PG_DAMAGED

2020-12-04 Thread Szabo, Istvan (Agoda)
Hi, Not sure is it related to my 15.2.7 update, but today I got many time this issue: 2020-12-04T15:14:23.910799+0700 osd.40 (osd.40) 11 : cluster [DBG] 11.2 deep-scrub starts 2020-12-04T15:14:23.947255+0700 osd.40 (osd.40) 12 : cluster [ERR] 11.2 soid 11:434f049b:::.dir.75333f99-93d0-4238-91a

[ceph-users] Re: [Suspicious newsletter] Re: PG_DAMAGED

2020-12-04 Thread Szabo, Istvan (Agoda)
click links nor open attachments! Hi, this is not necessarily but most likely a hint to a (slowly) failing disk. Check all OSDs for this PG for disk errors in dmesg and smartctl. Regards, Eugen Zitat von "Szabo, Istvan (Agoda)" : > Hi, > > N

[ceph-users] Weird ceph df

2020-12-15 Thread Szabo, Istvan (Agoda)
Hi, It is a nautilus 14.2.13 ceph. The quota on the pool is 745GiB, how can be the stored data 788GiB? (2 replicas pool). Based on the used column it means just 334GiB is used because the pool has 2 replicas only. I don't understand. POOLS: POOLID STORED OBJECTS

[ceph-users] Data migration between clusters

2020-12-17 Thread Szabo, Istvan (Agoda)
What is the easiest and best way to migrate bucket from an old cluster to a new one? Luminous to octopus not sure does it matter from the data perspective. This message is confidential and is for the sole use of the intended recipient(s). It may also be privileg

[ceph-users] Re: Data migration between clusters

2020-12-23 Thread Szabo, Istvan (Agoda)
l > > Cheers, > Kalle > > - Original Message - >> From: "Szabo, Istvan (Agoda)" >> To: "ceph-users" >> Sent: Thursday, 17 December, 2020 12:11:19 >> Subject: [ceph-users] Data migration between clusters > >> What is the easies

[ceph-users] radosgw-admin sync status takes ages to print output

2021-01-14 Thread Szabo, Istvan (Agoda)
Hello, I have a 3 DC octopus Multisite setup with bucket sync policy applied. I have 2 buckets where I’ve set the shard 24.000 and the other is 9.000 because they want to use 1 bucket but with a huge amount of objects (2.400.000.000 and 900.000.000) and in case of multisite we need to preshard

[ceph-users] Re: radosgw-admin sync status takes ages to print output

2021-01-14 Thread Szabo, Istvan (Agoda)
: [5,14,23,25,26,34,36,37,38,45,46,47,49,50,51,52,54,55,57,58,60,61,62,67,68,69,71,77,79,80,88,89,90,95,97,100,108,110,111,117,118,120,121,125,126] Sorry for the 2 email. -Original Message- From: Szabo, Istvan (Agoda) Sent: Thursday, January 14, 2021 12:57 PM To: ceph-users@ceph.io Subject: [ceph-users] radosgw-admin sync status takes ages to print output Email received from

[ceph-users] Centos 8 2021 with ceph, how to move forward?

2021-01-14 Thread Szabo, Istvan (Agoda)
Hi, Just curious how you guys move forward with this Centos 8 change. We just finished installing our full multisite cluster and looks like we need to change the operating system. So curious if you are using centos 8 with ceph, where you are going to move forward. Thank you _

[ceph-users] Re: [Suspicious newsletter] Re: Centos 8 2021 with ceph, how to move forward?

2021-01-14 Thread Szabo, Istvan (Agoda)
eased a 1:1 binary compatible redhat fork due to the changes with Centos 8. Could be worth looking at. https://almalinux.org/ In our case we're using ceph on debian 10. -- David Majchrzak CTO Oderland Webbhotell AB Östra Hamngatan 50B, 411 09 Göteborg, SWEDEN Den 2021-01-14 kl. 09:04, skr

[ceph-users] Re: radosgw-admin sync status takes ages to print output

2021-01-14 Thread Szabo, Istvan (Agoda)
t; we're running, with our significantly lower object count. > > Thank you, > > Dominic L. Hilsbos, MBA > Director – Information Technology > Perform Air International Inc. > dhils...@performair.com > www.PerformAir.com > > > -Original Message- > From: Sz

[ceph-users] .rgw.root was created wit a lot of PG

2021-01-15 Thread Szabo, Istvan (Agoda)
Hi, Originally this pool was created with 512PG which makes couple of OSDs having 500PG 😲 What is the safe steps to copy over this pool? These are the files in this pool: default.realm period_config.f320e60d-8cff-4824-878e-c316423cc519 periods.18d63a25-8a50-4e17-9561-d452621f62fa.latest_epoch de

[ceph-users] Re: [Suspicious newsletter] Re: .rgw.root was created wit a lot of PG

2021-01-15 Thread Szabo, Istvan (Agoda)
lot of PG Email received from outside the company. If in doubt don't click links nor open attachments! Which ceph version is this? Since Nautilus you can decrease pg numbers (or let pg-autoscaler do that for you). Zitat von "Szabo, Istvan (Agoda)"

[ceph-users] RBD on windows

2021-01-20 Thread Szabo, Istvan (Agoda)
Hi, I'm looking the suse documentation regarding their option to have rbd on win. I want to try on windows server 2019 vm, but I got this error: PS C:\Users\$admin$> rbd create image01 --size 4096 --pool windowstest -m 10.118.199.248,10.118.199.249,10.118.199.250 --id windowstest --keyring C:/P

[ceph-users] Snap trimming best practice

2023-01-11 Thread Szabo, Istvan (Agoda)
Hi, Wonder have you ever faced issue with snaptrimming if you follow ceph pg allocation recommendation (100pg/osd)? We have a nautilus cluster and we scare to increase the pg-s of the pools because seems like even if we have 4osd/nvme, if the pg number is higher = the snaptrimming is slower.

[ceph-users] Re: Very slow snaptrim operations blocking client I/O

2023-01-27 Thread Szabo, Istvan (Agoda)
How is your pg distribution on your osd devices? Do you have enough assigned pgs? Istvan Szabo Staff Infrastructure Engineer --- Agoda Services Co., Ltd. e: istvan.sz...@agoda.com --

[ceph-users] Real memory usage of the osd(s)

2023-01-29 Thread Szabo, Istvan (Agoda)
Hello, If buffered_io is enabled, is there a way to know what is the exactly used physical memory from each osd? What I've found is the dump_mempools which last entries are the following, but this bytes would be the real physical memory usage? "total": { "items": 60005205,

[ceph-users] PG increase / data movement fine tuning

2023-02-06 Thread Szabo, Istvan (Agoda)
Hi, I've increased the placement group in my octopus cluster firstly in the index pool and I gave almost 2.5 hours bad performance for the user. I'm planning to increase the data pool also, but first I'd like to know is there any way to make it smoother or not. At the moment I have these value

[ceph-users] Adding osds to each nodes

2023-02-08 Thread Szabo, Istvan (Agoda)
Hi, What is the safest way to add disk(s) to each of the node in the cluster? Should it be done 1 by 1 or can add all of them at once and let it rebalance? My concern is that if add all in one due to host based EC code it will block all the host. The other side if I add 1 by 1, one node will ha

[ceph-users] Re: Adding osds to each nodes

2023-02-08 Thread Szabo, Istvan (Agoda)
topic, e.g. [1]. Regards, Eugen [1] https://www.mail-archive.com/ceph-users@lists.ceph.com/msg36475.html Zitat von "Szabo, Istvan (Agoda)" : Hi, What is the safest way to add disk(s) to each of the node in the cluster? Should it be done 1 by 1 or can add all of them at once and

[ceph-users] Changing os to ubuntu from centos 8

2023-03-21 Thread Szabo, Istvan (Agoda)
Hi, I'd like to change the os to ubuntu 20.04.5 from my bare metal deployed octopus 15.2.14 on centos 8. On the first run I would go with octopus 15.2.17 just to not make big changes in the cluster. I've found couple of threads on the mailing list but those were containerized (like: Re: Upgrade

[ceph-users] Re: Changing os to ubuntu from centos 8

2023-03-21 Thread Szabo, Istvan (Agoda)
ehrens Sent: Tuesday, March 21, 2023 4:29 PM To: Szabo, Istvan (Agoda) ; Ceph Users Cc: dietr...@internet-sicherheit.de; ji...@spets.org Subject: Re: [ceph-users] Changing os to ubuntu from centos 8 Email received from the internet. If in doubt, don't click any link nor open any

[ceph-users] Re: RGW access logs with bucket name

2023-03-30 Thread Szabo, Istvan (Agoda)
It has the full url begins with the bucket name in the beast logs http requests, hasn’t it? Istvan Szabo Staff Infrastructure Engineer --- Agoda Services Co., Ltd. e: istvan.sz...@agoda.com

[ceph-users] Bucket notification

2023-04-25 Thread Szabo, Istvan (Agoda)
Hi, I'm trying to set a kafka endpoint for bucket object create operation notifications but the notification is not created in kafka endpoint. Settings seems to be fine because I can upload to the bucket objects when these settings are applied: NotificationConfiguration> bulknotif

[ceph-users] Re: Bucket notification

2023-04-27 Thread Szabo, Istvan (Agoda)
RGW debug logs to 20 and see if there are any kafka related errors there? Yuval On Tue, Apr 25, 2023 at 5:48 PM Szabo, Istvan (Agoda) mailto:istvan.sz...@agoda.com>> wrote: Hi, I'm trying to set a kafka endpoint for bucket object create operation notifications but the notification i

[ceph-users] Os changed to Ubuntu, device class not shown

2023-05-08 Thread Szabo, Istvan (Agoda)
Hi, We have an octopus cluster where we want to move from centos to Ubuntu, after activate all the osd, class is not shown in ceph osd tree. However ceph-volume list shows the crush device class :/ Should I just add it or? This message is confidential and is fo

[ceph-users] Octopus on Ubuntu 20.04.6 LTS with kernel 5

2023-05-10 Thread Szabo, Istvan (Agoda)
Hi, In octopus documentation we can see kernel 4 as recommended, however we've changed our test cluster yesterday from centos 7 / 8 to Ubuntu 20.04.6 LTS with kernel 5.4.0-148 and seems working, I just want to make sure before I move to prod there isn't any caveats. Thank you

[ceph-users] Re: Octopus on Ubuntu 20.04.6 LTS with kernel 5

2023-05-10 Thread Szabo, Istvan (Agoda)
I can answer my question, even in the official ubuntu repo they are using by default the octopus version so for sure it works with kernel 5. https://packages.ubuntu.com/focal/allpackages -Original Message- From: Szabo, Istvan (Agoda) Sent: Thursday, May 11, 2023 11:20 AM To: Ceph

[ceph-users] Re: Octopus on Ubuntu 20.04.6 LTS with kernel 5

2023-05-15 Thread Szabo, Istvan (Agoda)
: Ilya Dryomov Sent: Thursday, May 11, 2023 3:39 PM To: Szabo, Istvan (Agoda) Cc: Ceph Users Subject: Re: [ceph-users] Re: Octopus on Ubuntu 20.04.6 LTS with kernel 5 Email received from the internet. If in doubt, don't click any link nor open any attachment ! ___

[ceph-users] Re: Deleting millions of objects

2023-05-17 Thread Szabo, Istvan (Agoda)
If it works I’d be amazed. We have this slow and limited delete issue also. What we’ve done to run on the same bucket multiple delete from multiple servers via s3cmd. Istvan Szabo Staff Infrastructure Engineer --- Agoda Services Co., Ltd. e: istvan

[ceph-users] Non cephadm cluster upgrade from octopus to quincy

2023-06-07 Thread Szabo, Istvan (Agoda)
Hi, I don't find any documentation for this upgrade process. Is there anybody who has already done it yet? Is the normal apt-get update method works? Thank you This message is confidential and is for the sole use of the intended recipient(s). It may also be pri

[ceph-users] Bottleneck between loadbalancer and rgws

2023-06-14 Thread Szabo, Istvan (Agoda)
Hi, I have a dedicated loadbalancer pairs separated on 2x baremetal servers and behind the haproxy balancers I have 3 mon/mgr/rgw nodes. Each rgw node has 2rgw on it so in the cluster altogether 6, (now I just added one more so currently 9). Today I see pretty high GET latency in the cluster (3

[ceph-users] Re: Bottleneck between loadbalancer and rgws

2023-06-14 Thread Szabo, Istvan (Agoda)
@agoda.com --- -Original Message- From: Kai Stian Olstad Sent: Wednesday, June 14, 2023 9:02 PM To: Szabo, Istvan (Agoda) Cc: Ceph Users Subject: Re: [ceph-users] Bottleneck between loadbalancer and rgws Email received from the internet. I

[ceph-users] Transmit rate metric based per bucket

2023-06-19 Thread Szabo, Istvan (Agoda)
Hello, I'd like to know is there a way to query some metrics/logs in octopus (or if has newer version I'm interested for the future too) about the bandwidth used in the bucket for put/get operations? Thank you This message is confidential and is for the sole us

[ceph-users] Re: radosgw hang under pressure

2023-06-25 Thread Szabo, Istvan (Agoda)
Hi, Can you check the read and write latency of your osds? Maybe it hangs because it’s waiting for pg’s but maybe the pg are under scrub or something else. Also with many small objects don’t rely on pg autoscaler, it might not tell to increase pg but maybe it should be. Istvan Szabo Staff Infra

[ceph-users] Re: RGW dynamic resharding blocks write ops

2023-07-07 Thread Szabo, Istvan (Agoda)
I turned off :) Istvan Szabo Staff Infrastructure Engineer --- Agoda Services Co., Ltd. e: istvan.sz...@agoda.com --- On 2023. Jul 7., at 17:35, Eugen Block wrote: Ema

[ceph-users] Re: RGW dynamic resharding blocks write ops

2023-07-07 Thread Szabo, Istvan (Agoda)
d that they have their index pool on HDDs (with rocksdb on SSD), not sure how big the impact is during resharding though. Zitat von "Szabo, Istvan (Agoda)" : I turned off :) Istvan Szabo Staff Infrastructure Engineer --- Agoda Services C

[ceph-users] Multisite sync - zone permission denied

2023-07-13 Thread Szabo, Istvan (Agoda)
Hi, Have you had the issue with zones are permission denied? failed to retrieve sync info: (13) Permission denied It's a newly added zone, uses the same sync user and credentials but it shows permission denied and I don't see any reason behind. Thank you This

[ceph-users] Is it safe to add different OS but same ceph version to the existing cluster?

2023-08-06 Thread Szabo, Istvan (Agoda)
Hi, I have an octopus cluster on the latest octopus version with mgr/mon/rgw/osds on centos 8. Is it safe to add an ubuntu osd host with the same octopus version? Thank you This message is confidential and is for the sole use of the intended recipient(s). It ma

[ceph-users] 64k buckets for 1 user

2023-08-06 Thread Szabo, Istvan (Agoda)
Hi, We are in a transition where I'd like to ask my user who stores 2B objects in 1 bucket to split it some way. Thinking for the future we identified to make it future proof and don't store huge amount of objects in 1 bucket, we would need to create 65xxx buckets. Is there anybody aware of any

[ceph-users] Is there any way to fine tune peering/pg relocation/rebalance?

2023-08-29 Thread Szabo, Istvan (Agoda)
Hello, Is there a way to somehow fine tune the rebalance even further than basic tuning steps when adding new osds? Today I've added some osd to the index pool and it generated many slow ops due to OSD op latency increase + read operation latency increase = high put get latency. https://ibb.co

[ceph-users] Re: Is there any way to fine tune peering/pg relocation/rebalance?

2023-08-29 Thread Szabo, Istvan (Agoda)
I'm using upmap with max deviation 1, maybe is it too aggressive? From: Louis Koo Sent: Wednesday, August 30, 2023 4:17 AM To: ceph-users@ceph.io Subject: [ceph-users] Re: Is there any way to fine tune peering/pg relocation/rebalance? Email received from the in

[ceph-users] Re: Is there any way to fine tune peering/pg relocation/rebalance?

2023-08-29 Thread Szabo, Istvan (Agoda)
Seems like tested on nautilus but I still see commits last month so I guess it is good with octopus. From: Matt Vandermeulen Sent: Wednesday, August 30, 2023 12:44 AM To: Szabo, Istvan (Agoda) Cc: Ceph Users Subject: Re: [ceph-users] Is there any way to fine

[ceph-users] Re: Is it safe to add different OS but same ceph version to the existing cluster?

2023-09-04 Thread Szabo, Istvan (Agoda)
e is very god with these kernel parameter values, do you see something that might be related to the high disk utilization? Thank you [https://i.ibb.co/Tk5Srk6/image-2023-09-04-09-55-52-311.png] From: Milind Changire Sent: Monday, August 7, 2023 11:38 PM To: Sz

<    1   2   3   4