[ceph-users] Re: Ceph reef and (slow) backfilling - how to speed it up

2024-05-01 Thread Sridhar Seshasayee
Hi Mark,

On Thu, May 2, 2024 at 3:18 AM Mark Nelson  wrote:

> For our customers we are still disabling mclock and using wpq. Might be
> worth trying.
>
>
Could you please elaborate a bit on the issue(s) preventing the
use of mClock. Is this specific to only the slow backfill rate and/or other
issue?

This feedback would help prioritize the improvements in those areas.

Thanks,
-Sridhar
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Ceph reef and (slow) backfilling - how to speed it up

2024-05-01 Thread Sridhar Seshasayee
Hi Götz,

Please see my response below.

On Tue, Apr 30, 2024 at 7:39 PM Pierre Riteau  wrote:

> Hi Götz,
>
> You can change the value of osd_max_backfills (for all OSDs or specific
> ones) using `ceph config`, but you need
> enable osd_mclock_override_recovery_settings. See
>
> https://docs.ceph.com/en/quincy/rados/configuration/mclock-config-ref/#steps-to-modify-mclock-max-backfills-recovery-limits
> for more information.
>
>
Did the suggestion from Pierre help improve the backfilling rate? With the
mClock scheduler, this is
the correct way of modifying the value of osd_max_backfills and
osd_recovery_max_active.

As to the observation of slower backfills, this is expected with the
'balanced' and 'high_client_ops'
mClock profiles (see allocations here
).
This is due to backfill operation being classified as
background best-effort service and a lower priority assigned to backfill
operations when
compared to degraded recoveries. Degraded recovery (or background recovery
service) is given
higher priority as there's a higher risk of data unavailability in case
other OSDs in the cluster go
down. Backfill operations are assigned lower priority since it just
involves data movement.

If the 'high_recovery_ops' profile coupled with increasing the above config
parameters is still
not enough to improve the backfilling rate, then the cluster must be
examined to see if there
are other competing services like degraded recoveries, client ops etc. that
could affect the
backfilling rate. The ceph status output should give an idea about this.

-Sridhar
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Dashboard issue slowing to a crawl - active ceph mgr process spiking to 600%+

2024-05-01 Thread Zachary Perry
Hello All,

I'm hoping I can get some help with an issue in the dashboard after doing a 
recent bare metal ceph upgrade from
Octopus to Quincy. 

** Please note, this document references it only being an issue with the images 
tab shortly after this I found the same issue on another cluster that was 
recently upgraded from Octopus to quincy 17.2.7 within the last few months and 
it's affecting all tabs in the ceph dashboard it slows to a crawl until I 
restart or fail over the mgr both running on top of ubuntu 20.04

Everything appears to be working fine besides the Block --> images tab. It 
doesn't matter what node I fail over to,
reboots, reinstalling ceph-mgr-dashboard, different broswers, clients etc

It will not load the 4 RBDs I have, they appear in rbd ls, I can query them and 
the connection on the end appliance is
fine. The loading icons spin infinitely without any failure message. If I 
access the images tab and then move to
any other tab is the dashboard it will allow me to navigate but not display 
anything until I either restart the service
on the active mgr or fail over to another, so it works as expected until I 
access this one tab.


when I use any other section in the in the dashboard cpu utilization for the 
ceph-mgr is normal but when I
access the images tab it's spiked to as high as 600% and will stay like that 
until I restart the service or fail
over the active mgr

-- Active MGR before clicking Block, the OSDs spike for a second but revert to 
around 5%

top - 13:43:37 up 8 days, 23:09,  1 user,  load average: 8.08, 5.02, 4.37
Tasks: 695 total,   1 running, 694 sleeping,   0 stopped,   0 zombie
%Cpu(s):  7.0 us,  1.6 sy,  0.0 ni, 89.7 id,  1.2 wa,  0.0 hi,  0.5 si,  0.0 st

-
MiB Mem : 128474.1 total,   6705.6 free,  65684.0 used,  56084.5 buff/cache
MiB Swap:  40927.0 total,  35839.3 free,   5087.7 used.  49253.0 avail Mem
PID USER  PR  NIVIRTRESSHR S  %CPU  %MEM TIME+ COMMAND
  14156 ceph  20   0 3420632   1.9g  13668 S  55.3   1.5 864:49.51 ceph-osd
  13762 ceph  20   0 3421384   1.8g  13432 S  51.3   1.4 960:22.12 ceph-osd
  14163 ceph  20   0 3422352   1.7g  13016 S  50.0   1.3 902:41.19 ceph-osd
  13803 ceph  20   0 3469596   1.8g  13532 S  44.7   1.4 941:55.10 ceph-osd
  13774 ceph  20   0 3427560   1.7g  13656 S  38.7   1.4 932:02.51 ceph-osd
  13801 ceph  20   0 3439796   1.7g  13448 S  37.7   1.3 981:25.55 ceph-osd
  14025 ceph  20   0 3426360   1.8g  13780 S  36.4   1.4 994:00.75 ceph-osd
   9888 nobody20   0  126100   8696  0 S  21.2   0.0   1106:19 
node_exporter
 126798 ceph  20   0 1787824 528000  39464 S   7.9   0.4   0:14.84 ceph-mgr
  13795 ceph  20   0 3420252   1.7g  13264 S   7.6   1.4 990:00.61 ceph-osd
  13781 ceph  20   0 3484476   1.9g  13248 S   6.3   1.5   1040:10 ceph-osd
  13777 ceph  20   0 3408972   1.8g  13464 S   6.0   1.5   1026:21 ceph-osd
  13797 ceph  20   0 3432068   1.6g  13932 S   6.0   1.3 950:39.35 ceph-osd
  13779 ceph  20   0 3471668   1.7g  12728 S   5.6   1.3 984:53.80 ceph-osd
  13768 ceph  20   0 3496064   1.9g  13504 S   5.3   1.5 918:37.48 ceph-osd
  13786 ceph  20   0 3422044   1.6g  13456 S   5.3   1.3 974:29.08 ceph-osd
  13788 ceph  20   0 3454184   1.9g  13048 S   5.3   1.5 980:35.78 ceph-osd
  13776 ceph  20   0 3445680   1.7g  12880 S   5.0   1.3 998:30.58 ceph-osd
  13785 ceph  20   0 3409548   1.7g  13704 S   5.0   1.3 939:37.08 ceph-osd
  14152 ceph  20   0 3465284   1.7g  13840 S   5.0   1.4 959:39.42 ceph-osd
  10339 nobody20   0 6256048 531428  60188 S   4.6   0.4 239:37.56 
prometheus
  13802 ceph  20   0 3430696   1.8g  13872 S   4.6   1.4 924:15.74 ceph-osd
  13791 ceph  20   0 3498876   1.5g  12648 S   4.3   1.2 962:58.37 ceph-osd
  13800 ceph  20   0 3455268   1.7g  12404 S   4.3   1.3   1000:41 ceph-osd
  13790 ceph  20   0 3434364   1.6g  13516 S   3.3   1.3 974:16.46 ceph-osd
  14217 ceph  20   0 3443436   1.8g  13560 S   3.3   1.4 902:54.22 ceph-osd
  13526 ceph  20   0 1012048 499628  11244 S   3.0   0.4 349:35.28 ceph-mon
  13775 ceph  20   0 3367284   1.6g  13940 S   3.0   1.3 878:38.27 ceph-osd
  13784 ceph  20   0 3380960   1.8g  12892 S   3.0   1.4 910:50.47 ceph-osd
  13789 ceph  20   0 3432876   1.6g  12464 S   2.6   1.2 922:45.15 ceph-osd
  13804 ceph  20   0 3428120   1.9g  13192 S   2.6   1.5 865:31.30 ceph-osd
  14153 ceph  20   0 3432752   1.8g  12576 S   2.3   1.4 874:27.92 ceph-osd
  14192 ceph  20   0 3412640   1.9g  13512 S   2.3   1.5 923:01.97 ceph-osd
  13796 ceph  20   0 3433016   1.8g  13164 S   2.0   1.4 982:08.21 ceph-osd
  13798 ceph  20   0 3405708   1.6g  13508 S   2.0   1.3 873:50.34 ceph-osd
  13814 ceph  20   0 4243252   1.5g  13500 S   2.0   1.2   2020:41 ceph-osd
  13985 ceph  20   0 3487848   1.6g  13100 S   2.0   1.3 942:21.96 ceph-osd
  14001 ceph  20   0 4194336   1.9g  13460 S   2.0   1.5   

[ceph-users] Re: Stuck OSD service specification - can't remove

2024-05-01 Thread Wang Jie
Hello David, did you resolve it? I have the same problem for rgw. I upgraded 
from N to P.


Regards,
Jie
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: stretched cluster new pool and second pool with nvme

2024-05-01 Thread ronny.lippold

hi stefan ... you are the hero of the month ;)

i don't know, why i did not found your bug report.

i have the exact same problem and resolved the HEALTH only with "ceph 
osd force_healthy_stretch_mode --yes-i-really-mean-it"

will comment the report soon.

actually, we think about 4/2 size without stretch mode enable.

what was your solution?

many thanks ... ronny

Am 2024-04-23 15:03, schrieb Stefan Kooman:

On 23-04-2024 14:40, Eugen Block wrote:

Hi,


whats the right way to add another pool?
create pool with 4/2 and use the rule for the stretched mode, 
finished?
the exsisting pools were automaticly set to 4/2 after "ceph mon 
enable_stretch_mode".


It should be that simple. However, it does not seem to work. I tried to 
do just that, use two separate pools, hdd and ssd in that case, but it 
would not work, see this tracker: https://tracker.ceph.com/issues/64817


If your experience is different please update the tracker ticket. If it 
indeed does not work, please also update the tracker ticket with a 
"+1".


Thanks,

Gr. Stefan

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Ceph Day NYC 2024 Slides

2024-05-01 Thread Laura Flores
Attached is a copy of the "Launch of the Ceph User Council" slides.

On Sat, Apr 27, 2024 at 8:12 AM Matt Vandermeulen 
wrote:

> Hi folks!
>
> Thanks for a great Ceph Day event in NYC! I wanted to make sure I posted
> my slides before I forget (and encourage others to do the same). Feel
> free to reach out in the Ceph Slack
> https://ceph.io/en/community/connect/
>
> How we Operate Ceph at Scale (DigitalOcean):
>
> -
>
> https://do-matt-ams3.ams3.digitaloceanspaces.com/2024%20Ceph%20Day%20NYC%20How%20we%20Operate%20Ceph%20at%20Scale.pdf
> -
>
> https://do-matt-sfo3.sfo3.digitaloceanspaces.com/2024%20Ceph%20Day%20NYC%20How%20we%20Operate%20Ceph%20at%20Scale.pdf
>
> Discards in Ceph (DigitalOcean):
>
> -
>
> https://do-matt-ams3.ams3.digitaloceanspaces.com/2024%20Ceph%20Day%20NYC%20Discards%20Lightning%20Talk.pdf
> -
>
> https://do-matt-sfo3.sfo3.digitaloceanspaces.com/2024%20Ceph%20Day%20NYC%20Discards%20Lightning%20Talk.pdf
>
> Matt
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
>

-- 

Laura Flores

She/Her/Hers

Software Engineer, Ceph Storage 

Chicago, IL

lflo...@ibm.com | lflo...@redhat.com 
M: +17087388804
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Ceph client cluster compatibility

2024-05-01 Thread Nima AbolhassanBeigi
Hi,

We are trying to upgrade our OS version from ubuntu 18.04 to ubuntu 22.04.
Our ceph cluster version is 16.2.13 (pacific).

The problem is that the ubuntu packages for the ceph pacific release will
not be supported for ubuntu 22.04. We were wondering if the ceph client
(version 18.2, reef) on ubuntu 22.04 can work with lower version clusters.

Thanks in advance

Regards
Nima
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] After dockerized ceph cluster to Pacific, the fsid changed in the output of 'ceph -s'

2024-05-01 Thread wjsherry075
Hello,
I had a problem after I finished the 'cephadm adopt' from services to docker 
containers for mon and mgr. The fsid of `ceph -s` is not the same as the 
/etc/ceph/ceph.conf. The ceph.conf is correct, but `ceph -s` is incorrect. I 
followed the https://docs.ceph.com/en/quincy/cephadm/adoption/
```
2024-04-25T19:49:17.460652+ mgr.cloud-lab-test-mon01 (mgr.65113) 109 : 
cephadm [ERR] cephadm exited with an error code: 1, stderr:ERROR: fsid does not 
match ceph.conf
Traceback (most recent call last):
  File "/usr/share/ceph/mgr/cephadm/serve.py", line 1538, in _remote_connection
yield (conn, connr)
  File "/usr/share/ceph/mgr/cephadm/serve.py", line 1426, in _run_cephadm
code, '\n'.join(err)))
orchestrator._interface.OrchestratorError: cephadm exited with an error code: 
1, stderr:ERROR: fsid does not match ceph.conf
#ceph health detail shows the warning:
[WRN] CEPHADM_REFRESH_FAILED: failed to probe daemons or devices
```

Does anyone else have any ideas?

Thanks,
Sherry
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Unable to add new OSDs

2024-05-01 Thread ceph
I'm trying to add a new storage host into a Ceph cluster (quincy 17.2.6). The 
machine has boot drives, one free SSD and 10 HDDs. The plan is to have each HDD 
be an OSD with a DB on a equal size lvm of the SDD. This machine is newer but 
otherwise similar to other machines already in the cluster that are setup and 
running the same way. But I've been unable to add OSDs and unable to figure out 
why, or fix it. I have some experience, but I'm not an expert and could be 
missing something obvious. If anyone has any suggestions, I would appreciate it.

I've tried to add OSDs a couple different ways.

Via the dashboard, this has worked fine for previous machines. And it appears 
to succeed and gives no errors that I can find looking in /var/log/ceph and 
dashboard logs. The OSDs are never created. In fact, the drives still show up 
as available in Physical Disks and I can do the same creation procedure 
repeatedly.

I've tried creating it in cephadm shell with the following, which has also 
worked in the past:
ceph orch daemon add osd 
stor04.fqdn:data_devices=/dev/sdb,/dev/sdc,/dev/sdd,/dev/sde,/dev/sdf,/dev/sdg,/dev/sdh,/dev/sdi,/dev/sdj,/dev/sdk,db_devices=/dev/sda,osds_per_device=1
The command just hangs. Again I wasn't able to find any obvious errors. 
Although this one did seem to cause some slow op errors from the monitors that 
required restarting a monitor. And it could cause errors with the dashboard 
locking up and having to restart the manager as well.

And I've tried setting 'ceph orch apply osd --all-available-devices 
--unmanaged=false' to let Ceph automatically add the drives. In the past, this 
would cause Ceph to automatically add the drives as OSDs but without having 
associated DBs on the SSD. The SSD would just be another OSD. This time it 
appears to have no affect and similar to the above, I wasn't able to find any 
obvious error feedback.

-Mike
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: How to handle incomplete data after rbd import-diff failure?

2024-05-01 Thread Satoru Takeuchi
Hi Maged,

2024年5月2日(木) 5:34 Maged Mokhtar :

>
> On 01/05/2024 16:12, Satoru Takeuchi wrote:
> > I confirmed that incomplete data is left on `rbd import-diff` failure.
> > I guess that this data is the part of snapshot. Could someone answer
> > me the following questions?
> >
> > Q1. Is it safe to use the RBD image (e.g. client I/O and snapshot
> > management) even though incomplete data exists?
> > Q2. Is there any way to clean up the incomplete data?
> >
> > I read the following document and understand that this problem will be
> > resolved after running `rbd import-diff` again.
> >
> > https://ceph.io/en/news/blog/2013/incremental-snapshots-with-rbd/
> >> Since overwriting the same data is idempotent, it’s safe to have an
> import-diff interrupted in the middle.
> > However, it's difficult if I can't access the exported backup data
> > anymore. For instance, I'm afraid of the following scenario.
> >
> > 1. Send the backup data from one DC (DC0) to another DC (DC1)
> periodically.
> > 2. The backup data is created in DC0 and is sent directly to DC1
> > without persist backup data as a file.
> > 3. Major power outage happens in DC0 and it's impossible to
> > re-generate the backup data for  a long time.
> >
> > I simulated this problem as follows:
> >
> > 1. Create an RBD image.
> > 2. Write some data to this image.
> > 3. Create a snapshot S0.
> > 4. Write another data to this image.
> > 5. Create a snapshot S1.
> > 6. Create a backup data consists of the difference between S0 and S1
> > by running rbd export-diff.
> > 7. Delete the last byte of the backup data, which is 'e' and means the
> > end of the backup data, to inject import-diff failure.
> > 8. Delete S1.
> > 9. Run rbd import-diff to apply the broken backup data created in the
> step 7.
> >
> > Then step9 failed and S1 was not created. However, the number of RADOS
> > objects and the storage usage has increased.
> >
> > before
> > ```
> > $ rados -p replicapool df
> > POOL_NAME  USED  OBJECTS  CLONES  COPIES  MISSING_ON_PRIMARY
> > UNFOUND  DEGRADED  RD_OPS  RD  WR_OPS  WR  USED COMPR  UNDER
> > COMPR
> > replicapool  11 MiB   24   9  24   0
> >   0 03609  53 MiB 279  41 MiB 0 B  0 B
> >
> > total_objects24
> > total_used   39 MiB
> > total_avail  32 GiB
> > total_space  32 GiB
> > ```
> >
> > after:
> > ```
> > $ rados -p replicapool df
> > POOL_NAME  USED  OBJECTS  CLONES  COPIES  MISSING_ON_PRIMARY
> > UNFOUND  DEGRADED  RD_OPS  RD  WR_OPS  WR  USED COMPR  UNDER
> > COMPR
> > replicapool  12 MiB   25   9  25   0
> >   0 03531  53 MiB 278  41 MiB 0 B  0 B
> >
> > total_objects25
> > total_used   40 MiB
> > total_avail  32 GiB
> > total_space  32 GiB
> > ```
> >
> > The incomplete data seem to increase if rbd import-diff fails again
> > and again. The following output was get after the above-mentioned
> > step9 100 times.
> >
> > ```
> > $ rados -p replicapool df
> > POOL_NAME  USED  OBJECTS  CLONES  COPIES  MISSING_ON_PRIMARY
> > UNFOUND  DEGRADED  RD_OPS   RD  WR_OPS   WR  USED COMPR  UNDER
> > COMPR
> > replicapool  12 MiB   25   9  25   0
> >   0 07925  104 MiB1308  164 MiB 0 B  0
> > B
> >
> > total_objects25
> > total_used   58 MiB
> > total_avail  32 GiB
> > total_space  32 GiB
> > ```
> >
> > Thanks,
> > Satoru
> > ___
> > ceph-users mailing list -- ceph-users@ceph.io
> > To unsubscribe send an email to ceph-users-le...@ceph.io
>
> the image is not in a consistent state so should not be used as is. if
> you no longer have access to the source image or its exported data, you
> should be able to use the rbd snap rollback command to rollback the
> destination image to its last  known good snapshot, the destination
> snapshots get created from the import-diff command with names matching
> source snapshots.
>

Thank you for the reply. I succeed to rollback the rbd image to S0 and
`total_objects` got back to the previous value (24).

On the other hand, `total_used` didn't become the original value. Repeating
the following steps resulted in the continuous growth of `total_used`.

1. Import the broken diff (it fails).
2. Rollback to S0.

I guess it's a resource leak.

Could you tell me whether I can clean up these remaining garbage data?

Best,
Satoru
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: reef 18.2.3 QE validation status

2024-05-01 Thread Yuri Weinstein
We've run into a problem during the last verification steps before
publishing this release after upgrading the LRC to it  =>
https://tracker.ceph.com/issues/65733

After this issue is resolved, we will continue testing and publishing
this point release.

Thanks for your patience!

On Thu, Apr 18, 2024 at 11:29 PM Christian Rohmann
 wrote:
>
> On 18.04.24 8:13 PM, Laura Flores wrote:
> > Thanks for bringing this to our attention. The leads have decided that
> > since this PR hasn't been merged to main yet and isn't approved, it
> > will not go in v18.2.3, but it will be prioritized for v18.2.4.
> > I've already added the PR to the v18.2.4 milestone so it's sure to be
> > picked up.
>
> Thanks a bunch. If you miss the train, you miss the train - fair enough.
> Nice to know there is another one going soon and that bug is going to be
> on it !
>
>
> Regards
>
> Christian
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Ceph reef and (slow) backfilling - how to speed it up

2024-05-01 Thread Mark Nelson
For our customers we are still disabling mclock and using wpq. Might be 
worth trying.


Mark


On 4/30/24 09:08, Pierre Riteau wrote:

Hi Götz,

You can change the value of osd_max_backfills (for all OSDs or specific
ones) using `ceph config`, but you need
enable osd_mclock_override_recovery_settings. See
https://docs.ceph.com/en/quincy/rados/configuration/mclock-config-ref/#steps-to-modify-mclock-max-backfills-recovery-limits
for more information.

Best regards,
Pierre Riteau

On Sat, 27 Apr 2024 at 08:32, Götz Reinicke 
wrote:


Dear ceph community,

I’ve a ceph cluster which got upgraded from nautilus/pacific/…to reef over
time. Now I added two new nodes to an existing EC pool as I did with the
previous versions of ceph.

Now I face the fact, that the previous „backfilling tuning“ I’v used by
increasing injectargs --osd-max-backfills=XX --osd-recovery-max-active=YY
dose not work anymore.

With adjusting thous parameters the backfill was running with up to 2k +-
objects/s.

As I’m not (yet) familiar with the reef opiont the only speed up so far I
found is „ceph config set osd osd_mclock_profile high_recovery_ops“ which
currently runs the backfill with up to 600 opbjects/s.

My question: What is a best (simple) way to speed that backfill up ?

I’v tried to understand the custom profiles (?) but without success - and
did not apply anything other yet.

Thanks for feedback and suggestions ! Best regards . Götz



___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


--
Best Regards,
Mark Nelson
Head of Research and Development

Clyso GmbH
p: +49 89 21552391 12 | a: Minnesota, USA
w: https://clyso.com | e: mark.nel...@clyso.com

We are hiring: https://www.clyso.com/jobs/
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: How to handle incomplete data after rbd import-diff failure?

2024-05-01 Thread Maged Mokhtar


On 01/05/2024 16:12, Satoru Takeuchi wrote:

I confirmed that incomplete data is left on `rbd import-diff` failure.
I guess that this data is the part of snapshot. Could someone answer
me the following questions?

Q1. Is it safe to use the RBD image (e.g. client I/O and snapshot
management) even though incomplete data exists?
Q2. Is there any way to clean up the incomplete data?

I read the following document and understand that this problem will be
resolved after running `rbd import-diff` again.

https://ceph.io/en/news/blog/2013/incremental-snapshots-with-rbd/

Since overwriting the same data is idempotent, it’s safe to have an import-diff 
interrupted in the middle.

However, it's difficult if I can't access the exported backup data
anymore. For instance, I'm afraid of the following scenario.

1. Send the backup data from one DC (DC0) to another DC (DC1) periodically.
2. The backup data is created in DC0 and is sent directly to DC1
without persist backup data as a file.
3. Major power outage happens in DC0 and it's impossible to
re-generate the backup data for  a long time.

I simulated this problem as follows:

1. Create an RBD image.
2. Write some data to this image.
3. Create a snapshot S0.
4. Write another data to this image.
5. Create a snapshot S1.
6. Create a backup data consists of the difference between S0 and S1
by running rbd export-diff.
7. Delete the last byte of the backup data, which is 'e' and means the
end of the backup data, to inject import-diff failure.
8. Delete S1.
9. Run rbd import-diff to apply the broken backup data created in the step 7.

Then step9 failed and S1 was not created. However, the number of RADOS
objects and the storage usage has increased.

before
```
$ rados -p replicapool df
POOL_NAME  USED  OBJECTS  CLONES  COPIES  MISSING_ON_PRIMARY
UNFOUND  DEGRADED  RD_OPS  RD  WR_OPS  WR  USED COMPR  UNDER
COMPR
replicapool  11 MiB   24   9  24   0
  0 03609  53 MiB 279  41 MiB 0 B  0 B

total_objects24
total_used   39 MiB
total_avail  32 GiB
total_space  32 GiB
```

after:
```
$ rados -p replicapool df
POOL_NAME  USED  OBJECTS  CLONES  COPIES  MISSING_ON_PRIMARY
UNFOUND  DEGRADED  RD_OPS  RD  WR_OPS  WR  USED COMPR  UNDER
COMPR
replicapool  12 MiB   25   9  25   0
  0 03531  53 MiB 278  41 MiB 0 B  0 B

total_objects25
total_used   40 MiB
total_avail  32 GiB
total_space  32 GiB
```

The incomplete data seem to increase if rbd import-diff fails again
and again. The following output was get after the above-mentioned
step9 100 times.

```
$ rados -p replicapool df
POOL_NAME  USED  OBJECTS  CLONES  COPIES  MISSING_ON_PRIMARY
UNFOUND  DEGRADED  RD_OPS   RD  WR_OPS   WR  USED COMPR  UNDER
COMPR
replicapool  12 MiB   25   9  25   0
  0 07925  104 MiB1308  164 MiB 0 B  0
B

total_objects25
total_used   58 MiB
total_avail  32 GiB
total_space  32 GiB
```

Thanks,
Satoru
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


the image is not in a consistent state so should not be used as is. if 
you no longer have access to the source image or its exported data, you 
should be able to use the rbd snap rollback command to rollback the 
destination image to its last  known good snapshot, the destination 
snapshots get created from the import-diff command with names matching 
source snapshots.


/maged

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: RGW multisite slowness issue due to the "304 Not Modified" responses on primary zone

2024-05-01 Thread Alexander E. Patrakov
Hello Saif,

Unfortunately, I have no other ideas that could help you.

On Wed, May 1, 2024 at 4:48 PM Saif Mohammad  wrote:
>
> Hi Alexander,
>
> We have configured  the parameters in our infrastructure to fix the issue, 
> and despite tuning them or even set it to the higher levels, the issue still 
> persists. We have shared the latency between the DC and DR site for your 
> reference. Please advise on alternative solutions to resolve this issue as 
> this is very crucial for us.
>
> - rgw_bucket_sync_spawn_window
> - rgw_data_sync_spawn_window
> - rgw_meta_sync_spawn_window
>
> root@host-01:~# ping 
> PING  (ip) 56(84) bytes of data.
> 64 bytes from : icmp_seq=1 ttl=60 time=41.9 ms
> 64 bytes from : icmp_seq=2 ttl=60 time=41.5 ms
> 64 bytes from : icmp_seq=3 ttl=60 time=41.6 ms
> 64 bytes from  : icmp_seq=4 ttl=60 time=50.8 ms
>
>
> Any guidance would be greatly appreciated.
>
> Regards,
> Mohammad Saif
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io



-- 
Alexander E. Patrakov
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] How to handle incomplete data after rbd import-diff failure?

2024-05-01 Thread Satoru Takeuchi
I confirmed that incomplete data is left on `rbd import-diff` failure.
I guess that this data is the part of snapshot. Could someone answer
me the following questions?

Q1. Is it safe to use the RBD image (e.g. client I/O and snapshot
management) even though incomplete data exists?
Q2. Is there any way to clean up the incomplete data?

I read the following document and understand that this problem will be
resolved after running `rbd import-diff` again.

https://ceph.io/en/news/blog/2013/incremental-snapshots-with-rbd/
> Since overwriting the same data is idempotent, it’s safe to have an 
> import-diff interrupted in the middle.

However, it's difficult if I can't access the exported backup data
anymore. For instance, I'm afraid of the following scenario.

1. Send the backup data from one DC (DC0) to another DC (DC1) periodically.
2. The backup data is created in DC0 and is sent directly to DC1
without persist backup data as a file.
3. Major power outage happens in DC0 and it's impossible to
re-generate the backup data for  a long time.

I simulated this problem as follows:

1. Create an RBD image.
2. Write some data to this image.
3. Create a snapshot S0.
4. Write another data to this image.
5. Create a snapshot S1.
6. Create a backup data consists of the difference between S0 and S1
by running rbd export-diff.
7. Delete the last byte of the backup data, which is 'e' and means the
end of the backup data, to inject import-diff failure.
8. Delete S1.
9. Run rbd import-diff to apply the broken backup data created in the step 7.

Then step9 failed and S1 was not created. However, the number of RADOS
objects and the storage usage has increased.

before
```
$ rados -p replicapool df
POOL_NAME  USED  OBJECTS  CLONES  COPIES  MISSING_ON_PRIMARY
UNFOUND  DEGRADED  RD_OPS  RD  WR_OPS  WR  USED COMPR  UNDER
COMPR
replicapool  11 MiB   24   9  24   0
 0 03609  53 MiB 279  41 MiB 0 B  0 B

total_objects24
total_used   39 MiB
total_avail  32 GiB
total_space  32 GiB
```

after:
```
$ rados -p replicapool df
POOL_NAME  USED  OBJECTS  CLONES  COPIES  MISSING_ON_PRIMARY
UNFOUND  DEGRADED  RD_OPS  RD  WR_OPS  WR  USED COMPR  UNDER
COMPR
replicapool  12 MiB   25   9  25   0
 0 03531  53 MiB 278  41 MiB 0 B  0 B

total_objects25
total_used   40 MiB
total_avail  32 GiB
total_space  32 GiB
```

The incomplete data seem to increase if rbd import-diff fails again
and again. The following output was get after the above-mentioned
step9 100 times.

```
$ rados -p replicapool df
POOL_NAME  USED  OBJECTS  CLONES  COPIES  MISSING_ON_PRIMARY
UNFOUND  DEGRADED  RD_OPS   RD  WR_OPS   WR  USED COMPR  UNDER
COMPR
replicapool  12 MiB   25   9  25   0
 0 07925  104 MiB1308  164 MiB 0 B  0
B

total_objects25
total_used   58 MiB
total_avail  32 GiB
total_space  32 GiB
```

Thanks,
Satoru
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: RGW multisite slowness issue due to the "304 Not Modified" responses on primary zone

2024-05-01 Thread Saif Mohammad
Hi Alexander,

We have configured  the parameters in our infrastructure to fix the issue, and 
despite tuning them or even set it to the higher levels, the issue still 
persists. We have shared the latency between the DC and DR site for your 
reference. Please advise on alternative solutions to resolve this issue as this 
is very crucial for us. 

- rgw_bucket_sync_spawn_window
- rgw_data_sync_spawn_window
- rgw_meta_sync_spawn_window

root@host-01:~# ping 
PING  (ip) 56(84) bytes of data.
64 bytes from : icmp_seq=1 ttl=60 time=41.9 ms
64 bytes from : icmp_seq=2 ttl=60 time=41.5 ms
64 bytes from : icmp_seq=3 ttl=60 time=41.6 ms
64 bytes from  : icmp_seq=4 ttl=60 time=50.8 ms


Any guidance would be greatly appreciated.

Regards,
Mohammad Saif
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io