[ceph-users] Re: Does dynamic resharding block I/Os by design?

2021-05-17 Thread Satoru Takeuchi
2021年5月18日(火) 9:23 Satoru Takeuchi :
>
> Hi,
>
> I have a Ceph cluster used for RGW and RBD. I found that all I/Os to
> RGW seemed to be
> blocked while dynamic resharding. Could you tell me whether this
> behavior is by design or not?
>
> I attached a graph which means I/O seemed to be blocked. Here x-axis
> is time and y-axis
> is the number of RADOS objects. In addition, dynamic resharding was
> run between 16:22:30 and 16:31:30.

I uploaded my image to google photos (thank you Anthony to point it out).

https://photos.app.goo.gl/ZFJg7CQ4z6gfqmSX6

>
> I read the official documents about dynamic resharding. But there is
> no description about blocking during dynamic resharding.
>
> https://docs.ceph.com/en/octopus/radosgw/dynamicresharding/
>
> In addition, I read the following Red Hat's blog post.
>
> https://www.redhat.com/ja/blog/ceph-rgw-dynamic-bucket-sharding-performance-investigation-and-guidance
>
> > You do not need to stop reading or writing objects to the bucket while 
> > resharding is happening.
>
> It would mean the dynamic resharding is online operation. However,
> it's not clear whether this feature blocks I/Os or not.
>
> Thanks,
> Satoru
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Does dynamic resharding block I/Os by design?

2021-05-17 Thread Satoru Takeuchi
Hi,

I have a Ceph cluster used for RGW and RBD. I found that all I/Os to
RGW seemed to be
blocked while dynamic resharding. Could you tell me whether this
behavior is by design or not?

I attached a graph which means I/O seemed to be blocked. Here x-axis
is time and y-axis
is the number of RADOS objects. In addition, dynamic resharding was
run between 16:22:30 and 16:31:30.

I read the official documents about dynamic resharding. But there is
no description about blocking during dynamic resharding.

https://docs.ceph.com/en/octopus/radosgw/dynamicresharding/

In addition, I read the following Red Hat's blog post.

https://www.redhat.com/ja/blog/ceph-rgw-dynamic-bucket-sharding-performance-investigation-and-guidance

> You do not need to stop reading or writing objects to the bucket while 
> resharding is happening.

It would mean the dynamic resharding is online operation. However,
it's not clear whether this feature blocks I/Os or not.

Thanks,
Satoru
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: RBD as a boot image [was: libceph: get_reply osd2 tid 1459933 data 3248128 > preallocated 131072, skipping]

2021-05-17 Thread Markus Kienast
Am Mo., 17. Mai 2021 um 20:28 Uhr schrieb Nico Schottelius <
nico.schottel...@ungleich.ch>:

>
>
> Markus Kienast  writes:
>
> > Hi Nico,
> >
> > we are already doing exactly that:
> >
> > Loading initrd via iPXE
> > which contains the necessary modules and scripts to boot an RBD boot dev.
> > Works just fine.
>
> Interesting and very good to hear. How do you handle kernel differences
> (loaded kernel vs. modules in the RBD image)?
>

The way it works in LTSP is, that you run a script provided by the ltsp
project, which copies the kernel to your tftpboot directory and sets the
ipxe boot options accordingly. Currently there is no automatism for mapping
and mounting the RBD for this purpose, you have to map it by hand and mount
it to the proper dir prior to running the script.

I am sure you can integrate this into your workflow by using what is
currently in ltsp and my rbd branch as an example.


>
> > And Ilya just helped to work out the last show stopper, thanks again for
> > that!
> >
> > We are using a modified LTSP system for this.
> >
> > We have proposed some patches to LTSP to get the necessary facilities
> > upstream but Alkis Georgopoulos first want to see that there is enough
> > interest for that before he considers merging our patch or creating the
> > necessary changes himself.
>
> I think seeing LTSP booting on RBD is a great move forward, also for
> other projects.
>
> > However the necessary initrd code is already available in this merge
> > request:
> >
> https://github.com/trickkiste/ltsp/blob/feature-boot_method-rbd/debian/ltsp-rbd.initramfs-script
> >
> > I see you are from Switzerland - neighbors!
>
> We might actually meet at a Linuxtag - but we should probably take this
> off-list :-)
>

In Vienna? Not been there for ages but happy to meet.

>
> > Out of interest, what are you planning to use this for? Servers, Thin/Fat
> > Clients?
>
> Our objective in the end is to boot servers and VMs from possible the
> same RBD pool/images.
>
> The problem there is though that we don't know what is inside the RBD
> image, so we don't know which kernel to load besides we would do some
> kind of kexec magic, which would pass on the RBD parameters.
>
> Or in other words, we have this use case:
>
> - a customer books a VM and needs more performance
> - the customer decides to go with *a* server, but not necessarily a
>   specific server
> - the customer VM should be shutdown and the RBD image should boot on a
> server
>
> If the server crashes, the OS should be booted on a different server.
>
> We can obviously work around this by *always* running a VM, but this is
> not exactly what our customers want. At the moment they use local disks
> + nfs shares to achieve a similar solution, but it is far from perfect.
>

Alright, I understand.
Yes, your scenario would work, while you would only need to borrow the ipxe
stuff from LTSP and my rbd initramfs-hook and initramfs-script.
The initramfs stuff should actually be molded into a separate Ubuntu
package and made available to upstream.
The mapping RO stuff, which Ilya has provided yet needs to be added and all
the other available kernel options should be made available as well.

My best regards
Markus


>
> Cheers,
>
> Nico
>
> --
> Sustainable and modern Infrastructures by ungleich.ch
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: RBD as a boot image [was: libceph: get_reply osd2 tid 1459933 data 3248128 > preallocated 131072, skipping]

2021-05-17 Thread Nico Schottelius



Markus Kienast  writes:

> Hi Nico,
>
> we are already doing exactly that:
>
> Loading initrd via iPXE
> which contains the necessary modules and scripts to boot an RBD boot dev.
> Works just fine.

Interesting and very good to hear. How do you handle kernel differences
(loaded kernel vs. modules in the RBD image)?

> And Ilya just helped to work out the last show stopper, thanks again for
> that!
>
> We are using a modified LTSP system for this.
>
> We have proposed some patches to LTSP to get the necessary facilities
> upstream but Alkis Georgopoulos first want to see that there is enough
> interest for that before he considers merging our patch or creating the
> necessary changes himself.

I think seeing LTSP booting on RBD is a great move forward, also for
other projects.

> However the necessary initrd code is already available in this merge
> request:
> https://github.com/trickkiste/ltsp/blob/feature-boot_method-rbd/debian/ltsp-rbd.initramfs-script
>
> I see you are from Switzerland - neighbors!

We might actually meet at a Linuxtag - but we should probably take this
off-list :-)

> Out of interest, what are you planning to use this for? Servers, Thin/Fat
> Clients?

Our objective in the end is to boot servers and VMs from possible the
same RBD pool/images.

The problem there is though that we don't know what is inside the RBD
image, so we don't know which kernel to load besides we would do some
kind of kexec magic, which would pass on the RBD parameters.

Or in other words, we have this use case:

- a customer books a VM and needs more performance
- the customer decides to go with *a* server, but not necessarily a
  specific server
- the customer VM should be shutdown and the RBD image should boot on a server

If the server crashes, the OS should be booted on a different server.

We can obviously work around this by *always* running a VM, but this is
not exactly what our customers want. At the moment they use local disks
+ nfs shares to achieve a similar solution, but it is far from perfect.

Cheers,

Nico

--
Sustainable and modern Infrastructures by ungleich.ch
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: After a huge amount of snaphot delete many snaptrim+snaptrim_wait pgs

2021-05-17 Thread Szabo, Istvan (Agoda)
Yes, but before I update need to have a healthy cluster, don't really want to 
update if it is not healthy to carry issue over.

Istvan Szabo
Senior Infrastructure Engineer
---
Agoda Services Co., Ltd.
e: istvan.sz...@agoda.com
---

From: Konstantin Shalygin 
Sent: Sunday, May 16, 2021 4:59 PM
To: Szabo, Istvan (Agoda) 
Cc: Ceph Users 
Subject: Re: [ceph-users] After a huge amount of snaphot delete many 
snaptrim+snaptrim_wait pgs

Hi,


On 16 May 2021, at 04:22, Szabo, Istvan (Agoda) 
mailto:istvan.sz...@agoda.com>> wrote:

The cluster has 3 servers, running on luminous 12.2.8.

Again, this is old and unsupported version of Ceph. Please, upgrade at least to 
12.2.13



k


This message is confidential and is for the sole use of the intended 
recipient(s). It may also be privileged or otherwise protected by copyright or 
other legal rules. If you have received it by mistake please let us know by 
reply email and delete it from your system. It is prohibited to copy this 
message or disclose its content to anyone. Any confidentiality or privilege is 
not waived or lost by any mistaken delivery or unauthorized disclosure of the 
message. All messages sent to and from Agoda may be monitored to ensure 
compliance with company policies, to protect the company's interests and to 
remove potential malware. Electronic messages may be intercepted, amended, lost 
or deleted, or contain viruses.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Pool has been deleted before snaptrim finished

2021-05-17 Thread Szabo, Istvan (Agoda)
Hi,

We decided to delete the pool before the snaptrim finished after 4 days waiting.
Now we have bigger issue, many osd started to flap, 2 of them cannot even 
restart due after.

Did some bluestore fsck on the not started osds and has many messages like this 
inside:

2021-05-17 18:37:07.176203 7f416d20bec0 10 stupidalloc 0x0x564e4e804f50 
init_add_free 0x482d0778000~4000
2021-05-17 18:37:07.176204 7f416d20bec0 10 freelist enumerate_next 
0x482d0784000~4000
2021-05-17 18:37:07.176204 7f416d20bec0 10 stupidalloc 0x0x564e4e804f50 
init_add_free 0x482d0784000~4000
2021-05-17 18:37:07.176205 7f416d20bec0 10 freelist enumerate_next 
0x482d078c000~c000
2021-05-17 18:37:07.176206 7f416d20bec0 10 stupidalloc 0x0x564e4e804f50 
init_add_free 0x482d078c000~c000
[root@hk-cephosd-2002 ~]# tail -f /tmp/ceph-osd-44-fsck.log
2021-05-17 18:39:16.466967 7f416d20bec0 20 bluefs _read_random read buffered 
0x2cd6e8f~ed6 of 1:0x372e070+420
2021-05-17 18:39:16.467154 7f416d20bec0 20 bluefs _read_random got 3798
2021-05-17 18:39:16.467179 7f416d20bec0 10 bluefs _read_random h 0x564e4e658500 
0x24d6e35~ee2 from file(ino 216551 size 0x43a382d mtime 2021-05-17 
13:21:19.839668 bdev 1 allocated 440 extents [1:0x35bc7c0+440])
2021-05-17 18:39:16.467186 7f416d20bec0 20 bluefs _read_random read buffered 
0x24d6e35~ee2 of 1:0x35bc7c0+440
2021-05-17 18:39:16.467409 7f416d20bec0 20 bluefs _read_random got 3810

and

uh oh, missing shared_blob

I've set back buffered_io to false back because when restart the osds always 
had to wait to fix degraded pgs.
Many of the SSDs are smashing at the moment on 100% and don't really know what 
to do to stop the process and bring back the 2 ssds :/

Some paste: https://justpaste.it/9bj3a

Some metric (each column is 1 server metric, total 3 servers):
How it is smashing the ssds: https://i.ibb.co/x3xm0Rj/ssds.png
IOWAIT Super high due to ssd utilization: https://i.ibb.co/683TR9y/iowait.png
Capacity seems coming back: https://i.ibb.co/mz4Lq2r/space.png

Thank you the help.


This message is confidential and is for the sole use of the intended 
recipient(s). It may also be privileged or otherwise protected by copyright or 
other legal rules. If you have received it by mistake please let us know by 
reply email and delete it from your system. It is prohibited to copy this 
message or disclose its content to anyone. Any confidentiality or privilege is 
not waived or lost by any mistaken delivery or unauthorized disclosure of the 
message. All messages sent to and from Agoda may be monitored to ensure 
compliance with company policies, to protect the company's interests and to 
remove potential malware. Electronic messages may be intercepted, amended, lost 
or deleted, or contain viruses.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Pool has been deleted before snaptrim finished

2021-05-17 Thread Igor Fedotov
Highly likely you're facing "degraded" RocksDB caused by bulk data 
removal. This applies to both your original snaptrimm issue and the 
current flapping OSDs.


Following tickets reffer to snaptrim issue: 
https://tracker.ceph.com/issues/50511 https://tracker.ceph.com/issues/47446



As a workaround you might want to compact DBs for every OSD using 
ceph-kvstore-tool (offline) or  ceph admin daemon's compact command 
(online).


Offline compaction is preferred as online one might be no-op undere some 
circumstanses.



Thanks,

Igor


On 5/17/2021 4:31 PM, Szabo, Istvan (Agoda) wrote:

Hi,

We decided to delete the pool before the snaptrim finished after 4 days waiting.
Now we have bigger issue, many osd started to flap, 2 of them cannot even 
restart due after.

Did some bluestore fsck on the not started osds and has many messages like this 
inside:

2021-05-17 18:37:07.176203 7f416d20bec0 10 stupidalloc 0x0x564e4e804f50 
init_add_free 0x482d0778000~4000
2021-05-17 18:37:07.176204 7f416d20bec0 10 freelist enumerate_next 
0x482d0784000~4000
2021-05-17 18:37:07.176204 7f416d20bec0 10 stupidalloc 0x0x564e4e804f50 
init_add_free 0x482d0784000~4000
2021-05-17 18:37:07.176205 7f416d20bec0 10 freelist enumerate_next 
0x482d078c000~c000
2021-05-17 18:37:07.176206 7f416d20bec0 10 stupidalloc 0x0x564e4e804f50 
init_add_free 0x482d078c000~c000
[root@hk-cephosd-2002 ~]# tail -f /tmp/ceph-osd-44-fsck.log
2021-05-17 18:39:16.466967 7f416d20bec0 20 bluefs _read_random read buffered 
0x2cd6e8f~ed6 of 1:0x372e070+420
2021-05-17 18:39:16.467154 7f416d20bec0 20 bluefs _read_random got 3798
2021-05-17 18:39:16.467179 7f416d20bec0 10 bluefs _read_random h 0x564e4e658500 
0x24d6e35~ee2 from file(ino 216551 size 0x43a382d mtime 2021-05-17 
13:21:19.839668 bdev 1 allocated 440 extents [1:0x35bc7c0+440])
2021-05-17 18:39:16.467186 7f416d20bec0 20 bluefs _read_random read buffered 
0x24d6e35~ee2 of 1:0x35bc7c0+440
2021-05-17 18:39:16.467409 7f416d20bec0 20 bluefs _read_random got 3810

and

uh oh, missing shared_blob

I've set back buffered_io to false back because when restart the osds always 
had to wait to fix degraded pgs.
Many of the SSDs are smashing at the moment on 100% and don't really know what 
to do to stop the process and bring back the 2 ssds :/

Some paste: https://justpaste.it/9bj3a

Some metric (each column is 1 server metric, total 3 servers):
How it is smashing the ssds: https://i.ibb.co/x3xm0Rj/ssds.png
IOWAIT Super high due to ssd utilization: https://i.ibb.co/683TR9y/iowait.png
Capacity seems coming back: https://i.ibb.co/mz4Lq2r/space.png

Thank you the help.


This message is confidential and is for the sole use of the intended 
recipient(s). It may also be privileged or otherwise protected by copyright or 
other legal rules. If you have received it by mistake please let us know by 
reply email and delete it from your system. It is prohibited to copy this 
message or disclose its content to anyone. Any confidentiality or privilege is 
not waived or lost by any mistaken delivery or unauthorized disclosure of the 
message. All messages sent to and from Agoda may be monitored to ensure 
compliance with company policies, to protect the company's interests and to 
remove potential malware. Electronic messages may be intercepted, amended, lost 
or deleted, or contain viruses.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Process for adding a separate block.db to an osd

2021-05-17 Thread Boris Behrens
I have no idea why, but it worked.

As the fsck went well, I just re did the bluefs-bdev-new-db and now the OSD
is back up, with a block.db device.

Thanks a lot

Am Mo., 17. Mai 2021 um 15:28 Uhr schrieb Igor Fedotov :

> If you haven't had successful OSD.68 starts with standalone DB I think
> it's safe to revert previous DB adding and just retry it.
>
> At first suggest to run bluefs-bdev-new-db command only and then do fsck
> again. If it's OK - proceed with bluefs migrate followed by another
> fsck. And then finalize with adding lvm tags and OSD activation.
>
>
> Thanks,
>
> Igor
>
> On 5/17/2021 3:47 PM, Boris Behrens wrote:
> > The FSCK looks good:
> >
> > [root@s3db10 export-bluefs2]# ceph-bluestore-tool --path
> > /var/lib/ceph/osd/ceph-68  fsck
> > fsck success
> >
> > Am Mo., 17. Mai 2021 um 14:39 Uhr schrieb Boris Behrens :
> >
> >> Here is the new output. I kept both for now.
> >>
> >> [root@s3db10 export-bluefs2]# ls *
> >> db:
> >> 018215.sst  018444.sst  018839.sst  019074.sst  019210.sst  019381.sst
> >>   019560.sst  019755.sst  019849.sst  019888.sst  019958.sst  019995.sst
> >>   020007.sst  020042.sst  020067.sst  020098.sst  020115.sst
> >> 018216.sst  018445.sst  018840.sst  019075.sst  019211.sst  019382.sst
> >>   019670.sst  019756.sst  019877.sst  019889.sst  019959.sst  019996.sst
> >>   020008.sst  020043.sst  020068.sst  020104.sst  CURRENT
> >> 018273.sst  018446.sst  018876.sst  019076.sst  019256.sst  019383.sst
> >>   019671.sst  019757.sst  019878.sst  019890.sst  019960.sst  019997.sst
> >>   020030.sst  020055.sst  020069.sst  020105.sst  IDENTITY
> >> 018300.sst  018447.sst  018877.sst  019081.sst  019257.sst  019395.sst
> >>   019672.sst  019762.sst  019879.sst  019918.sst  019961.sst  019998.sst
> >>   020031.sst  020056.sst  020070.sst  020106.sst  LOCK
> >> 018301.sst  018448.sst  018904.sst  019082.sst  019344.sst  019396.sst
> >>   019673.sst  019763.sst  019880.sst  019919.sst  019962.sst  01.sst
> >>   020032.sst  020057.sst  020071.sst  020107.sst  MANIFEST-020084
> >> 018326.sst  018449.sst  018950.sst  019083.sst  019345.sst  019400.sst
> >>   019674.sst  019764.sst  019881.sst  019920.sst  019963.sst  02.sst
> >>   020035.sst  020058.sst  020072.sst  020108.sst  OPTIONS-020084
> >> 018327.sst  018540.sst  018952.sst  019126.sst  019346.sst  019470.sst
> >>   019675.sst  019765.sst  019882.sst  019921.sst  019964.sst  020001.sst
> >>   020036.sst  020059.sst  020073.sst  020109.sst  OPTIONS-020087
> >> 018328.sst  018541.sst  018953.sst  019127.sst  019370.sst  019471.sst
> >>   019676.sst  019766.sst  019883.sst  019922.sst  019965.sst  020002.sst
> >>   020037.sst  020060.sst  020074.sst  020110.sst
> >> 018329.sst  018590.sst  018954.sst  019128.sst  019371.sst  019472.sst
> >>   019677.sst  019845.sst  019884.sst  019923.sst  019989.sst  020003.sst
> >>   020038.sst  020061.sst  020075.sst  020111.sst
> >> 018406.sst  018591.sst  018995.sst  019174.sst  019372.sst  019473.sst
> >>   019678.sst  019846.sst  019885.sst  019950.sst  019992.sst  020004.sst
> >>   020039.sst  020062.sst  020094.sst  020112.sst
> >> 018407.sst  018727.sst  018996.sst  019175.sst  019373.sst  019474.sst
> >>   019753.sst  019847.sst  019886.sst  019955.sst  019993.sst  020005.sst
> >>   020040.sst  020063.sst  020095.sst  020113.sst
> >> 018443.sst  018728.sst  019073.sst  019176.sst  019380.sst  019475.sst
> >>   019754.sst  019848.sst  019887.sst  019956.sst  019994.sst  020006.sst
> >>   020041.sst  020064.sst  020096.sst  020114.sst
> >>
> >> db.slow:
> >>
> >> db.wal:
> >> 020085.log  020088.log
> >> [root@s3db10 export-bluefs2]# du -hs
> >> 12G .
> >> [root@s3db10 export-bluefs2]# cat db/CURRENT
> >> MANIFEST-020084
> >>
> >> Am Mo., 17. Mai 2021 um 14:28 Uhr schrieb Igor Fedotov <
> ifedo...@suse.de>:
> >>
> >>> On 5/17/2021 2:53 PM, Boris Behrens wrote:
>  Like this?
> >>> Yeah.
> >>>
> >>> so DB dir structure is more or less O but db/CURRENT looks corrupted.
> It
> >>> should contain something like: MANIFEST-020081
> >>>
> >>> Could you please remove (or even just rename)  block.db symlink and do
> >>> the export again? Preferably to preserve the results for the first
> export.
> >>>
> >>> if export reveals proper CURRENT content - you might want to run fsck
> on
> >>> the OSD...
> >>>
>  [root@s3db10 export-bluefs]# ls *
>  db:
>  018215.sst  018444.sst  018839.sst  019074.sst  019174.sst  019372.sst
> 019470.sst  019675.sst  019765.sst  019882.sst  019918.sst
> 019961.sst
> 019997.sst  020022.sst  020042.sst  020061.sst  020073.sst
>  018216.sst  018445.sst  018840.sst  019075.sst  019175.sst  019373.sst
> 019471.sst  019676.sst  019766.sst  019883.sst  019919.sst
> 019962.sst
> 019998.sst  020023.sst  020043.sst  020062.sst  020074.sst
>  018273.sst  018446.sst  018876.sst  019076.sst  019176.sst  019380.sst
> 019472.sst  019677.sst  019845.sst  019884.sst  019920.sst
> 019963.sst
> 

[ceph-users] Re: Process for adding a separate block.db to an osd

2021-05-17 Thread Igor Fedotov
If you haven't had successful OSD.68 starts with standalone DB I think 
it's safe to revert previous DB adding and just retry it.


At first suggest to run bluefs-bdev-new-db command only and then do fsck 
again. If it's OK - proceed with bluefs migrate followed by another 
fsck. And then finalize with adding lvm tags and OSD activation.



Thanks,

Igor

On 5/17/2021 3:47 PM, Boris Behrens wrote:

The FSCK looks good:

[root@s3db10 export-bluefs2]# ceph-bluestore-tool --path
/var/lib/ceph/osd/ceph-68  fsck
fsck success

Am Mo., 17. Mai 2021 um 14:39 Uhr schrieb Boris Behrens :


Here is the new output. I kept both for now.

[root@s3db10 export-bluefs2]# ls *
db:
018215.sst  018444.sst  018839.sst  019074.sst  019210.sst  019381.sst
  019560.sst  019755.sst  019849.sst  019888.sst  019958.sst  019995.sst
  020007.sst  020042.sst  020067.sst  020098.sst  020115.sst
018216.sst  018445.sst  018840.sst  019075.sst  019211.sst  019382.sst
  019670.sst  019756.sst  019877.sst  019889.sst  019959.sst  019996.sst
  020008.sst  020043.sst  020068.sst  020104.sst  CURRENT
018273.sst  018446.sst  018876.sst  019076.sst  019256.sst  019383.sst
  019671.sst  019757.sst  019878.sst  019890.sst  019960.sst  019997.sst
  020030.sst  020055.sst  020069.sst  020105.sst  IDENTITY
018300.sst  018447.sst  018877.sst  019081.sst  019257.sst  019395.sst
  019672.sst  019762.sst  019879.sst  019918.sst  019961.sst  019998.sst
  020031.sst  020056.sst  020070.sst  020106.sst  LOCK
018301.sst  018448.sst  018904.sst  019082.sst  019344.sst  019396.sst
  019673.sst  019763.sst  019880.sst  019919.sst  019962.sst  01.sst
  020032.sst  020057.sst  020071.sst  020107.sst  MANIFEST-020084
018326.sst  018449.sst  018950.sst  019083.sst  019345.sst  019400.sst
  019674.sst  019764.sst  019881.sst  019920.sst  019963.sst  02.sst
  020035.sst  020058.sst  020072.sst  020108.sst  OPTIONS-020084
018327.sst  018540.sst  018952.sst  019126.sst  019346.sst  019470.sst
  019675.sst  019765.sst  019882.sst  019921.sst  019964.sst  020001.sst
  020036.sst  020059.sst  020073.sst  020109.sst  OPTIONS-020087
018328.sst  018541.sst  018953.sst  019127.sst  019370.sst  019471.sst
  019676.sst  019766.sst  019883.sst  019922.sst  019965.sst  020002.sst
  020037.sst  020060.sst  020074.sst  020110.sst
018329.sst  018590.sst  018954.sst  019128.sst  019371.sst  019472.sst
  019677.sst  019845.sst  019884.sst  019923.sst  019989.sst  020003.sst
  020038.sst  020061.sst  020075.sst  020111.sst
018406.sst  018591.sst  018995.sst  019174.sst  019372.sst  019473.sst
  019678.sst  019846.sst  019885.sst  019950.sst  019992.sst  020004.sst
  020039.sst  020062.sst  020094.sst  020112.sst
018407.sst  018727.sst  018996.sst  019175.sst  019373.sst  019474.sst
  019753.sst  019847.sst  019886.sst  019955.sst  019993.sst  020005.sst
  020040.sst  020063.sst  020095.sst  020113.sst
018443.sst  018728.sst  019073.sst  019176.sst  019380.sst  019475.sst
  019754.sst  019848.sst  019887.sst  019956.sst  019994.sst  020006.sst
  020041.sst  020064.sst  020096.sst  020114.sst

db.slow:

db.wal:
020085.log  020088.log
[root@s3db10 export-bluefs2]# du -hs
12G .
[root@s3db10 export-bluefs2]# cat db/CURRENT
MANIFEST-020084

Am Mo., 17. Mai 2021 um 14:28 Uhr schrieb Igor Fedotov :


On 5/17/2021 2:53 PM, Boris Behrens wrote:

Like this?

Yeah.

so DB dir structure is more or less O but db/CURRENT looks corrupted. It
should contain something like: MANIFEST-020081

Could you please remove (or even just rename)  block.db symlink and do
the export again? Preferably to preserve the results for the first export.

if export reveals proper CURRENT content - you might want to run fsck on
the OSD...


[root@s3db10 export-bluefs]# ls *
db:
018215.sst  018444.sst  018839.sst  019074.sst  019174.sst  019372.sst
   019470.sst  019675.sst  019765.sst  019882.sst  019918.sst  019961.sst
   019997.sst  020022.sst  020042.sst  020061.sst  020073.sst
018216.sst  018445.sst  018840.sst  019075.sst  019175.sst  019373.sst
   019471.sst  019676.sst  019766.sst  019883.sst  019919.sst  019962.sst
   019998.sst  020023.sst  020043.sst  020062.sst  020074.sst
018273.sst  018446.sst  018876.sst  019076.sst  019176.sst  019380.sst
   019472.sst  019677.sst  019845.sst  019884.sst  019920.sst  019963.sst
   01.sst  020030.sst  020049.sst  020063.sst  020075.sst
018300.sst  018447.sst  018877.sst  019077.sst  019210.sst  019381.sst
   019473.sst  019678.sst  019846.sst  019885.sst  019921.sst  019964.sst
   02.sst  020031.sst  020051.sst  020064.sst  020077.sst
018301.sst  018448.sst  018904.sst  019081.sst  019211.sst  019382.sst
   019474.sst  019753.sst  019847.sst  019886.sst  019922.sst  019965.sst
   020001.sst  020032.sst  020052.sst  020065.sst  020080.sst
018326.sst  018449.sst  018950.sst  019082.sst  019256.sst  019383.sst
   019475.sst  019754.sst  019848.sst  019887.sst  019923.sst  019986.sst
   020002.sst  020035.sst  020053.sst  020066.sst  CURRENT
018327.sst  018540.sst  

[ceph-users] Re: Process for adding a separate block.db to an osd

2021-05-17 Thread Boris Behrens
See my last mail :)

Am Mo., 17. Mai 2021 um 14:52 Uhr schrieb Igor Fedotov :

> Would you try fsck without standalone DB?
>
> On 5/17/2021 3:39 PM, Boris Behrens wrote:
> > Here is the new output. I kept both for now.
> >
> > [root@s3db10 export-bluefs2]# ls *
> > db:
> > 018215.sst  018444.sst  018839.sst  019074.sst  019210.sst  019381.sst
> >   019560.sst  019755.sst  019849.sst  019888.sst  019958.sst  019995.sst
> >   020007.sst  020042.sst  020067.sst  020098.sst  020115.sst
> > 018216.sst  018445.sst  018840.sst  019075.sst  019211.sst  019382.sst
> >   019670.sst  019756.sst  019877.sst  019889.sst  019959.sst  019996.sst
> >   020008.sst  020043.sst  020068.sst  020104.sst  CURRENT
> > 018273.sst  018446.sst  018876.sst  019076.sst  019256.sst  019383.sst
> >   019671.sst  019757.sst  019878.sst  019890.sst  019960.sst  019997.sst
> >   020030.sst  020055.sst  020069.sst  020105.sst  IDENTITY
> > 018300.sst  018447.sst  018877.sst  019081.sst  019257.sst  019395.sst
> >   019672.sst  019762.sst  019879.sst  019918.sst  019961.sst  019998.sst
> >   020031.sst  020056.sst  020070.sst  020106.sst  LOCK
> > 018301.sst  018448.sst  018904.sst  019082.sst  019344.sst  019396.sst
> >   019673.sst  019763.sst  019880.sst  019919.sst  019962.sst  01.sst
> >   020032.sst  020057.sst  020071.sst  020107.sst  MANIFEST-020084
> > 018326.sst  018449.sst  018950.sst  019083.sst  019345.sst  019400.sst
> >   019674.sst  019764.sst  019881.sst  019920.sst  019963.sst  02.sst
> >   020035.sst  020058.sst  020072.sst  020108.sst  OPTIONS-020084
> > 018327.sst  018540.sst  018952.sst  019126.sst  019346.sst  019470.sst
> >   019675.sst  019765.sst  019882.sst  019921.sst  019964.sst  020001.sst
> >   020036.sst  020059.sst  020073.sst  020109.sst  OPTIONS-020087
> > 018328.sst  018541.sst  018953.sst  019127.sst  019370.sst  019471.sst
> >   019676.sst  019766.sst  019883.sst  019922.sst  019965.sst  020002.sst
> >   020037.sst  020060.sst  020074.sst  020110.sst
> > 018329.sst  018590.sst  018954.sst  019128.sst  019371.sst  019472.sst
> >   019677.sst  019845.sst  019884.sst  019923.sst  019989.sst  020003.sst
> >   020038.sst  020061.sst  020075.sst  020111.sst
> > 018406.sst  018591.sst  018995.sst  019174.sst  019372.sst  019473.sst
> >   019678.sst  019846.sst  019885.sst  019950.sst  019992.sst  020004.sst
> >   020039.sst  020062.sst  020094.sst  020112.sst
> > 018407.sst  018727.sst  018996.sst  019175.sst  019373.sst  019474.sst
> >   019753.sst  019847.sst  019886.sst  019955.sst  019993.sst  020005.sst
> >   020040.sst  020063.sst  020095.sst  020113.sst
> > 018443.sst  018728.sst  019073.sst  019176.sst  019380.sst  019475.sst
> >   019754.sst  019848.sst  019887.sst  019956.sst  019994.sst  020006.sst
> >   020041.sst  020064.sst  020096.sst  020114.sst
> >
> > db.slow:
> >
> > db.wal:
> > 020085.log  020088.log
> > [root@s3db10 export-bluefs2]# du -hs
> > 12G .
> > [root@s3db10 export-bluefs2]# cat db/CURRENT
> > MANIFEST-020084
> >
> > Am Mo., 17. Mai 2021 um 14:28 Uhr schrieb Igor Fedotov  >:
> >
> >> On 5/17/2021 2:53 PM, Boris Behrens wrote:
> >>> Like this?
> >> Yeah.
> >>
> >> so DB dir structure is more or less O but db/CURRENT looks corrupted. It
> >> should contain something like: MANIFEST-020081
> >>
> >> Could you please remove (or even just rename)  block.db symlink and do
> the
> >> export again? Preferably to preserve the results for the first export.
> >>
> >> if export reveals proper CURRENT content - you might want to run fsck on
> >> the OSD...
> >>
> >>> [root@s3db10 export-bluefs]# ls *
> >>> db:
> >>> 018215.sst  018444.sst  018839.sst  019074.sst  019174.sst  019372.sst
> >>>019470.sst  019675.sst  019765.sst  019882.sst  019918.sst
> 019961.sst
> >>>019997.sst  020022.sst  020042.sst  020061.sst  020073.sst
> >>> 018216.sst  018445.sst  018840.sst  019075.sst  019175.sst  019373.sst
> >>>019471.sst  019676.sst  019766.sst  019883.sst  019919.sst
> 019962.sst
> >>>019998.sst  020023.sst  020043.sst  020062.sst  020074.sst
> >>> 018273.sst  018446.sst  018876.sst  019076.sst  019176.sst  019380.sst
> >>>019472.sst  019677.sst  019845.sst  019884.sst  019920.sst
> 019963.sst
> >>>01.sst  020030.sst  020049.sst  020063.sst  020075.sst
> >>> 018300.sst  018447.sst  018877.sst  019077.sst  019210.sst  019381.sst
> >>>019473.sst  019678.sst  019846.sst  019885.sst  019921.sst
> 019964.sst
> >>>02.sst  020031.sst  020051.sst  020064.sst  020077.sst
> >>> 018301.sst  018448.sst  018904.sst  019081.sst  019211.sst  019382.sst
> >>>019474.sst  019753.sst  019847.sst  019886.sst  019922.sst
> 019965.sst
> >>>020001.sst  020032.sst  020052.sst  020065.sst  020080.sst
> >>> 018326.sst  018449.sst  018950.sst  019082.sst  019256.sst  019383.sst
> >>>019475.sst  019754.sst  019848.sst  019887.sst  019923.sst
> 019986.sst
> >>>020002.sst  020035.sst  020053.sst  020066.sst  CURRENT
> >>> 018327.sst  018540.sst  018952.sst  

[ceph-users] Re: Process for adding a separate block.db to an osd

2021-05-17 Thread Boris Behrens
The FSCK looks good:

[root@s3db10 export-bluefs2]# ceph-bluestore-tool --path
/var/lib/ceph/osd/ceph-68  fsck
fsck success

Am Mo., 17. Mai 2021 um 14:39 Uhr schrieb Boris Behrens :

> Here is the new output. I kept both for now.
>
> [root@s3db10 export-bluefs2]# ls *
> db:
> 018215.sst  018444.sst  018839.sst  019074.sst  019210.sst  019381.sst
>  019560.sst  019755.sst  019849.sst  019888.sst  019958.sst  019995.sst
>  020007.sst  020042.sst  020067.sst  020098.sst  020115.sst
> 018216.sst  018445.sst  018840.sst  019075.sst  019211.sst  019382.sst
>  019670.sst  019756.sst  019877.sst  019889.sst  019959.sst  019996.sst
>  020008.sst  020043.sst  020068.sst  020104.sst  CURRENT
> 018273.sst  018446.sst  018876.sst  019076.sst  019256.sst  019383.sst
>  019671.sst  019757.sst  019878.sst  019890.sst  019960.sst  019997.sst
>  020030.sst  020055.sst  020069.sst  020105.sst  IDENTITY
> 018300.sst  018447.sst  018877.sst  019081.sst  019257.sst  019395.sst
>  019672.sst  019762.sst  019879.sst  019918.sst  019961.sst  019998.sst
>  020031.sst  020056.sst  020070.sst  020106.sst  LOCK
> 018301.sst  018448.sst  018904.sst  019082.sst  019344.sst  019396.sst
>  019673.sst  019763.sst  019880.sst  019919.sst  019962.sst  01.sst
>  020032.sst  020057.sst  020071.sst  020107.sst  MANIFEST-020084
> 018326.sst  018449.sst  018950.sst  019083.sst  019345.sst  019400.sst
>  019674.sst  019764.sst  019881.sst  019920.sst  019963.sst  02.sst
>  020035.sst  020058.sst  020072.sst  020108.sst  OPTIONS-020084
> 018327.sst  018540.sst  018952.sst  019126.sst  019346.sst  019470.sst
>  019675.sst  019765.sst  019882.sst  019921.sst  019964.sst  020001.sst
>  020036.sst  020059.sst  020073.sst  020109.sst  OPTIONS-020087
> 018328.sst  018541.sst  018953.sst  019127.sst  019370.sst  019471.sst
>  019676.sst  019766.sst  019883.sst  019922.sst  019965.sst  020002.sst
>  020037.sst  020060.sst  020074.sst  020110.sst
> 018329.sst  018590.sst  018954.sst  019128.sst  019371.sst  019472.sst
>  019677.sst  019845.sst  019884.sst  019923.sst  019989.sst  020003.sst
>  020038.sst  020061.sst  020075.sst  020111.sst
> 018406.sst  018591.sst  018995.sst  019174.sst  019372.sst  019473.sst
>  019678.sst  019846.sst  019885.sst  019950.sst  019992.sst  020004.sst
>  020039.sst  020062.sst  020094.sst  020112.sst
> 018407.sst  018727.sst  018996.sst  019175.sst  019373.sst  019474.sst
>  019753.sst  019847.sst  019886.sst  019955.sst  019993.sst  020005.sst
>  020040.sst  020063.sst  020095.sst  020113.sst
> 018443.sst  018728.sst  019073.sst  019176.sst  019380.sst  019475.sst
>  019754.sst  019848.sst  019887.sst  019956.sst  019994.sst  020006.sst
>  020041.sst  020064.sst  020096.sst  020114.sst
>
> db.slow:
>
> db.wal:
> 020085.log  020088.log
> [root@s3db10 export-bluefs2]# du -hs
> 12G .
> [root@s3db10 export-bluefs2]# cat db/CURRENT
> MANIFEST-020084
>
> Am Mo., 17. Mai 2021 um 14:28 Uhr schrieb Igor Fedotov :
>
>> On 5/17/2021 2:53 PM, Boris Behrens wrote:
>> > Like this?
>>
>> Yeah.
>>
>> so DB dir structure is more or less O but db/CURRENT looks corrupted. It
>> should contain something like: MANIFEST-020081
>>
>> Could you please remove (or even just rename)  block.db symlink and do
>> the export again? Preferably to preserve the results for the first export.
>>
>> if export reveals proper CURRENT content - you might want to run fsck on
>> the OSD...
>>
>> >
>> > [root@s3db10 export-bluefs]# ls *
>> > db:
>> > 018215.sst  018444.sst  018839.sst  019074.sst  019174.sst  019372.sst
>> >   019470.sst  019675.sst  019765.sst  019882.sst  019918.sst  019961.sst
>> >   019997.sst  020022.sst  020042.sst  020061.sst  020073.sst
>> > 018216.sst  018445.sst  018840.sst  019075.sst  019175.sst  019373.sst
>> >   019471.sst  019676.sst  019766.sst  019883.sst  019919.sst  019962.sst
>> >   019998.sst  020023.sst  020043.sst  020062.sst  020074.sst
>> > 018273.sst  018446.sst  018876.sst  019076.sst  019176.sst  019380.sst
>> >   019472.sst  019677.sst  019845.sst  019884.sst  019920.sst  019963.sst
>> >   01.sst  020030.sst  020049.sst  020063.sst  020075.sst
>> > 018300.sst  018447.sst  018877.sst  019077.sst  019210.sst  019381.sst
>> >   019473.sst  019678.sst  019846.sst  019885.sst  019921.sst  019964.sst
>> >   02.sst  020031.sst  020051.sst  020064.sst  020077.sst
>> > 018301.sst  018448.sst  018904.sst  019081.sst  019211.sst  019382.sst
>> >   019474.sst  019753.sst  019847.sst  019886.sst  019922.sst  019965.sst
>> >   020001.sst  020032.sst  020052.sst  020065.sst  020080.sst
>> > 018326.sst  018449.sst  018950.sst  019082.sst  019256.sst  019383.sst
>> >   019475.sst  019754.sst  019848.sst  019887.sst  019923.sst  019986.sst
>> >   020002.sst  020035.sst  020053.sst  020066.sst  CURRENT
>> > 018327.sst  018540.sst  018952.sst  019083.sst  019257.sst  019395.sst
>> >   019560.sst  019755.sst  019849.sst  019888.sst  019950.sst  019989.sst
>> >   020003.sst  020036.sst  020055.sst  020067.sst  IDENTITY
>> > 

[ceph-users] Re: Process for adding a separate block.db to an osd

2021-05-17 Thread Igor Fedotov

Would you try fsck without standalone DB?

On 5/17/2021 3:39 PM, Boris Behrens wrote:

Here is the new output. I kept both for now.

[root@s3db10 export-bluefs2]# ls *
db:
018215.sst  018444.sst  018839.sst  019074.sst  019210.sst  019381.sst
  019560.sst  019755.sst  019849.sst  019888.sst  019958.sst  019995.sst
  020007.sst  020042.sst  020067.sst  020098.sst  020115.sst
018216.sst  018445.sst  018840.sst  019075.sst  019211.sst  019382.sst
  019670.sst  019756.sst  019877.sst  019889.sst  019959.sst  019996.sst
  020008.sst  020043.sst  020068.sst  020104.sst  CURRENT
018273.sst  018446.sst  018876.sst  019076.sst  019256.sst  019383.sst
  019671.sst  019757.sst  019878.sst  019890.sst  019960.sst  019997.sst
  020030.sst  020055.sst  020069.sst  020105.sst  IDENTITY
018300.sst  018447.sst  018877.sst  019081.sst  019257.sst  019395.sst
  019672.sst  019762.sst  019879.sst  019918.sst  019961.sst  019998.sst
  020031.sst  020056.sst  020070.sst  020106.sst  LOCK
018301.sst  018448.sst  018904.sst  019082.sst  019344.sst  019396.sst
  019673.sst  019763.sst  019880.sst  019919.sst  019962.sst  01.sst
  020032.sst  020057.sst  020071.sst  020107.sst  MANIFEST-020084
018326.sst  018449.sst  018950.sst  019083.sst  019345.sst  019400.sst
  019674.sst  019764.sst  019881.sst  019920.sst  019963.sst  02.sst
  020035.sst  020058.sst  020072.sst  020108.sst  OPTIONS-020084
018327.sst  018540.sst  018952.sst  019126.sst  019346.sst  019470.sst
  019675.sst  019765.sst  019882.sst  019921.sst  019964.sst  020001.sst
  020036.sst  020059.sst  020073.sst  020109.sst  OPTIONS-020087
018328.sst  018541.sst  018953.sst  019127.sst  019370.sst  019471.sst
  019676.sst  019766.sst  019883.sst  019922.sst  019965.sst  020002.sst
  020037.sst  020060.sst  020074.sst  020110.sst
018329.sst  018590.sst  018954.sst  019128.sst  019371.sst  019472.sst
  019677.sst  019845.sst  019884.sst  019923.sst  019989.sst  020003.sst
  020038.sst  020061.sst  020075.sst  020111.sst
018406.sst  018591.sst  018995.sst  019174.sst  019372.sst  019473.sst
  019678.sst  019846.sst  019885.sst  019950.sst  019992.sst  020004.sst
  020039.sst  020062.sst  020094.sst  020112.sst
018407.sst  018727.sst  018996.sst  019175.sst  019373.sst  019474.sst
  019753.sst  019847.sst  019886.sst  019955.sst  019993.sst  020005.sst
  020040.sst  020063.sst  020095.sst  020113.sst
018443.sst  018728.sst  019073.sst  019176.sst  019380.sst  019475.sst
  019754.sst  019848.sst  019887.sst  019956.sst  019994.sst  020006.sst
  020041.sst  020064.sst  020096.sst  020114.sst

db.slow:

db.wal:
020085.log  020088.log
[root@s3db10 export-bluefs2]# du -hs
12G .
[root@s3db10 export-bluefs2]# cat db/CURRENT
MANIFEST-020084

Am Mo., 17. Mai 2021 um 14:28 Uhr schrieb Igor Fedotov :


On 5/17/2021 2:53 PM, Boris Behrens wrote:

Like this?

Yeah.

so DB dir structure is more or less O but db/CURRENT looks corrupted. It
should contain something like: MANIFEST-020081

Could you please remove (or even just rename)  block.db symlink and do the
export again? Preferably to preserve the results for the first export.

if export reveals proper CURRENT content - you might want to run fsck on
the OSD...


[root@s3db10 export-bluefs]# ls *
db:
018215.sst  018444.sst  018839.sst  019074.sst  019174.sst  019372.sst
   019470.sst  019675.sst  019765.sst  019882.sst  019918.sst  019961.sst
   019997.sst  020022.sst  020042.sst  020061.sst  020073.sst
018216.sst  018445.sst  018840.sst  019075.sst  019175.sst  019373.sst
   019471.sst  019676.sst  019766.sst  019883.sst  019919.sst  019962.sst
   019998.sst  020023.sst  020043.sst  020062.sst  020074.sst
018273.sst  018446.sst  018876.sst  019076.sst  019176.sst  019380.sst
   019472.sst  019677.sst  019845.sst  019884.sst  019920.sst  019963.sst
   01.sst  020030.sst  020049.sst  020063.sst  020075.sst
018300.sst  018447.sst  018877.sst  019077.sst  019210.sst  019381.sst
   019473.sst  019678.sst  019846.sst  019885.sst  019921.sst  019964.sst
   02.sst  020031.sst  020051.sst  020064.sst  020077.sst
018301.sst  018448.sst  018904.sst  019081.sst  019211.sst  019382.sst
   019474.sst  019753.sst  019847.sst  019886.sst  019922.sst  019965.sst
   020001.sst  020032.sst  020052.sst  020065.sst  020080.sst
018326.sst  018449.sst  018950.sst  019082.sst  019256.sst  019383.sst
   019475.sst  019754.sst  019848.sst  019887.sst  019923.sst  019986.sst
   020002.sst  020035.sst  020053.sst  020066.sst  CURRENT
018327.sst  018540.sst  018952.sst  019083.sst  019257.sst  019395.sst
   019560.sst  019755.sst  019849.sst  019888.sst  019950.sst  019989.sst
   020003.sst  020036.sst  020055.sst  020067.sst  IDENTITY
018328.sst  018541.sst  018953.sst  019124.sst  019344.sst  019396.sst
   019670.sst  019756.sst  019877.sst  019889.sst  019955.sst  019992.sst
   020004.sst  020037.sst  020056.sst  020068.sst  LOCK
018329.sst  018590.sst  018954.sst  019125.sst  019345.sst  019400.sst
   019671.sst  019757.sst  019878.sst  

[ceph-users] Re: Process for adding a separate block.db to an osd

2021-05-17 Thread Boris Behrens
Here is the new output. I kept both for now.

[root@s3db10 export-bluefs2]# ls *
db:
018215.sst  018444.sst  018839.sst  019074.sst  019210.sst  019381.sst
 019560.sst  019755.sst  019849.sst  019888.sst  019958.sst  019995.sst
 020007.sst  020042.sst  020067.sst  020098.sst  020115.sst
018216.sst  018445.sst  018840.sst  019075.sst  019211.sst  019382.sst
 019670.sst  019756.sst  019877.sst  019889.sst  019959.sst  019996.sst
 020008.sst  020043.sst  020068.sst  020104.sst  CURRENT
018273.sst  018446.sst  018876.sst  019076.sst  019256.sst  019383.sst
 019671.sst  019757.sst  019878.sst  019890.sst  019960.sst  019997.sst
 020030.sst  020055.sst  020069.sst  020105.sst  IDENTITY
018300.sst  018447.sst  018877.sst  019081.sst  019257.sst  019395.sst
 019672.sst  019762.sst  019879.sst  019918.sst  019961.sst  019998.sst
 020031.sst  020056.sst  020070.sst  020106.sst  LOCK
018301.sst  018448.sst  018904.sst  019082.sst  019344.sst  019396.sst
 019673.sst  019763.sst  019880.sst  019919.sst  019962.sst  01.sst
 020032.sst  020057.sst  020071.sst  020107.sst  MANIFEST-020084
018326.sst  018449.sst  018950.sst  019083.sst  019345.sst  019400.sst
 019674.sst  019764.sst  019881.sst  019920.sst  019963.sst  02.sst
 020035.sst  020058.sst  020072.sst  020108.sst  OPTIONS-020084
018327.sst  018540.sst  018952.sst  019126.sst  019346.sst  019470.sst
 019675.sst  019765.sst  019882.sst  019921.sst  019964.sst  020001.sst
 020036.sst  020059.sst  020073.sst  020109.sst  OPTIONS-020087
018328.sst  018541.sst  018953.sst  019127.sst  019370.sst  019471.sst
 019676.sst  019766.sst  019883.sst  019922.sst  019965.sst  020002.sst
 020037.sst  020060.sst  020074.sst  020110.sst
018329.sst  018590.sst  018954.sst  019128.sst  019371.sst  019472.sst
 019677.sst  019845.sst  019884.sst  019923.sst  019989.sst  020003.sst
 020038.sst  020061.sst  020075.sst  020111.sst
018406.sst  018591.sst  018995.sst  019174.sst  019372.sst  019473.sst
 019678.sst  019846.sst  019885.sst  019950.sst  019992.sst  020004.sst
 020039.sst  020062.sst  020094.sst  020112.sst
018407.sst  018727.sst  018996.sst  019175.sst  019373.sst  019474.sst
 019753.sst  019847.sst  019886.sst  019955.sst  019993.sst  020005.sst
 020040.sst  020063.sst  020095.sst  020113.sst
018443.sst  018728.sst  019073.sst  019176.sst  019380.sst  019475.sst
 019754.sst  019848.sst  019887.sst  019956.sst  019994.sst  020006.sst
 020041.sst  020064.sst  020096.sst  020114.sst

db.slow:

db.wal:
020085.log  020088.log
[root@s3db10 export-bluefs2]# du -hs
12G .
[root@s3db10 export-bluefs2]# cat db/CURRENT
MANIFEST-020084

Am Mo., 17. Mai 2021 um 14:28 Uhr schrieb Igor Fedotov :

> On 5/17/2021 2:53 PM, Boris Behrens wrote:
> > Like this?
>
> Yeah.
>
> so DB dir structure is more or less O but db/CURRENT looks corrupted. It
> should contain something like: MANIFEST-020081
>
> Could you please remove (or even just rename)  block.db symlink and do the
> export again? Preferably to preserve the results for the first export.
>
> if export reveals proper CURRENT content - you might want to run fsck on
> the OSD...
>
> >
> > [root@s3db10 export-bluefs]# ls *
> > db:
> > 018215.sst  018444.sst  018839.sst  019074.sst  019174.sst  019372.sst
> >   019470.sst  019675.sst  019765.sst  019882.sst  019918.sst  019961.sst
> >   019997.sst  020022.sst  020042.sst  020061.sst  020073.sst
> > 018216.sst  018445.sst  018840.sst  019075.sst  019175.sst  019373.sst
> >   019471.sst  019676.sst  019766.sst  019883.sst  019919.sst  019962.sst
> >   019998.sst  020023.sst  020043.sst  020062.sst  020074.sst
> > 018273.sst  018446.sst  018876.sst  019076.sst  019176.sst  019380.sst
> >   019472.sst  019677.sst  019845.sst  019884.sst  019920.sst  019963.sst
> >   01.sst  020030.sst  020049.sst  020063.sst  020075.sst
> > 018300.sst  018447.sst  018877.sst  019077.sst  019210.sst  019381.sst
> >   019473.sst  019678.sst  019846.sst  019885.sst  019921.sst  019964.sst
> >   02.sst  020031.sst  020051.sst  020064.sst  020077.sst
> > 018301.sst  018448.sst  018904.sst  019081.sst  019211.sst  019382.sst
> >   019474.sst  019753.sst  019847.sst  019886.sst  019922.sst  019965.sst
> >   020001.sst  020032.sst  020052.sst  020065.sst  020080.sst
> > 018326.sst  018449.sst  018950.sst  019082.sst  019256.sst  019383.sst
> >   019475.sst  019754.sst  019848.sst  019887.sst  019923.sst  019986.sst
> >   020002.sst  020035.sst  020053.sst  020066.sst  CURRENT
> > 018327.sst  018540.sst  018952.sst  019083.sst  019257.sst  019395.sst
> >   019560.sst  019755.sst  019849.sst  019888.sst  019950.sst  019989.sst
> >   020003.sst  020036.sst  020055.sst  020067.sst  IDENTITY
> > 018328.sst  018541.sst  018953.sst  019124.sst  019344.sst  019396.sst
> >   019670.sst  019756.sst  019877.sst  019889.sst  019955.sst  019992.sst
> >   020004.sst  020037.sst  020056.sst  020068.sst  LOCK
> > 018329.sst  018590.sst  018954.sst  019125.sst  019345.sst  019400.sst
> >   019671.sst  019757.sst  019878.sst 

[ceph-users] Re: Process for adding a separate block.db to an osd

2021-05-17 Thread Igor Fedotov

On 5/17/2021 2:53 PM, Boris Behrens wrote:

Like this?


Yeah.

so DB dir structure is more or less O but db/CURRENT looks corrupted. It 
should contain something like: MANIFEST-020081


Could you please remove (or even just rename)  block.db symlink and do the 
export again? Preferably to preserve the results for the first export.

if export reveals proper CURRENT content - you might want to run fsck on the 
OSD...



[root@s3db10 export-bluefs]# ls *
db:
018215.sst  018444.sst  018839.sst  019074.sst  019174.sst  019372.sst
  019470.sst  019675.sst  019765.sst  019882.sst  019918.sst  019961.sst
  019997.sst  020022.sst  020042.sst  020061.sst  020073.sst
018216.sst  018445.sst  018840.sst  019075.sst  019175.sst  019373.sst
  019471.sst  019676.sst  019766.sst  019883.sst  019919.sst  019962.sst
  019998.sst  020023.sst  020043.sst  020062.sst  020074.sst
018273.sst  018446.sst  018876.sst  019076.sst  019176.sst  019380.sst
  019472.sst  019677.sst  019845.sst  019884.sst  019920.sst  019963.sst
  01.sst  020030.sst  020049.sst  020063.sst  020075.sst
018300.sst  018447.sst  018877.sst  019077.sst  019210.sst  019381.sst
  019473.sst  019678.sst  019846.sst  019885.sst  019921.sst  019964.sst
  02.sst  020031.sst  020051.sst  020064.sst  020077.sst
018301.sst  018448.sst  018904.sst  019081.sst  019211.sst  019382.sst
  019474.sst  019753.sst  019847.sst  019886.sst  019922.sst  019965.sst
  020001.sst  020032.sst  020052.sst  020065.sst  020080.sst
018326.sst  018449.sst  018950.sst  019082.sst  019256.sst  019383.sst
  019475.sst  019754.sst  019848.sst  019887.sst  019923.sst  019986.sst
  020002.sst  020035.sst  020053.sst  020066.sst  CURRENT
018327.sst  018540.sst  018952.sst  019083.sst  019257.sst  019395.sst
  019560.sst  019755.sst  019849.sst  019888.sst  019950.sst  019989.sst
  020003.sst  020036.sst  020055.sst  020067.sst  IDENTITY
018328.sst  018541.sst  018953.sst  019124.sst  019344.sst  019396.sst
  019670.sst  019756.sst  019877.sst  019889.sst  019955.sst  019992.sst
  020004.sst  020037.sst  020056.sst  020068.sst  LOCK
018329.sst  018590.sst  018954.sst  019125.sst  019345.sst  019400.sst
  019671.sst  019757.sst  019878.sst  019890.sst  019956.sst  019993.sst
  020005.sst  020038.sst  020057.sst  020069.sst  MANIFEST-020081
018406.sst  018591.sst  018995.sst  019126.sst  019346.sst  019467.sst
  019672.sst  019762.sst  019879.sst  019915.sst  019958.sst  019994.sst
  020006.sst  020039.sst  020058.sst  020070.sst  OPTIONS-020081
018407.sst  018727.sst  018996.sst  019127.sst  019370.sst  019468.sst
  019673.sst  019763.sst  019880.sst  019916.sst  019959.sst  019995.sst
  020007.sst  020040.sst  020059.sst  020071.sst  OPTIONS-020084
018443.sst  018728.sst  019073.sst  019128.sst  019371.sst  019469.sst
  019674.sst  019764.sst  019881.sst  019917.sst  019960.sst  019996.sst
  020008.sst  020041.sst  020060.sst  020072.sst

db.slow:

db.wal:
020082.log
[root@s3db10 export-bluefs]# du -hs
12G .
[root@s3db10 export-bluefs]# cat db/CURRENT
�g�U
uN�[�+p[root@s3db10 export-bluefs]#

Am Mo., 17. Mai 2021 um 13:45 Uhr schrieb Igor Fedotov :


You might want to check file structure at new DB using bluestore-tools's
bluefs-export command:

ceph-bluestore-tool --path  --command bluefs-export --out


 needs to have enough free space enough to fit DB data.

Once exported - does   contain valid BlueFS directory
structure - multiple .sst files, CURRENT and IDENTITY files etc?

If so then please check and share the content of /db/CURRENT
file.


Thanks,

Igor

On 5/17/2021 1:32 PM, Boris Behrens wrote:

Hi Igor,
I posted it on pastebin: https://pastebin.com/Ze9EuCMD

Cheers
   Boris

Am Mo., 17. Mai 2021 um 12:22 Uhr schrieb Igor Fedotov 
Hi Boris,

could you please share full OSD startup log and file listing for
'/var/lib/ceph/osd/ceph-68'?


Thanks,

Igor

On 5/17/2021 1:09 PM, Boris Behrens wrote:

Hi,
sorry for replying to this old thread:

I tried to add a block.db to an OSD but now the OSD can not start with

the

error:
Mai 17 09:50:38 s3db10.fra2.gridscale.it ceph-osd[26038]: -7>

2021-05-17

09:50:38.362 7fc48ec94a80 -1 rocksdb: Corruption: CURRENT file does not

end

with newline
Mai 17 09:50:38 s3db10.fra2.gridscale.it ceph-osd[26038]: -6>

2021-05-17

09:50:38.362 7fc48ec94a80 -1 bluestore(/var/lib/ceph/osd/ceph-68)

_open_db

erroring opening db:
Mai 17 09:50:38 s3db10.fra2.gridscale.it ceph-osd[26038]: -1>

2021-05-17

09:50:38.866 7fc48ec94a80 -1


/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/gigantic/release/14.2.21/rpm/el7/BUILD/ceph-14.2.21/src/os/bluestore/BlueStore.cc:

In function 'int BlueStore::_upgrade_super()' thread 7fc48ec94a80 time
2021-05-17 09:50:38.865204
Mai 17 09:50:38 s3db10.fra2.gridscale.it ceph-osd[26038]:



[ceph-users] Re: Process for adding a separate block.db to an osd

2021-05-17 Thread Boris Behrens
Like this?

[root@s3db10 export-bluefs]# ls *
db:
018215.sst  018444.sst  018839.sst  019074.sst  019174.sst  019372.sst
 019470.sst  019675.sst  019765.sst  019882.sst  019918.sst  019961.sst
 019997.sst  020022.sst  020042.sst  020061.sst  020073.sst
018216.sst  018445.sst  018840.sst  019075.sst  019175.sst  019373.sst
 019471.sst  019676.sst  019766.sst  019883.sst  019919.sst  019962.sst
 019998.sst  020023.sst  020043.sst  020062.sst  020074.sst
018273.sst  018446.sst  018876.sst  019076.sst  019176.sst  019380.sst
 019472.sst  019677.sst  019845.sst  019884.sst  019920.sst  019963.sst
 01.sst  020030.sst  020049.sst  020063.sst  020075.sst
018300.sst  018447.sst  018877.sst  019077.sst  019210.sst  019381.sst
 019473.sst  019678.sst  019846.sst  019885.sst  019921.sst  019964.sst
 02.sst  020031.sst  020051.sst  020064.sst  020077.sst
018301.sst  018448.sst  018904.sst  019081.sst  019211.sst  019382.sst
 019474.sst  019753.sst  019847.sst  019886.sst  019922.sst  019965.sst
 020001.sst  020032.sst  020052.sst  020065.sst  020080.sst
018326.sst  018449.sst  018950.sst  019082.sst  019256.sst  019383.sst
 019475.sst  019754.sst  019848.sst  019887.sst  019923.sst  019986.sst
 020002.sst  020035.sst  020053.sst  020066.sst  CURRENT
018327.sst  018540.sst  018952.sst  019083.sst  019257.sst  019395.sst
 019560.sst  019755.sst  019849.sst  019888.sst  019950.sst  019989.sst
 020003.sst  020036.sst  020055.sst  020067.sst  IDENTITY
018328.sst  018541.sst  018953.sst  019124.sst  019344.sst  019396.sst
 019670.sst  019756.sst  019877.sst  019889.sst  019955.sst  019992.sst
 020004.sst  020037.sst  020056.sst  020068.sst  LOCK
018329.sst  018590.sst  018954.sst  019125.sst  019345.sst  019400.sst
 019671.sst  019757.sst  019878.sst  019890.sst  019956.sst  019993.sst
 020005.sst  020038.sst  020057.sst  020069.sst  MANIFEST-020081
018406.sst  018591.sst  018995.sst  019126.sst  019346.sst  019467.sst
 019672.sst  019762.sst  019879.sst  019915.sst  019958.sst  019994.sst
 020006.sst  020039.sst  020058.sst  020070.sst  OPTIONS-020081
018407.sst  018727.sst  018996.sst  019127.sst  019370.sst  019468.sst
 019673.sst  019763.sst  019880.sst  019916.sst  019959.sst  019995.sst
 020007.sst  020040.sst  020059.sst  020071.sst  OPTIONS-020084
018443.sst  018728.sst  019073.sst  019128.sst  019371.sst  019469.sst
 019674.sst  019764.sst  019881.sst  019917.sst  019960.sst  019996.sst
 020008.sst  020041.sst  020060.sst  020072.sst

db.slow:

db.wal:
020082.log
[root@s3db10 export-bluefs]# du -hs
12G .
[root@s3db10 export-bluefs]# cat db/CURRENT
�g�U
   uN�[�+p[root@s3db10 export-bluefs]#

Am Mo., 17. Mai 2021 um 13:45 Uhr schrieb Igor Fedotov :

> You might want to check file structure at new DB using bluestore-tools's
> bluefs-export command:
>
> ceph-bluestore-tool --path  --command bluefs-export --out
> 
>
>  needs to have enough free space enough to fit DB data.
>
> Once exported - does   contain valid BlueFS directory
> structure - multiple .sst files, CURRENT and IDENTITY files etc?
>
> If so then please check and share the content of /db/CURRENT
> file.
>
>
> Thanks,
>
> Igor
>
> On 5/17/2021 1:32 PM, Boris Behrens wrote:
> > Hi Igor,
> > I posted it on pastebin: https://pastebin.com/Ze9EuCMD
> >
> > Cheers
> >   Boris
> >
> > Am Mo., 17. Mai 2021 um 12:22 Uhr schrieb Igor Fedotov  >:
> >
> >> Hi Boris,
> >>
> >> could you please share full OSD startup log and file listing for
> >> '/var/lib/ceph/osd/ceph-68'?
> >>
> >>
> >> Thanks,
> >>
> >> Igor
> >>
> >> On 5/17/2021 1:09 PM, Boris Behrens wrote:
> >>> Hi,
> >>> sorry for replying to this old thread:
> >>>
> >>> I tried to add a block.db to an OSD but now the OSD can not start with
> >> the
> >>> error:
> >>> Mai 17 09:50:38 s3db10.fra2.gridscale.it ceph-osd[26038]: -7>
> 2021-05-17
> >>> 09:50:38.362 7fc48ec94a80 -1 rocksdb: Corruption: CURRENT file does not
> >> end
> >>> with newline
> >>> Mai 17 09:50:38 s3db10.fra2.gridscale.it ceph-osd[26038]: -6>
> 2021-05-17
> >>> 09:50:38.362 7fc48ec94a80 -1 bluestore(/var/lib/ceph/osd/ceph-68)
> >> _open_db
> >>> erroring opening db:
> >>> Mai 17 09:50:38 s3db10.fra2.gridscale.it ceph-osd[26038]: -1>
> 2021-05-17
> >>> 09:50:38.866 7fc48ec94a80 -1
> >>>
> >>
> /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/gigantic/release/14.2.21/rpm/el7/BUILD/ceph-14.2.21/src/os/bluestore/BlueStore.cc:
> >>> In function 'int BlueStore::_upgrade_super()' thread 7fc48ec94a80 time
> >>> 2021-05-17 09:50:38.865204
> >>> Mai 17 09:50:38 s3db10.fra2.gridscale.it ceph-osd[26038]:
> >>>
> >>
> /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/gigantic/release/14.2.21/rpm/el7/BUILD/ceph-14.2.21/src/os/bluestore/BlueStore.cc:
> >>> 10647: FAILED ceph_assert(ondisk_format > 0)
> >>>
> >>> I tried to run an fsck/repair on the disk:
> >>> [root@s3db10 osd]# 

[ceph-users] Re: Process for adding a separate block.db to an osd

2021-05-17 Thread Igor Fedotov
You might want to check file structure at new DB using bluestore-tools's 
bluefs-export command:


ceph-bluestore-tool --path  --command bluefs-export --out 



 needs to have enough free space enough to fit DB data.

Once exported - does   contain valid BlueFS directory 
structure - multiple .sst files, CURRENT and IDENTITY files etc?


If so then please check and share the content of /db/CURRENT 
file.



Thanks,

Igor

On 5/17/2021 1:32 PM, Boris Behrens wrote:

Hi Igor,
I posted it on pastebin: https://pastebin.com/Ze9EuCMD

Cheers
  Boris

Am Mo., 17. Mai 2021 um 12:22 Uhr schrieb Igor Fedotov :


Hi Boris,

could you please share full OSD startup log and file listing for
'/var/lib/ceph/osd/ceph-68'?


Thanks,

Igor

On 5/17/2021 1:09 PM, Boris Behrens wrote:

Hi,
sorry for replying to this old thread:

I tried to add a block.db to an OSD but now the OSD can not start with

the

error:
Mai 17 09:50:38 s3db10.fra2.gridscale.it ceph-osd[26038]: -7> 2021-05-17
09:50:38.362 7fc48ec94a80 -1 rocksdb: Corruption: CURRENT file does not

end

with newline
Mai 17 09:50:38 s3db10.fra2.gridscale.it ceph-osd[26038]: -6> 2021-05-17
09:50:38.362 7fc48ec94a80 -1 bluestore(/var/lib/ceph/osd/ceph-68)

_open_db

erroring opening db:
Mai 17 09:50:38 s3db10.fra2.gridscale.it ceph-osd[26038]: -1> 2021-05-17
09:50:38.866 7fc48ec94a80 -1


/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/gigantic/release/14.2.21/rpm/el7/BUILD/ceph-14.2.21/src/os/bluestore/BlueStore.cc:

In function 'int BlueStore::_upgrade_super()' thread 7fc48ec94a80 time
2021-05-17 09:50:38.865204
Mai 17 09:50:38 s3db10.fra2.gridscale.it ceph-osd[26038]:


/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/gigantic/release/14.2.21/rpm/el7/BUILD/ceph-14.2.21/src/os/bluestore/BlueStore.cc:

10647: FAILED ceph_assert(ondisk_format > 0)

I tried to run an fsck/repair on the disk:
[root@s3db10 osd]# ceph-bluestore-tool --path ceph-68  repair
2021-05-17 10:05:25.695 7f714dea3ec0 -1 rocksdb: Corruption: CURRENT file
does not end with newline
2021-05-17 10:05:25.695 7f714dea3ec0 -1 bluestore(ceph-68) _open_db
erroring opening db:
error from fsck: (5) Input/output error
[root@s3db10 osd]# ceph-bluestore-tool --path ceph-68  fsck
2021-05-17 10:05:35.012 7fb8f22e6ec0 -1 rocksdb: Corruption: CURRENT file
does not end with newline
2021-05-17 10:05:35.012 7fb8f22e6ec0 -1 bluestore(ceph-68) _open_db
erroring opening db:
error from fsck: (5) Input/output error

These are the steps I did to add the disk:
$ CEPH_ARGS="--bluestore-block-db-size 53687091200
--bluestore_block_db_create=true" ceph-bluestore-tool bluefs-bdev-new-db
--path /var/lib/ceph/osd/ceph-68 --dev-target /dev/sdj1
$ chown -h ceph:ceph /var/lib/ceph/osd/ceph-68/block.db
$ lvchange --addtag ceph.db_device=/dev/sdj1


/dev/ceph-3bbfd168-2a54-4593-a037-80d0d7e97afd/osd-block-aaeaea54-eb6a-480c-b2fd-d938e336c0f6

$ lvchange --addtag ceph.db_uuid=463dd37c-fd49-4ccb-849f-c5827d3d9df2


/dev/ceph-3bbfd168-2a54-4593-a037-80d0d7e97afd/osd-block-aaeaea54-eb6a-480c-b2fd-d938e336c0f6

$ ceph-volume lvm activate --all

The UUIDs
later I tried this:
$ ceph-bluestore-tool --path /var/lib/ceph/osd/ceph-68 --devs-source
/var/lib/ceph/osd/ceph-68/block --dev-target
/var/lib/ceph/osd/ceph-68/block.db bluefs-bdev-migrate

Any ideas how I can get the rocksdb fixed?

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io




___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Process for adding a separate block.db to an osd

2021-05-17 Thread Boris Behrens
Hi Igor,
I posted it on pastebin: https://pastebin.com/Ze9EuCMD

Cheers
 Boris

Am Mo., 17. Mai 2021 um 12:22 Uhr schrieb Igor Fedotov :

> Hi Boris,
>
> could you please share full OSD startup log and file listing for
> '/var/lib/ceph/osd/ceph-68'?
>
>
> Thanks,
>
> Igor
>
> On 5/17/2021 1:09 PM, Boris Behrens wrote:
> > Hi,
> > sorry for replying to this old thread:
> >
> > I tried to add a block.db to an OSD but now the OSD can not start with
> the
> > error:
> > Mai 17 09:50:38 s3db10.fra2.gridscale.it ceph-osd[26038]: -7> 2021-05-17
> > 09:50:38.362 7fc48ec94a80 -1 rocksdb: Corruption: CURRENT file does not
> end
> > with newline
> > Mai 17 09:50:38 s3db10.fra2.gridscale.it ceph-osd[26038]: -6> 2021-05-17
> > 09:50:38.362 7fc48ec94a80 -1 bluestore(/var/lib/ceph/osd/ceph-68)
> _open_db
> > erroring opening db:
> > Mai 17 09:50:38 s3db10.fra2.gridscale.it ceph-osd[26038]: -1> 2021-05-17
> > 09:50:38.866 7fc48ec94a80 -1
> >
> /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/gigantic/release/14.2.21/rpm/el7/BUILD/ceph-14.2.21/src/os/bluestore/BlueStore.cc:
> > In function 'int BlueStore::_upgrade_super()' thread 7fc48ec94a80 time
> > 2021-05-17 09:50:38.865204
> > Mai 17 09:50:38 s3db10.fra2.gridscale.it ceph-osd[26038]:
> >
> /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/gigantic/release/14.2.21/rpm/el7/BUILD/ceph-14.2.21/src/os/bluestore/BlueStore.cc:
> > 10647: FAILED ceph_assert(ondisk_format > 0)
> >
> > I tried to run an fsck/repair on the disk:
> > [root@s3db10 osd]# ceph-bluestore-tool --path ceph-68  repair
> > 2021-05-17 10:05:25.695 7f714dea3ec0 -1 rocksdb: Corruption: CURRENT file
> > does not end with newline
> > 2021-05-17 10:05:25.695 7f714dea3ec0 -1 bluestore(ceph-68) _open_db
> > erroring opening db:
> > error from fsck: (5) Input/output error
> > [root@s3db10 osd]# ceph-bluestore-tool --path ceph-68  fsck
> > 2021-05-17 10:05:35.012 7fb8f22e6ec0 -1 rocksdb: Corruption: CURRENT file
> > does not end with newline
> > 2021-05-17 10:05:35.012 7fb8f22e6ec0 -1 bluestore(ceph-68) _open_db
> > erroring opening db:
> > error from fsck: (5) Input/output error
> >
> > These are the steps I did to add the disk:
> > $ CEPH_ARGS="--bluestore-block-db-size 53687091200
> > --bluestore_block_db_create=true" ceph-bluestore-tool bluefs-bdev-new-db
> > --path /var/lib/ceph/osd/ceph-68 --dev-target /dev/sdj1
> > $ chown -h ceph:ceph /var/lib/ceph/osd/ceph-68/block.db
> > $ lvchange --addtag ceph.db_device=/dev/sdj1
> >
> /dev/ceph-3bbfd168-2a54-4593-a037-80d0d7e97afd/osd-block-aaeaea54-eb6a-480c-b2fd-d938e336c0f6
> > $ lvchange --addtag ceph.db_uuid=463dd37c-fd49-4ccb-849f-c5827d3d9df2
> >
> /dev/ceph-3bbfd168-2a54-4593-a037-80d0d7e97afd/osd-block-aaeaea54-eb6a-480c-b2fd-d938e336c0f6
> > $ ceph-volume lvm activate --all
> >
> > The UUIDs
> > later I tried this:
> > $ ceph-bluestore-tool --path /var/lib/ceph/osd/ceph-68 --devs-source
> > /var/lib/ceph/osd/ceph-68/block --dev-target
> > /var/lib/ceph/osd/ceph-68/block.db bluefs-bdev-migrate
> >
> > Any ideas how I can get the rocksdb fixed?
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>


-- 
Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im
groüen Saal.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Process for adding a separate block.db to an osd

2021-05-17 Thread Igor Fedotov

Hi Boris,

could you please share full OSD startup log and file listing for 
'/var/lib/ceph/osd/ceph-68'?



Thanks,

Igor

On 5/17/2021 1:09 PM, Boris Behrens wrote:

Hi,
sorry for replying to this old thread:

I tried to add a block.db to an OSD but now the OSD can not start with the
error:
Mai 17 09:50:38 s3db10.fra2.gridscale.it ceph-osd[26038]: -7> 2021-05-17
09:50:38.362 7fc48ec94a80 -1 rocksdb: Corruption: CURRENT file does not end
with newline
Mai 17 09:50:38 s3db10.fra2.gridscale.it ceph-osd[26038]: -6> 2021-05-17
09:50:38.362 7fc48ec94a80 -1 bluestore(/var/lib/ceph/osd/ceph-68) _open_db
erroring opening db:
Mai 17 09:50:38 s3db10.fra2.gridscale.it ceph-osd[26038]: -1> 2021-05-17
09:50:38.866 7fc48ec94a80 -1
/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/gigantic/release/14.2.21/rpm/el7/BUILD/ceph-14.2.21/src/os/bluestore/BlueStore.cc:
In function 'int BlueStore::_upgrade_super()' thread 7fc48ec94a80 time
2021-05-17 09:50:38.865204
Mai 17 09:50:38 s3db10.fra2.gridscale.it ceph-osd[26038]:
/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/gigantic/release/14.2.21/rpm/el7/BUILD/ceph-14.2.21/src/os/bluestore/BlueStore.cc:
10647: FAILED ceph_assert(ondisk_format > 0)

I tried to run an fsck/repair on the disk:
[root@s3db10 osd]# ceph-bluestore-tool --path ceph-68  repair
2021-05-17 10:05:25.695 7f714dea3ec0 -1 rocksdb: Corruption: CURRENT file
does not end with newline
2021-05-17 10:05:25.695 7f714dea3ec0 -1 bluestore(ceph-68) _open_db
erroring opening db:
error from fsck: (5) Input/output error
[root@s3db10 osd]# ceph-bluestore-tool --path ceph-68  fsck
2021-05-17 10:05:35.012 7fb8f22e6ec0 -1 rocksdb: Corruption: CURRENT file
does not end with newline
2021-05-17 10:05:35.012 7fb8f22e6ec0 -1 bluestore(ceph-68) _open_db
erroring opening db:
error from fsck: (5) Input/output error

These are the steps I did to add the disk:
$ CEPH_ARGS="--bluestore-block-db-size 53687091200
--bluestore_block_db_create=true" ceph-bluestore-tool bluefs-bdev-new-db
--path /var/lib/ceph/osd/ceph-68 --dev-target /dev/sdj1
$ chown -h ceph:ceph /var/lib/ceph/osd/ceph-68/block.db
$ lvchange --addtag ceph.db_device=/dev/sdj1
/dev/ceph-3bbfd168-2a54-4593-a037-80d0d7e97afd/osd-block-aaeaea54-eb6a-480c-b2fd-d938e336c0f6
$ lvchange --addtag ceph.db_uuid=463dd37c-fd49-4ccb-849f-c5827d3d9df2
/dev/ceph-3bbfd168-2a54-4593-a037-80d0d7e97afd/osd-block-aaeaea54-eb6a-480c-b2fd-d938e336c0f6
$ ceph-volume lvm activate --all

The UUIDs
later I tried this:
$ ceph-bluestore-tool --path /var/lib/ceph/osd/ceph-68 --devs-source
/var/lib/ceph/osd/ceph-68/block --dev-target
/var/lib/ceph/osd/ceph-68/block.db bluefs-bdev-migrate

Any ideas how I can get the rocksdb fixed?

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: [Suspicious newsletter] Re: bluefs_buffered_io turn to true

2021-05-17 Thread Szabo, Istvan (Agoda)
What happens if we are using buffered_io and the machine restared due to some 
power failure? Everything that was in the cache will be lost or how ceph handle 
this?

Istvan Szabo
Senior Infrastructure Engineer
---
Agoda Services Co., Ltd.
e: istvan.sz...@agoda.com
---

From: Szabo, Istvan (Agoda)
Sent: Friday, May 14, 2021 3:21 PM
To: 'Konstantin Shalygin' 
Cc: ceph-users@ceph.io
Subject: RE: [Suspicious newsletter] [ceph-users] Re: bluefs_buffered_io turn 
to true

When this stop  ? When died … :D

Istvan Szabo
Senior Infrastructure Engineer
---
Agoda Services Co., Ltd.
e: istvan.sz...@agoda.com
---

From: Konstantin Shalygin mailto:k0...@k0ste.ru>>
Sent: Friday, May 14, 2021 3:00 PM
To: Szabo, Istvan (Agoda) 
mailto:istvan.sz...@agoda.com>>
Cc: ceph-users@ceph.io
Subject: Re: [Suspicious newsletter] [ceph-users] Re: bluefs_buffered_io turn 
to true


On 14 May 2021, at 10:50, Szabo, Istvan (Agoda) 
mailto:istvan.sz...@agoda.com>> wrote:

Is it also normal if this buffered_ioturned on, it eats all the memory on the 
system? Hmmm.

This is what actually do this option - eat all free memory as cached for bluefs 
speedups



k


This message is confidential and is for the sole use of the intended 
recipient(s). It may also be privileged or otherwise protected by copyright or 
other legal rules. If you have received it by mistake please let us know by 
reply email and delete it from your system. It is prohibited to copy this 
message or disclose its content to anyone. Any confidentiality or privilege is 
not waived or lost by any mistaken delivery or unauthorized disclosure of the 
message. All messages sent to and from Agoda may be monitored to ensure 
compliance with company policies, to protect the company's interests and to 
remove potential malware. Electronic messages may be intercepted, amended, lost 
or deleted, or contain viruses.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Process for adding a separate block.db to an osd

2021-05-17 Thread Boris Behrens
Hi,
sorry for replying to this old thread:

I tried to add a block.db to an OSD but now the OSD can not start with the
error:
Mai 17 09:50:38 s3db10.fra2.gridscale.it ceph-osd[26038]: -7> 2021-05-17
09:50:38.362 7fc48ec94a80 -1 rocksdb: Corruption: CURRENT file does not end
with newline
Mai 17 09:50:38 s3db10.fra2.gridscale.it ceph-osd[26038]: -6> 2021-05-17
09:50:38.362 7fc48ec94a80 -1 bluestore(/var/lib/ceph/osd/ceph-68) _open_db
erroring opening db:
Mai 17 09:50:38 s3db10.fra2.gridscale.it ceph-osd[26038]: -1> 2021-05-17
09:50:38.866 7fc48ec94a80 -1
/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/gigantic/release/14.2.21/rpm/el7/BUILD/ceph-14.2.21/src/os/bluestore/BlueStore.cc:
In function 'int BlueStore::_upgrade_super()' thread 7fc48ec94a80 time
2021-05-17 09:50:38.865204
Mai 17 09:50:38 s3db10.fra2.gridscale.it ceph-osd[26038]:
/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/gigantic/release/14.2.21/rpm/el7/BUILD/ceph-14.2.21/src/os/bluestore/BlueStore.cc:
10647: FAILED ceph_assert(ondisk_format > 0)

I tried to run an fsck/repair on the disk:
[root@s3db10 osd]# ceph-bluestore-tool --path ceph-68  repair
2021-05-17 10:05:25.695 7f714dea3ec0 -1 rocksdb: Corruption: CURRENT file
does not end with newline
2021-05-17 10:05:25.695 7f714dea3ec0 -1 bluestore(ceph-68) _open_db
erroring opening db:
error from fsck: (5) Input/output error
[root@s3db10 osd]# ceph-bluestore-tool --path ceph-68  fsck
2021-05-17 10:05:35.012 7fb8f22e6ec0 -1 rocksdb: Corruption: CURRENT file
does not end with newline
2021-05-17 10:05:35.012 7fb8f22e6ec0 -1 bluestore(ceph-68) _open_db
erroring opening db:
error from fsck: (5) Input/output error

These are the steps I did to add the disk:
$ CEPH_ARGS="--bluestore-block-db-size 53687091200
--bluestore_block_db_create=true" ceph-bluestore-tool bluefs-bdev-new-db
--path /var/lib/ceph/osd/ceph-68 --dev-target /dev/sdj1
$ chown -h ceph:ceph /var/lib/ceph/osd/ceph-68/block.db
$ lvchange --addtag ceph.db_device=/dev/sdj1
/dev/ceph-3bbfd168-2a54-4593-a037-80d0d7e97afd/osd-block-aaeaea54-eb6a-480c-b2fd-d938e336c0f6
$ lvchange --addtag ceph.db_uuid=463dd37c-fd49-4ccb-849f-c5827d3d9df2
/dev/ceph-3bbfd168-2a54-4593-a037-80d0d7e97afd/osd-block-aaeaea54-eb6a-480c-b2fd-d938e336c0f6
$ ceph-volume lvm activate --all

The UUIDs
later I tried this:
$ ceph-bluestore-tool --path /var/lib/ceph/osd/ceph-68 --devs-source
/var/lib/ceph/osd/ceph-68/block --dev-target
/var/lib/ceph/osd/ceph-68/block.db bluefs-bdev-migrate

Any ideas how I can get the rocksdb fixed?
-- 
Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im
groüen Saal.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Limit memory of ceph-mgr

2021-05-17 Thread mabi
‐‐‐ Original Message ‐‐‐
On Monday, May 17, 2021 4:31 AM, Anthony D'Atri  wrote:

> You’re running on so small a node that 3.6GB is a problem??

Yes, I have hardware constraints based on the hardware where my hardware has 
maximum 8 GB of RAM as it is a Raspberry Pi 4. I am doing a proof of concept to 
see if it is possible to run a small cluster on this type of hardware. So it is 
no problem to me to scale horizontally (add more nodes) but scaling vertically 
(adding more RAM or CPU) is not possible. It is also a great way of learning 
and experiencing ceph and see what can be done in terms of optimization.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: after upgrade to 16.2.3 16.2.4 and after adding few hdd's OSD's started to fail 1 by 1.

2021-05-17 Thread Andrius Jurkus

Thanks for fast responses.

For now I m running with

bluestore_allocator = bitmap

and it doesnt crash anymore. (ps other messages didint appear for some 
reason)


On 2021-05-15 02:10, Igor Fedotov wrote:

This looks similar to #50656 indeed.

Hopefully will fix that next week.


Thanks,

Igor

On 5/14/2021 9:09 PM, Neha Ojha wrote:

On Fri, May 14, 2021 at 10:47 AM Andrius Jurkus
 wrote:

Hello, I will try to keep it sad and short :) :(PS sorry if this
dublicate I tried post it from web also.

Today I upgraded from 16.2.3 to 16.2.4 and added few hosts and osds.
After data migration for few hours, 1 SSD failed, then another and
another 1 by 1. Now I have cluster in pause and 5 failed SSD's, same
host has SSD and HDD, but only SSD's are failing so I think this has 
to

be balancing refiling or something bug and probably not upgrade bug.

Cluster has been in pause for 4 hours and no more OSD's are failing.

full trace
https://pastebin.com/UxbfFYpb

This looks very similar to https://tracker.ceph.com/issues/50656.
Adding Igor for more ideas.

Neha


Now I m googling and learning but, Is there a way how to easily test
lets say 15.2.XX version on osd without losing anything?

Any help would be appreciated.

Error start like this

May 14 16:58:52 dragon-ball-radar systemd[1]: Starting Ceph osd.2 for
4e01640b-951b-4f75-8dca-0bad4faf1b11...
May 14 16:58:53 dragon-ball-radar podman[113650]: 2021-05-14
16:58:53.057836433 + UTC m=+0.454352919 container create
3b44520aa651b8196cd0bf0c96daa2bd03845ef5f8cfaf9a689410a1f98d84dd
(image=docker.io/ceph/ceph@sha256:54e95ae1e11404157d7b329d0bef866ebbb214b195a009e87aae4eba9d282949,
name=ceph-4e01640b-951b-4f75-8dca-0bad4faf1b11-osd.2-activate,
GIT_BRANCH=HEAD, maintainer=D
May 14 16:58:53 dragon-ball-radar systemd[1]: Started libcrun 
container.

May 14 16:58:53 dragon-ball-radar podman[113650]: 2021-05-14
16:58:53.3394116 + UTC m=+0.735928098 container init
3b44520aa651b8196cd0bf0c96daa2bd03845ef5f8cfaf9a689410a1f98d84dd
(image=docker.io/ceph/ceph@sha256:54e95ae1e11404157d7b329d0bef866ebbb214b195a009e87aae4eba9d282949,
name=ceph-4e01640b-951b-4f75-8dca-0bad4faf1b11-osd.2-activate,
maintainer=Dimitri Savineau  ceph-volume lvm
activate successful for osd ID: 2
May 14 16:58:53 dragon-ball-radar podman[113650]: 2021-05-14
16:58:53.8147653 + UTC m=+1.211281741 container died
3b44520aa651b8196cd0bf0c96daa2bd03845ef5f8cfaf9a689410a1f98d84dd
(image=docker.io/ceph/ceph@sha256:54e95ae1e11404157d7b329d0bef866ebbb214b195a009e87aae4eba9d282949,
name=ceph-4e01640b-951b-4f75-8dca-0bad4faf1b11-osd.2-activate)
May 14 16:58:55 dragon-ball-radar podman[113650]: 2021-05-14
16:58:55.044964534 + UTC m=+2.441480996 container remove
3b44520aa651b8196cd0bf0c96daa2bd03845ef5f8cfaf9a689410a1f98d84dd
(image=docker.io/ceph/ceph@sha256:54e95ae1e11404157d7b329d0bef866ebbb214b195a009e87aae4eba9d282949,
name=ceph-4e01640b-951b-4f75-8dca-0bad4faf1b11-osd.2-activate,
CEPH_POINT_RELEASE=-16.2.4, R
May 14 16:58:55 dragon-ball-radar podman[113909]: 2021-05-14
16:58:55.594265612 + UTC m=+0.369978347 container create
31364008fcb8b290643d6e892fba16d19618f5682f590373feabed23061749da
(image=docker.io/ceph/ceph@sha256:54e95ae1e11404157d7b329d0bef866ebbb214b195a009e87aae4eba9d282949,
name=ceph-4e01640b-951b-4f75-8dca-0bad4faf1b11-osd.2, RELEASE=HEAD,
org.label-schema.build-d
May 14 16:58:55 dragon-ball-radar podman[113909]: 2021-05-14
16:58:55.864589286 + UTC m=+0.640302021 container init
31364008fcb8b290643d6e892fba16d19618f5682f590373feabed23061749da
(image=docker.io/ceph/ceph@sha256:54e95ae1e11404157d7b329d0bef866ebbb214b195a009e87aae4eba9d282949,
name=ceph-4e01640b-951b-4f75-8dca-0bad4faf1b11-osd.2,
org.label-schema.schema-version=1.0, GIT
May 14 16:58:55 dragon-ball-radar conmon[113957]: debug
2021-05-14T16:58:55.896+ 7fcf16aa2080 0 set uid:gid to 167:167
(ceph:ceph)
May 14 16:58:55 dragon-ball-radar conmon[113957]: debug
2021-05-14T16:58:55.896+ 7fcf16aa2080 0 ceph version 16.2.4
(3cbe25cde3cfa028984618ad32de9edc4c1eaed0) pacific (stable), process
ceph-osd, pid 2
May 14 16:58:55 dragon-ball-radar conmon[113957]: debug
2021-05-14T16:58:55.896+ 7fcf16aa2080 0 pidfile_write: ignore 
empty

--pid-file
May 14 16:58:55 dragon-ball-radar conmon[113957]: debug
2021-05-14T16:58:55.896+ 7fcf16aa2080 1 bdev(0x564ad3a8c800
/var/lib/ceph/osd/ceph-2/block) open path 
/var/lib/ceph/osd/ceph-2/block

May 14 16:58:55 dragon-ball-radar conmon[113957]: debug
2021-05-14T16:58:55.900+ 7fcf16aa2080 1 bdev(0x564ad3a8c800
/var/lib/ceph/osd/ceph-2/block) open size 500103643136 (0x747080,
466 GiB) block_size 4096 (4 KiB) non-rotational discard supported
May 14 16:58:55 dragon-ball-radar conmon[113957]: debug
2021-05-14T16:58:55.900+ 7fcf16aa2080 1
bluestore(/var/lib/ceph/osd/ceph-2) _set_cache_sizes cache_size
3221225472 meta 0.45 kv 0.45 data 0.06
May 14 16:58:55 dragon-ball-radar conmon[113957]: debug
2021-05-14T16:58:55.900+ 7fcf16aa2080 1 bdev(0x564ad3a8cc00

[ceph-users] Re: [Suspicious newsletter] RGW: Multiple Site does not sync olds data

2021-05-17 Thread 特木勒
Hi Istvan:

I think this issue may relate to this problem. Currently I am using tenant
buckets now.

https://tracker.ceph.com/issues/50785

Hope this will help you.


Szabo, Istvan (Agoda)  于2021年5月11日周二 上午10:47写道:

> Ok, will be challenging with an 800 millions object bucket  But I might
> give a try.
>
>
>
> Istvan Szabo
> Senior Infrastructure Engineer
> ---
> Agoda Services Co., Ltd.
> e: istvan.sz...@agoda.com
> ---
>
>
>
> *From:* 特木勒 
> *Sent:* Monday, May 10, 2021 6:53 PM
> *To:* Szabo, Istvan (Agoda) 
> *Cc:* Jean-Sebastien Landry ;
> ceph-users@ceph.io; Amit Ghadge 
> *Subject:* Re: [ceph-users] Re: [Suspicious newsletter] RGW: Multiple
> Site does not sync olds data
>
>
>
> Hi Istvan:
>
>
>
> Thanks for your help.
>
>
>
> After we rewrite all the objects that in buckets, the sync seems to work
> again.
>
>
>
> We are using this command to rewrite all the objects in specific bucket:
>
> `radosgw-admin bucket rewrite —bucket=BUCKET_NAME --min-rewrite-size 0`
>
>
>
> You can try to run this on 1 bucket and see if it could help you fix the
> problem.
>
>
>
> Thank you~
>
>
>
> Szabo, Istvan (Agoda)  于2021年5月10日周一 下午12:16写道:
>
> So how is your multisite things going at the moment? Seems like with this
> rewrite you’ve moved further than me  Is it working properly now? If
> yes, what is the steps to make it work? Where is the magic  ?
>
>
>
> Istvan Szabo
> Senior Infrastructure Engineer
> ---
> Agoda Services Co., Ltd.
> e: istvan.sz...@agoda.com
> ---
>
>
>
> *From:* 特木勒 
> *Sent:* Thursday, May 6, 2021 11:27 AM
> *To:* Jean-Sebastien Landry 
> *Cc:* Szabo, Istvan (Agoda) ; ceph-users@ceph.io;
> Amit Ghadge 
> *Subject:* Re: [ceph-users] Re: [Suspicious newsletter] RGW: Multiple
> Site does not sync olds data
>
>
>
> Hi Jean:
>
>
>
> Thanks for your info.
>
>
>
> Unfortunately I check the secondary cluster and non-objects had been
> synced. The only way I have is to force rewrite objects for whole buckets.
>
>
>
> I have tried to set up multiple site between Nautilus and octopus. It
> works pretty well. But after I upgrade primary cluster to octopus, we have
> this issue. :(
>
>
>
> Here is the issue: https://tracker.ceph.com/issues/49542#change-193975
>
>
>
> Thanks
>
>
>
> Jean-Sebastien Landry  于2021年4月27日周二 下午
> 7:52写道:
>
> Hi, I hit the same errors when doing multisite sync between luminous and
> octopus, but what I founded is that my sync errors was mainly on old
> multipart and shadow objects, at the "rados level" if I might say.
> (leftovers from luminous bugs)
>
> So check at the "user level", using s3cmd/awscli and the objects md5,
> you will probably find that your pretty much in sync. Hopefully.
>
> Cheers!
>
> On 4/25/21 11:29 PM, 特木勒 wrote:
> > [Externe UL*]
> >
> > Another problem I notice for a new bucket, the first object in the bucket
> > will not be sync. the sync will start with the second object. I tried to
> > fix the index on the bucket and manually rerun bucket sync, but the first
> > object still does not sync with secondary cluster.
> >
> > Do you have any ideas for this issue?
> >
> > Thanks
> >
> > 特木勒  于2021年4月26日周一 上午11:16写道:
> >
> >> Hi Istvan:
> >>
> >> Thanks Amit's suggestion.
> >>
> >> I followed his suggestion to fix bucket index and re-do sync on buckets,
> >> but it still did not work for me.
> >>
> >> Then I tried to use bucket rewrite command to rewrite all the objects in
> >> buckets and it works for me. I think the reason is there's something
> wrong
> >> with bucket index and rewrite has rebuilt the index.
> >>
> >> Here's the command I use:
> >> `sudo radosgw-admin bucket rewrite -b BUCKET-NAME --min-rewrite-size 0`
> >>
> >> Maybe you can try this to fix the sync issues.
> >>
> >> @Amit Ghadge  Thanks for your suggestions. Without
> >> your suggestions, I will not notice something wrong with index part.
> >>
> >> Thanks :)
> >>
> >> Szabo, Istvan (Agoda)  于2021年4月26日周一 上午9:57写道:
> >>
> >>> Hi,
> >>>
> >>>
> >>>
> >>> No, doesn’t work, now we will write our own sync app for ceph, I gave
> up.
> >>>
> >>>
> >>>
> >>> Istvan Szabo
> >>> Senior Infrastructure Engineer
> >>> ---
> >>> Agoda Services Co., Ltd.
> >>> e: istvan.sz...@agoda.com
> >>> ---
> >>>
> >>>
> >>>
> >>> *From:* 特木勒 
> >>> *Sent:* Friday, April 23, 2021 7:50 PM
> >>> *To:* Szabo, Istvan (Agoda) 
> >>> *Cc:* ceph-users@ceph.io
> >>> *Subject:* Re: [Suspicious newsletter] [ceph-users] RGW: Multiple Site
> >>> does not sync olds data
> >>>
> >>>
> >>>
> >>> Hi Istvan:
> >>>
> >>>
> >>>
> >>> We just upgraded whole cluster to 15.2.10 and the multiple site still
> >>> cannot sync whole objects to secondary cluster. 
> >>>
> >>>
> >>>
> >>> Do you have any suggestions on this? And I open another issues in 

[ceph-users] Re: [Suspicious newsletter] Re: bluefs_buffered_io turn to true

2021-05-17 Thread Janne Johansson
Den mån 17 maj 2021 kl 08:15 skrev Szabo, Istvan (Agoda) <
istvan.sz...@agoda.com>:

> What happens if we are using buffered_io and the machine restared due to
> some power failure? Everything that was in the cache will be lost or how
> ceph handle this?
>

Not to be picky, but between any client writing data to storage and the
magnetic layers of a drive there are some 8-9-10 layers of caches and
buffers, and a varying degree of lying to the layer above on when data
actually hits the spinning rust. Just because we found yet another one here
which may or may not impact writes, it doesn't change the situation in the
larger scheme.

If power goes, any IO in flight will be disrupted in some way, either it
will be finalized if possible or reverted if not, where reverted might be
as late as "on the subsequent fsck/scrub over that part of the disk". The
WAL/DB/Journals help a bit in figuring out if it can be finalized and what
was in flight before the crash, but all in all, this is not totally
unexpected from a storage point of view.

Most often I guess the IO will be counted as not-done and when the OSD
comes back, it will get the correct data replayed from another PG replica
during recovery/backfill and that's more or less it.

-- 
May the most significant bit of your life be positive.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: RBD as a boot image [was: libceph: get_reply osd2 tid 1459933 data 3248128 > preallocated 131072, skipping]

2021-05-17 Thread Kees Meijs | Nefos

Hi,

This is a chicken and egg problem I guess. The boot process (albeit UEFI 
or BIOS; given x86) should be able to load boot loader code, a Linux 
kernel and initial RAM disk (although in some cases a kernel alone could 
be enough).


So yes: use PXE to load a Linux kernel and RAM disk. The RAM disk should 
include RBD kernel modules and a hook to map and mount the RBD device. 
The code referred to seems it could do the trick.


But that's a real world scenario that is usable right now.

In terms of crazy ideas: implement an iSCSI like UEFI module supporting 
RBD acting as a software based HBA. Or even cooler: a hardware HBA 
similar to Coraid's adapters made for AoE.


Cheers,
Kees

On 16-05-2021 23:19, Nico Schottelius wrote:

Hey Markus, Ilya,

you don't know with how much interest I am following this thread,
because ...


Generally it would be great if you could include the proper initrd code for RBD 
and CephFS root filesystems to the Ceph project. You can happily use my code as 
a starting point.

https://github.com/trickkiste/ltsp/blob/feature-boot_method-rbd/debian/ltsp-rbd.initramfs-script

I think booting from CephFS would require kernel patches.  It looks
like NFS and CIFS are the only network filesystems supported by the
init/root infrastructure in the kernel.

... we have been looking for a while to a discussion about using RBD
(not cephfs) as a replacement for a hard disk. Linux can map RBD
devices, so should Linux not also be able to *boot* from an rbd device
similar to a regular disk?

I did not find any example of this yet, but I'd assume that conceptually
one would probably:

- preload a Linux kernel from the network (potentially via ipxe)
- specify root=rbd://fsid/pool/image

Or in a even *better* variant:

- the bootloader (ipxe?) can map RBD
- the bootloader pre-loads enough of the image for reading the partition
- the bootloader either loads the kernel + initramfs *or* chainloads
   another bootloader

What are your thoughts on this? Do-able or totally crazy?

Best regards,

Nico

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io