[ceph-users] Re: About lost disk with erasure code

2023-12-26 Thread Janne Johansson
Den tis 26 dec. 2023 kl 08:45 skrev Phong Tran Thanh :
>
> Hi community,
>
> I am running ceph with block rbd with 6 nodes, erasure code 4+2 with
> min_size of pool is 4.
>
> When three osd is down, and an PG is state down, some pools is can't write
> data, suppose three osd can't start and pg stuck in down state, how i can
> delete or recreate pg to replace down pg or another way to allow pool to
> write/read data?


Depending on how the data is laid out in this pool, you might lose
more or less all data from it.

RBD images get split into pieces of 2 or 4M sizes, so that those
pieces end up on different PGs,
which in turn makes them end up on different OSDs and this allows for
load balancing over the'
whole cluster, but also means that if you lose some PGs on a 40G RBD
image (made up of 10k
pieces), chances are very high that the lost PG did contain one or
more of those 10k pieces.

So lost PGs would probably mean that every RBD image of decent sizes
will have holes in them,
and how this affects all the instances that mount the images will be
hard to tell.
If at all possible, try to use the offline OSD tools to try to get
this PG out of one of the bad OSDs.

https://hawkvelt.id.au/post/2022-4-5-ceph-pg-export-import/ might
help, to see how to run
the export + import commands.

If you can get it out, it can be injected (imported) into any other
running OSD and then replicas
will be recreated and moved to where they should be.

If you have disks to spare, make sure to do full copies of the broken
OSDs and work in the copies
instead, to maximize the chances of restoring your data.

If you are very sure that these three OSDs are never coming back, and
have marked the OSDs
as lost, then I guess

ceph pg force_create_pg 

would be the next step to have the cluster create empty PGs to replace
the lost ones, but I would
consider this only after trying all the possible options for repairing
at least one of the OSDs that held
the PGs that are missing.

--
May the most significant bit of your life be positive.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: About lost disk with erasure code

2023-12-26 Thread Phong Tran Thanh
Thank you for your knowledge. I have a question. Which pool is affected
when the PG is down, and how can I show it?
When a PG is down, is only one pool affected or are multiple pools affected?


Vào Th 3, 26 thg 12, 2023 vào lúc 16:15 Janne Johansson <
icepic...@gmail.com> đã viết:

> Den tis 26 dec. 2023 kl 08:45 skrev Phong Tran Thanh <
> tranphong...@gmail.com>:
> >
> > Hi community,
> >
> > I am running ceph with block rbd with 6 nodes, erasure code 4+2 with
> > min_size of pool is 4.
> >
> > When three osd is down, and an PG is state down, some pools is can't
> write
> > data, suppose three osd can't start and pg stuck in down state, how i can
> > delete or recreate pg to replace down pg or another way to allow pool to
> > write/read data?
>
>
> Depending on how the data is laid out in this pool, you might lose
> more or less all data from it.
>
> RBD images get split into pieces of 2 or 4M sizes, so that those
> pieces end up on different PGs,
> which in turn makes them end up on different OSDs and this allows for
> load balancing over the'
> whole cluster, but also means that if you lose some PGs on a 40G RBD
> image (made up of 10k
> pieces), chances are very high that the lost PG did contain one or
> more of those 10k pieces.
>
> So lost PGs would probably mean that every RBD image of decent sizes
> will have holes in them,
> and how this affects all the instances that mount the images will be
> hard to tell.
> If at all possible, try to use the offline OSD tools to try to get
> this PG out of one of the bad OSDs.
>
> https://hawkvelt.id.au/post/2022-4-5-ceph-pg-export-import/ might
> help, to see how to run
> the export + import commands.
>
> If you can get it out, it can be injected (imported) into any other
> running OSD and then replicas
> will be recreated and moved to where they should be.
>
> If you have disks to spare, make sure to do full copies of the broken
> OSDs and work in the copies
> instead, to maximize the chances of restoring your data.
>
> If you are very sure that these three OSDs are never coming back, and
> have marked the OSDs
> as lost, then I guess
>
> ceph pg force_create_pg 
>
> would be the next step to have the cluster create empty PGs to replace
> the lost ones, but I would
> consider this only after trying all the possible options for repairing
> at least one of the OSDs that held
> the PGs that are missing.
>
> --
> May the most significant bit of your life be positive.
>


-- 
Trân trọng,


*Tran Thanh Phong*

Email: tranphong...@gmail.com
Skype: tranphong079
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: About lost disk with erasure code

2023-12-28 Thread Kai Stian Olstad

On 27.12.2023 04:54, Phong Tran Thanh wrote:

Thank you for your knowledge. I have a question. Which pool is affected
when the PG is down, and how can I show it?
When a PG is down, is only one pool affected or are multiple pools 
affected?


If only 1 PG is down only 1 pool is affected.
The name of a PG is {pool-num}.{pg-id} and the pools number you find 
with "ceph osd lspools".


ceph health detail
will show which PG is down and all other issues.

ceph pg ls
will show you all PG, their status and the OSD they are running on.

Some useful links
https://docs.ceph.com/en/quincy/rados/operations/monitoring-osd-pg/#monitoring-pg-states
https://docs.ceph.com/en/quincy/rados/troubleshooting/troubleshooting-pg/
https://docs.ceph.com/en/latest/dev/placement-group/#user-visible-pg-states


--
Kai Stian Olstad
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: About lost disk with erasure code

2023-12-28 Thread Phong Tran Thanh
Dear Kai Stian Olstad,
Thank you for your information. It's good knowledge for me.

Vào Th 5, 28 thg 12, 2023 vào lúc 15:06 Kai Stian Olstad <
ceph+l...@olstad.com> đã viết:

> On 27.12.2023 04:54, Phong Tran Thanh wrote:
> > Thank you for your knowledge. I have a question. Which pool is affected
> > when the PG is down, and how can I show it?
> > When a PG is down, is only one pool affected or are multiple pools
> > affected?
>
> If only 1 PG is down only 1 pool is affected.
> The name of a PG is {pool-num}.{pg-id} and the pools number you find
> with "ceph osd lspools".
>
> ceph health detail
> will show which PG is down and all other issues.
>
> ceph pg ls
> will show you all PG, their status and the OSD they are running on.
>
> Some useful links
>
> https://docs.ceph.com/en/quincy/rados/operations/monitoring-osd-pg/#monitoring-pg-states
> https://docs.ceph.com/en/quincy/rados/troubleshooting/troubleshooting-pg/
> https://docs.ceph.com/en/latest/dev/placement-group/#user-visible-pg-states
>
>
> --
> Kai Stian Olstad
>


-- 
Trân trọng,


*Tran Thanh Phong*

Email: tranphong...@gmail.com
Skype: tranphong079
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io