[ceph-users] Re: Can I delete rgw log entries?

2023-04-20 Thread Richard Bade
Ok, cool. Thanks for clarifying that Daniel and Casey.
I'll clean up my sync logs now but leave the rest alone.

Rich

On Fri, 21 Apr 2023, 05:46 Daniel Gryniewicz,  wrote:

> On 4/20/23 10:38, Casey Bodley wrote:
> > On Sun, Apr 16, 2023 at 11:47 PM Richard Bade  wrote:
> >>
> >> Hi Everyone,
> >> I've been having trouble finding an answer to this question. Basically
> >> I'm wanting to know if stuff in the .log pool is actively used for
> >> anything or if it's just logs that can be deleted.
> >> In particular I was wondering about sync logs.
> >> In my particular situation I have had some tests of zone sync setup,
> >> but now I've removed the secondary zone and pools. My primary zone is
> >> filled with thousands of logs like this:
> >> data_log.71
> >> data.full-sync.index.e2cf2c3e-7870-4fc4-8ab9-d78a17263b4f.47
> >> meta.full-sync.index.7
> >> datalog.sync-status.shard.e2cf2c3e-7870-4fc4-8ab9-d78a17263b4f.13
> >>
> bucket.sync-status.f3113d30-ecd3-4873-8537-aa006e54b884:{bucketname}:default.623958784.455
> >>
> >> I assume that because I'm not doing any sync anymore I can delete all
> >> the sync related logs? Is anyone able to confirm this?
> >
> > yes
> >
> >> What about if the sync is running? Are these being written and read
> >> from and therefore must be left alone?
> >
> > right. while a multisite configuration is operating, the replication
> > logs will be trimmed in the background. in addition to the replication
> > logs, the log pool also contains sync status objects. these track the
> > progress of replication, and removing those objects would generally
> > cause sync to start over from the beginning
> >
> >> It seems like these are more of a status than just a log and that
> >> deleting them might confuse the sync process. If so, does that mean
> >> that the log pool is not just output that can be removed as needed?
> >> Are there perhaps other things in there that need to stay?
> >
> > the log pool is used by several subsystems like multisite sync,
> > garbage collection, bucket notifications, and lifecycle. those
> > features won't work reliably if you delete their rados objects
> >
>
> Also, to be clear (in case you were confused), these logs are not data
> to be read by admins (like "log files") but structured data that
> represents changes to be used by syncing (like "log structured
> filesystem").  So deleting logs while sync is running will break sync.
>
> Daniel
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Can I delete rgw log entries?

2023-04-20 Thread Daniel Gryniewicz

On 4/20/23 10:38, Casey Bodley wrote:

On Sun, Apr 16, 2023 at 11:47 PM Richard Bade  wrote:


Hi Everyone,
I've been having trouble finding an answer to this question. Basically
I'm wanting to know if stuff in the .log pool is actively used for
anything or if it's just logs that can be deleted.
In particular I was wondering about sync logs.
In my particular situation I have had some tests of zone sync setup,
but now I've removed the secondary zone and pools. My primary zone is
filled with thousands of logs like this:
data_log.71
data.full-sync.index.e2cf2c3e-7870-4fc4-8ab9-d78a17263b4f.47
meta.full-sync.index.7
datalog.sync-status.shard.e2cf2c3e-7870-4fc4-8ab9-d78a17263b4f.13
bucket.sync-status.f3113d30-ecd3-4873-8537-aa006e54b884:{bucketname}:default.623958784.455

I assume that because I'm not doing any sync anymore I can delete all
the sync related logs? Is anyone able to confirm this?


yes


What about if the sync is running? Are these being written and read
from and therefore must be left alone?


right. while a multisite configuration is operating, the replication
logs will be trimmed in the background. in addition to the replication
logs, the log pool also contains sync status objects. these track the
progress of replication, and removing those objects would generally
cause sync to start over from the beginning


It seems like these are more of a status than just a log and that
deleting them might confuse the sync process. If so, does that mean
that the log pool is not just output that can be removed as needed?
Are there perhaps other things in there that need to stay?


the log pool is used by several subsystems like multisite sync,
garbage collection, bucket notifications, and lifecycle. those
features won't work reliably if you delete their rados objects



Also, to be clear (in case you were confused), these logs are not data 
to be read by admins (like "log files") but structured data that 
represents changes to be used by syncing (like "log structured 
filesystem").  So deleting logs while sync is running will break sync.


Daniel
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Can I delete rgw log entries?

2023-04-20 Thread Casey Bodley
On Sun, Apr 16, 2023 at 11:47 PM Richard Bade  wrote:
>
> Hi Everyone,
> I've been having trouble finding an answer to this question. Basically
> I'm wanting to know if stuff in the .log pool is actively used for
> anything or if it's just logs that can be deleted.
> In particular I was wondering about sync logs.
> In my particular situation I have had some tests of zone sync setup,
> but now I've removed the secondary zone and pools. My primary zone is
> filled with thousands of logs like this:
> data_log.71
> data.full-sync.index.e2cf2c3e-7870-4fc4-8ab9-d78a17263b4f.47
> meta.full-sync.index.7
> datalog.sync-status.shard.e2cf2c3e-7870-4fc4-8ab9-d78a17263b4f.13
> bucket.sync-status.f3113d30-ecd3-4873-8537-aa006e54b884:{bucketname}:default.623958784.455
>
> I assume that because I'm not doing any sync anymore I can delete all
> the sync related logs? Is anyone able to confirm this?

yes

> What about if the sync is running? Are these being written and read
> from and therefore must be left alone?

right. while a multisite configuration is operating, the replication
logs will be trimmed in the background. in addition to the replication
logs, the log pool also contains sync status objects. these track the
progress of replication, and removing those objects would generally
cause sync to start over from the beginning

> It seems like these are more of a status than just a log and that
> deleting them might confuse the sync process. If so, does that mean
> that the log pool is not just output that can be removed as needed?
> Are there perhaps other things in there that need to stay?

the log pool is used by several subsystems like multisite sync,
garbage collection, bucket notifications, and lifecycle. those
features won't work reliably if you delete their rados objects

>
> Regards,
> Richard
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io