Re: Why WAL archives enabled by default?

Alex Plehanov Fri, 06 Nov 2020 02:45:45 -0800

Guys,

We already have FileWriteAheadLogManager#maxSegCountWithoutCheckpoint.
Checkpoint triggered if there are too many WAL segments without checkpoint.
Looks like you are talking about this feature.


пт, 6 нояб. 2020 г. в 13:21, Ivan Daschinsky <ivanda...@gmail.com>:

> Kirill and I discussed privately proposed approach. As far as I understand,
> Kirill suggests to implement some
> heuristic to do a force checkpoint in some cases if user by mistake
> misconfigured cluster in order to preserve
> requested size of WAL archive.
> Currently, as for me, this approach is questionable, because it can cause
> some performance problems. But as an option,
> it can be used and should be switchable.
>
> пт, 6 нояб. 2020 г. в 12:36, Ivan Daschinsky <ivanda...@gmail.com>:
>
> > Kirill, how your approach will help if user tuned a cluster to do
> > checkpoints rarely under load?
> > No way.
> >
> > пт, 6 нояб. 2020 г. в 12:19, ткаленко кирилл <tkalkir...@yandex.ru>:
> >
> >> Ivan, I agree with you that the archive is primarily about optimization.
> >>
> >> If the size of the archive is critical for the user, we have no
> >> protection against this, we can always go beyond this limit.
> >> Thus, the user needs to remember this and configure it in some way.
> >>
> >> I suggest not to exceed this limit and give the expected behavior for
> the
> >> user. At the same time, the segments needed for recovery will remain and
> >> there will be no data loss.
> >>
> >> 06.11.2020, 11:29, "Ivan Daschinsky" <ivanda...@gmail.com>:
> >> > Guys, fisrt of all, archiving is not for PITR at all, this is
> >> optimization.
> >> > If we disable archiving, every rollover we need to create new file. If
> >> we
> >> > enable archiving, we reserve 10 (by default) segments filled with
> >> zeroes.
> >> > We use mmap by default, so if we use no-archiver approach:
> >> > 1. We firstly create new empty file
> >> > 2. Call on it sun.nio.ch.FileChannelImpl#map, thats under the hood
> >> > a. If file is shorter, than wal segment size, it
> >> > calls sun.nio.ch.FileDispatcherImpl#truncate0, this is under the hood
> >> just
> >> > a system call truncate [1]
> >> > b. Than it calls system call mmap on this
> >> > file sun.nio.ch.FileChannelImpl#map0, under the hood see [2]
> >> > These manipulation are not free and cheap. So rollover will be much
> much
> >> > slower.
> >> > If archiving is enabled, 10 segments are already preallocated at the
> >> moment
> >> > of node's start.
> >> >
> >> > When archiving is enabled, archiver just copy previous preallocated
> >> segment
> >> > and move it to archive directory.
> >> > This archived segment is crucial for recovery. When new checkpoints
> >> > finished, all eligible for trunocating segments are just removed.
> >> >
> >> > If archiving is disabled, we also write WAL segments in wal directory
> >> and
> >> > disabling archiving don't prevent you from storing segments, if they
> are
> >> > required for recovery.
> >> >
> >> >>> Before increasing the size of WAL archive (transferring to archive
> >> >
> >> > /rollOver, compression, decompression), we can make sure that there
> >> will be
> >> > enough space in the archive and if there is no such, then we will try
> to
> >> >>> clean it. We cannot delete those segments that are required for
> >> recovery
> >> >
> >> > (between the last two checkpoints) and reserved for example for
> >> historical
> >> > rebalancing.
> >> > First of all, compression/decompression is offtopic here.
> >> > Secondly, wal segments are required only with idx higher than LAST
> >> > checkpoint marker.
> >> > Thirdly, archiving and rolling over can be during checkpoint and we
> can
> >> > broke everything accidentially.
> >> > Fourthly, I see no benefits to overcomplicated already complicated
> >> logic.
> >> > This is basically problem of misunderstanding and tuning.
> >> > There are a lot of similar topics for almost every DB. [3]
> >> >
> >> > [1] -- https://man7.org/linux/man-pages/man2/ftruncate.2.html
> >> > [2] -- https://man7.org/linux/man-pages/man2/mmap.2.html
> >> > [3] --
> >> >
> >>
> https://www.google.com/search?q=pg_wal%2Fxlogtemp+no+space+left+on+device&oq=pg+wal+no
> >> >
> >> > пт, 6 нояб. 2020 г. в 10:42, ткаленко кирилл <tkalkir...@yandex.ru>:
> >> >
> >> >>  Hi, Ivan!
> >> >>
> >> >>  I have only described ideas. But here are a few more details.
> >> >>
> >> >>  We can take care not to go beyond
> >> >>  DataStorageConfiguration#maxWalArchiveSize.
> >> >>
> >> >>  Before increasing the size of WAL archive (transferring to archive
> >> >>  /rollOver, compression, decompression), we can make sure that there
> >> will be
> >> >>  enough space in the archive and if there is no such, then we will
> try
> >> to
> >> >>  clean it. We cannot delete those segments that are required for
> >> recovery
> >> >>  (between the last two checkpoints) and reserved for example for
> >> historical
> >> >>  rebalancing.
> >> >>
> >> >>  We can receive a notification about the change of checkpoints and
> the
> >> >>  reservation / release of segments, thus we can know how many
> segments
> >> we
> >> >>  can delete right now.
> >> >>
> >> >>  06.11.2020, 09:53, "Ivan Daschinsky" <ivanda...@gmail.com>:
> >> >>  >>> For example, when trying to move a segment to the archive.
> >> >>  >
> >> >>  > We cannot do this, we will lost data. We can truncate archived
> >> segment if
> >> >>  > and only if it is not required for recovery. If last checkpoint
> >> marker
> >> >>  > points to segment
> >> >>  > with lower index, we cannot delete any segment with higher index.
> >> So the
> >> >>  > only moment where we can remove truncate segments is a finish of
> >> >>  checkpoint.
> >> >>  >
> >> >>  > пт, 6 нояб. 2020 г. в 09:46, ткаленко кирилл <
> tkalkir...@yandex.ru
> >> >:
> >> >>  >
> >> >>  >> Hello, everybody!
> >> >>  >>
> >> >>  >> As far as I know, WAL archive is used for PITP(GridGain feature)
> >> and
> >> >>  >> historical rebalancing.
> >> >>  >>
> >> >>  >> Facundo seems to have a problem with running out of directory
> >> >>  >> (/opt/work/walarchive) space.
> >> >>  >> Currently, WAL archive is cleared at the end of checkpoint.
> >> Potentially
> >> >>  >> long transaction may prevent checkpoint starting, thereby not
> >> cleaning
> >> >>  WAL
> >> >>  >> archive, which will lead to such an error.
> >> >>  >> At the moment, I see such a WA to increase size of directory
> >> >>  >> (/opt/work/walarchive) in k8s and avoid long transactions or
> >> something
> >> >>  like
> >> >>  >> that that modifies data and runs for a long time.
> >> >>  >>
> >> >>  >> And it is best to fix the logic of working with WAL archive. I
> >> think we
> >> >>  >> should remove WAL archive cleanup from the end of the checkpoint
> >> and
> >> >>  do it
> >> >>  >> on demand. For example, when trying to move a segment to the
> >> archive.
> >> >>  >>
> >> >>  >> 06.11.2020, 01:58, "Denis Magda" <dma...@apache.org>:
> >> >>  >> > Folks,
> >> >>  >> >
> >> >>  >> > In my understanding, you need the archives only for features
> >> such as
> >> >>  >> PITR.
> >> >>  >> > Considering, that the PITR functionality is not provided in
> >> Ignite
> >> >>  why do
> >> >>  >> > we have the archives enabled by default?
> >> >>  >> >
> >> >>  >> > How about having this feature disabled by default to prevent
> the
> >> >>  >> following
> >> >>  >> > issues experienced by our users:
> >> >>  >> >
> >> >>  >>
> >> >>
> >>
> http://apache-ignite-users.70518.x6.nabble.com/WAL-and-WAL-Archive-volume-size-recommendation-td34458.html
> >> >>  >> >
> >> >>  >> > -
> >> >>  >> > Denis
> >> >>  >
> >> >>  > --
> >> >>  > Sincerely yours, Ivan Daschinskiy
> >> >
> >> > --
> >> > Sincerely yours, Ivan Daschinskiy
> >>
> >
> >
> > --
> > Sincerely yours, Ivan Daschinskiy
> >
>
>
> --
> Sincerely yours, Ivan Daschinskiy
>

Re: Why WAL archives enabled by default?

Reply via email to