Re: WAL Archive Issue
Hi Ivan, Excellent idea, I've added this to https://issues.apache.org/jira/browse/IGNITE-7730 ticket. Sinerely, Dmitriy Pavlov ср, 14 февр. 2018 г. в 0:28, Ivan Rakov : > > - applying compressor to segments older than 1 completed checkpoint > ago - > > saves space. > By the way: WAL compression is already implemented that way. If there > are any ".zip" segments in archive dir, they are free to delete. > This can be a safe workaround for users who experience lack of free > space - just delete compressed segments. We should mention it in > documentation for 2.4 release. > > Best Regards, > Ivan Rakov > > On 13.02.2018 23:53, Dmitry Pavlov wrote: > > I see, it seems subgoal 'gain predictable size' can be achieved with > > following options: > > - https://issues.apache.org/jira/browse/IGNITE-6552 implementation (in > > variant of '...WAL history size in time units and maximum size in > GBytes', > > - here we probably should change description or create 2nd issue), > > - no-archiver mode ( segments still can be deleted, but in same > directory > > it was written) - maximum perfomance on ext* fs. > > - applying compressor to segments older than 1 completed checkpoint > ago - > > saves space. > > > > Is it necessary to store data we can safely remove? > > > > Or may be Ignite should handle this by itself and delete unnecessary > > segments on low space left on device, like Linux decreases page cache in > > memory if there is no free RAM left. > > > > вт, 13 февр. 2018 г. в 23:32, Ivan Rakov : > > > >> As far as I understand, the idea is WAL archive with predictable size > >> ("N checkpoints" is not predictable size), which can be safely removed > >> (e.g. if free disk space is urgently needed) without losing crash > recovery. > >> > >> No-archiver mode makes sense as well - it should be faster than current > >> mode (at least, on filesystems different from XFS). It will be useful > >> for users who has lots of disk space and want to gain maximum > throughput. > >> > >> Best Regards, > >> Ivan Rakov > >> > >> On 13.02.2018 23:14, Dmitry Pavlov wrote: > >>> Hi, I didn't get the point why it may be required to separate WAL work, > >> WAL > >>> uncheckpointed archive (some work outside segment rotation) and > >>> checkpointed archive (which is better to be compressed using Ignite new > >>> feature - WAL compressor). > >>> > >>> Please consider new no-archiver mode implemented recently. > >>> > >>> If archive folder confuses end user, grid admin may set up this mode > (all > >>> segments is placed in 1 directory) instead of introducing folders. > >>> > >>> > >>> вт, 13 февр. 2018 г. в 22:11, Ivan Rakov : > >>> > I think, I got the point now. > There's no need to copy files from "temp" to "archive" dir - we can > just > move them, which is a constant-time operation. > Makes sense. > > Change is quite complex (we need to synchronize all movings thoroughly > to avoid ruining existing WAL read iterators), but feasible. > > Best Regards, > Ivan Rakov > > > On 13.02.2018 22:06, Ivan Rakov wrote: > > Yakov, > > > > This will work. However, I expect performance degradation with this > > change. Disk storage has a limited number of I/O operations per > second > > on hardware level. List of already existing disk I/O activities > > (writing to WAL work dir, copying from WAL work dir to WAL archive > > dir, writing partition files during checkpoint) will be updated with > a > > new one - copying from WAL work dir to temp dir. > > > > Best Regards, > > Ivan Rakov > > > > On 13.02.2018 21:35, Yakov Zhdanov wrote: > >> Ivan, > >> > >> I do not want to create new files. As far as I know, now we copy > >> segments > >> to archive dir before they get checkpointed. What I suggest is to > >> copy them > >> to a temp dir under wal directory and then move to archive. In my > >> understanding at the time we copy the files to a temp folder all > >> changes to > >> them are already fsynced. > >> > >> Correct? > >> > >> Yakov Zhdanov, > >> www.gridgain.com > >> > >> 2018-02-13 21:29 GMT+03:00 Ivan Rakov : > >> > >>> Yakov, > >>> > >>> I see the only one problem with your suggestion - number of > >>> "uncheckpointed" segments is potentially unlimited. > >>> Right now we have limited number (10) of file segments with > immutable > >>> names in WAL "work" directory. We have to keep this approach due to > >>> known > >>> bug in XFS - fsync time is nearly twice bigger for recently created > >>> files. > >>> > >>> Best Regards, > >>> Ivan Rakov > >>> > >>> > >>> On 13.02.2018 21:22, Yakov Zhdanov wrote: > >>> > I meant we still will be copying segment once and then will be > moving it > to > archive which should not affect file system much. > >
Re: WAL Archive Issue
- applying compressor to segments older than 1 completed checkpoint ago - saves space. By the way: WAL compression is already implemented that way. If there are any ".zip" segments in archive dir, they are free to delete. This can be a safe workaround for users who experience lack of free space - just delete compressed segments. We should mention it in documentation for 2.4 release. Best Regards, Ivan Rakov On 13.02.2018 23:53, Dmitry Pavlov wrote: I see, it seems subgoal 'gain predictable size' can be achieved with following options: - https://issues.apache.org/jira/browse/IGNITE-6552 implementation (in variant of '...WAL history size in time units and maximum size in GBytes', - here we probably should change description or create 2nd issue), - no-archiver mode ( segments still can be deleted, but in same directory it was written) - maximum perfomance on ext* fs. - applying compressor to segments older than 1 completed checkpoint ago - saves space. Is it necessary to store data we can safely remove? Or may be Ignite should handle this by itself and delete unnecessary segments on low space left on device, like Linux decreases page cache in memory if there is no free RAM left. вт, 13 февр. 2018 г. в 23:32, Ivan Rakov : As far as I understand, the idea is WAL archive with predictable size ("N checkpoints" is not predictable size), which can be safely removed (e.g. if free disk space is urgently needed) without losing crash recovery. No-archiver mode makes sense as well - it should be faster than current mode (at least, on filesystems different from XFS). It will be useful for users who has lots of disk space and want to gain maximum throughput. Best Regards, Ivan Rakov On 13.02.2018 23:14, Dmitry Pavlov wrote: Hi, I didn't get the point why it may be required to separate WAL work, WAL uncheckpointed archive (some work outside segment rotation) and checkpointed archive (which is better to be compressed using Ignite new feature - WAL compressor). Please consider new no-archiver mode implemented recently. If archive folder confuses end user, grid admin may set up this mode (all segments is placed in 1 directory) instead of introducing folders. вт, 13 февр. 2018 г. в 22:11, Ivan Rakov : I think, I got the point now. There's no need to copy files from "temp" to "archive" dir - we can just move them, which is a constant-time operation. Makes sense. Change is quite complex (we need to synchronize all movings thoroughly to avoid ruining existing WAL read iterators), but feasible. Best Regards, Ivan Rakov On 13.02.2018 22:06, Ivan Rakov wrote: Yakov, This will work. However, I expect performance degradation with this change. Disk storage has a limited number of I/O operations per second on hardware level. List of already existing disk I/O activities (writing to WAL work dir, copying from WAL work dir to WAL archive dir, writing partition files during checkpoint) will be updated with a new one - copying from WAL work dir to temp dir. Best Regards, Ivan Rakov On 13.02.2018 21:35, Yakov Zhdanov wrote: Ivan, I do not want to create new files. As far as I know, now we copy segments to archive dir before they get checkpointed. What I suggest is to copy them to a temp dir under wal directory and then move to archive. In my understanding at the time we copy the files to a temp folder all changes to them are already fsynced. Correct? Yakov Zhdanov, www.gridgain.com 2018-02-13 21:29 GMT+03:00 Ivan Rakov : Yakov, I see the only one problem with your suggestion - number of "uncheckpointed" segments is potentially unlimited. Right now we have limited number (10) of file segments with immutable names in WAL "work" directory. We have to keep this approach due to known bug in XFS - fsync time is nearly twice bigger for recently created files. Best Regards, Ivan Rakov On 13.02.2018 21:22, Yakov Zhdanov wrote: I meant we still will be copying segment once and then will be moving it to archive which should not affect file system much. Thoughts? --Yakov 2018-02-13 21:19 GMT+03:00 Yakov Zhdanov : Alex, I remember we had some confusing behavior for WAL archive when archived segments were required for successful recovery. Is issue still present? If yes, what if we copy "uncheckpointed" segments to a directory under wal directory and then move the segments to archive after checkpoint? Will this work? Thanks! --Yakov
Re: WAL Archive Issue
I see, it seems subgoal 'gain predictable size' can be achieved with following options: - https://issues.apache.org/jira/browse/IGNITE-6552 implementation (in variant of '...WAL history size in time units and maximum size in GBytes', - here we probably should change description or create 2nd issue), - no-archiver mode ( segments still can be deleted, but in same directory it was written) - maximum perfomance on ext* fs. - applying compressor to segments older than 1 completed checkpoint ago - saves space. Is it necessary to store data we can safely remove? Or may be Ignite should handle this by itself and delete unnecessary segments on low space left on device, like Linux decreases page cache in memory if there is no free RAM left. вт, 13 февр. 2018 г. в 23:32, Ivan Rakov : > As far as I understand, the idea is WAL archive with predictable size > ("N checkpoints" is not predictable size), which can be safely removed > (e.g. if free disk space is urgently needed) without losing crash recovery. > > No-archiver mode makes sense as well - it should be faster than current > mode (at least, on filesystems different from XFS). It will be useful > for users who has lots of disk space and want to gain maximum throughput. > > Best Regards, > Ivan Rakov > > On 13.02.2018 23:14, Dmitry Pavlov wrote: > > Hi, I didn't get the point why it may be required to separate WAL work, > WAL > > uncheckpointed archive (some work outside segment rotation) and > > checkpointed archive (which is better to be compressed using Ignite new > > feature - WAL compressor). > > > > Please consider new no-archiver mode implemented recently. > > > > If archive folder confuses end user, grid admin may set up this mode (all > > segments is placed in 1 directory) instead of introducing folders. > > > > > > вт, 13 февр. 2018 г. в 22:11, Ivan Rakov : > > > >> I think, I got the point now. > >> There's no need to copy files from "temp" to "archive" dir - we can just > >> move them, which is a constant-time operation. > >> Makes sense. > >> > >> Change is quite complex (we need to synchronize all movings thoroughly > >> to avoid ruining existing WAL read iterators), but feasible. > >> > >> Best Regards, > >> Ivan Rakov > >> > >> > >> On 13.02.2018 22:06, Ivan Rakov wrote: > >>> Yakov, > >>> > >>> This will work. However, I expect performance degradation with this > >>> change. Disk storage has a limited number of I/O operations per second > >>> on hardware level. List of already existing disk I/O activities > >>> (writing to WAL work dir, copying from WAL work dir to WAL archive > >>> dir, writing partition files during checkpoint) will be updated with a > >>> new one - copying from WAL work dir to temp dir. > >>> > >>> Best Regards, > >>> Ivan Rakov > >>> > >>> On 13.02.2018 21:35, Yakov Zhdanov wrote: > Ivan, > > I do not want to create new files. As far as I know, now we copy > segments > to archive dir before they get checkpointed. What I suggest is to > copy them > to a temp dir under wal directory and then move to archive. In my > understanding at the time we copy the files to a temp folder all > changes to > them are already fsynced. > > Correct? > > Yakov Zhdanov, > www.gridgain.com > > 2018-02-13 21:29 GMT+03:00 Ivan Rakov : > > > Yakov, > > > > I see the only one problem with your suggestion - number of > > "uncheckpointed" segments is potentially unlimited. > > Right now we have limited number (10) of file segments with immutable > > names in WAL "work" directory. We have to keep this approach due to > > known > > bug in XFS - fsync time is nearly twice bigger for recently created > > files. > > > > Best Regards, > > Ivan Rakov > > > > > > On 13.02.2018 21:22, Yakov Zhdanov wrote: > > > >> I meant we still will be copying segment once and then will be > >> moving it > >> to > >> archive which should not affect file system much. > >> > >> Thoughts? > >> > >> --Yakov > >> > >> 2018-02-13 21:19 GMT+03:00 Yakov Zhdanov : > >> > >> Alex, > >>> I remember we had some confusing behavior for WAL archive when > >>> archived > >>> segments were required for successful recovery. > >>> > >>> Is issue still present? > >>> > >>> If yes, what if we copy "uncheckpointed" segments to a directory > >>> under > >>> wal > >>> directory and then move the segments to archive after checkpoint? > >>> Will > >>> this > >>> work? > >>> > >>> Thanks! > >>> > >>> --Yakov > >>> > >>> > >> > >
Re: WAL Archive Issue
As far as I understand, the idea is WAL archive with predictable size ("N checkpoints" is not predictable size), which can be safely removed (e.g. if free disk space is urgently needed) without losing crash recovery. No-archiver mode makes sense as well - it should be faster than current mode (at least, on filesystems different from XFS). It will be useful for users who has lots of disk space and want to gain maximum throughput. Best Regards, Ivan Rakov On 13.02.2018 23:14, Dmitry Pavlov wrote: Hi, I didn't get the point why it may be required to separate WAL work, WAL uncheckpointed archive (some work outside segment rotation) and checkpointed archive (which is better to be compressed using Ignite new feature - WAL compressor). Please consider new no-archiver mode implemented recently. If archive folder confuses end user, grid admin may set up this mode (all segments is placed in 1 directory) instead of introducing folders. вт, 13 февр. 2018 г. в 22:11, Ivan Rakov : I think, I got the point now. There's no need to copy files from "temp" to "archive" dir - we can just move them, which is a constant-time operation. Makes sense. Change is quite complex (we need to synchronize all movings thoroughly to avoid ruining existing WAL read iterators), but feasible. Best Regards, Ivan Rakov On 13.02.2018 22:06, Ivan Rakov wrote: Yakov, This will work. However, I expect performance degradation with this change. Disk storage has a limited number of I/O operations per second on hardware level. List of already existing disk I/O activities (writing to WAL work dir, copying from WAL work dir to WAL archive dir, writing partition files during checkpoint) will be updated with a new one - copying from WAL work dir to temp dir. Best Regards, Ivan Rakov On 13.02.2018 21:35, Yakov Zhdanov wrote: Ivan, I do not want to create new files. As far as I know, now we copy segments to archive dir before they get checkpointed. What I suggest is to copy them to a temp dir under wal directory and then move to archive. In my understanding at the time we copy the files to a temp folder all changes to them are already fsynced. Correct? Yakov Zhdanov, www.gridgain.com 2018-02-13 21:29 GMT+03:00 Ivan Rakov : Yakov, I see the only one problem with your suggestion - number of "uncheckpointed" segments is potentially unlimited. Right now we have limited number (10) of file segments with immutable names in WAL "work" directory. We have to keep this approach due to known bug in XFS - fsync time is nearly twice bigger for recently created files. Best Regards, Ivan Rakov On 13.02.2018 21:22, Yakov Zhdanov wrote: I meant we still will be copying segment once and then will be moving it to archive which should not affect file system much. Thoughts? --Yakov 2018-02-13 21:19 GMT+03:00 Yakov Zhdanov : Alex, I remember we had some confusing behavior for WAL archive when archived segments were required for successful recovery. Is issue still present? If yes, what if we copy "uncheckpointed" segments to a directory under wal directory and then move the segments to archive after checkpoint? Will this work? Thanks! --Yakov
Re: WAL Archive Issue
Hi, I didn't get the point why it may be required to separate WAL work, WAL uncheckpointed archive (some work outside segment rotation) and checkpointed archive (which is better to be compressed using Ignite new feature - WAL compressor). Please consider new no-archiver mode implemented recently. If archive folder confuses end user, grid admin may set up this mode (all segments is placed in 1 directory) instead of introducing folders. вт, 13 февр. 2018 г. в 22:11, Ivan Rakov : > I think, I got the point now. > There's no need to copy files from "temp" to "archive" dir - we can just > move them, which is a constant-time operation. > Makes sense. > > Change is quite complex (we need to synchronize all movings thoroughly > to avoid ruining existing WAL read iterators), but feasible. > > Best Regards, > Ivan Rakov > > > On 13.02.2018 22:06, Ivan Rakov wrote: > > Yakov, > > > > This will work. However, I expect performance degradation with this > > change. Disk storage has a limited number of I/O operations per second > > on hardware level. List of already existing disk I/O activities > > (writing to WAL work dir, copying from WAL work dir to WAL archive > > dir, writing partition files during checkpoint) will be updated with a > > new one - copying from WAL work dir to temp dir. > > > > Best Regards, > > Ivan Rakov > > > > On 13.02.2018 21:35, Yakov Zhdanov wrote: > >> Ivan, > >> > >> I do not want to create new files. As far as I know, now we copy > >> segments > >> to archive dir before they get checkpointed. What I suggest is to > >> copy them > >> to a temp dir under wal directory and then move to archive. In my > >> understanding at the time we copy the files to a temp folder all > >> changes to > >> them are already fsynced. > >> > >> Correct? > >> > >> Yakov Zhdanov, > >> www.gridgain.com > >> > >> 2018-02-13 21:29 GMT+03:00 Ivan Rakov : > >> > >>> Yakov, > >>> > >>> I see the only one problem with your suggestion - number of > >>> "uncheckpointed" segments is potentially unlimited. > >>> Right now we have limited number (10) of file segments with immutable > >>> names in WAL "work" directory. We have to keep this approach due to > >>> known > >>> bug in XFS - fsync time is nearly twice bigger for recently created > >>> files. > >>> > >>> Best Regards, > >>> Ivan Rakov > >>> > >>> > >>> On 13.02.2018 21:22, Yakov Zhdanov wrote: > >>> > I meant we still will be copying segment once and then will be > moving it > to > archive which should not affect file system much. > > Thoughts? > > --Yakov > > 2018-02-13 21:19 GMT+03:00 Yakov Zhdanov : > > Alex, > > I remember we had some confusing behavior for WAL archive when > > archived > > segments were required for successful recovery. > > > > Is issue still present? > > > > If yes, what if we copy "uncheckpointed" segments to a directory > > under > > wal > > directory and then move the segments to archive after checkpoint? > > Will > > this > > work? > > > > Thanks! > > > > --Yakov > > > > > > > >
Re: WAL Archive Issue
I think, I got the point now. There's no need to copy files from "temp" to "archive" dir - we can just move them, which is a constant-time operation. Makes sense. Change is quite complex (we need to synchronize all movings thoroughly to avoid ruining existing WAL read iterators), but feasible. Best Regards, Ivan Rakov On 13.02.2018 22:06, Ivan Rakov wrote: Yakov, This will work. However, I expect performance degradation with this change. Disk storage has a limited number of I/O operations per second on hardware level. List of already existing disk I/O activities (writing to WAL work dir, copying from WAL work dir to WAL archive dir, writing partition files during checkpoint) will be updated with a new one - copying from WAL work dir to temp dir. Best Regards, Ivan Rakov On 13.02.2018 21:35, Yakov Zhdanov wrote: Ivan, I do not want to create new files. As far as I know, now we copy segments to archive dir before they get checkpointed. What I suggest is to copy them to a temp dir under wal directory and then move to archive. In my understanding at the time we copy the files to a temp folder all changes to them are already fsynced. Correct? Yakov Zhdanov, www.gridgain.com 2018-02-13 21:29 GMT+03:00 Ivan Rakov : Yakov, I see the only one problem with your suggestion - number of "uncheckpointed" segments is potentially unlimited. Right now we have limited number (10) of file segments with immutable names in WAL "work" directory. We have to keep this approach due to known bug in XFS - fsync time is nearly twice bigger for recently created files. Best Regards, Ivan Rakov On 13.02.2018 21:22, Yakov Zhdanov wrote: I meant we still will be copying segment once and then will be moving it to archive which should not affect file system much. Thoughts? --Yakov 2018-02-13 21:19 GMT+03:00 Yakov Zhdanov : Alex, I remember we had some confusing behavior for WAL archive when archived segments were required for successful recovery. Is issue still present? If yes, what if we copy "uncheckpointed" segments to a directory under wal directory and then move the segments to archive after checkpoint? Will this work? Thanks! --Yakov
Re: WAL Archive Issue
Yakov, This will work. However, I expect performance degradation with this change. Disk storage has a limited number of I/O operations per second on hardware level. List of already existing disk I/O activities (writing to WAL work dir, copying from WAL work dir to WAL archive dir, writing partition files during checkpoint) will be updated with a new one - copying from WAL work dir to temp dir. Best Regards, Ivan Rakov On 13.02.2018 21:35, Yakov Zhdanov wrote: Ivan, I do not want to create new files. As far as I know, now we copy segments to archive dir before they get checkpointed. What I suggest is to copy them to a temp dir under wal directory and then move to archive. In my understanding at the time we copy the files to a temp folder all changes to them are already fsynced. Correct? Yakov Zhdanov, www.gridgain.com 2018-02-13 21:29 GMT+03:00 Ivan Rakov : Yakov, I see the only one problem with your suggestion - number of "uncheckpointed" segments is potentially unlimited. Right now we have limited number (10) of file segments with immutable names in WAL "work" directory. We have to keep this approach due to known bug in XFS - fsync time is nearly twice bigger for recently created files. Best Regards, Ivan Rakov On 13.02.2018 21:22, Yakov Zhdanov wrote: I meant we still will be copying segment once and then will be moving it to archive which should not affect file system much. Thoughts? --Yakov 2018-02-13 21:19 GMT+03:00 Yakov Zhdanov : Alex, I remember we had some confusing behavior for WAL archive when archived segments were required for successful recovery. Is issue still present? If yes, what if we copy "uncheckpointed" segments to a directory under wal directory and then move the segments to archive after checkpoint? Will this work? Thanks! --Yakov
Re: WAL Archive Issue
Ivan, I do not want to create new files. As far as I know, now we copy segments to archive dir before they get checkpointed. What I suggest is to copy them to a temp dir under wal directory and then move to archive. In my understanding at the time we copy the files to a temp folder all changes to them are already fsynced. Correct? Yakov Zhdanov, www.gridgain.com 2018-02-13 21:29 GMT+03:00 Ivan Rakov : > Yakov, > > I see the only one problem with your suggestion - number of > "uncheckpointed" segments is potentially unlimited. > Right now we have limited number (10) of file segments with immutable > names in WAL "work" directory. We have to keep this approach due to known > bug in XFS - fsync time is nearly twice bigger for recently created files. > > Best Regards, > Ivan Rakov > > > On 13.02.2018 21:22, Yakov Zhdanov wrote: > >> I meant we still will be copying segment once and then will be moving it >> to >> archive which should not affect file system much. >> >> Thoughts? >> >> --Yakov >> >> 2018-02-13 21:19 GMT+03:00 Yakov Zhdanov : >> >> Alex, >>> >>> I remember we had some confusing behavior for WAL archive when archived >>> segments were required for successful recovery. >>> >>> Is issue still present? >>> >>> If yes, what if we copy "uncheckpointed" segments to a directory under >>> wal >>> directory and then move the segments to archive after checkpoint? Will >>> this >>> work? >>> >>> Thanks! >>> >>> --Yakov >>> >>> >
Re: WAL Archive Issue
Yakov, I see the only one problem with your suggestion - number of "uncheckpointed" segments is potentially unlimited. Right now we have limited number (10) of file segments with immutable names in WAL "work" directory. We have to keep this approach due to known bug in XFS - fsync time is nearly twice bigger for recently created files. Best Regards, Ivan Rakov On 13.02.2018 21:22, Yakov Zhdanov wrote: I meant we still will be copying segment once and then will be moving it to archive which should not affect file system much. Thoughts? --Yakov 2018-02-13 21:19 GMT+03:00 Yakov Zhdanov : Alex, I remember we had some confusing behavior for WAL archive when archived segments were required for successful recovery. Is issue still present? If yes, what if we copy "uncheckpointed" segments to a directory under wal directory and then move the segments to archive after checkpoint? Will this work? Thanks! --Yakov
Re: WAL Archive Issue
I meant we still will be copying segment once and then will be moving it to archive which should not affect file system much. Thoughts? --Yakov 2018-02-13 21:19 GMT+03:00 Yakov Zhdanov : > Alex, > > I remember we had some confusing behavior for WAL archive when archived > segments were required for successful recovery. > > Is issue still present? > > If yes, what if we copy "uncheckpointed" segments to a directory under wal > directory and then move the segments to archive after checkpoint? Will this > work? > > Thanks! > > --Yakov >
WAL Archive Issue
Alex, I remember we had some confusing behavior for WAL archive when archived segments were required for successful recovery. Is issue still present? If yes, what if we copy "uncheckpointed" segments to a directory under wal directory and then move the segments to archive after checkpoint? Will this work? Thanks! --Yakov