Re: [HACKERS] [PROPOSAL] VACUUM Progress Checker.
Hello, Please find attached an updated patch. >Flag isn't reset on error. Corrected in the attached. > + pgstat_reset_activityflag; >Does this actually compile? It does compile but with no effect. It has been corrected. >snprintf()? I don't think you need to keep track of schemaname_len at all. memcpy() has been replaced by snprintf() to avoid calculating schemaname_len. >In fact, I wonder if you need to send total_pages at all -- surely the client >can add both total_heap_pages and total_index_pages by itself ... This has been corrected in the attached patch. >It seems a bit strange that the remaining progress_param entries are not >initialized to anything. Also, why aren't the number of params of each type >saved too? The number of params for each command remains constant hence it has been hardcoded. >In the receiving code you check whether each value equals 0, and if it does >then report NULL, but imagine vacuuming a table with no indexes where the >number of index pages is going to be zero. >Shouldn't we display zero there rather than null? Agree. IIUC, NULL should rather be used when a value is invalid. But for valid values like 'zero index pages' it is clearer to display 0. It has been corrected in the attached. >This patch lacks a comment somewhere explaining how this whole thing works. Have added few lines in pgstat.h inside PgBackendStatus struct. >I believe you don't need this include. Corrected. >This not only adds an unnecessary empty line at the end of the struct >declaration, but also fails to preserve the "st_" prefix used in all the other >fields Corrected. Thank you, Rahila Syed __ Disclaimer: This email and any attachments are sent in strictest confidence for the sole use of the addressee and may contain legally privileged, confidential, and proprietary data. If you are not the intended recipient, please advise the sender by replying promptly to this email and then delete and destroy this email and any attachments without any further use, copying or forwarding. Vacuum_progress_checker_v6.patch Description: Vacuum_progress_checker_v6.patch -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PROPOSAL] VACUUM Progress Checker.
Hello, >I think that you should add the flag or something which indicates whether this >backend is running VACUUM or not, into PgBackendStatus. >pg_stat_vacuum_progress should display the entries of only backends with that >flag set true. This design means that you need to set the flag to true when >starting VACUUM and reset at the end of VACUUM progressing. Please find attached updated patch which adds a flag in PgBackendStatus which indicates whether this backend in running VACUUM. Also, pgstat_report_progress function is changed to make it generic for all commands reporting progress. Thank you, Rahila Syed __ Disclaimer: This email and any attachments are sent in strictest confidence for the sole use of the addressee and may contain legally privileged, confidential, and proprietary data. If you are not the intended recipient, please advise the sender by replying promptly to this email and then delete and destroy this email and any attachments without any further use, copying or forwarding. Vacuum_progress_checker_v5.patch Description: Vacuum_progress_checker_v5.patch -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PROPOSAL] VACUUM Progress Checker.
Hello Fujii-san, >Here are another review comments Thank you for review. Please find attached an updated patch. > You removed some empty lines, for example, in vacuum.h. >Which seems useless to me. Has been corrected in the attached. >Why did you use an array to store the progress information of VACUUM? >I think that it's better to use separate specific variables for them for >better code readability, for example, variables scanned_pages, >heap_total_pages, etc. It is designed this way to keep it generic for all the commands which can store different progress parameters in shared memory. >Currently only progress_param_float[0] is used. So there is no need to use an >array here. Same as before . This is for later use by other commands. >progress_param_float[0] saves the percetage of VACUUM progress. >But why do we need to save that information into shared memory? >We can just calculate the percentage whenever pg_stat_get_vacuum_progress() is >executed, instead. There seems to be no need to save that information. This has been corrected in the attached. >char progress_message[PROGRESS_MESSAGE_LENGTH][N_PROGRESS_PARAM]; >As Sawada pointed out, there is no user of this variable. Have used it to store table name in the updated patch. It can also be used to display index names, current phase of VACUUM. This has not been included in the patch yet to avoid cluttering the display with too much information. >For example, st_activity of autovacuum worker displays "autovacuum ...". >So as Sawada reported, he could not find any entry for autovacuum in >pg_stat_vacuum_progress. In the attached patch , I have performed check for autovacuum also. >I think that you should add the flag or something which indicates whether this >backend is running VACUUM or not, into PgBackendStatus. >pg_stat_vacuum_progress should display the entries of only backends with that >flag set true. This design means that you need to set the flag to true when >starting VACUUM and reset at the end of VACUUM progressing. This design seems better in the sense that we don’t rely on st_activity entry to display progress values. A variable which stores flags for running commands can be created in PgBackendStatus. These flags can be used at the time of display of progress of particular command. >That is, non-superuser should not be allowed to see the >pg_stat_vacuum_progress entry of superuser running VACUUM? This has been included in the updated patch. >This code may cause total_pages and total_heap_pages to decrease while VACUUM >is running. Yes. This is because the initial count of total pages to be vacuumed and the pages which are actually vacuumed can vary depending on visibility of tuples. The pages which are all visible are skipped and hence have been subtracted from total page count. Thank you, Rahila Syed __ Disclaimer: This email and any attachments are sent in strictest confidence for the sole use of the addressee and may contain legally privileged, confidential, and proprietary data. If you are not the intended recipient, please advise the sender by replying promptly to this email and then delete and destroy this email and any attachments without any further use, copying or forwarding. Vacuum_progress_checker_v4.patch Description: Vacuum_progress_checker_v4.patch -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PROPOSAL] VACUUM Progress Checker.
Hello, Please check the attached patch as the earlier one had typo in regression test output. >+#define PG_STAT_GET_PROGRESS_COLS30 >Why did you use 30? That has come from N_PROGRESS_PARAM * 3 where N_PROGRESS_PARAM = 10 is the number of progress parameters of each type stored in shared memory. There are three such types (int, float, string) hence total number of progress parameters can be 30. Thank you, Rahila Syed __ Disclaimer: This email and any attachments are sent in strictest confidence for the sole use of the addressee and may contain legally privileged, confidential, and proprietary data. If you are not the intended recipient, please advise the sender by replying promptly to this email and then delete and destroy this email and any attachments without any further use, copying or forwarding. Vacuum_progress_checker_v4.patch Description: Vacuum_progress_checker_v4.patch -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PROPOSAL] VACUUM Progress Checker.
Hello, Please find attached patch with bugs reported by Thom and Sawada-san solved. >* The progress of vacuum by autovacuum seems not to be displayed. The progress is stored in shared variables during autovacuum. I guess the reason they are not visible is that the entries are deleted as soon as the process exits. But the progress can be viewed while autovacuum worker is running. Thank you, Rahila Syed __ Disclaimer: This email and any attachments are sent in strictest confidence for the sole use of the addressee and may contain legally privileged, confidential, and proprietary data. If you are not the intended recipient, please advise the sender by replying promptly to this email and then delete and destroy this email and any attachments without any further use, copying or forwarding. Vacuum_progress_checker_v3.patch Description: Vacuum_progress_checker_v3.patch -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PROPOSAL] VACUUM Progress Checker.
Hello Thom, >Okay, I've just tested this with a newly-loaded table (1,252,973 of jsonb >data), Thanks a lot! >but after it's finished, I end up with this: >json=# select * from pg_stat_vacuum_progress; >-[ RECORD 1 ]---+--- >pid | 5569 >total_pages | 217941 >scanned_pages | 175243 >total_heap_pages | 175243 >scanned_heap_pages | 175243 >total_index_pages | 42698 >scanned_index_pages | >percent_complete | 80 >This was running with a VACUUM ANALYZE. This output seems to suggest that it >didn't complete. Ok. The patch fails here because 'total pages to be scanned' takes into account index pages and no index pages are actually scanned. So the scanned pages count does not reach total pages count . I will fix this. It seems that no index pages were scanned during this because there were no dead tuples to be cleaned as the table was newly loaded. >After, I ran VACUUM FULL. pg_stat_vacuum_progress didn't change from before, >so that doesn't appear to show up in the view. The scope of this patch is to report progress of basic VACUUM . It does not take into account VACUUM FULL yet. I think this can be included after basic VACUUM progress is done. >I then deleted 40,000 rows from my table, and ran VACUUM ANALYZE again. This >time it progressed and percent_complete reached 100 OK. Thank you, Rahila Syed. __ Disclaimer: This email and any attachments are sent in strictest confidence for the sole use of the addressee and may contain legally privileged, confidential, and proprietary data. If you are not the intended recipient, please advise the sender by replying promptly to this email and then delete and destroy this email and any attachments without any further use, copying or forwarding. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PROPOSAL] VACUUM Progress Checker.
Hello, Please find attached updated VACUUM progress checker patch. Following have been accomplished in the patch 1. Accounts for index pages count while calculating total progress of VACUUM. 2. Common location for storing progress parameters for any command. Idea is every command which needs to report progress can populate and interpret the shared variables in its own way. Each can display progress by implementing separate views. 3. Separate VACUUM progress view to display various progress parameters has been implemented . Progress of various phases like heap scan, index scan, total pages scanned along with completion percentage is reported. 4.This view can display progress for all active backends running VACUUM. Basic testing has been performed. Thorough testing is yet to be done. Marking it as Needs Review in Sept-Commitfest. ToDo: Display count of heap pages actually vacuumed(marking line pointers unused) Display percentage of work_mem being used to store dead tuples. Thank you, Rahila Syed __ Disclaimer: This email and any attachments are sent in strictest confidence for the sole use of the addressee and may contain legally privileged, confidential, and proprietary data. If you are not the intended recipient, please advise the sender by replying promptly to this email and then delete and destroy this email and any attachments without any further use, copying or forwarding. Vacuum_progress_checker_v2.patch Description: Vacuum_progress_checker_v2.patch -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PROPOSAL] VACUUM Progress Checker.
Hello, Autovacuum knows what % of a table needs to be cleaned - that is how it is triggered. When a vacuum runs we should calculate how many TIDs we will collect and therefore how many trips to the indexes we need for given memory. We can use the VM to find out how many blocks we'll need to scan in the table. So overall, we know how many blocks we need to scan. Total heap pages to be scanned can be obtained from VM as mentioned. To figure out number of index scans we need an estimate of dead tuples. IIUC, autovacuum acquires information that a table has to be cleaned by looking at pgstat entry for the table. i.e n_dead_tuples. Hence,initial estimate of dead tuple TIDs can be made using n_dead_tuples in pgstat. n_dead_tuples in pgstat table entry is the value updated by last analyze and may not be up to date. In cases where pgstat entry for table is NULL, number of dead tuples TIDs cannot be estimated. In such cases where TIDs cannot be estimated , we can start with an initial estimate of 1 index scan and later go on adding number of index pages to the total count of pages(heap+index) if count of index scan exceeds. Thank you, Rahila Syed. __ Disclaimer: This email and any attachments are sent in strictest confidence for the sole use of the addressee and may contain legally privileged, confidential, and proprietary data. If you are not the intended recipient, please advise the sender by replying promptly to this email and then delete and destroy this email and any attachments without any further use, copying or forwarding. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PROPOSAL] VACUUM Progress Checker.
Hello, When we're in Phase2 or 3, don't we need to report the number of total page scanned or percentage of how many table pages scanned, as well? The total heap pages scanned need to be reported with phase 2 or 3. Complete progress report need to have numbers from each phase when applicable. Phase 1. Report 2 integer counters: heap pages scanned and total heap pages, 1 float counter: percentage_complete and progress message. Phase 2. Report 2 integer counters: index pages scanned and total index pages(across all indexes) and progress message. Phase 3. 1 integer counter: heap pages vacuumed. Sorry for being unclear here. What I meant to say is, each phase of a command will correspond to a slot in COMMAND_NUM_SLOTS. Each phase will be a separate array element and will comprise of n integers, n floats, string. So , in the view reporting progress, VACUUM command can have 3 entries one for each phase. Thank you, Rahila Syed __ Disclaimer: This email and any attachments are sent in strictest confidence for the sole use of the addressee and may contain legally privileged, confidential, and proprietary data. If you are not the intended recipient, please advise the sender by replying promptly to this email and then delete and destroy this email and any attachments without any further use, copying or forwarding. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PROPOSAL] VACUUM Progress Checker.
Hello, Naming the GUC pgstat* seems a little inconsistent. Sorry, there is a typo in the mail. The GUC name is 'track_activity_progress'. Also, adding the new GUC to src/backend/utils/misc/postgresql.conf.sample might be helpful Yes. I will update. Thank you, Rahila Syed __ Disclaimer: This email and any attachments are sent in strictest confidence for the sole use of the addressee and may contain legally privileged, confidential, and proprietary data. If you are not the intended recipient, please advise the sender by replying promptly to this email and then delete and destroy this email and any attachments without any further use, copying or forwarding. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PROPOSAL] VACUUM Progress Checker.
Hello, TBH, I think that designing this as a hook-based solution is adding a whole lot of complexity for no value. The hard parts of the problem are collecting the raw data and making the results visible to users, and both of those require involvement of the core code. Where is the benefit from pushing some trivial intermediate arithmetic into an external module? If there's any at all, it's certainly not enough to justify problems such as you mention here. So I'd just create a pgstat_report_percent_done() type of interface in pgstat.c and then teach VACUUM to call it directly. Thank you for suggestion. I agree that adding code in core will reduce code complexity with no additional overhead. Going by the consensus, I will update the patch with code to collect and store progress information from vacuum in pgstat.c and UI using pg_stat_activity view. Thank you, Rahila Syed __ Disclaimer: This email and any attachments are sent in strictest confidence for the sole use of the addressee and may contain legally privileged, confidential, and proprietary data. If you are not the intended recipient, please advise the sender by replying promptly to this email and then delete and destroy this email and any attachments without any further use, copying or forwarding. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PROPOSAL] VACUUM Progress Checker.
Hello, Unless I am missing something, I guess you can still keep the actual code that updates counters outside the core if you adopt an approach that Simon suggests. Yes. The code to extract progress information from VACUUM and storing in shared memory can be outside core even with pg_stat_activity as a user interface. Whatever the view (existing/new), any related counters would have a valid (non-NULL) value when read off the view iff hooks are set perhaps because you have an extension that sets them. I guess that means any operation that supports progress tracking would have an extension with suitable hooks implemented. Do you mean to say , any operation/application that want progress tracking feature will dynamically load the progress checker module which will set the hooks for progress reporting? If yes , unless I am missing something such dynamic loading cannot happen if we use pg_stat_activity as it gets values from shared memory. Module has to be a shared_preload_library to allocate a shared memory. So this will mean the module can be loaded only at server restart. Am I missing something? Thank you, Rahila Syed __ Disclaimer: This email and any attachments are sent in strictest confidence for the sole use of the addressee and may contain legally privileged, confidential, and proprietary data. If you are not the intended recipient, please advise the sender by replying promptly to this email and then delete and destroy this email and any attachments without any further use, copying or forwarding. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [Proposal] More Vacuum Statistics
Hello, Maybe, For DBAs, It might be better to show vacuum progress in pg_stat_activity. (if we'd do, add a free-style column like progress ?) This column might also be able to use for other long time commands like ANALYZE, CREATE/RE INDEX and COPY. To realize this feature, we certainly need to properly change pgstat_report_activity, use it more and add a new track-activity parameter Very similar idea was proposed in the following http://www.postgresql.org/message-id/1284756643.25048.42.ca...@vanquo.pezone.net IIUC, problem with showing progress in pg_stat_activity is that it introduces compulsary progress calculation overhead in core for every command. As work units of each command varies, common infrastructure might not be able to represent every command progress effectively. An architecture which will display progress only on users demand for each command separately will be more efficient. So, suggestion was rather to have a detailed progress report including remaining time for a command on users demand. FWIW, I am working on designing an approach to report VACUUM progress stats for which I will be posting a detailed proposal. The use case is reporting progress for long running VACUUMs. The approach involves using hooks to extract VACUUM progress statistics . The progress can be displayed using psql view (ex. pg_stat_maintenance). Thank you, Rahila Syed -Original Message- From: pgsql-hackers-ow...@postgresql.org [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Naoya Anzai Sent: Tuesday, June 16, 2015 8:41 AM To: Tomas Vondra Cc: pgsql-hackers@postgresql.org; Akio Iwaasa; bench.cof...@gmail.com; Tom Lane; Jeff Janes; Jim Nasby; Andres Freund; Alvaro Herrera Subject: Re: [HACKERS] [Proposal] More Vacuum Statistics Hi, Thank you for comments. and Sorry for my late response. pg_stat_vacuum view I understand it is not good to simply add more counters in pg_stat_*_tables. For now, I'd like to suggest an extension which can confirm vacuum statistics like pg_stat_statements. Similar feature has been already provided by pg_statsinfo package. But it is a full-stack package for PG-stats and it needs to redesign pg_log and design a repository database for introduce. And it is not a core-extension for PostgreSQL. (I don't intend to hate pg_statsinfo, I think this package is a very convinient tool) Everyone will be able to do more easily tuning of VACUUM. That's all I want. I'm still wondering whether these stats will really make the tuning any easier. What I do right now is looking at pg_stat_all_tables.n_deat_tup and if it exceeds some threshold, it's a sign that vacuum may need a bit of tuning. Sometimes it really requires tuning vacuum itself, but more often than not it's due to something else (a large bulk delete, autovacuum getting stuck on another table, ...). I don't see how the new stats would make this any easier. Can you give some examples on how the new stats might be used (and where the current stats are insufficient)? What use cases do you imagine for those stats? pg_stat_vacuum can keep histories of vacuum statistics for each tables/indices into shared memory.(They are not only last vacuum. This is already able to confirm using pg_stat_all_tables.) It makes easier analysis of vacuum histories because this view can sort or aggregate or filter. My use cases for those stats are following. - examine TRANSITION of vacuum execution time on any table (you can predict the future vacuum execution time) - examine EXECUTION INTERVAL of vacuum for each table (if too frequent, it should make vacuum-threshold tuning to up) - examine REST of dead-tuples just after vacuum (if dead-tuples remain, it may be due to any idle in transaction sessions) It might help differentiate the autovacuum activity from the rest of the system (e.g. there's a lot of I/O going on - how much of that is coming from autovacuum workers?). This would however require a more fine-grained reporting, because often the vacuums run for a very long time, especially on very large tables (which is exactly the case when this might be handy) - I just had a VACUUM that ran for 12 hours. These jobs should report the stats incrementally, not just once at the very end, because that makes it rather useless IMNSHO. +1 Certainly, VACUUM have often much execution time, I just had too. At present, we cannot predict when this vacuum finishes, what this vacuum is doing now, and whether this vacuum have any problem or not. Maybe, For DBAs, It might be better to show vacuum progress in pg_stat_activity. (if we'd do, add a free-style column like progress ?) This column might also be able to use for other long time commands like ANALYZE, CREATE/RE INDEX and COPY. To realize this feature, we certainly need to properly change pgstat_report_activity, use it more and add a new track-activity parameter. Regards, Anzai Naoya ---
Re: [HACKERS] [REVIEW] Re: Compression of full-page-writes
Hello, Please find attached a patch. As discussed, flag to denote compression and presence of hole in block image has been added in XLogRecordImageHeader rather than block header. Following are WAL numbers based on attached test script posted by Michael earlier in the thread. WAL generated FPW compression on 122.032 MB FPW compression off 155.223 MB HEAD 155.236 MB Compression : 21 % Number of block images generated in WAL : 63637 Thank you, Rahila Syed __ Disclaimer: This email and any attachments are sent in strictest confidence for the sole use of the addressee and may contain legally privileged, confidential, and proprietary data. If you are not the intended recipient, please advise the sender by replying promptly to this email and then delete and destroy this email and any attachments without any further use, copying or forwarding. Support-compression-of-full-page-writes-in-WAL_v23.patch Description: Support-compression-of-full-page-writes-in-WAL_v23.patch compress_run.bash Description: compress_run.bash -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [REVIEW] Re: Compression of full-page-writes
Hello, Are there any other flag bits that we should or are planning to add into WAL header newly, except the above two? If yes and they are required by even a block which doesn't have an image, I will change my mind and agree to add something like chunk ID to a block header. But I guess the answer of the question is No. Since the flag bits now we are thinking to add are required only by a block having an image, adding them into a block header (instead of block image header) seems a waste of bytes in WAL. So I concur with Michael. I agree. As per my understanding, this change of xlog format was to provide for future enhancement which would need flags relevant to entire block. But as mentioned, currently the flags being added are related to block image only. Hence for this patch it makes sense to add a field to XLogRecordImageHeader rather than block header. This will also save bytes per WAL record. Thank you, Rahila Syed __ Disclaimer: This email and any attachments are sent in strictest confidence for the sole use of the addressee and may contain legally privileged, confidential, and proprietary data. If you are not the intended recipient, please advise the sender by replying promptly to this email and then delete and destroy this email and any attachments without any further use, copying or forwarding. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [REVIEW] Re: Compression of full-page-writes
Hello, It would be good to get those problems fixed first. Could you send an updated patch? Please find attached updated patch with WAL replay error fixed. The patch follows chunk ID approach of xlog format. Following are brief measurement numbers. WAL FPW compression on 122.032 MB FPW compression off 155.239 MB HEAD 155.236 MB Thank you, Rahila Syed __ Disclaimer: This email and any attachments are sent in strictest confidence for the sole use of the addressee and may contain legally privileged, confidential, and proprietary data. If you are not the intended recipient, please advise the sender by replying promptly to this email and then delete and destroy this email and any attachments without any further use, copying or forwarding. Support-compression-of-full-page-writes_v22.patch Description: Support-compression-of-full-page-writes_v22.patch -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [REVIEW] Re: Compression of full-page-writes
Hello, I think we should change the xlog format so that the block_id (which currently is XLR_BLOCK_ID_DATA_SHORT/LONG or a actual block id) isn't the block id but something like XLR_CHUNK_ID. Which is used as is for XLR_CHUNK_ID_DATA_SHORT/LONG, but for backup blocks can be set to to XLR_CHUNK_BKP_WITH_HOLE, XLR_CHUNK_BKP_COMPRESSED, XLR_CHUNK_BKP_REFERENCE... The BKP blocks will then follow, storing the block id following the chunk id. Yes, that'll increase the amount of data for a backup block by 1 byte, but I think that's worth it. I'm pretty sure we will be happy about the added extensibility pretty soon. To clarify my understanding of the above change, Instead of a block id to reference different fragments of an xlog record , a single byte field chunk_id should be used. chunk_id will be same as XLR_BLOCK_ID_DATA_SHORT/LONG for main data fragments. But for block references, it will take store following values in order to store information about the backup blocks. #define XLR_CHUNK_BKP_COMPRESSED 0x01 #define XLR_CHUNK_BKP_WITH_HOLE 0x02 ... The new xlog format should look like follows, Fixed-size header (XLogRecord struct) Chunk_id(add a field before id field in XLogRecordBlockHeader struct) XLogRecordBlockHeader Chunk_id XLogRecordBlockHeader ... ... Chunk_id ( rename id field of the XLogRecordDataHeader struct) XLogRecordDataHeader[Short|Long] block data block data ... main data I will post a patch based on this. Thank you, Rahila Syed -Original Message- From: Andres Freund [mailto:and...@2ndquadrant.com] Sent: Monday, February 16, 2015 5:26 PM To: Syed, Rahila Cc: Michael Paquier; Fujii Masao; PostgreSQL mailing lists Subject: Re: [HACKERS] [REVIEW] Re: Compression of full-page-writes On 2015-02-16 11:30:20 +, Syed, Rahila wrote: - * As a trivial form of data compression, the XLOG code is aware that - * PG data pages usually contain an unused hole in the middle, which - * contains only zero bytes. If hole_length 0 then we have removed - * such a hole from the stored data (and it's not counted in the - * XLOG record's CRC, either). Hence, the amount of block data actually - * present is BLCKSZ - hole_length bytes. + * Block images are able to do several types of compression: + * - When wal_compression is off, as a trivial form of compression, + the + * XLOG code is aware that PG data pages usually contain an unused hole + * in the middle, which contains only zero bytes. If length BLCKSZ + * then we have removed such a hole from the stored data (and it is + * not counted in the XLOG record's CRC, either). Hence, the amount + * of block data actually present is length bytes. The hole offset + * on page is defined using hole_offset. + * - When wal_compression is on, block images are compressed using a + * compression algorithm without their hole to improve compression + * process of the page. length corresponds in this case to the + length + * of the compressed block. hole_offset is the hole offset of the + page, + * and the length of the uncompressed block is defined by + raw_length, + * whose data is included in the record only when compression is + enabled + * and with_hole is set to true, see below. + * + * is_compressed is used to identify if a given block image is + compressed + * or not. Maximum page size allowed on the system being 32k, the + hole + * offset cannot be more than 15-bit long so the last free bit is + used to + * store the compression state of block image. If the maximum page + size + * allowed is increased to a value higher than that, we should + consider + * increasing this structure size as well, but this would increase + the + * length of block header in WAL records with alignment. + * + * with_hole is used to identify the presence of a hole in a block image. + * As the length of a block cannot be more than 15-bit long, the + extra bit in + * the length field is used for this identification purpose. If the + block image + * has no hole, it is ensured that the raw size of a compressed block + image is + * equal to BLCKSZ, hence the contents of + XLogRecordBlockImageCompressionInfo + * are not necessary. */ typedef struct XLogRecordBlockImageHeader { - uint16 hole_offset;/* number of bytes before hole */ - uint16 hole_length;/* number of bytes in hole */ + uint16 length:15, /* length of block data in record */ + with_hole:1;/* status of hole in the block */ + + uint16 hole_offset:15, /* number of bytes before hole */ + is_compressed:1;/* compression status of image */ + + /* Followed by the data related to compression if block is +compressed */ } XLogRecordBlockImageHeader; Yikes, this is ugly. I think we should change the xlog format so that the block_id
Re: [HACKERS] [REVIEW] Re: Compression of full-page-writes
Hello, Thank you for reviewing and testing the patch. + /* leave if data cannot be compressed */ + if (compressed_len == 0) + return false; This should be 0, pglz_compress returns -1 when compression fails. + if (pglz_decompress(block_image, bkpb-bkp_len, record-uncompressBuf, + bkpb-bkp_uncompress_len) == 0) Similarly, this should be 0. These have been corrected in the attached. Regarding the sanity checks that have been added recently. I think that they are useful but I am suspecting as well that only a check on the record CRC is done because that's reliable enough and not doing those checks accelerates a bit replay. So I am thinking that we should simply replace them by assertions. Removing the checks makes sense as CRC ensures correctness . Moreover ,as error message for invalid length of record is present in the code , messages for invalid block length can be redundant. Checks have been replaced by assertions in the attached patch. Following if condition in XLogCompressBackupBlock has been modified as follows Previous /* + * We recheck the actual size even if pglz_compress() reports success and + * see if at least 2 bytes of length have been saved, as this corresponds + * to the additional amount of data stored in WAL record for a compressed + * block via raw_length when block contains hole.. + */ + *len = (uint16) compressed_len; + if (*len = orig_len - SizeOfXLogRecordBlockImageCompressionInfo) + return false; + return true; Current if ((hole_length != 0) + (*len = orig_len - SizeOfXLogRecordBlockImageCompressionInfo)) + return false; +return true This is because the extra information raw_length is included only if compressed block has hole in it. Once we get those small issues fixes, I think that it is with having a committer look at this patch, presumably Fujii-san Agree. I will mark this patch as ready for committer Thank you, Rahila Syed __ Disclaimer: This email and any attachments are sent in strictest confidence for the sole use of the addressee and may contain legally privileged, confidential, and proprietary data. If you are not the intended recipient, please advise the sender by replying promptly to this email and then delete and destroy this email and any attachments without any further use, copying or forwarding. Support-compression-for-full-page-writes-in-WAL_v19.patch Description: Support-compression-for-full-page-writes-in-WAL_v19.patch -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [REVIEW] Re: Compression of full-page-writes
Thank you for comments. Please find attached the updated patch. This patch fails to compile: xlogreader.c:1049:46: error: extraneous ')' after condition, expected a statement blk-with_hole blk-hole_offset = 0)) This has been rectified. Note as well that at least clang does not like much how the sanity check with with_hole are done. You should place parentheses around the '' expressions. Also, I would rather define with_hole == 0 or with_hole == 1 explicitly int those checks The expressions are modified accordingly. There is a typo: s/true,see/true, see/ [nitpicky]Be as well aware of the 80-character limit per line that is usually normally by comment blocks.[/] Have corrected the typos and changed the comments as mentioned. Also , realigned certain lines to meet the 80-char limit. Thank you, Rahila Syed __ Disclaimer: This email and any attachments are sent in strictest confidence for the sole use of the addressee and may contain legally privileged, confidential, and proprietary data. If you are not the intended recipient, please advise the sender by replying promptly to this email and then delete and destroy this email and any attachments without any further use, copying or forwarding. Support-compression-for-full-page-writes-in-WAL_v18.patch Description: Support-compression-for-full-page-writes-in-WAL_v18.patch -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [REVIEW] Re: Compression of full-page-writes
Hello, A bug had been introduced in the latest versions of the patch. The order of parameters passed to pglz_decompress was wrong. Please find attached patch with following correction, Original code, + if (pglz_decompress(block_image, record-uncompressBuf, + bkpb-bkp_len, bkpb-bkp_uncompress_len) == 0) Correction + if (pglz_decompress(block_image, bkpb-bkp_len, + record-uncompressBuf, bkpb-bkp_uncompress_len) == 0) For example here you should not have a space between the function definition and the variable declarations: +{ + +int orig_len = BLCKSZ - hole_length; This is as well incorrect in two places: if(hole_length != 0) There should be a space between the if and its condition in parenthesis. Also corrected above code format mistakes. Thank you, Rahila Syed -Original Message- From: pgsql-hackers-ow...@postgresql.org [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Syed, Rahila Sent: Monday, February 09, 2015 6:58 PM To: Michael Paquier; Fujii Masao Cc: PostgreSQL mailing lists Subject: Re: [HACKERS] [REVIEW] Re: Compression of full-page-writes Hello, Do we always need extra two bytes for compressed backup block? ISTM that extra bytes are not necessary when the hole length is zero. In this case the length of the original backup block (i.e., uncompressed) must be BLCKSZ, so we don't need to save the original size in the extra bytes. Yes, we would need a additional bit to identify that. We could steal it from length in XLogRecordBlockImageHeader. This is implemented in the attached patch by dividing length field as follows, uint16 length:15, with_hole:1; 2 should be replaced with the macro variable indicating the size of extra header for compressed backup block. Macro SizeOfXLogRecordBlockImageCompressionInfo is used instead of 2 You can refactor XLogCompressBackupBlock() and move all the above code to it for more simplicity This is also implemented in the patch attached. Thank you, Rahila Syed -Original Message- From: Michael Paquier [mailto:michael.paqu...@gmail.com] Sent: Friday, February 06, 2015 6:00 PM To: Fujii Masao Cc: Syed, Rahila; PostgreSQL mailing lists Subject: Re: [HACKERS] [REVIEW] Re: Compression of full-page-writes On Fri, Feb 6, 2015 at 3:03 PM, Fujii Masao wrote: Do we always need extra two bytes for compressed backup block? ISTM that extra bytes are not necessary when the hole length is zero. In this case the length of the original backup block (i.e., uncompressed) must be BLCKSZ, so we don't need to save the original size in the extra bytes. Yes, we would need a additional bit to identify that. We could steal it from length in XLogRecordBlockImageHeader. Furthermore, when fpw compression is disabled and the hole length is zero, we seem to be able to save one byte from the header of backup block. Currently we use 4 bytes for the header, 2 bytes for the length of backup block, 15 bits for the hole offset and 1 bit for the flag indicating whether block is compressed or not. But in that case, the length of backup block doesn't need to be stored because it must be BLCKSZ. Shouldn't we optimize the header in this way? Thought? If we do it, that's something to tackle even before this patch on HEAD, because you could use the 16th bit of the first 2 bytes of XLogRecordBlockImageHeader to do necessary sanity checks, to actually not reduce record by 1 byte, but 2 bytes as hole-related data is not necessary. I imagine that a patch optimizing that wouldn't be that hard to write as well. +int page_len = BLCKSZ - hole_length; +char *scratch_buf; +if (hole_length != 0) +{ +scratch_buf = compression_scratch; +memcpy(scratch_buf, page, hole_offset); +memcpy(scratch_buf + hole_offset, + page + (hole_offset + hole_length), + BLCKSZ - (hole_length + hole_offset)); +} +else +scratch_buf = page; + +/* Perform compression of block */ +if (XLogCompressBackupBlock(scratch_buf, +page_len, +regbuf-compressed_page, +compress_len)) +{ +/* compression is done, add record */ +is_compressed = true; +} You can refactor XLogCompressBackupBlock() and move all the above code to it for more simplicity. Sure. -- Michael __ Disclaimer: This email and any attachments are sent in strictest confidence
Re: [HACKERS] [REVIEW] Re: Compression of full-page-writes
Hello, Do we always need extra two bytes for compressed backup block? ISTM that extra bytes are not necessary when the hole length is zero. In this case the length of the original backup block (i.e., uncompressed) must be BLCKSZ, so we don't need to save the original size in the extra bytes. Yes, we would need a additional bit to identify that. We could steal it from length in XLogRecordBlockImageHeader. This is implemented in the attached patch by dividing length field as follows, uint16 length:15, with_hole:1; 2 should be replaced with the macro variable indicating the size of extra header for compressed backup block. Macro SizeOfXLogRecordBlockImageCompressionInfo is used instead of 2 You can refactor XLogCompressBackupBlock() and move all the above code to it for more simplicity This is also implemented in the patch attached. Thank you, Rahila Syed -Original Message- From: Michael Paquier [mailto:michael.paqu...@gmail.com] Sent: Friday, February 06, 2015 6:00 PM To: Fujii Masao Cc: Syed, Rahila; PostgreSQL mailing lists Subject: Re: [HACKERS] [REVIEW] Re: Compression of full-page-writes On Fri, Feb 6, 2015 at 3:03 PM, Fujii Masao wrote: Do we always need extra two bytes for compressed backup block? ISTM that extra bytes are not necessary when the hole length is zero. In this case the length of the original backup block (i.e., uncompressed) must be BLCKSZ, so we don't need to save the original size in the extra bytes. Yes, we would need a additional bit to identify that. We could steal it from length in XLogRecordBlockImageHeader. Furthermore, when fpw compression is disabled and the hole length is zero, we seem to be able to save one byte from the header of backup block. Currently we use 4 bytes for the header, 2 bytes for the length of backup block, 15 bits for the hole offset and 1 bit for the flag indicating whether block is compressed or not. But in that case, the length of backup block doesn't need to be stored because it must be BLCKSZ. Shouldn't we optimize the header in this way? Thought? If we do it, that's something to tackle even before this patch on HEAD, because you could use the 16th bit of the first 2 bytes of XLogRecordBlockImageHeader to do necessary sanity checks, to actually not reduce record by 1 byte, but 2 bytes as hole-related data is not necessary. I imagine that a patch optimizing that wouldn't be that hard to write as well. +int page_len = BLCKSZ - hole_length; +char *scratch_buf; +if (hole_length != 0) +{ +scratch_buf = compression_scratch; +memcpy(scratch_buf, page, hole_offset); +memcpy(scratch_buf + hole_offset, + page + (hole_offset + hole_length), + BLCKSZ - (hole_length + hole_offset)); +} +else +scratch_buf = page; + +/* Perform compression of block */ +if (XLogCompressBackupBlock(scratch_buf, +page_len, +regbuf-compressed_page, +compress_len)) +{ +/* compression is done, add record */ +is_compressed = true; +} You can refactor XLogCompressBackupBlock() and move all the above code to it for more simplicity. Sure. -- Michael __ Disclaimer: This email and any attachments are sent in strictest confidence for the sole use of the addressee and may contain legally privileged, confidential, and proprietary data. If you are not the intended recipient, please advise the sender by replying promptly to this email and then delete and destroy this email and any attachments without any further use, copying or forwarding. Support-compression-for-full-page-writes-in-WAL_v17.patch Description: Support-compression-for-full-page-writes-in-WAL_v17.patch -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [REVIEW] Re: Compression of full-page-writes
In any case, those things have been introduced by what I did in previous versions... And attached is a new patch. Thank you for feedback. /* allocate scratch buffer used for compression of block images */ + if (compression_scratch == NULL) + compression_scratch = MemoryContextAllocZero(xloginsert_cxt, + BLCKSZ); } The compression patch can use the latest interface MemoryContextAllocExtended to proceed without compression when sufficient memory is not available for scratch buffer. The attached patch introduces OutOfMem flag which is set on when MemoryContextAllocExtended returns NULL . Thank you, Rahila Syed -Original Message- From: Michael Paquier [mailto:michael.paqu...@gmail.com] Sent: Friday, February 06, 2015 12:46 AM To: Syed, Rahila Cc: PostgreSQL mailing lists Subject: Re: [HACKERS] [REVIEW] Re: Compression of full-page-writes On Thu, Feb 5, 2015 at 11:06 PM, Syed, Rahila rahila.s...@nttdata.com wrote: /* +* We recheck the actual size even if pglz_compress() report success, +* because it might be satisfied with having saved as little as one byte +* in the compressed data. +*/ + *len = (uint16) compressed_len; + if (*len = orig_len - 1) + return false; + return true; +} As per latest code ,when compression is 'on' we introduce additional 2 bytes in the header of each block image for storing raw_length of the compressed block. In order to achieve compression while accounting for these two additional bytes, we must ensure that compressed length is less than original length - 2. So , IIUC the above condition should rather be If (*len = orig_len -2 ) return false; return true; The attached patch contains this. It also has a cosmetic change- renaming compressBuf to uncompressBuf as it is used to store uncompressed page. Agreed on both things. Just looking at your latest patch after some time to let it cool down, I noticed a couple of things. #define MaxSizeOfXLogRecordBlockHeader \ (SizeOfXLogRecordBlockHeader + \ - SizeOfXLogRecordBlockImageHeader + \ + SizeOfXLogRecordBlockImageHeader, \ + SizeOfXLogRecordBlockImageCompressionInfo + \ There is a comma here instead of a sum sign. We should really sum up all those sizes to evaluate the maximum size of a block header. + * Permanently allocate readBuf uncompressBuf. We do it this way, + * rather than just making a static array, for two reasons: This comment is just but weird, readBuf AND uncompressBuf is more appropriate. + * We recheck the actual size even if pglz_compress() report success, + * because it might be satisfied with having saved as little as one byte + * in the compressed data. We add two bytes to store raw_length with the + * compressed image. So for compression to be effective compressed_len should + * be atleast orig_len - 2 This comment block should be reworked, and misses a dot at its end. I rewrote it like that, hopefully that's clearer: + /* +* We recheck the actual size even if pglz_compress() reports success and see +* if at least 2 bytes of length have been saved, as this corresponds to the +* additional amount of data stored in WAL record for a compressed block +* via raw_length. +*/ In any case, those things have been introduced by what I did in previous versions... And attached is a new patch. -- Michael __ Disclaimer: This email and any attachments are sent in strictest confidence for the sole use of the addressee and may contain legally privileged, confidential, and proprietary data. If you are not the intended recipient, please advise the sender by replying promptly to this email and then delete and destroy this email and any attachments without any further use, copying or forwarding. Support-compression-for-full-page-writes-in-WAL_v17.patch Description: Support-compression-for-full-page-writes-in-WAL_v17.patch -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [REVIEW] Re: Compression of full-page-writes
Hello, /* +* We recheck the actual size even if pglz_compress() report success, +* because it might be satisfied with having saved as little as one byte +* in the compressed data. +*/ + *len = (uint16) compressed_len; + if (*len = orig_len - 1) + return false; + return true; +} As per latest code ,when compression is 'on' we introduce additional 2 bytes in the header of each block image for storing raw_length of the compressed block. In order to achieve compression while accounting for these two additional bytes, we must ensure that compressed length is less than original length - 2. So , IIUC the above condition should rather be If (*len = orig_len -2 ) return false; return true; The attached patch contains this. It also has a cosmetic change- renaming compressBuf to uncompressBuf as it is used to store uncompressed page. Thank you, Rahila Syed -Original Message- From: pgsql-hackers-ow...@postgresql.org [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Michael Paquier Sent: Wednesday, January 07, 2015 9:32 AM To: Rahila Syed Cc: PostgreSQL mailing lists Subject: Re: [HACKERS] [REVIEW] Re: Compression of full-page-writes On Wed, Jan 7, 2015 at 12:51 AM, Rahila Syed rahilasyed...@gmail.com wrote: Following are some comments, Thanks for the feedback. uint16 hole_offset:15, /* number of bytes in hole */ Typo in description of hole_offset Fixed. That's before hole. for (block_id = 0; block_id = record-max_block_id; block_id++) - { - if (XLogRecHasBlockImage(record, block_id)) - fpi_len += BLCKSZ - record-blocks[block_id].hole_length; - } + fpi_len += record-blocks[block_id].bkp_len; IIUC, if condition, /if(XLogRecHasBlockImage(record, block_id))/ is incorrectly removed from the above for loop. Fixed. typedef struct XLogRecordCompressedBlockImageHeader I am trying to understand the purpose behind declaration of the above struct. IIUC, it is defined in order to introduce new field uint16 raw_length and it has been declared as a separate struct from XLogRecordBlockImageHeader to not affect the size of WAL record when compression is off. I wonder if it is ok to simply memcpy the uint16 raw_length in the hdr_scratch when compression is on and not have a separate header struct for it neither declare it in existing header. raw_length can be a locally defined variable is XLogRecordAssemble or it can be a field in registered_buffer struct like compressed_page. I think this can simplify the code. Am I missing something obvious? You are missing nothing. I just introduced this structure for a matter of readability to show the two-byte difference between non-compressed and compressed header information. It is true that doing it my way makes the structures duplicated, so let's simply add the compression-related information as an extra structure added after XLogRecordBlockImageHeader if the block is compressed. I hope this addresses your concerns. /* * Fill in the remaining fields in the XLogRecordBlockImageHeader * struct and add new entries in the record chain. */ bkpb.fork_flags |= BKPBLOCK_HAS_IMAGE; This code line seems to be misplaced with respect to the above comment. Comment indicates filling of XLogRecordBlockImageHeader fields while fork_flags is a field of XLogRecordBlockHeader. Is it better to place the code close to following condition? if (needs_backup) { Yes, this comment should not be here. I replaced it with the comment in HEAD. + *the original length of the + * block without its page hole being deducible from the compressed + data + * itself. IIUC, this comment before XLogRecordBlockImageHeader seems to be no longer valid as original length is not deducible from compressed data and rather stored in header. Aah, true. This was originally present in the header of PGLZ that has been removed to make it available for frontends. Updated patches are attached. Regards, -- Michael __ Disclaimer: This email and any attachments are sent in strictest confidence for the sole use of the addressee and may contain legally privileged, confidential, and proprietary data. If you are not the intended recipient, please advise the sender by replying promptly to this email and then delete and destroy this email and any attachments without any further use, copying or forwarding. Support-compression-for-full-page-writes-in-WAL_v15.patch Description: Support-compression-for-full-page-writes-in-WAL_v15.patch -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [REVIEW] Re: Compression of full-page-writes
Hello, I would like to contribute few points. XLogInsertRecord(XLogRecData *rdata, XLogRecPtr fpw_lsn) RedoRecPtr = Insert-RedoRecPtr; } doPageWrites = (Insert-fullPageWrites || Insert-forcePageWrites); doPageCompression = (Insert-fullPageWrites == FULL_PAGE_WRITES_COMPRESS); Don't we need to initialize doPageCompression similar to doPageWrites in InitXLOGAccess? Also , in the earlier patches compression was set 'on' even when fpw GUC is 'off'. This was to facilitate compression of FPW which are forcibly written even when fpw GUC is turned off. doPageCompression in this patch is set to true only if value of fpw GUC is 'compress'. I think its better to compress forcibly written full page writes. Regards, Rahila Syed -Original Message- From: pgsql-hackers-ow...@postgresql.org [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Michael Paquier Sent: Wednesday, November 26, 2014 1:55 PM To: Alvaro Herrera Cc: Andres Freund; Robert Haas; Fujii Masao; Rahila Syed; Rahila Syed; PostgreSQL-development Subject: Re: [HACKERS] [REVIEW] Re: Compression of full-page-writes So, Here are reworked patches for the whole set, with the following changes: - Found why replay was failing, xlogreader.c took into account BLCKSZ - hole while it should have taken into account the compressed data length when fetching a compressed block image. - Reworked pglz portion to have it return status errors instead of simple booleans. pglz stuff is as well moved to src/common as Alvaro suggested. I am planning to run some tests to check how much compression can reduce WAL size with this new set of patches. I have been however able to check that those patches pass installcheck-world with a standby replaying the changes behind. Feel free to play with those patches... Regards, -- Michael __ Disclaimer: This email and any attachments are sent in strictest confidence for the sole use of the addressee and may contain legally privileged, confidential, and proprietary data. If you are not the intended recipient, please advise the sender by replying promptly to this email and then delete and destroy this email and any attachments without any further use, copying or forwarding. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [REVIEW] Re: Compression of full-page-writes
Hello Fujii-san, Thank you for your comments. The patch isn't applied to the master cleanly. The compilation of the document failed with the following error message. openjade:config.sgml:2188:12:E: end tag for element TERM which is not open make[3]: *** [HTML.index] Error 1 xlog.c:930: warning: ISO C90 forbids mixed declarations and code xlogreader.c:744: warning: ISO C90 forbids mixed declarations and code xlogreader.c:744: warning: ISO C90 forbids mixed declarations and code Please find attached patch with these rectified. Only backend calls CompressBackupBlocksPagesAlloc when SIGHUP is sent. Why does only backend need to do that? What about other processes which can write FPW, e.g., autovacuum? I had overlooked this. I will correct it. Do we release the buffers for compressed data when fpw is changed from compress to on? The current code does not do this. The memory is always (i.e., even when fpw=on) allocated to uncompressedPages, but not to compressedPages. Why? I guess that the test of fpw needs to be there uncompressedPages is also used to store the decompression output at the time of recovery. Hence, memory for uncompressedPages needs to be allocated even if fpw=on which is not the case for compressedPages. Thank you, Rahila Syed -Original Message- From: Fujii Masao [mailto:masao.fu...@gmail.com] Sent: Monday, October 27, 2014 6:50 PM To: Rahila Syed Cc: Andres Freund; Syed, Rahila; Alvaro Herrera; Rahila Syed; PostgreSQL-development; Tom Lane Subject: Re: [HACKERS] [REVIEW] Re: Compression of full-page-writes On Fri, Oct 17, 2014 at 1:52 PM, Rahila Syed rahilasye...@gmail.com wrote: Hello, Please find the updated patch attached. Thanks for updating the patch! Here are the comments. The patch isn't applied to the master cleanly. I got the following compiler warnings. xlog.c:930: warning: ISO C90 forbids mixed declarations and code xlogreader.c:744: warning: ISO C90 forbids mixed declarations and code xlogreader.c:744: warning: ISO C90 forbids mixed declarations and code The compilation of the document failed with the following error message. openjade:config.sgml:2188:12:E: end tag for element TERM which is not open make[3]: *** [HTML.index] Error 1 Only backend calls CompressBackupBlocksPagesAlloc when SIGHUP is sent. Why does only backend need to do that? What about other processes which can write FPW, e.g., autovacuum? Do we release the buffers for compressed data when fpw is changed from compress to on? +if (uncompressedPages == NULL) +{ +uncompressedPages = (char *)malloc(XLR_TOTAL_BLCKSZ); +if (uncompressedPages == NULL) +outOfMem = 1; +} The memory is always (i.e., even when fpw=on) allocated to uncompressedPages, but not to compressedPages. Why? I guess that the test of fpw needs to be there. Regards, -- Fujii Masao __ Disclaimer: This email and any attachments are sent in strictest confidence for the sole use of the addressee and may contain legally privileged, confidential, and proprietary data. If you are not the intended recipient, please advise the sender by replying promptly to this email and then delete and destroy this email and any attachments without any further use, copying or forwarding. compress_fpw_v2.patch Description: compress_fpw_v2.patch -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [REVIEW] Re: Compression of full-page-writes
Hello, Please find attached the patch to compress FPW. Sorry I had forgotten to attach. Please find the patch attached. Thank you, Rahila Syed From: pgsql-hackers-ow...@postgresql.org [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Rahila Syed Sent: Monday, September 22, 2014 3:16 PM To: Alvaro Herrera Cc: Rahila Syed; PostgreSQL-development; Tom Lane Subject: Re: [HACKERS] [REVIEW] Re: Compression of full-page-writes Hello All, Well, from Rahila's point of view the patch is actually attached, but she's posting from the Nabble interface, which mangles it and turns into a link instead. Yes. but the end result is the same: to properly submit a patch, you need to send an email to the mailing list, not join a group/forum from some intermediary newsgroup site that mirrors the list. Thank you. I will take care of it henceforth. Please find attached the patch to compress FPW. Patch submitted by Fujii-san earlier in the thread is used to merge compression GUC with full_page_writes. I am reposting the measurement numbers. Server Specification: Processors:Intel® Xeon ® Processor E5-2650 (2 GHz, 8C/16T, 20 MB) * 2 nos RAM: 32GB Disk : HDWD 450GB 10K Hot Plug 2.5-inch SAS HDD * 8 nos 1 x 450 GB SAS HDD, 2.5-inch, 6Gb/s, 10,000 rpm Checkpoint segments: 1024 Checkpoint timeout: 5 mins pgbench -c 64 -j 64 -r -T 900 -M prepared Scale factor: 1000 WAL generated (MB) Throughput (tps) Latency(ms) On 9235.43 979.03 65.36 Compress(pglz) 6518.681072.34 59.66 Off 501.04 1135.17 56.34 The results show around 30 percent decrease in WAL volume due to compression of FPW. Thank you , Rahila Syed Tom Lane wrote: Rahila Syed rahilasyedmailto:rahilasyed...@gmail.com.90@mailto:rahilasyed...@gmail.comgmail.commailto:rahilasyed...@gmail.com writes: Please find attached patch to compress FPW using pglz compression. Patch not actually attached AFAICS (no, a link is not good enough). Well, from Rahila's point of view the patch is actually attached, but she's posting from the Nabble interface, which mangles it and turns into a link instead. Not her fault, really -- but the end result is the same: to properly submit a patch, you need to send an email to the pgsqlmailto:pgsql-hackers@postgresql.org-mailto:pgsql-hackers@postgresql.orghackersmailto:pgsql-hackers@postgresql.org@mailto:pgsql-hackers@postgresql.orgpostgresql.orgmailto:pgsql-hackers@postgresql.orgmailing list, not join a group/forum from some intermediary newsgroup site that mirrors the list. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services __ Disclaimer:This email and any attachments are sent in strictest confidence for the sole use of the addressee and may contain legally privileged, confidential, and proprietary data. If you are not the intended recipient, please advise the sender by replying promptly to this email and then delete and destroy this email and any attachments without any further use, copying or forwarding compress_fpw_v1.patch Description: compress_fpw_v1.patch -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers