Re: [HACKERS] [PROPOSAL] VACUUM Progress Checker.

2015-10-29 Thread Syed, Rahila
Hello,

Please find attached an updated patch.

>Flag isn't reset on error.
Corrected in the attached.

> + pgstat_reset_activityflag;
>Does this actually compile?
It does compile but with no effect.  It has been corrected.

>snprintf()?  I don't think you need to keep track of schemaname_len at all.
memcpy() has been replaced by snprintf() to avoid calculating schemaname_len.

>In fact, I wonder if you need to send total_pages at all -- surely the client 
>can add both total_heap_pages and total_index_pages by itself ...
This has  been corrected in the attached patch.

>It seems a bit strange that the remaining progress_param entries are not 
>initialized to anything.  Also, why aren't the number of params of each type 
>saved too?  
The number of params for each command remains constant hence it has been 
hardcoded.

>In the receiving code you check whether each value equals 0, and if it does 
>then report NULL, but imagine vacuuming a table with no indexes where the 
>number of index pages is going to be zero.
>Shouldn't we display zero there rather than null?
Agree.  IIUC, NULL should rather be used when a value is invalid. But for valid 
values like 'zero index pages' it is clearer to display 0. It has been 
corrected in the attached. 

>This patch lacks a comment somewhere explaining how this whole thing works.
Have added few lines in pgstat.h inside PgBackendStatus struct.

>I believe you don't need this include.
Corrected.

>This not only adds an unnecessary empty line at the end of the struct 
>declaration, but also fails to preserve the "st_" prefix used in all the other 
>fields
Corrected.

Thank you,
Rahila Syed


__
Disclaimer: This email and any attachments are sent in strictest confidence
for the sole use of the addressee and may contain legally privileged,
confidential, and proprietary data. If you are not the intended recipient,
please advise the sender by replying promptly to this email and then delete
and destroy this email and any attachments without any further use, copying
or forwarding.

Vacuum_progress_checker_v6.patch
Description: Vacuum_progress_checker_v6.patch

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PROPOSAL] VACUUM Progress Checker.

2015-10-20 Thread Syed, Rahila
Hello,

>I think that you should add the flag or something which indicates whether this 
>backend is running VACUUM or not, into PgBackendStatus.
>pg_stat_vacuum_progress should display the entries of only backends with that 
>flag set true. This design means that you need to set the flag to true when 
>starting VACUUM and reset at the end of VACUUM progressing.
Please find attached  updated patch which adds a flag in PgBackendStatus which 
indicates whether this backend in running VACUUM.
Also, pgstat_report_progress function is changed to make it generic for all 
commands reporting progress.

Thank you,
Rahila Syed



__
Disclaimer: This email and any attachments are sent in strictest confidence
for the sole use of the addressee and may contain legally privileged,
confidential, and proprietary data. If you are not the intended recipient,
please advise the sender by replying promptly to this email and then delete
and destroy this email and any attachments without any further use, copying
or forwarding.


Vacuum_progress_checker_v5.patch
Description: Vacuum_progress_checker_v5.patch

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PROPOSAL] VACUUM Progress Checker.

2015-10-06 Thread Syed, Rahila
Hello Fujii-san,

>Here are another review comments
Thank you for review. Please find attached an updated patch. 

> You removed some empty lines, for example, in vacuum.h.
>Which seems useless to me.
Has been corrected in the attached.

>Why did you use an array to store the progress information of VACUUM?
>I think that it's better to use separate specific variables for them for 
>better code readability, for example, variables scanned_pages, 
>heap_total_pages, etc.
It is designed this way to keep it generic for all the commands which can store 
different progress parameters in shared memory.

>Currently only progress_param_float[0] is used. So there is no need to use an 
>array here.
Same as before . This is for later use by other commands.

>progress_param_float[0] saves the percetage of VACUUM progress.
>But why do we need to save that information into shared memory?
>We can just calculate the percentage whenever pg_stat_get_vacuum_progress() is 
>executed, instead. There seems to be no need to save that information.
This has been corrected in the attached.

>char progress_message[PROGRESS_MESSAGE_LENGTH][N_PROGRESS_PARAM];
>As Sawada pointed out, there is no user of this variable.
Have used it to store table name in the updated patch. It can also be used to 
display index names, current phase of VACUUM.  
This has not been included in the patch yet to avoid cluttering the display 
with too much information.

>For example, st_activity of autovacuum worker displays "autovacuum ...".
>So as Sawada reported, he could not find any entry for autovacuum in 
>pg_stat_vacuum_progress.
In the attached patch , I have performed check for autovacuum also.

>I think that you should add the flag or something which indicates whether this 
>backend is running VACUUM or not, into PgBackendStatus.
>pg_stat_vacuum_progress should display the entries of only backends with that 
>flag set true. This design means that you need to set the flag to true when 
>starting VACUUM and reset at the end of VACUUM progressing.
This design seems better in the sense that we don’t rely on st_activity entry 
to display progress values. 
A variable which stores flags for running commands can be created in 
PgBackendStatus. These flags can be used at the time of display of progress of 
particular command. 
 
>That is, non-superuser should not be allowed to see the 
>pg_stat_vacuum_progress entry of superuser running VACUUM?
This has been included in the updated patch.

>This code may cause total_pages and total_heap_pages to decrease while VACUUM 
>is running.
Yes. This is because the initial count of total pages to be vacuumed and the 
pages which are actually vacuumed can vary depending on visibility of tuples.
The pages which are all visible are skipped and hence have been subtracted from 
total page count.


Thank you,
Rahila Syed 

__
Disclaimer: This email and any attachments are sent in strictest confidence
for the sole use of the addressee and may contain legally privileged,
confidential, and proprietary data. If you are not the intended recipient,
please advise the sender by replying promptly to this email and then delete
and destroy this email and any attachments without any further use, copying
or forwarding.


Vacuum_progress_checker_v4.patch
Description: Vacuum_progress_checker_v4.patch

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PROPOSAL] VACUUM Progress Checker.

2015-10-06 Thread Syed, Rahila
Hello,

Please check the attached patch as the earlier one had typo in regression test 
output.

>+#define PG_STAT_GET_PROGRESS_COLS30
>Why did you use 30?
That has come from N_PROGRESS_PARAM * 3  where N_PROGRESS_PARAM = 10 is the 
number of progress parameters of each type stored in shared memory.
There are three such types (int, float, string) hence total number of progress 
parameters can be 30.

Thank you,
Rahila Syed


__
Disclaimer: This email and any attachments are sent in strictest confidence
for the sole use of the addressee and may contain legally privileged,
confidential, and proprietary data. If you are not the intended recipient,
please advise the sender by replying promptly to this email and then delete
and destroy this email and any attachments without any further use, copying
or forwarding.


Vacuum_progress_checker_v4.patch
Description: Vacuum_progress_checker_v4.patch

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PROPOSAL] VACUUM Progress Checker.

2015-09-22 Thread Syed, Rahila
Hello,

Please find attached patch with bugs reported by Thom and Sawada-san solved.

>* The progress of vacuum by autovacuum seems not to be displayed.
The progress is stored in shared variables during autovacuum. I guess the 
reason they are not visible is that the entries are deleted as soon as the 
process exits.
But the progress can be viewed while autovacuum worker is running.

Thank you,
Rahila Syed

__
Disclaimer: This email and any attachments are sent in strictest confidence
for the sole use of the addressee and may contain legally privileged,
confidential, and proprietary data. If you are not the intended recipient,
please advise the sender by replying promptly to this email and then delete
and destroy this email and any attachments without any further use, copying
or forwarding.


Vacuum_progress_checker_v3.patch
Description: Vacuum_progress_checker_v3.patch

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PROPOSAL] VACUUM Progress Checker.

2015-09-15 Thread Syed, Rahila

Hello Thom,

>Okay, I've just tested this with a newly-loaded table (1,252,973 of jsonb 
>data),
Thanks a lot!

>but after it's finished, I end up with this:
>json=# select * from pg_stat_vacuum_progress;
>-[ RECORD 1 ]---+---
>pid | 5569
>total_pages | 217941
>scanned_pages   | 175243
>total_heap_pages    | 175243
>scanned_heap_pages  | 175243
>total_index_pages   | 42698
>scanned_index_pages | 
>percent_complete    | 80
>This was running with a VACUUM ANALYZE.  This output seems to suggest that it 
>didn't complete.

Ok. The patch fails here because 'total pages to be scanned' takes into account 
index pages and no index pages are actually scanned. 
So the scanned pages count does not reach total pages count . I will fix this. 
It seems that no index pages were scanned during this  because there were no 
dead tuples to be cleaned as the table was newly loaded.

>After, I ran VACUUM FULL.  pg_stat_vacuum_progress didn't change from before, 
>so that doesn't appear to show up in the view.
The scope of this patch is to report progress of basic VACUUM . It does not 
take into account VACUUM FULL yet.  I think this can be included after basic 
VACUUM progress is done.

>I then deleted 40,000 rows from my table, and ran VACUUM ANALYZE again.  This 
>time it progressed and percent_complete reached 100
OK.

Thank you,
Rahila Syed.


__
Disclaimer: This email and any attachments are sent in strictest confidence
for the sole use of the addressee and may contain legally privileged,
confidential, and proprietary data. If you are not the intended recipient,
please advise the sender by replying promptly to this email and then delete
and destroy this email and any attachments without any further use, copying
or forwarding.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PROPOSAL] VACUUM Progress Checker.

2015-09-11 Thread Syed, Rahila

Hello,

Please find attached updated VACUUM progress checker patch.
Following have been accomplished in the patch

1. Accounts for index pages count while calculating  total progress of VACUUM.
2. Common location for storing progress parameters for any command. Idea is 
every command which needs to report progress can populate and interpret the 
shared variables in its own way.
 Each can display progress by implementing separate views.
3. Separate VACUUM progress view to display various progress parameters has 
been implemented . Progress of various phases like heap scan, index scan, total 
pages scanned along with 
completion percentage is reported.
4.This view can display progress for all active backends running VACUUM.

Basic testing has been performed. Thorough testing is yet to be done. Marking 
it as Needs Review in  Sept-Commitfest.

ToDo:
Display count of heap pages actually vacuumed(marking line pointers unused)
Display percentage of work_mem being used to store dead tuples.

Thank you,
Rahila Syed

__
Disclaimer: This email and any attachments are sent in strictest confidence
for the sole use of the addressee and may contain legally privileged,
confidential, and proprietary data. If you are not the intended recipient,
please advise the sender by replying promptly to this email and then delete
and destroy this email and any attachments without any further use, copying
or forwarding.


Vacuum_progress_checker_v2.patch
Description: Vacuum_progress_checker_v2.patch

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PROPOSAL] VACUUM Progress Checker.

2015-08-13 Thread Syed, Rahila
Hello,

Autovacuum knows what % of a table needs to be cleaned - that is how it is 
triggered. 
When a vacuum runs we should calculate how many TIDs we will collect and 
therefore how many trips to the indexes we need for given memory.
We can use the VM to find out how many blocks we'll need to scan in the table. 
So overall, we know how many blocks we need to scan.

Total heap pages to be scanned can be obtained from VM as mentioned. To figure 
out number of index scans we need an estimate of dead tuples. 

IIUC, autovacuum acquires information that a table has to be cleaned by looking 
at pgstat entry for the table. i.e n_dead_tuples.
Hence,initial estimate of dead tuple TIDs can be made using n_dead_tuples in 
pgstat.
n_dead_tuples in pgstat table entry is the value  updated by last analyze and 
may not be up to date.
In cases where pgstat entry for table is NULL, number of dead tuples TIDs 
cannot be estimated.
In  such cases where TIDs cannot be estimated , we can start with an initial 
estimate of 1 index scan and later go on adding number of index pages to the 
total count of pages(heap+index)  if count of index scan exceeds.

Thank you,
Rahila Syed.

 

__
Disclaimer: This email and any attachments are sent in strictest confidence
for the sole use of the addressee and may contain legally privileged,
confidential, and proprietary data. If you are not the intended recipient,
please advise the sender by replying promptly to this email and then delete
and destroy this email and any attachments without any further use, copying
or forwarding.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PROPOSAL] VACUUM Progress Checker.

2015-08-10 Thread Syed, Rahila
Hello,

When we're in Phase2 or 3, don't we need to report the number of total page 
scanned or percentage of how many table pages scanned, as well?
The total heap pages scanned need to be reported with phase 2 or 3. Complete 
progress report need to have numbers from each phase when applicable. 

 Phase 1. Report 2 integer counters: heap pages scanned and total heap 
 pages,
 1 float counter: percentage_complete and progress message.
 Phase 2. Report 2 integer counters: index pages scanned and total 
 index pages(across all indexes) and progress message.
 Phase 3. 1 integer counter: heap pages vacuumed.

Sorry for being unclear here. What I meant to say is, each phase of a command 
will correspond to a slot in COMMAND_NUM_SLOTS. Each phase will be a separate 
array element and 
will comprise of n integers, n floats, string. So , in the view reporting 
progress, VACUUM command can have 3 entries one for each phase. 

Thank you,
Rahila Syed



__
Disclaimer: This email and any attachments are sent in strictest confidence
for the sole use of the addressee and may contain legally privileged,
confidential, and proprietary data. If you are not the intended recipient,
please advise the sender by replying promptly to this email and then delete
and destroy this email and any attachments without any further use, copying
or forwarding.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PROPOSAL] VACUUM Progress Checker.

2015-07-16 Thread Syed, Rahila
Hello,

Naming the GUC pgstat* seems a little inconsistent.
Sorry, there is a typo in the mail. The GUC name is 'track_activity_progress'.

Also, adding the new GUC to src/backend/utils/misc/postgresql.conf.sample
might be helpful 
Yes.  I will update.

Thank you,
Rahila Syed

__
Disclaimer: This email and any attachments are sent in strictest confidence
for the sole use of the addressee and may contain legally privileged,
confidential, and proprietary data. If you are not the intended recipient,
please advise the sender by replying promptly to this email and then delete
and destroy this email and any attachments without any further use, copying
or forwarding.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PROPOSAL] VACUUM Progress Checker.

2015-07-03 Thread Syed, Rahila
Hello,

TBH, I think that designing this as a hook-based solution is adding a whole 
lot of complexity for no value.  The hard parts of the problem are collecting 
the raw data and making the results visible to users, and both of those 
require involvement of the core code.  Where is the benefit from pushing some 
trivial intermediate arithmetic into an external module?
If there's any at all, it's certainly not enough to justify problems such as 
you mention here.

So I'd just create a pgstat_report_percent_done() type of interface in 
pgstat.c and then teach VACUUM to call it directly.

Thank you for suggestion. I agree that adding code in core will reduce code 
complexity with no additional overhead. 
Going by the consensus, I will update the patch with code to collect and store 
progress information from vacuum in pgstat.c and
UI using pg_stat_activity view.

Thank you,
Rahila Syed

__
Disclaimer: This email and any attachments are sent in strictest confidence
for the sole use of the addressee and may contain legally privileged,
confidential, and proprietary data. If you are not the intended recipient,
please advise the sender by replying promptly to this email and then delete
and destroy this email and any attachments without any further use, copying
or forwarding.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PROPOSAL] VACUUM Progress Checker.

2015-07-02 Thread Syed, Rahila
Hello,

Unless I am missing something, I guess you can still keep the actual code that 
updates counters outside the core if you adopt an approach that Simon suggests.
Yes. The code to extract progress information from VACUUM and storing in shared 
memory can be outside core even with pg_stat_activity as a user interface.

Whatever the view (existing/new), any related counters would have a valid 
(non-NULL) value when read off the view iff hooks are set perhaps because you 
have an extension that sets them. 
I guess that means any operation that supports progress tracking would have 
an extension with suitable hooks implemented.
Do you mean to say , any operation/application that want progress  tracking 
feature will dynamically load the progress checker module which will set the 
hooks for progress reporting?
If yes , unless I am missing something such dynamic loading cannot happen if we 
use pg_stat_activity as it gets values from shared memory. Module has to be a 
shared_preload_library
to allocate a shared memory. So this will mean the module can be loaded only at 
server restart. Am I missing something?

Thank you,
Rahila Syed




__
Disclaimer: This email and any attachments are sent in strictest confidence
for the sole use of the addressee and may contain legally privileged,
confidential, and proprietary data. If you are not the intended recipient,
please advise the sender by replying promptly to this email and then delete
and destroy this email and any attachments without any further use, copying
or forwarding.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [Proposal] More Vacuum Statistics

2015-06-16 Thread Syed, Rahila
Hello,

Maybe, For DBAs,
It might be better to show vacuum progress in pg_stat_activity.
(if we'd do, add a free-style column like progress ?) This column might also 
be able to use for other long time commands like ANALYZE, CREATE/RE INDEX and 
COPY. To realize this feature, we certainly need to properly change 
pgstat_report_activity, use it more and add a new track-activity parameter
 Very similar idea was proposed in the following
http://www.postgresql.org/message-id/1284756643.25048.42.ca...@vanquo.pezone.net

IIUC, problem with showing progress in pg_stat_activity is that it introduces 
compulsary progress calculation overhead  in core for every command.
As work units of each command varies, common infrastructure might not be able 
to represent every command progress effectively.
An architecture which will display progress only  on users demand  for each 
command separately will be more efficient.
So, suggestion was rather to have a detailed progress report including 
remaining time for a command on users demand.

FWIW, I am working on designing an approach to report VACUUM progress stats for 
which I will be posting a detailed proposal.
The use case is reporting progress for long running VACUUMs. The approach 
involves using hooks to extract VACUUM progress statistics .
The progress can be displayed using psql view (ex. pg_stat_maintenance).

Thank you,
Rahila Syed


-Original Message-
From: pgsql-hackers-ow...@postgresql.org 
[mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Naoya Anzai
Sent: Tuesday, June 16, 2015 8:41 AM
To: Tomas Vondra
Cc: pgsql-hackers@postgresql.org; Akio Iwaasa; bench.cof...@gmail.com; Tom 
Lane; Jeff Janes; Jim Nasby; Andres Freund; Alvaro Herrera
Subject: Re: [HACKERS] [Proposal] More Vacuum Statistics

Hi,

Thank you for comments. and Sorry for my late response.

 
 pg_stat_vacuum view
 

 I understand it is not good to simply add more counters in 
 pg_stat_*_tables. For now, I'd like to suggest an extension which can 
 confirm vacuum statistics like pg_stat_statements.

 Similar feature has been already provided by pg_statsinfo package.
 But it is a full-stack package for PG-stats and it needs to redesign 
 pg_log and design a repository database for introduce.
 And it is not a core-extension for PostgreSQL.
 (I don't intend to hate pg_statsinfo,
   I think this package is a very convinient tool)

 Everyone will be able to do more easily tuning of VACUUM.
 That's all I want.

I'm still wondering whether these stats will really make the tuning any 
easier. What I do right now is looking at pg_stat_all_tables.n_deat_tup 
and if it exceeds some threshold, it's a sign that vacuum may need a 
bit of tuning. Sometimes it really requires tuning vacuum itself, but 
more often than not it's due to something else (a large bulk delete, 
autovacuum getting stuck on another table, ...). I don't see how the 
new stats would make this any easier.

Can you give some examples on how the new stats might be used (and 
where the current stats are insufficient)? What use cases do you 
imagine for those stats?

pg_stat_vacuum can keep histories of vacuum statistics for each tables/indices 
into shared memory.(They are not only last vacuum. 
This is already able to confirm using pg_stat_all_tables.) It makes easier 
analysis of vacuum histories because this view can sort or aggregate or filter.

My use cases for those stats are following.

- examine TRANSITION of vacuum execution time on any table  (you can predict 
the future vacuum execution time)
- examine EXECUTION INTERVAL of vacuum for each table  (if too frequent, it 
should make vacuum-threshold tuning to up)
- examine REST of dead-tuples just after vacuum  (if dead-tuples remain, it may 
be due to any idle in transaction sessions)


It might help differentiate the autovacuum activity from the rest of 
the system (e.g. there's a lot of I/O going on - how much of that is 
coming from autovacuum workers?). This would however require a more 
fine-grained reporting, because often the vacuums run for a very long 
time, especially on very large tables (which is exactly the case when 
this might be handy) - I just had a VACUUM that ran for 12 hours. These 
jobs should report the stats incrementally, not just once at the very 
end, because that makes it rather useless IMNSHO.

+1

Certainly, VACUUM have often much execution time, I just had too.
At present, we cannot predict when this vacuum finishes, what this vacuum is 
doing now, and whether this vacuum have any problem or not.

Maybe, For DBAs,
It might be better to show vacuum progress in pg_stat_activity.
(if we'd do, add a free-style column like progress ?) This column might also 
be able to use for other long time commands like ANALYZE, CREATE/RE INDEX and 
COPY. To realize this feature, we certainly need to properly change 
pgstat_report_activity, use it more and add a new track-activity parameter.

Regards,

Anzai Naoya
---

Re: [HACKERS] [REVIEW] Re: Compression of full-page-writes

2015-03-05 Thread Syed, Rahila

Hello,

Please find attached  a patch. As discussed, flag to denote compression and 
presence of hole in block image has been added in XLogRecordImageHeader rather 
than block header.  

Following are WAL numbers based on attached  test script  posted by Michael 
earlier in the thread.
 
  WAL generated
FPW compression on   122.032 MB

FPW compression off   155.223 MB

HEAD  155.236 MB 

Compression : 21 %
Number of block images generated in WAL :   63637


Thank you,
Rahila Syed


__
Disclaimer: This email and any attachments are sent in strictest confidence
for the sole use of the addressee and may contain legally privileged,
confidential, and proprietary data. If you are not the intended recipient,
please advise the sender by replying promptly to this email and then delete
and destroy this email and any attachments without any further use, copying
or forwarding.


Support-compression-of-full-page-writes-in-WAL_v23.patch
Description: Support-compression-of-full-page-writes-in-WAL_v23.patch


compress_run.bash
Description: compress_run.bash

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [REVIEW] Re: Compression of full-page-writes

2015-03-04 Thread Syed, Rahila
Hello,

Are there any other flag bits that we should or are planning to add into WAL 
header newly, except the above two? If yes and they are required by even a 
block which doesn't have an image, I will change my mind and agree to add 
something like chunk ID to a block header. 
But I guess the answer of the question is No. Since the flag bits now we are 
thinking to add are required only by a block having an image, adding them into 
a block header (instead of block image header) seems a waste of bytes in WAL. 
So I concur with Michael.
I agree.
As per my understanding, this change of xlog format was to provide for future 
enhancement which would need flags relevant to entire block.
But as mentioned, currently the flags being added are related to block image 
only. Hence for this patch it makes sense to add a field to 
XLogRecordImageHeader rather than block header. 
This will also save bytes per WAL record. 

Thank you,
Rahila Syed

 


__
Disclaimer: This email and any attachments are sent in strictest confidence
for the sole use of the addressee and may contain legally privileged,
confidential, and proprietary data. If you are not the intended recipient,
please advise the sender by replying promptly to this email and then delete
and destroy this email and any attachments without any further use, copying
or forwarding.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [REVIEW] Re: Compression of full-page-writes

2015-03-03 Thread Syed, Rahila
Hello,

It would be good to get those problems fixed first. Could you send an updated 
patch?

Please find attached updated patch with WAL replay error fixed. The patch 
follows chunk ID approach of xlog format.

Following are brief measurement numbers. 
  WAL
FPW compression on   122.032 MB

FPW compression off   155.239 MB

HEAD  155.236 MB


Thank you,
Rahila Syed


__
Disclaimer: This email and any attachments are sent in strictest confidence
for the sole use of the addressee and may contain legally privileged,
confidential, and proprietary data. If you are not the intended recipient,
please advise the sender by replying promptly to this email and then delete
and destroy this email and any attachments without any further use, copying
or forwarding.

Support-compression-of-full-page-writes_v22.patch
Description: Support-compression-of-full-page-writes_v22.patch

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [REVIEW] Re: Compression of full-page-writes

2015-02-18 Thread Syed, Rahila
Hello,

I think we should change the xlog format so that the block_id (which currently 
is XLR_BLOCK_ID_DATA_SHORT/LONG or a actual block id) isn't the block id but 
something like XLR_CHUNK_ID. Which is used as is for 
XLR_CHUNK_ID_DATA_SHORT/LONG, but for backup blocks can be set to to 
XLR_CHUNK_BKP_WITH_HOLE, XLR_CHUNK_BKP_COMPRESSED, XLR_CHUNK_BKP_REFERENCE... 
The BKP blocks will then follow, storing the block id following the chunk id.

Yes, that'll increase the amount of data for a backup block by 1 byte, but I 
think that's worth it. I'm pretty sure we will be happy about the added 
extensibility pretty soon.

To clarify my understanding of the above change,

Instead of a block id to reference different fragments of an xlog record , a 
single byte field chunk_id  should be used.  chunk_id  will be same as 
XLR_BLOCK_ID_DATA_SHORT/LONG for main data fragments. 
But for block references, it will take store following values in order to store 
information about the backup blocks.
#define XLR_CHUNK_BKP_COMPRESSED  0x01  
#define XLR_CHUNK_BKP_WITH_HOLE 0x02
...

The new xlog format should look like follows,

Fixed-size header (XLogRecord struct)
Chunk_id(add a field before id field in XLogRecordBlockHeader struct)
XLogRecordBlockHeader 
Chunk_id
 XLogRecordBlockHeader 
...
...
Chunk_id ( rename id field of the XLogRecordDataHeader struct) 
XLogRecordDataHeader[Short|Long] 
 block data
 block data
 ...
 main data

I will post a patch based on this.

Thank you,
Rahila Syed

-Original Message-
From: Andres Freund [mailto:and...@2ndquadrant.com] 
Sent: Monday, February 16, 2015 5:26 PM
To: Syed, Rahila
Cc: Michael Paquier; Fujii Masao; PostgreSQL mailing lists
Subject: Re: [HACKERS] [REVIEW] Re: Compression of full-page-writes

On 2015-02-16 11:30:20 +, Syed, Rahila wrote:
 - * As a trivial form of data compression, the XLOG code is aware that
 - * PG data pages usually contain an unused hole in the middle, 
 which
 - * contains only zero bytes.  If hole_length  0 then we have removed
 - * such a hole from the stored data (and it's not counted in the
 - * XLOG record's CRC, either).  Hence, the amount of block data 
 actually
 - * present is BLCKSZ - hole_length bytes.
 + * Block images are able to do several types of compression:
 + * - When wal_compression is off, as a trivial form of compression, 
 + the
 + * XLOG code is aware that PG data pages usually contain an unused hole
 + * in the middle, which contains only zero bytes.  If length  BLCKSZ
 + * then we have removed such a hole from the stored data (and it is
 + * not counted in the XLOG record's CRC, either).  Hence, the amount
 + * of block data actually present is length bytes.  The hole offset
 + * on page is defined using hole_offset.
 + * - When wal_compression is on, block images are compressed using a
 + * compression algorithm without their hole to improve compression
 + * process of the page. length corresponds in this case to the 
 + length
 + * of the compressed block. hole_offset is the hole offset of the 
 + page,
 + * and the length of the uncompressed block is defined by 
 + raw_length,
 + * whose data is included in the record only when compression is 
 + enabled
 + * and with_hole is set to true, see below.
 + *
 + * is_compressed is used to identify if a given block image is 
 + compressed
 + * or not. Maximum page size allowed on the system being 32k, the 
 + hole
 + * offset cannot be more than 15-bit long so the last free bit is 
 + used to
 + * store the compression state of block image. If the maximum page 
 + size
 + * allowed is increased to a value higher than that, we should 
 + consider
 + * increasing this structure size as well, but this would increase 
 + the
 + * length of block header in WAL records with alignment.
 + *
 + * with_hole is used to identify the presence of a hole in a block image.
 + * As the length of a block cannot be more than 15-bit long, the 
 + extra bit in
 + * the length field is used for this identification purpose. If the 
 + block image
 + * has no hole, it is ensured that the raw size of a compressed block 
 + image is
 + * equal to BLCKSZ, hence the contents of 
 + XLogRecordBlockImageCompressionInfo
 + * are not necessary.
   */
  typedef struct XLogRecordBlockImageHeader  {
 - uint16  hole_offset;/* number of bytes before hole */
 - uint16  hole_length;/* number of bytes in hole */
 + uint16  length:15,  /* length of block data in 
 record */
 + with_hole:1;/* status of hole in the block 
 */
 +
 + uint16  hole_offset:15, /* number of bytes before hole */
 + is_compressed:1;/* compression status of image */
 +
 + /* Followed by the data related to compression if block is 
 +compressed */
  } XLogRecordBlockImageHeader;

Yikes, this is ugly.

I think we should change the xlog format so that the block_id

Re: [HACKERS] [REVIEW] Re: Compression of full-page-writes

2015-02-16 Thread Syed, Rahila
Hello,

Thank you for reviewing and testing the patch.

+   /* leave if data cannot be compressed */
+   if (compressed_len == 0)
+   return false;
This should be  0, pglz_compress returns -1 when compression fails.

+   if (pglz_decompress(block_image, bkpb-bkp_len, 
record-uncompressBuf,
+   
bkpb-bkp_uncompress_len) == 0)
Similarly, this should be  0.

These have been corrected in the attached.

Regarding the sanity checks that have been added recently. I think that they 
are useful but I am suspecting as well that only a check on the record CRC is 
done because that's reliable enough and not doing those checks accelerates a 
bit replay. So I am thinking that we should simply replace them by assertions.
Removing the checks makes sense as CRC ensures correctness . Moreover ,as error 
message for invalid length of record is present in the code , messages for 
invalid block length can be redundant.
Checks have been replaced by assertions in the attached patch.

Following if  condition in XLogCompressBackupBlock has been modified as follows

Previous
/*
+  * We recheck the actual size even if pglz_compress() reports success 
and
+  * see if at least 2 bytes of length have been saved, as this 
corresponds
+  * to the additional amount of data stored in WAL record for a 
compressed
+  * block via raw_length when block contains hole..
+  */
+  *len = (uint16) compressed_len;
+  if (*len = orig_len - SizeOfXLogRecordBlockImageCompressionInfo)
+  return false;
+  return true;


Current
if ((hole_length != 0) 
+  (*len = orig_len - 
SizeOfXLogRecordBlockImageCompressionInfo))
+  return false;
+return true

This is because the extra information raw_length is included only if compressed 
block has hole in it.

Once we get those small issues fixes, I think that it is with having a 
committer look at this patch, presumably Fujii-san
Agree. I will mark this patch as ready for committer

Thank you,
Rahila Syed

__
Disclaimer: This email and any attachments are sent in strictest confidence
for the sole use of the addressee and may contain legally privileged,
confidential, and proprietary data. If you are not the intended recipient,
please advise the sender by replying promptly to this email and then delete
and destroy this email and any attachments without any further use, copying
or forwarding.

Support-compression-for-full-page-writes-in-WAL_v19.patch
Description: Support-compression-for-full-page-writes-in-WAL_v19.patch

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [REVIEW] Re: Compression of full-page-writes

2015-02-12 Thread Syed, Rahila

Thank you for comments. Please find attached the updated patch.

This patch fails to compile:
xlogreader.c:1049:46: error: extraneous ')' after condition, expected a 
statement
blk-with_hole  blk-hole_offset = 
 0))
This has been rectified.

Note as well that at least clang does not like much how the sanity check with 
with_hole are done. You should place parentheses around the '' expressions. 
Also, I would rather define with_hole == 0 or with_hole == 1 explicitly int 
those checks
The expressions are modified accordingly.

There is a typo:
s/true,see/true, see/
[nitpicky]Be as well aware of the 80-character limit per line that is usually 
normally by comment blocks.[/]

Have corrected the typos and changed the comments as mentioned. Also , 
realigned certain lines to meet the 80-char limit.

Thank you,
Rahila Syed



__
Disclaimer: This email and any attachments are sent in strictest confidence
for the sole use of the addressee and may contain legally privileged,
confidential, and proprietary data. If you are not the intended recipient,
please advise the sender by replying promptly to this email and then delete
and destroy this email and any attachments without any further use, copying
or forwarding.

Support-compression-for-full-page-writes-in-WAL_v18.patch
Description: Support-compression-for-full-page-writes-in-WAL_v18.patch

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [REVIEW] Re: Compression of full-page-writes

2015-02-09 Thread Syed, Rahila
Hello,

A bug had been introduced in the latest versions of the patch. The order of 
parameters passed to pglz_decompress was wrong.
Please find attached patch with following correction,

Original code,
+   if (pglz_decompress(block_image, record-uncompressBuf,
+   bkpb-bkp_len, 
bkpb-bkp_uncompress_len) == 0)
Correction
+   if (pglz_decompress(block_image, bkpb-bkp_len,
+   record-uncompressBuf, 
bkpb-bkp_uncompress_len) == 0)


For example here you should not have a space between the function definition 
and the variable declarations:
+{
+
+int orig_len = BLCKSZ - hole_length;
This is as well incorrect in two places:
if(hole_length != 0)
There should be a space between the if and its condition in parenthesis.

Also corrected above code format mistakes.

Thank you,
Rahila Syed

-Original Message-
From: pgsql-hackers-ow...@postgresql.org 
[mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Syed, Rahila
Sent: Monday, February 09, 2015 6:58 PM
To: Michael Paquier; Fujii Masao
Cc: PostgreSQL mailing lists
Subject: Re: [HACKERS] [REVIEW] Re: Compression of full-page-writes

Hello,

 Do we always need extra two bytes for compressed backup block?
 ISTM that extra bytes are not necessary when the hole length is zero.
 In this case the length of the original backup block (i.e.,
 uncompressed) must be BLCKSZ, so we don't need to save the original 
 size in the extra bytes.

Yes, we would need a additional bit to identify that. We could steal it from 
length in XLogRecordBlockImageHeader.

This is implemented in the attached patch by dividing length field as follows,
uint16  length:15,  
with_hole:1; 

2 should be replaced with the macro variable indicating the size of 
extra header for compressed backup block.
Macro SizeOfXLogRecordBlockImageCompressionInfo is used instead of 2

You can refactor XLogCompressBackupBlock() and move all the above code 
to it for more simplicity
This is also implemented in the patch attached.

Thank you,
Rahila Syed


-Original Message-
From: Michael Paquier [mailto:michael.paqu...@gmail.com]
Sent: Friday, February 06, 2015 6:00 PM
To: Fujii Masao
Cc: Syed, Rahila; PostgreSQL mailing lists
Subject: Re: [HACKERS] [REVIEW] Re: Compression of full-page-writes

On Fri, Feb 6, 2015 at 3:03 PM, Fujii Masao wrote:
 Do we always need extra two bytes for compressed backup block?
 ISTM that extra bytes are not necessary when the hole length is zero.
 In this case the length of the original backup block (i.e.,
 uncompressed) must be BLCKSZ, so we don't need to save the original 
 size in the extra bytes.

Yes, we would need a additional bit to identify that. We could steal it from 
length in XLogRecordBlockImageHeader.

 Furthermore, when fpw compression is disabled and the hole length is 
 zero, we seem to be able to save one byte from the header of backup 
 block. Currently we use 4 bytes for the header, 2 bytes for the length 
 of backup block, 15 bits for the hole offset and 1 bit for the flag 
 indicating whether block is compressed or not. But in that case, the 
 length of backup block doesn't need to be stored because it must be 
 BLCKSZ. Shouldn't we optimize the header in this way? Thought?

If we do it, that's something to tackle even before this patch on HEAD, because 
you could use the 16th bit of the first 2 bytes of XLogRecordBlockImageHeader 
to do necessary sanity checks, to actually not reduce record by 1 byte, but 2 
bytes as hole-related data is not necessary. I imagine that a patch optimizing 
that wouldn't be that hard to write as well.

 +int page_len = BLCKSZ - hole_length;
 +char *scratch_buf;
 +if (hole_length != 0)
 +{
 +scratch_buf = compression_scratch;
 +memcpy(scratch_buf, page, hole_offset);
 +memcpy(scratch_buf + hole_offset,
 +   page + (hole_offset + hole_length),
 +   BLCKSZ - (hole_length + hole_offset));
 +}
 +else
 +scratch_buf = page;
 +
 +/* Perform compression of block */
 +if (XLogCompressBackupBlock(scratch_buf,
 +page_len,
 +regbuf-compressed_page,
 +compress_len))
 +{
 +/* compression is done, add record */
 +is_compressed = true;
 +}

 You can refactor XLogCompressBackupBlock() and move all the above code 
 to it for more simplicity.

Sure.
--
Michael

__
Disclaimer: This email and any attachments are sent in strictest confidence

Re: [HACKERS] [REVIEW] Re: Compression of full-page-writes

2015-02-09 Thread Syed, Rahila
Hello,

 Do we always need extra two bytes for compressed backup block?
 ISTM that extra bytes are not necessary when the hole length is zero.
 In this case the length of the original backup block (i.e., 
 uncompressed) must be BLCKSZ, so we don't need to save the original 
 size in the extra bytes.

Yes, we would need a additional bit to identify that. We could steal it from 
length in XLogRecordBlockImageHeader.

This is implemented in the attached patch by dividing length field as follows,
uint16  length:15,  
with_hole:1; 

2 should be replaced with the macro variable indicating the size of 
extra header for compressed backup block.
Macro SizeOfXLogRecordBlockImageCompressionInfo is used instead of 2

You can refactor XLogCompressBackupBlock() and move all the 
above code to it for more simplicity
This is also implemented in the patch attached.

Thank you,
Rahila Syed


-Original Message-
From: Michael Paquier [mailto:michael.paqu...@gmail.com] 
Sent: Friday, February 06, 2015 6:00 PM
To: Fujii Masao
Cc: Syed, Rahila; PostgreSQL mailing lists
Subject: Re: [HACKERS] [REVIEW] Re: Compression of full-page-writes

On Fri, Feb 6, 2015 at 3:03 PM, Fujii Masao wrote:
 Do we always need extra two bytes for compressed backup block?
 ISTM that extra bytes are not necessary when the hole length is zero.
 In this case the length of the original backup block (i.e., 
 uncompressed) must be BLCKSZ, so we don't need to save the original 
 size in the extra bytes.

Yes, we would need a additional bit to identify that. We could steal it from 
length in XLogRecordBlockImageHeader.

 Furthermore, when fpw compression is disabled and the hole length is 
 zero, we seem to be able to save one byte from the header of backup 
 block. Currently we use 4 bytes for the header, 2 bytes for the length 
 of backup block, 15 bits for the hole offset and 1 bit for the flag 
 indicating whether block is compressed or not. But in that case, the 
 length of backup block doesn't need to be stored because it must be 
 BLCKSZ. Shouldn't we optimize the header in this way? Thought?

If we do it, that's something to tackle even before this patch on HEAD, because 
you could use the 16th bit of the first 2 bytes of XLogRecordBlockImageHeader 
to do necessary sanity checks, to actually not reduce record by 1 byte, but 2 
bytes as hole-related data is not necessary. I imagine that a patch optimizing 
that wouldn't be that hard to write as well.

 +int page_len = BLCKSZ - hole_length;
 +char *scratch_buf;
 +if (hole_length != 0)
 +{
 +scratch_buf = compression_scratch;
 +memcpy(scratch_buf, page, hole_offset);
 +memcpy(scratch_buf + hole_offset,
 +   page + (hole_offset + hole_length),
 +   BLCKSZ - (hole_length + hole_offset));
 +}
 +else
 +scratch_buf = page;
 +
 +/* Perform compression of block */
 +if (XLogCompressBackupBlock(scratch_buf,
 +page_len,
 +regbuf-compressed_page,
 +compress_len))
 +{
 +/* compression is done, add record */
 +is_compressed = true;
 +}

 You can refactor XLogCompressBackupBlock() and move all the above code 
 to it for more simplicity.

Sure.
--
Michael

__
Disclaimer: This email and any attachments are sent in strictest confidence
for the sole use of the addressee and may contain legally privileged,
confidential, and proprietary data. If you are not the intended recipient,
please advise the sender by replying promptly to this email and then delete
and destroy this email and any attachments without any further use, copying
or forwarding.

Support-compression-for-full-page-writes-in-WAL_v17.patch
Description: Support-compression-for-full-page-writes-in-WAL_v17.patch

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [REVIEW] Re: Compression of full-page-writes

2015-02-06 Thread Syed, Rahila

In any case, those things have been introduced by what I did in previous 
versions... And attached is a new patch.
Thank you for feedback.

   /* allocate scratch buffer used for compression of block images */
+  if (compression_scratch == NULL)
+  compression_scratch = MemoryContextAllocZero(xloginsert_cxt,
+  
 BLCKSZ);
 }
The compression patch can  use the latest interface MemoryContextAllocExtended 
to proceed without compression when sufficient memory is not available for 
scratch buffer. 
The attached patch introduces OutOfMem flag which is set on when 
MemoryContextAllocExtended returns NULL . 

Thank you,
Rahila Syed

-Original Message-
From: Michael Paquier [mailto:michael.paqu...@gmail.com] 
Sent: Friday, February 06, 2015 12:46 AM
To: Syed, Rahila
Cc: PostgreSQL mailing lists
Subject: Re: [HACKERS] [REVIEW] Re: Compression of full-page-writes

On Thu, Feb 5, 2015 at 11:06 PM, Syed, Rahila rahila.s...@nttdata.com wrote:
/*
+* We recheck the actual size even if pglz_compress() report success,
+* because it might be satisfied with having saved as little as one byte
+* in the compressed data.
+*/
+   *len = (uint16) compressed_len;
+   if (*len = orig_len - 1)
+   return false;
+   return true;
+}

 As per latest code ,when compression is 'on' we introduce additional 2 bytes 
 in the header of each block image for storing raw_length of the compressed 
 block.
 In order to achieve compression while accounting for these two additional 
 bytes, we must ensure that compressed length is less than original length - 2.
 So , IIUC the above condition should rather be

 If (*len = orig_len -2 )
 return false;
 return true;
 The attached patch contains this. It also has a cosmetic change-  renaming 
 compressBuf to uncompressBuf as it is used to store uncompressed page.

Agreed on both things.

Just looking at your latest patch after some time to let it cool down, I 
noticed a couple of things.

 #define MaxSizeOfXLogRecordBlockHeader \
 (SizeOfXLogRecordBlockHeader + \
- SizeOfXLogRecordBlockImageHeader + \
+ SizeOfXLogRecordBlockImageHeader, \
+ SizeOfXLogRecordBlockImageCompressionInfo + \
There is a comma here instead of a sum sign. We should really sum up all those 
sizes to evaluate the maximum size of a block header.

+ * Permanently allocate readBuf uncompressBuf.  We do it this way,
+ * rather than just making a static array, for two reasons:
This comment is just but weird, readBuf AND uncompressBuf is more appropriate.

+ * We recheck the actual size even if pglz_compress() report success,
+ * because it might be satisfied with having saved as little as one byte
+ * in the compressed data. We add two bytes to store raw_length  with the
+ * compressed image. So for compression to be effective
compressed_len should
+ * be atleast  orig_len - 2
This comment block should be reworked, and misses a dot at its end. I rewrote 
it like that, hopefully that's clearer:
+   /*
+* We recheck the actual size even if pglz_compress() reports
success and see
+* if at least 2 bytes of length have been saved, as this
corresponds to the
+* additional amount of data stored in WAL record for a compressed block
+* via raw_length.
+*/

In any case, those things have been introduced by what I did in previous 
versions... And attached is a new patch.
--
Michael

__
Disclaimer: This email and any attachments are sent in strictest confidence
for the sole use of the addressee and may contain legally privileged,
confidential, and proprietary data. If you are not the intended recipient,
please advise the sender by replying promptly to this email and then delete
and destroy this email and any attachments without any further use, copying
or forwarding.

Support-compression-for-full-page-writes-in-WAL_v17.patch
Description: Support-compression-for-full-page-writes-in-WAL_v17.patch

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [REVIEW] Re: Compression of full-page-writes

2015-02-05 Thread Syed, Rahila
Hello,
 
/*
+* We recheck the actual size even if pglz_compress() report success,
+* because it might be satisfied with having saved as little as one byte
+* in the compressed data.
+*/
+   *len = (uint16) compressed_len;
+   if (*len = orig_len - 1)
+   return false;
+   return true;
+}

As per latest code ,when compression is 'on' we introduce additional 2 bytes in 
the header of each block image for storing raw_length of the compressed block. 
In order to achieve compression while accounting for these two additional 
bytes, we must ensure that compressed length is less than original length - 2.
So , IIUC the above condition should rather be 

If (*len = orig_len -2 )
return false;
return true;
 
The attached patch contains this. It also has a cosmetic change-  renaming 
compressBuf to uncompressBuf as it is used to store uncompressed page.
 
Thank you,
Rahila Syed

-Original Message-
From: pgsql-hackers-ow...@postgresql.org 
[mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Michael Paquier
Sent: Wednesday, January 07, 2015 9:32 AM
To: Rahila Syed
Cc: PostgreSQL mailing lists
Subject: Re: [HACKERS] [REVIEW] Re: Compression of full-page-writes

On Wed, Jan 7, 2015 at 12:51 AM, Rahila Syed rahilasyed...@gmail.com wrote:
 Following are some comments,
Thanks for the feedback.

uint16  hole_offset:15, /* number of bytes in hole */
 Typo in description of hole_offset
Fixed. That's before hole.

for (block_id = 0; block_id = record-max_block_id; block_id++)
-   {
-   if (XLogRecHasBlockImage(record, block_id))
-   fpi_len += BLCKSZ -
 record-blocks[block_id].hole_length;
-   }
+   fpi_len += record-blocks[block_id].bkp_len;

 IIUC, if condition, /if(XLogRecHasBlockImage(record, block_id))/ is 
 incorrectly removed from the above for loop.
Fixed.

typedef struct XLogRecordCompressedBlockImageHeader
 I am trying to understand the purpose behind declaration of the above 
 struct. IIUC, it is defined in order to introduce new field uint16 
 raw_length and it has been declared as a separate struct from 
 XLogRecordBlockImageHeader to not affect the size of WAL record when 
 compression is off.
 I wonder if it is ok to simply memcpy the uint16 raw_length in the 
 hdr_scratch when compression is on and not have a separate header 
 struct for it neither declare it in existing header. raw_length can be 
 a locally defined variable is XLogRecordAssemble or it can be a field 
 in registered_buffer struct like compressed_page.
 I think this can simplify the code.
 Am I missing something obvious?
You are missing nothing. I just introduced this structure for a matter of 
readability to show the two-byte difference between non-compressed and 
compressed header information. It is true that doing it my way makes the 
structures duplicated, so let's simply add the compression-related information 
as an extra structure added after XLogRecordBlockImageHeader if the block is 
compressed. I hope this addresses your concerns.

 /*
  * Fill in the remaining fields in the XLogRecordBlockImageHeader
  * struct and add new entries in the record chain.
 */

   bkpb.fork_flags |= BKPBLOCK_HAS_IMAGE;

 This code line seems to be misplaced with respect to the above comment.
 Comment indicates filling of XLogRecordBlockImageHeader fields while 
 fork_flags is a field of XLogRecordBlockHeader.
 Is it better to place the code close to following condition?
   if (needs_backup)
   {
Yes, this comment should not be here. I replaced it with the comment in HEAD.


+  *the original length of the
+ * block without its page hole being deducible from the compressed 
+ data
+ * itself.
 IIUC, this comment before XLogRecordBlockImageHeader seems to be no 
 longer valid as original length is not deducible from compressed data 
 and rather stored in header.
Aah, true. This was originally present in the header of PGLZ that has been 
removed to make it available for frontends.

Updated patches are attached.
Regards,
--
Michael

__
Disclaimer: This email and any attachments are sent in strictest confidence
for the sole use of the addressee and may contain legally privileged,
confidential, and proprietary data. If you are not the intended recipient,
please advise the sender by replying promptly to this email and then delete
and destroy this email and any attachments without any further use, copying
or forwarding.

Support-compression-for-full-page-writes-in-WAL_v15.patch
Description: Support-compression-for-full-page-writes-in-WAL_v15.patch

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [REVIEW] Re: Compression of full-page-writes

2014-11-26 Thread Syed, Rahila
Hello,
I would like to contribute few points.

XLogInsertRecord(XLogRecData *rdata, XLogRecPtr fpw_lsn)
   RedoRecPtr = Insert-RedoRecPtr;
   }
   doPageWrites = (Insert-fullPageWrites || Insert-forcePageWrites);
   doPageCompression = (Insert-fullPageWrites == 
 FULL_PAGE_WRITES_COMPRESS);

Don't we need to initialize doPageCompression  similar to doPageWrites in 
InitXLOGAccess?

Also , in the earlier patches compression was set 'on' even when fpw GUC is 
'off'. This was to facilitate compression of FPW which are forcibly written 
even when fpw GUC is turned off.
 doPageCompression in this patch is set to true only if value of fpw GUC is 
'compress'. I think its better to compress forcibly written full page writes.


Regards,

Rahila Syed
-Original Message-
From: pgsql-hackers-ow...@postgresql.org 
[mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Michael Paquier
Sent: Wednesday, November 26, 2014 1:55 PM
To: Alvaro Herrera
Cc: Andres Freund; Robert Haas; Fujii Masao; Rahila Syed; Rahila Syed; 
PostgreSQL-development
Subject: Re: [HACKERS] [REVIEW] Re: Compression of full-page-writes

So, Here are reworked patches for the whole set, with the following changes:
- Found why replay was failing, xlogreader.c took into account BLCKSZ
- hole while it should have taken into account the compressed data length when 
fetching a compressed block image.
- Reworked pglz portion to have it return status errors instead of simple 
booleans. pglz stuff is as well moved to src/common as Alvaro suggested.

I am planning to run some tests to check how much compression can reduce WAL 
size with this new set of patches. I have been however able to check that those 
patches pass installcheck-world with a standby replaying the changes behind. 
Feel free to play with those patches...
Regards,
--
Michael

__
Disclaimer: This email and any attachments are sent in strictest confidence
for the sole use of the addressee and may contain legally privileged,
confidential, and proprietary data. If you are not the intended recipient,
please advise the sender by replying promptly to this email and then delete
and destroy this email and any attachments without any further use, copying
or forwarding.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [REVIEW] Re: Compression of full-page-writes

2014-10-28 Thread Syed, Rahila
Hello Fujii-san,

Thank you for your comments.

The patch isn't applied to the master cleanly.
The compilation of the document failed with the following error message.
openjade:config.sgml:2188:12:E: end tag for element TERM which is not open
make[3]: *** [HTML.index] Error 1
xlog.c:930: warning: ISO C90 forbids mixed declarations and code
xlogreader.c:744: warning: ISO C90 forbids mixed declarations and code
xlogreader.c:744: warning: ISO C90 forbids mixed declarations and code

Please find attached patch with these rectified.

Only backend calls CompressBackupBlocksPagesAlloc when SIGHUP is sent.
Why does only backend need to do that? What about other processes which can 
write FPW, e.g., autovacuum?
I had overlooked this. I will correct it.

Do we release the buffers for compressed data when fpw is changed from 
compress to on?
The current code does not do this.

The memory is always (i.e., even when fpw=on) allocated to uncompressedPages, 
but not to compressedPages. Why? I guess that the test of fpw needs to be there
uncompressedPages is also used to store the decompression output at the time of 
recovery. Hence, memory for uncompressedPages needs to be allocated even if 
fpw=on which is not the case for compressedPages.

Thank you,
Rahila Syed

-Original Message-
From: Fujii Masao [mailto:masao.fu...@gmail.com] 
Sent: Monday, October 27, 2014 6:50 PM
To: Rahila Syed
Cc: Andres Freund; Syed, Rahila; Alvaro Herrera; Rahila Syed; 
PostgreSQL-development; Tom Lane
Subject: Re: [HACKERS] [REVIEW] Re: Compression of full-page-writes

On Fri, Oct 17, 2014 at 1:52 PM, Rahila Syed rahilasye...@gmail.com wrote:
 Hello,

 Please find the updated patch attached.

Thanks for updating the patch! Here are the comments.

The patch isn't applied to the master cleanly.

I got the following compiler warnings.

xlog.c:930: warning: ISO C90 forbids mixed declarations and code
xlogreader.c:744: warning: ISO C90 forbids mixed declarations and code
xlogreader.c:744: warning: ISO C90 forbids mixed declarations and code

The compilation of the document failed with the following error message.

openjade:config.sgml:2188:12:E: end tag for element TERM which is not open
make[3]: *** [HTML.index] Error 1

Only backend calls CompressBackupBlocksPagesAlloc when SIGHUP is sent.
Why does only backend need to do that? What about other processes which can 
write FPW, e.g., autovacuum?

Do we release the buffers for compressed data when fpw is changed from 
compress to on?

+if (uncompressedPages == NULL)
+{
+uncompressedPages = (char *)malloc(XLR_TOTAL_BLCKSZ);
+if (uncompressedPages == NULL)
+outOfMem = 1;
+}

The memory is always (i.e., even when fpw=on) allocated to uncompressedPages, 
but not to compressedPages. Why? I guess that the test of fpw needs to be there.

Regards,

--
Fujii Masao

__
Disclaimer: This email and any attachments are sent in strictest confidence
for the sole use of the addressee and may contain legally privileged,
confidential, and proprietary data. If you are not the intended recipient,
please advise the sender by replying promptly to this email and then delete
and destroy this email and any attachments without any further use, copying
or forwarding.


compress_fpw_v2.patch
Description: compress_fpw_v2.patch

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [REVIEW] Re: Compression of full-page-writes

2014-09-22 Thread Syed, Rahila
Hello,

Please find attached the patch to compress FPW.

Sorry I had forgotten to attach. Please find the patch attached.

Thank you,
Rahila Syed





From: pgsql-hackers-ow...@postgresql.org 
[mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Rahila Syed
Sent: Monday, September 22, 2014 3:16 PM
To: Alvaro Herrera
Cc: Rahila Syed; PostgreSQL-development; Tom Lane
Subject: Re: [HACKERS] [REVIEW] Re: Compression of full-page-writes


Hello All,


Well, from Rahila's point of view the patch is actually attached, but
she's posting from the Nabble interface, which mangles it and turns into
a link instead.

Yes.



but the end result is the
same: to properly submit a patch, you need to send an email to the
 mailing list, not join a group/forum from
some intermediary newsgroup site that mirrors the list.


Thank you. I will take care of it henceforth.

Please find attached the patch to compress FPW.  Patch submitted by Fujii-san 
earlier in the thread is used to merge compression GUC with full_page_writes.



I am reposting the measurement numbers.

Server Specification:
Processors:Intel® Xeon ® Processor E5-2650 (2 GHz, 8C/16T, 20 MB) * 2 nos
RAM: 32GB
Disk : HDWD 450GB 10K Hot Plug 2.5-inch SAS HDD * 8 nos
1 x 450 GB SAS HDD, 2.5-inch, 6Gb/s, 10,000 rpm

Checkpoint segments: 1024
Checkpoint timeout: 5 mins

pgbench -c 64 -j 64 -r -T 900 -M prepared
Scale factor: 1000

WAL generated (MB)   Throughput (tps)   
   Latency(ms)
On 9235.43   979.03 
  65.36
Compress(pglz)   6518.681072.34 
59.66
Off 501.04  1135.17 
   56.34

The results show  around 30 percent decrease in WAL volume due to compression 
of FPW.

Thank you ,

Rahila Syed

Tom Lane wrote:
 Rahila Syed 
 rahilasyedmailto:rahilasyed...@gmail.com.90@mailto:rahilasyed...@gmail.comgmail.commailto:rahilasyed...@gmail.com
  writes:
  Please find attached patch to compress FPW using pglz compression.

 Patch not actually attached AFAICS (no, a link is not good enough).

Well, from Rahila's point of view the patch is actually attached, but
she's posting from the Nabble interface, which mangles it and turns into
a link instead.  Not her fault, really -- but the end result is the
same: to properly submit a patch, you need to send an email to the
pgsqlmailto:pgsql-hackers@postgresql.org-mailto:pgsql-hackers@postgresql.orghackersmailto:pgsql-hackers@postgresql.org@mailto:pgsql-hackers@postgresql.orgpostgresql.orgmailto:pgsql-hackers@postgresql.orgmailing
 list, not join a group/forum from
some intermediary newsgroup site that mirrors the list.

--
Álvaro Herrera   http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services

__
Disclaimer:This email and any attachments are sent in strictest confidence for 
the sole use of the addressee and may contain legally privileged, confidential, 
and proprietary data.  If you are not the intended recipient, please advise the 
sender by replying promptly to this email and then delete and destroy this 
email and any attachments without any further use, copying or forwarding


compress_fpw_v1.patch
Description: compress_fpw_v1.patch

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers