Re: [HACKERS] New gist vacuum.

2017-11-12 Thread Andrey Borodin
Hello!

> 31 янв. 2016 г., в 17:18, Alvaro Herrera <alvhe...@2ndquadrant.com> 
> написал(а):
> 
> Костя Кузнецов wrote:
>> Thank you, Jeff.I reworking patch now. All // warning will 
>> be deleted.About memory consumption new version will control size 
>> of stack and will operate with map of little size because i want delete old 
>> style vacuum(now if maintenance_work_mem less than needed to build info map 
>> we use old-style vacuum with logical order).
> 
> You never got around to submitting the updated version of this patch,
> and it's been a long time now, so I'm marking it as returned with
> feedback for now.  Please do submit a new version once you have it,
> since this seems to be very useful.

I've rebased patch (see attachment), it seems to work. It still requires a bit 
of styling, docs and tests, at least.
Also, I thinks that hash table is not very good option if we have all pages 
there: we should either use array or do not fill table for every page.

If author and community do not object, I want to continue work on Konstantin's 
patch.

Best regards, Andrey Borodin.


0001-GiST-VACUUM-rebase.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: Covering + unique indexes.

2017-11-12 Thread Andrey Borodin
Hi!
+1 for pushing this. I'm really looking forward to see this in 11.

> 31 окт. 2017 г., в 13:21, Anastasia Lubennikova 
> <a.lubennik...@postgrespro.ru> написал(а):
> 
> Updated version is attached. It applies to the commit 
> e4fbf22831c2bbcf032ee60a327b871d2364b3f5.
> The first patch patch contains changes in general index routines
> and the second one contains btree specific changes.
> 
> This version contains fixes of the issues mentioned in the thread above and 
> passes all existing tests.
> But still it requires review and testing, because the merge was quite uneasy.
> Especially I worry about integration with partitioning. I'll add some more 
> tests in the next message.

I've been doing benchmark tests a year ago, so I skip this part in this review.

I've done some stress tests with pgbench, replication etc. Everything was fine 
until I plugged in amcheck.
If I create cluster with this [0] and then 
./pgbench -i -s 50

create index on pgbench_accounts (abalance) include (bid,aid,filler); 
create extension amcheck;

--and finally 
SELECT bt_index_check(c.oid), c.relname, c.relpages
FROM pg_index i
JOIN pg_opclass op ON i.indclass[0] = op.oid
JOIN pg_am am ON op.opcmethod = am.oid
JOIN pg_class c ON i.indexrelid = c.oid
JOIN pg_namespace n ON c.relnamespace = n.oid
WHERE am.amname = 'btree' AND n.nspname = 'public'
AND c.relpersistence != 't'
AND i.indisready AND i.indisvalid
ORDER BY c.relpages DESC LIMIT 100;
--just copypasted from amcheck docs with minor corrections

Postgres crashes: 
TRAP: FailedAssertion("!(((const void*)() != ((void*)0)) && 
(scankey->sk_attno) > 0)", File: "nbtsearch.c", Line: 466)

May be I'm doing something wrong or amcheck support will go with different 
patch?

Few minor nitpicks:
0. PgAdmin fails to understand what is going on [1] . It is clearly problem of 
PgAdmin, pg_dump works as expected.
1. ISTM index_truncate_tuple() can be optimized. We only need to reset tuple 
length and infomask. But this should not be hotpath, anyway, so I propose 
ignoring this for current version.
2. I've done grammarly checking :) This comma seems redundant [2]
I don't think something of these items require fixing.

Thanks for working on this, I believe it is important.

Best regards, Andrey Borodin.

[0] https://github.com/x4m/pgscripts/blob/master/install.sh
[1] https://yadi.sk/i/ro9YKFqo3PcwFT
[2] 
https://github.com/x4m/postgres_g/commit/657c28952d923d8c150e6cabb3bdcbbc44a641b6?diff=unified#diff-640baf2937029728a8d51cccd554c2eeR1291

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Index only scan for cube and seg

2017-10-29 Thread Andrey Borodin
Hi!
> 29 окт. 2017 г., в 2:24, Alexander Korotkov <a.korot...@postgrespro.ru> 
> написал(а):
> 
> As I can see, cube GiST code always uses DatumGetNDBOX() macro to transform 
> Datum to (NDBOX *).
> 
> #define DatumGetNDBOX(x)  ((NDBOX *) PG_DETOAST_DATUM(x))
> 
> Thus, it should be safe to just remove both compress/decompress methods from 
> existing opclass.

Alexander, Tom, you are absolutely right. I was sure there is toasting code in 
cube's compress, but it was not ever there.

Here is patch for cube that drops functions.

Best regards, Andrey Borodin.


0001-Enable-Index-Only-Scan-in-cube.patch
Description: Binary data


0001-Enable-Index-Only-Scan-in-seg.patch
Description: Binary data



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reading timeline from pg_control on replication slave

2017-10-28 Thread Andrey Borodin
Hi, Michael!

Thank you very much for these comments!

> 28 окт. 2017 г., в 3:09, Michael Paquier <michael.paqu...@gmail.com> 
> написал(а):
> 
> ThisTimeLineID is not something you can rely on for standby backends
> as it is not set during recovery. That's the reason behind
> pg_walfile_name disabled during recovery. There are three things
> popping on top of my mind that one could think about:
> 1) Backups cannot be completed when started on a standby in recovery
> and when stopped after the standby has been promoted, meaning that its
> timeline has changed.
> 2) After a standby has been promoted, by using pg_start_backup, you
> issue a checkpoint which makes sure that the control file gets flushed
> with the new information, so when pg_start_backup returns to the
> caller you should have the correct timeline number because the outer
> function gets evaluated last.
> 3) Backups taken from cascading standbys, where a direct parent has
> been promoted.
> 
> 1) and 2) are actually not problems per the restrictions I am giving
> above, but 3) is. If I recall correctly, when a streaming standby does
> a timeline jump, a restart point is not immediately generated, so you
> could have the timeline on the control file not updated to the latest
> timeline value, meaning that you could have the WAL file name you use
> here referring to a previous timeline and not the newest one.
> 
> In short, yes, what you are doing is definitely risky in my opinion,
> particularly for complex cascading setups.

We are using TimeLineId from pg_control only to give a name to backup. Slightly 
stale timeline Id will not incur significant problems as long as pg_control is 
picked up after backup finalization.

But from your words I see that the safest option is to check timeline from 
pg_control after start and after stop. If this timelines differ - invalidate 
backup entirely. This does not seem too hard condition for invalidation, does 
it?

Best regards, Andrey Borodin.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Reading timeline from pg_control on replication slave

2017-10-27 Thread Andrey Borodin
Hi, hackers!

I'm working on backups from replication salve in WAL-G [0]
Backups used to use result of pg_walfile_name(pg_start_backup(...)). Call to 
pg_start_backup() works nice, but "pg_walfile_name() cannot be executed during 
recovery."
This function has LSN as argument and reads TimeLineId from global state.
So I made a function[1] that, if on replica, reads timeline from pg_control 
file and formats WAL file name as is it was produces by pg_wal_filename(lsn).

Are there any serious dangers? Obviously, this hack is not crisp and clear. Is 
the risk of reading stale timeline really a problem? By reading TimeLineId from 
file I'm fighting those precautions in pg_walfile_name(..) which were 
implemented for a reason, I guess. 

Thanks for reading this. I'll be happy to hear any comments on the matter.

[0] https://github.com/wal-g/wal-g/pull/32
[1] 
https://github.com/wal-g/wal-g/pull/32/files#diff-455316c14fb04c9f9748a0798ad151d9R158

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Index only scan for cube and seg

2017-10-26 Thread Andrey Borodin
Hi hackers!

Here are patches enabling Index Only Scan for cube and seg extensions.

These patches follow this discussion [0].

For cube there is new default opclass. We cannot drop old opclass, because it 
could TOAST come cube values in rare occasions. Index Only Scan is enabled only 
for newly created indexes. Btw I can add fetch to old opclass so that IOS would 
be enabled.
For seg compress and decompress functions are dropped from opclass and 
extension. Index Only Scan is enabled.

There are two more functions which can be deleted
ghstore_decompress
g_intbig_decompress
But it will not lead to any feasible consequences.


[0] 
https://www.postgresql.org/message-id/flat/CAJEAwVELVx9gYscpE%3DBe6iJxvdW5unZ_LkcAaVNSeOwvdwtD%3DA%40mail.gmail.com#CAJEAwVELVx9gYscpE=Be6iJxvdW5unZ_LkcAaVNSeOwvdwtD=a...@mail.gmail.com



 

0001-Create-cube-opclass-without-toasting.patch
Description: Binary data


0001-Enable-Index-Only-Scan-in-seg.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Allow GiST opcalsses without compress\decompres functions

2017-10-23 Thread Andrey Borodin

> 22 окт. 2017 г., в 21:21, Tom Lane <t...@sss.pgh.pa.us> написал(а):
> 
> Andrey Borodin <x4...@yandex-team.ru> writes:
>> I was looking for a way to correctly drop compress\decompress functions from 
>> opclasses.
> 
> Making a new opclass seems like a pretty grotty answer; it won't
> help existing installations.

Unfortunately in cube's case that's the only option: cube allows 
100-dimensional cubes, while 94-dimensional cubes could be toasted (according 
to code).

> I think what you need is to undo opclasscmds.c's decision that the
> dependencies should be INTERNAL.  I tried this:

Thanks Tom, that's exactly what I need. I'll push patches with this to November 
commitfest.

Best regards, Andrey Borodin.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Allow GiST opcalsses without compress\decompres functions

2017-10-22 Thread Andrey Borodin
Hi, Tom!
> 20 сент. 2017 г., в 8:38, Tom Lane <t...@sss.pgh.pa.us> написал(а):
> 
> Andrey Borodin <x4...@yandex-team.ru> writes:
>> [ 0001-Allow-uncompressed-GiST-4.patch ]
> 
> Pushed, with a bit more work on the documentation and some minor
> cosmetic changes.
> 
> I did not like the fact that the new code paths added by the patch
> were untested, so I went ahead and removed the no-longer-needed
> no-op functions in the core GiST opclasses.  There's still room
> to improve the contrib opclasses, but that's much more tedious
> (you need to write an extension update script) so I didn't feel
> like messing with those now.

I was looking for a way to correctly drop compress\decompress functions from 
opclasses.
There are two cases: when the functions do something unnecessary (like 
TOASTing) and when they are just no-ops.

First case occurs, for example, in cube. Please see patch attached. There is no 
DROP DEFAULT and I'm doing
UPDATE pg_opclass SET opcdefault = FALSE WHERE opcname = 'gist_cube_ops';
I'm not sure it is correct way, but I haven't found any other workaround. Then 
I just create new default opclass without compress\decompress.

Second case, for example in seg. I have tried 
alter operator family gist_seg_ops using gist drop function 3 (seg);
It does not work: dependency between opclass and method is DEPENDENCY_INTERNAL 
and command is refused.
Should I create new opclass as with cube or, may be, I can delete functions 
from system catalog?

Best regards, Andrey Borodin.



0001-Create-cube-opclass-without-compress-decompress.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] On markers of changed data

2017-10-15 Thread Andrey Borodin
Hello!


> 9 окт. 2017 г., в 10:23, Andrey Borodin <x4...@yandex-team.ru> написал(а):
> 
> Thanks, Stephen, this actually pointed what to look for
> VM is WAL-logged [0]
> FSM is not [1]
> 
> [0] 
> https://github.com/postgres/postgres/blob/113b0045e20d40f726a0a30e33214455e4f1385e/src/backend/access/heap/visibilitymap.c#L315
> [1] 
> https://github.com/postgres/postgres/blob/1d25779284fe1ba08ecd57e647292a9deb241376/src/backend/storage/freespace/freespace.c#L593

After tests of binary equivalence before and after backup I've come to 
conclusion, that Visibility Map cannot be backuped incrementally: it's bits are 
unset without page LSN bump. This can lead to wrong results of Index Only Scans 
executed on freshly restored backups.

In my implementation of incremental backup in WAL-G I will disable any FSM, VM 
and XACT\CLOG incrementation.

Posting this for the record, so that if someone goes this way info will be 
available. Thank you for your attention.

Best regards, Andrey Borodin.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] On markers of changed data

2017-10-09 Thread Andrey Borodin
Hi Michael!
> 9 окт. 2017 г., в 17:28, Michael Paquier <michael.paqu...@gmail.com> 
> написал(а):
>> VM is WAL-logged [0]
>> FSM is not [1]
> 
> If you are willing to go down this road, here are my takes on the matter:
> - Any LSN map should be in a different file than FSM and VM. The VM
> uses 2 bits per blocks now, and the FSM 8 bits. The FSM is wanted to
> remain small, so trying to plug into it more data is not welcome. The
> VM bits also are dedicated to purely visibility matters. We may not
> want to use that for the VM.
> - Those should not be logged, as we would end up with tracking down
> WAL records for things that themselves track the effects of other
> records.
I was asking about FSM and VM not because I wanted to place something there, 
but because I was looking for the way to backup FSM and VM efficiently. VM can 
be copied page-incrementally, FSM cannot.

But the design you are describing resembles PTRACK[0]: fork for page changes 
tracking, not WAL-logged, but crash safe due to recovery from others WALs.

Best regards, Andrey Borodin.


[0] https://gist.github.com/stalkerg/ab833d94e2f64df241f1835651e06e4b 
<https://gist.github.com/stalkerg/ab833d94e2f64df241f1835651e06e4b>

Re: [HACKERS] On markers of changed data

2017-10-08 Thread Andrey Borodin

> 8 окт. 2017 г., в 20:11, Stephen Frost <sfr...@snowman.net> написал(а):
> * Andrey Borodin (x4...@yandex-team.ru) wrote:
>> But my other question still seems unanswered: can I use LSN logic for 
>> incrementing FSM and VM? Seems like most of the time there is valid LSN
> 
> I haven't gone and audited it myself, but I would certainly expect you
> to be able to use the LSN for everything which is WAL'd.  If you have
> cases where that's not the case, it'd be useful to see them.

Thanks, Stephen, this actually pointed what to look for
VM is WAL-logged [0]
FSM is not [1]

Now I have everything I wanted, and go back coding :)

Best regards, Andrey Borodin.

[0] 
https://github.com/postgres/postgres/blob/113b0045e20d40f726a0a30e33214455e4f1385e/src/backend/access/heap/visibilitymap.c#L315
[1] 
https://github.com/postgres/postgres/blob/1d25779284fe1ba08ecd57e647292a9deb241376/src/backend/storage/freespace/freespace.c#L593

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] On markers of changed data

2017-10-08 Thread Andrey Borodin
Tom, Alvaro, Michael, and especially Septhen, thank you for your valuable 
comments.

I feel enlightened about mtime.
My takeaway is:
1. Any other marker would be better (It can be WAL scan during archiving, some 
new LSN-based mechanics* et c.)
2. mtime could be used, with precautions described by Stephen are taken.

But my other question still seems unanswered: can I use LSN logic for 
incrementing FSM and VM? Seems like most of the time there is valid LSN


* I like the idea of using something for both incr(diff) backups and VACUUM, it 
worth thinking about.

Best regards, Andrey Borodin.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] On markers of changed data

2017-10-06 Thread Andrey Borodin
Hi, hackers!

Currently I'm working on page-level incremental backups using WAL-G 
codebase[0]. And I have two questions that I cannot resolve myself.

Incremental backup is a set of changes, that should be applied over preexisting 
backup. I use page LSN to understand should page be backup`ed or not.

Question 1. FSM and VM.
As you can see here [1] FSM and VM files are exempt from incremental tracking 
and are backuped as whole files. I've done it this way, because sanity checks 
[2] of page headers have indicated a lot of "invalid" pages in FSM and VM 
files. But seems like in some pages headers are valid with sane LSNs.
Can I use LSNs as history marker on FSM and VM pages? On 1Tb backup I get like 
150Mb of FSM+VM, and it's kind of a lot.

Question 2. File dates.
Is it safe to use file modification time to track that file were changes since 
previous backup? If the file has date before start of previous backup I just 
add it to "skip list" [3].
I have assumption: every time file is changes in filesystem, it's modification 
date is updated to higher value.
Is this assumption valid for most of used platforms and filesystems? Or can I 
check this "capacity" of FS?

Thank you for your attention. I'll be glad to receive any information\pointers 
on this matter.


Best regards, Andrey Borodin, Yandex.

[0] https://github.com/wal-g/wal-g/pull/29
[1] 
https://github.com/wal-g/wal-g/pull/29/files#diff-d77406e827f5f947d4d4a1e6d76c1f4eR114
[2] 
https://github.com/wal-g/wal-g/pull/29/files#diff-d77406e827f5f947d4d4a1e6d76c1f4eR50
[3] 
https://github.com/wal-g/wal-g/pull/29/files#diff-f5c8f0067297f98eb5acc6e2c6b1b234R87

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Allow GiST opcalsses without compress\decompres functions

2017-09-20 Thread Andrey Borodin
Hello Tom! Thanks for committing the patch!

> 20 сент. 2017 г., в 8:38, Tom Lane <t...@sss.pgh.pa.us> написал(а):
> 
> Andrey Borodin <x4...@yandex-team.ru> writes:
>> [ 0001-Allow-uncompressed-GiST-4.patch ]
> 
> ... There's still room
> to improve the contrib opclasses

I have one important question regarding compress\decompres functions.

You mentioned that probably there cannot be TOASTed values in the index. 
I need to settle this question in more deterministic way.
Can you point where to look at the code or who to ask:
Can there be TOASTed values in index tuples?

If answer is NO, we can get rid of much more useless code.

Best regards, Andrey Borodin.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Index Only Scan support for cube

2017-09-19 Thread Andrey Borodin
Hi hackers!
> 23 мая 2017 г., в 14:53, Andrew Borodin <boro...@octonica.com> написал(а):
> 
> Here's a small patch that implements fetch function necessary for
> Index Only Scans that use cube data type.

Tom Lane have just commited d3a4f89 (Allow no-op GiST support functions to be 
omitted) Thanks, Tom! : )
"Index Only Scan support for cube" patch now is obsolete. I'm working on 
another similar patch for contribs to support GiST IOS and remove no-op support 
functions.

Best regards, Andrey Borodin.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Allow GiST opcalsses without compress\decompres functions

2017-09-15 Thread Andrey Borodin
Dmitry and Alexander, thank you for looking into the patch!

> 14 сент. 2017 г., в 18:42, Alexander Korotkov <a.korot...@postgrespro.ru> 
> написал(а):
> It would be good if someone would write patch for removing useless 
> compress/decompress methods from builtin and contrib GiST opclasses.  
> Especially when it gives benefit in IndexOnlyScan enabling.
If the patch will be accepted, I'll either do this myself for commitfest at 
November or charge some students from Yandex Data School to do this (they earn 
some points in algorithms course for contributing to OSS).

> BTW, this change in the patch look suspicious for me
> 
> We are typically evade changing formatting in fragments of codes not directly 
> touched by the patch.

Thanks for spotting this out. Here's fixed version.

Best regards, Andrey Borodin.



0001-Allow-uncompressed-GiST-4.patch
Description: Binary data


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hooks to track changed pages for backup purposes

2017-09-13 Thread Andrey Borodin
Hi!
Thank you for your interest and experiment results.
> 13 сент. 2017 г., в 15:43, Ants Aasma <ants.aa...@eesti.ee> написал(а):
> 
> On Thu, Aug 31, 2017 at 9:02 AM, Andrey Borodin <x4...@yandex-team.ru> wrote:
>> When we have accumulated diff blocknumbers for most of segments we can 
>> significantly speed up method of WAL scanning. If we have blocknumbers for 
>> all segments we can skip WAL scanning at all.
> 
> Have you measured that the WAL scanning is actually a significant
> issue? As a quick experiment I hacked up pg_waldump to just dump block
> references to stdout in binary format. It scanned 2.8GB of WAL in 3.17
> seconds, outputting 9.3M block refs per second. WAL was generated with
> pgbench, synchronous commit off, using 4 cores for 10 minutes - making
> the ratio of work from generating WAL to parsing it be about 750:1.
> 

No, I had not done this measurement myself. Sure, parsing WAL, when it is in 
RAM, is not very expensive. Though, it can be even cheaper before formatting 
WAL.
I just want to figure out what is the best place for this, if backuping exec is 
sharing CPUs with postmaster.

Best regards, Andrey Borodin.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hooks to track changed pages for backup purposes

2017-09-12 Thread Andrey Borodin
Hi Tomas! Thank you for looking into that patch.

> 8 сент. 2017 г., в 1:53, Tomas Vondra <tomas.von...@2ndquadrant.com> 
> написал(а):
> 
> A few more comments:
> 
> * The patch defines wal_switch_hook, but it's never called.
That call was missing, that's a bug, thanks for spotting that out.

> * I see there are conditions like this:
> 
>if(xlogreader->blocks[nblock].forknum == MAIN_FORKNUM)
> 
> Why is it enough to restrict the block-tracking code to main fork?
> Aren't we interested in all relation forks?
fsm, vm and others are small enough to take them 

> I guess you'll have to explain
> what the implementation of the hooks is supposed to do, and why these
> locations for hook calls are the right ones. It's damn impossible to
> validate the patch without that information.
> 
> Assuming you still plan to use the hook approach ...
Yes, I still think hooking is good idea, but you are right - I need prototype 
first. I'll mark patch as Returned with feedback before prototype 
implementation.

> 
>>> There
>>> are no arguments fed to this hook, so modules would not be able to
>>> analyze things in this context, except shared memory and process
>>> state?
>> 
>>> 
>>> Those hooks are put in hot code paths, and could impact performance of
>>> WAL insertion itself.
>> I do not think sending few bytes to cached array is comparable to disk
> write of XLog record. Checking the func ptr is even cheaper with correct
> branch prediction.
>> 
> 
> That seems somewhat suspicious, for two reasons. Firstly, I believe we
> only insert the XLOG records into WAL buffer here, so why should there
> be any disk write related? Or do you mean the final commit?
Yes, I mean finally we will be waiting for disk. Hundred empty ptr checks are 
neglectable in comparision with disk.
> 
> But more importantly, doesn't this kind of information require some
> durability guarantees? I mean, if it gets lost during server crashes or
> restarts, doesn't that mean the incremental backups might miss some
> buffers? I'd guess the hooks will have to do some sort of I/O, to
> achieve that, no?
We need durability only on the level of one segment. If we do not have info 
from segment we can just rescan it.
If we send segment to S3 as one file, we are sure in it's integrity. But this 
IO can by async.

PTRACK in it's turn switch bits in fork's buffers which are written in 
checkpointer and..well... recovered during recovery. By usual WAL replay of 
recovery.


> From this POV, the idea to collect this information on the backup system
> (WAL archive) by pre-processing the arriving WAL segments seems like the
> most promising. It moves the work to another system, the backup system
> can make it as durable as the WAL segments, etc.

Well, in some not so rare cases users encrypt backups and send to S3. And there 
is no system with CPUs that can handle that WAL parsing. Currently, I'm 
considering mocking prototype for wal-g, which works exactly this.

Your comments were very valuable, thank you for looking into the patch and 
joining the discussion.

Best regards, Andrey Borodin.




Re: [HACKERS] Allow GiST opcalsses without compress\decompres functions

2017-09-11 Thread Andrey Borodin

> 11 сент. 2017 г., в 12:57, Dmitriy Sarafannikov <dsarafanni...@yandex.ru> 
> написал(а):
> Hi Andrew! Thanks for the patch, but patch 
> 0001-allow-uncompressed-Gist-2.patch no longer applies on current master 
> branch.
> Please could you rebase it?
Sure, see attachment. Thanks for looking into the patch!

Best regards, Andrey Borodin. 


0001-Allow-uncompressed-GiST-3.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hooks to track changed pages for backup purposes

2017-09-01 Thread Andrey Borodin
Thank you for your reply, Michael! Your comments are valuable, especially in 
the world of backups.

> 31 авг. 2017 г., в 19:44, Michael Paquier <michael.paqu...@gmail.com> 
> написал(а):
> Such things are not Postgres-C like. 
Will be fixed.

> I don't understand what xlog_begin_insert_hook() is good for.
memset control structures and array of blocknos and relfilenodes of the size 
XLR_MAX_BLOCK_ID .

> There
> are no arguments fed to this hook, so modules would not be able to
> analyze things in this context, except shared memory and process
> state?

> 
> Those hooks are put in hot code paths, and could impact performance of
> WAL insertion itself.
I do not think sending few bytes to cached array is comparable to disk write of 
XLog record. Checking the func ptr is even cheaper with correct branch 
prediction.

> So you basically move the cost of scanning WAL
> segments for those blocks from any backup solution to the WAL
> insertion itself. Really, wouldn't it be more simple to let for
> example the archiver process to create this meta-data if you just want
> to take faster backups with a set of segments? Even better, you could
> do a scan after archiving N segments, and then use M jobs to do this
> work more quickly. (A set of background workers could do this job as
> well).
I like the idea of doing this during archiving. It is different trade-off 
between performance of OLTP and performance of backuping. Essentially, it is 
parsing WAL some time before doing backup. The best thing about it is usage of 
CPUs that are usually spinning in idle loop on backup machines.

> In the backup/restore world, backups can be allowed to be taken at a
> slow pace, what matters is to be able to restore them quickly.
Backups are taken much more often than restored.

> In short, anything moving performance from an external backup code path
> to a critical backend code path looks like a bad design to begin with.
> So I am dubious that what you are proposing here is a good idea.
I will think about it more. This proposal takes vanishingly small part of 
backend performance, but, indeed, nonzero part.

Again, thank you for your time and comments.

Best regards, Andrey Borodin.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Hooks to track changed pages for backup purposes

2017-08-31 Thread Andrey Borodin
Hi hackers!

Here is the patch with hooks that I consider sufficient for implementation of 
incremental backup with pages tracking as extension.

Recently I was posting these things to the thread "Adding hook in BufferSync 
for backup purposes" [0], but here I start separate thread since Subj field of 
previous discussion is technically wrong.

Currently various incremental backups can use one of this methods to take diff 
of a cluster since some LSN:
1. Check LSN of every page
2. Scan WAL and collect block numbers of changed pages

I propose adding hooks:
1. When a buffer is registered in WAL insertion
This hook is supposed to place blocknumbers in a temporary storage, like 
backend-local static array.
2. When a WAL record insertion is started and finished, to transfer 
blocknumbers to more concurrency-protected storage.
3. When the WAL segment is switched to initiate async transfer of accumulated 
blocknumbers to durable storage.

When we have accumulated diff blocknumbers for most of segments we can 
significantly speed up method of WAL scanning. If we have blocknumbers for all 
segments we can skip WAL scanning at all.

I think that these proposed hooks can enable more efficient backups. How do you 
think?

Any ideas will be appreciated. This patch is influenced by the code of PTRACK 
(Yury Zhuravlev and Postgres Professional).

Best regards, Andrey Borodin.
 

[0] 
https://www.postgresql.org/message-id/flat/20051502087457%40webcorp01e.yandex-team.ru#20051502087...@webcorp01e.yandex-team.ru



0001-hooks-to-watch-for-changed-pages.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Adding hook in BufferSync for backup purposes

2017-08-28 Thread Andrey Borodin
Hi hackers!

> 8 авг. 2017 г., в 11:27, Tom Lane <t...@sss.pgh.pa.us> написал(а):
> 
> My point is not to claim that we mustn't put a hook there.  It's that what
> such a hook could safely do is tightly constrained, and you've not offered
> clear evidence that there's something useful to be done within those
> constraints.  Alvaro seems to think that the result might be better
> reached by hooking in at other places, and my gut reaction is similar.
> 


I was studying through work done by Marco and Gabriel on the matter of tracking 
pages for the incremental backup, and I have found PTRACK patch by Yury 
Zhuravlev and PostgresPro [0]. This work seems to be solid and thoughtful. I 
have drafted a new prototype for hooks enabling incremental backup as extension 
based on PTRACK calls.

If you look at the original patch you can see that attached patch differs 
slightly: functionality to track end of critical section is omitted. I have not 
included it in this draft because it changes very sensitive place even for 
those who do not need it.

Subscriber of proposed hook must remember that it is invoked under critical 
section. But there cannot me more than XLR_MAX_BLOCK_ID blocks for one 
transaction. Proposed draft does not add any functionality to track finished 
transactions or any atomic unit of work, just provides a flow of changed block 
numbers. Neither does this draft provide any assumption on where to store this 
information. I suppose subscribers could trigger asynchronous writes somewhere 
as long as info for given segment is accumulated (do we need a hook on segment 
end then?). During inremental backup you can skip scanning those WAL segments 
for which you have accumulated changeset of block numbers.

The thing which is not clear to me: if we are accumulating blocknumbers under 
critical section, then we must place them to preallocated array. When is the 
best time to take away these blocknumbers to empty that array and avoid 
overflow? PTRACK has array of XLR_MAX_BLOCK_ID length and saves these array 
during the end of each critical section. But I want to avoid intervention into 
critical sections.

Thank you for your attention, any thoughts will be appreciated.

Best regards, Andrey Borodin.

[0] https://gist.github.com/stalkerg/ab833d94e2f64df241f1835651e06e4b


0001-hooks-to-watch-for-changed-pages.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PROPOSAL] Use SnapshotAny in get_actual_variable_range

2017-08-18 Thread Andrey Borodin
The following review has been posted through the commitfest application:
make installcheck-world:  tested, passed
Implements feature:   tested, passed
Spec compliant:   tested, passed
Documentation:tested, passed

Oops, missed those checkboxes. Sorry for the noise. Here's correct checklist: 
installcheck-world passes, documented as expected. Feature is there.

Best regards, Andrey Borodin.
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PROPOSAL] Use SnapshotAny in get_actual_variable_range

2017-08-18 Thread Andrey Borodin
The following review has been posted through the commitfest application:
make installcheck-world:  tested, failed
Implements feature:   tested, failed
Spec compliant:   tested, failed
Documentation:tested, failed

Hi! I've looked into the patch.

There is not so much of the code. The code itself is pretty clear and well 
documented. I've found just one small typo "transactionn" in 
HeapTupleSatisfiesNonVacuumable() comments.

Chosen Snapshot type is not a solution to the author's problem at hand. Though 
implemented SnapshotNonVacuumable is closer to rational choice  than currently 
used SnapshotDirty for range sketching. As far as I can see this is what is 
get_actual_variable_range() is used for, despite word "actual" in it's name. 
The only place where get_actual_variable_range() is used for actual range is 
preprocessed-out in get_variable_range():
/*
 * XXX It's very tempting to try to use the actual column min and max, 
if
 * we can get them relatively-cheaply with an index probe.  However, 
since
 * this function is called many times during join planning, that could
 * have unpleasant effects on planning speed.  Need more investigation
 * before enabling this.
 */
#ifdef NOT_USED
if (get_actual_variable_range(root, vardata, sortop, min, max))
return true;
#endif

I think it would be good if we could have something like "give me actual 
values, but only if it is on first index page", like conditional locks. But I 
think this patch is a step to right direction.

Best regards, Andrey Borodin.
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GSoC 2017 : Patch for predicate locking in Gist index

2017-08-16 Thread Andrey Borodin
Hi, hackers!

> 21 июня 2017 г., в 12:52, Shubham Barai <shubhambara...@gmail.com> написал(а):
> 
> Hi,
> ...
> I know that. This is the old version of the patch. I had sent updated patch 
> later. Please have a look at updated patch.
> 
> Regards,
> Shubham


Here is some information for reviewers. This applies to patches for GiST, Hash, 
GIN and SP-GiST.

As a mentor of the Shubham's GSoC project for every patch regarding predicate 
locking I have checked:
0. Correctness of implementation (places of predicate lock function calls, but, 
anyway, this point deserves special attention)
1. Patch applies and installcheck-world passes
2. Patch contains new isolation tests
3. These tests fail if indexes do not implement predicate locking
4. Patch updates documentation

Shubham also had checked that patches work (install check-world) on 1Kb, 8Kb 
and 32Kb pages.

Best regards, Andrey Borodin, Yandex.

Re: [HACKERS] Adding hook in BufferSync for backup purposes

2017-08-07 Thread Andrey Borodin
Alvaro, Tom, thank you for your valuable comments.


> Alvaro:
> I remember discussing the topic of differential base-backups with
> somebody (probably Marco and Gabriele).  The idea we had was to have a
> new relation fork which stores an LSN for each group of pages,
> indicating the LSN of the newest change to those pages.  The backup tool
> "scans" the whole LSN fork, and grabs images of all pages that have LSNs
> newer than the one used for the previous base backup.
Thanks for the pointer, I’ve found the discussions and now I’m in a process of 
extraction of the knowledge from there

> I think it should be at the point where the buffer is
> modified (i.e. when WAL is written) rather than when it's checkpointed
> out.
WAL is flushed before actual pages are written to disk(sent to kernel). I’d 
like to notify extensions right after we exactly sure pages were flushed.
But you are right, BufferSync is not good place for this:
1. It lacks LSNs
2. It’s not the only place to flush: bgwriter goes through nearby function 
FlushBuffer() and many AMs write directly to smgr (for example when matapge is 
created)

BufferSync() seemed sooo comfortable and efficient place for flashing info on 
dirty pages, already sorted and grouped by tablespace, but it is absolutely 
incorrect to do it there. I’ll look for the better place.

> 
> 7 авг. 2017 г., в 18:37, Tom Lane <t...@sss.pgh.pa.us> написал(а):
> 
> Yeah.  Keep in mind that if the extension does anything at all that could
> possibly throw an error, and if that error condition persists across
> multiple tries, you will have broken the database completely: it will
> be impossible to complete a checkpoint, and your WAL segment pool will
> grow until it exhausts disk.  So the idea of doing something that involves
> unspecified extension behavior, especially possible interaction with
> an external backup agent, right there is pretty terrifying.
I think that API for extensions should tend to protect developer from breaking 
everything, but may allow it with precaution warnings in docs and comments. 
Please let me know if this assumption is incorrect.

> 
> Other problems with the proposed patch: it misses coverage of
> BgBufferSync, and I don't like exposing an ad-hoc structure like
> CkptTsStatus as part of an extension API.  The algorithm used by
> BufferSync to schedule buffer writes has changed multiple times
> before and doubtless will again; if we're going to have a hook
> here it should depend as little as possible on those details.
OK, now I see that «buf_internals.h» had word internals for a reason. Thanks 
for pointing that out, I didn’t knew about changes in these algorithms.

Best regards, Andrey Borodin, Yandex.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_background contrib module proposal

2017-01-19 Thread Andrey Borodin
The following review has been posted through the commitfest application:
make installcheck-world:  tested, passed
Implements feature:   tested, passed
Spec compliant:   tested, passed
Documentation:tested, failed

I’ll summarize here my thoughts as a reviewer on the current state of the 
pg_background:
1. Current version of a code [1] is fine, from my point of view. I have no 
suggestions on improving it. There is no documentation, but code is commented.
2. Patch is dependent on background sessions from the same commitfest.
3. There can exist more features, but for v1 there is surely enough features.
4. There is some controversy on where implemented feature shall be: in separate 
extension (as in this patch), in db_link, in some PL API, in FDW or somewhere 
else. I think that new extension is an appropriate place for the feature. But 
I’m not certain.
Summarizing these points, appropriate statuses of the patch are ‘Ready for 
committer’ or ‘Rejected’. Between these two I choose ‘Ready for committer’, I 
think patch is committable (after bg sessions).

Best regards, Andrey Borodin.

The new status of this patch is: Ready for Committer

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] autonomous transactions

2017-01-07 Thread Andrey Borodin
The following review has been posted through the commitfest application:
make installcheck-world:  tested, passed
Implements feature:   tested, passed
Spec compliant:   tested, passed
Documentation:tested, passed

Here’s review of Background Sessions v2 patch.
===Purpose===
Patch provides API for controlling other backend. Also, patch contains Python 
API support for calling C API.

===Overall status===
Patch applies to current HEAD clearly.
Contains new test and passes existing tests.
Contains sufficient documentation.
Contains 2 TODO items. Not sure it's OK to it leave so.
Another patch from this commitfest (pg_background) is based on this patch.

===Suggestions===
I haven’t found a way to safely acquire status of session (without possibility 
of ereport(ERROR)).
I do not see how to pass massive data, except by one long char* SQL. All the 
values have to be formatted as text.
BackgroundSessionStart() result do not contain PID. This functionality is 
expected by pg_background (though, can be added separately by pg_background). I 
suppose, this is done to prevent API users from accessing internals of 
BackgroundSession structure. But some information have to be public, anyway.
bgsession.c code contains very little comments.
I do not think that switch inside a switch (see bgsession_worker_main()) is 
easy to follow.

===Conclusion===
There’s always something to improve, but I think that this patch is ready for 
committer.

PS. I’ve read the db_link patches, but this review does not apply to them. I 
suppose db_link refactoring would be useful and functionality is added, so I 
think these patches deserve separate commitfest entry.

Best regards, Andrey Borodin.

The new status of this patch is: Ready for Committer

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GiST penalty functions [PoC]

2016-09-08 Thread Andrey Borodin
Thank you for your attention to details, Mikhail.

pack_float_good() looks good. But I'm not sure inline strict init is allowed 
under ansi C. Converting to regular ancient form b.fp = v; won't change compile 
result, would it?

Regards, Andrey Borodin.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: GiST optimizing memmoves in gistplacetopage for fixed-size updates [PoC]

2016-09-08 Thread Andrey Borodin
That storage assertion fired during usual update table set x=random() without 
conditions. Also Make check fails without it (for brin usage, gist is ok with 
it).
Cannot type quotation properly, I'm on a phone for a long time.

Regards, Andrey Borodin.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: Covering + unique indexes.

2016-08-14 Thread Andrey Borodin
The following review has been posted through the commitfest application:
make installcheck-world:  tested, passed
Implements feature:   tested, failed
Spec compliant:   tested, passed
Documentation:tested, passed

Hi hackers!

I've read the patch and here is my code review.

==PURPOSE
I've used this feature from time to time with MS SQL. From my experience 
INCLUDE is a 'sugar on top' feature. 
Some MS SQL classes do not even mention INCLUDE despite it's there from 2005 
(though classes do not mention lots of important things, so it's not kind of 
valuable indicator).
But those who use it, use it whenever possible. For example, system view with 
recommended indices rarely list one without INCLUDE columns.
So, this feature is very important from perspective of converting MS SQL DBAs 
to PostgreSQL. This is how I see it.

SUGGESTIONS==
0. Index build is broken. This script 
https://github.com/x4m/pggistopt/blob/8ad65d2e305e98c836388a07909af5983dba9c73/test.sql
 SEGFAULTs and may cause situation when you cannot insert anything into table 
(I think drop of index would help, but didn't tested this)
1. I think MS SQL syntax INCLUDE instead of INCLUDING would be better (for a 
purpose listed above)
2. Empty line added in ruleutils.c. Is it for a reason?
3. Now we have indnatts and indnkeyatts instead of indnatts. I think it is 
worth considering renaming indnatts to something different from old name. 
Someone somewhere could still suppose it's a number of keys.

PERFORMANCE==
Due to suggestion number 0 I could not measure performance of index build. 
Index crashes when there's more than 1.1 million of rows in a table.
Performance test script is here 
https://github.com/x4m/pggistopt/blob/f206b4395baa15a2fa42897eeb27bd555619119a/test.sql
Test scenario is following:
1. Create table, then create index, then add data.
2. Make a query touching data in INCLUDING columns.
This scenario is tested against table with:
A. Table with index, that do not contain touched columns, just PK.
B. Index with all columns in index.
C. Index with PK in keys and INCLUDING all other columns.

Tests were executed 5 times on Ubuntu VM under Hyper-V i5 2500 CPU, 16 Gb of 
RAM, SSD disk.
Time to insert 10M rows:
A. AVG 110 seconds STD 4.8
B. AVG 121 seconds STD 2.0
C. AVG 111 seconds STD 5.7
Inserts to INCLUDING index is almost as fast as inserts to index without extra 
columns.

Time to run SELECT query:
A. AVG 2864 ms STD 794
B. AVG 2329 ms STD 84
C. AVG 2293 ms STD 58
Selects with INCLUDING columns is almost as fast as with full index.

Index size (deterministic measure, STD = 0)
A. 317 MB
B. 509 MB
C. 399 MB
Index size is in the middle between full index and minimal index.

I think this numbers agree with expectation from the feature.

CONCLUSION==
This patch brings useful and important feature. Build shall be repaired; other 
my suggestions are only suggestions.



Best regards, Andrey Borodin, Octonica & Ural Federal University.

The new status of this patch is: Waiting on Author

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Optimizing numeric SUM() aggregate

2016-08-03 Thread Andrey Borodin

> 3 авг. 2016 г., в 22:47, Andrey Borodin <amboro...@acm.org> написал(а):
> 
> make installcheck-world:  tested, failed
> Implements feature:   tested, failed
> Spec compliant:   tested, failed
> Documentation:tested, failed
> 
> This is a review of a patch "Optimizing numeric SUM() aggregate" by Heikki 
> Linnakangas
Oh, I failed to use those checkboxes in web app. In my review please read 
"tested, failed" as "tested, passed". I'm sorry.

Best regards, Andrey Borodin.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Optimizing numeric SUM() aggregate

2016-08-03 Thread Andrey Borodin
The following review has been posted through the commitfest application:
make installcheck-world:  tested, failed
Implements feature:   tested, failed
Spec compliant:   tested, failed
Documentation:tested, failed

This is a review of a patch "Optimizing numeric SUM() aggregate" by Heikki 
Linnakangas
https://www.postgresql.org/message-id/flat/c0545351-a467-5b76-6d46-4840d1ea8...@iki.fi#c0545351-a467-5b76-6d46-4840d1ea8...@iki.fi
This review contains summarization of all my posts regarding this patch and a 
little bit of additional suggestions.

Contents & Purpose
==
This patch improves performance of aggregates computation by delaying numeric 
overflow carring between NBASE-digit arbitrary-length arithmetic. This is 
possible because 32-bit int is used for storage of every NBASE-digit, where 
NBASE is 1.
Patch changes file numeric.c only. Folder of numeric.c does not contain README. 
That is why all documentation of a patch is done in comments. I consider 
documentation sufficient.

Initial Run
===
Patch can be applied cleanly to current HEAD. Regression tests path before and 
after patch application.

Performance
===
I've tested patch with this query
CREATE TABLE avg_test AS SELECT (random() * 999)::decimal(5,2) as d FROM 
generate_series(1, 100) s;
SELECT avg(d) a , avg(d+0) s0 , avg(d+1) s1 , avg(d+2) s2, avg(d+3) s3 , 
avg(d+4) s4 , avg(d+5) s5, avg(d+6) s6 , avg(d+7) s7, avg(d+8) s8 , avg(d+9) s9 
FROM avg_test;

Test specs were: Ubuntu 14 LTS VM, dynamic RAM, hypervisor Windows
Server 2016, SSD disk, core i5-2500. Configuration: out of the box master make.

On 10 test runs timing of select statement: AVG 3739.8 ms, STD  435.4193
With patch applied (as is) : 3017.8 ms, STD 319.893

I suppose this proves performance impact of a patch. I don't think that sum 
calculation was a common bottleneck, but certainly patch will slightly improve 
performance of a very huge number of queries.

Suggestions
===
1. Currenlty overflow is carried every  addition. I suggest that it is 
possibe to do it only every (INT32_MAX - INT32_MAX / NBASE) / (NBASE - 1) 
addition. Please note that with this overflow count it will be neccesary to 
check whether two finishing carrings are required.
2. Aggregates and numeric regression tests do not containt any test case 
covering overflows. I recommend adding sum with numer 1 000 000 of 99 999 999 
values. May be you will come up with more clever overflow testing.

Conclusion
==
This patch is important and thorough work, but I'd suggest to consider some 
polishing on two points listed above.

The new status of this patch is: Waiting on Author

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers