Re: [HACKERS] patch proposal

2017-03-27 Thread David Steele
On 3/26/17 7:34 PM, Venkata B Nagothi wrote:
> Hi David, 
> 
> On Thu, Mar 23, 2017 at 4:21 AM, David Steele  > wrote:
> 
> On 3/21/17 8:45 PM, Venkata B Nagothi wrote:
> 
> On Tue, Mar 21, 2017 at 8:46 AM, David Steele
> mailto:da...@pgmasters.net>
> 
> Unfortunately, I don't think the first patch
> (recoveryStartPoint)
> will work as currently implemented.  The problem I see is
> that the
> new function recoveryStartsHere() depends on pg_control
> containing a
> checkpoint right at the end of the backup.  There's no guarantee
> that this is true, even if pg_control is copied last.  That
> means a
> time, lsn, or xid that occurs somewhere in the middle of the
> backup
> can be selected without complaint from this code depending
> on timing.
> 
> 
> Yes, that is true.  The latest best position, checkpoint
> position, xid
> and timestamp of the restored backup of the backup is shown up
> in the
> pg_controldata, which means, that is the position from which the
> recovery would start.
> 
> 
> Backup recovery starts from the checkpoint in the backup_label, not
> from the checkpoint in pg_control.  The original checkpoint that
> started the backup is generally overwritten in pg_control by the end
> of the backup.
> 
> 
> Yes, I totally agree. My initial intention was to compare the recovery
> target position(s) with the contents in the backup_label, but, then, the
> checks would fails if the recovery is performed without the backup_label
> file. Then, i decided to compare the recovery target positions with the
> contents in the pg_control file.
> 
> 
> Which in-turn means, WALs start getting replayed
> from that position towards --> minimum recovery position (which
> is the
> end backup, which again means applying WALs generated between
> start and
> the end backup) all the way through to  --> recovery target
> position.
> 
> 
> minRecoveryPoint is only used when recovering a backup that was made
> from a standby.  For backups taken on the master, the backup end WAL
> record is used.
> 
> The best start position to check with would the position shown
> up in the
> pg_control file, which is way much better compared to the current
> postgresql behaviour.
> 
> 
> I don't agree, for the reasons given previously.
> 
> 
> As explained above, my intention was to ensure that the recovery start
> positions checks are successfully performed irrespective of the presence
> of the backup_label file.
> 
> I did some analysis before deciding to use pg_controldata's output
> instead of backup_label file contents.
> 
> Comparing the output of the pg_controldata with the contents of
> backup_label contents.
> 
> *Recovery Target LSN*
> 
> START WAL LOCATION (which is 0/9C28) in the backup_label =
> pg_controldata's latest checkpoint's REDO location (Latest
> checkpoint's REDO location:  0/9C28)
> 
> *Recovery Target TIME*
> 
> backup start time in the backup_label (START TIME: 2017-03-26
> 11:55:46 AEDT) = pg_controldata's latest checkpoint time (Time of
> latest checkpoint :  Sun 26 Mar 2017 11:55:46 AM AEDT)
> 
> *Recovery Target XID*
> 
> To begin with backup_label does contain any start XID. So, the only
> option is to depend on pg_controldata's output. 
> After a few quick tests and thorough observation, i do notice that,
> the pg_control file information is copied as it is to the backup
> location at the pg_start_backup. I performed some quick tests by
> running few transactions between pg_start_backup and pg_stop_backup.
> So, i believe, this is ideal start point for WAL replay.
> 
> Am i missing anything here ?

You are making assumptions about the contents of pg_control vs.
backup_label based on trivial tests.  With PG defaults, the backup must
run about five minutes before the values in pg_control and backup_label
will diverge.  Even if pg_control and backup_label do match, those are
the wrong values to use, and will get more incorrect the longer the
backup runs.

I believe there is a correct way to go about this, at least for time and
LSN, and I don't think your very approximate solution will pass muster
with a committer.

Since we are nearly at the end of the CF, I have marked this submission
"Returned with Feedback".

-- 
-David
da...@pgmasters.net


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch proposal

2017-03-26 Thread Venkata B Nagothi
Hi David,

On Thu, Mar 23, 2017 at 4:21 AM, David Steele  wrote:

> On 3/21/17 8:45 PM, Venkata B Nagothi wrote:
>
>> On Tue, Mar 21, 2017 at 8:46 AM, David Steele >
>> Unfortunately, I don't think the first patch (recoveryStartPoint)
>> will work as currently implemented.  The problem I see is that the
>> new function recoveryStartsHere() depends on pg_control containing a
>> checkpoint right at the end of the backup.  There's no guarantee
>> that this is true, even if pg_control is copied last.  That means a
>> time, lsn, or xid that occurs somewhere in the middle of the backup
>> can be selected without complaint from this code depending on timing.
>>
>>
>> Yes, that is true.  The latest best position, checkpoint position, xid
>> and timestamp of the restored backup of the backup is shown up in the
>> pg_controldata, which means, that is the position from which the
>> recovery would start.
>>
>
> Backup recovery starts from the checkpoint in the backup_label, not from
> the checkpoint in pg_control.  The original checkpoint that started the
> backup is generally overwritten in pg_control by the end of the backup.


Yes, I totally agree. My initial intention was to compare the recovery
target position(s) with the contents in the backup_label, but, then, the
checks would fails if the recovery is performed without the backup_label
file. Then, i decided to compare the recovery target positions with the
contents in the pg_control file.


> Which in-turn means, WALs start getting replayed
>> from that position towards --> minimum recovery position (which is the
>> end backup, which again means applying WALs generated between start and
>> the end backup) all the way through to  --> recovery target position.
>>
>
> minRecoveryPoint is only used when recovering a backup that was made from
> a standby.  For backups taken on the master, the backup end WAL record is
> used.
>
> The best start position to check with would the position shown up in the
>> pg_control file, which is way much better compared to the current
>> postgresql behaviour.
>>
>
> I don't agree, for the reasons given previously.
>

As explained above, my intention was to ensure that the recovery start
positions checks are successfully performed irrespective of the presence of
the backup_label file.

I did some analysis before deciding to use pg_controldata's output instead
of backup_label file contents.

Comparing the output of the pg_controldata with the contents of
backup_label contents.

*Recovery Target LSN*

START WAL LOCATION (which is 0/9C28) in the backup_label =
pg_controldata's latest checkpoint's REDO location (Latest checkpoint's
REDO location:  0/9C28)

*Recovery Target TIME*

backup start time in the backup_label (START TIME: 2017-03-26 11:55:46
AEDT) = pg_controldata's latest checkpoint time (Time of latest checkpoint
:  Sun 26 Mar 2017 11:55:46 AM AEDT)

*Recovery Target XID*

To begin with backup_label does contain any start XID. So, the only option
is to depend on pg_controldata's output.
After a few quick tests and thorough observation, i do notice that, the
pg_control file information is copied as it is to the backup location at
the pg_start_backup. I performed some quick tests by running few
transactions between pg_start_backup and pg_stop_backup. So, i believe,
this is ideal start point for WAL replay.

Am i missing anything here ?



The tests pass (or rather fail as expected) because they are written
>> using values before the start of the backup.  It's actually the end
>> of the backup (where the database becomes consistent on recovery)
>> that defines where PITR can start and this distinction definitely
>> matters for very long backups.  A better test would be to start the
>> backup, get a time/lsn/xid after writing some data, and then make
>> sure that time/lsn/xid is flagged as invalid on recovery.
>>
>> The current behaviour is that, if the XID shown-up by the pg_controldata
>> is say 1400 and the specified recovery_target_xid is say 200, then,
>> postgresql just completes the recovery and promotes at the immediate
>> consistent recovery target, which means, the DBA has to all the way
>> start the restoration and recovery process again to promote the database
>> at the intended position.
>>
>
> I'm not saying that the current behavior is good, only that the proposed
> behavior is incorrect as far as I can tell.


Please consider my explanation above and share your thoughts.


>
> It is also problematic to assume that transaction IDs commit in
>> order. If checkPoint.latestCompletedXid contains 5 then a recovery
>> to xid 4 may or may not be successful depending on the commit
>> order.  However, it appears in this case the patch would disallow
>> recovery to 4.
>
>
>> If the txid 4 is the recovery target xid, then, the appropriate backup
>> taken previous to txid 4 must be used or as an alternative recovery
>> target like recovery_target

Re: [HACKERS] patch proposal

2017-03-22 Thread David Steele

On 3/21/17 8:45 PM, Venkata B Nagothi wrote:

On Tue, Mar 21, 2017 at 8:46 AM, David Steele 

Backup recovery starts from the checkpoint in the backup_label, not from 
the checkpoint in pg_control.  The original checkpoint that started the 
backup is generally overwritten in pg_control by the end of the backup.



Which in-turn means, WALs start getting replayed
from that position towards --> minimum recovery position (which is the
end backup, which again means applying WALs generated between start and
the end backup) all the way through to  --> recovery target position.


minRecoveryPoint is only used when recovering a backup that was made 
from a standby.  For backups taken on the master, the backup end WAL 
record is used.



The best start position to check with would the position shown up in the
pg_control file, which is way much better compared to the current
postgresql behaviour.


I don't agree, for the reasons given previously.


The tests pass (or rather fail as expected) because they are written
using values before the start of the backup.  It's actually the end
of the backup (where the database becomes consistent on recovery)
that defines where PITR can start and this distinction definitely
matters for very long backups.  A better test would be to start the
backup, get a time/lsn/xid after writing some data, and then make
sure that time/lsn/xid is flagged as invalid on recovery.

The current behaviour is that, if the XID shown-up by the pg_controldata
is say 1400 and the specified recovery_target_xid is say 200, then,
postgresql just completes the recovery and promotes at the immediate
consistent recovery target, which means, the DBA has to all the way
start the restoration and recovery process again to promote the database
at the intended position.


I'm not saying that the current behavior is good, only that the proposed 
behavior is incorrect as far as I can tell.



It is also problematic to assume that transaction IDs commit in
order. If checkPoint.latestCompletedXid contains 5 then a recovery
to xid 4 may or may not be successful depending on the commit
order.  However, it appears in this case the patch would disallow
recovery to 4.

If the txid 4 is the recovery target xid, then, the appropriate backup
taken previous to txid 4 must be used or as an alternative recovery
target like recovery_target_timestamp must be used to proceed to the
respective recovery target xid.


I'm not sure I follow you here, but I do know that using the last 
committed xid says little about which xids precede or follow it.



I think this logic would need to be baked into recoveryStopsAfter()
and/or recoveryStopsBefore() in order to work.  It's not clear to me
what that would look like, however, or if the xid check is even
practical.


The above two functions would determine if the recovery process has to
stop at a particular position or not and the patch has nothing to do
with it.

To summarise, this patch only determines if the WAL replay has to start
at all. In other words, if the specified recovery target falls prior to
the restored database position, then, the WAL replay should not start,
which prevent the database from getting promoted at an unintended
recovery target.


I understand what you are trying to do.  My point is that the 
information you are working from (whatever checkpoint happens to be in 
pg_control when recovery starts) is not sufficient for the task.  This 
method is inaccurate and would disallow valid recovery scenarios.


I suggest doing a thorough read-through of StartupXLOG(), particularly 
the section where the backup_label is loaded.


Thanks,
--
-David
da...@pgmasters.net


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch proposal

2017-03-21 Thread Venkata B Nagothi
On Tue, Mar 21, 2017 at 8:46 AM, David Steele  wrote:

> Hi Venkata,
>
> On 2/28/17 11:59 PM, Venkata B Nagothi wrote:
>
>> On Wed, Mar 1, 2017 at 1:14 AM, Venkata B Nagothi > > wrote:
>> On Tue, Jan 31, 2017 at 6:49 AM, David Steele > > wrote:
>>
>> Do you know when those will be ready?
>>
>> Attached are both the patches with tests included.
>>
>
> Thanks for adding the tests.  They bring clarity to the patch.
>

Thank you again reviewing the patch and your comments !


> Unfortunately, I don't think the first patch (recoveryStartPoint) will
> work as currently implemented.  The problem I see is that the new function
> recoveryStartsHere() depends on pg_control containing a checkpoint right at
> the end of the backup.  There's no guarantee that this is true, even if
> pg_control is copied last.  That means a time, lsn, or xid that occurs
> somewhere in the middle of the backup can be selected without complaint
> from this code depending on timing.
>

Yes, that is true.  The latest best position, checkpoint position, xid and
timestamp of the restored backup of the backup is shown up in the
pg_controldata, which means, that is the position from which the recovery
would start. Which in-turn means, WALs start getting replayed from that
position towards --> minimum recovery position (which is the end backup,
which again means applying WALs generated between start and the end backup)
all the way through to  --> recovery target position. The best start
position to check with would the position shown up in the pg_control file,
which is way much better compared to the current postgresql behaviour.

The tests pass (or rather fail as expected) because they are written using
> values before the start of the backup.  It's actually the end of the backup
> (where the database becomes consistent on recovery) that defines where PITR
> can start and this distinction definitely matters for very long backups.  A
> better test would be to start the backup, get a time/lsn/xid after writing
> some data, and then make sure that time/lsn/xid is flagged as invalid on
> recovery.
>

Yes, i agree, the databases restored from the backup would start the
recovery and would become consistent once the end-backup is reached. The
point here is that, when the backup is restored and recovery proceeds
further, there is NO WAY to identify the next consistent position or
end-position of the backup. This patch is only implementing a check to
ensure recovery starts and proceeds at the right position instead of
pre-maturely getting promoted which is the current behaviour.

The current behaviour is that, if the XID shown-up by the pg_controldata is
say 1400 and the specified recovery_target_xid is say 200, then, postgresql
just completes the recovery and promotes at the immediate consistent
recovery target, which means, the DBA has to all the way start the
restoration and recovery process again to promote the database at the
intended position.

It is also problematic to assume that transaction IDs commit in order. If
> checkPoint.latestCompletedXid contains 5 then a recovery to xid 4 may or
> may not be successful depending on the commit order.  However, it appears
> in this case the patch would disallow recovery to 4.
>

If the txid 4 is the recovery target xid, then, the appropriate backup
taken previous to txid 4 must be used or as an alternative recovery target
like recovery_target_timestamp must be used to proceed to the respective
recovery target xid.


> I think this logic would need to be baked into recoveryStopsAfter() and/or
> recoveryStopsBefore() in order to work.  It's not clear to me what that
> would look like, however, or if the xid check is even practical.
>

The above two functions would determine if the recovery process has to stop
at a particular position or not and the patch has nothing to do with it.

To summarise, this patch only determines if the WAL replay has to start at
all. In other words, if the specified recovery target falls prior to the
restored database position, then, the WAL replay should not start, which
prevent the database from getting promoted at an unintended recovery target.

Any thoughts/comments ?

Regards,

Venkata Balaji N
Database Consultant


Re: [HACKERS] patch proposal

2017-03-20 Thread David Steele

Hi Venkata,

On 2/28/17 11:59 PM, Venkata B Nagothi wrote:

On Wed, Mar 1, 2017 at 1:14 AM, Venkata B Nagothi mailto:nag1...@gmail.com>> wrote:
On Tue, Jan 31, 2017 at 6:49 AM, David Steele mailto:da...@pgmasters.net>> wrote:

Do you know when those will be ready?

Attached are both the patches with tests included.


Thanks for adding the tests.  They bring clarity to the patch.

Unfortunately, I don't think the first patch (recoveryStartPoint) will 
work as currently implemented.  The problem I see is that the new 
function recoveryStartsHere() depends on pg_control containing a 
checkpoint right at the end of the backup.  There's no guarantee that 
this is true, even if pg_control is copied last.  That means a time, 
lsn, or xid that occurs somewhere in the middle of the backup can be 
selected without complaint from this code depending on timing.


The tests pass (or rather fail as expected) because they are written 
using values before the start of the backup.  It's actually the end of 
the backup (where the database becomes consistent on recovery) that 
defines where PITR can start and this distinction definitely matters for 
very long backups.  A better test would be to start the backup, get a 
time/lsn/xid after writing some data, and then make sure that 
time/lsn/xid is flagged as invalid on recovery.


It is also problematic to assume that transaction IDs commit in order. 
If checkPoint.latestCompletedXid contains 5 then a recovery to xid 4 may 
or may not be successful depending on the commit order.  However, it 
appears in this case the patch would disallow recovery to 4.


I think this logic would need to be baked into recoveryStopsAfter() 
and/or recoveryStopsBefore() in order to work.  It's not clear to me 
what that would look like, however, or if the xid check is even practical.


Regards,
--
-David
da...@pgmasters.net


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch proposal

2017-02-28 Thread Venkata B Nagothi
On Wed, Mar 1, 2017 at 1:14 AM, Venkata B Nagothi  wrote:

> Hi David,
>
> On Tue, Jan 31, 2017 at 6:49 AM, David Steele  wrote:
>
>> On 1/27/17 3:19 AM, Venkata B Nagothi wrote:
>>
>> > I will be adding the tests in
>> > src/test/recovery/t/003_recovery_targets.pl
>> > . My tests would look more or less
>> > similar to recovery_target_action. I do not see any of the tests added
>> > for the parameter recovery_target_action ? Anyways, i will work on
>> > adding tests for recovery_target_incomplete.
>>
>> Do you know when those will be ready?
>>
>
> Attached are both the patches with tests included.
>
>
>>
>> >
>> >
>> > 4) I'm not convinced that fatal errors by default are the way to go.
>> > It's a pretty big departure from what we have now.
>> >
>> >
>> > I have changed the code to generate ERRORs instead of FATALs. Which is
>> > in the patch recoveryStartPoint.patch
>>
>> I think at this point in recovery any ERROR at all will probably be
>> fatal.  Once you have some tests in place we'll know for sure.
>>
>> Either way, the goal would be to keep recovery from proceeding and let
>> the user adjust their targets.  Since this is a big behavioral change
>> which will need buy in from the community.
>>
>
> Hoping to get some comments / feedback from the community on the second
> patch too.
>
> > Also, I don't think it's very intuitive how
>> recovery_target_incomplete
>> > works.  For instance, if I set recovery_target_incomplete=promote
>> and
>> > set recovery_target_time, but pick a backup after the specified
>> time,
>> > then there will be an error "recovery_target_time %s is older..."
>> rather
>> > than a promotion.
>> >
>> >
>> > Yes, that is correct and that is the expected behaviour. Let me explain
>> -
>>
>> My point was that this combination of options could lead to confusion on
>> the part of users.  However, I'd rather leave that alone for the time
>> being and focus on the first patch.
>>
>> > I have split the patch into two different
>> > pieces. One is to determine if the recovery can start at all and other
>> > patch deals with the incomplete recovery situation.
>>
>> I think the first patch looks promising and I would rather work through
>> that before considering the second patch.  Once there are tests for the
>> first patch I will complete my review.
>>
>
> I have added all the tests for the second patch and all seem to be working
> fine.
>
> Regarding the first patch, i have included all the tests except for the
> test case on recovery_target_time.
> I did write the test, but, the test is kind of behaving weird which i am
> working through to resolve.
>

This issue has been resolved. All good now. To avoid any further confusion,
i have attached the latest versions (4) of both the patches with all the
tests included.

I have already changed the patch status to "Needs review".

Thank you !

Regards,

Venkata B N
Database Consultant


recoveryStartPoint.patch-version4
Description: Binary data


recoveryTargetIncomplete.patch-version4
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch proposal

2017-02-28 Thread Venkata B Nagothi
Hi David,

On Tue, Jan 31, 2017 at 6:49 AM, David Steele  wrote:

> On 1/27/17 3:19 AM, Venkata B Nagothi wrote:
>
> > I will be adding the tests in
> > src/test/recovery/t/003_recovery_targets.pl
> > . My tests would look more or less
> > similar to recovery_target_action. I do not see any of the tests added
> > for the parameter recovery_target_action ? Anyways, i will work on
> > adding tests for recovery_target_incomplete.
>
> Do you know when those will be ready?
>

Attached are both the patches with tests included.


>
> >
> >
> > 4) I'm not convinced that fatal errors by default are the way to go.
> > It's a pretty big departure from what we have now.
> >
> >
> > I have changed the code to generate ERRORs instead of FATALs. Which is
> > in the patch recoveryStartPoint.patch
>
> I think at this point in recovery any ERROR at all will probably be
> fatal.  Once you have some tests in place we'll know for sure.
>
> Either way, the goal would be to keep recovery from proceeding and let
> the user adjust their targets.  Since this is a big behavioral change
> which will need buy in from the community.
>

Hoping to get some comments / feedback from the community on the second
patch too.

> Also, I don't think it's very intuitive how recovery_target_incomplete
> > works.  For instance, if I set recovery_target_incomplete=promote
> and
> > set recovery_target_time, but pick a backup after the specified time,
> > then there will be an error "recovery_target_time %s is older..."
> rather
> > than a promotion.
> >
> >
> > Yes, that is correct and that is the expected behaviour. Let me explain -
>
> My point was that this combination of options could lead to confusion on
> the part of users.  However, I'd rather leave that alone for the time
> being and focus on the first patch.
>
> > I have split the patch into two different
> > pieces. One is to determine if the recovery can start at all and other
> > patch deals with the incomplete recovery situation.
>
> I think the first patch looks promising and I would rather work through
> that before considering the second patch.  Once there are tests for the
> first patch I will complete my review.
>

I have added all the tests for the second patch and all seem to be working
fine.

Regarding the first patch, i have included all the tests except for the
test case on recovery_target_time.
I did write the test, but, the test is kind of behaving weird which i am
working through to resolve.

Regards,

Venkata B N
Database Consultant


recoveryStartPoint.patch-version3
Description: Binary data


recoveryTargetIncomplete.patch-version4
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch proposal

2017-02-21 Thread Venkata B Nagothi
On Tue, Jan 31, 2017 at 10:41 AM, Michael Paquier  wrote:

> On Tue, Jan 31, 2017 at 4:49 AM, David Steele  wrote:
> > On 1/27/17 3:19 AM, Venkata B Nagothi wrote:
> >> I have split the patch into two different
> >> pieces. One is to determine if the recovery can start at all and other
> >> patch deals with the incomplete recovery situation.
> >
> > I think the first patch looks promising and I would rather work through
> > that before considering the second patch.  Once there are tests for the
> > first patch I will complete my review.
>
> Based on that, I am moving the patch to next CF with "Needs Review".
> Venkata, please be careful in updating correctly the patch status, it
> was still on "Waiting on Author".
>

Apologies. Sure. Will make a note.

Regards,

Venkata B N
Database Consultant


Re: [HACKERS] patch proposal

2017-02-21 Thread Venkata B Nagothi
On Tue, Jan 31, 2017 at 6:49 AM, David Steele  wrote:

> On 1/27/17 3:19 AM, Venkata B Nagothi wrote:
>
> > I will be adding the tests in
> > src/test/recovery/t/003_recovery_targets.pl
> > . My tests would look more or less
> > similar to recovery_target_action. I do not see any of the tests added
> > for the parameter recovery_target_action ? Anyways, i will work on
> > adding tests for recovery_target_incomplete.
>
> Do you know when those will be ready?
>

Apologies for the delayed response. Actually, I am working through to add
the tests for both the patches and started with adding the tests to
recovery_target_incomplete parameter.
I have hit an issue while adding the tests which i have explained below.


> >
> >
> > 4) I'm not convinced that fatal errors by default are the way to go.
> > It's a pretty big departure from what we have now.
> >
> >
> > I have changed the code to generate ERRORs instead of FATALs. Which is
> > in the patch recoveryStartPoint.patch
>
> I think at this point in recovery any ERROR at all will probably be
> fatal.  Once you have some tests in place we'll know for sure.
>

Sure. I can add the tests for the first patch.

Either way, the goal would be to keep recovery from proceeding and let
> the user adjust their targets.  Since this is a big behavioral change
> which will need buy in from the community.
>

Sure. Agreed.


> > Also, I don't think it's very intuitive how
> recovery_target_incomplete
> > works.  For instance, if I set recovery_target_incomplete=promote
> and
> > set recovery_target_time, but pick a backup after the specified time,
> > then there will be an error "recovery_target_time %s is older..."
> rather
> > than a promotion.
> >
> >
> > Yes, that is correct and that is the expected behaviour. Let me explain -
>
> My point was that this combination of options could lead to confusion on
> the part of users.  However, I'd rather leave that alone for the time
> being and focus on the first patch.
>

Sure.


> > I have split the patch into two different
> > pieces. One is to determine if the recovery can start at all and other
> > patch deals with the incomplete recovery situation.
>
> I think the first patch looks promising and I would rather work through
> that before considering the second patch.  Once there are tests for the
> first patch I will complete my review.
>

Sure. Will submit the first patch along with tests once i can get through
the issue i am facing.

*The issue i am facing while writing the tests*

Below is the problem i am facing while adding the tests to
003_recovery_targets.pl

1. Test case : Test the Incomplete recovery with parameters
restore_command, recovery_target_xid and recovery_target_incomplete
='shutdown' (option 'pause' and 'promote' work fine)

2. Expected test Result : Standby node successfully shuts-down if the
recovery process reaches mid-way and does not reach the intended recovery
target due to unavailability of WALs

3. All good. Everything works as expected. The problem is that, the test
exits with the exit code 256, bails out and shows-up as "pg_ctl failed".

Is there anyway, i can ensure the test does not bail-out and exits with the
exit code '0' when you do "pg_ctl start" and the server starts and
shuts-down for any reason ?

If you look at the log file, pg_ctl actually starts successfully and then
shuts down cleanly because intended recovery target is not reached. The
messages in the log file are also appropriate.
Since it is a clean-start and clean-shutdown (which means the test is
successful), the test should return a success, which is not happening.

Below is the log (a test for the second patch). I believe Perl generates an
error code 256 because pg_ctl responds with an error - "pg_ctl: could not
start the server".
I would need my test case to return a success in this scenario.  Any
suggestions would be great.

[dba@buildhost ~]$  /data/PostgreSQL-Repo/postgresql/tmp_install/disk1/
pginstall/pgsql-10-dev-master-xlog-2-wal/bin/pg_ctl -D
/data/PostgreSQL-Repo/postgresql/src/test/recovery/
tmp_check/data_pitr_3_U_oA/pgdata start
waiting for server to start2017-02-22 12:06:41.773 AEDT [23858] LOG:
 database system was interrupted while in recovery at log time 2017-02-15
16:40:41 AEDT
2017-02-22 12:06:41.773 AEDT [23858] HINT:  If this has occurred more than
once some data might be corrupted and you might need to choose an earlier
recovery target.
2017-02-22 12:06:41.773 AEDT [23858] LOG:  starting point-in-time recovery
to XID 559
2017-02-22 12:06:41.782 AEDT [23858] LOG:  restored log file
"00010002" from archive
2017-02-22 12:06:41.784 AEDT [23858] LOG:  redo starts at 0/228
2017-02-22 12:06:41.794 AEDT [23858] LOG:  restored log file
"00010003" from archive
2017-02-22 12:06:41.814 AEDT [23858] LOG:  restored log file
"00010004" from archive
2017-02-22 12:06:41.835 AEDT [23858] LOG:  restored log file

Re: [HACKERS] patch proposal

2017-01-30 Thread Michael Paquier
On Tue, Jan 31, 2017 at 4:49 AM, David Steele  wrote:
> On 1/27/17 3:19 AM, Venkata B Nagothi wrote:
>> I have split the patch into two different
>> pieces. One is to determine if the recovery can start at all and other
>> patch deals with the incomplete recovery situation.
>
> I think the first patch looks promising and I would rather work through
> that before considering the second patch.  Once there are tests for the
> first patch I will complete my review.

Based on that, I am moving the patch to next CF with "Needs Review".
Venkata, please be careful in updating correctly the patch status, it
was still on "Waiting on Author".
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch proposal

2017-01-30 Thread David Steele
On 1/27/17 3:19 AM, Venkata B Nagothi wrote:

> I will be adding the tests in
> src/test/recovery/t/003_recovery_targets.pl
> . My tests would look more or less
> similar to recovery_target_action. I do not see any of the tests added
> for the parameter recovery_target_action ? Anyways, i will work on
> adding tests for recovery_target_incomplete. 

Do you know when those will be ready?

>  
> 
> 4) I'm not convinced that fatal errors by default are the way to go.
> It's a pretty big departure from what we have now.
> 
> 
> I have changed the code to generate ERRORs instead of FATALs. Which is
> in the patch recoveryStartPoint.patch

I think at this point in recovery any ERROR at all will probably be
fatal.  Once you have some tests in place we'll know for sure.

Either way, the goal would be to keep recovery from proceeding and let
the user adjust their targets.  Since this is a big behavioral change
which will need buy in from the community.

> Also, I don't think it's very intuitive how recovery_target_incomplete
> works.  For instance, if I set recovery_target_incomplete=promote and
> set recovery_target_time, but pick a backup after the specified time,
> then there will be an error "recovery_target_time %s is older..." rather
> than a promotion.
> 
> 
> Yes, that is correct and that is the expected behaviour. Let me explain -

My point was that this combination of options could lead to confusion on
the part of users.  However, I'd rather leave that alone for the time
being and focus on the first patch.

> I have split the patch into two different
> pieces. One is to determine if the recovery can start at all and other
> patch deals with the incomplete recovery situation.

I think the first patch looks promising and I would rather work through
that before considering the second patch.  Once there are tests for the
first patch I will complete my review.

-- 
-David
da...@pgmasters.net


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch proposal

2017-01-27 Thread Venkata B Nagothi
Hi David,

On Tue, Jan 24, 2017 at 9:22 AM, David Steele  wrote:

> Hi Venkata,
>
> On 11/8/16 5:47 PM, Venkata B Nagothi wrote:
> > Attached is the 2nd version of the patch with some enhancements.
>
> Here's my review of the patch.
>

Thank you very much for reviewing the patch.

1) There are a number of whitespace error when applying the patch:
>
> $ git apply ../other/recoveryTargetIncomplete.patch-version2
> ../other/recoveryTargetIncomplete.patch-version2:158: indent with spaces.
> else
> ../other/recoveryTargetIncomplete.patch-version2:160: space before tab
> in indent.
> ereport(LOG,
> ../other/recoveryTargetIncomplete.patch-version2:166: indent with spaces.
>   /*
> ../other/recoveryTargetIncomplete.patch-version2:167: indent with spaces.
>* This is the position where we can
> choose to shutdown, pause
> ../other/recoveryTargetIncomplete.patch-version2:168: indent with spaces.
>* or promote at the end-of-the-wal if the
> intended recovery
> warning: squelched 10 whitespace errors
> warning: 15 lines add whitespace errors.
>
> The build did succeed after applying the patch.
>

Sorry, my bad. The attached patch should fix those errors.


> 2) There is no documentation.  Serious changes to recovery are going to
> need to be documented very carefully.  It will also help during review
> to determine you intent.
>

Attached patches include the documentation as well.


>
> 3) There are no tests.  I believe that
> src/test/recovery/t/006_logical_decoding.pl would be the best place to
> insert your tests.
>

I will be adding the tests in src/test/recovery/t/003_recovery_targets.pl.
My tests would look more or less similar to recovery_target_action. I do
not see any of the tests added for the parameter recovery_target_action ?
Anyways, i will work on adding tests for recovery_target_incomplete.


> 4) I'm not convinced that fatal errors by default are the way to go.
> It's a pretty big departure from what we have now.
>

I have changed the code to generate ERRORs instead of FATALs. Which is in
the patch recoveryStartPoint.patch


> Also, I don't think it's very intuitive how recovery_target_incomplete
> works.  For instance, if I set recovery_target_incomplete=promote and
> set recovery_target_time, but pick a backup after the specified time,
> then there will be an error "recovery_target_time %s is older..." rather
> than a promotion.
>

Yes, that is correct and that is the expected behaviour. Let me explain -

a) When you use the parameter recovery_target_time and pick-up a backup
taken after the specified target time, then, recovery will not proceed
further and ideally it should not proceed further.
This is not an incomplete recovery situation either. This is a
situation where recovery process cannot start at all. With this patch,
"recovery_target_time" (similarly recovery_target_xid and
recovery_target_lsn) first checks if the specified target time is later
than the backup's time stamp if yes, then the recovery will proceed a-head
or else it would generate an error,
which is what has happened in your case. The ideal way to deal with
this situation to change the recovery_target_time to a much later time
(which is later than the backup's time)
or use the parameter "recovery_target=immediate" to complete the
recovery and start the database or choose the correct backup (which has a
time stamp prior to the recovery_target_time).

b) recovery_target_incomplete has no effect in the above situation as this
parameter only deals with incomplete recovery. A situation where-in
recovery process stops mid-way (Example : due to missing or corrupt
WALs ) without reaching the intended recovery target (XID, Time and LSN).

c) In real-time environment when a DBA is required to perform PITR to a
particular recovery target (say, recovery_target_time),
it is the DBA who knows till where the database needs to be recovered
and what backup needs to be used for the same. The first thing DBA would do
is, choose the
appropriate backup which is taken prior to the recovery target. This is
the basic first step needed.

It seems to me that there needs to be a separate setting to raise a
> fatal error.
>
> 5) There are really two patches here.  One deals with notifying the user
> that replay to their recovery target is not possible (implemented here
> with fatal errors) and the other allows options when their recovery
> target appears to be possible but never arrives.

As far as I can see, at this point, nobody has really signed on to this
> approach.  I believe you will need a consensus before you can proceed
> with a patch this disruptive.
>
> A better approach may be to pull the first part out and raise warnings
> instead.  This will allow you to write tests and make sure that your
> logic is correct without major behavioral changes to Postgres.
>

This

Re: [HACKERS] patch proposal

2017-01-24 Thread David Steele
On 1/23/17 10:52 PM, Michael Paquier wrote:
> On Tue, Jan 24, 2017 at 7:22 AM, David Steele  wrote:
>> 3) There are no tests.  I believe that
>> src/test/recovery/t/006_logical_decoding.pl would be the best place to
>> insert your tests.
> 
> 003_recovery_targets.pl is the test for recovery targets. The routines
> it has would be helpful.

Thanks for the correction, that's what I meant.  Auto-complete is not my
friend.

-- 
-David
da...@pgmasters.net


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch proposal

2017-01-23 Thread Michael Paquier
On Tue, Jan 24, 2017 at 7:22 AM, David Steele  wrote:
> 3) There are no tests.  I believe that
> src/test/recovery/t/006_logical_decoding.pl would be the best place to
> insert your tests.

003_recovery_targets.pl is the test for recovery targets. The routines
it has would be helpful.

> I'm marking this as "Waiting on Author", but honestly I think "Returned
> with Feedback" might be more appropriate.  I'll leave that up to the CFM.

We could wait a couple of days before that, this patch just got a
review. (Thanks!)
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch proposal

2017-01-23 Thread David Steele
Hi Venkata,

On 11/8/16 5:47 PM, Venkata B Nagothi wrote:
> Attached is the 2nd version of the patch with some enhancements.

Here's my review of the patch.

1) There are a number of whitespace error when applying the patch:

$ git apply ../other/recoveryTargetIncomplete.patch-version2
../other/recoveryTargetIncomplete.patch-version2:158: indent with spaces.
else
../other/recoveryTargetIncomplete.patch-version2:160: space before tab
in indent.
ereport(LOG,
../other/recoveryTargetIncomplete.patch-version2:166: indent with spaces.
  /*
../other/recoveryTargetIncomplete.patch-version2:167: indent with spaces.
   * This is the position where we can
choose to shutdown, pause
../other/recoveryTargetIncomplete.patch-version2:168: indent with spaces.
   * or promote at the end-of-the-wal if the
intended recovery
warning: squelched 10 whitespace errors
warning: 15 lines add whitespace errors.

The build did succeed after applying the patch.

2) There is no documentation.  Serious changes to recovery are going to
need to be documented very carefully.  It will also help during review
to determine you intent.

3) There are no tests.  I believe that
src/test/recovery/t/006_logical_decoding.pl would be the best place to
insert your tests.

4) I'm not convinced that fatal errors by default are the way to go.
It's a pretty big departure from what we have now.

Also, I don't think it's very intuitive how recovery_target_incomplete
works.  For instance, if I set recovery_target_incomplete=promote and
set recovery_target_time, but pick a backup after the specified time,
then there will be an error "recovery_target_time %s is older..." rather
than a promotion.

It seems to me that there needs to be a separate setting to raise a
fatal error.

5) There are really two patches here.  One deals with notifying the user
that replay to their recovery target is not possible (implemented here
with fatal errors) and the other allows options when their recovery
target appears to be possible but never arrives.

As far as I can see, at this point, nobody has really signed on to this
approach.  I believe you will need a consensus before you can proceed
with a patch this disruptive.

A better approach may be to pull the first part out and raise warnings
instead.  This will allow you to write tests and make sure that your
logic is correct without major behavioral changes to Postgres.

I'm marking this as "Waiting on Author", but honestly I think "Returned
with Feedback" might be more appropriate.  I'll leave that up to the CFM.

-- 
-David
da...@pgmasters.net


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch proposal

2016-12-02 Thread Haribabu Kommi
On Tue, Nov 22, 2016 at 10:17 PM, Haribabu Kommi 
wrote:

>
> Hi Abhijit,
>
> This is a gentle reminder.
>
> you assigned as reviewer to the current patch in the 11-2016 commitfest.
> But you haven't shared your review yet. Please share your review about
> the patch. This will help us in smoother operation of commitfest.
>
> Please Ignore if you already shared your review.
>

Moved to next CF with "needs review" status.

Regards,
Hari Babu
Fujitsu Australia


Re: [HACKERS] patch proposal

2016-11-22 Thread Haribabu Kommi
Hi Abhijit,

This is a gentle reminder.

you assigned as reviewer to the current patch in the 11-2016 commitfest.
But you haven't shared your review yet. Please share your review about
the patch. This will help us in smoother operation of commitfest.

Please Ignore if you already shared your review.

Regards,
Hari Babu
Fujitsu Australia


Re: [HACKERS] patch proposal

2016-11-09 Thread Jaime Casanova
On 16 August 2016 at 09:06, Stephen Frost  wrote:
> Greetings,
>
> * Venkata B Nagothi (nag1...@gmail.com) wrote:
>> The above said parameters can be configured to pause, shutdown or prevent
>> promotion only after reaching the recovery target point.
>> To clarify, I am referring to a scenario where recovery target point is not
>> reached at all ( i mean, half-complete or in-complete recovery) and there
>> are lots of WALs still pending to be replayed - in this situation,
>
> PG doesn't know that there are still WALs to be replayed.
>
>> PostgreSQL just completes the archive recovery until the end of the last
>> available WAL (WAL file "0001001E" in my case) and
>> starts-up the cluster by generating an error message (saying
>> "0001001F" not found).
>
> That's not a PG error, that's an error from cp.  From PG's perspective,
> your restore command has said that all of the WAL has been replayed.
>
> If that's not what you want then change your restore command to return
> an exit code > 125, which tells PG that it's unable to restore that WAL
> segment.
>

Ah! Is this documented somewhere?
if we expect people to use correct exit codes we should document them, no?

-- 
Jaime Casanova  www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch proposal

2016-11-08 Thread Venkata B Nagothi
Attached is the 2nd version of the patch with some enhancements.

*Scenario 2 :*
>
> Generates Errors, Hints when the specified recovery target is prior to the
> backup's current position (xid, time and lsn). This behaviour is integrated
> with the parameters "recovery_target_time","recovery_target_lsn" and
> "recovery_target_xid".
>

In the second version of the patch, recovery_target_xid will be compared
with the "latestCompletedXid" instead of "oldestXid" of the backup and if
the specified recovery_target_xid is prior to the "latestCompletedXid" of
the backup, then errors/hints would be generated indicating the DBA to use
the appropriate backup. Now, with this patch, pg_controldata also displays
the "latestCompletedXid" of the cluster.


> Backup started at: 4/22D3B8E0, time: 2016-08-26 08:29:08
>>> Backup completed (consistency) at: 4/22D3B8EF, 2016-08-26 08:30:00
>>> recovery_target is not set
>>> recovery_target_time = 2016-08-26 08:29:30
>>> recovery_target_inclusive = false
>>>
>>> In such a case, we will necessairly commit transactions which happened
>>> between 8:29:30 and 8:30:00 and only stop (I believe..) once we've
>>> reached consistency.  We could possibly detect that case and throw an
>>> error instead.  The same could happen with xid.
>>>
>>
>> Yes, currently, PG just recovers until the consistency is reached. It
>> makes sense to throw an error instead.
>>
>
> This has not been addressed yet. Currently, the patch only considers
> generating an error if the recovery target position is prior the current
> backup's position and this is irrespective of weather backup_label is
> present or not.
> I will share my updates on how this can be addressed.
>

This does not seem to be a possibility as the information in the
backup_label is not enough to handle this scenario in specific cases like
xid or time and it does not seem possible
to add the backup stop info to the backup_label. I would prefer leaving
this to the original behaviour at the moment which is : PostgreSQL
generates
errors and exits when *EndOfLog < minRecoveryPoint *during recovery

Documentation is still pending from my end and will be submitting the same
when the patch design/approach is agreed.

Regards,

Venkata B N
Database Consultant

Fujitsu Australia


recoveryTargetIncomplete.patch-version2
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch proposal

2016-10-25 Thread Venkata B Nagothi
Attached is the patch which introduces the new parameter
"recovery_target_incomplete" -

Currently this patch addresses two scenarios -

*Scenario 1 :*

Provide options to DBA when the recovery target is not reached and has
stopped mid-way due to unavailability of WALs

When performing recovery to a specific recovery target,
"recovery_target_incomplete" parameter can be used to either promote, pause
or shutdown when recovery does not reach the specified recovery target and
reaches end-of-the-wal.


*Scenario 2 :*

Generates Errors, Hints when the specified recovery target is prior to the
backup's current position (xid, time and lsn). This behaviour is integrated
with the parameters "recovery_target_time","recovery_target_lsn" and
"recovery_target_xid".

I would like to know if the approach this patch incorporates sounds ok ?

My other comments are inline

On Mon, Aug 29, 2016 at 3:17 PM, Venkata B Nagothi 
wrote:

>
> On Fri, Aug 26, 2016 at 10:58 PM, Stephen Frost 
> wrote:
>
>> * Venkata B Nagothi (nag1...@gmail.com) wrote:
>> > On Fri, Aug 26, 2016 at 12:30 PM, Stephen Frost 
>> wrote:
>> > > * Venkata B Nagothi (nag1...@gmail.com) wrote:
>> > > > This behaviour will be similar to that of
>> recovery_target="immediate" and
>> > > > can be aliased.
>> > >
>> > > I don't believe we're really going at this the right way.  Clearly,
>> > > there will be cases where we'd like promotion at the end of the WAL
>> > > stream (as we currently have) even if the recovery point is not found,
>> > > but if the new option's "promote" is the same as "immediate" then we
>> > > don't have that.
>> >
>> > My apologies for the confusion. Correction - I meant, "promote" option
>> > would promote the database once the consistent point is reached at the
>> > end-of-the-WAL.
>>
>> "consistent point" and "end-of-the-WAL" are not the same.
>>
>> > > recovery_target = immediate, other
>> > >
>> > > recovery_action_target_found = promote, pause, shutdown
>> >
>> > This is currently supported by the existing parameter called
>> > "recovery_target_action"
>>
>> Right, sure, we could possibly use that, or create an alias, etc.
>>
>> > > recovery_action_target_not_found = promote, pause, shutdown
>> >
>> > This is exactly what newly proposed parameter will do.
>>
>> Then it isn't the same as 'recovery_target = immediate'.
>>
>> > > One question to ask is if we need to support an option for xid and
>> time
>> > > related to when we realize that we won't find the recovery target.  If
>> > > consistency is reached at a time which is later than the recovery
>> target
>> > > for time, what then?  Do we go through the rest of the WAL and perform
>> > > the "not found" action at the end of the WAL stream?  If we use that
>> > > approach, then at least all of the recovery target types are handled
>> the
>> > > same, but I can definitely see cases where an administrator might
>> prefer
>> > > an "error" option.
>> >
>> > Currently, this situation is handled by recovery_target_inclusive
>> > parameter.
>>
>> No, that's not the same.
>>
>> > Administrator can choose to stop the recovery at any consistent
>> > point before or after the specified recovery target. Is this what you
>> were
>> > referring to ?
>>
>> Not quite.
>>
>> > I might need a bit of clarification here, the parameter i am proposing
>> > would be effective only if the specified recovery target is not reached
>> and
>> > may not be effective if the recovery goes beyond recovery target point.
>>
>> Ok, let's consider this scenario:
>>
>> Backup started at: 4/22D3B8E0, time: 2016-08-26 08:29:08
>> Backup completed (consistency) at: 4/22D3B8EF, 2016-08-26 08:30:00
>> recovery_target is not set
>> recovery_target_time = 2016-08-26 08:29:30
>> recovery_target_inclusive = false
>>
>> In such a case, we will necessairly commit transactions which happened
>> between 8:29:30 and 8:30:00 and only stop (I believe..) once we've
>> reached consistency.  We could possibly detect that case and throw an
>> error instead.  The same could happen with xid.
>>
>
> Yes, currently, PG just recovers until the consistency is reached. It
> makes sense to throw an error instead.
>

This has not been addressed yet. Currently, the patch only considers
generating an error if the recovery target position is prior the current
backup's position and this is irrespective of weather backup_label is
present or not.
I will share my updates on how this can be addressed.


>
>
>> Working through more scenarios would be useful, I believe.  For example,
>> what if the xid or time specified happened before the backup started?
>>
>> Basically, what I was trying to get at is that we might be able to
>> detect that we'll never find a given recovery target point without
>> actually replaying all of WAL and wondering if we should provide options
>> to control what to do in such a case.
>>
>
> Ah, Yes, I got the point and I agree. Thanks for the clarification.
>
> Yes, good idea and it is important to ensure PG errors out and w

Re: [HACKERS] patch proposal

2016-08-28 Thread Venkata B Nagothi
On Fri, Aug 26, 2016 at 10:58 PM, Stephen Frost  wrote:

> * Venkata B Nagothi (nag1...@gmail.com) wrote:
> > On Fri, Aug 26, 2016 at 12:30 PM, Stephen Frost 
> wrote:
> > > * Venkata B Nagothi (nag1...@gmail.com) wrote:
> > > > This behaviour will be similar to that of
> recovery_target="immediate" and
> > > > can be aliased.
> > >
> > > I don't believe we're really going at this the right way.  Clearly,
> > > there will be cases where we'd like promotion at the end of the WAL
> > > stream (as we currently have) even if the recovery point is not found,
> > > but if the new option's "promote" is the same as "immediate" then we
> > > don't have that.
> >
> > My apologies for the confusion. Correction - I meant, "promote" option
> > would promote the database once the consistent point is reached at the
> > end-of-the-WAL.
>
> "consistent point" and "end-of-the-WAL" are not the same.
>
> > > recovery_target = immediate, other
> > >
> > > recovery_action_target_found = promote, pause, shutdown
> >
> > This is currently supported by the existing parameter called
> > "recovery_target_action"
>
> Right, sure, we could possibly use that, or create an alias, etc.
>
> > > recovery_action_target_not_found = promote, pause, shutdown
> >
> > This is exactly what newly proposed parameter will do.
>
> Then it isn't the same as 'recovery_target = immediate'.
>
> > > One question to ask is if we need to support an option for xid and time
> > > related to when we realize that we won't find the recovery target.  If
> > > consistency is reached at a time which is later than the recovery
> target
> > > for time, what then?  Do we go through the rest of the WAL and perform
> > > the "not found" action at the end of the WAL stream?  If we use that
> > > approach, then at least all of the recovery target types are handled
> the
> > > same, but I can definitely see cases where an administrator might
> prefer
> > > an "error" option.
> >
> > Currently, this situation is handled by recovery_target_inclusive
> > parameter.
>
> No, that's not the same.
>
> > Administrator can choose to stop the recovery at any consistent
> > point before or after the specified recovery target. Is this what you
> were
> > referring to ?
>
> Not quite.
>
> > I might need a bit of clarification here, the parameter i am proposing
> > would be effective only if the specified recovery target is not reached
> and
> > may not be effective if the recovery goes beyond recovery target point.
>
> Ok, let's consider this scenario:
>
> Backup started at: 4/22D3B8E0, time: 2016-08-26 08:29:08
> Backup completed (consistency) at: 4/22D3B8EF, 2016-08-26 08:30:00
> recovery_target is not set
> recovery_target_time = 2016-08-26 08:29:30
> recovery_target_inclusive = false
>
> In such a case, we will necessairly commit transactions which happened
> between 8:29:30 and 8:30:00 and only stop (I believe..) once we've
> reached consistency.  We could possibly detect that case and throw an
> error instead.  The same could happen with xid.
>

Yes, currently, PG just recovers until the consistency is reached. It makes
sense to throw an error instead.


> Working through more scenarios would be useful, I believe.  For example,
> what if the xid or time specified happened before the backup started?
>
> Basically, what I was trying to get at is that we might be able to
> detect that we'll never find a given recovery target point without
> actually replaying all of WAL and wondering if we should provide options
> to control what to do in such a case.
>

Ah, Yes, I got the point and I agree. Thanks for the clarification.

Yes, good idea and it is important to ensure PG errors out and warn the
administrator with appropriate message indicating that... "a much earlier
backup must be used..."
if the specified recovery target xid, name or time is earlier than the
backup time.

Regards,

Venkata B N
Fujitsu Australia


Re: [HACKERS] patch proposal

2016-08-26 Thread Stephen Frost
* Venkata B Nagothi (nag1...@gmail.com) wrote:
> On Fri, Aug 26, 2016 at 12:30 PM, Stephen Frost  wrote:
> > * Venkata B Nagothi (nag1...@gmail.com) wrote:
> > > This behaviour will be similar to that of recovery_target="immediate" and
> > > can be aliased.
> >
> > I don't believe we're really going at this the right way.  Clearly,
> > there will be cases where we'd like promotion at the end of the WAL
> > stream (as we currently have) even if the recovery point is not found,
> > but if the new option's "promote" is the same as "immediate" then we
> > don't have that.
> 
> My apologies for the confusion. Correction - I meant, "promote" option
> would promote the database once the consistent point is reached at the
> end-of-the-WAL.

"consistent point" and "end-of-the-WAL" are not the same.

> > recovery_target = immediate, other
> >
> > recovery_action_target_found = promote, pause, shutdown
> 
> This is currently supported by the existing parameter called
> "recovery_target_action"

Right, sure, we could possibly use that, or create an alias, etc.

> > recovery_action_target_not_found = promote, pause, shutdown
> 
> This is exactly what newly proposed parameter will do.

Then it isn't the same as 'recovery_target = immediate'.

> > One question to ask is if we need to support an option for xid and time
> > related to when we realize that we won't find the recovery target.  If
> > consistency is reached at a time which is later than the recovery target
> > for time, what then?  Do we go through the rest of the WAL and perform
> > the "not found" action at the end of the WAL stream?  If we use that
> > approach, then at least all of the recovery target types are handled the
> > same, but I can definitely see cases where an administrator might prefer
> > an "error" option.
> 
> Currently, this situation is handled by recovery_target_inclusive
> parameter. 

No, that's not the same.

> Administrator can choose to stop the recovery at any consistent
> point before or after the specified recovery target. Is this what you were
> referring to ?

Not quite.

> I might need a bit of clarification here, the parameter i am proposing
> would be effective only if the specified recovery target is not reached and
> may not be effective if the recovery goes beyond recovery target point.

Ok, let's consider this scenario:

Backup started at: 4/22D3B8E0, time: 2016-08-26 08:29:08
Backup completed (consistency) at: 4/22D3B8EF, 2016-08-26 08:30:00
recovery_target is not set
recovery_target_time = 2016-08-26 08:29:30
recovery_target_inclusive = false

In such a case, we will necessairly commit transactions which happened
between 8:29:30 and 8:30:00 and only stop (I believe..) once we've
reached consistency.  We could possibly detect that case and throw an
error instead.  The same could happen with xid.

Working through more scenarios would be useful, I believe.  For example,
what if the xid or time specified happened before the backup started?

Basically, what I was trying to get at is that we might be able to
detect that we'll never find a given recovery target point without
actually replaying all of WAL and wondering if we should provide options
to control what to do in such a case.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] patch proposal

2016-08-26 Thread Venkata B Nagothi
On Fri, Aug 26, 2016 at 12:30 PM, Stephen Frost  wrote:

> * Venkata B Nagothi (nag1...@gmail.com) wrote:
> > On Thu, Aug 25, 2016 at 10:59 PM, Stephen Frost 
> wrote:
> > > I'm not a fan of the "recovery_target" option, particularly as it's
> only
> > > got one value even though it can mean two things (either "immediate" or
> > > "not set"), but we need a complete solution before we can consider
> > > deprecating it.  Further, we could consider making it an alias for
> > > whatever better name we come up with.
> >
> > The new parameter will accept options : "pause", "shutdown" and "promote"
> >
> > *"promote"*
> >
> > This option will ensure database starts up once the "immediate"
> consistent
> > recovery point is reached even if it is well before the mentioned
> recovery
> > target point (XID, Name or time).
> > This behaviour will be similar to that of recovery_target="immediate" and
> > can be aliased.
>
> I don't believe we're really going at this the right way.  Clearly,
> there will be cases where we'd like promotion at the end of the WAL
> stream (as we currently have) even if the recovery point is not found,
> but if the new option's "promote" is the same as "immediate" then we
> don't have that.
>

My apologies for the confusion. Correction - I meant, "promote" option
would promote the database once the consistent point is reached at the
end-of-the-WAL.


> We need to break this down into all the different possible combinations
> and then come up with names for the options to define them.  I don't
> believe a single option is going to be able to cover all of the cases.
>
> The cases which I'm considering are:
>
> recovery target is immediate (as soon as we have consistency)
> recovery target is a set point (name, xid, time, whatever)
>
> action to take if recovery target is found
> action to take if recovery target is not found
>
> Generally, "action" is one of "promote", "pause", or "shutdown".
> Clearly, not all actions are valid for all recovery target cases- in
> particular, "immediate" with "recovery target not found" can not support
> the "promote" or "pause" options.  Otherwise, we can support:
>

I Agree. This is a valid point to consider. I might have few questions over
this at a later stage.

Recovery Target  |  Found  |  Action
> -|-|--
> immediate|  Yes| promote
> immediate|  Yes| pause
> immediate|  Yes| shutdown
>
> immediate|  No | shutdown
>
> name/xid/time|  Yes| promote
> name/xid/time|  Yes| pause
> name/xid/time|  Yes| shutdown
>
> name/xid/time|  No | promote
> name/xid/time|  No | pause
> name/xid/time|  No | shutdown
>
> We could clearly support this with these options:
>


> recovery_target = immediate, other
>
recovery_action_target_found = promote, pause, shutdown
>

This is currently supported by the existing parameter called
"recovery_target_action"


> recovery_action_target_not_found = promote, pause, shutdown
>

This is exactly what newly proposed parameter will do.


> One question to ask is if we need to support an option for xid and time
> related to when we realize that we won't find the recovery target.  If
> consistency is reached at a time which is later than the recovery target
> for time, what then?  Do we go through the rest of the WAL and perform
> the "not found" action at the end of the WAL stream?  If we use that
> approach, then at least all of the recovery target types are handled the
> same, but I can definitely see cases where an administrator might prefer
> an "error" option.
>

Currently, this situation is handled by recovery_target_inclusive
parameter. Administrator can choose to stop the recovery at any consistent
point before or after the specified recovery target. Is this what you were
referring to ?
I might need a bit of clarification here, the parameter i am proposing
would be effective only if the specified recovery target is not reached and
may not be effective if the recovery goes beyond recovery target point.

Regards,
Venkata B N

Fujitsu Australia


Re: [HACKERS] patch proposal

2016-08-25 Thread Stephen Frost
* Venkata B Nagothi (nag1...@gmail.com) wrote:
> On Thu, Aug 25, 2016 at 10:59 PM, Stephen Frost  wrote:
> > I'm not a fan of the "recovery_target" option, particularly as it's only
> > got one value even though it can mean two things (either "immediate" or
> > "not set"), but we need a complete solution before we can consider
> > deprecating it.  Further, we could consider making it an alias for
> > whatever better name we come up with.
> 
> The new parameter will accept options : "pause", "shutdown" and "promote"
> 
> *"promote"*
> 
> This option will ensure database starts up once the "immediate" consistent
> recovery point is reached even if it is well before the mentioned recovery
> target point (XID, Name or time).
> This behaviour will be similar to that of recovery_target="immediate" and
> can be aliased.

I don't believe we're really going at this the right way.  Clearly,
there will be cases where we'd like promotion at the end of the WAL
stream (as we currently have) even if the recovery point is not found,
but if the new option's "promote" is the same as "immediate" then we
don't have that.

We need to break this down into all the different possible combinations
and then come up with names for the options to define them.  I don't
believe a single option is going to be able to cover all of the cases.

The cases which I'm considering are:

recovery target is immediate (as soon as we have consistency)
recovery target is a set point (name, xid, time, whatever)

action to take if recovery target is found
action to take if recovery target is not found

Generally, "action" is one of "promote", "pause", or "shutdown".
Clearly, not all actions are valid for all recovery target cases- in
particular, "immediate" with "recovery target not found" can not support
the "promote" or "pause" options.  Otherwise, we can support:

Recovery Target  |  Found  |  Action
-|-|--
immediate|  Yes| promote
immediate|  Yes| pause
immediate|  Yes| shutdown

immediate|  No | shutdown

name/xid/time|  Yes| promote
name/xid/time|  Yes| pause
name/xid/time|  Yes| shutdown

name/xid/time|  No | promote
name/xid/time|  No | pause
name/xid/time|  No | shutdown

We could clearly support this with these options:

recovery_target = immediate, other
recovery_action_target_found = promote, pause, shutdown
recovery_action_target_not_found = promote, pause, shutdown

One question to ask is if we need to support an option for xid and time
related to when we realize that we won't find the recovery target.  If
consistency is reached at a time which is later than the recovery target
for time, what then?  Do we go through the rest of the WAL and perform
the "not found" action at the end of the WAL stream?  If we use that
approach, then at least all of the recovery target types are handled the
same, but I can definitely see cases where an administrator might prefer
an "error" option.

I'd suggest we attempt to support that also.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] patch proposal

2016-08-25 Thread Venkata B Nagothi
On Thu, Aug 25, 2016 at 10:59 PM, Stephen Frost  wrote:

> * Venkata B Nagothi (nag1...@gmail.com) wrote:
> > *Query 1*
> >
> > What about the existing parameter called "recovery_target" which accepts
> > only one value "immediate", which will be similar to the "promote" option
> > with the to-be-introduced new parameter.
> > Since this parameter's behaviour will be incorporated into the new
> > parameter, I think, this parameter can be deprecated from the next
> > PostgreSQL version ?
>
> I don't think we can really consider that without having a good answer
> for what the "new parameter" is, in particular...
>

Sure. I Agree.


> > *Query 2*
> >
> > I am thinking that the new parameter name should be
> > "recovery_target_incomplete" or "recovery_target_incomplete_action"
> which
> > (by name) suggests that recovery target point is not yet reached and
> > accepts options "pause","promote" and "shutdown".
> >
> > The other alternative name i thought of was -
> > "recovery_target_immediate_action", which (by name) suggests the action
> to
> > be taken when the recovery does not reach the actual set recovery target
> > and reaches immediate consistent point.
>
> I don't really care for any of those names.  Note that "immediate" and
> "the point at which we realize that we didn't find the recovery target"
> are *not* necessairly the same.  Whatever we do here needs to also cover
> the 'recovery_target_name' option, where we're only going to realize
> that we didn't find the restore point when we reach the end of the WAL
> stream.
>

Yes, all the recovery targets (including recovery_target_name) will be
taken into consideration.
The whole idea about this patch is to ensure PG realises that the recovery
is incomplete if the specified recovery target point is not found at the
end of the WAL stream.


> I'm not a fan of the "recovery_target" option, particularly as it's only
> got one value even though it can mean two things (either "immediate" or
> "not set"), but we need a complete solution before we can consider
> deprecating it.  Further, we could consider making it an alias for
> whatever better name we come up with.
>

The new parameter will accept options : "pause", "shutdown" and "promote"

*"promote"*

This option will ensure database starts up once the "immediate" consistent
recovery point is reached even if it is well before the mentioned recovery
target point (XID, Name or time).
This behaviour will be similar to that of recovery_target="immediate" and
can be aliased.

*"pause"*

This option ensures database recovery pauses if any of the specified
recovery target ("XID", "Name" or "time") is not reached. So that WAL
replay can be resumed if required.

*"shutdown"*

This option ensures database shuts down if any of the mentioned recovery
target points (XID, Name or time) is not reached.


Regards,
Venkata B N

Fujitsu Australia


Re: [HACKERS] patch proposal

2016-08-25 Thread Stephen Frost
* Venkata B Nagothi (nag1...@gmail.com) wrote:
> *Query 1*
> 
> What about the existing parameter called "recovery_target" which accepts
> only one value "immediate", which will be similar to the "promote" option
> with the to-be-introduced new parameter.
> Since this parameter's behaviour will be incorporated into the new
> parameter, I think, this parameter can be deprecated from the next
> PostgreSQL version ?

I don't think we can really consider that without having a good answer
for what the "new parameter" is, in particular...

> *Query 2*
> 
> I am thinking that the new parameter name should be
> "recovery_target_incomplete" or "recovery_target_incomplete_action" which
> (by name) suggests that recovery target point is not yet reached and
> accepts options "pause","promote" and "shutdown".
> 
> The other alternative name i thought of was -
> "recovery_target_immediate_action", which (by name) suggests the action to
> be taken when the recovery does not reach the actual set recovery target
> and reaches immediate consistent point.

I don't really care for any of those names.  Note that "immediate" and
"the point at which we realize that we didn't find the recovery target"
are *not* necessairly the same.  Whatever we do here needs to also cover
the 'recovery_target_name' option, where we're only going to realize
that we didn't find the restore point when we reach the end of the WAL
stream.

I'm not a fan of the "recovery_target" option, particularly as it's only
got one value even though it can mean two things (either "immediate" or
"not set"), but we need a complete solution before we can consider
deprecating it.  Further, we could consider making it an alias for
whatever better name we come up with.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] patch proposal

2016-08-24 Thread Venkata B Nagothi
On Thu, Aug 18, 2016 at 3:37 PM, Michael Paquier 
wrote:

> On Tue, Aug 16, 2016 at 11:06 PM, Stephen Frost 
> wrote:
> > I could see supporting an additional "pause" option that means "pause at
> > the end of WAL if you don't reach the recovery target point".  I'd also
> > be happy with a warning being emitted in the log if the recovery target
> > point isn't reached before reaching the end of WAL, but I don't think it
> > makes sense to change the existing behavior.
>
> Indeed, let's not change the existing behavior. A warning showing up
> by default would be useful in itself, even if there are people that I
> think set up overly high recovery targets to be sure to replay WAL as
> much as possible. As recovery_target_action has meaning when a
> recovery target has been reached, I would guess that we would want a
> new option that has the same mapping value as recovery_target_action,
> except that it activates when the target recovery is *not* reached.
> Hence it would be possible to shutdown, pause or promote at will when
> recovery completes, and be able to take a separate action is the
> recovery target is indeed reached. The default of this parameter would
> be "promote", which is what happens now.
>

Yes, a new parameter with same options as recovery_target_action is the
idea i had in mind as well and i have the following queries while working
through the patch design -

*Query 1*

What about the existing parameter called "recovery_target" which accepts
only one value "immediate", which will be similar to the "promote" option
with the to-be-introduced new parameter.
Since this parameter's behaviour will be incorporated into the new
parameter, I think, this parameter can be deprecated from the next
PostgreSQL version ?

*Query 2*

I am thinking that the new parameter name should be
"recovery_target_incomplete" or "recovery_target_incomplete_action" which
(by name) suggests that recovery target point is not yet reached and
accepts options "pause","promote" and "shutdown".

The other alternative name i thought of was -
"recovery_target_immediate_action", which (by name) suggests the action to
be taken when the recovery does not reach the actual set recovery target
and reaches immediate consistent point.

Any comments, suggestions ?

Regards,
Venkata B N

Fujitsu Australia


Re: [HACKERS] patch proposal

2016-08-19 Thread Venkata B Nagothi
On Thu, Aug 18, 2016 at 7:20 PM, Stephen Frost  wrote:

> * Venkata B Nagothi (nag1...@gmail.com) wrote:
> > On Wed, Aug 17, 2016 at 11:27 PM, Stephen Frost 
> wrote:
> > > * Venkata B Nagothi (nag1...@gmail.com) wrote:
> > > > Agreed. Additional option like "pause" would. As long as there is an
> > > option
> > > > to ensure following happens if the recovery target is not reached -
> > > >
> > > >  a) PG pauses the recovery at the end of the WAL
> > > >  b) Generates a warning in the log file saying that recovery target
> point
> > > > is not reached (there is a patch being worked upon on by Thom on
> this)
> > > >  c) Does not open-up the database exiting from the recovery process
> by
> > > > giving room to resume the replay of WALs
> > >
> > > One thing to consider is just how different this is from simply
> bringing
> > > PG up as a warm standby instead, with the warning added to indicate if
> > > the recovery point wasn't reached.
> >
> > I am referring to a specific scenario (performing point-in time recovery)
> > where-in a DBA attempts to bring up a standalone PG instance by restoring
> > the backup and performing recovery to a particular recover target (XID,
> > time or a named restore point) in the past by replaying all the available
> > WAL archives, which is not quite possible through a warm-standby setup.
> >
> > Warm standby is more of a high-availability solution and i do not think
> so,
> > it addresses PITR kind of situation.
>
> No, a warm standby is one which just plays through the WAL but doesn't
> bring the database up or start its own set of WAL, which is what you're
> asking about.
>

I understand that, in warm-standby, PG does continuously replay WAL without
bringing up the database as the database will be in standby mode, which
sounds ideal.

While recovering the database to a particular recovery target point ( and
there is no requirement to setup standby ), then, there is a risk that
database will get promoted due to missing or corrupt WALs. Which means,
there is no way to avoid promotion and resume WAL recovery. An option like
"pause" with a new parameter would be ideal to prevent database promotion
at a default recovery target, which is "immediate" or "end-of-WAL".

Regards,

Venkata B N
Fujitsu Australia


Re: [HACKERS] patch proposal

2016-08-18 Thread Stephen Frost
* Venkata B Nagothi (nag1...@gmail.com) wrote:
> On Wed, Aug 17, 2016 at 11:27 PM, Stephen Frost  wrote:
> > * Venkata B Nagothi (nag1...@gmail.com) wrote:
> > > Agreed. Additional option like "pause" would. As long as there is an
> > option
> > > to ensure following happens if the recovery target is not reached -
> > >
> > >  a) PG pauses the recovery at the end of the WAL
> > >  b) Generates a warning in the log file saying that recovery target point
> > > is not reached (there is a patch being worked upon on by Thom on this)
> > >  c) Does not open-up the database exiting from the recovery process by
> > > giving room to resume the replay of WALs
> >
> > One thing to consider is just how different this is from simply bringing
> > PG up as a warm standby instead, with the warning added to indicate if
> > the recovery point wasn't reached.
> 
> I am referring to a specific scenario (performing point-in time recovery)
> where-in a DBA attempts to bring up a standalone PG instance by restoring
> the backup and performing recovery to a particular recover target (XID,
> time or a named restore point) in the past by replaying all the available
> WAL archives, which is not quite possible through a warm-standby setup.
> 
> Warm standby is more of a high-availability solution and i do not think so,
> it addresses PITR kind of situation.

No, a warm standby is one which just plays through the WAL but doesn't
bring the database up or start its own set of WAL, which is what you're
asking about.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] patch proposal

2016-08-18 Thread Venkata B Nagothi
On Thu, Aug 18, 2016 at 3:37 PM, Michael Paquier 
wrote:

> On Tue, Aug 16, 2016 at 11:06 PM, Stephen Frost 
> wrote:
> > I could see supporting an additional "pause" option that means "pause at
> > the end of WAL if you don't reach the recovery target point".  I'd also
> > be happy with a warning being emitted in the log if the recovery target
> > point isn't reached before reaching the end of WAL, but I don't think it
> > makes sense to change the existing behavior.
>
> Indeed, let's not change the existing behavior. A warning showing up
> by default would be useful in itself, even if there are people that I
> think set up overly high recovery targets to be sure to replay WAL as
> much as possible. As recovery_target_action has meaning when a
> recovery target has been reached, I would guess that we would want a
> new option that has the same mapping value as recovery_target_action,
> except that it activates when the target recovery is *not* reached.
> Hence it would be possible to shutdown, pause or promote at will when
> recovery completes, and be able to take a separate action is the
> recovery target is indeed reached. The default of this parameter would
> be "promote", which is what happens now.
>

Agreed. I understand the complexities with backward compatibility on
changing the existing behaviour.
Even, I was more inclined towards introducing a new parameter and as
suggested, will consider the options pause, shutdown or promote for new
parameter.
Thanks for the inputs and advises.

Regards,
Venkata B N

Fujitsu Australia


Re: [HACKERS] patch proposal

2016-08-17 Thread Michael Paquier
On Tue, Aug 16, 2016 at 11:06 PM, Stephen Frost  wrote:
> I could see supporting an additional "pause" option that means "pause at
> the end of WAL if you don't reach the recovery target point".  I'd also
> be happy with a warning being emitted in the log if the recovery target
> point isn't reached before reaching the end of WAL, but I don't think it
> makes sense to change the existing behavior.

Indeed, let's not change the existing behavior. A warning showing up
by default would be useful in itself, even if there are people that I
think set up overly high recovery targets to be sure to replay WAL as
much as possible. As recovery_target_action has meaning when a
recovery target has been reached, I would guess that we would want a
new option that has the same mapping value as recovery_target_action,
except that it activates when the target recovery is *not* reached.
Hence it would be possible to shutdown, pause or promote at will when
recovery completes, and be able to take a separate action is the
recovery target is indeed reached. The default of this parameter would
be "promote", which is what happens now.
--
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch proposal

2016-08-17 Thread Venkata B Nagothi
On Wed, Aug 17, 2016 at 11:27 PM, Stephen Frost  wrote:

> Venkata,
>
> * Venkata B Nagothi (nag1...@gmail.com) wrote:
> > Agreed. Additional option like "pause" would. As long as there is an
> option
> > to ensure following happens if the recovery target is not reached -
> >
> >  a) PG pauses the recovery at the end of the WAL
> >  b) Generates a warning in the log file saying that recovery target point
> > is not reached (there is a patch being worked upon on by Thom on this)
> >  c) Does not open-up the database exiting from the recovery process by
> > giving room to resume the replay of WALs
>
> One thing to consider is just how different this is from simply bringing
> PG up as a warm standby instead, with the warning added to indicate if
> the recovery point wasn't reached.
>

I am referring to a specific scenario (performing point-in time recovery)
where-in a DBA attempts to bring up a standalone PG instance by restoring
the backup and performing recovery to a particular recover target (XID,
time or a named restore point) in the past by replaying all the available
WAL archives, which is not quite possible through a warm-standby setup.

Warm standby is more of a high-availability solution and i do not think so,
it addresses PITR kind of situation.

I will share more details defining PG behaviour across various recovery
scenarios (as asked by David Steele) when using various recovery*
parameters. Will also explain what difference the proposed patch could make.

Regards,

Venkata B N
Fujitsu Australia


Re: [HACKERS] patch proposal

2016-08-17 Thread Stephen Frost
Venkata,

* Venkata B Nagothi (nag1...@gmail.com) wrote:
> Agreed. Additional option like "pause" would. As long as there is an option
> to ensure following happens if the recovery target is not reached -
> 
>  a) PG pauses the recovery at the end of the WAL
>  b) Generates a warning in the log file saying that recovery target point
> is not reached (there is a patch being worked upon on by Thom on this)
>  c) Does not open-up the database exiting from the recovery process by
> giving room to resume the replay of WALs

One thing to consider is just how different this is from simply bringing
PG up as a warm standby instead, with the warning added to indicate if
the recovery point wasn't reached.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] patch proposal

2016-08-16 Thread Venkata B Nagothi
On Wed, Aug 17, 2016 at 12:06 AM, Stephen Frost  wrote:

> Greetings,
>
> * Venkata B Nagothi (nag1...@gmail.com) wrote:
> > The above said parameters can be configured to pause, shutdown or prevent
> > promotion only after reaching the recovery target point.
> > To clarify, I am referring to a scenario where recovery target point is
> not
> > reached at all ( i mean, half-complete or in-complete recovery) and there
> > are lots of WALs still pending to be replayed - in this situation,
>
> PG doesn't know that there are still WALs to be replayed.
>

PG doesn't know that there are still WALs to be replayed. Since, i have
given an particular recovery target and PG knows the current replay
position,
I would say, it would be good if PG warns and pauses there by saying
recovery target point is not reached.

> It would be nice if PostgreSQL pauses the recovery in-case its not
> complete
> > (because of missing or corrupt WAL), shutdown the cluster and allows the
> > DBA to restart the replay of the remaining WAL Archive files to continue
> > recovery (from where it stopped previously) until the recovery target
> point
> > is reached.
>

Agreed. Reaching end-of-WAL is not an error. It sounds more like a
limitation in certain scenarios.

Reaching the end of WAL isn't an error and I don't believe it makes any
> sense to treat it like it is.  You can specify any recovery target point
> you wish, including ones that don't exist, and that's not an error
> either.
>
> I could see supporting an additional "pause" option that means "pause at
> the end of WAL if you don't reach the recovery target point".  I'd also
> be happy with a warning being emitted in the log if the recovery target
> point isn't reached before reaching the end of WAL, but I don't think it
> makes sense to change the existing behavior.
>

Agreed. Additional option like "pause" would. As long as there is an option
to ensure following happens if the recovery target is not reached -

 a) PG pauses the recovery at the end of the WAL
 b) Generates a warning in the log file saying that recovery target point
is not reached (there is a patch being worked upon on by Thom on this)
 c) Does not open-up the database exiting from the recovery process by
giving room to resume the replay of WALs



Regards,
Venkata B N

Fujitsu Australia


Re: [HACKERS] patch proposal

2016-08-16 Thread Stephen Frost
Greetings,

* Venkata B Nagothi (nag1...@gmail.com) wrote:
> The above said parameters can be configured to pause, shutdown or prevent
> promotion only after reaching the recovery target point.
> To clarify, I am referring to a scenario where recovery target point is not
> reached at all ( i mean, half-complete or in-complete recovery) and there
> are lots of WALs still pending to be replayed - in this situation,

PG doesn't know that there are still WALs to be replayed.

> PostgreSQL just completes the archive recovery until the end of the last
> available WAL (WAL file "0001001E" in my case) and
> starts-up the cluster by generating an error message (saying
> "0001001F" not found).

That's not a PG error, that's an error from cp.  From PG's perspective,
your restore command has said that all of the WAL has been replayed.

If that's not what you want then change your restore command to return
an exit code > 125, which tells PG that it's unable to restore that WAL
segment.

> It would be nice if PostgreSQL pauses the recovery in-case its not complete
> (because of missing or corrupt WAL), shutdown the cluster and allows the
> DBA to restart the replay of the remaining WAL Archive files to continue
> recovery (from where it stopped previously) until the recovery target point
> is reached.

Reaching the end of WAL isn't an error and I don't believe it makes any
sense to treat it like it is.  You can specify any recovery target point
you wish, including ones that don't exist, and that's not an error
either.

I could see supporting an additional "pause" option that means "pause at
the end of WAL if you don't reach the recovery target point".  I'd also
be happy with a warning being emitted in the log if the recovery target
point isn't reached before reaching the end of WAL, but I don't think it
makes sense to change the existing behavior.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] patch proposal

2016-08-16 Thread David Steele

On 8/16/16 1:08 AM, Venkata B Nagothi wrote:


> The issue here is, if by any chance, the required WALs are not available
> or if there is any WAL missing or corrupted at the restore_command
> location, then PostgreSQL recovers until the end of the last available
> WAL and starts-up the cluster.

You can use pause_at_recovery_target/recovery_target_action (depending
on your version) to prevent promotion.  That would work for your stated
scenario but not for the scenario where replay starts (or the database
reaches consistency) after the recovery target.

The above said parameters can be configured to pause, shutdown or
prevent promotion only after reaching the recovery target point.
To clarify, I am referring to a scenario where recovery target point is
not reached at all ( i mean, half-complete or in-complete recovery) and
there are lots of WALs still pending to be replayed - in this situation,
PostgreSQL just completes the archive recovery until the end of the last
available WAL (WAL file "0001001E" in my case) and
starts-up the cluster by generating an error message (saying
"0001001F" not found).

Note: I am testing in PostgreSQL-9.5

LOG:  restored log file "0001001E" from archive
cp: cannot stat ‘/data/pgrestore9531/0001001F’: No
such file or directory
LOG:  redo done at 0/1EFFDBB8
LOG:  last completed transaction was at log time 2016-08-15
11:04:26.795902+10

I have used the following recovery* parameters in the recovery.conf file
here and have intentionally not supplied all the WAL archives needed for
the recovery process to reach the target xid.

recovery_target_xid = ,
recovery_target_inclusive = true
recovery_target_action = pause

It would be nice if PostgreSQL pauses the recovery in-case its not
complete (because of missing or corrupt WAL), shutdown the cluster and
allows the DBA to restart the replay of the remaining WAL Archive files
to continue recovery (from where it stopped previously) until the
recovery target point is reached.


OK, I see what you mean.  I think it would be a good idea to work 
through the various scenarios and define what Postgres would do in each 
scenario before actually creating a patch.


--
-David
da...@pgmasters.net


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch proposal

2016-08-15 Thread Venkata B Nagothi
On Tue, Aug 16, 2016 at 2:50 AM, David Steele  wrote:

> On 8/15/16 2:33 AM, Venkata B Nagothi wrote:
>
> > During the recovery process, It would be nice if PostgreSQL generates an
> > error by aborting the recovery process (instead of starting-up the
> > cluster) if the intended recovery target point is not reached and give
> > an option to DBA to resume the recovery process from where it exactly
> > stopped.
>
> Thom wrote a patch [1] recently that gives warnings in this case.  You
> might want to have a look at that first.
>

That is good to know. Yes, this patch is about generating a more meaningful
output messages for recovery process, which makes sense.


> > The issue here is, if by any chance, the required WALs are not available
> > or if there is any WAL missing or corrupted at the restore_command
> > location, then PostgreSQL recovers until the end of the last available
> > WAL and starts-up the cluster.
>
> You can use pause_at_recovery_target/recovery_target_action (depending
> on your version) to prevent promotion.  That would work for your stated
> scenario but not for the scenario where replay starts (or the database
> reaches consistency) after the recovery target.
>

The above said parameters can be configured to pause, shutdown or prevent
promotion only after reaching the recovery target point.
To clarify, I am referring to a scenario where recovery target point is not
reached at all ( i mean, half-complete or in-complete recovery) and there
are lots of WALs still pending to be replayed - in this situation,
PostgreSQL just completes the archive recovery until the end of the last
available WAL (WAL file "0001001E" in my case) and
starts-up the cluster by generating an error message (saying
"0001001F" not found).

Note: I am testing in PostgreSQL-9.5

LOG:  restored log file "0001001E" from archive
cp: cannot stat ‘/data/pgrestore9531/0001001F’: No such
file or directory
LOG:  redo done at 0/1EFFDBB8
LOG:  last completed transaction was at log time 2016-08-15
11:04:26.795902+10


I have used the following recovery* parameters in the recovery.conf file
here and have intentionally not supplied all the WAL archives needed for
the recovery process to reach the target xid.

recovery_target_xid = ,
recovery_target_inclusive = true
recovery_target_action = pause


It would be nice if PostgreSQL pauses the recovery in-case its not complete
(because of missing or corrupt WAL), shutdown the cluster and allows the
DBA to restart the replay of the remaining WAL Archive files to continue
recovery (from where it stopped previously) until the recovery target point
is reached.

Regards,
Venkata B N

Fujitsu Australia


Re: [HACKERS] patch proposal

2016-08-15 Thread David Steele
On 8/15/16 2:33 AM, Venkata B Nagothi wrote:

> During the recovery process, It would be nice if PostgreSQL generates an
> error by aborting the recovery process (instead of starting-up the
> cluster) if the intended recovery target point is not reached and give
> an option to DBA to resume the recovery process from where it exactly
> stopped.

Thom wrote a patch [1] recently that gives warnings in this case.  You
might want to have a look at that first.

> The issue here is, if by any chance, the required WALs are not available
> or if there is any WAL missing or corrupted at the restore_command
> location, then PostgreSQL recovers until the end of the last available
> WAL and starts-up the cluster. 

You can use pause_at_recovery_target/recovery_target_action (depending
on your version) to prevent promotion.  That would work for your stated
scenario but not for the scenario where replay starts (or the database
reaches consistency) after the recovery target.

[1]
https://www.postgresql.org/message-id/flat/CAA-aLv4K2-9a%2BcvK75dkZkYD1etxpaH%2B9HC0vm9Ebw2up9Co2A%40mail.gmail.com#caa-alv4k2-9a+cvk75dkzkyd1etxpah+9hc0vm9ebw2up9c...@mail.gmail.com

-- 
-David
da...@pgmasters.net


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] patch proposal

2016-08-14 Thread Venkata B Nagothi
Hi,

During the recovery process, It would be nice if PostgreSQL generates an
error by aborting the recovery process (instead of starting-up the cluster)
if the intended recovery target point is not reached and give an option to
DBA to resume the recovery process from where it exactly stopped.

The issue here is, if by any chance, the required WALs are not available or
if there is any WAL missing or corrupted at the restore_command location,
then PostgreSQL recovers until the end of the last available WAL and
starts-up the cluster. At a later point-of-time, if the missing WAL is
found, then, it is not possible to resume the recovery process from where
it stopped previously. The whole backup restoration + recovery process must
be initiated from the beginning which can be time consuming especially when
recovering huge clusters sizing up to Giga bytes and Tera Bytes requiring
1000s of WALs to be copied over for recovery.

Any thoughts ? comments?

I can work on this patch based on the comments/feedback.

Regards,
Venkata B N

Fujitsu Australia


Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-04-09 Thread Daniel Verite
Alvaro Herrera wrote:

> I pushed it.

That's awesome, thanks! Also thanks to Pavel who reviewed and helped
continuously when iterating on this feature, and all others who
participed in this thread.

Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-04-08 Thread Alvaro Herrera
Daniel Verite wrote:

> ISTM that this could be avoided by erroring out for lack of an
> explicit 3rd column as argument. IOW, we wouldn't assume
> that "no column specified" means "show all columns". 
> 
> About simply ripping out the possibility of having multiple
> columns into cells, it's more radical but if that part turns out to
> be more confusing than useful, I don't have a problem
> with removing it.

Okay, I've ripped that out since I wasn't comfortable with the general
idea.  Once you have two data values for the same cell, the new code
raises an error, indicating the corresponding vertical and horizontal
header values; that way it's easy to spot where the problem is.

I also removed the FETCH_COUNT bits; it didn't make a lot of sense to
me.  Like \gexec, the query is executed to completion when in
\crosstabview regardless of FETCH_COUNT.

> The other case of stringing multiple contents into the same cell
> is when different tuples carry (row,column) duplicates.
> I'm not inclined to disallow that case, I think it would go too far
> in guessing what the user expects.
> My expectation for a viewer is that it displays the results as far as
> possible, whatever they are. 

The reason I made this case throw an error is that we can tweak the
behavior later.  I think separating them with newlines is too cute and
will be unusable when you have values that have embedded newlines; you
can imitate that behavior with string_agg(val, E'\n') as I've done in
the regression tests.  One option for improving it would be to have it
add another record, but that requires shifting the values of all cells
by the number of columns (you can see that if you change the border
options, or in HTML output etc).  We can do that later.

> Also, showing such contents in vertically-growing cells as it
> does now allows the user to spot these easily in the grid when
> they happen to be outliers. I'm seeing it as useful in that case.

It's useful, no doubt.

I pushed it.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-04-08 Thread Alvaro Herrera
Robert Haas wrote:

> This seems like it might be converging on some sort of consensus, but
> I'm wondering if we shouldn't push it to 9.7, instead of rushing
> decisions that we will later have trouble changing on
> backward-compatibility grounds.

My intention is to commit this afternoon in the next couple of hours,
and only the most basic case is going to be supported, and the rest of
the cases (concatenation of several fields and several rows, etc) are
just going to throw errors; that way, it will be easy to add more
features later as they are agreed upon.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-04-08 Thread Robert Haas
On Fri, Apr 8, 2016 at 7:23 AM, Daniel Verite  wrote:
> Alvaro Herrera wrote:
>
>> I wonder if the business of appending values of multiple columns
>> separated with spaces is doing us any good.  Why not require that
>> there's a single column in the cell?  If the user wants to put things
>> together, they can use format() or just || the fields together.  What
>> benefit is there to the ' '?  When I ran my first test queries over
>> pg_class I was surprised about this behavior:
>> alvherre=# select * from pg_class
>> alvherre=# \crosstabview relnatts relkind
>
> ISTM that this could be avoided by erroring out for lack of an
> explicit 3rd column as argument. IOW, we wouldn't assume
> that "no column specified" means "show all columns".
>
> About simply ripping out the possibility of having multiple
> columns into cells, it's more radical but if that part turns out to
> be more confusing than useful, I don't have a problem
> with removing it.
>
> The other case of stringing multiple contents into the same cell
> is when different tuples carry (row,column) duplicates.
> I'm not inclined to disallow that case, I think it would go too far
> in guessing what the user expects.
> My expectation for a viewer is that it displays the results as far as
> possible, whatever they are.
> Also, showing such contents in vertically-growing cells as it
> does now allows the user to spot these easily in the grid when
> they happen to be outliers. I'm seeing it as useful in that case.

This seems like it might be converging on some sort of consensus, but
I'm wondering if we shouldn't push it to 9.7, instead of rushing
decisions that we will later have trouble changing on
backward-compatibility grounds.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-04-08 Thread Daniel Verite
Alvaro Herrera wrote:

> I wonder if the business of appending values of multiple columns
> separated with spaces is doing us any good.  Why not require that
> there's a single column in the cell?  If the user wants to put things
> together, they can use format() or just || the fields together.  What
> benefit is there to the ' '?  When I ran my first test queries over
> pg_class I was surprised about this behavior:
> alvherre=# select * from pg_class
> alvherre=# \crosstabview relnatts relkind

ISTM that this could be avoided by erroring out for lack of an
explicit 3rd column as argument. IOW, we wouldn't assume
that "no column specified" means "show all columns". 

About simply ripping out the possibility of having multiple
columns into cells, it's more radical but if that part turns out to
be more confusing than useful, I don't have a problem
with removing it.

The other case of stringing multiple contents into the same cell
is when different tuples carry (row,column) duplicates.
I'm not inclined to disallow that case, I think it would go too far
in guessing what the user expects.
My expectation for a viewer is that it displays the results as far as
possible, whatever they are. 
Also, showing such contents in vertically-growing cells as it
does now allows the user to spot these easily in the grid when
they happen to be outliers. I'm seeing it as useful in that case.

Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-04-08 Thread Daniel Verite
Alvaro Herrera wrote:

> I messed with that code some more, as it looked unnecessarily
> complicated; please see attached and verify that it still behaves
> sanely.  This needs those regression tests you promised.  I tested a few
> cases and it seems good to me.

I've fixed a couple things over v16:
- avoid passing every cell through psprintf, which happened due
  to cont.cells being pre-initialized to empty strings.
- adjusted the loop freeing allocated_cells

and added the regression tests.

Attached is the diff over v16, tested with make check and valgrind.

Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite
diff --git a/src/bin/psql/crosstabview.c b/src/bin/psql/crosstabview.c
index 0d70e47..a20296e 100644
--- a/src/bin/psql/crosstabview.c
+++ b/src/bin/psql/crosstabview.c
@@ -360,7 +360,6 @@ printCrosstab(const PGresult *results,
 	printQueryOpt popt = pset.popt;
 	printTableContent cont;
 	int			i,
-j,
 rn;
 	char		col_align;
 	int		   *horiz_map;
@@ -414,9 +413,6 @@ printCrosstab(const PGresult *results,
 		cont.cells[k * (num_columns + 1)] = piv_rows[i].name ?
 			piv_rows[i].name :
 			(popt.nullPrint ? popt.nullPrint : "");
-		/* Initialize all cells inside the grid to an empty value */
-		for (j = 0; j < num_columns; j++)
-			cont.cells[k * (num_columns + 1) + j + 1] = "";
 	}
 	cont.cellsadded = num_rows * (num_columns + 1);
 
@@ -506,14 +502,10 @@ printCrosstab(const PGresult *results,
 		 * first column of each row, separate with a newline
 		 * instead.
 		 */
-		if (allocated_cells[idx] != NULL)
-			new_content = psprintf("%s%s%s",
-   allocated_cells[idx],
-   i == 0 ? "\n" : " ",
-   content);
-		else
-			new_content = psprintf("%s", content);
-
+		new_content = psprintf("%s%s%s",
+			   cont.cells[idx],
+			   i == 0 ? "\n" : " ",
+			   content);
 		cont.cells[idx] = new_content;
 		if (allocated_cells[idx] != NULL)
 			pg_free(allocated_cells[idx]);
@@ -528,10 +520,20 @@ printCrosstab(const PGresult *results,
 		}
 	}
 
+	/*
+	 * The non-initialized cells must be set to an empty string for the print
+	 * functions
+	 */
+	for (i = 0; i < cont.cellsadded; i++)
+	{
+		if (cont.cells[i] == NULL)
+			cont.cells[i] = "";
+	}
+
 	printTable(&cont, pset.queryFout, false, pset.logfile);
 	printTableCleanup(&cont);
 
-	for (i = 0; i < num_rows * num_columns; i++)
+	for (i = 0; i < (num_rows + 1) * (num_columns + 1); i++)
 	{
 		if (allocated_cells[i] != NULL)
 			pg_free(allocated_cells[i]);
diff --git a/src/test/regress/expected/psql_crosstabview.out b/src/test/regress/expected/psql_crosstabview.out
new file mode 100644
index 000..df3824a
--- /dev/null
+++ b/src/test/regress/expected/psql_crosstabview.out
@@ -0,0 +1,158 @@
+--
+-- tests for \crosstabview
+--
+CREATE VIEW vct_data as 
+select * from ( values
+   ('v1','h2','foo', 3, '2015-04-01'::date),
+   ('v2','h1','bar', 3, '2015-01-02'),
+   ('v1','h0','baz', NULL, '2015-07-12'),
+   ('v0','h4','qux', 4, '2015-07-15'),
+   ('v0','h4','dbl', -3, '2014-12-15'),
+   ('v0',NULL,'qux', 5, '2014-03-15')
+ ) as l(v,h,c,i,d);
+-- 2 columns with implicit 'X' as 3rd column
+select v,i from vct_data order by 1,2 \crosstabview v i
+ v  | -3 | 4 | 5 | 3 |   
+++---+---+---+---
+ v0 | X  | X | X |   | 
+ v1 ||   |   | X | X
+ v2 ||   |   | X | 
+(3 rows)
+
+-- basic usage with 3 columns
+select v, extract(year from d),count(*) from vct_data
+ group by 1, 2 order by 1,2
+ \crosstabview
+ v  | 2014 | 2015 
++--+--
+ v0 |2 |1
+ v1 |  |2
+ v2 |  |1
+(3 rows)
+
+-- ordered months in horizontal header, enclosed column name
+select v, to_char(d,'Mon') as "month name", extract(month from d) as num,
+ count(*) from vct_data  group by 1,2,3 order by 1
+ \crosstabview v "month name":num 4
+ v  | Jan | Mar | Apr | Jul | Dec 
++-+-+-+-+-
+ v0 | |   1 | |   1 |   1
+ v1 | | |   1 |   1 |
+ v2 |   1 | | | |
+(3 rows)
+
+-- combine contents vertically into the same cell (V/H duplicates)
+select v,h,c from vct_data order by 1,2,3
+ \crosstabview 1 2 3
+ v  | h4  | | h0  | h2  | h1  
++-+-+-+-+-
+ v0 | dbl+| qux | | | 
+| qux | | | | 
+ v1 | | | baz | foo | 
+ v2 | | | | | bar
+(3 rows)
+
+-- horizontal ASC order from window function
+select v,h,c, row_number() over(order by h) as r from vct_data order by 1,3,2
+ \crosstabview v h:r c
+ v  | h0  | h1  | h2  | h4  | 
++-+-+-+-+-
+ v0 | | | | dbl+| qux
+| | | | qux | 
+ v1 | baz | | foo | | 
+ v2 | | bar | | | 
+(3 rows)
+
+-- horizontal DESC order from window function
+select v,h,c, row_number() over(order by h DESC) as r from vct_data order by 1,3,2
+ \crosstabview v h:r c
+ v  |   

Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-04-07 Thread Alvaro Herrera
Daniel Verite wrote:

> > * In the "if (cont.cells[idx] != NULL && cont.cells[idx][0] != '\0')"
> > block (line 497 in the attached), can't we do the same thing by using
> > psprintf?
> 
> In that block, we can't pass a cell contents as a valist and be done with
> that cell, because duplicates of (col value,row value) may happen
> at any iteration of the upper loop over PQntuples(results). Any cell really
> may need reallocation unpredictably until that loop is done, whereas
> psprintf starts by allocating a new buffer unconditionally, so it doesn't
> look
> to me like it could help to simplify that block.

I messed with that code some more, as it looked unnecessarily
complicated; please see attached and verify that it still behaves
sanely.  This needs those regression tests you promised.  I tested a few
cases and it seems good to me.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml
index d8b9a03..9c5a915 100644
--- a/doc/src/sgml/ref/psql-ref.sgml
+++ b/doc/src/sgml/ref/psql-ref.sgml
@@ -990,6 +990,113 @@ testdb=>
   
 
   
+\crosstabview [
+colV
+colH
+[:scolH]
+[colG1[,colG2...]]
+] 
+
+
+Execute the current query buffer (like \g) and shows
+the results inside a crosstab grid.
+The output column colV
+becomes a vertical header
+and the output column colH
+becomes a horizontal header, optionally sorted by ranking data obtained
+from scolH.
+
+colG1[,colG2...]
+is the list of output columns to project into the grid.
+By default, all output columns of the query except 
+colV and
+colH
+are included in this list.
+
+
+
+All columns can be refered to by their position (starting at 1), or by
+their name. Normal case folding and quoting rules apply on column
+names. By default,
+colV corresponds to column 1
+and colH to column 2.
+A query having only one output column cannot be viewed in crosstab, and
+colH must differ from
+colV.
+
+
+
+The vertical header, displayed as the leftmost column,
+contains the deduplicated values found in
+column colV, in the same
+order as in the query results.
+
+
+The horizontal header, displayed as the first row,
+contains the deduplicated values found in
+column colH, in
+the order of appearance in the query results.
+If specified, the optional scolH
+argument refers to a column whose values should be integer numbers
+by which colH will be sorted
+to be positioned in the horizontal header.
+
+
+
+Inside the crosstab grid,
+given a query output with N columns
+(including colV and
+colH),
+for each distinct value x of
+colH
+and each distinct value y of
+colV,
+the contents of a cell located at the intersection
+(x,y) is determined by these rules:
+
+
+
+ if there is no corresponding row in the query results such that the
+ value for colH
+ is x and the value
+ for colV
+ is y, the cell is empty.
+
+
+
+
+
+ if there is exactly one row such that the value
+ for colH
+ is x and the value
+ for colV
+ is y, then the N-2 other
+ columns or the columns listed in
+ colG1[,colG2...]
+ are displayed in the cell, separated between each other by
+ a space character if needed.
+
+ If N=2, the letter X is displayed
+ in the cell as if a virtual third column contained that character.
+
+
+
+
+
+ if there are several corresponding rows, the behavior is identical to
+ the case of one row except that the values coming from different rows
+ are stacked vertically, the different source rows being separated by
+ newline characters inside the cell.
+
+
+
+
+
+
+
+  
+
+  
 \d[S+] [ pattern ]
 
 
@@ -4066,6 +4173,47 @@ first  | 4
 second | four
 
 
+
+  When suitable, query results can be shown in a crosstab representation
+  with the \crosstabview command:
+
+testdb=> SELECT first, second, first > 2 AS gt2 FROM my_table;
+ first | second | ge2 
+---++-
+ 1 | one| f
+ 2 | two| f
+ 3 | three  | t
+ 4 | four   | t
+(4 rows)
+
+testdb=> \crosstabview first second
+ first | one | two | three | four 
+---+-+-+---+--
+ 1 | f   | |   | 
+ 2 | | f   |   | 
+ 3 |

Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-04-07 Thread David G. Johnston
On Thu, Apr 7, 2016 at 1:26 PM, Alvaro Herrera 
wrote:

> I wonder if the business of appending values of multiple columns
> separated with spaces is doing us any good.  Why not require that
> there's a single column in the cell?  If the user wants to put things
> together, they can use format() or just || the fields together.  What
> benefit is there to the ' '?  When I ran my first test queries over
> pg_class I was surprised about this behavior:
>
> alvherre=# select * from pg_class
> alvherre=# \crosstabview relnatts relkind
>
>  relnatts |
>r
> |   t
>   |
> i |
>v
>
> --+++--+---
>26 | pg_statistic 11 11397 0 10 0 2619 0 15 380 15 2840 t f p 0 f f
> f f f f f t n 540 1 {alvherre=arwdDxt/alvherre} (null)
> |
>   |
>   |
>30 | pg_type 11 71 0 10 0 0 0 9 358 9 0 t f p 0 t f f f f f f t n
> 540 1 {=r/alvherre} (null)
>   |
> |
> |
> 3 | pg_user_mapping 11 11633 0 10 0 1418 0 0 0 0 0 t f p 0 t f f f
> f f f t n 540 1 {alvherre=arwdDxt/alvherre} (null)
>+| pg_toast_2604 99 11642 0 10 0 2830 0 0 0 0 0 t f p 0 f f f f f f f t
> n 540 1 (null) (null)+| pg_amop_opr_fam_index 11 0 0 10 403 2654 0 5
> 688 0 0 f f p 0 f f f f f f f t n 0 0 (null) (null)+|
> pg_group 11 11661 0 10 0 11660 0 0 0 0 0 f f p 0 f f t f f f f t n 0 0
> {=r/alvherre} (null)
>   +
>
>
> I'm tempted to rip that out, unless you have a reason not to.
>
> In fact, I think even the grouping of values of multiple rows with \n is
> not terribly great either.  Why not just require people to group the
> values beforehand?  You can use "string_agg(column, E'\n')" to get the
> same behavior, plus you can do other things such as sum() etc.
>

​Went and looked at the examples page and at first blush it seems like this
module only understands text.  My specific concern here is dealing with
"numbers-as-text" sorting.​

​As to the question of behavior when multiple columns (and rows?) are
present: ​we need some sort of default do we not.  Nothing is precluding
the user from doing their own aggregates and limiting the select-list.
That said I'm more inclined to error if the input data in not unique on
(v,h).  I feel the possibility of a user query bug going unnoticed in that
scenario is reasonably large since its likely that only some combinations
of duplicates appear.  I'm a bit less tentative regarding column
concatenation since I would expect that nearly every cell involved in the
output would be noticeably affected.  Though, if we are going to protect
against extra rows extending that to protect against extra columns seems
fair.

Another option is, possibly conditioned on the first two columns being the
headers, to only take the column in the third position (or, the first
unassigned column).and display it.

Otherwise if multiple candidate columns are present and none are chosen for
the cell we could just error and force the user to explicitly choose.

The concatenation behavior seems like the least useful default.  I'm
inclined to favor the first unassigned input column.  And never allow (v,h)
is violate uniqueness.

David J.


Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-04-07 Thread Alvaro Herrera
Alvaro Herrera wrote:
> Daniel Verite wrote:

> > > * A few examples in docs.  The psql manpage should have at least two new
> > > examples showing the crosstab features, one with the simplest case you
> > > can think of, and another one showing all the features.
> > 
> > Added that in the EXAMPLES section at the very end of the manpage.
> 
> Ok.  Seems a bit too short to me, and I don't like the fact that you
> can't actually run it because you need to create the my_table
> beforehand.  I think it'd be better if you used a VALUES clause there,
> so that the reader can cut'n paste into psql to start to play with the
> feature.

Oh, I noticed now that my_table was created by previous examples.
Nevermind.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-04-07 Thread Alvaro Herrera
I wonder if the business of appending values of multiple columns
separated with spaces is doing us any good.  Why not require that
there's a single column in the cell?  If the user wants to put things
together, they can use format() or just || the fields together.  What
benefit is there to the ' '?  When I ran my first test queries over
pg_class I was surprised about this behavior:

alvherre=# select * from pg_class
alvherre=# \crosstabview relnatts relkind

 relnatts | 
  r|
   t
|i  
   |
   v
   
--+++--+---
   26 | pg_statistic 11 11397 0 10 0 2619 0 15 380 15 2840 t f p 0 f f f f 
f f f t n 540 1 {alvherre=arwdDxt/alvherre} (null)  |   

 |  
| 
   30 | pg_type 11 71 0 10 0 0 0 9 358 9 0 t f p 0 t f f f f f f t n 540 1 
{=r/alvherre} (null)|   

 |  
| 
3 | pg_user_mapping 11 11633 0 10 0 1418 0 0 0 0 0 t f p 0 t f f f f f 
f t n 540 1 {alvherre=arwdDxt/alvherre} (null) +| 
pg_toast_2604 99 11642 0 10 0 2830 0 0 0 0 0 t f p 0 f f f f f f f t n 540 1 
(null) (null)+| pg_amop_opr_fam_index 11 0 0 10 403 2654 0 5 688 0 0 f f p 
0 f f f f f f f t n 0 0 (null) (null)+| pg_group 11 11661 0 10 
0 11660 0 0 0 0 0 f f p 0 f f t f f f f t n 0 0 {=r/alvherre} (null)
  +


I'm tempted to rip that out, unless you have a reason not to.

In fact, I think even the grouping of values of multiple rows with \n is
not terribly great either.  Why not just require people to group the
values beforehand?  You can use "string_agg(column, E'\n')" to get the
same behavior, plus you can do other things such as sum() etc.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-04-07 Thread Alvaro Herrera
Daniel Verite wrote:

> > regression=# select * from pg_class \crosstabview relnatts 
> > \crosstabview: missing second argument
> > regression-# 
> 
> Fixed. This was modelled after the behavior of:
>  select 1 \badcommand
> but I've changed to mimic what happens with:
>   select 1 \g /some/invalid/path
> the query buffer is not discarded by the error but the prompt
> is ready for a fresh new command.

Works for me.

> > * A few examples in docs.  The psql manpage should have at least two new
> > examples showing the crosstab features, one with the simplest case you
> > can think of, and another one showing all the features.
> 
> Added that in the EXAMPLES section at the very end of the manpage.

Ok.  Seems a bit too short to me, and I don't like the fact that you
can't actually run it because you need to create the my_table
beforehand.  I think it'd be better if you used a VALUES clause there,
so that the reader can cut'n paste into psql to start to play with the
feature.

> > * In the "if (cont.cells[idx] != NULL && cont.cells[idx][0] != '\0')"
> > block (line 497 in the attached), can't we do the same thing by using
> > psprintf?
> 
> In that block, we can't pass a cell contents as a valist and be done with
> that cell, because duplicates of (col value,row value) may happen
> at any iteration of the upper loop over PQntuples(results). Any cell really
> may need reallocation unpredictably until that loop is done, whereas
> psprintf starts by allocating a new buffer unconditionally, so it doesn't
> look to me like it could help to simplify that block.

I don't know what you mean, but here's what I meant.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
diff --git a/src/bin/psql/crosstabview.c b/src/bin/psql/crosstabview.c
index b3510b9..0216dae 100644
--- a/src/bin/psql/crosstabview.c
+++ b/src/bin/psql/crosstabview.c
@@ -496,12 +496,12 @@ printCrosstab(const PGresult *results, int num_columns,
 {
 	src_col = colsG[i];
 
-	content = (!PQgetisnull(results, rn, src_col)) ?
+	content = !PQgetisnull(results, rn, src_col) ?
 		PQgetvalue(results, rn, src_col) :
 		(popt.nullPrint ? popt.nullPrint : "");
 }
 
-if (cont.cells[idx] != NULL && cont.cells[idx][0] != '\0')
+if (cont.cells[idx] != NULL)
 {
 	/*
 	 * Multiple values for the same (row,col) are projected
@@ -509,12 +509,9 @@ printCrosstab(const PGresult *results, int num_columns,
 	 * previous content of the cell from the new value by a
 	 * newline.
 	 */
-	int			content_size;
 	char	   *new_content;
 	int			idx2;
 
-	content_size = strlen(cont.cells[idx]) + 2 + strlen(content) + 1;
-
 	/*
 	 * idx2 is an index into allocated_cells. It differs from
 	 * idx (index into cont.cells), because vertical and
@@ -524,34 +521,17 @@ printCrosstab(const PGresult *results, int num_columns,
 	idx2 = (row_number * num_columns) + col_number;
 
 	if (allocated_cells[idx2] != NULL)
-	{
-		new_content = pg_realloc(allocated_cells[idx2], content_size);
-	}
+		new_content = psprintf("%s%s%s",
+			   allocated_cells[idx2],
+			   i == 0 ? "\n" : " ",
+			   content);
 	else
-	{
-		/*
-		 * At this point, cont.cells[idx] still contains a
-		 * PQgetvalue() pointer.  Just after, it will contain
-		 * a new pointer maintained in allocated_cells[], and
-		 * freed at the end of this function.
-		 */
-		new_content = pg_malloc(content_size);
-		strcpy(new_content, cont.cells[idx]);
-	}
-	cont.cells[idx] = new_content;
-	allocated_cells[idx2] = new_content;
+		new_content = psprintf("%s", content);
 
-	/*
-	 * Contents that are on adjacent columns in the source
-	 * results get separated by one space in the target.
-	 * Contents that are on different rows in the source get
-	 * separated by newlines in the target.
-	 */
-	if (i == 0)
-		strcat(new_content, "\n");
-	else
-		strcat(new_content, " ");
-	strcat(new_content, content);
+	cont.cells[idx] = new_content;
+	if (allocated_cells[idx2] != NULL)
+		pg_free(allocated_cells[idx2]);
+	allocated_cells[idx2] = new_content;
 }
 else
 {

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-04-07 Thread Daniel Verite
Alvaro Herrera wrote:


Thanks for looking into that patch!

> regression=# select * from pg_class \crosstabview relnatts 
> \crosstabview: missing second argument
> regression-# 

Fixed. This was modelled after the behavior of:
 select 1 \badcommand
but I've changed to mimic what happens with:
  select 1 \g /some/invalid/path
the query buffer is not discarded by the error but the prompt
is ready for a fresh new command.

> alvherre=# select * from pg_class \crosstabview relnatts relkinda
> Invalid column name: relkinda
> alvherre=# select 1;
> The query must return at least two columns to be shown in crosstab

Definitely a bug. Fixed.

Also fixed a one-off bug with quoted columns: in parseColumnRefs(),
first call to PQmblen(), I wrongly assumed that
PQmblen("", ..) returns 0, whereas in fact it returns 1.

> * A few examples in docs.  The psql manpage should have at least two new
> examples showing the crosstab features, one with the simplest case you
> can think of, and another one showing all the features.

Added that in the EXAMPLES section at the very end of the manpage.

> * Add regression test cases somewhere for the regression database.
> Probably use "FROM tenk1 WHERE hundred < 5", which provides you with 500
> rows, enough for many interesting games.  Make sure to test all the
> provided features.  I would use a new psql.sql file for this.

Looking into regression tests, not yet done.

> * How did you come up with the 1600 value?  Whatever it is, please use a
> #define instead of hardcoding it.

Done with accompanying comment in crosstabview.h 

> * In the "if (cont.cells[idx] != NULL && cont.cells[idx][0] != '\0')"
> block (line 497 in the attached), can't we do the same thing by using
> psprintf?

In that block, we can't pass a cell contents as a valist and be done with
that cell, because duplicates of (col value,row value) may happen
at any iteration of the upper loop over PQntuples(results). Any cell really
may need reallocation unpredictably until that loop is done, whereas
psprintf starts by allocating a new buffer unconditionally, so it doesn't
look
to me like it could help to simplify that block.


Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite
diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml
index d8b9a03..1072733 100644
--- a/doc/src/sgml/ref/psql-ref.sgml
+++ b/doc/src/sgml/ref/psql-ref.sgml
@@ -990,6 +990,113 @@ testdb=>
   
 
   
+\crosstabview [
+colV
+colH
+[:scolH]
+[colG1[,colG2...]]
+] 
+
+
+Execute the current query buffer (like \g) and shows
+the results inside a crosstab grid.
+The output column colV
+becomes a vertical header
+and the output column colH
+becomes a horizontal header, optionally sorted by ranking data obtained
+from scolH.
+
+colG1[,colG2...]
+is the list of output columns to project into the grid.
+By default, all output columns of the query except 
+colV and
+colH
+are included in this list.
+
+
+
+All columns can be refered to by their position (starting at 1), or by
+their name. Normal case folding and quoting rules apply on column
+names. By default,
+colV corresponds to column 1
+and colH to column 2.
+A query having only one output column cannot be viewed in crosstab, and
+colH must differ from
+colV.
+
+
+
+The vertical header, displayed as the leftmost column,
+contains the deduplicated values found in
+column colV, in the same
+order as in the query results.
+
+
+The horizontal header, displayed as the first row,
+contains the deduplicated values found in
+column colH, in
+the order of appearance in the query results.
+If specified, the optional scolH
+argument refers to a column whose values should be integer numbers
+by which colH will be sorted
+to be positioned in the horizontal header.
+
+
+
+Inside the crosstab grid,
+given a query output with N columns
+(including colV and
+colH),
+for each distinct value x of
+colH
+and each distinct value y of
+colV,
+the contents of a cell located at the intersection
+(x,y) is determined by these rules:
+
+
+
+ if there is no corresponding row in the query results such that the
+ value for colH
+ is x and the value
+ for colV
+ is y, the cell is empty.
+
+
+
+
+
+ if there is exactly one row such that the value
+ for colH
+ is x and the value
+ for colV
+ is y, then the N-2 other
+ 

Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-04-06 Thread Alvaro Herrera
I've been looking at this patch.  First thing was to rebase on top of
recent psql code restructuring; second, pgindent; third, reordered the
code in crosstabview.c more sensibly (had to add prototypes).  New
version attached.

Then I looked at the docs to try to figure out exactly how it works.
I'm surprised that there's not a single example added to the psql
manpage.  Please add one.

I then tested it a bit, "kick the tires" so to speak.  I noticed that
error handling is broken.  For instance, observe the query prompt after
the error:

regression=# select * from pg_class \crosstabview relnatts 
\crosstabview: missing second argument
regression-# 

At this point the query buffer contains the query (you can see it with
\e), which seems bogus to me.  The query buffer needs to be reset.
Compare \gexec:
alvherre=# select 1 \gexec
ERROR:  error de sintaxis en o cerca de «1»
LÍNEA 1: 1
 ^
alvherre=# 


Also, using bogus column names as arguments cause state to get all
bogus:

alvherre=# select * from pg_class \crosstabview relnatts relkinda
Invalid column name: relkinda
alvherre=# select 1;
The query must return at least two columns to be shown in crosstab

Note that the second query is not crosstab at all, yet the error message
is entirely bogus.  This one is probably the same bug:

alvherre=# select 'one', 'two';
Invalid column name: relnatts

Apparently, once in that state, not even a successful query crosstab
display resets the state correctly:

alvherre=# select * from pg_class \crosstabview relnatts relkinda
Invalid column name: relkinda
alvherre=# select 'one' as relnatts, 'two' as relkinda \crosstabview
 relnatts | two 
--+-
 one  | X
(1 fila)

alvherre=# select 1;
The query must return at least two columns to be shown in crosstab

Please fix this.


Some additional items:

* A few examples in docs.  The psql manpage should have at least two new
examples showing the crosstab features, one with the simplest case you
can think of, and another one showing all the features.

* Add regression test cases somewhere for the regression database.
Probably use "FROM tenk1 WHERE hundred < 5", which provides you with 500
rows, enough for many interesting games.  Make sure to test all the
provided features.  I would use a new psql.sql file for this.

* How did you come up with the 1600 value?  Whatever it is, please use a
#define instead of hardcoding it.

* In the "if (cont.cells[idx] != NULL && cont.cells[idx][0] != '\0')"
block (line 497 in the attached), can't we do the same thing by using
psprintf?

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml
index d8b9a03..536141c 100644
--- a/doc/src/sgml/ref/psql-ref.sgml
+++ b/doc/src/sgml/ref/psql-ref.sgml
@@ -990,6 +990,113 @@ testdb=>
   
 
   
+\crosstabview [
+colV
+colH
+[:scolH]
+[colG1[,colG2...]]
+] 
+
+
+Execute the current query buffer (like \g) and shows
+the results inside a crosstab grid.
+The output column colV
+becomes a vertical header
+and the output column colH
+becomes a horizontal header, optionally sorted by ranking data obtained
+from scolH.
+
+colG1[,colG2...]
+is the list of output columns to project into the grid.
+By default, all output columns of the query except 
+colV and
+colH
+are included in this list.
+
+
+
+All columns can be refered to by their position (starting at 1), or by
+their name. Normal case folding and quoting rules apply on column
+names. By default,
+colV corresponds to column 1
+and colH to column 2.
+A query having only one output column cannot be viewed in crosstab, and
+colH must differ from
+colV.
+
+
+
+The vertical header, displayed as the leftmost column,
+contains the deduplicated values found in
+column colV, in the same
+order as in the query results.
+
+
+The horizontal header, displayed as the first row,
+contains the deduplicated values found in
+column colH, in
+the order of appearance in the query results.
+If specified, the optional scolH
+argument refers to a column whose values should be integer numbers
+by which colH will be sorted
+to be positioned in the horizontal header.
+
+
+
+Inside the crosstab grid,
+given a query output with N columns
+(including colV and
+colH),
+for each distinct value x of
+colH
+and each distinct value y of
+colV,
+the contents of a cell located at the intersection
+(x,y) is determined by these rules:
+ 

Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-03-21 Thread Alvaro Herrera
Robert Haas wrote:
> On Sun, Mar 20, 2016 at 5:27 PM, Pavel Stehule  
> wrote:
> > From my perspective, it is ready for commiter. Daniel solved the most big
> > issues.
> 
> OK, so that brings us back to: is there any committer who likes this
> enough to want to look at committing it?  My view hasn't changed much
> since 
> http://www.postgresql.org/message-id/ca+tgmoz4yaduq9j8xtgrbh868jh2nj_nw_qgkxb32cedsvt...@mail.gmail.com

I volunteer for that, but it'll be a few days so if anyone else is
interested, feel free.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-03-21 Thread Robert Haas
On Sun, Mar 20, 2016 at 5:27 PM, Pavel Stehule  wrote:
> From my perspective, it is ready for commiter. Daniel solved the most big
> issues.

OK, so that brings us back to: is there any committer who likes this
enough to want to look at committing it?  My view hasn't changed much
since 
http://www.postgresql.org/message-id/ca+tgmoz4yaduq9j8xtgrbh868jh2nj_nw_qgkxb32cedsvt...@mail.gmail.com

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-03-20 Thread Pavel Stehule
Hi

2016-03-19 16:31 GMT+01:00 Alvaro Herrera :

> Pavel Stehule wrote:
>
> > Can I do review?
>
> Of course.
>

I did review of last patch. I had to do small changes to run the code due
last Tom's changes in psql. Updated patch is attached.

The last changes in this patch are two:

1. Remove strange server side sorting
2. Cleaning/reducing interface

Other code is +/- without changes. There was lot of discussion in this
thread, I would not to repeat it.

I'll comment the changes:

@1 using server side sorting was really generic, but strange. Now, the
crosstabview works without it without any significant functionality
degradation.

@2 interface is minimalist - but good enough - I am thinking so it is good
start point. I was able to run my examples without problems. The previous
API was more comfortable - "+","-" symbols allows to specify order quickly,
but without a agreement we can live without this feature. Now, a order of
data is controlled fully by SQL. crosstabview does data visualization only.
I have not any objection to this last design. It is reduced to minimum, but
still it works well.

* All regress tests passed
* A code is well and well commented
* No new warnings or compilation issues
* Documentation is clean

I have two minor notes, can be fixed simply, if we accept this last design:

1. can be nice if documentation will contains one example
2. some regress tests

>From my perspective, it is ready for commiter. Daniel solved the most big
issues.

Regards

Pavel


> --
> Álvaro Herrerahttp://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>
diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml
new file mode 100644
index 8a85804..da0621b
*** a/doc/src/sgml/ref/psql-ref.sgml
--- b/doc/src/sgml/ref/psql-ref.sgml
*** testdb=>
*** 990,995 
--- 990,1102 

  

+ \crosstabview [
+ colV
+ colH
+ [:scolH]
+ [colG1[,colG2...]]
+ ] 
+ 
+ 
+ Execute the current query buffer (like \g) and 
shows
+ the results inside a crosstab grid.
+ The output column colV
+ becomes a vertical header
+ and the output column colH
+ becomes a horizontal header, optionally sorted by ranking data 
obtained
+ from scolH.
+ 
+ colG1[,colG2...]
+ is the list of output columns to project into the grid.
+ By default, all output columns of the query except 
+ colV and
+ colH
+ are included in this list.
+ 
+ 
+ 
+ All columns can be refered to by their position (starting at 1), or by
+ their name. Normal case folding and quoting rules apply on column
+ names. By default,
+ colV corresponds to 
column 1
+ and colH to column 2.
+ A query having only one output column cannot be viewed in crosstab, 
and
+ colH must differ from
+ colV.
+ 
+ 
+ 
+ The vertical header, displayed as the leftmost column,
+ contains the deduplicated values found in
+ column colV, in the same
+ order as in the query results.
+ 
+ 
+ The horizontal header, displayed as the first row,
+ contains the deduplicated values found in
+ column colH, in
+ the order of appearance in the query results.
+ If specified, the optional scolH
+ argument refers to a column whose values should be integer numbers
+ by which colH will be 
sorted
+ to be positioned in the horizontal header.
+ 
+ 
+ 
+ Inside the crosstab grid,
+ given a query output with N columns
+ (including colV and
+ colH),
+ for each distinct value x of
+ colH
+ and each distinct value y of
+ colV,
+ the contents of a cell located at the intersection
+ (x,y) is determined by these rules:
+ 
+ 
+ 
+  if there is no corresponding row in the query results such that the
+  value for colH
+  is x and the value
+  for colV
+  is y, the cell is empty.
+ 
+ 
+ 
+ 
+ 
+  if there is exactly one row such that the value
+  for colH
+  is x and the value
+  for colV
+  is y, then the N-2 other
+  columns or the columns listed in
+  colG1[,colG2...]
+  are displayed in the cell, separated between each other by
+  a space character if needed.
+ 
+  If N=2, the letter X is 
displayed
+  in the cell as if a virtual third column contained that character.
+ 
+ 
+ 
+ 
+ 
+  if there are several corresponding rows, the behavior is identical to
+  the case of one row except that the values coming from dif

Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-03-19 Thread Alvaro Herrera
Pavel Stehule wrote:
 
> Can I do review?

Of course.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-03-19 Thread Pavel Stehule
2016-03-19 15:45 GMT+01:00 Robert Haas :

> On Mon, Mar 14, 2016 at 2:55 PM, Robert Haas 
> wrote:
> > On Sat, Mar 12, 2016 at 10:34 AM, Daniel Verite 
> wrote:
> >>> But worse than either of  those things, there is no real
> >>> agreement on what the overall design of this feature
> >>> should be.
> >>
> >> The part in the design that raised concerns upthread is
> >> essentially how headers sorting is exposed to the user and
> >> implemented.
> >>
> >> As suggested in [1], I've made some drastic changes in the
> >> attached patch to take the comments (from Dean R., Tom L.)
> >> into account.
> >> [ ... lengthy explanation ... ]
> >> - also NULLs are no longer excluded from headers, per Peter E.
> >>   comment in [2].
> >
> > Dean, Tom, Peter, what do you think of the new version?
>
> Is anyone up for re-reviewing this?  If not, I think we're going to
> have to reject this for lack of interest.
>

Can I do review?

Pavel


>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>


Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-03-19 Thread Robert Haas
On Mon, Mar 14, 2016 at 2:55 PM, Robert Haas  wrote:
> On Sat, Mar 12, 2016 at 10:34 AM, Daniel Verite  
> wrote:
>>> But worse than either of  those things, there is no real
>>> agreement on what the overall design of this feature
>>> should be.
>>
>> The part in the design that raised concerns upthread is
>> essentially how headers sorting is exposed to the user and
>> implemented.
>>
>> As suggested in [1], I've made some drastic changes in the
>> attached patch to take the comments (from Dean R., Tom L.)
>> into account.
>> [ ... lengthy explanation ... ]
>> - also NULLs are no longer excluded from headers, per Peter E.
>>   comment in [2].
>
> Dean, Tom, Peter, what do you think of the new version?

Is anyone up for re-reviewing this?  If not, I think we're going to
have to reject this for lack of interest.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-03-14 Thread Robert Haas
On Sat, Mar 12, 2016 at 10:34 AM, Daniel Verite  wrote:
>> But worse than either of  those things, there is no real
>> agreement on what the overall design of this feature
>> should be.
>
> The part in the design that raised concerns upthread is
> essentially how headers sorting is exposed to the user and
> implemented.
>
> As suggested in [1], I've made some drastic changes in the
> attached patch to take the comments (from Dean R., Tom L.)
> into account.
> [ ... lengthy explanation ... ]
> - also NULLs are no longer excluded from headers, per Peter E.
>   comment in [2].

Dean, Tom, Peter, what do you think of the new version?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-03-14 Thread Daniel Verite
Jim Nasby wrote:

> Ultimately I'd really like some way to remove/reduce the restriction of 
> result set definitions needing to be determined at plan time. That would 
> open the door for server-side crosstab/pivot as well a a host of other 
> things (such as dynamically turning a hstore/json/xml field into a 
> recordset).

> Ultimately I'd really like some way to remove/reduce the restriction of 
> result set definitions needing to be determined at plan time. That would 
> open the door for server-side crosstab/pivot as well a a host of other 
> things (such as dynamically turning a hstore/json/xml field into a 
> recordset).

That would go against a basic expectation of prepared statements, the
fact that queries can be parsed/prepared without any part of them
being executed.

For a dynamic pivot, but probably also for the other examples you
have in mind, the SQL engine wouldn't be able to determine the output
columns without executing a least a subselect to look inside some
table(s).

I suspect that the implications of this would be so far reaching and
problematic that it will just not happen.

It seems to me that a dynamic pivot will always consist of
two SQL queries that can never be combined into one,
unless using a workaround à la Oracle, which encapsulates the
entire dynamic resultset into an XML blob as output.
The problem here being that the client-side tools
that people routinely use are not equipped to process it anyway;
at least that's what  I find by anecdotal evidence for instance in:
https://community.oracle.com/thread/2133154?tstart=0
 or
http://stackoverflow.com/questions/19298424
 or
https://community.oracle.com/thread/2388982?tstart=0


Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-03-14 Thread Alvaro Herrera
Jim Nasby wrote:
> On 3/13/16 12:48 AM, Pavel Stehule wrote:
> >crosstabview is really visualization tool. **But now, there are not any
> >other tool available from terminal.** So this can be significant help to
> >all people who would to use this functionality.
> 
> Not just the terminal either. Offhand I'm not aware of *any* fairly simple
> tool that provides crosstab. There's a bunch of complicated/expensive BI
> tools that do, but unless you've gone through the trouble of getting one of
> those setup you're currently pretty stuck.

I'm definitely +1 for this feature in psql also.

Some years ago we had a discussion about splitting psql in two parts, a
bare-bones one which would help script-writing and another one with
fancy features; we decided to keep one tool to rule them all and made
the implicit decision that we would grow exotic, sophisticated features
into psql.  ISTM that this patch is going in that direction.

> Ultimately I'd really like some way to remove/reduce the restriction of
> result set definitions needing to be determined at plan time. That would
> open the door for server-side crosstab/pivot as well a a host of other
> things (such as dynamically turning a hstore/json/xml field into a
> recordset).

That seems so far down the road that I don't think it should block the
psql feature being proposed in this thread, but yes I would like that
one too.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-03-13 Thread Jim Nasby

On 3/13/16 12:48 AM, Pavel Stehule wrote:

crosstabview is really visualization tool. **But now, there are not any
other tool available from terminal.** So this can be significant help to
all people who would to use this functionality.


Not just the terminal either. Offhand I'm not aware of *any* fairly 
simple tool that provides crosstab. There's a bunch of 
complicated/expensive BI tools that do, but unless you've gone through 
the trouble of getting one of those setup you're currently pretty stuck.


Ultimately I'd really like some way to remove/reduce the restriction of 
result set definitions needing to be determined at plan time. That would 
open the door for server-side crosstab/pivot as well a a host of other 
things (such as dynamically turning a hstore/json/xml field into a 
recordset).

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-03-12 Thread Pavel Stehule
2016-03-11 14:49 GMT+01:00 Robert Haas :

> On Thu, Feb 18, 2016 at 9:23 AM, Daniel Verite 
> wrote:
> > Dean Rasheed wrote:
> >
> >> If I want to sort the rows coming out of a query, my first thought
> >> is always going to be to add/adjust the query's ORDER BY clause, not
> >> use some weird +/- psql syntax.
> >
> > About the vertical sort, I agree on all your points.
> > It's best to rely on ORDER BY for all the reasons mentioned,
> > as opposed to a separate sort in a second step.
> >
> > But you're considering the case when a user is designing
> > or adapting a query for the purpose of crosstab
> > viewing. As mentioned in my previous reply (about the
> > methods to achieve horizontal sort), that scenario is not really
> > what motivates the feature in the first place.
> >
> > If removing that sort option is required to move forward
> > with the patch because it's controversial, so be it,
> > but overall I don't see this as a benefit for the end user,
> > it's just an option.
>
> Discussion on this patch seems to have died off.  I'm probably not
> going to win any popularity contests for saying this, but I think we
> should reject this patch.  I don't feel like this is really a psql
> feature: it's a powerful data visualization tool which we're proposing
> to jam into psql.  I don't think that's psql's purpose.  I also think
> it's quite possible that there could be an unbounded number of
> slightly different things that people want here, and if we take this
> one and a couple more, the code for these individual features could
> come to be larger than all of psql, even though probably 95% of psql
> users would never use any of those.
>

crosstabview is really visualization tool. **But now, there are not any
other tool available from terminal.** So this can be significant help to
all people who would to use this functionality.

The psql has lot of features for 5% users. Currently it is famous not as
"bloated software" but like most comfortable sql console on the world. The
implementation of crosstabview is not complex and with last Daniel's
modification the complexity is less.

The crosstabview is not 100% equal to ANSI SQL PIVOT clause. The ANSI SQL
command is much more rigid (it is one stage statement with predefined
columns), so argument of duplicate implementation one things is not valid.
Probably we would not implement non ANSI SQL feature on server.

Regards

Pavel


>
> Now, that having been said, if other people want this feature to go in
> and are willing to do the work to get it in, I've said my piece and
> won't complain further.  There are a couple of committers who have
> taken positive interest in this thread, so that's good.  However,
> there are also a couple of committers who have expressed doubts
> similar to mine, so that's not so good.  But worse than either of
> those things, there is no real agreement on what the overall design of
> this feature should be.  Everybody wants something a little different,
> for different reasons.  If we can't come to an agreement, more or less
> immediately, on what to try to get into 9.6, then this can't go into
> this release.  Whether it should go into a future release is a
> question we can leave for another time.
>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>


Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-03-12 Thread Daniel Verite
Robert Haas wrote:

> But worse than either of  those things, there is no real
> agreement on what the overall design of this feature
> should be.

The part in the design that raised concerns upthread is
essentially how headers sorting is exposed to the user and
implemented.

As suggested in [1], I've made some drastic changes in the
attached patch to take the comments (from Dean R., Tom L.)
into account. The idea is to limit to the bare minimum
the involvement of psql in sorting:

- the +/- syntax goes away

- the possibility of post-sorting the values through a backdoor
  query goes away too, for both headers.

- the vertical order of the crosstab view is now driven solely by the
  order  in the query

- the order of the horizontal header can be optionally specified
  by a column expected to contain an integer, with the syntax
  \crosstabview colv colh:scolh [other cols]
  which means "colh" will be sorted by "scolh".
  It still defaults to whatever order "colh" comes in from the results

  Concerning the optional "scolh", there are cases where it might pre-exist
  naturally, such as a month number going in pair with a month name.
  In other cases,  a user may add it as a kind of "synthetic column"
  by way of a window function, for example:
SELECT ...other columns...,
   (row_number() over(order by something [order options]) as scolh
   FROM...
   Only the relative order of scolh values is taken into account, the value
itself
   has no meaning for crosstabview.

- also NULLs are no longer excluded from headers, per Peter E.
  comment in [2].


[1]
http://www.postgresql.org/message-id/3d513263-104b-41e3-b1c7-4ad4bd99c491@mm

[2] http://www.postgresql.org/message-id/56c4e344.6070...@gmx.net


Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite
diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml
index 8a85804..da0621b 100644
--- a/doc/src/sgml/ref/psql-ref.sgml
+++ b/doc/src/sgml/ref/psql-ref.sgml
@@ -990,6 +990,113 @@ testdb=>
   
 
   
+\crosstabview [
+colV
+colH
+[:scolH]
+[colG1[,colG2...]]
+] 
+
+
+Execute the current query buffer (like \g) and shows
+the results inside a crosstab grid.
+The output column colV
+becomes a vertical header
+and the output column colH
+becomes a horizontal header, optionally sorted by ranking data obtained
+from scolH.
+
+colG1[,colG2...]
+is the list of output columns to project into the grid.
+By default, all output columns of the query except 
+colV and
+colH
+are included in this list.
+
+
+
+All columns can be refered to by their position (starting at 1), or by
+their name. Normal case folding and quoting rules apply on column
+names. By default,
+colV corresponds to column 1
+and colH to column 2.
+A query having only one output column cannot be viewed in crosstab, and
+colH must differ from
+colV.
+
+
+
+The vertical header, displayed as the leftmost column,
+contains the deduplicated values found in
+column colV, in the same
+order as in the query results.
+
+
+The horizontal header, displayed as the first row,
+contains the deduplicated values found in
+column colH, in
+the order of appearance in the query results.
+If specified, the optional scolH
+argument refers to a column whose values should be integer numbers
+by which colH will be sorted
+to be positioned in the horizontal header.
+
+
+
+Inside the crosstab grid,
+given a query output with N columns
+(including colV and
+colH),
+for each distinct value x of
+colH
+and each distinct value y of
+colV,
+the contents of a cell located at the intersection
+(x,y) is determined by these rules:
+
+
+
+ if there is no corresponding row in the query results such that the
+ value for colH
+ is x and the value
+ for colV
+ is y, the cell is empty.
+
+
+
+
+
+ if there is exactly one row such that the value
+ for colH
+ is x and the value
+ for colV
+ is y, then the N-2 other
+ columns or the columns listed in
+ colG1[,colG2...]
+ are displayed in the cell, separated between each other by
+ a space character if needed.
+
+ If N=2, the letter X is displayed
+ in the cell as if a virtual third column contained that character.
+
+
+
+
+
+ if there are several corresponding rows, the behavior is identical to
+ 

Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-03-11 Thread Robert Haas
On Thu, Feb 18, 2016 at 9:23 AM, Daniel Verite  wrote:
> Dean Rasheed wrote:
>
>> If I want to sort the rows coming out of a query, my first thought
>> is always going to be to add/adjust the query's ORDER BY clause, not
>> use some weird +/- psql syntax.
>
> About the vertical sort, I agree on all your points.
> It's best to rely on ORDER BY for all the reasons mentioned,
> as opposed to a separate sort in a second step.
>
> But you're considering the case when a user is designing
> or adapting a query for the purpose of crosstab
> viewing. As mentioned in my previous reply (about the
> methods to achieve horizontal sort), that scenario is not really
> what motivates the feature in the first place.
>
> If removing that sort option is required to move forward
> with the patch because it's controversial, so be it,
> but overall I don't see this as a benefit for the end user,
> it's just an option.

Discussion on this patch seems to have died off.  I'm probably not
going to win any popularity contests for saying this, but I think we
should reject this patch.  I don't feel like this is really a psql
feature: it's a powerful data visualization tool which we're proposing
to jam into psql.  I don't think that's psql's purpose.  I also think
it's quite possible that there could be an unbounded number of
slightly different things that people want here, and if we take this
one and a couple more, the code for these individual features could
come to be larger than all of psql, even though probably 95% of psql
users would never use any of those.

Now, that having been said, if other people want this feature to go in
and are willing to do the work to get it in, I've said my piece and
won't complain further.  There are a couple of committers who have
taken positive interest in this thread, so that's good.  However,
there are also a couple of committers who have expressed doubts
similar to mine, so that's not so good.  But worse than either of
those things, there is no real agreement on what the overall design of
this feature should be.  Everybody wants something a little different,
for different reasons.  If we can't come to an agreement, more or less
immediately, on what to try to get into 9.6, then this can't go into
this release.  Whether it should go into a future release is a
question we can leave for another time.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-02-18 Thread Daniel Verite
Dean Rasheed wrote:

> If I want to sort the rows coming out of a query, my first thought
> is always going to be to add/adjust the query's ORDER BY clause, not
> use some weird +/- psql syntax.

About the vertical sort, I agree on all your points.
It's best to rely on ORDER BY for all the reasons mentioned,
as opposed to a separate sort in a second step.

But you're considering the case when a user is designing
or adapting a query for the purpose of crosstab
viewing. As mentioned in my previous reply (about the
methods to achieve horizontal sort), that scenario is not really
what motivates the feature in the first place.

If removing that sort option is required to move forward
with the patch because it's controversial, so be it,
but overall I don't see this as a benefit for the end user,
it's just an option.

Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-02-18 Thread Daniel Verite
Peter Eisentraut wrote:

> On 2/9/16 11:21 AM, Daniel Verite wrote:
> > Note that NULL values in the column that pivots are discarded
> > by \crosstabview, because NULL as the name of a column does not
> > make sense.
> 
> Why not?
> 
> All you're doing is printing it out, and psql is quite capable of
> printing a null value.

Initially it's by analogy with the crosstab SRF, but it's true
that the same principle does not have to apply to crosstabview.

The code could set in the header whatever text "pset null" is set to,
at the place where a pivoted NULL would be supposed to go
if it was not filtered out in the first place.

I'll consider implementing that change if there's no objection.

Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-02-18 Thread Daniel Verite
Daniel Verite wrote:

> > >   ORDER BY name
> > > \crosstabview cols = (select to_char(d, 'Mon') from
> > > generate_series('2000-01-01'::date, '2000-12-01', '1 month') d)
> > 
> > My concern with that is that often you don't know what the columns will 
> > be, because you don't know what exact data the query will produce. So to 
> > use this syntax you'd have to re-create a huge chunk of the original 
> > query. :(
> 
> Also, if that additional query refers to tables, it should be executed
> with the same data visibility as the main query. Doesn't that mean
> that both queries should happen within the same repeatable
> read transaction?
> 
> Another  impractical aspect of this approach is that a
> meta-command invocation in psql must fit on a single line, so
> queries containing newlines are not acceptable as argument.
> This problem exists with "\copy (select...) to ..."  already.

Thinking more about that, it occurs to me that if the sort must come
from a user-supplied bit of SQL, it would be simpler to just direct the
user to submit it in the main query, in an additional dedicated column.

For instance, to get a specific, separate order on "h",
let the user change this:

  SELECT v, h, c FROM v_data ORDER BY v;

into that:

  SELECT v, h, row_number() over(order by h) as hn, c
   FROM v_data ORDER BY v;

then with a relatively simple modification to the patch,
this invocation:

 \crosstabview v h:hn c

would display "h" in the horizontal header ordered by "hn".

ISTM this handles two objections raised upthread:

1. The ORDER BY inside OVER() can be augmented with additional
clauses such as lc_collate, desc, nulls last, etc... contrary to
the controversed "+/-" syntax.

2. a post-sort "backdoor" query is no longer necessary.

The drawback for me is that this change doesn't play out with
my original scenario for the command, which is to give the ability to
scrutinize query results in crosstab mode, playing with variations on
what column is pivoted and how headers for both directions get sorted,
while ideally not changing _at all_ the original query in the query
buffer, but just invoking  successive \crosstabview [args] commands
with varying arguments.


Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-02-17 Thread Jim Nasby

On 2/17/16 9:03 AM, Dean Rasheed wrote:

I'm not totally opposed to specifying a column sort order in psql, and
perhaps there's a way to support both 'cols' and 'col_order' options
in psql, since there are different situations where one or the other
might be more useful.


Yeah. If there was some magic way to reference the underlying data with 
your syntax it probably wouldn't be that bad. AIUI normally we're just 
dumping data into a Portal and there's no option to read back from it, 
but if the query results were first put in a tuplestore then I suspect 
it wouldn't be that hard to query against it and produce another result set.



What I am opposed to is specifying the row order in psql, because IMO
that's something that should be done entirely in the SQL query.


+1
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-02-17 Thread Peter Eisentraut
On 2/9/16 11:21 AM, Daniel Verite wrote:
> Note that NULL values in the column that pivots are discarded
> by \crosstabview, because NULL as the name of a column does not
> make sense.

Why not?

All you're doing is printing it out, and psql is quite capable of
printing a null value.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-02-17 Thread Daniel Verite
Jim Nasby wrote:

> >   ORDER BY name
> > \crosstabview cols = (select to_char(d, 'Mon') from
> > generate_series('2000-01-01'::date, '2000-12-01', '1 month') d)
> 
> My concern with that is that often you don't know what the columns will 
> be, because you don't know what exact data the query will produce. So to 
> use this syntax you'd have to re-create a huge chunk of the original 
> query. :(

Also, if that additional query refers to tables, it should be executed
with the same data visibility as the main query. Doesn't that mean
that both queries should happen within the same repeatable
read transaction?

Another  impractical aspect of this approach is that a
meta-command invocation in psql must fit on a single line, so
queries containing newlines are not acceptable as argument.
This problem exists with "\copy (select...) to ..."  already.

Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-02-17 Thread Dean Rasheed
On 17 February 2016 at 02:32, Jim Nasby  wrote:
> On 2/11/16 4:21 AM, Dean Rasheed wrote:
>>
>> Thinking about this some more though, perhaps*sorting*  the columns is
>> the wrong way to be thinking about it. Perhaps a better approach would
>> be to allow the columns to be*listed*  (optionally, using a separate
>> query). Something like the following (don't get too hung up on the
>> syntax):
>>
>> SELECT name,
>> to_char(date, 'Mon') AS month,
>> sum(amount) AS amount
>>   FROM invoices
>>   GROUP BY 1,2
>>   ORDER BY name
>> \crosstabview cols = (select to_char(d, 'Mon') from
>> generate_series('2000-01-01'::date, '2000-12-01', '1 month') d)
>
>
> My concern with that is that often you don't know what the columns will be,
> because you don't know what exact data the query will produce. So to use
> this syntax you'd have to re-create a huge chunk of the original query. :(
>

Yeah, that's a reasonable concern.

On the flip side, one of the advantages of the above syntax is that
you have absolute control over the columns, whereas with the
sort-based syntax you might find some columns missing (e.g., if there
were no invoices in August) and that could lead to confusion parsing
the results.

I'm not totally opposed to specifying a column sort order in psql, and
perhaps there's a way to support both 'cols' and 'col_order' options
in psql, since there are different situations where one or the other
might be more useful.

What I am opposed to is specifying the row order in psql, because IMO
that's something that should be done entirely in the SQL query.

Regards,
Dean


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-02-17 Thread Dean Rasheed
On 15 February 2016 at 14:08, Daniel Verite  wrote:
> Dean Rasheed wrote:
>
>> My biggest problem is with the sorting, for all the reasons discussed
>> above. There is absolutely no reason for \crosstabview to be
>> re-sorting rows -- they should just be left in the original query
>> result order
>
> Normal top-down display:
>
> select v,to_char(d,'Mon') as m, c  from v_data order by d;
>
>  v  |  m  |  c
> +-+-
>  v2 | Jan | bar
>  v1 | Apr | foo
>  v1 | Jul | baz
>  v0 | Jul | qux
>
> At this point, it seems to me that it's perfectly reasonable for our user
> to expect the possibility of sorting additionally by "v" , without
> changing the query and without changing the order of the horizontal
> header:
>
>  \crosstabview +v m c
>
>  v  | Jan | Apr | Jul
> +-+-+-
>  v0 | | | qux
>  v1 | | foo | baz
>  v2 | bar | |
>

I don't find that example particularly compelling. If I want to sort
the rows coming out of a query, my first thought is always going to be
to add/adjust the query's ORDER BY clause, not use some weird +/- psql
syntax.

The crux of the problem here is that in a pivoted query resultset SQL
can be used to control the order of the rows or the columns, but not
both at the same time. IMO it is more natural to use SQL to control
the order of the rows. The columns are the result of the psql
pivoting, so it's reasonable to control them via psql options.

A couple of other points to bear in mind:

The number of columns is always going to be quite limited (at most
1600, and usually far less than that), whereas the number of rows
could be arbitrarily large. So sorting the rows client-side in the way
that you are could get very inefficient, whereas that's not such a
problem for the columns.

The column values are non-NULL, so they require a more limited set of
sort options, whereas the rows could be anything, and people will want
all the sort options to be available.

Regards,
Dean


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-02-16 Thread Jim Nasby

On 2/11/16 4:21 AM, Dean Rasheed wrote:

Thinking about this some more though, perhaps*sorting*  the columns is
the wrong way to be thinking about it. Perhaps a better approach would
be to allow the columns to be*listed*  (optionally, using a separate
query). Something like the following (don't get too hung up on the
syntax):

SELECT name,
to_char(date, 'Mon') AS month,
sum(amount) AS amount
  FROM invoices
  GROUP BY 1,2
  ORDER BY name
\crosstabview cols = (select to_char(d, 'Mon') from
generate_series('2000-01-01'::date, '2000-12-01', '1 month') d)


My concern with that is that often you don't know what the columns will 
be, because you don't know what exact data the query will produce. So to 
use this syntax you'd have to re-create a huge chunk of the original 
query. :(

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-02-15 Thread Daniel Verite
Dean Rasheed wrote:

> My biggest problem is with the sorting, for all the reasons discussed
> above. There is absolutely no reason for \crosstabview to be
> re-sorting rows -- they should just be left in the original query
> result order

We want the option to sort the vertical the header in a late additional
step when the ORDER BY of the query is already assigned to another
purpose. 

I've submitted this example on the wiki:
https://wiki.postgresql.org/wiki/Crosstabview

create view v_data as 
select * from ( values
   ('v1','h2','foo', '2015-04-01'::date),
   ('v2','h1','bar', '2015-01-02'),
   ('v1','h0','baz', '2015-07-12'),
   ('v0','h4','qux', '2015-07-15')
 ) as l(v,h,c,d);


Normal top-down display:

select v,to_char(d,'Mon') as m, c  from v_data order by d;

 v  |  m  |  c  
+-+-
 v2 | Jan | bar
 v1 | Apr | foo
 v1 | Jul | baz
 v0 | Jul | qux

Crosstabview display without any additional sort:

 \crosstabview v m c

 v  | Jan | Apr | Jul 
+-+-+-
 v2 | bar | | 
 v1 | | foo | baz
 v0 | | | qux

"d" is not present the resultset but it drives the sort
so that month names come out in the natural order.

\crosstabview does not discard the order of colH nor the order of colV,
it follows both, so that we get v2,v1,v0 in this order in the leftmost
column (vertical header) just like in the resultset.

At this point, it seems to me that it's perfectly reasonable for our user
to expect the possibility of sorting additionally by "v" , without
changing the query and without changing the order of the horizontal
header:

 \crosstabview +v m c

 v  | Jan | Apr | Jul 
+-+-+-
 v0 | | | qux
 v1 | | foo | baz
 v2 | bar | | 



Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-02-15 Thread Daniel Verite
Alvaro Herrera wrote:

> So please can we have that wiki page so that the syntax can be hammered
> out a bit more.

I've added a wiki page with explanation and examples here:

https://wiki.postgresql.org/wiki/Crosstabview

Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-02-11 Thread Pavel Stehule
Thinking about this some more though, perhaps *sorting* the columns is
> the wrong way to be thinking about it. Perhaps a better approach would
> be to allow the columns to be *listed* (optionally, using a separate
> query). Something like the following (don't get too hung up on the
> syntax):
>
> SELECT name,
>to_char(date, 'Mon') AS month,
>sum(amount) AS amount
>  FROM invoices
>  GROUP BY 1,2
>  ORDER BY name
> \crosstabview cols = (select to_char(d, 'Mon') from
> generate_series('2000-01-01'::date, '2000-12-01', '1 month') d)
>

The idea is ok, but this design cannot be described as user friendly. The
work with time dimension is pretty common, and should be supported by some
short user friendly syntax.

Regards

Pavel


>
> Regards,
> Dean
>


Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-02-11 Thread Dean Rasheed
On 11 February 2016 at 08:43, Andres Freund  wrote:
> On 2016-02-09 09:27:04 +, Dean Rasheed wrote:
>> Looking at this patch, I have mixed feelings about it. On the one hand
>> I really like the look of the output, and I can see that the non-fixed
>> nature of the output columns makes this hard to achieve server-side.
>
>> But on the other hand, this seems to be going way beyond the normal
>> level of result formatting that something like \x does, and I find the
>> syntax for sorting particularly ugly.
>
> I've pretty similar doubts. Addinging features to psql which are complex
> enough that it's likely that people will be forced to parse psql
> output...  On the other hand, a proper server side solution won't be
> easy; so maybe this is a okay enough stopgap.
>

Well to be clear, I like the idea of this feature, and I'm not trying
to stand in the way of progressing it. However, I can't see myself
committing it in its current form.

My biggest problem is with the sorting, for all the reasons discussed
above. There is absolutely no reason for \crosstabview to be
re-sorting rows -- they should just be left in the original query
result order. Sorting columns is a little more understandable, since
there is no way for the original query to control the order in which
the colV values come out, but Tom raises a good point -- there are far
too many bells and whistles when it comes to sorting, and we don't
want to be adding all of them to the psql syntax.

Thinking about this some more though, perhaps *sorting* the columns is
the wrong way to be thinking about it. Perhaps a better approach would
be to allow the columns to be *listed* (optionally, using a separate
query). Something like the following (don't get too hung up on the
syntax):

SELECT name,
   to_char(date, 'Mon') AS month,
   sum(amount) AS amount
 FROM invoices
 GROUP BY 1,2
 ORDER BY name
\crosstabview cols = (select to_char(d, 'Mon') from
generate_series('2000-01-01'::date, '2000-12-01', '1 month') d)

Regards,
Dean


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-02-11 Thread Andres Freund
On 2016-02-09 09:27:04 +, Dean Rasheed wrote:
> Looking at this patch, I have mixed feelings about it. On the one hand
> I really like the look of the output, and I can see that the non-fixed
> nature of the output columns makes this hard to achieve server-side.

> But on the other hand, this seems to be going way beyond the normal
> level of result formatting that something like \x does, and I find the
> syntax for sorting particularly ugly.

I've pretty similar doubts. Addinging features to psql which are complex
enough that it's likely that people will be forced to parse psql
output...  On the other hand, a proper server side solution won't be
easy; so maybe this is a okay enough stopgap.

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-02-09 Thread Jim Nasby

On 2/9/16 8:40 AM, Daniel Verite wrote:

Alvaro Herrera wrote:


While I understand that you may think that "silence is consent",
what I am afraid of is that some committer will look at this two
months from now and say "I hate this Hcol+ stuff, -1 from me" and
send the patch back for syntax rework.  IMO it's better to have more
people chime in here so that the patch that we discuss during the
next commitfest is really the best one we can think of.


Yes, but on the other hand we can't force people to participate.
If a patch is moving forward and being discussed here between
one author and one reviewer, and nothing particularly wrong
pops out in what is discussed, the reality if that other people will
not intervene.


The problem is that assumes people are still reading the thread. This is 
a feature I'm very interested, but at some point I just gave up on 
trying to follow it because of the volume of messages. I bet a lot of 
others did the same.


I think in this case, what should have happened is that once an issue 
with the design of the feature itself was identified, a new thread 
should have been started to discuss that part in particular. That would 
have re-raised attention and made it easier for people to follow that 
specific part of the discussion, even if they don't care about some if 
the code intricacies.



Besides, as it being mentioned here frequently, all patches, even
much more important ones, are short on reviews and reviewers
and testing, still new stuff must keep getting in the source tree
to progress.


Sure, and new stuff will be making it in. The question is: will *your* 
new stuff be making it in?


Believe me, I know how burdensome getting new features pushed is. 
Frankly it shouldn't be this hard, and I certainly don't blame you for 
being frustrated. But none of that changes the fact that the bar for 
including code is very high and if you don't meet it then your stuff 
won't make it in.

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-02-09 Thread Daniel Verite
Tom Lane wrote:

> I do not think we want any client-side sorting in this feature at all,
> because the minute you have any such thing, you are going to have an
> absolutely never-ending stream of demands for more sorting features:
> multi column, numeric vs text, ASC vs DESC, locale-aware, etc etc etc.

It doesn't really do any client-side sorting, the rest of the thread might
refer to it like that by oversimplification, but if the command requests
a header to be sorted,  a "backdoor-style" query of this form
is sent to the server, with PQexecParams():

SELECT n FROM (VALUES ($1,1),($2,2),($3,3)...) ) AS l(x,n) 
  ORDER BY x [DESC]
where the values to display in the header are bound to 
$1,$2,.. and the type associated with these parameters is
the PQftype() of the field from which these values come.
Then  the  values coming back ordered by  tell us
where to position the values corresponding to $1,$2... in the
sorted header.

There are some cases when this sort cannot work.
For example if the field is an anonymous type or a ROW().
Or if the field is POINT(x,y), because our "point" type
does not support order by.
I believe these are corner cases for this feature. In these
cases, psql just displays the error message that PQexecParams()
emits.


Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-02-09 Thread Pavel Stehule
I haven't been paying attention to this thread ... but it is sure
> sounding like this feature has gotten totally out of hand.  Suggest
> reconsidering your design goals.
>
> > Or said otherwise, having the [+/-] colV sorting is a way to
> > avoid the question:
> > "we can sort the horizontal header, so why can't we sort the
> > vertical header just the same?"
>
> I would turn that around, and ask why not remove *both* those things.
>
> I do not think we want any client-side sorting in this feature at all,
> because the minute you have any such thing, you are going to have an
> absolutely never-ending stream of demands for more sorting features:
> multi column, numeric vs text, ASC vs DESC, locale-aware, etc etc etc.
> I'd rather reject the feature altogether than expect that psql is going
> to have to grow all of that.
>

I am thinking so without possibility to sort data on client side, this
feature will be significantly limited. You cannot do server side sort for
both dimensions. Working with 2d report when one dimension is unsorted is
not friendly.

But the client side sorting can be limited to number's or C locale sorting.
I don't think so full sort possibilities are necessary.

Regards

Pavel


> regards, tom lane
>


Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-02-09 Thread Tom Lane
"Daniel Verite"  writes:
>   Dean Rasheed wrote:
>> I don't think we should allow sorting colV values client-side,
>> overriding a server-side ORDER BY clause in the query.

> I shared that opinion until (IIRC) the v8 or v9 of the patch.
> Most of the evolution of this patch has been to go
> from no client-side sorting option at all, to the full range
> of possibilities, ascending or descending, and in both
> vertical and horizontal directions.

I haven't been paying attention to this thread ... but it is sure
sounding like this feature has gotten totally out of hand.  Suggest
reconsidering your design goals.

> Or said otherwise, having the [+/-] colV sorting is a way to
> avoid the question:
> "we can sort the horizontal header, so why can't we sort the
> vertical header just the same?"

I would turn that around, and ask why not remove *both* those things.

I do not think we want any client-side sorting in this feature at all,
because the minute you have any such thing, you are going to have an
absolutely never-ending stream of demands for more sorting features:
multi column, numeric vs text, ASC vs DESC, locale-aware, etc etc etc.
I'd rather reject the feature altogether than expect that psql is going
to have to grow all of that.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-02-09 Thread Daniel Verite
Dean Rasheed wrote:

> Note that I might also want to pass additional sort options, such as
> "ORDER BY name NULLS LAST", which the existing syntax doesn't allow.
> In the new syntax, such sort options could be trivially supported in
> both the server- and client-side sorts:

Note that NULL values in the column that pivots are discarded
by \crosstabview, because NULL as the name of a column does not
make sense.

The doc (in the patch) says:

"The horizontal header, displayed as the first row, contains the set of
all distinct non-null values found in column colH"


Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-02-09 Thread Daniel Verite
Dean Rasheed wrote:

> I don't think we should allow sorting colV values client-side,
> overriding a server-side ORDER BY clause in the query.

I shared that opinion until (IIRC) the v8 or v9 of the patch.
Most of the evolution of this patch has been to go
from no client-side sorting option at all, to the full range
of possibilities, ascending or descending, and in both
vertical and horizontal directions.

I agree that colV sorting can be achieved through the
query's ORDER BY, which additionally is more efficient
so it should be the primary choice.

The reason to allow [+/-]colV in \crosstabview is because
I think the average user will expect it, by symmetry with colH.
As the display is reorganized to be like a "grid" instead of a "list
with several columns", we shift the focus to the symmetry
between horizontal and vertical headers, rather than on
the pre-crosstab form of the resultset, even if it's the
same data.
It's easier for the user to just stick a + in front of a column
reference than to figure out that the same result could
be achieved by editing the query and changing/adding
an ORDER BY.

Or said otherwise, having the [+/-] colV sorting is a way to
avoid the question:
"we can sort the horizontal header, so why can't we sort the
vertical header just the same?"


Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-02-09 Thread Daniel Verite
Alvaro Herrera wrote:

> Also, what about the business of putting "x" if there's no third column?
> Three months from now some Czech psql hacker will say "we should use
> Unicode chars for this" and we will be forever stuck with \pset
> unicode_crosstab_marker to change the character to a ☑ BALLOT BOX WITH
> CZECH.  Maybe we should think that a bit harder -- for example, what
> about just rejecting the case with no third column and forcing the user
> to add a third column with the character they choose?  That way you
> avoid that mess.

Yes,  that implicit "X" with 2 column resultsets is not essential, it may
be removed without real damage.

About the possible suggestion to have a \pset unicode_crosstab_marker,
my opinion would be that it's not important enough to justify a new
\pset setting.


Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-02-09 Thread Daniel Verite
Alvaro Herrera wrote:

> While I understand that you may think that "silence is consent",
> what I am afraid of is that some committer will look at this two
> months from now and say "I hate this Hcol+ stuff, -1 from me" and
> send the patch back for syntax rework.  IMO it's better to have more
> people chime in here so that the patch that we discuss during the
> next commitfest is really the best one we can think of.

Yes, but on the other hand we can't force people to participate.
If a patch is moving forward and being discussed here between
one author and one reviewer, and nothing particularly wrong
pops out in what is discussed, the reality if that other people will
not intervene.

Besides, as it being mentioned here frequently, all patches, even
much more important ones, are short on reviews and reviewers
and testing, still new stuff must keep getting in the source tree
to progress.

Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-02-09 Thread Daniel Verite
Alvaro Herrera wrote:

> So please can we have that wiki page so that the syntax can be hammered
> out a bit more.

Sure, I'm on it.

> I'm closing this as returned-with-feedback for now.

Well,  the feedback it got during months was incorporated into
the patch in the form of significant improvements, and
at the end of this CF it was at the point that it has really been
polished, and no other feedback was coming.

I'll resubmit.

Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-02-09 Thread Dean Rasheed
On 9 February 2016 at 11:06, Pavel Stehule  wrote:
>   + respect SQL clauses ordering, allows pretty complex ORDER BY clause

That, to me is the key point. SQL already allows very powerful
sorting, so psql should not just throw away the query's sort order and
replace it with something much more basic and limited. The exact
syntax can be debated, but I don't think psql should be doing row
sorting.

I also don't believe that extending the +/- sort syntax to support
more advanced options will be particularly easy, and the result is
likely to be even less readable. It also requires the user to learn
another syntax, when they will already be familiar with SQL's sort
syntax.

Regards,
Dean


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-02-09 Thread Pavel Stehule
> >
> > SELECT name, to_char(date, 'mon') AS month, extract(month from date) AS
> > month_order, sum(amount) AS amount FROM invoices GROUP BY 1,2,3;
> >
> > and crosstabview command (per Daniel proposal)
> >
> > \crosstabview +name  +month:month_order amount
> >
> > But if I don't need column header in human readable form, I can do
> >
> > \crosstabview +name +month_order amount
> >
> > What is solution of this use case with your proposal??
> >
>
> So it would just be
>
> SELECT name,
>to_char(date, 'mon') AS month,
>sum(amount) AS amount,
>extract(month from date) AS month_order
>  FROM invoices
>  GROUP BY 1,2,3
>  ORDER BY name
> \crosstabview name month amount month_order
>

Warning: :) Now I am subjective. The Daniel syntax "\crosstabview +name
+month:month_order amount" looks more readable for me, because related
things are near to self.


>
> Note that I might also want to pass additional sort options, such as
> "ORDER BY name NULLS LAST", which the existing syntax doesn't allow.
> In the new syntax, such sort options could be trivially supported in
> both the server- and client-side sorts:


> SELECT name, to_char(date, 'mon') AS month,
>extract(month from date) AS month_order, sum(amount) AS amount
>   FROM invoices
>  GROUP BY 1,2,3
>  ORDER BY name NULLS LAST
> \crosstabview name month amount month_order asc nulls last
>

I understand - if I compare these two syntaxes I and I am trying be
objective, then I see

your:
  + respect SQL clauses ordering, allows pretty complex ORDER BY clause
  - possible to fail on unexpected syntax errors
  +/- more verbose
  - allow only one client side sort
  - less expressive

Daniel:
  + cannot to fail on syntax error
  + more compacts (not necessary to specify ORDER BY clauses)
  + allow to specify sort in both dimensions
  + more expressive (+colH is more expressive than colV colH col colH
  - doesn't allow to complex order clauses in both dimensions


>
> This is probably not an issue in this example, but it might well be in
> other cases. The +/-scol syntax is always going to be limited in what
> it can support.
>

the +/- syntax can be enhanced by additional attributes - this is only
syntax (but then there is a risk of possible syntax errors)


>
>
> > I agree so this syntax is pretty raw. But it is consistent with other
> psql
> > statements and there are not possible conflicts.
> >
> > What I mean? Your syntax is not unambiguous: \crosstabview [colV] [colH]
> > [colG1[,colG2...]] [sortCol [asc|desc]] - when I would to enter sort
> order
> > column, I have to enter one or more colG1,... or I have to enter
> explicitly
> > asc, desc keyword.
> >
>
> That is resolved by the comma that precedes colG2, etc. isn't it?
>

but colG1 is optional. What if you miss any colGx ?

Regards

Pavel


>
> Regards,
> Dean
>


Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-02-09 Thread Dean Rasheed
On 9 February 2016 at 10:09, Pavel Stehule  wrote:
> The sorting on client side is necessary - minimally in one direction,
> because you cannot to create perfect sorting for both dimensions.
> Possibility to order in second dimension is just pretty comfortable -
> because you don't need to think two steps forward - when you create SQL
> query.
>
> I have a basic use case that should be supported well, and it is supported
> well by last version of this patch. The evaluation of syntax is subjective.
> We can compare Daniel's syntax and your proposal.
>
> The use case: I have a table with the invoices with attributes (date, name
> and amount). I would to take a report of amounts across months and
> customers. Horizontal dimension is month (name), vertical dimension is name
> of customers. I need sorting of months in semantic order and customers in
> alphabet order.
>
> So my query is:
>
> SELECT name, to_char(date, 'mon') AS month, extract(month from date) AS
> month_order, sum(amount) AS amount FROM invoices GROUP BY 1,2,3;
>
> and crosstabview command (per Daniel proposal)
>
> \crosstabview +name  +month:month_order amount
>
> But if I don't need column header in human readable form, I can do
>
> \crosstabview +name +month_order amount
>
> What is solution of this use case with your proposal??
>

So it would just be

SELECT name,
   to_char(date, 'mon') AS month,
   sum(amount) AS amount,
   extract(month from date) AS month_order
 FROM invoices
 GROUP BY 1,2,3
 ORDER BY name
\crosstabview name month amount month_order

Note that I might also want to pass additional sort options, such as
"ORDER BY name NULLS LAST", which the existing syntax doesn't allow.
In the new syntax, such sort options could be trivially supported in
both the server- and client-side sorts:

SELECT name, to_char(date, 'mon') AS month,
   extract(month from date) AS month_order, sum(amount) AS amount
  FROM invoices
 GROUP BY 1,2,3
 ORDER BY name NULLS LAST
\crosstabview name month amount month_order asc nulls last

This is probably not an issue in this example, but it might well be in
other cases. The +/-scol syntax is always going to be limited in what
it can support.


> I agree so this syntax is pretty raw. But it is consistent with other psql
> statements and there are not possible conflicts.
>
> What I mean? Your syntax is not unambiguous: \crosstabview [colV] [colH]
> [colG1[,colG2...]] [sortCol [asc|desc]] - when I would to enter sort order
> column, I have to enter one or more colG1,... or I have to enter explicitly
> asc, desc keyword.
>

That is resolved by the comma that precedes colG2, etc. isn't it?

Regards,
Dean


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-02-09 Thread Pavel Stehule
Hi

Looking at this patch, I have mixed feelings about it. On the one hand
> I really like the look of the output, and I can see that the non-fixed
> nature of the output columns makes this hard to achieve server-side.
>
> But on the other hand, this seems to be going way beyond the normal
> level of result formatting that something like \x does, and I find the
> syntax for sorting particularly ugly. I can understand the need to
> sort the colH values, but it seems to me that the result rows should
> just be returned in the order the server returns them -- i.e., I don't
> think we should allow sorting colV values client-side, overriding a
> server-side ORDER BY clause in the query.
>

This feature has zero relation with \x option, and any link to this option
is confusing. This is important, elsewhere we are on start again, where I
did long discuss with Daniel about the name, when I blocked the name
"rotate".


> Client-side sorting makes me uneasy in general, and I think it should
> be restricted to just sorting the columns that appear in the output
> (the colH values). This would also allow the syntax to be simplified:
>
> \crosstabview [colV] [colH] [colG1[,colG2...]] [sortCol [asc|desc]]
>

The sorting on client side is necessary - minimally in one direction,
because you cannot to create perfect sorting for both dimensions.
Possibility to order in second dimension is just pretty comfortable -
because you don't need to think two steps forward - when you create SQL
query.

I have a basic use case that should be supported well, and it is supported
well by last version of this patch. The evaluation of syntax is subjective.
We can compare Daniel's syntax and your proposal.

The use case: I have a table with the invoices with attributes (date, name
and amount). I would to take a report of amounts across months and
customers. Horizontal dimension is month (name), vertical dimension is name
of customers. I need sorting of months in semantic order and customers in
alphabet order.

So my query is:

SELECT name, to_char(date, 'mon') AS month, extract(month from date) AS
month_order, sum(amount) AS amount FROM invoices GROUP BY 1,2,3;

and crosstabview command (per Daniel proposal)

\crosstabview +name  +month:month_order amount

But if I don't need column header in human readable form, I can do

\crosstabview +name +month_order amount

What is solution of this use case with your proposal??

I agree so this syntax is pretty raw. But it is consistent with other psql
statements and there are not possible conflicts.

What I mean? Your syntax is not unambiguous: \crosstabview [colV] [colH]
[colG1[,colG2...]] [sortCol [asc|desc]] - when I would to enter sort order
column, I have to enter one or more colG1,... or I have to enter explicitly
asc, desc keyword.

Regards

Pavel








>
> Overall, I like the feature, but I'm not convinced that it's ready in
> its current form.
>
> For the future (not in this first version of the patch), since the
> transformation is more than just a \x-type formatting of the query
> results, a nice-to-have feature would be a way to save the results
> somewhere -- say by making it play nicely with \g or \copy somehow,
> but I admit that I don't know exactly how that would work.
>
> Regards,
> Dean
>


Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-02-09 Thread Dean Rasheed
On 9 February 2016 at 05:24, Pavel Stehule  wrote:
> I have not a feeling so we did some with Daniel privately. All work was
> public (I checked my mailbox) - but what is unhappy - in more mailing list
> threads (not sure how it is possible, because subjects looks same). The
> discus about the design was public, I am sure. It was relative longer
> process, with good progress (from my perspective), because Daniel accepts
> and fixed all my objection. The proposed syntax is fully consistent with
> other psql commands - hard to create something new there, because psql
> parser is pretty limited. Although I am thinking so syntax is good, clean
> and useful I am open to discuss about it. Please, try the last design, last
> patch - I spent lot of hours (and I am sure so Daniel much more) in thinking
> how this can be designed better.
>

Looking at this patch, I have mixed feelings about it. On the one hand
I really like the look of the output, and I can see that the non-fixed
nature of the output columns makes this hard to achieve server-side.

But on the other hand, this seems to be going way beyond the normal
level of result formatting that something like \x does, and I find the
syntax for sorting particularly ugly. I can understand the need to
sort the colH values, but it seems to me that the result rows should
just be returned in the order the server returns them -- i.e., I don't
think we should allow sorting colV values client-side, overriding a
server-side ORDER BY clause in the query.

Client-side sorting makes me uneasy in general, and I think it should
be restricted to just sorting the columns that appear in the output
(the colH values). This would also allow the syntax to be simplified:

\crosstabview [colV] [colH] [colG1[,colG2...]] [sortCol [asc|desc]]

Overall, I like the feature, but I'm not convinced that it's ready in
its current form.

For the future (not in this first version of the patch), since the
transformation is more than just a \x-type formatting of the query
results, a nice-to-have feature would be a way to save the results
somewhere -- say by making it play nicely with \g or \copy somehow,
but I admit that I don't know exactly how that would work.

Regards,
Dean


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-02-08 Thread Pavel Stehule
Hi


> I just rechecked the thread.  In my reading, lots of people argued
> whether it should be called \rotate or \pivot or \crosstab; it seems the
> \crosstabview proposal was determined to be best.  I can support that
> decision.  But once the details were discussed, it was only you and
> Daniel left in the thread; nobody else participated.  While I understand
> that you may think that "silence is consent", what I am afraid of is
> that some committer will look at this two months from now and say "I
> hate this Hcol+ stuff, -1 from me" and send the patch back for syntax
> rework.  IMO it's better to have more people chime in here so that the
> patch that we discuss during the next commitfest is really the best one
> we can think of.
>

I have not a feeling so we did some with Daniel privately. All work was
public (I checked my mailbox) - but what is unhappy - in more mailing list
threads (not sure how it is possible, because subjects looks same). The
discus about the design was public, I am sure. It was relative longer
process, with good progress (from my perspective), because Daniel accepts
and fixed all my objection. The proposed syntax is fully consistent with
other psql commands - hard to create something new there, because psql
parser is pretty limited. Although I am thinking so syntax is good, clean
and useful I am open to discuss about it. Please, try the last design, last
patch - I spent lot of hours (and I am sure so Daniel much more) in
thinking how this can be designed better.


> Also, what about the business of putting "x" if there's no third column?
> Three months from now some Czech psql hacker will say "we should use
> Unicode chars for this" and we will be forever stuck with \pset
> unicode_crosstab_marker to change the character to a ☑ BALLOT BOX WITH
> CZECH.  Maybe we should think that a bit harder -- for example, what
> about just rejecting the case with no third column and forcing the user
> to add a third column with the character they choose?  That way you
> avoid that mess.
>

These features are in category "nice to have". There are no problem to do
in last commitfest or in next release cycle. It is not reason why to block
commit of this feature, and I am sure so lot of users can be pretty happy
with "basic" version of this patch. The all necessary functionality is
there and working. We can continue on development in other cycles, but for
this cycle, I am sure, so all necessary work is done.


>
> > This feature has only small relation to SQL PIVOTING feature - it is just
> > form of view and I am agree with Daniel about sense of this feature.
>
> Yes, I don't disagree there.  Robert Haas and David Fetter also
> expressed their support for psql-side processing, so I think we're good
> there.
>
>
great. Personally, I have not any objection against current state. If
anybody has, please do it early. We can move to forward. This is nice
feature, good patch, and there is not reason why stop here.

Regards

Pavel


> --
> Álvaro Herrerahttp://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>


Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-02-08 Thread Alvaro Herrera
Pavel Stehule wrote:

> > FWIW I think the general idea of this feature (client-side resultset
> > "pivoting") is a good one, but I don't really have an opinion regarding
> > your specific proposal.  I think you should first seek some more
> > consensus about the proposed design; in your original thread [1] several
> > guys defended the idea of having this be a psql feature, and the idea of
> > this being a parallel to \x seems a very sensible one,

Sorry, I meant \q here, not \x.

> > but there's really been no discussion on whether your proposed "+/-"
> > syntax to change sort order makes sense, for one thing.
> 
> I am sorry, but I disagree - the discussion about implementation was more
> than two months, and I believe so anybody who would to discuss had enough
> time to discuss. This feature and design was changed significantly and
> there was not anybody who sent feature design objection.

I just rechecked the thread.  In my reading, lots of people argued
whether it should be called \rotate or \pivot or \crosstab; it seems the
\crosstabview proposal was determined to be best.  I can support that
decision.  But once the details were discussed, it was only you and
Daniel left in the thread; nobody else participated.  While I understand
that you may think that "silence is consent", what I am afraid of is
that some committer will look at this two months from now and say "I
hate this Hcol+ stuff, -1 from me" and send the patch back for syntax
rework.  IMO it's better to have more people chime in here so that the
patch that we discuss during the next commitfest is really the best one
we can think of.

Also, what about the business of putting "x" if there's no third column?
Three months from now some Czech psql hacker will say "we should use
Unicode chars for this" and we will be forever stuck with \pset
unicode_crosstab_marker to change the character to a ☑ BALLOT BOX WITH
CZECH.  Maybe we should think that a bit harder -- for example, what
about just rejecting the case with no third column and forcing the user
to add a third column with the character they choose?  That way you
avoid that mess.

> This feature has only small relation to SQL PIVOTING feature - it is just
> form of view and I am agree with Daniel about sense of this feature.

Yes, I don't disagree there.  Robert Haas and David Fetter also
expressed their support for psql-side processing, so I think we're good
there.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-02-08 Thread Pavel Stehule
Hi



> FWIW I think the general idea of this feature (client-side resultset
> "pivoting") is a good one, but I don't really have an opinion regarding
> your specific proposal.  I think you should first seek some more
> consensus about the proposed design; in your original thread [1] several
> guys defended the idea of having this be a psql feature, and the idea of
> this being a parallel to \x seems a very sensible one, but there's
> really been no discussion on whether your proposed "+/-" syntax to
> change sort order makes sense, for one thing.
>

I am sorry, but I disagree - the discussion about implementation was more
than two months, and I believe so anybody who would to discuss had enough
time to discuss. This feature and design was changed significantly and
there was not anybody who sent feature design objection.

This feature has only small relation to SQL PIVOTING feature - it is just
form of view and I am agree with Daniel about sense of this feature.

Regards

Pavel


>
> So please can we have that wiki page so that the syntax can be hammered
> out a bit more.
>
> I'm closing this as returned-with-feedback for now.
>
> [1] It's a good idea to add links to previous threads where things were
> discussed.  I had to search for
> www.postgresql.org/message-id/flat/78543039-c708-4f5d-a66f-0c0fbcda1f76@mm
> because you didn't provide a link to it when you started the second
> thread.
>
> --
> Álvaro Herrerahttp://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>


Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-02-08 Thread Alvaro Herrera
Daniel Verite wrote:
>   Teodor Sigaev wrote:
> 
> > Interesting feature, but it's not very obvious how to use it. I'd like to
> > see  some example(s) in documentation.
> 
> I'm thinking of making a wiki page, because examples pretty much
> require showing resultsets, and I'm not sure this would fly
> with our current psql documentation, which is quite compact.

Yeah, we need to keep in mind that the psql doc is processed as a
manpage also, so it may not be a great idea to add too many things
there.  But I also agree that some good examples would be useful.

FWIW I think the general idea of this feature (client-side resultset
"pivoting") is a good one, but I don't really have an opinion regarding
your specific proposal.  I think you should first seek some more
consensus about the proposed design; in your original thread [1] several
guys defended the idea of having this be a psql feature, and the idea of
this being a parallel to \x seems a very sensible one, but there's
really been no discussion on whether your proposed "+/-" syntax to
change sort order makes sense, for one thing.

So please can we have that wiki page so that the syntax can be hammered
out a bit more.

I'm closing this as returned-with-feedback for now.

[1] It's a good idea to add links to previous threads where things were
discussed.  I had to search for
www.postgresql.org/message-id/flat/78543039-c708-4f5d-a66f-0c0fbcda1f76@mm
because you didn't provide a link to it when you started the second
thread.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-02-08 Thread Daniel Verite
Teodor Sigaev wrote:

> Interesting feature, but it's not very obvious how to use it. I'd like to
> see  some example(s) in documentation.

I'm thinking of making a wiki page, because examples pretty much
require showing resultsets, and I'm not sure this would fly
with our current psql documentation, which is quite compact.

The current bit of doc I've produced is 53 lines long in manpage
format already. The text has not been proofread by a native
English speaker yet, so part of the problem might be that
it's just me struggling with english :)

> And I see an implementation of AVL tree in psql source code
> (src/bin/psql/crosstabview.c). Postgres already has a Red-Black tree
> implementation in src/include/lib/rbtree.h and
> src/backend/lib/rbtree.c. Is any advantage of using AVL tree here? I
> have some doubt about that and this implementation, obviously, will
> not be well tested. But I see in comments that implementation is
> reduced to insert only and it doesn't use the fact of ordering tree,
> so, even hash could be used.

Yes. I expect too that a RB tree or a hash-based algorithm would do
the job and perform well.

The AVL implementation in crosstabview is purposely simplified
and specialized for this job, resulting in ~185 lines of code
versus ~850 lines for rb-tree.c
But I understand the argument that the existing rb-tree has been
battle-tested, whereas this code hasn't.

I'm looking at rb-tree.c and thinking what it would take to
incorporate it:
1. duplicating or linking from backend/lib/rb-tree.c into psql/
2. replacing the elog() calls with something else in the case of psql
3. updating the crosstabview data structures and call sites.

While I'm OK with #3, #1 and #2 seem wrong.
I could adapt rb-tree.c so that the same code can be used
both client-side and server-side, but touching server-side
code for this feature and creating links in the source tree
feels invasive and overkill.

Another approach is to replace AVL with an hash-based algorithm,
but that raises the same sort of question. If crosstabview comes
with its specific implementation, why use that rather than existing
server-side code? But at a glance, utils/hash/dynahash.c seems quite
hard to convert for client-side use.

I'm open to ideas on this. In particular, if we have a hash table
implementation that is already blessed by the project and small enough
to make sense in psql, I'd be happy to consider it.

Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-02-08 Thread Teodor Sigaev

Hi!

Interesting feature, but it's not very obvious how to use it. I'd like to see 
some example(s) in documentation.


And I see an implementation of AVL tree in psql source code 
(src/bin/psql/crosstabview.c). Postgres already has a Red-Black tree 
implementation in
src/include/lib/rbtree.h and src/backend/lib/rbtree.c. Is any advantage of using 
AVL tree here? I have some doubt about that and this implementation, obviously, 
will not be well tested. But I see in comments that implementation is reduced to 
insert only and it doesn't use the fact of ordering tree, so, even hash could be 
used.


Daniel Verite wrote:

Pavel Stehule wrote:


1. maybe we can decrease name to shorter "crossview" ?? I am happy with
crosstabview too, just crossview is correct too, and shorter


I'm in favor of keeping crosstabview. It's more explicit, only
3 characters longer and we have tab completion anyway.


2. Columns used for ordering should not be displayed by default. I can live
with current behave, but hiding ordering columns is much more practical for
me


I can see why, but I'm concerned about a consequence:
say we have 4 columns A,B,C,D and user does \crosstabview +A:B +C:D
If B and D are excluded by default, then there's nothing
left to display inside the grid.
It doesn't feel quite right. There's something counter-intuitive
in the fact that values in the grid would disappear depending on
whether and how headers are sorted.
With the 3rd argument, we let the user decide what they want
to see.


3. This code is longer, so some regress tests are recommended - attached
simple test case


I've added a few regression tests to the psql testsuite
based on your sample data. New patch with these tests
included is attached, make check passes.

Best regards,






--
Teodor Sigaev   E-mail: teo...@sigaev.ru
   WWW: http://www.sigaev.ru/


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-02-04 Thread Pavel Stehule
Hi

I tested last version, v11 and I have not any objection

It is working as expected

all regress tests passed, there is related documentation and new test is
attached.

This patch is ready form commiter.

Daniel, thank you very much, it is interesting feature.

Regards

Pavel


  1   2   >