Re: [HACKERS] pg_dump / copy bugs with "big lines" ?

2016-03-01 Thread Daniel Verite
Alvaro Herrera wrote:

> If others can try this patch to ensure it enables pg_dump to work on
> their databases, it would be great.

It doesn't seem to help if one field exceeds 1Gb, for instance when
inflated by a bin->hex translation.

postgres=# create table big as 
   select pg_read_binary_file('data') as binarycol;

postgres=# select octet_length(binarycol) from big;
 octet_length 
--
   107370

postgres=# copy big to '/var/tmp/big.copy';
ERROR:  XX000: invalid memory alloc request size 214743
LOCATION:  palloc, mcxt.c:903

Same problem with pg_dump.

OTOH, it improves the case where the cumulative size of field contents
for a row exceeds 1 Gb, but not  any single field exceeds that size.

If splitting the table into 3 fields, each smaller than 512MB:

postgres=# create table big2 as select
 substring(binarycol from 1 for 300*1024*1024) as b1,
 substring(binarycol from 1+300*1024*1024 for 300*1024*1024) as b2 ,
 substring(binarycol from 1+600*1024*1024 for 400*1024*1024) as b3
from big;

postgres=# copy big2 to '/var/tmp/big.copy';
COPY 1

then that works, producing a single line of 2097152012 chars
in the output file.

By contrast, it fails with an unpatched 9.5:

postgres=# copy big2 to '/var/tmp/big.copy';
ERROR:  54000: out of memory
DETAIL:  Cannot enlarge string buffer containing 629145605 bytes by 629145602
more bytes.
LOCATION:  enlargeStringInfo, stringinfo.c:260

If setting bytea_output to 'escape', it also fails with the patch applied,
as it tries to allocate 4x the binary field size, and it exceeds 1GB again.

postgres=# set bytea_output =escape;
SET
postgres=# copy big2 to '/var/tmp/big.copy';
ERROR:  invalid memory alloc request size 1258291201
LOCATION:  palloc, mcxt.c:821

1258291201 = 300*1024*1024*4+1

Also, the COPY of both tables work fine if using (FORMAT BINARY),
on both the patched and unpatched server.

Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_dump / copy bugs with "big lines" ?

2016-03-01 Thread Daniel Verite
I wrote:

> If splitting the table into 3 fields, each smaller than 512MB:
> 
> postgres=# create table big2 as select
>  substring(binarycol from 1 for 300*1024*1024) as b1,
>  substring(binarycol from 1+300*1024*1024 for 300*1024*1024) as b2 ,
>  substring(binarycol from 1+600*1024*1024 for 400*1024*1024) as b3
> from big;

I've tried adding another large field to see what happens if the whole row
exceeds 2GB, and data goes to the client rather than to a file.

postgres=# alter table big2 add b4 bytea;
postgres=# update big2 set b4=b1;

So big2 has 4 bytea columns with 300+300+400+300 MB on a single row.

Then I'm trying to \copy this from a 9.5.1 backend with patch applied,
configured with --enable-debug, on Debian8 x86-64 with 64GB of RAM
and all defaults in postgresql.conf

My idea was to check if the client side was OK with that much data on
a single COPY row, but it turns out that the server is not OK anyway.

postgres=# \copy big2 to /dev/null
lost synchronization with server: got message type "d", length -1568669676

The backend aborts with the following backtrace:

Program terminated with signal 6, Aborted.
#0  0x7f82ee68e165 in raise () from /lib/x86_64-linux-gnu/libc.so.6
#1  0x7f82ee6913e0 in abort () from /lib/x86_64-linux-gnu/libc.so.6
#2  0x7f82ee6c839b in ?? () from /lib/x86_64-linux-gnu/libc.so.6
#3  0x7f82ee6d1be6 in ?? () from /lib/x86_64-linux-gnu/libc.so.6
#4  0x7f82ee6d698c in free () from /lib/x86_64-linux-gnu/libc.so.6
#5  0x007b5a89 in AllocSetDelete (context=0x)
at aset.c:643
#6  0x007b63e3 in MemoryContextDelete (context=0x1fa58c8) at
mcxt.c:229
#7  0x0055fa25 in CopyTo (cstate=0x1fb1050) at copy.c:1967
#8  DoCopyTo (cstate=cstate@entry=0x1fb1050) at copy.c:1778
#9  0x00562ea9 in DoCopy (stmt=stmt@entry=0x1f796d0,
queryString=queryString@entry=0x1f78c60 "COPY  big2 TO STDOUT ",
processed=processed@entry=0x7fff22103338) at copy.c:961
#10 0x006b4440 in standard_ProcessUtility (parsetree=0x1f796d0,
queryString=0x1f78c60 "COPY  big2 TO STDOUT ", context=,
params=0x0, dest=0x1f79a30, completionTag=0x7fff22103680 "")
at utility.c:541
#11 0x006b1937 in PortalRunUtility (portal=0x1f26680,
utilityStmt=0x1f796d0, isTopLevel=1 '\001', dest=0x1f79a30,
completionTag=0x7fff22103680 "") at pquery.c:1183
#12 0x006b2535 in PortalRunMulti (portal=portal@entry=0x1f26680,
isTopLevel=isTopLevel@entry=1 '\001', dest=dest@entry=0x1f79a30,
altdest=altdest@entry=0x1f79a30,
completionTag=completionTag@entry=0x7fff22103680 "") at pquery.c:1314
#13 0x006b3022 in PortalRun (portal=portal@entry=0x1f26680,
count=count@entry=9223372036854775807,
isTopLevel=isTopLevel@entry=1 '\001', dest=dest@entry=0x1f79a30,
altdest=altdest@entry=0x1f79a30,
completionTag=completionTag@entry=0x7fff22103680 "") at pquery.c:812
#14 0x006b0393 in exec_simple_query (
query_string=0x1f78c60 "COPY  big2 TO STDOUT ") at postgres.c:1104
#15 PostgresMain (argc=, argv=argv@entry=0x1f0d240,
dbname=0x1f0d0f0 "postgres", username=) at postgres.c:4030
#16 0x0047072b in BackendRun (port=0x1f2d230) at postmaster.c:4239
#17 BackendStartup (port=0x1f2d230) at postmaster.c:3913
#18 ServerLoop () at postmaster.c:1684
#19 0x0065b8cd in PostmasterMain (argc=argc@entry=3,
argv=argv@entry=0x1f0c330) at postmaster.c:1292
#20 0x00471161 in main (argc=3, argv=0x1f0c330) at main.c:223



The server log has this:

STATEMENT:  COPY  big2 TO STDOUT
*** glibc detected *** postgres: postgres postgres [local] COPY: double free
or corruption (out): 0x7f8234929010 ***
=== Backtrace: =
/lib/x86_64-linux-gnu/libc.so.6(+0x75be6)[0x7f82ee6d1be6]
/lib/x86_64-linux-gnu/libc.so.6(cfree+0x6c)[0x7f82ee6d698c]
postgres: postgres postgres [local] COPY[0x7b5a89]
postgres: postgres postgres [local] COPY(MemoryContextDelete+0x43)[0x7b63e3]
postgres: postgres postgres [local] COPY[0x55fa25]
postgres: postgres postgres [local] COPY(DoCopy+0x479)[0x562ea9]
postgres: postgres postgres [local]
COPY(standard_ProcessUtility+0x590)[0x6b4440]
postgres: postgres postgres [local] COPY[0x6b1937]
postgres: postgres postgres [local] COPY[0x6b2535]
postgres: postgres postgres [local] COPY(PortalRun+0x202)[0x6b3022]
postgres: postgres postgres [local] COPY(PostgresMain+0x1493)[0x6b0393]
postgres: postgres postgres [local] COPY[0x47072b]
postgres: postgres postgres [local] COPY(PostmasterMain+0xe7d)[0x65b8cd]
postgres: postgres postgres [local] COPY(main+0x3e1)[0x471161]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xfd)[0x7f82ee67aead]
postgres: postgres postgres [local] COPY[0x4711c9]


I will try other tests without bytea exported in text format but with
several huge text columns.

Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)

Re: [HACKERS] pg_dump / copy bugs with "big lines" ?

2016-03-02 Thread Daniel Verite
I wrote:

> postgres=# \copy big2 to /dev/null
> lost synchronization with server: got message type "d", length -1568669676
> 
> The backend aborts with the following backtrace:
> 
> Program terminated with signal 6, Aborted.
> #0  0x7f82ee68e165 in raise () from /lib/x86_64-linux-gnu/libc.so.6
> #1  0x7f82ee6913e0 in abort () from /lib/x86_64-linux-gnu/libc.so.6
> #2  0x7f82ee6c839b in ?? () from /lib/x86_64-linux-gnu/libc.so.6
> #3  0x7f82ee6d1be6 in ?? () from /lib/x86_64-linux-gnu/libc.so.6
> #4  0x7f82ee6d698c in free () from /lib/x86_64-linux-gnu/libc.so.6
> #5  0x007b5a89 in AllocSetDelete (context=0x)
>at aset.c:643
> #6  0x007b63e3 in MemoryContextDelete (context=0x1fa58c8) at
> mcxt.c:229
> #7  0x0055fa25 in CopyTo (cstate=0x1fb1050) at copy.c:1967

The cause of the crash turns out to be, in enlargeStringInfo():

needed += str->len + 1; /* total space required now */

needed is an int and str->len is an int64, so it overflows when the
size has to grow beyond 2^31 bytes, fails to enlarge the buffer and
overwrites memory after it.

When fixing it with a local int64 copy of the variable, the backend
no longer crashes and COPY big2 TO 'file' appears to work.

However, getting it to the client with \copy big2 to 'file'
still produces the error in psql:
   lost synchronization with server: got message type "d"
and leaves an empty file, so there are more problems to solve to
go beyond 2GB text per row.

Or maybe another approach would be to advertise that this is the maximum
for a row in text mode, and limit the backend's string buffer to this size
for the time being? Notwithstanding that there's still the other direction
client->server to test.


Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_dump / copy bugs with "big lines" ?

2016-03-02 Thread Daniel Verite
Tomas Vondra wrote:

> My guess is this is a problem at the protocol level - the 'd' message is 
> CopyData, and all the messages use int32 to define length. So if there's 
> a 2GB row, it's likely to overflow.

Yes. Besides, the full message includes a negative length:

> postgres=# \copy big2 to /dev/null
> lost synchronization with server: got message type "d", length -1568669676

which happens to be the correct size if interpreted as an unsigned int32

-1568669676 = (int) (1300UL*1024*1024*2 + 3 + 3*4 + 1 + 4)

One interpretation would be that putting an unsigned length in
CopyData message is a protocol violation.

However it's not clear to me that Int32 in the doc necessarily designates
a signed integer.

Int32 is defined as:
   Intn(i) 

   An n-bit integer in network byte order (most significant byte
   first). If i is specified it is the exact value that will appear,
   otherwise the value is variable. Eg. Int16, Int32(42).

There's a least one example when we use Int16 as unsigned:
the number of parameters in Bind (F) can be up to 65535.
This maximum is tested explicitly and refered to at several
places in fe-exec.

In some instances, Int32 is clearly signed, because -1 is accepted
to indicate NULLness, such as again in Bind (F) for the length of
the parameter value.

From this it seems to me that Intn is to be interpreted as
signed or unsigned on a case by case basis.

Back to CopyData (F & B), it's documented as:

  Byte1('d')
  Identifies the message as COPY data.

  Int32
  Length of message contents in bytes, including self.

  Byten
  Data that forms part of a COPY data stream. Messages sent from the
  backend will always correspond to single data rows, but messages sent
  by frontends might divide the data stream arbitrarily.

I don't see any hint that this length is signed, nor any reason of having
it signed.

I guess before the patch it didn't matter, for the B case at least,
because the backend never sent more than 1GB.

Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_dump / copy bugs with "big lines" ?

2016-03-03 Thread Daniel Verite
Tom Lane wrote:

> BTW, is anyone checking the other side of this, ie "COPY IN" with equally
> wide rows?  There doesn't seem to be a lot of value in supporting dump
> if you can't reload ...

Indeed, the lines bigger than 1GB can't be reloaded :(

My test: with a stock 9.5.1 plus Alvaro's patch with my additional
int64 fix mentioned upthread, creating this table and filling
all columns with 200MB of text each:

create table bigtext(t1 text ,t2 text, t3 text, t4 text,
 t5 text, t6 text, t7 text);

# \copy bigtext to '/var/tmp/bigtext.sql'

That part does work as expected, producing this huge single line:
$ wc /var/tmp/bigtext.sql
 1  7 1468006407 /var/tmp/bigtext.sql

But reloading it fails:

# create table bigtext2 (like bigtext);
CREATE TABLE

# copy bigtext2 from '/var/tmp/bigtext.sql';
ERROR:  54000: out of memory
DETAIL:  Cannot enlarge string buffer containing 1073676288 bytes by 65536
more bytes.
CONTEXT:  COPY bigtext2, line 1
LOCATION:  enlargeStringInfo, stringinfo.c:278


# \copy bigtext2 from '/var/tmp/bigtext.sql'
ERROR:  54000: out of memory
DETAIL:  Cannot enlarge string buffer containing 1073741808 bytes by 8191
more bytes.
CONTEXT:  COPY bigtext2, line 1
LOCATION:  enlargeStringInfo, stringinfo.c:278


Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] raw output from copy

2016-03-04 Thread Daniel Verite
Corey Huinker wrote:

> So, for me, RAW is the right solution, or at least *a* right solution.

Questions on how to extract from a bytea column come up on a regular
basis, as in [1] [2] [3], or [4] a few days ago, and so far the answers
are to encode the contents in text and decode them in an additional
step, or use COPY BINARY and filter out the headers.

But none of this is as straightforward and efficient as the proposed
COPY RAW.
Also the conversion to text can't be used at all on very large
contents (>512MB), as mentioned in another recent thread [5]
(this is the same reason why pg_dump can't dump such rows),
but COPY RAW doesn't have this limitation.

Technically COPY BINARY should be sufficient, but it seems that
people dislike having to deal with its headers.
Also it's not supported by any of the drivers of popular
script languages that otherwise provide COPY in text format
(DBD::Pg, php, psycopg2...)
Maybe the RAW format would have a better chance to get support
there, because of its simplicity.

[1]
http://www.postgresql.org/message-id/038517CEB6DE43BD8422D7947B6BE8D8@fanliji
ng

[2] http://www.postgresql.org/message-id/4c8272c4.1000...@arcor.de

[3] http://stackoverflow.com/questions/6730729

[4] http://www.postgresql.org/message-id/56c66565.50...@consistentstate.com

[5] http://www.postgresql.org/message-id/14620.1456851...@sss.pgh.pa.us


Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-03-12 Thread Daniel Verite
Robert Haas wrote:

> But worse than either of  those things, there is no real
> agreement on what the overall design of this feature
> should be.

The part in the design that raised concerns upthread is
essentially how headers sorting is exposed to the user and
implemented.

As suggested in [1], I've made some drastic changes in the
attached patch to take the comments (from Dean R., Tom L.)
into account. The idea is to limit to the bare minimum
the involvement of psql in sorting:

- the +/- syntax goes away

- the possibility of post-sorting the values through a backdoor
  query goes away too, for both headers.

- the vertical order of the crosstab view is now driven solely by the
  order  in the query

- the order of the horizontal header can be optionally specified
  by a column expected to contain an integer, with the syntax
  \crosstabview colv colh:scolh [other cols]
  which means "colh" will be sorted by "scolh".
  It still defaults to whatever order "colh" comes in from the results

  Concerning the optional "scolh", there are cases where it might pre-exist
  naturally, such as a month number going in pair with a month name.
  In other cases,  a user may add it as a kind of "synthetic column"
  by way of a window function, for example:
SELECT ...other columns...,
   (row_number() over(order by something [order options]) as scolh
   FROM...
   Only the relative order of scolh values is taken into account, the value
itself
   has no meaning for crosstabview.

- also NULLs are no longer excluded from headers, per Peter E.
  comment in [2].


[1]
http://www.postgresql.org/message-id/3d513263-104b-41e3-b1c7-4ad4bd99c491@mm

[2] http://www.postgresql.org/message-id/56c4e344.6070...@gmx.net


Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite
diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml
index 8a85804..da0621b 100644
--- a/doc/src/sgml/ref/psql-ref.sgml
+++ b/doc/src/sgml/ref/psql-ref.sgml
@@ -990,6 +990,113 @@ testdb=>
   
 
   
+\crosstabview [
+colV
+colH
+[:scolH]
+[colG1[,colG2...]]
+] 
+
+
+Execute the current query buffer (like \g) and shows
+the results inside a crosstab grid.
+The output column colV
+becomes a vertical header
+and the output column colH
+becomes a horizontal header, optionally sorted by ranking data obtained
+from scolH.
+
+colG1[,colG2...]
+is the list of output columns to project into the grid.
+By default, all output columns of the query except 
+colV and
+colH
+are included in this list.
+
+
+
+All columns can be refered to by their position (starting at 1), or by
+their name. Normal case folding and quoting rules apply on column
+names. By default,
+colV corresponds to column 1
+and colH to column 2.
+A query having only one output column cannot be viewed in crosstab, and
+colH must differ from
+colV.
+
+
+
+The vertical header, displayed as the leftmost column,
+contains the deduplicated values found in
+column colV, in the same
+order as in the query results.
+
+
+The horizontal header, displayed as the first row,
+contains the deduplicated values found in
+column colH, in
+the order of appearance in the query results.
+If specified, the optional scolH
+argument refers to a column whose values should be integer numbers
+by which colH will be sorted
+to be positioned in the horizontal header.
+
+
+
+Inside the crosstab grid,
+given a query output with N columns
+(including colV and
+colH),
+for each distinct value x of
+colH
+and each distinct value y of
+colV,
+the contents of a cell located at the intersection
+(x,y) is determined by these rules:
+
+
+
+ if there is no corresponding row in the query results such that the
+ value for colH
+ is x and the value
+ for colV
+ is y, the cell is empty.
+
+
+
+
+
+ if there is exactly one row such that the value
+ for colH
+ is x and the value
+ for colV
+ is y, then the N-2 other
+ columns or the columns listed in
+ colG1[,colG2...]
+ are displayed in the cell, separated between each other by
+ a space character if needed.
+
+ If N=2, the letter X is displayed
+ in the cell as if a virtual third column contained that character.
+
+
+
+
+
+ if there are several corresponding rows, the behavior is identical to
+ 

Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-03-14 Thread Daniel Verite
Jim Nasby wrote:

> Ultimately I'd really like some way to remove/reduce the restriction of 
> result set definitions needing to be determined at plan time. That would 
> open the door for server-side crosstab/pivot as well a a host of other 
> things (such as dynamically turning a hstore/json/xml field into a 
> recordset).

> Ultimately I'd really like some way to remove/reduce the restriction of 
> result set definitions needing to be determined at plan time. That would 
> open the door for server-side crosstab/pivot as well a a host of other 
> things (such as dynamically turning a hstore/json/xml field into a 
> recordset).

That would go against a basic expectation of prepared statements, the
fact that queries can be parsed/prepared without any part of them
being executed.

For a dynamic pivot, but probably also for the other examples you
have in mind, the SQL engine wouldn't be able to determine the output
columns without executing a least a subselect to look inside some
table(s).

I suspect that the implications of this would be so far reaching and
problematic that it will just not happen.

It seems to me that a dynamic pivot will always consist of
two SQL queries that can never be combined into one,
unless using a workaround à la Oracle, which encapsulates the
entire dynamic resultset into an XML blob as output.
The problem here being that the client-side tools
that people routinely use are not equipped to process it anyway;
at least that's what  I find by anecdotal evidence for instance in:
https://community.oracle.com/thread/2133154?tstart=0
 or
http://stackoverflow.com/questions/19298424
 or
https://community.oracle.com/thread/2388982?tstart=0


Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_dump / copy bugs with "big lines" ?

2016-03-19 Thread Daniel Verite
Daniel Verite wrote:

> # \copy bigtext2 from '/var/tmp/bigtext.sql'
> ERROR:  54000: out of memory
> DETAIL:  Cannot enlarge string buffer containing 1073741808 bytes by 8191
> more bytes.
> CONTEXT:  COPY bigtext2, line 1
> LOCATION:  enlargeStringInfo, stringinfo.c:278

To go past that problem, I've tried tweaking the StringInfoData
used for COPY FROM, like the original patch does in CopyOneRowTo. 

It turns out that it fails a bit later when trying to make a tuple
from the big line, in heap_form_tuple():

  tuple = (HeapTuple) palloc0(HEAPTUPLESIZE + len);

which fails because (HEAPTUPLESIZE + len) is again considered
an invalid size, the  size being 1468006476 in my test.

At this point it feels like a dead end, at least for the idea that extending
StringInfoData might suffice to enable COPYing such large rows.

Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_dump / copy bugs with "big lines" ?

2016-03-23 Thread Daniel Verite
Alvaro Herrera wrote:

> >   tuple = (HeapTuple) palloc0(HEAPTUPLESIZE + len);
> > 
> > which fails because (HEAPTUPLESIZE + len) is again considered
> > an invalid size, the  size being 1468006476 in my test.
> 
> Um, it seems reasonable to make this one be a huge-zero-alloc:
> 
>   MemoryContextAllocExtended(CurrentMemoryContext,
>  HEAPTUPLESIZE + len,
> MCXT_ALLOC_HUGE | MCXT_ALLOC_ZERO)

Good, this allows the tests to go to completion! The tests in question
are dump/reload of a row with several fields totalling 1.4GB (deflated),
with COPY TO/FROM file and psql's \copy in both directions, as well as
pg_dump followed by pg_restore|psql.

The modified patch is attached.

It provides a useful mitigation to dump/reload databases having
rows in the 1GB-2GB range, but it works under these limitations:

- no single field has a text representation exceeding 1GB.
- no row as text exceeds 2GB (\copy from fails beyond that. AFAICS we
  could push this to 4GB with limited changes to libpq, by
  interpreting the Int32 field in the CopyData message as unsigned).

It's also possible to go beyond 4GB per row with this patch, but
when not using the protocol. I've managed to get a 5.6GB single-row
file with COPY TO file. That doesn't help with pg_dump, but that might
be useful in other situations.

In StringInfo, I've changed int64 to Size, because on 32 bits platforms
the downcast from int64 to Size is problematic, and as the rest of the
allocation routines seems to favor Size, it seems more consistent
anyway.

I couldn't test on 32 bits though, as I seem to never have enough
free contiguous memory available on a 32 bits VM to handle
that kind of data.


Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite
diff --git a/src/backend/access/common/heaptuple.c b/src/backend/access/common/heaptuple.c
index 6d0f3f3..ed9cabd 100644
--- a/src/backend/access/common/heaptuple.c
+++ b/src/backend/access/common/heaptuple.c
@@ -741,7 +741,9 @@ heap_form_tuple(TupleDesc tupleDescriptor,
 	 * Allocate and zero the space needed.  Note that the tuple body and
 	 * HeapTupleData management structure are allocated in one chunk.
 	 */
-	tuple = (HeapTuple) palloc0(HEAPTUPLESIZE + len);
+	tuple = MemoryContextAllocExtended(CurrentMemoryContext,
+	   HEAPTUPLESIZE + len,
+	   MCXT_ALLOC_HUGE | MCXT_ALLOC_ZERO);
 	tuple->t_data = td = (HeapTupleHeader) ((char *) tuple + HEAPTUPLESIZE);
 
 	/*
diff --git a/src/backend/commands/copy.c b/src/backend/commands/copy.c
index 3201476..ce681d7 100644
--- a/src/backend/commands/copy.c
+++ b/src/backend/commands/copy.c
@@ -446,6 +446,15 @@ SendCopyEnd(CopyState cstate)
 	}
 }
 
+/*
+ * Prepare for output
+ */
+static void
+CopyStartSend(CopyState cstate)
+{
+	allowLongStringInfo(cstate->fe_msgbuf);
+}
+
 /*--
  * CopySendData sends output data to the destination (file or frontend)
  * CopySendString does the same for null-terminated strings
@@ -2015,6 +2024,8 @@ CopyOneRowTo(CopyState cstate, Oid tupleOid, Datum *values, bool *nulls)
 	MemoryContextReset(cstate->rowcontext);
 	oldcontext = MemoryContextSwitchTo(cstate->rowcontext);
 
+	CopyStartSend(cstate);
+
 	if (cstate->binary)
 	{
 		/* Binary per-tuple header */
@@ -3203,6 +3214,7 @@ CopyReadLine(CopyState cstate)
 	bool		result;
 
 	resetStringInfo(&cstate->line_buf);
+	allowLongStringInfo(&cstate->line_buf);
 	cstate->line_buf_valid = true;
 
 	/* Mark that encoding conversion hasn't occurred yet */
@@ -3272,6 +3284,7 @@ CopyReadLine(CopyState cstate)
 		{
 			/* transfer converted data back to line_buf */
 			resetStringInfo(&cstate->line_buf);
+			allowLongStringInfo(&cstate->line_buf);
 			appendBinaryStringInfo(&cstate->line_buf, cvt, strlen(cvt));
 			pfree(cvt);
 		}
@@ -3696,7 +3709,7 @@ CopyReadAttributesText(CopyState cstate)
 	}
 
 	resetStringInfo(&cstate->attribute_buf);
-
+	allowLongStringInfo(&cstate->attribute_buf);
 	/*
 	 * The de-escaped attributes will certainly not be longer than the input
 	 * data line, so we can just force attribute_buf to be large enough and
diff --git a/src/backend/lib/stringinfo.c b/src/backend/lib/stringinfo.c
index 7382e08..6e451b2 100644
--- a/src/backend/lib/stringinfo.c
+++ b/src/backend/lib/stringinfo.c
@@ -47,12 +47,24 @@ initStringInfo(StringInfo str)
 {
 	int			size = 1024;	/* initial default buffer size */
 
-	str->data = (char *) palloc(size);
+	str->data = (char *) palloc(size);	/* no need for "huge" at this point */
 	str->maxlen = size;
+	str->allowlong = false;
 	resetStringInfo(str);
 }
 
 /*
+ * allocLongStringInfo
+ *
+ * Mark the StringInfo as a candidate for a "huge" allocation
+ */
+void
+allowLongStringInfo(StringInfo str)
+{
+	str->allowlong = true;
+}
+
+/*
  * resetStringInfo
  *
  * Reset the StringInfo: the data buffer remains valid, but its
@@ -64,6 +76,7 @@ resetStringInfo(StringInfo str)
 	str->data[0] = '\0';
 	str->len = 0;
 	str->cursor = 0;
+	str->allowl

Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-01-22 Thread Daniel Verite
  Hi,

Here's an updated patch improving on how the horizontal and vertical
headers can be sorted.

The discussion upthread went into how it was desirable
to have independant sorts for these headers, possibly driven
by another column, in addition to the query's ORDER BY.

Thus the options now accepted are:

\crosstabview [ [-|+]colV[:scolV] [-|+]colH[:scolH]  [colG1[,colG2...]] ]

The optional scolV/scolH columns drive sorts for respectively
colV/colH (colV:scolV somehow means SELECT colV from... order by scolV)

colG1,... in 3rd arg indicate the columns whose contents form the grid
cells, the typical use case being that there's only one such column.
By default it's all columns minus colV and colH.

For example,

SELECT
  cust_id,
  cust_name,
  cust_date,
  date_part('month, sales_date),
  to_char(sales_date, 'Mon') as month,
  amount
FROM sales_view
WHERE [predicates]
[ORDER BY ...]

If we want to look at  in a grid with months names across, sorted
by month number, and customer name in the vertical header, sorted by date of 
acquisition, we could do this:

\crosstabview +cust_name:cust_date +5:4 amount

or letting the vertical header being sorted by the query's ORDER BY,
and the horizontal header same as above:

\crosstabview cust_name +5:4 amount

or sorting vertically by name, if it happens that the ORDER BY is missing or
is on something else:

\crosstabview +cust_name +5:4 amount


Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite
diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml
index 6d0cb3d..a242ec4 100644
--- a/doc/src/sgml/ref/psql-ref.sgml
+++ b/doc/src/sgml/ref/psql-ref.sgml
@@ -990,6 +990,123 @@ testdb=>
   
 
   
+\crosstabview [
+[-|+]colV
+[:scolV]
+[-|+]colH
+[:scolH]
+[colG1[,colG2...]]
+] 
+
+
+Execute the current query buffer (like \g) and shows
+the results inside a crosstab grid.
+The output column colV
+becomes a vertical header, optionally sorted by scolV,
+and the output column colH
+becomes a horizontal header, optionally sorted by
+scolH.
+
+colG1[,colG2...]
+is the list of output columns to project into the grid.
+By default, all output columns of the query except 
+colV and
+colH
+are included in this list.
+
+
+
+All columns can be refered to by their position (starting at 1), or by
+their name. Normal case folding and quoting rules apply on column
+names. By default,
+colV corresponds to column 1
+and colH to column 2.
+A query having only one output column cannot be viewed in crosstab, and
+colH must differ from
+colV.
+
+
+
+The vertical header, displayed as the leftmost column,
+contains the set of all distinct values found in
+column colV, in the order
+of their first appearance in the query results,
+or in ascending order if a plus (+) sign precedes
+colV,
+or in descending order if it's a minus (-) sign.
+
+
+The horizontal header, displayed as the first row,
+contains the set of all distinct non-null values found in
+column colH.  They come
+by default in their order of appearance in the query results,
+or in ascending order if a plus (+) sign precedes
+colH,
+or in descending order if it's a minus (-) sign.
+
+Also, they can optionally be sorted by another column, if
+colV
+(respectively colH) is
+immediately followed by a colon and a reference to another column
+scolV
+(respectively scolH).
+
+
+
+Inside the crosstab grid,
+given a query output with N columns
+(including colV and
+colH),
+for each distinct value x of
+colH
+and each distinct value y of
+colV,
+the contents of a cell located at the intersection
+(x,y) is determined by these rules:
+
+
+
+ if there is no corresponding row in the query results such that the
+ value for colH
+ is x and the value
+ for colV
+ is y, the cell is empty.
+
+
+
+
+
+ if there is exactly one row such that the value
+ for colH
+ is x and the value
+ for colV
+ is y, then the N-2 other
+ columns or the columns listed in
+ colG1[,colG2...]
+ are displayed in the cell, separated between each other by
+ a space character if needed.
+
+ If N=2, the letter X is displayed
+ in the cell as if a virtual third column contained that character.
+
+
+
+
+
+ if there are s

Re: [HACKERS] Why format() adds double quote?

2016-01-26 Thread Daniel Verite
Tatsuo Ishii wrote:

> IMO, it's a bug or at least an inconsistency

Personally I don't see this change being good for everything.

Let's play devil's advocate:

create table abc(U&"foo\2003" int);

U+2003 is 'EM SPACE', in Unicode's General Punctuation block.

With the current version, format('%I', attname) on this column is:
"foo "

With the patched version, it produces this:
foo 

So the visual hint that there are more characters at the end is lost.

Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Why format() adds double quote?

2016-01-27 Thread Daniel Verite
Tatsuo Ishii wrote:

> 2) What does the SQL standard say? Do they say that non ASCII white
>   spaces should be treated as ASCII white spaces?

I've used white space in the example, but I'm concerned about
punctuation too.

unicode.org has this helpful paper:
http://www.unicode.org/L2/L2000/00260-sql.pdf
which studies Unicode in SQL-99 identifiers.

The relevant BNF they extracted from the standard looks like this:

identifier body> ::=
   
   [ {  |  }... ]

 ::=
   
   | 

 ::=

| 
| 
| 
| 
| 
| 
| 
| 

 ::=
 

 ::= ...

 ::=
   
   | 



The current version of quote_ident() plays it safe by implementing
the rule that, as soon it encounters a character outside
of US-ASCII, it surrounds the identifier with double quotes, no matter
to which category or block this character belongs.
So its output is guaranteed to be compatible with the above grammar.

The change in the patch is that multibyte characters just don't imply
quoting.

But according to the points 1 and 2 of the paper, the first character
must have the Unicode alphabetic property, and it must not
have the Unicode combining property.

I'm mostly ignorant in Unicode so I'm not sure of the precise
implications of having such Unicode properties, but still my
understanding is that the new quote_ident() ignores these rules,
so in this sense it could produce outputs that wouldn't be
compatible with SQL-99.

Also, here's what we say in the manual about non quoted identifiers:
http://www.postgresql.org/docs/current/static/sql-syntax-lexical.html

"SQL identifiers and key words must begin with a letter (a-z, but also
letters with diacritical marks and non-Latin letters) or an underscore
(_). Subsequent characters in an identifier or key word can be
letters, underscores, digits (0-9), or dollar signs ($)"

So it explicitly allows letters in general  (and also seems less
strict than SQL-99 about underscore), but it makes no promise about
Unicode punctuation or spaces, for instance, even though in practice
the parser seems to accept them just fine.


Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Why format() adds double quote?

2016-01-27 Thread Daniel Verite
Tatsuo Ishii wrote:

> What is the "visual hint"? If you are talking about psql's output, it 
> never adds "visual hint" (double quotations). 
> 
> If you are talking about the string handling in a program, what kind 
> of program cares about "visiual"? 

Sure. The scenario I'm thinking about would be someone inspecting
generated queries, say for controlling or troubleshooting,
the queries containing identifiers injected with the help of
quote_ident/format. That could be from the logs, or
monitoring or audit tools.

If identifiers contain weird Unicode characters, currently 
that's relatively easy to spot because they get surrounded by
double quotes.

If I see something like this: UPDATE "test․table" SET...
I immediately think that there's something fishy. It looks like test
should be a schema, but the surrounding quotes say otherwise.
In any case, it's clear that it updates a table in the current schema.

But if I see that: UPDATE test․table SET...
is seems legit and seems to update "table" inside the "test" schema
even though that's not what it does, it actually updates the
"test․table" table in the current schema, because the dot between
test and table is not the US-ASCII U+002E,  it's U+2024,
'ONE DOT LEADER'
On my display, they are almost indiscernible.

This boils down to the fact that the current quote_ident gives:

=# select quote_ident('test․table');
 quote_ident  
--
 "test․table"

whereas the quote_ident patched as proposed gives:

=# select quote_ident('test․table');
 quote_ident 
-
 test․table


So this is what I don't feel good about.

Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-02-01 Thread Daniel Verite
Pavel Stehule wrote:

> 1. maybe we can decrease name to shorter "crossview" ?? I am happy with
> crosstabview too, just crossview is correct too, and shorter

I'm in favor of keeping crosstabview. It's more explicit, only 
3 characters longer and we have tab completion anyway.

> 2. Columns used for ordering should not be displayed by default. I can live
> with current behave, but hiding ordering columns is much more practical for
> me

I can see why, but I'm concerned about a consequence:
say we have 4 columns A,B,C,D and user does \crosstabview +A:B +C:D
If B and D are excluded by default, then there's nothing
left to display inside the grid.
It doesn't feel quite right. There's something counter-intuitive
in the fact that values in the grid would disappear depending on
whether and how headers are sorted.
With the 3rd argument, we let the user decide what they want
to see.

> 3. This code is longer, so some regress tests are recommended - attached
> simple test case

I've added a few regression tests to the psql testsuite
based on your sample data. New patch with these tests
included is attached, make check passes.

Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite
diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml
index 6d0cb3d..a242ec4 100644
--- a/doc/src/sgml/ref/psql-ref.sgml
+++ b/doc/src/sgml/ref/psql-ref.sgml
@@ -990,6 +990,123 @@ testdb=>
   
 
   
+\crosstabview [
+[-|+]colV
+[:scolV]
+[-|+]colH
+[:scolH]
+[colG1[,colG2...]]
+] 
+
+
+Execute the current query buffer (like \g) and shows
+the results inside a crosstab grid.
+The output column colV
+becomes a vertical header, optionally sorted by scolV,
+and the output column colH
+becomes a horizontal header, optionally sorted by
+scolH.
+
+colG1[,colG2...]
+is the list of output columns to project into the grid.
+By default, all output columns of the query except 
+colV and
+colH
+are included in this list.
+
+
+
+All columns can be refered to by their position (starting at 1), or by
+their name. Normal case folding and quoting rules apply on column
+names. By default,
+colV corresponds to column 1
+and colH to column 2.
+A query having only one output column cannot be viewed in crosstab, and
+colH must differ from
+colV.
+
+
+
+The vertical header, displayed as the leftmost column,
+contains the set of all distinct values found in
+column colV, in the order
+of their first appearance in the query results,
+or in ascending order if a plus (+) sign precedes
+colV,
+or in descending order if it's a minus (-) sign.
+
+
+The horizontal header, displayed as the first row,
+contains the set of all distinct non-null values found in
+column colH.  They come
+by default in their order of appearance in the query results,
+or in ascending order if a plus (+) sign precedes
+colH,
+or in descending order if it's a minus (-) sign.
+
+Also, they can optionally be sorted by another column, if
+colV
+(respectively colH) is
+immediately followed by a colon and a reference to another column
+scolV
+(respectively scolH).
+
+
+
+Inside the crosstab grid,
+given a query output with N columns
+(including colV and
+colH),
+for each distinct value x of
+colH
+and each distinct value y of
+colV,
+the contents of a cell located at the intersection
+(x,y) is determined by these rules:
+
+
+
+ if there is no corresponding row in the query results such that the
+ value for colH
+ is x and the value
+ for colV
+ is y, the cell is empty.
+
+
+
+
+
+ if there is exactly one row such that the value
+ for colH
+ is x and the value
+ for colV
+ is y, then the N-2 other
+ columns or the columns listed in
+ colG1[,colG2...]
+ are displayed in the cell, separated between each other by
+ a space character if needed.
+
+ If N=2, the letter X is displayed
+ in the cell as if a virtual third column contained that character.
+
+
+
+
+
+ if there are several corresponding rows, the behavior is identical to
+ the case of one row except that the values coming from different rows
+ are stacked vertically, the different source rows being separated by
+ newline characters inside the cell.
+   

Re: [HACKERS] 2016-01 Commitfest

2016-02-03 Thread Daniel Verite
Alvaro Herrera wrote:

>  https://commitfest.postgresql.org/8/372/
> \crosstabview (previously: \rotate) in psql for crosstab-style display

About this one, the code is no longer moving, the latest addition was
regression tests a couple days ago.

I think it should be moved to the next CF, to be hopefully promoted to
Ready for Committer soon.

Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add schema-qualified relnames in constraint error messages.

2016-02-08 Thread Daniel Verite
Shulgin, Oleksandr wrote:

> Added to the Open commitfest: https://commitfest.postgresql.org/9/475/

Here's a review. Note that the patch tested and submitted
is not the initial one in the thread, so it doesn't exactly
match  $subject now.
What's tested here is a client-side approach, suggested by Tom
upthread as an alternative, and implemented by Oleksandr in 
0001-POC-errverbose-in-psql.patch

The patch has two parts: one part in libpq exposing a new function
named PQresultBuildErrorMessage, and a second part implementing an
\errverbose command in psql, essentially displaying the result of the
function.
The separation makes sense if we consider that other clients might benefit
from the function, or that libpq is a better place than psql to maintain
this in the future, as the list of error fields available in a PGresult
might grow.
OTOH maybe adding a new libpq function just for that is overkill, but this
is subjective.

psql part
=
Compiles and works as intended.
For instance with \set VERBOSITY default, an FK violation produces:

  # insert into table2 values(10);
  ERROR:  insert or update on table "table2" violates foreign key constraint
"table2_id_tbl1_fkey"
  DETAIL:  Key (id_tbl1)=(10) is not present in table "table1".

Then \errverbose just displays the verbose form of the error:
  # \errverbose
ERROR:  23503: insert or update on table "table2" violates foreign
  key constraint "table2_id_tbl1_fkey"
DETAIL:  Key (id_tbl1)=(10) is not present in table "table1".
SCHEMA NAME:  public
TABLE NAME:  table2
CONSTRAINT NAME:  table2_id_tbl1_fkey
LOCATION:  ri_ReportViolation, ri_triggers.c:3326

- make check passes
- valgrind test is OK (no obvious leak after using the command).

Missing bits:
- the command is not mentioned in the help (\? command, see help.c)
- it should be added to tab completions (see tab-complete.c)
- it needs to be described in the SGML documentation

libpq part
==
The patch implements and exports a new PQresultBuildErrorMessage()
function.

AFAICS its purpose is to produce a result similar to what
PQresultErrorMessage() would have produced, if PQERRORS_VERBOSE
was the verbosity in effect at execution time.

My comments:

- the name of the function does not really hint at what it does.
Maybe something like PQresultVerboseErrorMessage() would be more
explicit?

I would expect the new function to have the same interface than
PQresultErrorMessage(), but it's not the case.

- it takes a PGconn* and PGresult* as input parameters, but
PQresultErrorMessage takes only a  as input.
It's not clear why PGconn* is necessary.

- it returns a pointer to a strdup'ed() buffer, which
has to be freed separately by the caller. That differs from
PQresultErrorMessage() which keeps this managed inside the
PGresult struct.

- if protocol version < 3, an error message is returned. It's not
clear to the caller that this error is emitted by the function itself
rather than the query. Besides, are we sure it's necessary?
Maybe the function could just produce an output with whatever
error fields it has, even minimally with older protocol versions,
rather than failing?

- if the fixed error message is kept, it should pass through
libpq_gettext() for translation.

- a description of the function should be added to the SGML doc.
There's a sentence in PQsetErrorVerbosity() that says, currently:

  "Changing the verbosity does not affect the messages available from
   already-existing PGresult objects, only subsequently-created ones."

As it's precisely the point of that new function, that bit could
be altered to refer to it.


Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-02-08 Thread Daniel Verite
Teodor Sigaev wrote:

> Interesting feature, but it's not very obvious how to use it. I'd like to
> see  some example(s) in documentation.

I'm thinking of making a wiki page, because examples pretty much
require showing resultsets, and I'm not sure this would fly
with our current psql documentation, which is quite compact.

The current bit of doc I've produced is 53 lines long in manpage
format already. The text has not been proofread by a native
English speaker yet, so part of the problem might be that
it's just me struggling with english :)

> And I see an implementation of AVL tree in psql source code
> (src/bin/psql/crosstabview.c). Postgres already has a Red-Black tree
> implementation in src/include/lib/rbtree.h and
> src/backend/lib/rbtree.c. Is any advantage of using AVL tree here? I
> have some doubt about that and this implementation, obviously, will
> not be well tested. But I see in comments that implementation is
> reduced to insert only and it doesn't use the fact of ordering tree,
> so, even hash could be used.

Yes. I expect too that a RB tree or a hash-based algorithm would do
the job and perform well.

The AVL implementation in crosstabview is purposely simplified
and specialized for this job, resulting in ~185 lines of code
versus ~850 lines for rb-tree.c
But I understand the argument that the existing rb-tree has been
battle-tested, whereas this code hasn't.

I'm looking at rb-tree.c and thinking what it would take to
incorporate it:
1. duplicating or linking from backend/lib/rb-tree.c into psql/
2. replacing the elog() calls with something else in the case of psql
3. updating the crosstabview data structures and call sites.

While I'm OK with #3, #1 and #2 seem wrong.
I could adapt rb-tree.c so that the same code can be used
both client-side and server-side, but touching server-side
code for this feature and creating links in the source tree
feels invasive and overkill.

Another approach is to replace AVL with an hash-based algorithm,
but that raises the same sort of question. If crosstabview comes
with its specific implementation, why use that rather than existing
server-side code? But at a glance, utils/hash/dynahash.c seems quite
hard to convert for client-side use.

I'm open to ideas on this. In particular, if we have a hash table
implementation that is already blessed by the project and small enough
to make sense in psql, I'd be happy to consider it.

Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-02-09 Thread Daniel Verite
Alvaro Herrera wrote:

> So please can we have that wiki page so that the syntax can be hammered
> out a bit more.

Sure, I'm on it.

> I'm closing this as returned-with-feedback for now.

Well,  the feedback it got during months was incorporated into
the patch in the form of significant improvements, and
at the end of this CF it was at the point that it has really been
polished, and no other feedback was coming.

I'll resubmit.

Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-02-09 Thread Daniel Verite
Alvaro Herrera wrote:

> While I understand that you may think that "silence is consent",
> what I am afraid of is that some committer will look at this two
> months from now and say "I hate this Hcol+ stuff, -1 from me" and
> send the patch back for syntax rework.  IMO it's better to have more
> people chime in here so that the patch that we discuss during the
> next commitfest is really the best one we can think of.

Yes, but on the other hand we can't force people to participate.
If a patch is moving forward and being discussed here between
one author and one reviewer, and nothing particularly wrong
pops out in what is discussed, the reality if that other people will
not intervene.

Besides, as it being mentioned here frequently, all patches, even
much more important ones, are short on reviews and reviewers
and testing, still new stuff must keep getting in the source tree
to progress.

Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-02-09 Thread Daniel Verite
Alvaro Herrera wrote:

> Also, what about the business of putting "x" if there's no third column?
> Three months from now some Czech psql hacker will say "we should use
> Unicode chars for this" and we will be forever stuck with \pset
> unicode_crosstab_marker to change the character to a ☑ BALLOT BOX WITH
> CZECH.  Maybe we should think that a bit harder -- for example, what
> about just rejecting the case with no third column and forcing the user
> to add a third column with the character they choose?  That way you
> avoid that mess.

Yes,  that implicit "X" with 2 column resultsets is not essential, it may
be removed without real damage.

About the possible suggestion to have a \pset unicode_crosstab_marker,
my opinion would be that it's not important enough to justify a new
\pset setting.


Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-02-09 Thread Daniel Verite
Dean Rasheed wrote:

> I don't think we should allow sorting colV values client-side,
> overriding a server-side ORDER BY clause in the query.

I shared that opinion until (IIRC) the v8 or v9 of the patch.
Most of the evolution of this patch has been to go
from no client-side sorting option at all, to the full range
of possibilities, ascending or descending, and in both
vertical and horizontal directions.

I agree that colV sorting can be achieved through the
query's ORDER BY, which additionally is more efficient
so it should be the primary choice.

The reason to allow [+/-]colV in \crosstabview is because
I think the average user will expect it, by symmetry with colH.
As the display is reorganized to be like a "grid" instead of a "list
with several columns", we shift the focus to the symmetry
between horizontal and vertical headers, rather than on
the pre-crosstab form of the resultset, even if it's the
same data.
It's easier for the user to just stick a + in front of a column
reference than to figure out that the same result could
be achieved by editing the query and changing/adding
an ORDER BY.

Or said otherwise, having the [+/-] colV sorting is a way to
avoid the question:
"we can sort the horizontal header, so why can't we sort the
vertical header just the same?"


Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-02-09 Thread Daniel Verite
Dean Rasheed wrote:

> Note that I might also want to pass additional sort options, such as
> "ORDER BY name NULLS LAST", which the existing syntax doesn't allow.
> In the new syntax, such sort options could be trivially supported in
> both the server- and client-side sorts:

Note that NULL values in the column that pivots are discarded
by \crosstabview, because NULL as the name of a column does not
make sense.

The doc (in the patch) says:

"The horizontal header, displayed as the first row, contains the set of
all distinct non-null values found in column colH"


Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-02-09 Thread Daniel Verite
Tom Lane wrote:

> I do not think we want any client-side sorting in this feature at all,
> because the minute you have any such thing, you are going to have an
> absolutely never-ending stream of demands for more sorting features:
> multi column, numeric vs text, ASC vs DESC, locale-aware, etc etc etc.

It doesn't really do any client-side sorting, the rest of the thread might
refer to it like that by oversimplification, but if the command requests
a header to be sorted,  a "backdoor-style" query of this form
is sent to the server, with PQexecParams():

SELECT n FROM (VALUES ($1,1),($2,2),($3,3)...) ) AS l(x,n) 
  ORDER BY x [DESC]
where the values to display in the header are bound to 
$1,$2,.. and the type associated with these parameters is
the PQftype() of the field from which these values come.
Then  the  values coming back ordered by  tell us
where to position the values corresponding to $1,$2... in the
sorted header.

There are some cases when this sort cannot work.
For example if the field is an anonymous type or a ROW().
Or if the field is POINT(x,y), because our "point" type
does not support order by.
I believe these are corner cases for this feature. In these
cases, psql just displays the error message that PQexecParams()
emits.


Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add schema-qualified relnames in constraint error messages.

2016-02-09 Thread Daniel Verite
Shulgin, Oleksandr wrote:

> Most importantly, I'd like to learn of better options than storing the
> whole last_result in psql's pset structure.

I guess that you could, each time a query fails, gather silently the
result of \errverbose, store it in a buffer, discard the PGresult,
and in case the user does \errverbose before running another query,
output what was in that buffer.


Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-02-15 Thread Daniel Verite
Alvaro Herrera wrote:

> So please can we have that wiki page so that the syntax can be hammered
> out a bit more.

I've added a wiki page with explanation and examples here:

https://wiki.postgresql.org/wiki/Crosstabview

Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-02-15 Thread Daniel Verite
Dean Rasheed wrote:

> My biggest problem is with the sorting, for all the reasons discussed
> above. There is absolutely no reason for \crosstabview to be
> re-sorting rows -- they should just be left in the original query
> result order

We want the option to sort the vertical the header in a late additional
step when the ORDER BY of the query is already assigned to another
purpose. 

I've submitted this example on the wiki:
https://wiki.postgresql.org/wiki/Crosstabview

create view v_data as 
select * from ( values
   ('v1','h2','foo', '2015-04-01'::date),
   ('v2','h1','bar', '2015-01-02'),
   ('v1','h0','baz', '2015-07-12'),
   ('v0','h4','qux', '2015-07-15')
 ) as l(v,h,c,d);


Normal top-down display:

select v,to_char(d,'Mon') as m, c  from v_data order by d;

 v  |  m  |  c  
+-+-
 v2 | Jan | bar
 v1 | Apr | foo
 v1 | Jul | baz
 v0 | Jul | qux

Crosstabview display without any additional sort:

 \crosstabview v m c

 v  | Jan | Apr | Jul 
+-+-+-
 v2 | bar | | 
 v1 | | foo | baz
 v0 | | | qux

"d" is not present the resultset but it drives the sort
so that month names come out in the natural order.

\crosstabview does not discard the order of colH nor the order of colV,
it follows both, so that we get v2,v1,v0 in this order in the leftmost
column (vertical header) just like in the resultset.

At this point, it seems to me that it's perfectly reasonable for our user
to expect the possibility of sorting additionally by "v" , without
changing the query and without changing the order of the horizontal
header:

 \crosstabview +v m c

 v  | Jan | Apr | Jul 
+-+-+-
 v0 | | | qux
 v1 | | foo | baz
 v2 | bar | | 



Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-02-17 Thread Daniel Verite
Jim Nasby wrote:

> >   ORDER BY name
> > \crosstabview cols = (select to_char(d, 'Mon') from
> > generate_series('2000-01-01'::date, '2000-12-01', '1 month') d)
> 
> My concern with that is that often you don't know what the columns will 
> be, because you don't know what exact data the query will produce. So to 
> use this syntax you'd have to re-create a huge chunk of the original 
> query. :(

Also, if that additional query refers to tables, it should be executed
with the same data visibility as the main query. Doesn't that mean
that both queries should happen within the same repeatable
read transaction?

Another  impractical aspect of this approach is that a
meta-command invocation in psql must fit on a single line, so
queries containing newlines are not acceptable as argument.
This problem exists with "\copy (select...) to ..."  already.

Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-02-18 Thread Daniel Verite
Daniel Verite wrote:

> > >   ORDER BY name
> > > \crosstabview cols = (select to_char(d, 'Mon') from
> > > generate_series('2000-01-01'::date, '2000-12-01', '1 month') d)
> > 
> > My concern with that is that often you don't know what the columns will 
> > be, because you don't know what exact data the query will produce. So to 
> > use this syntax you'd have to re-create a huge chunk of the original 
> > query. :(
> 
> Also, if that additional query refers to tables, it should be executed
> with the same data visibility as the main query. Doesn't that mean
> that both queries should happen within the same repeatable
> read transaction?
> 
> Another  impractical aspect of this approach is that a
> meta-command invocation in psql must fit on a single line, so
> queries containing newlines are not acceptable as argument.
> This problem exists with "\copy (select...) to ..."  already.

Thinking more about that, it occurs to me that if the sort must come
from a user-supplied bit of SQL, it would be simpler to just direct the
user to submit it in the main query, in an additional dedicated column.

For instance, to get a specific, separate order on "h",
let the user change this:

  SELECT v, h, c FROM v_data ORDER BY v;

into that:

  SELECT v, h, row_number() over(order by h) as hn, c
   FROM v_data ORDER BY v;

then with a relatively simple modification to the patch,
this invocation:

 \crosstabview v h:hn c

would display "h" in the horizontal header ordered by "hn".

ISTM this handles two objections raised upthread:

1. The ORDER BY inside OVER() can be augmented with additional
clauses such as lc_collate, desc, nulls last, etc... contrary to
the controversed "+/-" syntax.

2. a post-sort "backdoor" query is no longer necessary.

The drawback for me is that this change doesn't play out with
my original scenario for the command, which is to give the ability to
scrutinize query results in crosstab mode, playing with variations on
what column is pivoted and how headers for both directions get sorted,
while ideally not changing _at all_ the original query in the query
buffer, but just invoking  successive \crosstabview [args] commands
with varying arguments.


Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-02-18 Thread Daniel Verite
Peter Eisentraut wrote:

> On 2/9/16 11:21 AM, Daniel Verite wrote:
> > Note that NULL values in the column that pivots are discarded
> > by \crosstabview, because NULL as the name of a column does not
> > make sense.
> 
> Why not?
> 
> All you're doing is printing it out, and psql is quite capable of
> printing a null value.

Initially it's by analogy with the crosstab SRF, but it's true
that the same principle does not have to apply to crosstabview.

The code could set in the header whatever text "pset null" is set to,
at the place where a pivoted NULL would be supposed to go
if it was not filtered out in the first place.

I'll consider implementing that change if there's no objection.

Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-02-18 Thread Daniel Verite
Dean Rasheed wrote:

> If I want to sort the rows coming out of a query, my first thought
> is always going to be to add/adjust the query's ORDER BY clause, not
> use some weird +/- psql syntax.

About the vertical sort, I agree on all your points.
It's best to rely on ORDER BY for all the reasons mentioned,
as opposed to a separate sort in a second step.

But you're considering the case when a user is designing
or adapting a query for the purpose of crosstab
viewing. As mentioned in my previous reply (about the
methods to achieve horizontal sort), that scenario is not really
what motivates the feature in the first place.

If removing that sort option is required to move forward
with the patch because it's controversial, so be it,
but overall I don't see this as a benefit for the end user,
it's just an option.

Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] psql metaqueries with \gexec

2016-02-22 Thread Daniel Verite
Corey Huinker wrote:

> ...and query text visibility, and result visibility, and error handling,
> etc. In this case, we're leveraging the psql environment we'd already set
> up, and if there's an error, \set ECHO queries shows us the errant SQL as
> if we typed it ourselves..

BTW, about error handling, shouldn't it honor ON_ERROR_STOP ?

With the patch when trying this:

=> set ON_ERROR_STOP on
=> select * from (values ('select 1/0', 'select 1/0')) AS n \gexec

it produces two errors:
ERROR:  division by zero
ERROR:  division by zero

I'd rather have the execution stop immediately after the first error,
like it's the case with successive queries entered normally via the
query buffer:

=> \set ON_ERROR_STOP on
=> select 1/0; select 1/0;
ERROR:  division by zero

as opposed to:

=> \set ON_ERROR_STOP off
=> select 1/0; select 1/0;
ERROR:  division by zero
ERROR:  division by zero


Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Renaming of pg_xlog and pg_clog

2016-08-29 Thread Daniel Verite
Craig Ringer wrote:

> People won't see a README in amongst 5000 xlog segments while
> freaking out about the sever being down.

Maybe they're more likely to google "pg_xlog".
BTW, renaming it will not help with respect to that, as
there's a pretty good corpus on knowledge linked to that
particular keyword.

The first google results of "pg_xlog" are, for me:

- Solving pg_xlog out of disk space problem on Postgres 
- PostgreSQL: Documentation: 9.1: WAL Internals
- Why is my pg_xlog directory so huge? - PostgreSQL
- Database Soup: Don't delete pg_xlog
...

That's spot-on. For each person deleting stuff in pg_xlog and then
regretting it, how many look it up in the above results and avoid
making the mistake? Will they find comparable results if the
directory is "wal" ?

Aside from that, we might also question how much of the excuse
"pg_xlog looked like it was just deletable logs" is a lie made up
after the fact, because anybody wrecking a database is not against
deflecting a bit of the blame to the software, that's human.

The truth being that they took the gamble that postgres
will somehow recover from the deletion, at the risk of possibly
loosing the latest transactions. That doesn't necessarily look
so bad when current transactions are failing anyway for lack of disk
space, users are yelling at you, and you're frantically searching for
anything to delete in the FS to come back online quickly.
Personally I'm quite skeptical of the *name* of the directory
playing that much of a role in this scenario.

Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Renaming of pg_xlog and pg_clog

2016-08-29 Thread Daniel Verite
Joshua D. Drake wrote:

> You log in, see that all the space and you find that you are using a
> ton of disk space. You look around for anything you can delete. You
> find a directory called pg_xlog, it says log, junior ignorant, don't
> want to be a sysadmin 101 says, "delete logs".

Yes, it happens. I don't deny the problem, but I'm wondering
about the wishful thinking we're possibly falling into here
concerning the solution.

Let's imagine that pg_xlog is named wal instead.
How does that help our user with the disk space problem?
Does that point to a path of resolution? I don't see it.
What do we think that user's next move will be?
After all, WAL means Write Ahead *Log*.

On the other hand, by decommissioning pg_xlog as a name,
that makes it obsolete in all presentations, tutorials, docs
that refer to that directory, and there are many of them.
There are years of confusion ahead with questions like
"where is that pg_xlog directory that I'm supposed to
monitor, or move into a different partition, etc...",
for a benefit that remains to be seen.

Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Surprising behaviour of \set AUTOCOMMIT ON

2016-09-03 Thread Daniel Verite
Rahila Syed wrote:

> >Above error coming because in below code block change, you don't have
> check for
> >command (you should check opt0 for AUTOCOMMIT command)

> Corrected in the attached.

There are other values than ON: true/yes and theirs abbreviations,
the value 1, and anything that doesn't resolve to OFF is taken as ON.
ParseVariableBool() in commands.c already does the job of converting
these to bool, the new code should probably just call that function
instead of parsing the value itself.

Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Surprising behaviour of \set AUTOCOMMIT ON

2016-09-03 Thread Daniel Verite
Rushabh Lathia wrote:

> It might happen that SetVariable() can be called from other places in
> future,
> so its good to have all the set variable related checks inside
> SetVariable().

There's already another way to skip the \set check:
  select 'on' as "AUTOCOMMIT" \gset

But there's a function in startup.c which might be the ideal location
for the check, as it looks like the one and only place where the
autocommit flag actually changes:

static void
autocommit_hook(const char *newval)
{
 pset.autocommit = ParseVariableBool(newval, "AUTOCOMMIT");
}



Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: Batch/pipelining support for libpq

2016-09-06 Thread Daniel Verite
Craig Ringer wrote:

> Updated patch attached.

Please find attached a couple fixes for typos I've came across in
the doc part.

Also it appears that PQqueriesInBatch() doesn't work as documented.
It says:
 "Returns the number of queries still in the queue for this batch"
but in fact it's implemented as a boolean:

+/* PQqueriesInBatch
+ *   Return true if there are queries currently pending in batch mode
+ */+int
+PQqueriesInBatch(PGconn *conn)
+{
+   if (!PQisInBatchMode(conn))
+   return false;
+
+   return conn->cmd_queue_head != NULL;
+}

However, is this function really needed? It doesn't seem essential to
the API. You don't call it in the test program either.

Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite
diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml
index 73c6c03..af4f922 100644
--- a/doc/src/sgml/libpq.sgml
+++ b/doc/src/sgml/libpq.sgml
@@ -4653,7 +4653,7 @@ int PQflush(PGconn *conn);
 
 could be much more efficiently done with:
 
- UPDATE mytable SET x = x + 1;
+ UPDATE mytable SET x = x + 1 WHERE id = 42;
 

 
@@ -4696,7 +4696,7 @@ int PQflush(PGconn *conn);
  non-blocking mode. If used 
in
  blocking mode it is possible for a client/server deadlock to occur. The
  client will block trying to send queries to the server, but the server 
will
- block trying to send results from queries it's already processed to the
+ block trying to send results from queries it has already processed to the
  client. This only occurs when the client sends enough queries to fill its
  output buffer and the server's receive buffer before switching to
  processing input from the server, but it's hard to predict exactly when
@@ -5015,7 +5015,7 @@ int PQgetNextQuery(PGconn *conn);
  
   
   Returns the number of queries still in the queue for this batch, not
-  including any query that's currently having results being processsed.
+  including any query that's currently having results being processed.
   This is the number of times PQgetNextQuery has to be
   called before the query queue is empty again.
 
@@ -5037,7 +5037,7 @@ int PQqueriesInBatch(PGconn *conn);
 
  
   
-   Returns 1 if the batch curently being received on a
+   Returns 1 if the batch currently being received on a
libpq connection in batch mode is
aborted, 0

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Surprising behaviour of \set AUTOCOMMIT ON

2016-09-08 Thread Daniel Verite
Rahila Syed wrote:

> However, including the check here will require returning the status
> of command from this hook. i.e if we throw error inside this
> hook we will need to return false indicating the value has not changed.

Looking at the other variables hooks, they already emit errors and can
deny the effect of a change corresponding to a new value, without
informing the caller. Why would autocommit be different?

For example echo_hook does this:

/* ...if the value is in (queries,errors,all,none) then
   assign pset.echo accordingly ... */
else
{
  psql_error("unrecognized value \"%s\" for \"%s\"; assuming \"%s\"\n",
   newval, "ECHO", "none");
  pset.echo = PSQL_ECHO_NONE;
}


If the user issues \set ECHO FOOBAR, then it produces the above error
message and makes the same internal change as if \set ECHO none
had been issued.

But, the value of the variable as seen by the user is still FOOBAR:

\set
[...other variables...]
ECHO = 'FOOBAR'

The proposed patch doesn't follow that behavior, as it denies
a new value for AUTOCOMMIT. You might argue that it's better,
but on the other side, there are two reasons against it:

- it's not consistent with the other uses of \set that affect psql,
such as ECHO, ECHO_HIDDEN,  ON_ERROR_ROLLBACK,
 COMP_KEYWORD_CASE... and even AUTOCOMMIT as
 \set AUTOCOMMIT FOOBAR is not denied, just reinterpreted.

- it doesn't address the case of another method than \set being used
 to set the variable, as in : SELECT 'on' as "AUTOCOMMIT" \gset
 whereas the hook would work in that case.

Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Surprising behaviour of \set AUTOCOMMIT ON

2016-09-15 Thread Daniel Verite
Tom Lane wrote:

> I think changing the hook API is a pretty reasonable thing to do here
> (though I'd make it its own patch and then add the autocommit change
> on top).  When was the last time you actually wanted to set VERBOSITY
> to "fubar"?

It looks easy to make the hooks return a bool, and when returning false,
cancel the assignment of the variable.
I volunteer to write that patch.

It would close the hazard that exists today of an internal psql flag
getting decorrelated from the corresponding variable, due to feeding
it with a wrong value, or in the case of autocommit, in the wrong
context.

Also with that, the behavior of ParseVariableBool() assuming "on"
when it's being fed with junk could be changed. Instead it could just
reject the assignment rather than emit a warning, and both
the internal flag and the variable would keep the values they had
at the point of the bogus assignment.

Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Improvements in psql hooks for variables

2016-09-16 Thread Daniel Verite
 Hi,

Following the discussion on forbidding an AUTOCOMMIT off->on
switch mid-transaction [1], attached is a patch that let the hooks
return a boolean indicating whether a change is allowed.

Using the hooks, bogus assignments to built-in variables can
be dealt with more strictly.

For example, pre-patch behavior:

  =# \set ECHO errors
  =# \set ECHO on
  unrecognized value "on" for "ECHO"; assuming "none"
  =# \echo :ECHO
  on

which has two problems:
- we have to assume a value, even though we can't know what the user meant.
- after assignment, the user-visible value of the variable diverges from its
internal counterpart (pset.echo in this case).


Post-patch:
  =# \set ECHO errors
  =# \set ECHO on
  unrecognized value "on" for "ECHO"
  \set: error while setting variable
  =# \echo :ECHO
  errors

Both the internal pset.* state and the user-visible value are kept unchanged
is the input value is incorrect.

Concerning AUTOCOMMIT, autocommit_hook() could return false to forbid
a switch when the conditions are not met.


Another user-visible effect of the patch is that, using a bogus value
for a built-in variable on the command-line becomes a fatal error
that prevents psql to continue.
This is not directly intended by the patch but is a consequence
of SetVariable() failing.

Example:
  $ ./psql -vECHO=bogus
  unrecognized value "bogus" for "ECHO"
  psql: could not set variable "ECHO"
  $ echo $?
  1

The built-in vars concerned by the change are:

booleans: AUTOCOMMIT, ON_ERROR_STOP, QUIET, SINGLELINE, SINGLESTEP

non-booleans: ECHO, ECHO_HIDDEN, ON_ERROR_ROLLBACK, COMP_KEYWORD_CASE,
 HISTCONTROL, VERBOSITY, SHOW_CONTEXT

We could go further to close the gap between pset.* and the built-in
variables,
by changing how they're initialized and forbidding deletion as Tom
suggests in [2], but if there's negative feedback on the above changes,
I think we should hear it first.

[1]
https://www.postgresql.org/message-id/f2cb5838-0ee9-4fe3-acc0-df77aeb7d4c7%40mm
[2]
https://www.postgresql.org/message-id/4695.1473961140%40sss.pgh.pa.us


Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite
diff --git a/src/bin/psql/startup.c b/src/bin/psql/startup.c
index 7ce05fb..9dcfc0a 100644
--- a/src/bin/psql/startup.c
+++ b/src/bin/psql/startup.c
@@ -786,43 +786,59 @@ showVersion(void)
  * This isn't an amazingly good place for them, but neither is anywhere else.
  */
 
-static void
+/*
+ * Hook to set an internal flag from a user-supplied string value.
+ * If the syntax is correct, affect *flag and return true.
+ * Otherwise, keep *flag untouched and return false.
+ */
+static bool
+generic_boolean_hook(const char *newval, const char* varname, bool *flag)
+{
+   bool isvalid;
+   bool val = ParseVariableBool(newval, varname, &isvalid);
+   if (isvalid)
+   *flag = val;
+   return isvalid;
+}
+
+static bool
 autocommit_hook(const char *newval)
 {
-   pset.autocommit = ParseVariableBool(newval, "AUTOCOMMIT");
+   return generic_boolean_hook(newval, "AUTOCOMMIT", &pset.autocommit);
 }
 
-static void
+static bool
 on_error_stop_hook(const char *newval)
 {
-   pset.on_error_stop = ParseVariableBool(newval, "ON_ERROR_STOP");
+   return generic_boolean_hook(newval, "ON_ERROR_STOP", 
&pset.on_error_stop);
 }
 
-static void
+static bool
 quiet_hook(const char *newval)
 {
-   pset.quiet = ParseVariableBool(newval, "QUIET");
+   return generic_boolean_hook(newval, "QUIET", &pset.quiet);
 }
 
-static void
+static bool
 singleline_hook(const char *newval)
 {
-   pset.singleline = ParseVariableBool(newval, "SINGLELINE");
+   return generic_boolean_hook(newval, "SINGLELINE", &pset.singleline);
 }
 
-static void
+static bool
 singlestep_hook(const char *newval)
 {
-   pset.singlestep = ParseVariableBool(newval, "SINGLESTEP");
+   return generic_boolean_hook(newval, "SINGLESTEP", &pset.singlestep);
 }
 
-static void
+static bool
 fetch_count_hook(const char *newval)
 {
pset.fetch_count = ParseVariableNum(newval, -1, -1, false);
+   return true;
 }
 
-static void
+static bool
 echo_hook(const char *newval)
 {
if (newval == NULL)
@@ -837,39 +853,52 @@ echo_hook(const char *newval)
pset.echo = PSQL_ECHO_NONE;
else
{
-   psql_error("unrecognized value \"%s\" for \"%s\"; assuming 
\"%s\"\n",
-  newval, "ECHO", "none");
-   pset.echo = PSQL_ECHO_NONE;
+   psql_error("unrecognized value \"%s\" for \"%s\"\n",
+  newval, "ECHO");
+   return false;
}
+   return true;
 }
 
-static void
+static bool
 echo_hidden_hook(const char *newval)
 {
if (newval == NULL)
pset.echo_hidden = PSQL_ECHO_HIDDEN_OFF;
else if (pg_strcasecmp(newval, "noexec") == 0)
pset.echo_hidden = PSQL_ECHO_HIDDEN_NOEXEC;
-   else if (ParseV

Re: [HACKERS] Improvements in psql hooks for variables

2016-09-19 Thread Daniel Verite
Rahila Syed wrote:


> I am beginning to review this patch. Initial comment. I got following
> compilation error when I applied the patch on latest sources and run make.

Sorry about that, I forgot to make clean, here's an updated patch.

Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite
diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c
index a9a2fdb..61b2304 100644
--- a/src/bin/psql/command.c
+++ b/src/bin/psql/command.c
@@ -254,7 +254,7 @@ exec_command(const char *cmd,
if (opt1 != NULL && strncmp(opt1, prefix, sizeof(prefix) - 1) 
== 0)
{
reuse_previous =
-   ParseVariableBool(opt1 + sizeof(prefix) - 1, 
prefix) ?
+   ParseVariableBool(opt1 + sizeof(prefix) - 1, 
prefix, NULL) ?
TRI_YES : TRI_NO;
 
free(opt1);
@@ -1548,7 +1548,7 @@ exec_command(const char *cmd,

 OT_NORMAL, NULL, false);
 
if (opt)
-   pset.timing = ParseVariableBool(opt, "\\timing");
+   pset.timing = ParseVariableBool(opt, "\\timing", NULL);
else
pset.timing = !pset.timing;
if (!pset.quiet)
@@ -2660,7 +2660,7 @@ do_pset(const char *param, const char *value, 
printQueryOpt *popt, bool quiet)
if (value && pg_strcasecmp(value, "auto") == 0)
popt->topt.expanded = 2;
else if (value)
-   popt->topt.expanded = ParseVariableBool(value, param);
+   popt->topt.expanded = ParseVariableBool(value, param, 
NULL);
else
popt->topt.expanded = !popt->topt.expanded;
}
@@ -2669,7 +2669,7 @@ do_pset(const char *param, const char *value, 
printQueryOpt *popt, bool quiet)
else if (strcmp(param, "numericlocale") == 0)
{
if (value)
-   popt->topt.numericLocale = ParseVariableBool(value, 
param);
+   popt->topt.numericLocale = ParseVariableBool(value, 
param, NULL);
else
popt->topt.numericLocale = !popt->topt.numericLocale;
}
@@ -2724,7 +2724,7 @@ do_pset(const char *param, const char *value, 
printQueryOpt *popt, bool quiet)
else if (strcmp(param, "t") == 0 || strcmp(param, "tuples_only") == 0)
{
if (value)
-   popt->topt.tuples_only = ParseVariableBool(value, 
param);
+   popt->topt.tuples_only = ParseVariableBool(value, 
param, NULL);
else
popt->topt.tuples_only = !popt->topt.tuples_only;
}
@@ -2756,7 +2756,7 @@ do_pset(const char *param, const char *value, 
printQueryOpt *popt, bool quiet)
popt->topt.pager = 2;
else if (value)
{
-   if (ParseVariableBool(value, param))
+   if (ParseVariableBool(value, param, NULL))
popt->topt.pager = 1;
else
popt->topt.pager = 0;
@@ -2778,7 +2778,7 @@ do_pset(const char *param, const char *value, 
printQueryOpt *popt, bool quiet)
else if (strcmp(param, "footer") == 0)
{
if (value)
-   popt->topt.default_footer = ParseVariableBool(value, 
param);
+   popt->topt.default_footer = ParseVariableBool(value, 
param, NULL);
else
popt->topt.default_footer = !popt->topt.default_footer;
}
diff --git a/src/bin/psql/startup.c b/src/bin/psql/startup.c
index 7ce05fb..9dcfc0a 100644
--- a/src/bin/psql/startup.c
+++ b/src/bin/psql/startup.c
@@ -786,43 +786,59 @@ showVersion(void)
  * This isn't an amazingly good place for them, but neither is anywhere else.
  */
 
-static void
+/*
+ * Hook to set an internal flag from a user-supplied string value.
+ * If the syntax is correct, affect *flag and return true.
+ * Otherwise, keep *flag untouched and return false.
+ */
+static bool
+generic_boolean_hook(const char *newval, const char* varname, bool *flag)
+{
+   bool isvalid;
+   bool val = ParseVariableBool(newval, varname, &isvalid);
+   if (isvalid)
+   *flag = val;
+   return isvalid;
+}
+
+static bool
 autocommit_hook(const char *newval)
 {
-   pset.autocommit = ParseVariableBool(newval, "AUTOCOMMIT");
+   return generic_boolean_hook(newval, "AUTOCOMMIT", &pset.autocommit);
 }
 
-static void
+static bool
 on_error_stop_hook(const char *newval)
 {
-   pset.on_error_stop = ParseVariableBool(newval, "ON_ERROR_STOP");
+   return generic_boolean_hook(newval, "ON_ERROR_STOP", 
&ps

Re: [HACKERS] Improvements in psql hooks for variables

2016-09-20 Thread Daniel Verite
Ashutosh Bapat wrote:

> You may want to add this to the November commitfest
> https://commitfest.postgresql.org/11/.

Done. It's at https://commitfest.postgresql.org/11/799/

Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] get_home_path: use HOME

2016-09-22 Thread Daniel Verite
Tom Lane wrote:

> If we take this patch, what's to stop someone from complaining that we
> broke *their* badly-designed system that abuses the HOME variable?

POSIX warns against doing that, listing HOME in the variables that
should be left to their intended usage:
http://pubs.opengroup.org/onlinepubs/9699919799/


  If the variables in the following two sections are present in the
  environment during the execution of an application or utility, they
  shall be given the meaning described below
  [...]
  HOME
  The system shall initialize this variable at the time of login to
  be a pathname of the user's home directory. See .


psql is indirectly using $HOME already for readline and terminfo:

$ HOME=/tmp/home2 strace psql 2>tr ; grep home2 tr
...
stat("/tmp/home2/.terminfo", 0x7ff985bf4730) = -1 ENOENT (No such file or
directory)
stat("/tmp/home2/.inputrc", 0x7fff3f641d70) = -1 ENOENT (No such file or
directory)

Also when using Debian's psql, the wrapper looks for it own config file in
$HOME:
open("/tmp/home2/.postgresqlrc", O_RDONLY) = -1 ENOENT (No such file or
directory)
Being written in Perl, it could use getpwuid(), but it doesn't, like I
believe
the majority of programs that just want the home directory.

+1 on using HOME for being consistent with other pieces of code around
postgres, and for the easiness of locally overriding it when
troubleshooting problems with dot files.


Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_dump / copy bugs with "big lines" ?

2016-09-28 Thread Daniel Verite
Tomas Vondra wrote:

> 4) HandleParallelMessage needs a tweak, as it uses msg->len in a log 
> message, but with '%d' and not '%ld' (as needed after changing the type 
> to Size).

This could be %zu for the Size type. It's supported by elog().

However, it occurs to me that if we don't claim the 2GB..4GB range for
the CopyData message, because its Int32 length is not explicitly
unsigned as mentioned upthread, then it should follow that we don't
need to change StringInfo.{len,maxlen} from int type to Size type.

We should just set the upper limit for StringInfo.maxlen to
(0x7fff-1) instead of MaxAllocHugeSize for stringinfos with the
allow_long flag set, and leave the len and maxlen fields to their
original, int type.

Does that make sense?


Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Dynamic result sets from procedures

2017-11-02 Thread Daniel Verite
Peter Eisentraut wrote:

> CREATE PROCEDURE pdrstest1()
> LANGUAGE SQL
> AS $$
> DECLARE c1 CURSOR WITH RETURN FOR SELECT * FROM cp_test2;
> DECLARE c2 CURSOR WITH RETURN FOR SELECT * FROM cp_test3;
> $$;
> 
> CALL pdrstest1();
> 
> and that returns those two result sets to the client.

If applied to plpgsql, to return a dynamic result, the following
does work:

CREATE PROCEDURE test()
LANGUAGE plpgsql
AS $$
DECLARE
 query text:= 'SELECT 1 AS col1, 2 AS col2';
BEGIN
 EXECUTE 'DECLARE c CURSOR WITH RETURN FOR ' || query;
END;
$$;

This method could be used, for instance, to build a pivot with dynamic
columns in a single client-server round-trip, which is not possible today
with the query-calling-functions interface.
More generally, I guess this should help in the whole class of situations
where the client needs polymorphic results, which is awesome.

But instead of having procedures not return anything,
couldn't they return whatever resultset(s) they want to
("no resultset" being just a particular case of "anything"),
so that we could leave out cursors and simply write:

CREATE PROCEDURE test()
LANGUAGE plpgsql
AS $$
  RETURN QUERY  EXECUTE 'SELECT 1 AS col1, 2 AS col2';
END;
$$;

Or is that not possible or not desirable?

Similarly, for the SQL language, I wonder if the above example
could be simplified to:

CREATE PROCEDURE pdrstest1()
LANGUAGE SQL
AS $$
 SELECT * FROM cp_test2;
 SELECT * FROM cp_test3;
$$;
by which the two result sets would go back to the client again
without declaring explicit cursors.
Currently, it does not error out and no result set is sent.

Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Dynamic result sets from procedures

2017-11-04 Thread Daniel Verite
Peter Eisentraut wrote:

> > CREATE PROCEDURE test()
> > LANGUAGE plpgsql
> > AS $$
> >   RETURN QUERY  EXECUTE 'SELECT 1 AS col1, 2 AS col2';
> > END;
> > $$;
> > 
> > Or is that not possible or not desirable?
> 
> RETURN means the execution ends there, so how would you return multiple
> result sets?

RETURN alone yes, but RETURN QUERY continues the execution, appending
rows to the single result set of the function. In the case of a
procedure, I guess each RETURN QUERY could generate an independant
result set.

> But maybe you don't want to return all those results, so you'd need a
> way to designate which ones, e.g.,
> 
> AS $$
> SELECT set_config('something', 'value');
> SELECT * FROM interesting_table;  -- return only this one
> SELECT set_config('something', 'oldvalue');
> $$;

Yes, in that case, lacking PERFORM in SQL, nothing simple comes to
mind on how to return certain results and not others.
But if it was in an SQL function, it wouldn't return the rows of
"interesting_table" either. I think it would be justified to say to just
use plpgsql for that kind of sequence.


Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_dump / copy bugs with "big lines" ?

2016-09-30 Thread Daniel Verite
Tomas Vondra wrote:

> A few minor comments regarding the patch:
> 
> 1) CopyStartSend seems pretty pointless - It only has one function call 
> in it, and is called on exactly one place (and all other places simply 
> call allowLongStringInfo directly). I'd get rid of this function and 
> replace the call in CopyOneRowTo(() with allowLongStringInfo().
> 
> 2) allowlong seems awkward, allowLong or allow_long would be better
> 
> 3) Why does resetStringInfo reset the allowLong flag? Are there any 
> cases when we want/need to forget the flag value? I don't think so, so 
> let's simply not reset it and get rid of the allowLongStringInfo() 
> calls. Maybe it'd be better to invent a new makeLongStringInfo() method 
> instead, which would make it clear that the flag value is permanent.
> 
> 4) HandleParallelMessage needs a tweak, as it uses msg->len in a log 
> message, but with '%d' and not '%ld' (as needed after changing the type 
> to Size).
>
> 5) The comment at allowLongStringInfo talks about allocLongStringInfo 
> (i.e. wrong function name).

Here's an updated patch. Compared to the previous version:

- removed CopyStartSend (per comment #1 in review)

- renamed flag to allow_long (comment #2)

- resetStringInfo no longer resets the flag (comment #3).

- allowLongStringInfo() is removed (comment #3 and #5),
makeLongStringInfo() and initLongStringInfo() are provided
instead, as alternatives to makeStringInfo() and initStringInfo().

- StringInfoData.len is back to int type, 2GB max.
(addresses comment #4 incidentally).
This is safer because  many routines peeking
into StringInfoData use int for offsets into the buffer,
for instance most of the stuff in backend/libpq/pqformat.c
Altough these routines are not supposed to be called on long
buffers, this assertion was not enforced in the code, and
with a 64-bit length effectively over 2GB, they would misbehave
pretty badly.


Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
diff --git a/src/backend/access/common/heaptuple.c 
b/src/backend/access/common/heaptuple.c
index 6d0f3f3..ed9cabd 100644
--- a/src/backend/access/common/heaptuple.c
+++ b/src/backend/access/common/heaptuple.c
@@ -741,7 +741,9 @@ heap_form_tuple(TupleDesc tupleDescriptor,
 * Allocate and zero the space needed.  Note that the tuple body and
 * HeapTupleData management structure are allocated in one chunk.
 */
-   tuple = (HeapTuple) palloc0(HEAPTUPLESIZE + len);
+   tuple = MemoryContextAllocExtended(CurrentMemoryContext,
+  
HEAPTUPLESIZE + len,
+  
MCXT_ALLOC_HUGE | MCXT_ALLOC_ZERO);
tuple->t_data = td = (HeapTupleHeader) ((char *) tuple + HEAPTUPLESIZE);
 
/*
diff --git a/src/backend/commands/copy.c b/src/backend/commands/copy.c
index 432b0ca..7419362 100644
--- a/src/backend/commands/copy.c
+++ b/src/backend/commands/copy.c
@@ -397,7 +397,7 @@ ReceiveCopyBegin(CopyState cstate)
pq_sendint(&buf, format, 2);/* per-column 
formats */
pq_endmessage(&buf);
cstate->copy_dest = COPY_NEW_FE;
-   cstate->fe_msgbuf = makeStringInfo();
+   cstate->fe_msgbuf = makeLongStringInfo();
}
else if (PG_PROTOCOL_MAJOR(FrontendProtocol) >= 2)
{
@@ -1892,7 +1892,7 @@ CopyTo(CopyState cstate)
cstate->null_print_client = cstate->null_print; /* default */
 
/* We use fe_msgbuf as a per-row buffer regardless of copy_dest */
-   cstate->fe_msgbuf = makeStringInfo();
+   cstate->fe_msgbuf = makeLongStringInfo();
 
/* Get info about the columns we need to process. */
cstate->out_functions = (FmgrInfo *) palloc(num_phys_attrs * 
sizeof(FmgrInfo));
@@ -2707,8 +2707,8 @@ BeginCopyFrom(ParseState *pstate,
cstate->cur_attval = NULL;
 
/* Set up variables to avoid per-attribute overhead. */
-   initStringInfo(&cstate->attribute_buf);
-   initStringInfo(&cstate->line_buf);
+   initLongStringInfo(&cstate->attribute_buf);
+   initLongStringInfo(&cstate->line_buf);
cstate->line_buf_converted = false;
cstate->raw_buf = (char *) palloc(RAW_BUF_SIZE + 1);
cstate->raw_buf_index = cstate->raw_buf_len = 0;
diff --git a/src/backend/lib/stringinfo.c b/src/backend/lib/stringinfo.c
index 7382e08..0125047 100644
--- a/src/backend/lib/stringinfo.c
+++ b/src/backend/lib/stringinfo.c
@@ -37,6 +37,24 @@ makeStringInfo(void)
 }
 
 /*
+ * makeLongStringInfo
+ *
+ * Same as makeStringInfo for larger strings.
+ */
+StringInfo
+makeLongStringInfo(void)
+{
+   StringInfo  res;
+
+   res = (StringInfo) palloc(sizeof(StringInfoData));
+
+   initLongStringInfo(res);
+
+   return res;
+}
+
+
+/*
  * initStringInfo
  *
  * Initialize a StringInfoData struct (with previously undefine

Re: [HACKERS] PATCH: Batch/pipelining support for libpq

2016-10-03 Thread Daniel Verite
Craig Ringer wrote:

> I think it's mostly of interest to app authors and driver developers
> and that's what it's aimed at. pg_restore could benefit a lot too.

Wouldn't pgbench benefit from it?
It was mentioned some time ago [1], in relationship to the 
\into construct, how client-server latency was important enough to
justify the use of a "\;" separator between statements, to send them
as a group.

But with the libpq batch API, maybe this could be modernized
with meta-commands like this:
  \startbatch
  ...
  \endbatch
which would essentially call PQbeginBatchMode() and PQsendEndBatch().
Inside the batch section, collecting results would be interleaved
with sending queries.
Interdepencies between results and subsequent queries could
be handled or ignored, depending on how sophisticated we'd want
this.

This might also draw more users to the batch API, because
it would make it easier to check how exactly it affects
the performance of specific sequences of SQL statements to be
grouped in a batch.
For instance it would make sense for programmers to benchmark mock-ups
of their code with pgbench with/without batching, before embarking on
adapting it from blocking mode to asynchronous/non-blocking mode.


[1]
https://www.postgresql.org/message-id/alpine.DEB.2.20.1607140925400.1962%40sto

Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: psql \setfileref

2016-10-10 Thread Daniel Verite
Gilles Darold wrote:

>postgres=# \setfileref b /dev/random
>postgres=# insert into test (:b);
> 
> Process need to be killed using SIGTERM.

This can be fixed by setting sigint_interrupt_enabled to true
before operating on the file. Then ctrl-C would be able to cancel
the command.

See comment in common.c, above the declaration of 
sigint_interrupt_enabled:

/*
 []
 * SIGINT is supposed to abort all long-running psql operations, not only
 * database queries.  In most places, this is accomplished by checking
 * cancel_pressed during long-running loops.  However, that won't work when
 * blocked on user input (in readline() or fgets()).  In those places, we
 * set sigint_interrupt_enabled TRUE while blocked, instructing the signal
 * catcher to longjmp through sigint_interrupt_jmp.  We assume readline and
 * fgets are coded to handle possible interruption.  (XXX currently this does
 * not work on win32, so control-C is less useful there)
 */


Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Better logging of COPY queries if log_statement='all'

2016-10-17 Thread Daniel Verite
Aleksander Alekseev wrote:

> According to my colleagues it would be very nice to have this feature.
> For instance, if you are trying to optimize PostgreSQL for application
> that uses COPY and you don't have access to or something like this. 
> It could also be useful in some other cases.

Outside of the app, what can be already set up is an AFTER INSERT
FOR EACH ROW trigger that essentially does:

  raise LOG, '%', NEW;

The main drawback of this approach is that, for each line of data
emitted to the log, there's another STATEMENT: copy... line added.
But that might be not too bad for certain uses.

Ideally we should be able to access the new rowset as a whole through
a statement-level trigger. In that case, the data could be logged in
a one-shot operation by that trigger.
There's a related item in the TODO list:
  "Allow statement-level triggers to access modified rows"
and an old thread on -hackers here:
https://www.postgresql.org/message-id/20060522150647.ge24...@svana.org
that discussed this topic in relation to MSSQL having this functionality.


Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Improvements in psql hooks for variables

2016-11-10 Thread Daniel Verite
Rahila Syed wrote:

> I have applied this patch on latest HEAD and have done basic testing which
> works fine.

Thanks for reviewing this patch!

> >if (current->assign_hook)
> >-   (*current->assign_hook) (current->value);
> >-   return true;
> >+   {
> >+   confirmed = (*current->assign_hook) (value);
> >+   }
> >+   if (confirmed)
> 
> Spurious brackets

OK.

> >static bool
> >+generic_boolean_hook(const char *newval, const char* varname, bool *flag)
> >+{
> 
> Contrary to what name suggests this function does not seem to have other
> implementations as in a hook.
> Also this takes care of rejecting a syntactically wrong value only for
> boolean variable hooks like autocommit_hook,
> on_error_stop_hook. However, there are other variable hooks which call
> ParseVariableBool.
> For instance, echo_hidden_hook which is handled separately in the patch.
> Thus there is some duplication of code between generic_boolean_hook and
> echo_hidden_hook.
> Similarly for on_error_rollback_hook.

The purpose of generic_boolean_hook() is to handle the case of a
boolean variable that only accepts ON or OFF, and has its pset.varname
declared as bool. I thought of this case as "generic" because that's
the base case and several variables need no more than that.

ECHO_HIDDEN differs from the generic boolean case because it also
accepts "noexec" and pset.echo_hidden is an enum, not a boolean. When
considering refactoring echo_hidden_hook() to call
generic_boolean_hook() instead of ParseVariableBool() after 
having established that the value is not "noexec", I don't see
any advantage in clarity or code size, so I'm not in favor of that change.

The same applies to on_error_rollback_hook(), which has to deal
with a specific enum as well.

> >-static void
> >+static bool
> > fetch_count_hook(const char *newval)
> > {
> >pset.fetch_count = ParseVariableNum(newval, -1, -1, false);
> >+   return true;
> > }
> 
> Shouldn't invalid numeric string assignment for numeric variables be
> handled too?

Agreed. Assignments like "\set FETCH_COUNT bogus" don't provoke any
user feedback currently, which is not ideal. I'll add this in a
v3 of the patch tomorrow.

> Instead of generic_boolean_hook cant we have something like follows which
> like generic_boolean_hook can be called from specific variable assignment
> hooks,
> 
> static bool
> ParseVariable(newval, VariableName, &pset.var)
> {
> if (VariableName == ‘AUTOCOMMIT’ || ECHO_HIDDEN || other variable with
> hooks which call ParseVariableBool )
>   ON_ERROR_ROLLBACK>
> else if (VariableName == ‘FETCH_COUNT’)
> ParseVariableNum();
> }

It's not possible because pset.var corresponds to different fields from
struct _psqlSettings that have different types: bool, int and some
enum types.
Besides, I don't think it would go well with hooks. If we wanted one
big function that knows all about parsing all built-in variables, we
could just as well dispense with hooks, since their current purpose in
psql is to achieve this parsing, but in a decentralized way.
Or if we keep them, our various built-in variables would be
essentially tied to the same one-big-hook-that-does-all, but isn't
that an antipattern for hooks?


> >@@ -260,7 +276,7 @@ SetVariableAssignHook(VariableSpace space, const char
> *name, VariableAssignHook
> >current->assign_hook = hook;
> >current->next = NULL;
> >previous->next = current;
> >-   (*hook) (NULL);
> >+   (void)(*hook) (NULL);   /* ignore return value */
> 
> Sorry for my lack of understanding, can you explain why is above change
> needed?

"hook" is changed by the patch from [pointer to function returning
void], to [pointer to function returning bool]. The cast to void is
not essential, it just indicates that we deliberately want to
ignore the return value here. I expect some compilers might
complain under a high level of warnings without this cast, although
TBH if you ask me, I wouldn't know which compiler with which flags
exactly.


Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Improvements in psql hooks for variables

2016-11-15 Thread Daniel Verite
  Hi,

I'm attaching v3 of the patch with error reporting for
FETCH_COUNT as mentioned upthread, and rebased
on the most recent master.

> I was suggesting change on the lines of extending generic_boolean_hook to
> include enum(part in enum_hooks which calls ParseVariableBool) and
> integer parsing as well.

Well, generic_boolean_hook() is meant to change this, for instance:

  static void
  on_error_stop_hook(const char *newval)
  {
 pset.on_error_stop = ParseVariableBool(newval, "ON_ERROR_STOP");
  }

into that:

  static bool
  on_error_stop_hook(const char *newval)
  {
 return generic_boolean_hook(newval, "ON_ERROR_STOP",
   &pset.on_error_stop);
   }

with the goal that the assignment does not occur if "newval" is bogus.
The change is really minimal.

When we're dealing with enum-or-bool variables, such as for instance
ECHO_HIDDEN, the patch replaces this:

  static void
  echo_hidden_hook(const char *newval)
  {
if (newval == NULL)
  pset.echo_hidden = PSQL_ECHO_HIDDEN_OFF;
else if (pg_strcasecmp(newval, "noexec") == 0)
  pset.echo_hidden = PSQL_ECHO_HIDDEN_NOEXEC;
else if (ParseVariableBool(newval, "ECHO_HIDDEN"))
  pset.echo_hidden = PSQL_ECHO_HIDDEN_ON;
else  /* ParseVariableBool printed msg if needed */
  pset.echo_hidden = PSQL_ECHO_HIDDEN_OFF;
  }

with that:

  static bool
  echo_hidden_hook(const char *newval)
  {
if (newval == NULL)
  pset.echo_hidden = PSQL_ECHO_HIDDEN_OFF;
else if (pg_strcasecmp(newval, "noexec") == 0)
  pset.echo_hidden = PSQL_ECHO_HIDDEN_NOEXEC;
else
{
  bool isvalid;
  bool val = ParseVariableBool(newval, "ECHO_HIDDEN", &isvalid);
  if (!isvalid)
return false; /* ParseVariableBool printed msg */
  pset.echo_hidden = val ? PSQL_ECHO_HIDDEN_ON : PSQL_ECHO_HIDDEN_OFF;
}
return true;
  }

The goal being again to reject a bogus assignment, as opposed to replacing
it with any hardwired value.
Code-wise, we can't call generic_boolean_hook() here because we need
to assign a non-boolean specific value after having parsed the ON/OFF
user-supplied string.

More generally, it turns out that the majority of hooks are concerned
by this patch, as they parse user-supplied values, but there
are 4 distinct categories of variables:

1- purely ON/OFF vars:
   AUTOCOMMIT, ON_ERROR_STOP, QUIET, SINGLELINE, SINGLESTEP

2- ON/OFF mixed with enum values:
  ECHO_HIDDEN, ON_ERROR_ROLLBACK

3- purely enum values:
  COMP_KEYWORD_CASE, HISTCONTROL, ECHO, VERBOSITY, SHOW_CONTEXT

4- numeric values:
  FETCH_COUNT

If you suggest that the patch should refactor the implementation
of hooks for case #2, only two hooks are concerned and they consist
of non-mergeable enum-specific code interleaved with generic code,
so I don't foresee any gain in fusioning. I have the same opinion about
merging any of #1, #2, #3, #4 together.
But feel free to post code implementing your idea if you disagree,
maybe I just don't figure out what you have in mind.
For case #3, these hooks clearly follow a common pattern, but I also
don't see any benefit in an opportunistic rewrite given the nature of
the functions.


Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite
diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c
index a9a2fdb..61b2304 100644
--- a/src/bin/psql/command.c
+++ b/src/bin/psql/command.c
@@ -254,7 +254,7 @@ exec_command(const char *cmd,
if (opt1 != NULL && strncmp(opt1, prefix, sizeof(prefix) - 1) 
== 0)
{
reuse_previous =
-   ParseVariableBool(opt1 + sizeof(prefix) - 1, 
prefix) ?
+   ParseVariableBool(opt1 + sizeof(prefix) - 1, 
prefix, NULL) ?
TRI_YES : TRI_NO;
 
free(opt1);
@@ -1548,7 +1548,7 @@ exec_command(const char *cmd,

 OT_NORMAL, NULL, false);
 
if (opt)
-   pset.timing = ParseVariableBool(opt, "\\timing");
+   pset.timing = ParseVariableBool(opt, "\\timing", NULL);
else
pset.timing = !pset.timing;
if (!pset.quiet)
@@ -2660,7 +2660,7 @@ do_pset(const char *param, const char *value, 
printQueryOpt *popt, bool quiet)
if (value && pg_strcasecmp(value, "auto") == 0)
popt->topt.expanded = 2;
else if (value)
-   popt->topt.expanded = ParseVariableBool(value, param);
+   popt->topt.expanded = ParseVariableBool(value, param, 
NULL);
else
popt->topt.expanded = !popt->topt.expanded;
}
@@ -2669,7 +2669,7 @@ do_pset(const char *param, const char *value, 
printQueryOpt *popt, bool quiet)
else if (strcmp(param, "numericlocale") ==

Re: [HACKERS] proposal: psql \setfileref

2016-11-15 Thread Daniel Verite
Corey Huinker wrote:

> I am not asking for this feature now, but do you see any barriers to later
> adding x/xq/xb/xbq equivalents for executing a shell command?

For text contents, we already have this (pasted right from the doc):

testdb=> \set content `cat my_file.txt`
testdb=> INSERT INTO my_table VALUES (:'content');

Maybe we might look at why it doesn't work for binary string and fix that?

I can think of three reasons:

- psql use C-style '\0' terminated string implying no nul byte in variables.
That could potentially be fixed by tweaking the data structures and
accessors.

- the backtick operator trims ending '\n' from the result of the command
(like the shell), but we could add a variant that doesn't have this behavior.

- we don't have "interpolate as binary", an operator that will
essentially apply PQescapeByteaConn rather than PQescapeStringConn.

For example, I'd suggest this syntax:

-- slurp a file into a variable, with no translation whatsoever
\set content ``cat my_binary_file``

-- interpolate as binary (backquotes instead of quotes)
INSERT INTO my_table VALUES(:`content`);

If we had something like this, would we still need filerefs? I feel like
the point of filerefs is mainly to work around lack of support for
binary in variables, but maybe supporting the latter directly would
be an easier change to accept.


Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Improvements in psql hooks for variables

2016-11-21 Thread Daniel Verite
Tom Lane wrote:

> Stephen Frost  writes:
> > In reviewing this patch, I also noticed that it's set up to assume a
> > 'true' result when a variable can't be parsed by ParseVariableBool.
> 
> I suppose that's meant to be backwards-compatible with the current
> behavior:
> 
> regression=# \timing foo
> unrecognized value "foo" for "\timing"; assuming "on"
> Timing is on.

Exactly. The scope of the patch is limited to the effect
of \set assignments to built-in variables.

> but I agree that if we're changing things in this area, that would
> be high on my list of things to change.  I think what we want going
> forward is to disallow setting "special" variables to invalid values,
> and that should hold both for regular variables that have special
> behaviors, and for the special-syntax cases like \timing.

+1

Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Improvements in psql hooks for variables

2016-11-21 Thread Daniel Verite
Stephen Frost wrote:

> Just fyi, there seems to be some issues with this patch because setting
> my PROMPT1 and PROMPT2 variables result in rather odd behavior.

Good catch! The issue is that the patch broke the assumption
of prompt hooks that their argument points to a long-lived buffer.
The attached v4 fixes the bug by passing to hooks a pointer to the final
strdup'ed value in VariableSpace rather than temp space, as commented
in SetVariable().

Also I've changed something else in ParseVariableBool(). The code assumes
"false" when value==NULL, but when value is an empty string, the result
was true and considered valid, due to the following test being
positive when len==0 (both with HEAD or the v3 patch from this thread):

if (pg_strncasecmp(value, "true", len) == 0)
return true;
It happens that "" as a value yields the same result than "true" for this
test so it passes, but it seems rather unintentional.

The resulting problem from the POV of the user is that we can do that,
for instance:

   test=> \set AUTOCOMMIT

without error message or feedback, and that leaves us without much
clue about autocommit being enabled:

   test=> \echo :AUTOCOMMIT

   test=> 

So I've changed ParseVariableBool() in v4 to reject empty string as an
invalid boolean (but not NULL since the startup logic requires NULL
meaning false in the early initialization of these variables).

"make check" seems OK with that, I hope it doesn't cause any regression
elsewhere.


Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite
diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c
index a9a2fdb..61b2304 100644
--- a/src/bin/psql/command.c
+++ b/src/bin/psql/command.c
@@ -254,7 +254,7 @@ exec_command(const char *cmd,
if (opt1 != NULL && strncmp(opt1, prefix, sizeof(prefix) - 1) 
== 0)
{
reuse_previous =
-   ParseVariableBool(opt1 + sizeof(prefix) - 1, 
prefix) ?
+   ParseVariableBool(opt1 + sizeof(prefix) - 1, 
prefix, NULL) ?
TRI_YES : TRI_NO;
 
free(opt1);
@@ -1548,7 +1548,7 @@ exec_command(const char *cmd,

 OT_NORMAL, NULL, false);
 
if (opt)
-   pset.timing = ParseVariableBool(opt, "\\timing");
+   pset.timing = ParseVariableBool(opt, "\\timing", NULL);
else
pset.timing = !pset.timing;
if (!pset.quiet)
@@ -2660,7 +2660,7 @@ do_pset(const char *param, const char *value, 
printQueryOpt *popt, bool quiet)
if (value && pg_strcasecmp(value, "auto") == 0)
popt->topt.expanded = 2;
else if (value)
-   popt->topt.expanded = ParseVariableBool(value, param);
+   popt->topt.expanded = ParseVariableBool(value, param, 
NULL);
else
popt->topt.expanded = !popt->topt.expanded;
}
@@ -2669,7 +2669,7 @@ do_pset(const char *param, const char *value, 
printQueryOpt *popt, bool quiet)
else if (strcmp(param, "numericlocale") == 0)
{
if (value)
-   popt->topt.numericLocale = ParseVariableBool(value, 
param);
+   popt->topt.numericLocale = ParseVariableBool(value, 
param, NULL);
else
popt->topt.numericLocale = !popt->topt.numericLocale;
}
@@ -2724,7 +2724,7 @@ do_pset(const char *param, const char *value, 
printQueryOpt *popt, bool quiet)
else if (strcmp(param, "t") == 0 || strcmp(param, "tuples_only") == 0)
{
if (value)
-   popt->topt.tuples_only = ParseVariableBool(value, 
param);
+   popt->topt.tuples_only = ParseVariableBool(value, 
param, NULL);
else
popt->topt.tuples_only = !popt->topt.tuples_only;
}
@@ -2756,7 +2756,7 @@ do_pset(const char *param, const char *value, 
printQueryOpt *popt, bool quiet)
popt->topt.pager = 2;
else if (value)
{
-   if (ParseVariableBool(value, param))
+   if (ParseVariableBool(value, param, NULL))
popt->topt.pager = 1;
else
popt->topt.pager = 0;
@@ -2778,7 +2778,7 @@ do_pset(const char *param, const char *value, 
printQueryOpt *popt, bool quiet)
else if (strcmp(param, "footer") == 0)
{
if (value)
-   popt->topt.default_footer = ParseVariableBool(value, 
param);
+   popt->topt.default_footer = ParseVariableBool(value, 
param, NULL);
  

Re: [HACKERS] Improvements in psql hooks for variables

2016-11-21 Thread Daniel Verite
Stephen Frost wrote:

> Are you working to make those changes?  I'd rather we make the changes
> to this code once rather than push what you have now only to turn around
> and change it significantly again.

If, as a maintainer, you prefer this together in one patch, I'm fine
with that. So I'll update it shortly with changes in \timing and
a few other callers of ParseVariableBool().

Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Improvements in psql hooks for variables

2016-11-21 Thread Daniel Verite
Stephen Frost wrote:

> Are you working to make those changes?  I'd rather we make the changes
> to this code once rather than push what you have now only to turn around
> and change it significantly again.

So I went through the psql commands which don't fail on parse errors
for booleans. I'd like to attract attention on \c, because I see
several options.

\c [-reuse-previous=BOOL] ...other args..

Current: if we can't parse BOOL, assume it's ON, and go on with reconnecting.

Option1: if we can't parse BOOL, stop here, don't reconnect, set
the command result as "failed", also ignoring the other arguments.

Option2: maybe we want to create a difference between interactive
and non-interactive use, as there's already one in handling
the failure to connect through \c.
The manpage says:
   If the connection attempt failed (wrong user name, access denied,
   etc.), the previous connection will only be kept if psql is in
   interactive mode. When executing a non-interactive script,
   processing will immediately stop with an error.

Proposal: if interactive, same as Option1.
If not interactive, -reuse-previous=bogus has the same outcome
as a failure to connect. Which I think means two subcases:
if it's through \i then kill the connection but don't quit psql
if it's through -f then quit psql.

Option3: leave this command unchanged, avoiding trouble.

\timing BOOL

Current: non-parseable BOOL interpreted as TRUE. Empty BOOL toggles the
state.

Proposal: on non-parseable BOOL, command failure and pset.timing is 
left unchanged.

\pset [x | expanded | vertical ] BOOL
\pset numericlocale BOOL
\pset [tuples_only | t] BOOL
\pset footer BOOL
\t BOOL (falls into pset_do("tuples_only", ...))
\pset pager BOOL

Current: non-parseable non-empty BOOL interpreted as TRUE. Empty BOOL
toggles the on/off state. In some cases, BOOL interpretation is attempted
only after specific built-in values have been ruled out first.

Proposal: non-parseable BOOL implies command failure and unchanged state.

About the empty string when a BOOL is expected. Only \c -reuse-previous
seems concerned:

#= \c -reuse-previous=
acts the same as
#= \c -reuse-previous=ON

Proposal: handle empty as when the value is bogus.

The other commands interpret this lack of value specifically to toggle
the state, so it's a non-issue for them.

Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Improvements in psql hooks for variables

2016-11-22 Thread Daniel Verite
Stephen Frost wrote:

> That certainly doesn't feel right.  I'm thinking that if we're going to
> throw an error back to the user about a value being invalid then we
> shouldn't change the current value.
> 
> My initial thought was that perhaps we should pass the current value to
> ParseVariableBool() and let it return whatever the "right" answer is,
> however, your use of ParseVariableBool() for enums that happen to accept
> on/off means that won't really work.

That's not needed once ParseVariableBool() informs the caller about
the validity of the boolean expression, which is what the patch already
does.

For instance I just implemented it for \timing and the diff consists of just
that:

if (opt)
-   pset.timing = ParseVariableBool(opt, "\\timing",
NULL);
+   {
+   boolnewval = ParseVariableBool(opt, "\\timing",
&success);
+   if (success)
+   pset.timing = newval;
+   }
else
pset.timing = !pset.timing;

That makes \timing foobar being rejected as a bad command with a
proper error message and no change of state, which is just what we want.

> Perhaps the right answer is to flip this around a bit and treat booleans
> as a special case of enums and have a generic solution for enums.
> Consider something like:
> 
> ParseVariableEnum(valid_enums, str_value, name, curr_value);
> 
> 'valid_enums' would be a simple list of what the valid values are for a
> given variable and their corresponding value, str_value the string the
> user typed, name the name of the variable, and curr_value the current
> value of the variable.

Firstly I'd like to insist that such a refactoring is not necessary
for this patch and I feel like it would be out of place in it.

That being said, if we wanted this, I think it would be successful
only if we'd first change our internal variables pset.* from a struct 
of different types to a list of variables from some kind of common
abstract type and an abstraction layer to access them.
That would be an order of magnitude more sophisticated than what we
have.

Otherwise as I tried to explain in [1], I don't see how we could write
a ParseVariableEnum() that would return different types
and take variable inputs.
Or if we say that ParseVariableEnum should not return the value
but affect the variable directly, that would require refactoring
all call sites, and what's the benefit that would justify
such large changes?
Plus we have two different non-mergeable concepts of variables
that need this parser:
psql variables from VariableSpace stored as strings,
and C variables directly instantiated as native types.


Also, the argument that bools are just another type of enums
is legitimate in theory, but as in psql we accept any left-anchored
match of true/false/on/off/0/1, it means that the enumeration
of values is in fact:

0
1
t
tr
tru
true
f
fa
fal
fals
false
on
of
off

I don't see that it would help if the code treated the above like just a
vanilla list of values, comparable to the other qualifiers like "auto",
"expanded", "vertical", an so on, notwithstanding the fact
that they don't share the same types.

I think that the current code with ParseVariableBool() that only
handles booleans is better in terms of separation of concerns
and readability.

[1]
https://www.postgresql.org/message-id/fc879967-da93-43b6-aa5a-92f2d825e786@mm

Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] \if, \elseif, \else, \endif (was Re: PSQL commands: \quit_if, \quit_unless)

2017-02-23 Thread Daniel Verite
Tom Lane wrote:

> Ah, I see why *that* wants to know about it ... I think.  I suppose you're
> arguing that variable expansion shouldn't be able to insert, say, an \else
> in a non-active branch?  Maybe, but if it can insert an \else in an active
> branch, then why not non-active too?  Seems a bit inconsistent.

Are we sold on the idea that conditionals should be implemented
by meta-commands, rather than for example terminal symbols of
a new grammar on top of the existing?

To recall the context, psql variables are really macros that may
contain meta-commands, and when they do they're essentially
injected and executed at the point of interpolation. That's more
or less what started this thread: demo'ing how we could exit
conditionally by injecting '\q' or nothing into a variable, and
saying that even if doable it was pretty weird, and it would be
better to have real conditional structures instead.

But when conditional structures are implemented as
meta-commands, there's the problem that this structure
can be generated on the fly too, which in a way is no less weird.
While I think that the introduction of conditionals in
psql is great, I'm getting doubtful about that part.
Are there popular script languages or preprocessors
that accept variables/macros instead of symbols to structure
the flow of instructions? I can't think of any myself.


Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] One-shot expanded output in psql using \gx

2017-03-03 Thread Daniel Verite
Christoph Berg wrote:

> The new version tests \g and \gx with a new query, and
> re-running it on the last query buffer.

Thanks, here's a review:

The patch compiles and works as expected.

The code follows the same pattern as other one-shot command
modifiers, setting a flag in the global "pset" struct
in exec_command() and resetting it at the end of SendQuery().

- make installcheck-world: ok

- sgml doc: ok

- help text: ok

- includes regression tests: ok

- tab-completion: works but the list in tab-complete.c:backslash_commands[]
is sorted alphabetically so "\\gx" should come after "\\gset"

- another nitpick: in PrintQueryTuples() it assigns a bool:
+   /* one-shot expanded output requested via \gx */
+   if (pset.g_expanded)
+   my_popt.topt.expanded = true;
+ 

"expanded" is defined as a tri-valued short int (print.h:98):
  typedef struct printTableOpt
  {
//
unsigned short int expanded;/* expanded/vertical output (if supported
by
 * output format); 0=no, 1=yes, 2=auto */
Although there is still code that puts true/false in this variable
(probably because it was a bool before?), I assume that for new
code, assigning 0/1/2 should be preferred.


Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] One-shot expanded output in psql using \gx

2017-03-07 Thread Daniel Verite
Christoph Berg wrote:

> Both fixed, thanks for the review! Version 3 attached.

It looks good to me.

Stephen Frost is also reviewer on the patch, so I'm moving the
status back to "Needs review" at
https://commitfest.postgresql.org/13/973/
and let him proceed.


Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: Batch/pipelining support for libpq

2017-03-07 Thread Daniel Verite
Vaishnavi Prabakaran wrote:

> Yes, I have created a new patch entry into the commitfest 2017-03 and
> attached the latest patch with this e-mail.

Please find attached a companion patch implementing the batch API in
pgbench, exposed as \beginbatch and \endbatch meta-commands
(without documentation).

The idea for now is to make it easier to exercise the API and test
how batching performs. I guess I'll submit the patch separately in
a future CF, depending on when/if the libpq patch goes in.

While developing this, I noted a few things with 0001-v4:

1. lack of initialization for count in PQbatchQueueCount.
Trivial fix:

--- a/src/interfaces/libpq/fe-exec.c
+++ b/src/interfaces/libpq/fe-exec.c
@@ -1874,7 +1874,7 @@ PQisBusy(PGconn *conn)
 int
 PQbatchQueueCount(PGconn *conn)
 {
-   int count;
+   int count = 0;
PGcommandQueueEntry *entry;

2. misleading error message in PQexecStart. It gets called by a few other
functions than PQexec, such as PQprepare. As I understand it, the intent
here is to forbid the synchronous functions in batch mode, so this error
message should not single out PQexec.

@@ -1932,6 +2425,13 @@ PQexecStart(PGconn *conn)
if (!conn)
return false;
 
+   if (conn->asyncStatus == PGASYNC_QUEUED || conn->batch_status !=
PQBATCH_MODE_OFF)
+   {
+   printfPQExpBuffer(&conn->errorMessage,
+ libpq_gettext("cannot
PQexec in batch mode\n"));
+   return false;
+   }
+

3. In relation to #2, PQsendQuery() is not forbidden in batch mode
although I don't think it can work with it, as it's based on the old
protocol. 


Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite
diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index f6cb5d4..9b2fce8 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -269,7 +269,8 @@ typedef enum
 *
 * CSTATE_START_COMMAND starts the execution of a command.  On a SQL
 * command, the command is sent to the server, and we move to
-* CSTATE_WAIT_RESULT state.  On a \sleep meta-command, the timer is 
set,
+* CSTATE_WAIT_RESULT state unless in batch mode.
+* On a \sleep meta-command, the timer is set,
 * and we enter the CSTATE_SLEEP state to wait for it to expire. Other
 * meta-commands are executed immediately.
 *
@@ -1882,11 +1883,24 @@ sendCommand(CState *st, Command *command)
if (commands[j]->type != SQL_COMMAND)
continue;
preparedStatementName(name, st->use_file, j);
-   res = PQprepare(st->con, name,
- commands[j]->argv[0], 
commands[j]->argc - 1, NULL);
-   if (PQresultStatus(res) != PGRES_COMMAND_OK)
-   fprintf(stderr, "%s", 
PQerrorMessage(st->con));
-   PQclear(res);
+   if (PQbatchStatus(st->con) != PQBATCH_MODE_ON)
+   {
+   res = PQprepare(st->con, name,
+   
commands[j]->argv[0], commands[j]->argc - 1, NULL);
+   if (PQresultStatus(res) != 
PGRES_COMMAND_OK)
+   fprintf(stderr, "%s", 
PQerrorMessage(st->con));
+   PQclear(res);
+   }
+   else
+   {
+   /*
+* In batch mode, we use asynchronous 
functions. If a server-side
+* error occurs, it will be processed 
later among the other results.
+*/
+   if (!PQsendPrepare(st->con, name,
+  
commands[j]->argv[0], commands[j]->argc - 1, NULL))
+   fprintf(stderr, "%s", 
PQerrorMessage(st->con));
+   }
}
st->prepared[st->use_file] = true;
}
@@ -2165,7 +2179,13 @@ doCustom(TState *thread, CState *st, StatsData *agg)
return;
}
else
-   st->state = CSTATE_WAIT_RESULT;
+   {
+   /* Wait for results, unless in 
batch mode */
+  

Re: [HACKERS] PATCH: Batch/pipelining support for libpq

2017-03-09 Thread Daniel Verite
Vaishnavi Prabakaran wrote:

>if (PQbatchStatus(st->con) != PQBATCH_MODE_ON)
> 
>But, it is better to use if (PQbatchStatus(st->con) ==
> PQBATCH_MODE_OFF) for this verification. Reason is there is one more state
> in batch mode - PQBATCH_MODE_ABORTED. So, if the batch mode is in aborted
> status, this check will still assume that the connection is not in batch
> mode.
> In a same way, "if(PQbatchStatus(st->con) == PQBATCH_MODE_ON)" check is
> better to be modified as "PQbatchStatus(st->con) != PQBATCH_MODE_OFF".

Agreed, these two tests need to be changed to account for the
PQBATCH_MODE_ABORTED case. Fixed in new patch.

> 2.  @@ -2207,7 +2227,47 @@ doCustom(TState *thread, CState *st, StatsData
> *agg)
> + if (PQbatchStatus(st->con) != PQBATCH_MODE_OFF)
> + {
> + commandFailed(st, "already in batch mode");
> + break;
> + }
> 
> This is not required as below PQbatchBegin() internally does this
> verification.
> 
> + PQbatchBegin(st->con);

The point of this test is to specifically forbid a sequence like this
\beginbatch
...(no \endbatch)
\beginbatch
...
and according to the doc for PQbatchBegin, it looks like the return
code won't tell us:
   "Causes a connection to enter batch mode if it is currently idle or
   already in batch mode.
int PQbatchBegin(PGconn *conn);
   Returns 1 for success"

My understanding is that "already in batch mode" is not an error.

   "Returns 0 and has no effect if the connection is not currently
   idle, i.e. it has a result ready, is waiting for more input from
   the server, etc. This function does not actually send anything to
   the server, it just changes the libpq connection state"

> 3. It is better to check the return value of PQbatchBegin() rather than
> assuming success. E.g: PQbatchBegin() will return false if the connection
> is in copy in/out/both states.

Given point #2 above, I think both tests are needed:
   if (PQbatchStatus(st->con) != PQBATCH_MODE_OFF)
{
   commandFailed(st, "already in batch mode");
   break;
}
if (PQbatchBegin(st->con) == 0)
{
   commandFailed(st, "failed to start a batch");
   break;
}

> > 3. In relation to #2, PQsendQuery() is not forbidden in batch mode
> > although I don't think it can work with it, as it's based on the old
> > protocol.
> >
> It is actually forbidden and you can see the error message "cannot
> PQsendQuery in batch mode, use PQsendQueryParams" when you use
> PQsendQuery().

Sorry for blaming the innocent piece of code, looking closer
it's pgbench that does not display the message.
With the simple query protocol, sendCommand() does essentially:

  r = PQsendQuery(st->con, sql);

... other stuff...

  if (r == 0)
  {
if (debug)
  fprintf(stderr, "client %d could not send %s\n",
  st->id, command->argv[0]);
st->ecnt++;
return false;
  }

So only in debug mode does it inform the user that it failed, and
even then, it does not display PQerrorMessage(st->con).

In non-debug mode it's opaque to the user. If it always fail, it appears
to loop forever. The caller has this comment that is relevant to the problem:

if (!sendCommand(st, command))
  {
/*
 * Failed. Stay in CSTATE_START_COMMAND state, to
 * retry. ??? What the point or retrying? Should
 * rather abort?
 */
return;
  }

I think it doesn't bother anyone up to now because the immediate
failure modes of PQsendQuery() are resource allocation or protocol
failures that tend to never happen.

Anyway currently \beginbatch avoids the problem by checking
whether querymode == QUERY_SIMPLE, to fail gracefully
instead of letting the endless loop happen.


Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite
diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index f6cb5d4..f93673e 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -269,7 +269,8 @@ typedef enum
 *
 * CSTATE_START_COMMAND starts the execution of a command.  On a SQL
 * command, the command is sent to the server, and we move to
-* CSTATE_WAIT_RESULT state.  On a \sleep meta-command, the timer is 
set,
+* CSTATE_WAIT_RESULT state unless in batch mode.
+* On a \sleep meta-command, the timer is set,
 * and we enter the CSTATE_SLEEP state to wait for it to expire. Other
 * meta-commands are executed immediately.
 *
@@ -1882,11 +1883,24 @@ sendCommand(CState *st, Command *command)
if (commands[j]->type != SQL_COMMAND)
continue;
preparedStatementName(name, st->use_file, j);
-   res = PQprepare(st->con, name,
- commands[j]->argv[0], 
commands[j]->argc - 1, NULL);
-   if (PQresultSt

Re: [HACKERS] PATCH: Batch/pipelining support for libpq

2017-03-10 Thread Daniel Verite
  Hi,

I notice that PQsetSingleRowMode() doesn't work when getting batch results.  

The function is documented as:
" int PQsetSingleRowMode(PGconn *conn);

  This function can only be called immediately after PQsendQuery or one
  of its sibling functions, before any other operation on the connection
  such as PQconsumeInput or PQgetResult"

But PQbatchQueueProcess() unconditionally clears conn->singleRowMode,
so whatever it was when sending the query gets lost, and besides
other queries might have been submitted in the meantime.

Also if trying to set that mode when fetching like this

 while (QbatchQueueProcess(conn)) {
   r = PQsetSingleRowMode(conn);
   if (r!=1) {
  fprintf(stderr, "PQsetSingleRowMode() failed");
   }
   .. 

it might work the first time, but on the next iterations, conn->asyncStatus
might be PGASYNC_READY, which is a failure condition for
PQsetSingleRowMode(), so that won't do.

ISTM that the simplest fix would be that when in batch mode,
PQsetSingleRowMode() should register that the last submitted query
does request that mode, and when later QbatchQueueProcess dequeues
the batch entry for the corresponding query, this flag would be popped off
and set as the current mode.

Please find attached the suggested fix, against the v5 of the patch.

Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite
diff --git a/src/interfaces/libpq/fe-exec.c b/src/interfaces/libpq/fe-exec.c
index 3c0be46..8ddf63d 100644
--- a/src/interfaces/libpq/fe-exec.c
+++ b/src/interfaces/libpq/fe-exec.c
@@ -1425,7 +1425,7 @@ PQmakePipelinedCommand(PGconn *conn)
 	}
 	entry->next = NULL;
 	entry->query = NULL;
-
+	entry->singleRowMode = false;
 	return entry;
 }
 
@@ -1783,16 +1783,28 @@ PQsetSingleRowMode(PGconn *conn)
 	/*
 	 * Only allow setting the flag when we have launched a query and not yet
 	 * received any results.
+	 * In batch mode, store the flag in the queue for applying it later.
 	 */
 	if (!conn)
 		return 0;
-	if (conn->asyncStatus != PGASYNC_BUSY)
-		return 0;
 	if (conn->queryclass != PGQUERY_SIMPLE &&
 		conn->queryclass != PGQUERY_EXTENDED)
 		return 0;
 	if (conn->result)
 		return 0;
+	if (conn->batch_status == PQBATCH_MODE_OFF)
+	{
+		if (conn->asyncStatus != PGASYNC_BUSY)
+			return 0;
+	}
+	else
+	{
+		/* apply to the last submitted query in the batch, or fail */
+		if (conn->cmd_queue_tail != NULL)
+			conn->cmd_queue_tail->singleRowMode = true;
+		else
+			return 0;
+	}
 
 	/* OK, set flag */
 	conn->singleRowMode = true;
@@ -2120,9 +2132,8 @@ PQbatchQueueProcess(PGconn *conn)
 	 * Initialize async result-accumulation state */
 	pqClearAsyncResult(conn);
 
-	/* reset single-row processing mode */
-	conn->singleRowMode = false;
-
+	/* Set single-row processing mode */
+	conn->singleRowMode = next_query->singleRowMode;
 
 	conn->last_query = next_query->query;
 	next_query->query = NULL;
diff --git a/src/interfaces/libpq/libpq-int.h b/src/interfaces/libpq/libpq-int.h
index 33f212f..af4f753 100644
--- a/src/interfaces/libpq/libpq-int.h
+++ b/src/interfaces/libpq/libpq-int.h
@@ -315,6 +315,7 @@ typedef struct pgCommandQueueEntry
 {
 	PGQueryClass queryclass;	/* Query type; PGQUERY_SYNC for sync msg */
 	char	   *query;			/* SQL command, or NULL if unknown */
+	bool	   singleRowMode;   /* apply single row mode to this query */
 	struct pgCommandQueueEntry *next;
 }	PGcommandQueueEntry;
 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: Batch/pipelining support for libpq

2017-03-13 Thread Daniel Verite
Vaishnavi Prabakaran wrote:

> >  while (QbatchQueueProcess(conn)) {
> >r = PQsetSingleRowMode(conn);
> >if (r!=1) {
> >   fprintf(stderr, "PQsetSingleRowMode() failed");
> >}
> >..

> Thanks for investigating the problem, and could you kindly explain what
> "next iteration" you mean here? Because I don't see any problem in
> following sequence of calls - PQbatchQueueProcess(),PQsetSingleRowMode()
> , PQgetResult()

I mean the next iteration of the above while statement. Referring
to the doc, that would be the "next batch entry":

  " To get the result of the first batch entry the client must call
   PQbatchQueueProcess. It must then call PQgetResult and handle the
   results until PQgetResult returns null (or would return null if
   called). The result from the next batch entry may then be retrieved
   using PQbatchQueueProcess and the cycle repeated"

Attached is a bare-bones testcase showing the problem.
As it is, it works, retrieving results for three "SELECT 1"
in the same batch.  Now in order to use the single-row
fetch mode, consider adding this:

r = PQsetSingleRowMode(conn);
if (r!=1) {
  fprintf(stderr, "PQsetSingleRowMode() failed for i=%d\n", i);
}

When inserted after the call to PQbatchQueueProcess,
which is what I understand you're saying works for you,
it fails for me when starting to get the results of the 2nd query
and after. 


Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


test-singlerow-batch.c
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] \if, \elseif, \else, \endif (was Re: PSQL commands: \quit_if, \quit_unless)

2017-03-13 Thread Daniel Verite
Tom Lane wrote:

> when we see \if is that we do nothing but absorb text
> until we see the matching \endif.  At that point we could bitch and throw
> everything away if, say, there's \elif after \else, or anything else you
> want to regard as a "compile time error".  Otherwise we start execution,
> and from there on it probably has to behave as we've been discussing.
> But this'd be pretty unfriendly from an interactive standpoint, and I'm
> not really convinced that it makes for significantly better error
> reporting.

This is basically what bash does. In an if/else/fi block
in an interactive session, the second prompt is displayed at every new
line and nothing gets executed until it recognizes the end of the
block and it's valid as a whole. Otherwise, nothing of the block
gets executed. That doesn't strike me as unfriendly.

When non-interactive, in addition to the block not being executed,
the fact that it fails implies that the execution of the current script
is ended, independently of the errexit setting.
If errexit is set, the interpreter terminates. If it was
an included script and errexit is not set, the execution resumes
after the point of the inclusion.

On the whole, isn't that a reasonable model to follow for psql?

Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: Batch/pipelining support for libpq

2017-03-16 Thread Daniel Verite
Vaishnavi Prabakaran wrote:

> So, attached the alternative fix for this issue.
> Please share me your thoughts.

I assume you prefer the alternative fix because it's simpler.

> I would also like to hear Craig's opinion on it before applying this fix
> to the original patch, just to make sure am not missing anything here.

+1

The main question is whether the predicates enforced
by PQsetSingleRowMode() apply in batch mode in all cases
when it's legit to call that function. Two predicates
that may be problematic are:
if (conn->asyncStatus != PGASYNC_BUSY)
return 0;
and
if (conn->result)
return 0;

The general case with batch mode is that, from the doc:
"The client interleaves result processing with sending batch queries"
Note that I've not even tested that here, I've tested
batching a bunch of queries in a first step and getting the results
in a second step.
I am not confident that the above predicates will be true
in all cases. Also your alternative fix assumes that we add
a user-visible exception to PQsetSingleRowMode in batch mode,
whereby it must not be called as currently documented:
  "call PQsetSingleRowMode immediately after a successful call of 
   PQsendQuery (or a sibling function)"
My gut feeling is that it's not the right direction, I prefer making
the single-row a per-query attribute internally and keep
PQsetSingleRowMode's contract unchanged from the
user's perspective.


Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] \if, \elseif, \else, \endif (was Re: PSQL commands: \quit_if, \quit_unless)

2017-03-17 Thread Daniel Verite
Tom Lane wrote:

> OT_WHOLE_LINE is not what you want because that results in verbatim
> copying, without variable expansion or anything

But if we want to implement "\if defined :foo" in the future
isn't it just what we need?

Also we could leave open the option to accept an SQL expression
here. I expect people will need SQL as the evaluator in a lot of cases.
So far we need to do that:

  SELECT sql_expr ... AS varname \gset
  \if :varname
  ...
  \endif

Surely users will wonder right away why they can't write it like this
instead:

  \if (sql_expr)
  ...
  \endif

There's a precedent with \copy accepting a query inside parentheses,
using OT_WHOLE_LINE.


Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: Batch/pipelining support for libpq

2017-03-27 Thread Daniel Verite
Vaishnavi Prabakaran wrote:

> Please let me know if you think this is not enough but wanted to update
> section 33.6 also?

Yes, if the right place to call PQsetSingleRowMode() is immediately
after PQbatchQueueProcess(), I think we need to update
"33.6. Retrieving Query Results Row-By-Row"
with that information, otherwise what it says is just wrong
when in batch mode.

Also, in 33.5, I suggest to not use the "currently executing
query" as a synonym for the "query currently being processed"
to avoid any confusion between what the backend is executing
and a prior query whose results are being collected by the client
at the same moment.


Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: Batch/pipelining support for libpq

2017-03-27 Thread Daniel Verite
Vaishnavi Prabakaran wrote:

> Am going to include the test which you shared in the test patch. Please let
> me know if you want to cover anymore specific cases to gain confidence.

One bit of functionality that does not work in batch mode and is left
as is by the patch is the PQfn() fast path interface and the large object
functions that use it.

Currently, calling lo_* functions inside a batch will fail with a message
that depends on whether the internal lo_initialize() has been successfully
called before.

If it hasn't, PQerrorMessage() will be:
   "Synchronous command execution functions are not allowed in batch mode"
which is fine, but it comes by happenstance from lo_initialize()
calling PQexec() to fetch the function OIDs from pg_catalog.pg_proc.

OTOH, if lo_initialize() has succeeded before, a call to a large
object function will fail with a different message:
  "connection in wrong state"
which is emitted by PQfn() based on conn->asyncStatus != PGASYNC_IDLE

Having an unified error message would be more user friendly.

Concerning the doc, when saying in 33.5.2:
 "In batch mode only asynchronous operations are permitted"
the server-side functions are indeed ruled out, since PQfn() is
synchronous, but maybe we should be a bit more explicit
about that?

Chapter 34.3 (Large Objects / Client Interfaces) could also
mention the incompatibility with batch mode.


Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: Batch/pipelining support for libpq

2017-03-28 Thread Daniel Verite
Michael Paquier wrote:

> # Running: testlibpqbatch dbname=postgres 1 copyfailure
> dispatching SELECT query failed: cannot queue commands during COPY
> 
> COPYBUF: 5
> 
> Error status and message got from server due to COPY command failure
> are : PGRES_FATAL_ERROR ERROR:  unexpected message type 0x50 during
> COPY from stdin
> CONTEXT:  COPY batch_demo, line 1

Same result here.

BTW the doc says: 
  "In batch mode only asynchronous operations are permitted, and COPY is
   not recommended as it most likely will trigger failure in batch
processing"
Yet it seems that the test assumes that it should work nonetheless.
I don't quite understand what we expect of this test, given what's
documented. Or what am I missing?

While looking at the regress log, I noticed multiple spurious
test_singlerowmode tests among the others. Should be fixed with:

--- a/src/test/modules/test_libpq/testlibpqbatch.c
+++ b/src/test/modules/test_libpq/testlibpqbatch.c
@@ -1578,6 +1578,7 @@ main(int argc, char **argv)
run_batch_abort = 0;
run_timings = 0;
run_copyfailure = 0;
+   run_singlerowmode = 0;
if (strcmp(argv[3], "disallowed_in_batch") == 0)
run_disallowed_in_batch = 1;
else if (strcmp(argv[3], "simple_batch") == 0)

There's also the fact that this test doesn't care whether it fails 
(compare with all the "goto fail" of the other tests).

And it happens that PQsetSingleRowMode() fails after the call to
PQbatchQueueProcess() that instantiates a PGresult
of status PGRES_BATCH_END to indicate the end of the batch,

To me this shows how this way of setting the single row mode is not ideal.
In order to avoid the failure, the loop should know in advance 
what kind of result it's going to dequeue before calling PQgetResult(),
which doesn't feel right.


Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: Batch/pipelining support for libpq

2017-03-30 Thread Daniel Verite
Vaishnavi Prabakaran wrote:

> Hmm, With batch mode, after sending COPY command to server(and server
> started processing the query and goes into COPY state) , client does not
> immediately read the result , but it keeps sending other queries to the
> server. By this time, server already encountered the error
> scenario(Receiving different message during COPY state) and sent error
> messages

IOW, the test intentionally violates the protocol and then all goes wonky
because of that.
That's why I was wondering upthread what's it's supposed to test.
I mean, regression tests are meant to warn against a desirable behavior
being unknowingly changed by new code into an undesirable behavior.
Here we have the undesirable behavior to start with.
What kind of regression could we fear from that?

Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Variable substitution in psql backtick expansion

2017-03-31 Thread Daniel Verite
Tom Lane wrote:

> Thoughts?

ISTM that expr is too painful to use to be seen as the
idiomatic way of achieving comparison in psql.

Among its disadvantages, it won't work on windows, and its
interface is hard to work with due to the necessary
quoting of half its operators, and the mandatory spacing
between arguments.

Also the quoting rules and command line syntax
depend on the underlying shell.
Isn't it going to be tricky to produce code that works
across different families of shells, like bourne and csh?

I think that users would rather have the option to just put
an SQL expression behind \if. That implies a working connection
to evaluate, which expr doesn't, but that's no
different from the other backslash commands that read
the database.


Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Suggested fix for \p and \r in psql

2017-04-02 Thread Daniel Verite
  Hi,

I've noticed two issues with the query buffer post-commit e984ef5
(Support \if ... \elif ... \else ... \endif in psql scripting):

1. \p ignores the "previous buffer". Example:

postgres=# select 1;
 ?column? 
--
1
(1 row)

postgres=# \p
Query buffer is empty.

That doesn't match the pre-commit behavior, and is not
consistent with \e or \w


2. \r keeps the "previous buffer". I think it should clear it. Example:

postgres=# select 1;
 ?column? 
--
1
(1 row)

postgres=# select 2 \r
Query buffer reset (cleared).
postgres=# \w /tmp/buffer
postgres=# \! cat /tmp/buffer
select 1;

I suggest the attached fix, with a few new regression tests.


Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite
diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c
index 94a3cfc..6554268 100644
--- a/src/bin/psql/command.c
+++ b/src/bin/psql/command.c
@@ -106,14 +106,14 @@ static backslashResult exec_command_lo(PsqlScanState scan_state, bool active_bra
 const char *cmd);
 static backslashResult exec_command_out(PsqlScanState scan_state, bool active_branch);
 static backslashResult exec_command_print(PsqlScanState scan_state, bool active_branch,
-   PQExpBuffer query_buf);
+   PQExpBuffer query_buf, PQExpBuffer previous_buf);
 static backslashResult exec_command_password(PsqlScanState scan_state, bool active_branch);
 static backslashResult exec_command_prompt(PsqlScanState scan_state, bool active_branch,
 	const char *cmd);
 static backslashResult exec_command_pset(PsqlScanState scan_state, bool active_branch);
 static backslashResult exec_command_quit(PsqlScanState scan_state, bool active_branch);
 static backslashResult exec_command_reset(PsqlScanState scan_state, bool active_branch,
-   PQExpBuffer query_buf);
+   PQExpBuffer query_buf, PQExpBuffer previous_buf);
 static backslashResult exec_command_s(PsqlScanState scan_state, bool active_branch);
 static backslashResult exec_command_set(PsqlScanState scan_state, bool active_branch);
 static backslashResult exec_command_setenv(PsqlScanState scan_state, bool active_branch,
@@ -192,8 +192,8 @@ static void checkWin32Codepage(void);
  * execution of the backslash command (for example, \r clears it).
  *
  * previous_buf contains the query most recently sent to the server
- * (empty if none yet).  This should not be modified here, but some
- * commands copy its content into query_buf.
+ * (empty if none yet).  This should not be modified here (except by
+ * \r), but some commands copy its content into query_buf.
  *
  * query_buf and previous_buf will be NULL when executing a "-c"
  * command-line option.
@@ -362,7 +362,8 @@ exec_command(const char *cmd,
 	else if (strcmp(cmd, "o") == 0 || strcmp(cmd, "out") == 0)
 		status = exec_command_out(scan_state, active_branch);
 	else if (strcmp(cmd, "p") == 0 || strcmp(cmd, "print") == 0)
-		status = exec_command_print(scan_state, active_branch, query_buf);
+		status = exec_command_print(scan_state, active_branch,
+	query_buf, previous_buf);
 	else if (strcmp(cmd, "password") == 0)
 		status = exec_command_password(scan_state, active_branch);
 	else if (strcmp(cmd, "prompt") == 0)
@@ -372,7 +373,8 @@ exec_command(const char *cmd,
 	else if (strcmp(cmd, "q") == 0 || strcmp(cmd, "quit") == 0)
 		status = exec_command_quit(scan_state, active_branch);
 	else if (strcmp(cmd, "r") == 0 || strcmp(cmd, "reset") == 0)
-		status = exec_command_reset(scan_state, active_branch, query_buf);
+		status = exec_command_reset(scan_state, active_branch,
+	query_buf, previous_buf);
 	else if (strcmp(cmd, "s") == 0)
 		status = exec_command_s(scan_state, active_branch);
 	else if (strcmp(cmd, "set") == 0)
@@ -1827,12 +1829,15 @@ exec_command_out(PsqlScanState scan_state, bool active_branch)
  */
 static backslashResult
 exec_command_print(PsqlScanState scan_state, bool active_branch,
-   PQExpBuffer query_buf)
+   PQExpBuffer query_buf, PQExpBuffer previous_buf)
 {
 	if (active_branch)
 	{
 		if (query_buf && query_buf->len > 0)
 			puts(query_buf->data);
+		/* Applies to previous query if current buffer is empty */
+		else if (previous_buf && previous_buf->len > 0)
+			puts(previous_buf->data);
 		else if (!pset.quiet)
 			puts(_("Query buffer is empty."));
 		fflush(stdout);
@@ -2056,10 +2061,11 @@ exec_command_quit(PsqlScanState scan_state, bool active_branch)
  */
 static backslashResult
 exec_command_reset(PsqlScanState scan_state, bool active_branch,
-   PQExpBuffer query_buf)
+   PQExpBuffer query_buf, PQExpBuffer previous_buf)
 {
 	if (active_branch)
 	{
+		resetPQExpBuffer(previous_buf);
 		resetPQExpBuffer(query_buf);
 		psql_scan_reset(scan_state);
 		if (!pset.quiet)
diff --git a/src/test/regress/expected/psql.out b/src/test/regress/expected/psql.out
index 8aa914f..f1c516a 100644
--- a/src/test/regress/expected/psql.out
+++ b/src/test/regress/expected/psql.out
@@ -2932,3 +2932,30 @@ NOTICE:

Re: [HACKERS] Variable substitution in psql backtick expansion

2017-04-02 Thread Daniel Verite
Fabien COELHO wrote:

> Note that this is already available indirectly, as show in the 
> documentation.
> 
>   SELECT some-boolean-expression AS okay \gset
>   \if :okay
> \echo boolean expression was true
>   \else
> \echo boolean expression was false
>   \endif

Yes, the question was whether we leave it as that for v10,
or if it's worth a last-minute improvement for usability,
assuming it's doable, similarly to what $subject does to backtick
expansion for external evaluation.


Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Suggested fix for \p and \r in psql

2017-04-02 Thread Daniel Verite
Tom Lane wrote:

> > 1. \p ignores the "previous buffer". Example:
> 
> Yeah, I did that intentionally, thinking that the old behavior was
> confusing.  We can certainly discuss it though.  I'd tend to agree
> with your point that \p and \w should print the same thing, but
> maybe neither of them should look at the previous_buf.
> 
> > 2. \r keeps the "previous buffer". I think it should clear it.
> 
> I don't really agree with this.  The fact that it used to clear both
> buffers was an implementation accident that probably nobody had even
> understood clearly.  ISTM that loses functionality because you can't
> do \g anymore.

I don't have a strong opinion on \r. As a user I tend to use Ctrl+C
rather than \r for the edit in progress.
The documentation over-simplifies things as if there
was only one query buffer, instead of two of them.

  \r or \reset
  Resets (clears) the query buffer. 

From just that reading, the behavior doesn't seem right when
we "clear the query buffer", and then print it, and it doesn't
come out as empty.

About \p alone, if it doesn't output what \g is about to run, I
think that's a problem because ISTM that it's the main use
case of \p

Following the chain of consistency, the starting point seems to be
that \g sends the previous query if the edit-in-progress input
buffer is empty, instead of saying, empty buffer = empty query,
so nothing to send.
Presumably the usage of issuing  \g alone to repeat the last
query is well established, so we can't change it and need
to adjust the other commands to be as less surprising
as possible with our two buffers.


Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Suggested fix for \p and \r in psql

2017-04-02 Thread Daniel Verite
Tom Lane wrote:

> If we do phrase it like that, I think that the most natural behavior
> for \r is the way I have it in HEAD --- you'd expect it to clear
> the query buffer, but you would not expect it to forget the most
> recent command.

Okay, leaving out \r as it is and changing only \p works for me.

Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Variable substitution in psql backtick expansion

2017-04-03 Thread Daniel Verite
Fabien COELHO wrote:

> My 0.02 € about server-side expressions: ISTM that there is nothing 
> obvious/easy to do to include these:
> 
>   - how would it work, both with \set ... and \if ...?

The idea is that the need to have two command (a SELECT .. \gset
followed by an \if) and a temporary variable in-between would be
lifted by implementing a close equivalent in one command.
It would behave essentially the same as the two commands.

I don't see that \set must have to be involved in that improvement,
although it could be indirectly, depending on what exactly we
implement.

\set differs in that it already exists in released versions,
so we have the backward compatibility to consider.
With \if we are not bound by that, but what \if will be at feature
freeze needs to be as convenient as we can make it in this release,
and not block progress in v11 and later, as these future improvements
will have to be backward-compatible against v10.


>   - should it be just simple expressions or may it allow complex
> queries?

Let's imagine that psql would support a syntax like this:
  \if [select current_setting('server_version_num')::int < 11]
or
  \if [select 1 from pg_catalog.pg_extension where extname='pgcrypto']

where by convention [ and ] enclose an SQL query that's assumed to
return a single-row, single-column bool-ish value, and in which
psql variables would be expanded, like they are now in
backtick expressions.
Queries can be as complex as necessary, they just have to fit in one line.

>   - how would error detection and handling work from a script?

The same as SELECT..\gset followed by \if, when the SELECT fails.

>   - should it have some kind of continuation, as expressions are
> likely to be longer than a constant?

No, to me that falls into the issue of continuation of backslash
commands in general, which is discussed separately.

>   - how would they interact with possible client-side expressions?

In no way at all,in the sense that, either you choose to use an SQL
evaluator, or a client-side evaluator (if it exists), or a backtick
expression.
They are mutually exclusive for a single \if invocation.

Client-side evaluation would have to be called with a syntax
that is unambiguously different. For example it could be
\if (:SQLSTATE = '23505')
  \echo A record with the unique key :key_id already exists
  rollback
\endif

AFAIK we don't have :SQLSTATE yet, but we should :)

Maybe the parentheses are not required, or we could require a different set
of brackets to enclose an expression to evaluate internally, like {}, or
whatever provided it's not ambiguous.

> (on this point, I think that client-side is NOT needed for "psql".
>  It makes sense for "pgbench" in a benchmarking context where the
>  client must interact with the server in some special meaningful
>  way, but for simple scripting the performance requirement and
>  logic is not the same, so server-side could be enough).

Client-side evaluation is useful for the example above, where
you expect that you might be in a failed transaction, or even
not connected, and you need to do quite simple tests.
We can do that already with backtick expansion and a shell evaluation
command, but it's a bit heavy/inelegant and creates a dependency on
external commands that is detrimental to portability.
I agree that we don't need a full-featured built-in evaluator, because
the cases where it's needed will be rare enough that it's reasonable
to have to defer to an external evaluator in those cases.


Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Variable substitution in psql backtick expansion

2017-04-03 Thread Daniel Verite
Tom Lane wrote:

> I really dislike this syntax proposal.  It would get us into having
> to count nested brackets, and distinguish quoted from unquoted brackets,
> and so on ... and for what?  It's not really better than
> 
>   \if sql select 1 from pg_catalog.pg_extension where extname='pgcrypto'
> 
> (ie, "\if sql ...text to send to server...").

That's fine by me. The advantage of enclosing the query is to leave
open the possibility of accepting additional contents after the query,
like options (as \copy does), or other expression terms to combine
with the query's result. But we can live without that.

> If you're worried about shaving keystrokes, make the "select" be implicit.
> That would be particularly convenient for the common case where you're
> just trying to evaluate a boolean expression with no table reference.

These expressions look more natural without the SELECT keyword,
besides the size, but OTOH "\if sql 1 from table where expr"
looks awkward. Given an implicit select, I would prefer
"\if exists (select 1 from table where expr)" but now it's not shorter.

An advantage of prepending the SELECT automatically, is that it
would prevent people from abusing this syntax by putting
update/insert/delete or even DDL in there, imagining that this would
be a success/failure test for these operations.
Having these fail to execute in the first place, when called by \if,
seems like a sane failure mode that we would gain incidentally.


Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] \if, \elseif, \else, \endif (was Re: PSQL commands: \quit_if, \quit_unless)

2017-04-03 Thread Daniel Verite
  Hi,

In interactive mode, the warning in untaken branches is misleading
when \endif is on the same line as the commands that
are skipped. For instance:

  postgres=# \if false \echo NOK \endif
  \echo command ignored; use \endif or Ctrl-C to exit current \if block
  postgres=# 

From the point of view of the user, the execution flow has exited
the branch already when this warning is displayed.
Of course issuing the recommended \endif at this point doesn't work:

  postgres=# \endif
  \endif: no matching \if

Maybe that part of the message: 
"use \endif or Ctrl-C to exit current \if block"
should be displayed only when coming back at the prompt,
and if the flow is still in an untaken branch at this point?

Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Variable substitution in psql backtick expansion

2017-04-03 Thread Daniel Verite
Fabien COELHO wrote:

> Now it could be decided that \set in psql stays simplistic because it is 
> not needed as much as it is with pgbench. Fine with me.

It's not just that. It's that currently, if we do in psql:

\set d sqrt(1 + random(1000) * 17)

then we get that:

\echo :d
sqrt(1+random(1000)*17)

I assume we want to keep that pre-existing behavior of \set in
psql, that is, making a copy of that string and associating a
name to it, whereas I guess pgbench computes a value from it and
stores that value.

Certainly if we want the same sort of evaluator in pgbench and psql
we'd better share the code between them, but I don't think it will be
exposed by the same backslash commands in both programs,
if only for that backward compatibility concern.


Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Variable substitution in psql backtick expansion

2017-04-14 Thread Daniel Verite
Fabien COELHO wrote:

> Pavel also suggested some support for TEXT, although I would like to see a 
> use case. That could be another extension to the engine.

SQLSTATE is text.

Also when issuing "psql -v someoption=value -f script", it's reasonable
to want to compare :someoptionvar to 'somevalue' in the script.


Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: Batch/pipelining support for libpq

2017-06-20 Thread Daniel Verite
Andres Freund wrote:

> FWIW, I still think this needs a pgbench or similar example integration,
> so we can actually properly measure the benefits.

Here's an updated version of the patch I made during review,
adding \beginbatch and \endbatch to pgbench.
The performance improvement appears clearly
with a custom script of this kind:
  \beginbatch
 UPDATE pgbench_branches SET bbalance = bbalance + 1 WHERE bid = 0;
 ..above repeated 1000 times...
  \endbatch

versus the same with a BEGIN; END; pair instead of \beginbatch \endbatch

On localhost on my desktop I tend to see a 30% difference in favor
of the batch mode with that kind of test.
On slower networks there are much bigger differences.

The latest main patch (v10) must also be slightly updated for HEAD,
because of this:
error: patch failed: src/interfaces/libpq/exports.txt:171
v11 attached without any other change.


Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite
diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml
index c3bd4f9..366f278 100644
--- a/doc/src/sgml/libpq.sgml
+++ b/doc/src/sgml/libpq.sgml
@@ -4656,6 +4656,526 @@ int PQflush(PGconn *conn);
 
  
 
+ 
+  Batch mode and query pipelining
+
+  
+   libpq
+   batch mode
+  
+
+  
+   libpq
+   pipelining
+  
+
+  
+   libpq supports queueing up queries into
+   a pipeline to be executed as a batch on the server. Batching queries allows
+   applications to avoid a client/server round-trip after each query to get
+   the results before issuing the next query.
+  
+
+  
+   An example of batch use may be found in the source distribution in
+   src/test/modules/test_libpq/testlibpqbatch.c.
+  
+
+  
+   When to use batching
+
+   
+Much like asynchronous query mode, there is no performance disadvantage to
+using batching and pipelining. It increases client application complexity
+and extra caution is required to prevent client/server deadlocks but
+can sometimes offer considerable performance improvements.
+   
+
+   
+Batching is most useful when the server is distant, i.e. network latency
+(ping time) is high, and when many small operations are 
being performed in
+rapid sequence. There is usually less benefit in using batches when each
+query takes many multiples of the client/server round-trip time to execute.
+A 100-statement operation run on a server 300ms round-trip-time away would 
take
+30 seconds in network latency alone without batching; with batching it may 
spend
+as little as 0.3s waiting for results from the server.
+   
+
+   
+Use batches when your application does lots of small
+INSERT, UPDATE and
+DELETE operations that can't easily be transformed into
+operations on sets or into a
+COPY operation.
+   
+
+   
+Batching is less useful when information from one operation is required by 
the
+client before it knows enough to send the next operation. The client must
+introduce a synchronisation point and wait for a full client/server
+round-trip to get the results it needs. However, it's often possible to
+adjust the client design to exchange the required information server-side.
+Read-modify-write cycles are especially good candidates; for example:
+
+ BEGIN;
+ SELECT x FROM mytable WHERE id = 42 FOR UPDATE;
+ -- result: x=2
+ -- client adds 1 to x:
+ UPDATE mytable SET x = 3 WHERE id = 42;
+ COMMIT;
+
+could be much more efficiently done with:
+
+ UPDATE mytable SET x = x + 1 WHERE id = 42;
+
+   
+
+   
+
+ The batch API was introduced in PostgreSQL 10.0, but clients using 
PostgresSQL 10.0 version of libpq can
+ use batches on server versions 8.4 and newer. Batching works on any server
+ that supports the v3 extended query protocol.
+
+   
+
+  
+
+  
+   Using batch mode
+
+   
+To issue batches the application must switch
+a connection into batch mode. Enter batch mode with PQenterBatchMode(conn)
 or test
+whether batch mode is active with PQbatchStatus(conn). 
In batch mode only asynchronous operations are permitted, and
+COPY is not recommended as it most likely will trigger 
failure in batch processing. 
+Using any synchronous command execution functions such as 
PQfn,
+PQexec or one of its sibling functions are error 
conditions.
+Functions allowed in batch mode are described in . 
+   
+
+   
+The client uses libpq's asynchronous query functions to dispatch work,
+marking the end of each batch with PQbatchSyncQueue.
+And to get results, it uses PQgetResult and
+PQbatchProcessQueue. It may eventually exit
+batch mode with PQexitBatchMode once all results are
+processed.
+   
+
+   
+
+ It is best to use batch mode with libpq in
+ non-blocking mode. If used 
in
+ blocking mode it is possible for a client/server deadlock to occur. The
+ client will block trying to

Re: [HACKERS] PATCH: Batch/pipelining support for libpq

2017-06-22 Thread Daniel Verite
Craig Ringer wrote:

> The kernel will usually do some packet aggregation unless we use
> TCP_NODELAY (which we don't and shouldn't)

Not sure. As a point of comparison, Oracle has it as a tunable
parameter (TCP.NODELAY), and they changed its default from
No to Yes starting from their 10g R2.


Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: Batch/pipelining support for libpq

2017-06-22 Thread Daniel Verite
Andres Freund wrote:

-   if (pqFlush(conn) < 0)
-   goto sendFailed;
+   if (conn->batch_status == PQBATCH_MODE_OFF)
+   {
+   /*
+* Give the data a push.  In nonblock mode, don't complain if
we're unable
+* to send it all; PQgetResult() will do any additional
flushing needed.
+*/
+   if (pqFlush(conn) < 0)
+   goto sendFailed;
+   }

Seems to be responsible for roughly an 1.7x speedup in tps and equivalent
decrease in latency, based on the "progress" info.
I wonder how much of that corresponds to a decrease in the number of
packets versus the number of syscalls. Both matter, I guess.

But OTOH there are certainly batch workloads where it will be preferrable
for the first query to reach the server ASAP, rather than waiting to be
coalesced with the next ones.
libpq is not going to know what's best.
One option may be to leave that decision to the user by providing a
PQBatchAutoFlush(true|false) property, along with a PQBatchFlush()
function. Maybe we could even let the user set the size of the sending
buffer, so those who really want to squeeze performance may tune it
for their network and workload.


Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: Batch/pipelining support for libpq

2017-06-22 Thread Daniel Verite
Andres Freund wrote:

> > One option may be to leave that decision to the user by providing a
> > PQBatchAutoFlush(true|false) property, along with a PQBatchFlush()
> > function.
> 
> What'd be the difference between PQflush() and PQbatchFlush()?

I guess no difference, I was just not seeing that libpq already provides
this functionality...

Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PG 10 release notes

2017-04-27 Thread Daniel Verite
Fabien COELHO wrote:

>Fix psql \p to always print what would be executed by \g or \w (Daniel
>Vérité)
> 
>Previously \p didn't properly print the reverted-to command after a
>buffer contents reset. CLARIFY?
> 
> The fix is linked to the change introduced by Tom when 
> refactoring/cleaning up in e984ef586 (\if) which change psql's \p 
> behavior.

That's my recollection as well. The "Previously" does not refer to 9.6,
but to that commit.

> I'm not sure how this should appear in the release notes. Maybe not at 
> all, associated to the feature in which the behavioral change was 
> introduced...

There is small change of behavior coming as a by-product of the
introduction of  /if.../endif blocks.

When doing in 9.x:

select '1st buffer' \g
followed by \e
and editing with select '2nd buffer' (without ending the query)
and then back in psql doing '\r' and '\p', the result is
select '2nd buffer'

The same with v10 leads instead to
select '1st buffer'

I'm not sure whether it's above the level of detail worth being mentioned
in the release notes.


Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] export import bytea from psql

2017-05-09 Thread Daniel Verite
Pavel Stehule wrote:

> 1. using psql variables - we just need to write commands \setfrom
> \setfrombf - this should be very simple implementation. The value can be
> used more times. On second hand - the loaded variable can hold lot of
> memory (and it can be invisible for user).

This could be simplified by using the variable only for the filename,
and then injecting the contents of the file into the PQexec'd query
as the interpolation of the variable.
We already have:
 :var for verbatim injection
 :'var' for injection quoted-as-text
 :"var" for injection quoted-as-identifier

What if we add new forms of dereferencing, for instance
(not necessarily with this exact syntax):
 : for injecting the quoted-as-text contents of the file pointed
  to by var.
 :{var} same thing but with file contents quoted as binary
 (PQescapeByteaConn)

then we could write:

\set img '/path/to/image.png'
insert into image(binary) values(:{img});

We could also go further  in that direction. More new interpolation
syntax could express that a variable is to be passed as a
parameter rather than injected (assuming a parametrized query),
and whether the value is directly the contents or it's a filename
pointing to the contents, and whether its format is binary or text,
and even support an optional oid or typename coupled to
the variable.
That would be a lot of new syntax for interpolation but no new
backslash command and no change to \set itself.


Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] export import bytea from psql

2017-05-11 Thread Daniel Verite
Pavel Stehule wrote:

> It is similar to my first or second proposal - rejected by Tom :(

Doesn't it differ? ISTM that in the patches/discussion related to:
https://commitfest.postgresql.org/11/787/
it was proposed to change \set in one way or another,
and also in the point #1 of this present thread:

> > 1. using psql variables - we just need to write commands \setfrom
> > \setfrombf - this should be very simple implementation. The value can be
> > used more times. On second hand - the loaded variable can hold lot of
> > memory (and it can be invisible for user).


My comment is: let's not change \set or even create a variant
of it just for importing verbatim file contents, in text or binary,
because interpolating psql variables differently in the query itself
is all we need.


Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Disallowing multiple queries per PQexec()

2017-06-12 Thread Daniel Verite
Surafel Temesgen wrote:

> I modified the patch as such and added to commitfest 2017-07.

A couple comments:

+   {"disallow_multiple_queries", PGC_POSTMASTER,
CLIENT_CONN_OTHER,
+   gettext_noop("Disallow multiple queries per query
string."),
+   NULL
+   },

PGC_POSTMASTER implies that it's an instance-wide setting.
Is is intentional? I can understand that it's more secure for this not to
be changeable in an existing session, but it's also much less usable if you
can't set it per-database and per-user.
Maybe it should be PGC_SUSET ?


+   if ((strcmp(commandTagHead, "BEGIN") != 0) ||
(strcmp(commandTagTail, "COMMIT") != 0) )
+   ereport(ERROR,
+   (errcode(ERRCODE_SYNTAX_ERROR),
errmsg("cannot execute multiple commands unless it is a transaction
block")));

Shouldn't ROLLBACK be considered too as ending a transaction block?
Also, can it use a more specific code than ERRCODE_SYNTAX_ERROR?
It feels more like a rule violation than a syntax error.



Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Disallowing multiple queries per PQexec()

2017-06-12 Thread Daniel Verite
Tom Lane wrote:

> Bearing in mind that I'm not really for this at all...

It's a band-aid, but certainly there are cases
where a DBA confronted to a badly written website
would just want to be able to:
  ALTER USER webuser SET allow_multiple_queries TO off;

> But if an attacker is able to inject a SET command,
> he's already found a way around it.  So there's no real
> point in locking down the GUC to prevent that.

I can think of the following case, where given the SQL-injectable
   select id from users where email='$email';
an attacker would submit this string in $email:
  foo' AND set_config('allow_multiple_queries', 'on', false)='on
which opens the rest of the session for a second injection, this
time in the form of several colon-separated commands that
would do the actual damage.


Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Disallowing multiple queries per PQexec()

2017-06-15 Thread Daniel Verite
Fabien COELHO wrote:

> I'm not fully convinced by this feature: using multiple queries is a 
> useful trick to reduce network-related latency by combining several 
> queries in one packet. Devs and even ORMs could use this trick.

It's proposed as an option. For apps that intentionally put multiple
commands in single PQexec calls, or for apps for which the deployer doesn't
know or care whether they do that, the setting should be kept to its default
value that let such calls pass, as they pass today.

In my understanding of the proposal, there is no implication that
intentionally using such multiple commands is bad, or that
it should be obsoleted in the future.

It's only bad in the specific case when this possibility is not used
normally by the app, but it's available to an attacker to make an
attack both easier to mount and more potent than a single-query injection.
This reasoning is also based on the assumption that the app has
bugs concerning quoting of parameters, but it's a longstanding class of
bugs that plague tons of low-quality code in production, and it shows no
sign of going away.

> Many SQL injection attacks focus on retrieving sensitive data, 
> in which case "' UNION SELECT ... --" would still work nicely. Should we 
> allow to forbid UNION? And/or disallow comments as well, which are 
> unlikely to be used from app code, and would make this harder? If pg is to 
> provide SQL injection guards by removing some features on some 
> connections, maybe it could be more complete about it.

It looks more like the "training wheel" patch that
was discussed  in https://commitfest.postgresql.org/13/948/
rather than something that should be in this patch.
This is a different direction because allowing or disallowing
compound statements is a clear-cut thing, whereas making
a list of SQL constructs to cripple to mitigate bugs is more like opening
a Pandora box.


Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Disallowing multiple queries per PQexec()

2017-06-15 Thread Daniel Verite
Andres Freund wrote:

> Since it's an application writer's choice whether to use it,
> it seems to make not that much sense to have a
> serverside guc - it can't really be sensible set.

The application writers who are concerned by this wouldn't
know that they have a choice. If there were informed, 
supposedly they would grok the SQL syntax to begin with,
understanding the necessity and workings of proper quoting, and
the problem would not exist.

What is proposed AFAIU is an optional policy to be set on already
developed client apps, not a setting that is meant to be played with
by them.

An analogy I can find in existing GUCs, and that incidentally is
actually relevant to me as an app writer, is lo_compat_privileges
It's SUSET, it's not GUC_REPORT. Either it's on and the app
is not subject to permission checking for large objects,
or it's off and it is subject to them.
It's something that is relevant at deployment time, and not really besides
that, and it's the DBA's problem to set the policy depending on the app
requirements and the security requirements, rather than the app's problem
to adjust to whatever value there is in there.
As an example of app requirement, if the app has to let a user create a
large object and a different user to delete it, this GUC must be on,
otherwise such a scenario is not allowed, as unlinking is not a grantable
privilege, just like drop table.


Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Variable substitution in psql backtick expansion

2017-09-04 Thread Daniel Verite
Tom Lane wrote:

> Since we've blown past 80 columns on some of the other
> output, maybe that doesn't matter.  Or maybe we should shorten this
> variable name so it doesn't force reformatting of all this text.

The two-space left margin on the entire block does not add that
much to readability, IMV, so maybe we could reclaim these
two characters.

Another idea: since there are already several variables for which
the text + spacing + presentation don't fit anyway, 
we could forget about the tabular presentation and grow
vertically.

That would look like the following, for example, with a 3-space margin
for the description:

AUTOCOMMIT
   If set, successful SQL commands are automatically committed
COMP_KEYWORD_CASE
   Determines the case used to complete SQL key words
   [lower, upper, preserve-lower, preserve-upper]
DBNAME
   The currently connected database name
ECHO
   Controls what input is written to standard output
   [all, errors, none, queries]
ECHO_HIDDEN
   If set, display internal queries executed by backslash commands;
   if set to "noexec", just show without execution
ENCODING
   Current client character set encoding
FETCH_COUNT
   The number of result rows to fetch and display at a time
   (default: 0=unlimited)
etc..

To me that looks like a good trade-off: it eases the size constraints
for both the description and the name of the variable, at the cost
of consuming one more line per variable, but that's why the pager
is for.


Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] psql: new help related to variables are not too readable

2017-09-09 Thread Daniel Verite
Tomas Vondra wrote:

> That's not what I meant. I've never felt a strong urge to avoid wrapping
> at 80 chars when translating strings - simply translate in the most
> meaningful way, if it gets longer than 80 chars and can't be easily
> shortened, just wrap. If there are 60 or 80 characters does not change
> this much - 80 chars may allow more unwrapped strings, of course, but
> it's a minor change for the translators. Or at least me, others may
> disagree, of course.

The difficulty varies across languages since some are more compact
than others, and the choice of wrapping or not is not
consistent across translations.

As an example, in the previous version, the first variable comes
out as:

en
  AUTOCOMMIT if set, successful SQL commands are automatically
committed

de
  AUTOCOMMIT wenn gesetzt werden alle erfolgreichen SQL-Befehle
 automatisch committet
es
  AUTOCOMMIT si está definida, órdenes SQL exitosas se comprometen
 automáticamente
fr
  AUTOCOMMIT si activé, les commandes SQL réussies sont
automatiquement validées

he
  AUTOCOMMITאם הופעל, פקודות SQL מחויבות באופן אוטומטי

it
  AUTOCOMMIT se impostato, i comandi SQL riusciti sono salvati
automaticamente

ko
  AUTOCOMMIT 지정 하면 SQL 명령이 성공하면 자동으로 커밋

pl
  AUTOCOMMIT gdy ustawiony, polecenia SQL zakończone powodzeniem są
automatycznie zatwierdzone

pt_BR
  AUTOCOMMIT se definido, comandos SQL bem sucedidos são
automaticamente efetivados

ru
  AUTOCOMMIT если установлен, успешные SQL-команды фиксируются
автоматически

sv
  AUTOCOMMIT om satt, efterföljande SQL-kommandon commit:as
automatiskt

zh_CN
  AUTOCOMMIT 如果被设置,成功的SQL命令将会被自动提交

The original line in english has exactly 80 characters.
When translated in other latin languages, it tends to be a bit
longer.

German and spanish translators have taken the trouble
to break the message into two lines of less than
80 chars with the second one nicely indented, also
aligned in the .po file:

msgid "  AUTOCOMMIT if set, successful SQL commands are automatically
committed\n"
msgstr ""
"  AUTOCOMMIT si está definida, órdenes SQL exitosas se
comprometen\n"
" automáticamente\n"


But other translations are not necessarily done this way, or
at least not consistently. If only for that, having more
characters in the description before screen wrap should help
a bit. In the above example, I think with the new presentation
only the polish version wouldn't fit in 80 chars without
wrapping.


Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] [patch] A \pivot command for psql

2015-08-09 Thread Daniel Verite
  Hi,

I want to suggest a client-side \pivot command in psql, implemented
in the attached patch.

\pivot takes the current query in the buffer, execute it and
display it pivoted by interpreting the result as:

column1 => row in pivoted output
column2 => column in pivoted output
column3 => value at (row,column) in pivoted ouput

The advantage over the server-side crosstab() from contrib is in
ease of use, mostly because \pivot doesn't need in advance the
list of columns that result from pivoting.

Typical use cases are queries that produce values
along two axes:
 SELECT a,b,aggr(something) FROM tbl GROUP a,b;
\pivot displays immediately the matrix-like representation

Or displaying "pairing" between columns, as in
 SELECT a,b,'X' FROM tblA [LEFT|RIGHT|FULL] JOIN tblB...
which once pivoted shows in an easily readable way
what "a" is/isn't in relation with any "b".

Columns are sorted with strcmp(). I think a more adequate sort could
be obtained through a separate query with ORDER BY just for these
values (casted to their original type), but the patch doesn't do that
yet.

Also, \pivot could take optionally the query as an argument instead
of getting it only from the query buffer.

Anyway, does it look useful enough to be part of psql? 
I guess I should push this to commitfest if that's the case.

Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite
diff --git a/src/bin/psql/Makefile b/src/bin/psql/Makefile
index f1336d5..7e1ac24 100644
--- a/src/bin/psql/Makefile
+++ b/src/bin/psql/Makefile
@@ -23,7 +23,7 @@ override CPPFLAGS := -I. -I$(srcdir) -I$(libpq_srcdir) -I$(top_srcdir)/src/bin/p
 OBJS=	command.o common.o help.o input.o stringutils.o mainloop.o copy.o \
 	startup.o prompt.o variables.o large_obj.o print.o describe.o \
 	tab-complete.o mbprint.o dumputils.o keywords.o kwlookup.o \
-	sql_help.o \
+	sql_help.o pivot.o \
 	$(WIN32RES)
 
 
diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c
index 6181a61..e348028 100644
--- a/src/bin/psql/command.c
+++ b/src/bin/psql/command.c
@@ -48,6 +48,7 @@
 #include "psqlscan.h"
 #include "settings.h"
 #include "variables.h"
+#include "pivot.h"
 
 /*
  * Editable database object types.
@@ -1083,6 +1084,12 @@ exec_command(const char *cmd,
 		free(pw2);
 	}
 
+	/* \pivot -- execute a query and show pivoted results */
+	else if (strcmp(cmd, "pivot") == 0)
+	{
+		success = doPivot(query_buf);
+	}
+
 	/* \prompt -- prompt and set variable */
 	else if (strcmp(cmd, "prompt") == 0)
 	{
diff --git a/src/bin/psql/pivot.c b/src/bin/psql/pivot.c
new file mode 100644
index 000..cb8ffdf
--- /dev/null
+++ b/src/bin/psql/pivot.c
@@ -0,0 +1,213 @@
+/*
+ * psql - the PostgreSQL interactive terminal
+ *
+ * Copyright (c) 2000-2015, PostgreSQL Global Development Group
+ *
+ * src/bin/psql/pivot.c
+ */
+
+#include "common.h"
+#include "pqexpbuffer.h"
+#include "settings.h"
+#include "pivot.h"
+
+#include 
+
+static int
+headerCompare(const void *a, const void *b)
+{
+	return strcmp( ((struct pivot_field*)a)->name,
+   ((struct pivot_field*)b)->name);
+}
+
+static void
+accumHeader(char* name, int* count, struct pivot_field **sorted_tab, int row_number)
+{
+	struct pivot_field *p;
+
+	/*
+	 * Search for name in sorted_tab. If it doesn't exist, insert it,
+	 * otherwise do nothing.
+	 */
+
+	if (*count >= 1)
+	{
+		p = (char**) bsearch(&name,
+			 *sorted_tab,
+			 *count,
+			 sizeof(struct pivot_field),
+			 headerCompare);
+	}
+	else
+		p=NULL;
+
+	if (!p)
+	{
+		*sorted_tab = pg_realloc(*sorted_tab, sizeof(struct pivot_field) * (1+*count));
+		(*sorted_tab)[*count].name = name;
+		(*sorted_tab)[*count].rank = *count;
+		(*count)++;
+
+		qsort(*sorted_tab,
+			  *count,
+			  sizeof(struct pivot_field),
+			  headerCompare);
+	}
+}
+
+static void
+printPivotedTuples(const PGresult *results,
+   int num_columns,
+   struct pivot_field *piv_columns,
+   int num_rows,
+   struct pivot_field *piv_rows)
+{
+	printQueryOpt popt = pset.popt;
+	printTableContent cont;
+	int	i, j, k, rn;
+
+	popt.title = _("Pivoted query results");
+	printTableInit(&cont, &popt.topt, popt.title,
+   num_columns+1, num_rows);
+
+	/* Pivoting keeps the name of the first column */
+	printTableAddHeader(&cont, PQfname(results, 0),
+		popt.translate_header, 'l');
+
+	for (i = 0; i < num_columns; i++)
+	{
+		printTableAddHeader(&cont, piv_columns[i].name,
+			popt.translate_header, 'l');
+	}
+
+	/* Set row names in the first output column */
+	for (i = 0; i < num_rows; i++)
+	{
+		k = piv_rows[i].rank;
+		cont.cells[k*(num_columns+1)] = piv_rows[i].name;
+		for (j = 0; j < num_columns; j++)
+			cont.cells[k*(num_columns+1)+j+1] = "";
+	}
+	cont.cellsadded = num_rows * (num_columns+1);
+
+	/* Set all the "inside" cells */
+	for (rn = 0; rn < PQntuples(results); rn++)
+	{
+		char* row_name;
+		char* col_name;
+		int row_number;
+		int col_number;
+		struct pivot_field *p;
+
+		row_number = col_number =

Re: [HACKERS] [patch] A \pivot command for psql

2015-08-09 Thread Daniel Verite
Tom Lane wrote:

> Is there a way to implement pivoting as a set-returning function?

Not with the same ease of use. We have crosstab functions
in contrib/tablefunc already, but the killer problem with PIVOT
is that truly dynamic columns are never reachable directly.

If we could do this:

  SELECT * FROM crosstab('select a,b,c FROM tbl');

and the result came back pivoted, with a column for each distinct value
of b, there will be no point in a client-side pivot. But postgres (or
I suppose any SQL interpreter) won't execute this, for not knowing
beforehand what structure "*" is going to have.

So what is currently required from the user, with dynamic columns,
is more like:

1st pass: identify the columns
 SELECT DISTINCT a FROM tbl;

2nd pass: inject the columns, in a second embedded query
and in a record definition, with careful quoting:

  select * from crosstab(
'SELECT a,b,c FROM tbl ORDER BY 1', 
' VALUES (col1),(col2),(col3)...' -- or 'select distinct...' again
  ) AS ct(b type, "col1" type, "col2" type, "col3" type)


Compared to this, \pivot limited to the psql  interpreter 
is a no-brainer, we could just write instead:

=> select a,b,c FROM tbl;
=> \pivot

This simplicity is the whole point. It's the result of doing
the operation client-side, where the record structure can be
pivoted without the target structure being formally declared.

Some engines have a built-in PIVOT syntax (Oracle, SQL server).
I have looked only at their documentation.
Their pivot queries look nicer and are possibly more efficient than
with SET functions, but AFAIK one still needs to programmatically 
inject the list of column values into them, when that list
is not static.


Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] A \pivot command for psql

2015-08-10 Thread Daniel Verite
David Fetter wrote:

> I'm working up a proposal to add (UN)PIVOT support to the back-end.

I was under the impression that a server-side PIVOT *with dynamic
columns* was just unworkable as an SQL query, because it couldn't
be prepared if it existed.

I am wrong on that? I feel like you guys are all telling me that \pivot
should happen on the server, but the point that it would not be
realistic to begin with is not considered.


Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] A \pivot command for psql

2015-08-10 Thread Daniel Verite
Tom Lane wrote:

> I'm not sure how pushing it out to psql makes that better.  There is
> no way to do further processing on something that psql has printed,
> so you've punted on solving that issue just as much if not more.

It's the same spirit as \x : the only thing it achieves is better
readability in certain cases, I don't expect more from it.
But I admit that \pivot would be much more of a "niche use case"
than \x

> Besides which, psql is not all that great for looking at very wide tables,
> so I'm not sure that this would be very useful for dynamic column sets
> anyway.

I use \x all the time with wide tables and \x and \pivot happen to play
out well together (In fact I was pleasantly surprised by this)

Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] A \pivot command for psql

2015-08-11 Thread Daniel Verite
David Fetter wrote:

> That depends on what you mean by "dynamic columns."  The approach
> taken in the tablefunc extension is to use functions which return
> SETOF RECORD, which in turn need to be cast at runtime.

For me, "PIVOT with dynamic columns" would be a pivot query
whose output columns are not enumerated as input in the
SQL query itself, in any form.

> The second, more on point, is to specify a serialization for the rows
> in the "dynamic columns" case.  Their syntax is "PIVOT XML", but I
> would rather do something more like "PIVOT (SERIALIZATION XML)".

The SERIALIZATION  looks interesting, but I believe these days JSON
would make more sense than XML, both as easier for the client-side and
because of all the json_* functions we now have to mix json with
relational structures.

Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


  1   2   >