date:20140917

Hi all

Attached is a patch to switch 9.5 over to using the
GetSystemTimeAsFileTime call instead of separate GetSystemTime and
SystemTimeToFileTime calls.

This patch the first step in improving PostgreSQL's support for Windows
high(er) resolution time.

In addition to requiring one less call into the platform libraries, this
change permits capture of timestamps at up to 100ns precision, instead
of the current 1ms limit. Unfortunately due to platform timer resolution
limitations it will in practice only report with 1ms resolution and
0.1ms precision - or sometimes even as much as 15ms resolution. (If you
want to know more, see the README for
https://github.com/2ndQuadrant/pg_sysdatetime).

On Windows 2012 and Windows 8 I'd like to use the new
GetSystemTimePreciseAsFileTime call instead. As this requires some extra
hoop-jumping to safely and efficiently use it without breaking support
for older platforms I suggest that we start with just switching over to
GetSystemTimeAsFileTime, which has been supported since Windows 2000.
Then more precise time capture can be added in a later patch.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services
From 337f1e4894bf7071c6d0c24a77c433a0b9e0cc81 Mon Sep 17 00:00:00 2001
From: Craig Ringer cr...@2ndquadrant.com
Date: Fri, 12 Sep 2014 12:41:35 +0800
Subject: [PATCH] Use GetSystemTimeAsFileTime directly in windows gettimeofday
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

PostgreSQL was calling GetSystemTime followed by SystemTimeToFileTime in the
win32 port gettimeofday function. This is not necessary and limits the reported
precision to the 1ms granularity that the SYSTEMTIME struct can represent. By
using GetSystemTimeAsFileTime we avoid unnecessary conversions and capture
timestamps at 100ns granularity, which is then rounded to 1Âµs granularity for
storage in a PostgreSQL timestamp.

On most Windows systems this change will actually have no significant effect as
the system timer tick is typically between 1ms and 15ms depending on what timer
resolution currently running applications have requested. You can check this
with clockres.exe from sysinternals. Despite the platform limiation this change
still permits capture of finer timestamps where the system is capable of
producing them and it gets rid of an unnecessary syscall.

Future work may permit use of GetSystemTimePreciseAsFileTime on Windows 8 and
Windows Server 2012 for higher resolution time capture. This call has the same
interface as GetSystemTimeAsFileTime.
---
 src/port/gettimeofday.c | 6 ++
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/src/port/gettimeofday.c b/src/port/gettimeofday.c
index 75a9199..73ec406 100644
--- a/src/port/gettimeofday.c
+++ b/src/port/gettimeofday.c
@@ -44,16 +44,14 @@ int
 gettimeofday(struct timeval * tp, struct timezone * tzp)
 {
 	FILETIME	file_time;
-	SYSTEMTIME	system_time;
 	ULARGE_INTEGER ularge;
 
-	GetSystemTime(system_time);
-	SystemTimeToFileTime(system_time, file_time);
+	GetSystemTimeAsFileTime(file_time);
 	ularge.LowPart = file_time.dwLowDateTime;
 	ularge.HighPart = file_time.dwHighDateTime;
 
 	tp-tv_sec = (long) ((ularge.QuadPart - epoch) / 1000L);
-	tp-tv_usec = (long) (system_time.wMilliseconds * 1000);
+	tp-tv_usec = (long) (((ularge.QuadPart - epoch) % 1000L) / 10);
 
 	return 0;
 }
-- 
1.9.3


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Collations and Replication; Next Steps

2014-09-17 Thread Martijn van Oosterhout

On Tue, Sep 16, 2014 at 02:57:00PM -0700, Peter Geoghegan wrote:
 On Tue, Sep 16, 2014 at 2:07 PM, Peter Eisentraut pete...@gmx.net wrote:
  Clearly, this is worth documenting, but I don't think we can completely
  prevent the problem.  There has been talk of a built-in index integrity
  checking tool.  That would be quite useful.
 
 We could at least use the GNU facility for versioning collations where
 available, LC_IDENTIFICATION [1]. By not versioning collations, we are
 going against the express advice of the Unicode consortium (they also
 advise to do a strcmp() tie-breaker, something that I think we
 independently discovered in 2005, because of a bug report - this is
 what I like to call the Hungarian issue. They know what our
 constraints are.). I recognize it's a tricky problem, because of our
 historic dependence on OS collations, but I think we should definitely
 do something. That said, I'm not volunteering for the task, because I
 don't have time. While I'm not sure of what the long term solution
 should be, it *is not* okay that we don't version collations. I think
 that even the best possible B-Tree check tool is a not a solution.

Personally I think we should just support ICU as an option. FreeBSD has
been maintaining an out of tree patch for 10 years now so we know it
works.

The FreeBSD patch is not optimal though, these days ICU supports UTF-8
directly so many of the push-ups FreeBSD does are no longer necessary.
It is often faster than glibc and the key sizes for strxfrm are more
compact [1] which is relevent for the recent optimisation patch.

Lets solve this problem for once and for all.

[1] http://site.icu-project.org/charts/collation-icu4c48-glibc

-- 
Martijn van Oosterhout   klep...@svana.org   http://svana.org/kleptog/
 He who writes carelessly confesses thereby at the very outset that he does
 not attach much importance to his own thoughts.
   -- Arthur Schopenhauer


signature.asc
Description: Digital signature

Re: [HACKERS] Collations and Replication; Next Steps

2014-09-17 Thread Matthew Kelly

Here is where I think the timezone and PostGIS cases are fundamentally 
different:
I can pretty easily make sure that all my servers run in the same timezone.  
That's just good practice.  I'm also going to install the same version of 
PostGIS everywhere in a cluster.  I'll build PostGIS and its dependencies from 
the exact same source files, regardless of when I build the machine.

Timezone is a user level setting; PostGIS is a user level library used by a 
subset.

glibc is a system level library, and text is a core data type, however.  
Changing versions to something that doesn't match the kernel can lead to system 
level instability, broken linkers, etc.  (I know because I tried).  Here are 
some subtle other problems that fall out:

 * Upgrading glibc, the kernel, and linker through the package manager in order 
to get security updates can cause the corruption.
 * A basebackup that is taken in production and placed on a backup server might 
not be valid on that server, or your desktop machine, or on the spare you keep 
to do PITR when someone screws up.
 * Unless you keep _all_ of your clusters on the same OS, machines from your 
database spare pool probably won't be the right OS when you add them to the 
cluster because a member failed.

Keep in mind here, by OS I mean CentOS versions.  (we're running a mix of late 
5.x and 6.x, because of our numerous issues with the 6.x kernel)

The problem with LC_IDENTIFICATION is that every machine I have seen reports 
revision 1.0, date 2000-06-24.  It doesn't seem like the versioning is 
being actively maintained.

I'm with Martjin here, lets go ICU, if only because it moves sorting to a user 
level library, instead of a system level.  Martjin do you have a link to the 
out of tree patch?  If not I'll find it.  I'd like to apply it to a branch and 
start playing with it.

- Matt K


On Sep 17, 2014, at 7:39 AM, Martijn van Oosterhout klep...@svana.org
 wrote:

 On Tue, Sep 16, 2014 at 02:57:00PM -0700, Peter Geoghegan wrote:
 On Tue, Sep 16, 2014 at 2:07 PM, Peter Eisentraut pete...@gmx.net wrote:
 Clearly, this is worth documenting, but I don't think we can completely
 prevent the problem.  There has been talk of a built-in index integrity
 checking tool.  That would be quite useful.
 
 We could at least use the GNU facility for versioning collations where
 available, LC_IDENTIFICATION [1]. By not versioning collations, we are
 going against the express advice of the Unicode consortium (they also
 advise to do a strcmp() tie-breaker, something that I think we
 independently discovered in 2005, because of a bug report - this is
 what I like to call the Hungarian issue. They know what our
 constraints are.). I recognize it's a tricky problem, because of our
 historic dependence on OS collations, but I think we should definitely
 do something. That said, I'm not volunteering for the task, because I
 don't have time. While I'm not sure of what the long term solution
 should be, it *is not* okay that we don't version collations. I think
 that even the best possible B-Tree check tool is a not a solution.
 
 Personally I think we should just support ICU as an option. FreeBSD has
 been maintaining an out of tree patch for 10 years now so we know it
 works.
 
 The FreeBSD patch is not optimal though, these days ICU supports UTF-8
 directly so many of the push-ups FreeBSD does are no longer necessary.
 It is often faster than glibc and the key sizes for strxfrm are more
 compact [1] which is relevent for the recent optimisation patch.
 
 Lets solve this problem for once and for all.
 
 [1] http://site.icu-project.org/charts/collation-icu4c48-glibc
 
 -- 
 Martijn van Oosterhout   klep...@svana.org   http://svana.org/kleptog/
 He who writes carelessly confesses thereby at the very outset that he does
 not attach much importance to his own thoughts.
   -- Arthur Schopenhauer



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Collations and Replication; Next Steps

2014-09-17 Thread Robert Haas

On Wed, Sep 17, 2014 at 9:07 AM, Matthew Kelly mke...@tripadvisor.com wrote:
 Here is where I think the timezone and PostGIS cases are fundamentally 
 different:
 I can pretty easily make sure that all my servers run in the same timezone.  
 That's just good practice.  I'm also going to install the same version of 
 PostGIS everywhere in a cluster.  I'll build PostGIS and its dependencies 
 from the exact same source files, regardless of when I build the machine.

 Timezone is a user level setting; PostGIS is a user level library used by a 
 subset.

 glibc is a system level library, and text is a core data type, however.  
 Changing versions to something that doesn't match the kernel can lead to 
 system level instability, broken linkers, etc.  (I know because I tried).  
 Here are some subtle other problems that fall out:

  * Upgrading glibc, the kernel, and linker through the package manager in 
 order to get security updates can cause the corruption.
  * A basebackup that is taken in production and placed on a backup server 
 might not be valid on that server, or your desktop machine, or on the spare 
 you keep to do PITR when someone screws up.
  * Unless you keep _all_ of your clusters on the same OS, machines from your 
 database spare pool probably won't be the right OS when you add them to the 
 cluster because a member failed.

 Keep in mind here, by OS I mean CentOS versions.  (we're running a mix of 
 late 5.x and 6.x, because of our numerous issues with the 6.x kernel)

 The problem with LC_IDENTIFICATION is that every machine I have seen reports 
 revision 1.0, date 2000-06-24.  It doesn't seem like the versioning is 
 being actively maintained.

 I'm with Martjin here, lets go ICU, if only because it moves sorting to a 
 user level library, instead of a system level.  Martjin do you have a link to 
 the out of tree patch?  If not I'll find it.  I'd like to apply it to a 
 branch and start playing with it.

What I find astonishing is that whoever maintains glibc (or the Red
Hat packaging for it) thinks it's OK to change the collation order in
a minor release.  I'd understand changing it between, say, RHEL 6 and
RHEL 7.  But the idea that minor release, supposedly safe updates
think they can whack this around without breaking applications really
kind of blows my mind.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Collations and Replication; Next Steps

2014-09-17 Thread Matthew Kelly

Let me double check that assertion before we go too far with it.

Most of the problems I've seen are across 5 and 6 boundaries.  I thought I had 
case where it was within a minor release but I can't find it right now.  I'm 
going to dig.

That being said the sort order changes whether you statically or dynamically 
link (demonstrated on 4+ machines running different linux flavors), so at the 
point I have no reason to trust the stability of the sort across any build.  I 
legitimately question whether strcoll is buggy.  Ex. I have cases where for 
three strings a, b and c:  a  b, but  (a || c)  (b || c).  That's right 
postfixing doesn't hold.  It actually calls into question the index scan 
optimization that occurs when you do LIKE 'test%' even on a single machine, but 
I don't want to bite that off at the moment.

My mentality has switched to 'don't trust any change until shown otherwise', so 
that may have bled into my last email.

- Matt K.




On Sep 17, 2014, at 8:17 AM, Robert Haas robertmh...@gmail.com
 wrote:

 On Wed, Sep 17, 2014 at 9:07 AM, Matthew Kelly mke...@tripadvisor.com wrote:
 Here is where I think the timezone and PostGIS cases are fundamentally 
 different:
 I can pretty easily make sure that all my servers run in the same timezone.  
 That's just good practice.  I'm also going to install the same version of 
 PostGIS everywhere in a cluster.  I'll build PostGIS and its dependencies 
 from the exact same source files, regardless of when I build the machine.
 
 Timezone is a user level setting; PostGIS is a user level library used by a 
 subset.
 
 glibc is a system level library, and text is a core data type, however.  
 Changing versions to something that doesn't match the kernel can lead to 
 system level instability, broken linkers, etc.  (I know because I tried).  
 Here are some subtle other problems that fall out:
 
 * Upgrading glibc, the kernel, and linker through the package manager in 
 order to get security updates can cause the corruption.
 * A basebackup that is taken in production and placed on a backup server 
 might not be valid on that server, or your desktop machine, or on the spare 
 you keep to do PITR when someone screws up.
 * Unless you keep _all_ of your clusters on the same OS, machines from your 
 database spare pool probably won't be the right OS when you add them to the 
 cluster because a member failed.
 
 Keep in mind here, by OS I mean CentOS versions.  (we're running a mix of 
 late 5.x and 6.x, because of our numerous issues with the 6.x kernel)
 
 The problem with LC_IDENTIFICATION is that every machine I have seen reports 
 revision 1.0, date 2000-06-24.  It doesn't seem like the versioning is 
 being actively maintained.
 
 I'm with Martjin here, lets go ICU, if only because it moves sorting to a 
 user level library, instead of a system level.  Martjin do you have a link 
 to the out of tree patch?  If not I'll find it.  I'd like to apply it to a 
 branch and start playing with it.
 
 What I find astonishing is that whoever maintains glibc (or the Red
 Hat packaging for it) thinks it's OK to change the collation order in
 a minor release.  I'd understand changing it between, say, RHEL 6 and
 RHEL 7.  But the idea that minor release, supposedly safe updates
 think they can whack this around without breaking applications really
 kind of blows my mind.
 
 -- 
 Robert Haas
 EnterpriseDB: http://www.enterprisedb.com
 The Enterprise PostgreSQL Company



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Collations and Replication; Next Steps

2014-09-17 Thread Greg Stark

On Tue, Sep 16, 2014 at 11:41 PM, Peter Geoghegan p...@heroku.com wrote:
 The timezone case you highlight here seems quite distinct from what
 Matthew is talking about, because in point of fact the on-disk
 representation is merely *interpreted* with reference to the timezone
 database. So, you could have an inconsistency between standbys
 concerning what the time was in a particular timezone at a particular
 timestamp value as reported by the timestamptz output function, but
 both standbys would be correct on their own terms, which isn't too
 bad.

You could have a problem if you have an expression index on (timestamp
AT TIME ZONE '...'). I may have the expression slightly wrong but I
believe it is posisble to write an immutable expression that depends
on the tzdata data as long as it doesn't depend on not the user's
current time zone (which would be stable but not immutable). The
actual likelihood of that situation might be much lower and the
ability to avoid it higher but in theory I think Peter's right that
it's the same class of problem.

Generally speaking we try to protect against most environment
dependencies that lead to corrupt databases by encoding them in the
control file. Obviously we can't encode an entire collation in the
controlfile though. We could conceivably have a corpus of
representative strings that we sort and then checksum in the
controlfile. It wouldn't be foolproof but if we collect interesting
examples as we find them it might be a worthwhile safety check.

Just brainstorming... I wonder if it would be possible to include any
collation comparisons made in handling an index insert in the xlog
record and have the standby verify those comparisons are valid on the
standby. I guess that would be pretty hard to arrange code-wise since
the comparisons could be coming from anywhere to say nothing of the
wal bloat.

Peter G, could go into more detail about collation versioning? What
would the implications be for Postgres?

-- 
greg


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Collations and Replication; Next Steps

Why don't we have our collation data? It seems MySQL has already done this.

http://dev.mysql.com/doc/refman/5.0/en/charset-collation-implementations.html

I don't think we cannot achieve that because even MySQL accomplishes:-)

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Collations and Replication; Next Steps

2014-09-17 Thread Greg Stark

On Wed, Sep 17, 2014 at 3:47 PM, Tatsuo Ishii is...@postgresql.org wrote:
 I don't think we cannot achieve that because even MySQL accomplishes:-)

We've always considered it an advantage that we're consistent with the
collations in the rest of the system. Generally speaking the fact that
Postgres integrates with the system rather than be a separate system
unto itself.

Consider bug reports like I've configured my system to use
fr_FR.UTF-8 and sort produces output in this order why is Postgres
producing output in a different order? Or extension authors using
strcoll and being surprised that the module gets inconsistent data
from the database.

Separately we always had a huge problem with ICU that it depended on
storing everything in a UCS-16 native encoding and required converting
to and from UTF-8 using an iterator interface. I heard that improved
somewhat but from what I understand it would be a struggle to avoid
copying every string before using it and consuming twice as much
memory. No more using strings directly out of disk buffers.

Then there's the concern that ICU is a *huge* dependency. ICU is
itself larger than the entire Postgres install. It's a big burden on
users to have to install and configure a second collation library in
addition to the system library and a complete non-starter for embedded
systems or low-memory systems.


-- 
greg


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] printing table in asciidoc with psql

2014-09-17 Thread Emanuel Calvo


El 16/09/14 16:52, Szymon Guz escribió:
 Hi,
 I've been working a little bit on a patch for printing tables in
 asciidoc with psql.

 It's not finished yet, I'm not sure it there is any sense in
 supporting border types etc. The code is not cleared so far, but any
 remarks on the style not playing well with the normal postgres style
 of code are welcomed.

 The code just works. With extended and normal modes. With table
 columns made of funny characters, with alignment of data in table
 cells. I was trying to implement it similar to the html export
 function, however escaping of the strings was much easier, as the
 normal html-way substitution is not easy to implement in asciidoc.

 I'd like to ask you for any advices for this code.

 thanks,
 Szymon



Please add asciidoc in src/bin/psql/help.c[354]

 354,96-103
fprintf(output, _(  format set output format
[unaligned, aligned, wrapped, html, latex, ..]\n));

Tested the patch and worked fine.

-- 
--
Emanuel Calvo http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services

Re: [HACKERS] [Windows,PATCH] Use faster, higher precision timer API

2014-09-17 Thread Andrew Dunstan



On 09/17/2014 08:27 AM, Craig Ringer wrote:

Hi all

Attached is a patch to switch 9.5 over to using the
GetSystemTimeAsFileTime call instead of separate GetSystemTime and
SystemTimeToFileTime calls.

This patch the first step in improving PostgreSQL's support for Windows
high(er) resolution time.

In addition to requiring one less call into the platform libraries, this
change permits capture of timestamps at up to 100ns precision, instead
of the current 1ms limit. Unfortunately due to platform timer resolution
limitations it will in practice only report with 1ms resolution and
0.1ms precision - or sometimes even as much as 15ms resolution. (If you
want to know more, see the README for
https://github.com/2ndQuadrant/pg_sysdatetime).

On Windows 2012 and Windows 8 I'd like to use the new
GetSystemTimePreciseAsFileTime call instead. As this requires some extra
hoop-jumping to safely and efficiently use it without breaking support
for older platforms I suggest that we start with just switching over to
GetSystemTimeAsFileTime, which has been supported since Windows 2000.
Then more precise time capture can be added in a later patch.





That will presumably breaK XP. I know XP has been declared at EOL, but 
there are still a heck of a lot of such systems out there, especially in 
places like ATMs, but I saw it in use recently at a US surgical facility 
(which is slightly scary, although this wasn't for life-sustaining 
functionality). My XP system is still actually getting some security 
updates sent from Microsoft.


I'm fine with doing this - frogmouth and currawong would retire on the 
buildfarm.


Just wanted to be up front about it.

cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] printing table in asciidoc with psql

2014-09-16 21:52 GMT+02:00 Szymon Guz mabew...@gmail.com:

 Hi,
 I've been working a little bit on a patch for printing tables in asciidoc
 with psql.

 It's not finished yet, I'm not sure it there is any sense in supporting
 border types etc. The code is not cleared so far, but any remarks on the
 style not playing well with the normal postgres style of code are welcomed.

 The code just works. With extended and normal modes. With table columns
 made of funny characters, with alignment of data in table cells. I was
 trying to implement it similar to the html export function, however
 escaping of the strings was much easier, as the normal html-way
 substitution is not easy to implement in asciidoc.

 I'd like to ask you for any advices for this code.


nice +1

Pavel



 thanks,
 Szymon


 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Windows exit code 128 ... it's baaack

2014-09-17 Thread Andrew Dunstan



On 04/07/2014 10:26 AM, Andres Freund wrote:

On 2014-04-05 11:05:09 -0400, Tom Lane wrote:

Andres Freund and...@2ndquadrant.com writes:

On 2014-02-27 19:14:13 -0500, Tom Lane wrote:

I looked at the postmaster log for the ongoing issue on narwhal
(to wit, that the contrib/dblink test dies the moment it tries
to do anything dblink-y), and looky here what the postmaster
has logged:

One interesting bit about this is that it seems to work in 9.3...

Well, yeah, it seems to have been broken somehow by the Windows
linking changes we did awhile back to try to ensure that missing
PGDLLIMPORT markers would be detected reliably.  Which we did not
back-patch.

Hard to say since there's been no working builds for HEAD for so long on
narwahl :(.




This issue has been hanging around for many months, possibly much longer 
since the last successful build on narwhal was 2012-08-01 and then it 
went quiet until 2014-02-03, when it came back with this error.


If we don't care to find a fix, I think we need to declare narwhal's 
fairly ancient compiler out of support and decommission it. Other gcc 
systems we have with more modern compilers are not getting this issue.


cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Join consolidation / Removing duplicate joins

2014-09-17 Thread Marti Raudsepp

On Wed, Sep 17, 2014 at 2:00 PM, David Rowley dgrowle...@gmail.com wrote:
 Anyway... I've been thinking of writing some code that converts these sub
 plans into left joins where it can be proved that the subquery would only at
 most produce 1 row

 Does anyone have any thoughts on this?

+1, I've thought about this part before. There is already precedent
for inlining FROM clause subqueries into the main query, it would be
nice to do that for correlated subqueries as well. It seems we've been
adding features to the planner without fully exploiting opportunities
for normalization and consolidation of optimization techniques.

I think it's not even necessary to prove uniqueness of the subquery as
you describe. Now that 9.3 added LATERAL, a correlated subquery can be
seen as a special case of LATERAL LEFT JOIN with an additional check
to raise an error if 1 rows are returned from the inner side. And you
could optionally elide the error check if you can prove uniqueness.

Advantages:
1. Sufficiently simple lateral subqueries are already normalized into
ordinary JOINs with hash/merge support, so you would get that for free
(probably requires eliding the 1-row check).
2. We get rid of silliness like the explosion of SubPlan nodes for
each reference (see examples below).
3. Based on some naive testing, it seems that 9.5devel performs
slightly better with NestLoop LATERAL subqueries than SubPlan
correlated ones.
4. EXPLAIN output is easier to read, I find.

I suppose EXISTS/IN with correlated subqueries needs some different
treatment, as it can currently take advantage of the hashed SubPlan
optimization.

Can anyone see any downsides? Perhaps one day we can get rid of
SubPlan entirely, would anyone miss it?


Example of SubPlan explosion:

regression=# create view foo1 as select *, (select ten as ten2 from
tenk2 where tenk1.unique1=tenk2.unique1) from tenk1;

regression=# explain analyze select * from foo1 where ten2 between 1 and 3;
 Seq Scan on tenk1  (cost=0.00..175782.08 rows= width=244) (actual
time=0.052..49.288 rows=3000 loops=1)
   Filter: (((SubPlan 2) = 1) AND ((SubPlan 3) = 3))
   Rows Removed by Filter: 7000
   SubPlan 1
 -  Index Scan using tenk2_unique1 on tenk2  (cost=0.29..8.30
rows=1 width=4) (actual time=0.001..0.002 rows=1 loops=3000)
   Index Cond: (tenk1.unique1 = unique1)
   SubPlan 2
 -  Index Scan using tenk2_unique1 on tenk2 tenk2_1
(cost=0.29..8.30 rows=1 width=4) (actual time=0.002..0.002 rows=1
loops=1)
   Index Cond: (tenk1.unique1 = unique1)
   SubPlan 3
 -  Index Scan using tenk2_unique1 on tenk2 tenk2_2
(cost=0.29..8.30 rows=1 width=4) (actual time=0.001..0.002 rows=1
loops=9000)
   Index Cond: (tenk1.unique1 = unique1)
 Execution time: 49.508 ms


LATERAL is a win even when using OFFSET 0 to prevent inlining:

regression=# create view foo3 as select * from tenk1 left join lateral
(select ten as ten2 from tenk2 where tenk1.unique1=tenk2.unique1
offset 0) x on true;

regression=# explain analyze select * from foo3 where ten2 between 1 and 3;
 Nested Loop  (cost=0.29..83733.00 rows=1 width=248) (actual
time=0.043..28.963 rows=3000 loops=1)
   -  Seq Scan on tenk1  (cost=0.00..458.00 rows=1 width=244)
(actual time=0.008..1.177 rows=1 loops=1)
   -  Subquery Scan on x  (cost=0.29..8.32 rows=1 width=4) (actual
time=0.002..0.002 rows=0 loops=1)
 Filter: ((x.ten2 = 1) AND (x.ten2 = 3))
 Rows Removed by Filter: 1
 -  Index Scan using tenk2_unique1 on tenk2  (cost=0.29..8.30
rows=1 width=4) (actual time=0.002..0.002 rows=1 loops=1)
   Index Cond: (tenk1.unique1 = unique1)
 Execution time: 29.186 ms


And if you could prove uniqueness of the inner side and inline it,
WHERE clauses can also be pushed down trivially:

regression=# create view foo2 as select * from tenk1 left join lateral
(select ten as ten2 from tenk2 where tenk1.unique1=tenk2.unique1) x on
true;

regression=# explain analyze select * from foo2 where ten2 between 1 and 3;
 Hash Join  (cost=532.50..1083.00 rows=3000 width=248) (actual
time=1.848..4.480 rows=3000 loops=1)
   Hash Cond: (tenk1.unique1 = tenk2.unique1)
   -  Seq Scan on tenk1  (cost=0.00..458.00 rows=1 width=244)
(actual time=0.002..0.617 rows=1 loops=1)
   -  Hash  (cost=495.00..495.00 rows=3000 width=8) (actual
time=1.837..1.837 rows=3000 loops=1)
 Buckets: 1024  Batches: 1  Memory Usage: 118kB
 -  Seq Scan on tenk2  (cost=0.00..495.00 rows=3000 width=8)
(actual time=0.004..1.562 rows=3000 loops=1)
   Filter: ((ten = 1) AND (ten = 3))
   Rows Removed by Filter: 7000
 Execution time: 4.591 ms


Regards,
Marti


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [Windows,PATCH] Use faster, higher precision timer API

2014-09-17 Thread Tom Lane

Andrew Dunstan and...@dunslane.net writes:
 On 09/17/2014 08:27 AM, Craig Ringer wrote:
 Attached is a patch to switch 9.5 over to using the
 GetSystemTimeAsFileTime call instead of separate GetSystemTime and
 SystemTimeToFileTime calls.

 That will presumably breaK XP. I know XP has been declared at EOL, but 
 there are still a heck of a lot of such systems out there,

Yeah.  Do we really think more precise timestamps are worth dropping
XP support?  On the Unix side, I know exactly what would happen to a
patch proposing that we replace gettimeofday() with clock_gettime()
with no thought for backwards compatibility.  Why would we expect
less on the Windows side?

Quite aside from XP ... AFAICS from the patch description, this patch
in itself moves us to a place that's a net negative in terms of
functionality.  Maybe it's a stepping stone to something better,
but I think we should just go directly to the something better.
I don't care for committing regressions on the promise that they'll
get fixed later.

Or in short: let's do the work needed to adapt our code to what's
available on the particular Windows version *first*.  Once we've
got that configuration support done, it shouldn't be much extra
work to continue XP support here.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [Windows,PATCH] Use faster, higher precision timer API

2014-09-17 Thread Andres Freund

On 2014-09-17 11:19:36 -0400, Andrew Dunstan wrote:
 
 On 09/17/2014 08:27 AM, Craig Ringer wrote:
 Hi all
 
 Attached is a patch to switch 9.5 over to using the
 GetSystemTimeAsFileTime call instead of separate GetSystemTime and
 SystemTimeToFileTime calls.
 
 This patch the first step in improving PostgreSQL's support for Windows
 high(er) resolution time.
 
 In addition to requiring one less call into the platform libraries, this
 change permits capture of timestamps at up to 100ns precision, instead
 of the current 1ms limit. Unfortunately due to platform timer resolution
 limitations it will in practice only report with 1ms resolution and
 0.1ms precision - or sometimes even as much as 15ms resolution. (If you
 want to know more, see the README for
 https://github.com/2ndQuadrant/pg_sysdatetime).
 
 On Windows 2012 and Windows 8 I'd like to use the new
 GetSystemTimePreciseAsFileTime call instead. As this requires some extra
 hoop-jumping to safely and efficiently use it without breaking support
 for older platforms I suggest that we start with just switching over to
 GetSystemTimeAsFileTime, which has been supported since Windows 2000.
 Then more precise time capture can be added in a later patch.
 
 
 
 
 That will presumably breaK XP.

The proposed patch? I don't really see why? GetSystemTimeAsFileTime() is
documented to be available since win2k?

Or do you mean GetSystemTimePreciseAsFileTime()? That'd surely - as
indicated by Craig - would have to be optional since it's not available
anywhere but 2012 and windows 8?

 I know XP has been declared at EOL, but there
 are still a heck of a lot of such systems out there, especially in places
 like ATMs, but I saw it in use recently at a US surgical facility (which is
 slightly scary, although this wasn't for life-sustaining functionality). My
 XP system is still actually getting some security updates sent from
 Microsoft.

I unfortunately have to agree, dropping XP is probably at least a year
or two out.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [Windows,PATCH] Use faster, higher precision timer API

2014-09-17 Thread Andrew Dunstan



On 09/17/2014 12:51 PM, Andres Freund wrote:

On 2014-09-17 11:19:36 -0400, Andrew Dunstan wrote:

On 09/17/2014 08:27 AM, Craig Ringer wrote:

Hi all

Attached is a patch to switch 9.5 over to using the
GetSystemTimeAsFileTime call instead of separate GetSystemTime and
SystemTimeToFileTime calls.

This patch the first step in improving PostgreSQL's support for Windows
high(er) resolution time.

In addition to requiring one less call into the platform libraries, this
change permits capture of timestamps at up to 100ns precision, instead
of the current 1ms limit. Unfortunately due to platform timer resolution
limitations it will in practice only report with 1ms resolution and
0.1ms precision - or sometimes even as much as 15ms resolution. (If you
want to know more, see the README for
https://github.com/2ndQuadrant/pg_sysdatetime).

On Windows 2012 and Windows 8 I'd like to use the new
GetSystemTimePreciseAsFileTime call instead. As this requires some extra
hoop-jumping to safely and efficiently use it without breaking support
for older platforms I suggest that we start with just switching over to
GetSystemTimeAsFileTime, which has been supported since Windows 2000.
Then more precise time capture can be added in a later patch.




That will presumably breaK XP.

The proposed patch? I don't really see why? GetSystemTimeAsFileTime() is
documented to be available since win2k?



Oh, hmm, yes, you're right. For some reason I was thinking W2K was later 
than XP. I get more random memory errors as I get older ...


cheers

andrew




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] printing table in asciidoc with psql

On 9/16/14 3:52 PM, Szymon Guz wrote:
 It's not finished yet, I'm not sure it there is any sense in supporting
 border types etc.

AFAICT, Asciidoc doesn't support border types, so (if so) you should
just ignore that setting.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [Windows,PATCH] Use faster, higher precision timer API

2014-09-17 Thread Andres Freund

On 2014-09-17 09:38:59 -0700, Tom Lane wrote:
 On the Unix side, I know exactly what would happen to a
 patch proposing that we replace gettimeofday() with clock_gettime()
 with no thought for backwards compatibility.

Btw, do you plan to pursue clock_gettime()? It'd be really neat to have
it...

 
 Quite aside from XP ... AFAICS from the patch description, this patch
 in itself moves us to a place that's a net negative in terms of
 functionality.  Maybe it's a stepping stone to something better, but I
 think we should just go directly to the something better.  I don't
 care for committing regressions on the promise that they'll get fixed
 later.

I don't think there's any regressions in that patch? Rather the
contrary. I understand the comment about the timer tick to be just as
applicable to the current code as the new version. Just that the old
code can't possibly have a precision lower than 1ms, but the new one
can.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] printing table in asciidoc with psql

2014-09-17 Thread Szymon Guz

On 17 September 2014 19:30, Peter Eisentraut pete...@gmx.net wrote:

 On 9/16/14 3:52 PM, Szymon Guz wrote:
  It's not finished yet, I'm not sure it there is any sense in supporting
  border types etc.

 AFAICT, Asciidoc doesn't support border types, so (if so) you should
 just ignore that setting.


Too late, I've done something like this:

border=0
[frame=none,grid=none]

border=1
[frame=all,grid=none]

border=2
[frame=all,grid=all]

thanks,
Szymon

Re: [HACKERS] Collations and Replication; Next Steps

On Wed, Sep 17, 2014 at 6:17 AM, Robert Haas robertmh...@gmail.com wrote:
 What I find astonishing is that whoever maintains glibc (or the Red
 Hat packaging for it) thinks it's OK to change the collation order in
 a minor release.  I'd understand changing it between, say, RHEL 6 and
 RHEL 7.  But the idea that minor release, supposedly safe updates
 think they can whack this around without breaking applications really
 kind of blows my mind.

Why wouldn't they feel entitled to? To quote UTS #10 [1]:


Collation order is not fixed.

Over time, collation order will vary: there may be fixes needed as
more information becomes available about languages; there may be new
government or industry standards for the language that require
changes; and finally, new characters added to the Unicode Standard
will interleave with the previously-defined ones. This means that
collations must be carefully versioned.


Indeed, they do version collations with LC_IDENTIFICATION. We just
don't make any attempt to use the version information. In short, this
is our fault.   :-(

[1] http://www.unicode.org/reports/tr10/#Stability
-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Collations and Replication; Next Steps

On 9/17/14 10:47 AM, Tatsuo Ishii wrote:
 Why don't we have our collation data? It seems MySQL has already done this.

Where would you get the source data from?  How would you maintain it?


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Collations and Replication; Next Steps

On 9/16/14 5:57 PM, Peter Geoghegan wrote:
 On Tue, Sep 16, 2014 at 2:07 PM, Peter Eisentraut pete...@gmx.net wrote:
 Clearly, this is worth documenting, but I don't think we can completely
 prevent the problem.  There has been talk of a built-in index integrity
 checking tool.  That would be quite useful.
 
 We could at least use the GNU facility for versioning collations where
 available, LC_IDENTIFICATION [1].

It looks like the revisions or dates reported by LC_IDENTIFICATION
aren't ever updated for most locales.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Collations and Replication; Next Steps

2014-09-17 Thread Martijn van Oosterhout

On Wed, Sep 17, 2014 at 01:07:56PM +, Matthew Kelly wrote:
 I'm with Martjin here, lets go ICU, if only because it moves sorting
 to a user level library, instead of a system level.  Martjin do you
 have a link to the out of tree patch?  If not I'll find it.  I'd like
 to apply it to a branch and start playing with it.

http://people.freebsd.org/~girgen/postgresql-icu/README.html
http://people.freebsd.org/~girgen/postgresql-icu/

Note I said optional. It is a large library for sure, but for some
installations I think the benefits are sufficient.

Mvg,
-- 
Martijn van Oosterhout   klep...@svana.org   http://svana.org/kleptog/
 He who writes carelessly confesses thereby at the very outset that he does
 not attach much importance to his own thoughts.
   -- Arthur Schopenhauer


signature.asc
Description: Digital signature

Re: [HACKERS] Collations and Replication; Next Steps

On 9/17/14 9:07 AM, Matthew Kelly wrote:
 Here is where I think the timezone and PostGIS cases are fundamentally 
 different:
 I can pretty easily make sure that all my servers run in the same timezone.  
 That's just good practice.  I'm also going to install the same version of 
 PostGIS everywhere in a cluster.  I'll build PostGIS and its dependencies 
 from the exact same source files, regardless of when I build the machine.

I wrote time zone *database*, not time zone.  The time zone database  is
(in some configurations) part of glibc.

I also wrote PostGIS dependent libraries, not PostGIS itself.  If you
are comparing RHEL 5 and 6, as you wrote elsewhere, then some of those
will most likely be different.  (Heck, glibc could be different.  Is
glibc never allowed to fix insufficiencies in its floating-point
implementation, for example?)

Also, there is nothing that guarantees that the PostGIS version will be
the same on both sides.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Collations and Replication; Next Steps

On Wed, Sep 17, 2014 at 11:05 AM, Peter Eisentraut pete...@gmx.net wrote:
 We could at least use the GNU facility for versioning collations where
 available, LC_IDENTIFICATION [1].

 It looks like the revisions or dates reported by LC_IDENTIFICATION
 aren't ever updated for most locales.


That's not surprising. There is zero controversy about how to
correctly sort English text, for example. For other languages, that
might be much less true.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Collations and Replication; Next Steps

On 9/17/14 10:46 AM, Greg Stark wrote:
 You could have a problem if you have an expression index on (timestamp
 AT TIME ZONE '...'). I may have the expression slightly wrong but I
 believe it is posisble to write an immutable expression that depends
 on the tzdata data as long as it doesn't depend on not the user's
 current time zone (which would be stable but not immutable). The
 actual likelihood of that situation might be much lower and the
 ability to avoid it higher but in theory I think Peter's right that
 it's the same class of problem.

I was thinking of something like a text column with natural input of
time stamp information, and and index on that_column::timestamp.

 Generally speaking we try to protect against most environment
 dependencies that lead to corrupt databases by encoding them in the
 control file. Obviously we can't encode an entire collation in the
 controlfile though. We could conceivably have a corpus of
 representative strings that we sort and then checksum in the
 controlfile. It wouldn't be foolproof but if we collect interesting
 examples as we find them it might be a worthwhile safety check.

I think it could be useful in a number of situations if a type could
stick some arbitrary additional information into a new column in
pg_type, such as versions of libraries it depends on or storage format
versions.

Then again, collation isn't actually a property of any single type.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Collations and Replication; Next Steps

On 9/17/14 2:07 PM, Peter Geoghegan wrote:
 On Wed, Sep 17, 2014 at 11:05 AM, Peter Eisentraut pete...@gmx.net wrote:
 We could at least use the GNU facility for versioning collations where
 available, LC_IDENTIFICATION [1].

 It looks like the revisions or dates reported by LC_IDENTIFICATION
 aren't ever updated for most locales.
 
 
 That's not surprising. There is zero controversy about how to
 correctly sort English text, for example. For other languages, that
 might be much less true.
 

a) There is plenty of controversy about how to sort English text on
Stack Overflow. ;-)

b) Even in an English locale you have to maintain a sort order for all
Unicode characters, and that changes more than zero times.  But it's
quite clear from looking at the glibc git logs that no one is
maintaining these version numbers.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Collations and Replication; Next Steps

On Wed, Sep 17, 2014 at 11:08 AM, Peter Eisentraut pete...@gmx.net wrote:
 I also wrote PostGIS dependent libraries, not PostGIS itself.  If you
 are comparing RHEL 5 and 6, as you wrote elsewhere, then some of those
 will most likely be different.  (Heck, glibc could be different.  Is
 glibc never allowed to fix insufficiencies in its floating-point
 implementation, for example?)


The operator class author has a responsibility to make sure that
doesn't happen. If he or she should fail, then it's a bug, and
possibly a failure of imagination on their part. This is the only way
of thinking about it that makes sense. If you want to use a library
feature in your opclass B-Tree support function 1, then you'd better
be damned sure that it implies immutability insofar as that's
possible. Sure, it's also possible that your users could be the victim
on an unfortunate upstream bug that you couldn't reasonably predict,
but when is that not true?

In general, I am totally unconvinced by this line of argument. It
implies that everyone has to be an expert on everything just to use
Postgres.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [Windows,PATCH] Use faster, higher precision timer API

2014-09-17 Thread Tom Lane

Andres Freund and...@2ndquadrant.com writes:
 On 2014-09-17 09:38:59 -0700, Tom Lane wrote:
 On the Unix side, I know exactly what would happen to a
 patch proposing that we replace gettimeofday() with clock_gettime()
 with no thought for backwards compatibility.

 Btw, do you plan to pursue clock_gettime()? It'd be really neat to have
 it...

It's on my TODO list, but not terribly close to the top.  If you're
excited about that, feel free to take it up.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Turning off HOT/Cleanup sometimes

2014-09-17 Thread Emanuel Calvo


El 15/09/14 18:13, Simon Riggs escribió:
 On 15 September 2014 17:09, Robert Haas robertmh...@gmail.com wrote:

 Do we really want to disable HOT for all catalog scans?
 The intention of the patch is that catalog scans are treated
 identically to non-catalog scans. The idea here is that HOT cleanup
 only occurs on scans on target relations, so only INSERT, UPDATE and
 DELETE do HOT cleanup.

 It's possible that many catalog scans don't follow the normal target
 relation logic, so we might argue we should use HOT every time. OTOH,
 since we now have separate catalog xmins we may find that using HOT on
 catalogs is no longer effective. So I could go either way on how to
 proceed; its an easy change either way.


I setup a more concurrent scenario and the difference is quite larger:

Without patch:
1st concurrent with writes:
tps = 5705.261620 (including connections establishing)
tps = 5945.338557 (excluding connections establishing)
2nd no writes being executed:
tps = 9988.792575 (including connections establishing)
tps = 11059.671351 (excluding connections establishing)


Patched version:
1st concurrent with writes:
tps = 9476.741707 (including connections establishing)
tps = 10274.831185 (excluding connections establishing)
2nd no writes being executed:
tps = 12993.644808 (including connections establishing)
tps = 15171.214744 (excluding connections establishing)


Stats (writes have been run with a time limit, not by tx):

hotcleanup=# select relname ,n_live_tup, n_dead_tup, n_tup_hot_upd from
pg_stat_user_tables where relname ~ 'pgbench';
 relname  | n_live_tup | n_dead_tup | n_tup_hot_upd
--+++---
 pgbench_tellers  |500 |  0 |   2044192
 pgbench_accounts |5109728 | 310842 |   1969264
 pgbench_history  |2265882 |  0 | 0
 pgbench_branches | 50 |  0 |   2237167
(4 rows)

hotcleanup=# select relname ,n_live_tup, n_dead_tup, n_tup_hot_upd from
pg_stat_user_tables where relname ~ 'pgbench';
 relname  | n_live_tup | n_dead_tup | n_tup_hot_upd
--+++---
 pgbench_history  |2148946 |  0 | 0
 pgbench_tellers  |500 |  0 |   1969675
 pgbench_branches | 50 |  0 |   2150655
 pgbench_accounts |5098774 | 300123 |   1897484
(4 rows)

  

I ran the regression tests over the patched version and they passed ok.


-- 
--
Emanuel Calvo http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] proposal: plpgsql - Assert statement

On 9/16/14 12:01 AM, Alvaro Herrera wrote:
 Jan Wieck wrote:
 I think that most data integrity issues can be handled by a well
 designed database schema that uses UNIQUE, NOT NULL, REFERENCES and
 CHECK constraints. Assertions are usually found inside of complex
 code constructs to check values of local variables. I don't think it
 is even a good idea to implement assertions that can query arbitrary
 data.
 
 Actually Peter Eisentraut posted a patch for SQL assertions:
 http://www.postgresql.org/message-id/1384486216.5008.17.ca...@vanquo.pezone.net

SQL assertions are just a kind of CHECK constraint, so fully
Jan-compliant. ;-)

I don't mind PL/pgSQL having an assert statement like many programming
languages, but I find a lot of the proposed details dubious.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] proposal: plpgsql - Assert statement

On 9/14/14 2:49 PM, Jan Wieck wrote:
 I don't think it is even a good idea to implement assertions that can
 query arbitrary data.

In a normal programming language, an assertion is usually a static fault
in your program.  If the assertion ever fails, you fix your program and
then it hopefully never happens again.

Assertion that query the state of the database or result row counts are
pushing that concept quite a bit.  Those are not assertions, those are
just plain old error handling.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] proposal: plpgsql - Assert statement

2014-09-17 Thread Marko Tiikkaja


On 9/17/14, 9:00 PM, Peter Eisentraut wrote:

On 9/14/14 2:49 PM, Jan Wieck wrote:

I don't think it is even a good idea to implement assertions that can
query arbitrary data.


In a normal programming language, an assertion is usually a static fault
in your program.  If the assertion ever fails, you fix your program and
then it hopefully never happens again.

Assertion that query the state of the database or result row counts are
pushing that concept quite a bit.  Those are not assertions, those are
just plain old error handling.


*shrug*  I don't see them as error handling if they're just checking 
conditions which should never happen.


That said, in PL/PgSQL these expressions would likely have to be SQL 
expressions, and then you'd have to go out of your way to implement 
assertions which *can't* query arbitrary data.  And that just seems silly.



.marko


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] proposal: plpgsql - Assert statement

2014-09-17 21:00 GMT+02:00 Peter Eisentraut pete...@gmx.net:

 On 9/14/14 2:49 PM, Jan Wieck wrote:
  I don't think it is even a good idea to implement assertions that can
  query arbitrary data.

 In a normal programming language, an assertion is usually a static fault
 in your program.  If the assertion ever fails, you fix your program and
 then it hopefully never happens again.

 Assertion that query the state of the database or result row counts are
 pushing that concept quite a bit.  Those are not assertions, those are
 just plain old error handling.


What is difference between content of variable or content of database? You
can test any prerequisite, but when this prerequisite is not solved, than
exception is very very hard without possible handling.

Pavel

Re: [HACKERS] printing table in asciidoc with psql

2014-09-17 Thread Szymon Guz

On 17 September 2014 19:55, Szymon Guz mabew...@gmail.com wrote:



 On 17 September 2014 19:30, Peter Eisentraut pete...@gmx.net wrote:

 On 9/16/14 3:52 PM, Szymon Guz wrote:
  It's not finished yet, I'm not sure it there is any sense in supporting
  border types etc.

 AFAICT, Asciidoc doesn't support border types, so (if so) you should
 just ignore that setting.


 Too late, I've done something like this:

 border=0
 [frame=none,grid=none]

 border=1
 [frame=all,grid=none]

 border=2
 [frame=all,grid=all]

 thanks,
 Szymon



Hi,
thanks for all the remarks.

I've attached another version of this patch.

I think it's done.

- This works: `\pset format asciidoc`

- Output is formatted as asciidoc tables.

- There is support for borders {0,1,2}. The attached html file was made by
running tests for psql, taking the asciidoc tables from it, converting to
html with `asciidoc file`.
-- border = 0 - [frame=none,grid=none]
-- border = 1 - [frame=none,grid=all]
-- border = 2 - [frame=all,grid=all]

- There are also tests.
-- For normal and extended mode combined with each of the border values.
-- With column names made of characters which need escaping
-- With values: (with escape needed characters, string '11' and integer 11
- they should have different right-left alignment).

- Documentation for psql is updated.

- According to Emanuel's advice: help.c is updated.

The attached html file contains tables from the test in this order:

normal, border 0
normal, border 1
normal, border 2
expanded, border 0
expanded, border 1
expanded, border 2

regards,
Szymon
 text/html; charset=US-ASCII; name="asciidoc_output.html": Unrecognized 
diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml
index e7fcc73..cd64b88 100644
--- a/doc/src/sgml/ref/psql-ref.sgml
+++ b/doc/src/sgml/ref/psql-ref.sgml
@@ -2092,8 +2092,8 @@ lo_import 152801
   literalaligned/literal, literalwrapped/literal,
   literalhtml/literal,
   literallatex/literal (uses literaltabular/literal),
-  literallatex-longtable/literal, or
-  literaltroff-ms/literal.
+  literallatex-longtable/literal,
+  literaltroff-ms/literal, or literalasciidoc/literal.
   Unique abbreviations are allowed.  (That would mean one letter
   is enough.)
   /para
@@ -2120,7 +2120,8 @@ lo_import 152801
 
   para
   The literalhtml/, literallatex/,
-  literallatex-longtable/literal, and literaltroff-ms/
+  literallatex-longtable/literal, literaltroff-ms/,
+  and literalasciidoc/
   formats put out tables that are intended to
   be included in documents using the respective mark-up
   language. They are not complete documents! This might not be
diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c
index 2227db4..ae6b106 100644
--- a/src/bin/psql/command.c
+++ b/src/bin/psql/command.c
@@ -2247,6 +2247,9 @@ _align2string(enum printFormat in)
 		case PRINT_TROFF_MS:
 			return troff-ms;
 			break;
+		case PRINT_ASCIIDOC:
+			return asciidoc;
+			break;
 	}
 	return unknown;
 }
@@ -2316,9 +2319,11 @@ do_pset(const char *param, const char *value, printQueryOpt *popt, bool quiet)
 			popt-topt.format = PRINT_LATEX_LONGTABLE;
 		else if (pg_strncasecmp(troff-ms, value, vallen) == 0)
 			popt-topt.format = PRINT_TROFF_MS;
+		else if (pg_strncasecmp(asciidoc, value, vallen) == 0)
+			popt-topt.format = PRINT_ASCIIDOC;
 		else
 		{
-			psql_error(\\pset: allowed formats are unaligned, aligned, wrapped, html, latex, troff-ms\n);
+			psql_error(\\pset: allowed formats are unaligned, aligned, wrapped, html, latex, troff-ms, asciidoc\n);
 			return false;
 		}
 
diff --git a/src/bin/psql/help.c b/src/bin/psql/help.c
index 6035a77..66da6ec 100644
--- a/src/bin/psql/help.c
+++ b/src/bin/psql/help.c
@@ -351,7 +351,7 @@ helpVariables(unsigned short int pager)
 	fprintf(output, _(  expanded (or x)toggle expanded output\n));
 	fprintf(output, _(  fieldsep   field separator for unaligned output (default '|')\n));
 	fprintf(output, _(  fieldsep_zero  set field separator in unaligned mode to zero\n));
-	fprintf(output, _(  format set output format [unaligned, aligned, wrapped, html, latex, ..]\n));
+	fprintf(output, _(  format set output format [unaligned, aligned, wrapped, html, latex, asciidoc ..]\n));
 	fprintf(output, _(  footer enable or disable display of the table footer [on, off]\n));
 	fprintf(output, _(  linestyle  set the border line drawing style [ascii, old-ascii, unicode]\n));
 	fprintf(output, _(  null   set the string to be printed in place of a null value\n));
diff --git a/src/bin/psql/print.c b/src/bin/psql/print.c
index 3b3c3b7..956bbb1 100644
--- a/src/bin/psql/print.c
+++ b/src/bin/psql/print.c
@@ -2475,6 +2475,200 @@ print_troff_ms_vertical(const printTableContent *cont, FILE *fout)
 	}
 }
 
+/*/
+/* ASCIIDOC **/

Re: [HACKERS] proposal: plpgsql - Assert statement

On 9/17/14 3:04 PM, Pavel Stehule wrote:
 What is difference between content of variable or content of database?
 You can test any prerequisite, but when this prerequisite is not solved,
 than exception is very very hard without possible handling.

If the assertion tests arbitrary Boolean expressions, then we can't stop
the user from abusing them.

But it's another thing if we design specific syntax that encourages such
abuse, as proposed earlier.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Turning off HOT/Cleanup sometimes

2014-09-17 Thread Simon Riggs

On 15 September 2014 22:13, Simon Riggs si...@2ndquadrant.com wrote:
 On 15 September 2014 17:09, Robert Haas robertmh...@gmail.com wrote:

 Do we really want to disable HOT for all catalog scans?

 The intention of the patch is that catalog scans are treated
 identically to non-catalog scans. The idea here is that HOT cleanup
 only occurs on scans on target relations, so only INSERT, UPDATE and
 DELETE do HOT cleanup.

Since INSERT, UPDATE and DELETE can only be called when
!RecoveryInProgress(), we can completely avoid making this test at the
top of each heap_page_prune_opt() call.

I very much like the simplicity of saying no target, no cleanup.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] proposal: plpgsql - Assert statement

2014-09-17 21:36 GMT+02:00 Peter Eisentraut pete...@gmx.net:

 On 9/17/14 3:04 PM, Pavel Stehule wrote:
  What is difference between content of variable or content of database?
  You can test any prerequisite, but when this prerequisite is not solved,
  than exception is very very hard without possible handling.

 If the assertion tests arbitrary Boolean expressions, then we can't stop
 the user from abusing them.


I am thinking so unhandled signal can be good defence. (and possibility to
disable assertions)

We design a database system, so we should to reflect it - plpgsql (or any
PL environment) are not classic language. There are lot of database
specific constructs.


 But it's another thing if we design specific syntax that encourages such
 abuse, as proposed earlier.


Other note - I am thinking so ANSI SQL Assertions and PL assertions are
independent features. Although they can have some common goals.

Re: [HACKERS] Anonymous code block with parameters

2014-09-17 Thread Vik Fearing

On 09/16/2014 10:09 AM, Heikki Linnakangas wrote:
 On 09/16/2014 10:57 AM, Craig Ringer wrote:
 On 09/16/2014 03:15 PM, Pavel Stehule wrote:

 Why we don't introduce a temporary functions instead?

 I think that'd be a lot cleaner and simpler. It's something I've
 frequently wanted, and as Hekki points out it's already possible by
 creating the function in pg_temp, there just isn't the syntax sugar for
 CREATE TEMPORARY FUNCTION.

 So why not just add CREATE TEMPORARY FUNCTION?
 
 Sure, why not.

Because you still have to do

SELECT pg_temp.my_temp_function(blah);

to execute it.

 It means two steps:

 CREATE TEMPORARY FUNCTION ... $$ $$;

 SELECT my_temp_function(blah);

That won't work; see above.
-- 
Vik


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Anonymous code block with parameters

2014-09-17 22:07 GMT+02:00 Vik Fearing vik.fear...@dalibo.com:

 On 09/16/2014 10:09 AM, Heikki Linnakangas wrote:
  On 09/16/2014 10:57 AM, Craig Ringer wrote:
  On 09/16/2014 03:15 PM, Pavel Stehule wrote:
 
  Why we don't introduce a temporary functions instead?
 
  I think that'd be a lot cleaner and simpler. It's something I've
  frequently wanted, and as Hekki points out it's already possible by
  creating the function in pg_temp, there just isn't the syntax sugar for
  CREATE TEMPORARY FUNCTION.
 
  So why not just add CREATE TEMPORARY FUNCTION?
 
  Sure, why not.

 Because you still have to do

 SELECT pg_temp.my_temp_function(blah);

 to execute it.


this problem should be solvable. I can to use a temporary tables without
using pg_temp schema.

Pavel



  It means two steps:
 
  CREATE TEMPORARY FUNCTION ... $$ $$;
 
  SELECT my_temp_function(blah);

 That won't work; see above.
 --
 Vik

Re: [HACKERS] Collations and Replication; Next Steps

On Wed, Sep 17, 2014 at 7:46 AM, Greg Stark st...@mit.edu wrote:
 You could have a problem if you have an expression index on (timestamp
 AT TIME ZONE '...'). I may have the expression slightly wrong but I
 believe it is posisble to write an immutable expression that depends
 on the tzdata data as long as it doesn't depend on not the user's
 current time zone (which would be stable but not immutable). The
 actual likelihood of that situation might be much lower and the
 ability to avoid it higher but in theory I think Peter's right that
 it's the same class of problem.

Really? If you have an IMMUTABLE expression predicated on something
that isn't immutable, then it isn't actually immutable, regardless of
the ability of Postgres to determine that. It's that simple. So, if
that's true, then it sounds like a bug to me. I didn't check, but I
think that AT TIME ZONE just changes the display format, and the
relevant output function is only stable, and so this shouldn't be a
problem.

 Peter G, could go into more detail about collation versioning? What
 would the implications be for Postgres?

Well, I think one implication might be that Postgres won't start, just
because you updated the OS (we have no way to get back results
consistent with the old collation file, I would think). At least your
database isn't corrupt, but that's still a really bad loss of
availability. This makes me lean towards pursuing ICU support as part
of any versioning scheme. There is a reason why everyone else does
something similar. Apparently DB2 uses ICU.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Final Patch for GROUPING SETS

2014-09-17 Thread Marti Raudsepp

On Fri, Sep 12, 2014 at 9:41 PM, Andrew Gierth
and...@tao11.riddles.org.uk wrote:
 gsp1.patch - phase 1 code patch (full syntax, limited functionality)
 gsp2.patch - phase 2 code patch (adds full functionality using the
  new chained aggregate mechanism)

I gave these a try by converting my current CTE-based queries into
CUBEs and it works as expected; query time is cut in half and lines of
code is 1/4 of original. Thanks!

I only have a few trivial observations; if I'm getting too nitpicky
let me know. :)


Since you were asking for feedback on the EXPLAIN output on IRC, I'd
weigh in and say that having the groups on separate lines would be
significantly more readable. It took me a while to understand what's
going on in my queries due to longer table and column names and
wrapping; The comma separators between groups are hard to distinguish.
If that can be made to work with the EXPLAIN printer without too much
trouble.

So instead of:
 GroupAggregate
   Output: four, ten, hundred, count(*)
   Grouping Sets: (onek.four, onek.ten, onek.hundred), (onek.four,
onek.ten), (onek.four), ()

Perhaps print:
   Grouping Sets: (onek.four, onek.ten, onek.hundred)
  (onek.four, onek.ten)
  (onek.four)
  ()

Or maybe:
   Grouping Set: (onek.four, onek.ten, onek.hundred)
   Grouping Set: (onek.four, onek.ten)
   Grouping Set: (onek.four)
   Grouping Set: ()

Both seem to work with the explain.depesz.com parser, although the 1st
won't be aligned as nicely.


Do you think it would be reasonable to normalize single-set grouping
sets into a normal GROUP BY? Such queries would be capable of using
HashAggregate, but the current code doesn't allow that. For example:

set enable_sort=off;
explain select two, count(*) from onek group by grouping sets (two);
Could be equivalent to:
explain select two, count(*) from onek group by two;


I'd expect GROUP BY () to be fully equivalent to having no GROUP BY
clause, but there's a difference in explain output. The former
displays Grouping Sets: () which is odd, since none of the grouping
set keywords were used.

# explain select count(*) from onek group by ();
 Aggregate  (cost=77.78..77.79 rows=1 width=0)
   Grouping Sets: ()
   -  Index Only Scan using onek_stringu1 on onek  (cost=0.28..75.28
rows=1000 width=0)

Regards,
Marti


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Collations and Replication; Next Steps

 On 9/17/14 10:47 AM, Tatsuo Ishii wrote:
 Why don't we have our collation data? It seems MySQL has already done this.
 
 Where would you get the source data from?  How would you maintain it?

Don't know. However seeing that that MySQL manages it, it should be
possible for us.

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [v9.5] Custom Plan API

2014-09-17 Thread Kouhei Kaigai

   Why does it need to know that?  I don't see that it's doing
   anything that requires knowing the size of that node, and if it is,
   I think it shouldn't be.  That should get delegated to the callback
   provided by the custom plan provider.
  
   Sorry, my explanation might be confusable. The create_custom_scan()
   does not need to know the exact size of the CustomScan (or its
   inheritance) because of the two separated hooks; one is node
   allocation time, the other is the tail of the series of initialization.
   If we have only one hook here, we need to have a mechanism to
   informs
   create_custom_scan() an exact size of the CustomScan node; including
   private fields managed by the provider, instead of the first hook on
   node allocation time. In this case, node allocation shall be
   processed by create_custom_scan() and it has to know exact size of
   the node to be
  allocated.
  
   How do I implement the feature here? Is the combination of static
   node size and callback on the tail more simple than the existing
   design that takes two individual hooks on create_custom_scan()?
 
  I still don't get it.  Right now, the logic in create_custom_scan(),
  which I think should really be create_custom_plan() or
  create_plan_from_custom_path(), basically looks like this:
 
  1. call hook function CreateCustomPlan 2. compute values for tlist and
  clauses 3. pass those values to hook function InitCustomScan() 4. call
  copy_path_costsize
 
  What I think we should do is:
 
  1. compute values for tlist and clauses 2. pass those values to hook
  function PlanCustomPath(), which will return a Plan 3. call
  copy_path_costsize
 
  create_custom_scan() does not need to allocate the node.  You don't
  need the node to be allocated before computing tlist and clauses.
 
 Thanks, I could get the point.
 I'll revise the patch according to the suggestion above.
 
At this moment, I revised the above portion of the patches.
create_custom_plan() was modified to call PlanCustomPath callback
next to the initialization of tlist and clauses.

It's probably same as what you suggested.

 It seems to me, we can also apply similar manner on ExecInitCustomScan().
 The current implementation doing is:
 1. call CreateCustomScanState() to allocate a CustomScanState node 2.
 common initialization of the fields on CustomScanState, but not private
fields.
 3. call BeginCustomScan() to initialize remaining stuffs and begin
 execution.
 
 If BeginCustomScan() is re-defined to accept values for common
 initialization portions and to return a CustomScanState node, we may be
 able to eliminate the CreateCustomScanState() hook.
 
 Unlike create_custom_scan() case, it takes more number of values for common
 initialization portions; expression tree of tlist and quals, scan and result
 tuple-slot, projection info and relation handler. It may mess up the
 interface specification.
 In addition, BeginCustomScan() has to belong to CustomScanMethods, not
 CustomexecMethods. I'm uncertain whether it is straightforward location.
 (a whisper: It may not need to be separate tables. CustomScan always
 populates CustomScanState, unlike relationship between CustomPath and
 CustomScan.)
 
 How about your opinion to apply the above manner on ExecInitCustomScan()
 also?
 
I kept existing implementation around ExecInitCustomScan() right now.

Thanks,
--
NEC OSS Promotion Center / PG-Strom Project
KaiGai Kohei kai...@ak.jp.nec.com


pgsql-v9.5-custom-scan.part-3.v10.patch
Description: pgsql-v9.5-custom-scan.part-3.v10.patch


pgsql-v9.5-custom-scan.part-2.v10.patch
Description: pgsql-v9.5-custom-scan.part-2.v10.patch


pgsql-v9.5-custom-scan.part-1.v10.patch
Description: pgsql-v9.5-custom-scan.part-1.v10.patch

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Collations and Replication; Next Steps

 On Wed, Sep 17, 2014 at 3:47 PM, Tatsuo Ishii is...@postgresql.org wrote:
 I don't think we cannot achieve that because even MySQL accomplishes:-)
 
 We've always considered it an advantage that we're consistent with the
 collations in the rest of the system. Generally speaking the fact that
 Postgres integrates with the system rather than be a separate system
 unto itself.

 Consider bug reports like I've configured my system to use
 fr_FR.UTF-8 and sort produces output in this order why is Postgres
 producing output in a different order? Or extension authors using
 strcoll and being surprised that the module gets inconsistent data
 from the database.

I doubt it. glibc takes liberty to change the collation data release
by release, but people don't seem to complain it. Then why would
people complain the collation difference between PostgreSQL and glibc
if there's any?

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Collations and Replication; Next Steps

2014-09-17 Thread Robert Haas

On Wed, Sep 17, 2014 at 10:06 AM, Matthew Kelly mke...@tripadvisor.com wrote:
 Let me double check that assertion before we go too far with it.

 Most of the problems I've seen are across 5 and 6 boundaries.  I thought I 
 had case where it was within a minor release but I can't find it right now.  
 I'm going to dig.

 That being said the sort order changes whether you statically or dynamically 
 link (demonstrated on 4+ machines running different linux flavors), so at the 
 point I have no reason to trust the stability of the sort across any build.  
 I legitimately question whether strcoll is buggy.  Ex. I have cases where for 
 three strings a, b and c:  a  b, but  (a || c)  (b || c).  That's right 
 postfixing doesn't hold.  It actually calls into question the index scan 
 optimization that occurs when you do LIKE 'test%' even on a single machine, 
 but I don't want to bite that off at the moment.

 My mentality has switched to 'don't trust any change until shown otherwise', 
 so that may have bled into my last email.

Of course, there's also the question of whether ICU would have similar
issues.  You're assuming that they *don't* whack the collation order
around in minor releases, or at least that they do so to some lesser
degree than glibc, but is that actually true?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Final Patch for GROUPING SETS

2014-09-17 Thread Josh Berkus

On 09/17/2014 03:02 PM, Marti Raudsepp wrote:
 So instead of:
  GroupAggregate
Output: four, ten, hundred, count(*)
Grouping Sets: (onek.four, onek.ten, onek.hundred), (onek.four,
 onek.ten), (onek.four), ()
 
 Perhaps print:
Grouping Sets: (onek.four, onek.ten, onek.hundred)
   (onek.four, onek.ten)
   (onek.four)
   ()

So:

  Grouping Sets: [
[ onek.four, onek.ten, onek.hundred ],
[ onek.four, onek.ten ],
[ onek.four ],
[]
]

.. in JSON?

Seems to me that we need a better way to display the grand total
grouping set.

 
 Or maybe:
Grouping Set: (onek.four, onek.ten, onek.hundred)
Grouping Set: (onek.four, onek.ten)
Grouping Set: (onek.four)
Grouping Set: ()

The latter won't work with JSON and YAML output.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Immediate standby promotion

2014-09-17 Thread Robert Haas

On Wed, Sep 17, 2014 at 7:23 AM, Simon Riggs si...@2ndquadrant.com wrote:
 On 14 August 2014 20:27, Robert Haas robertmh...@gmail.com wrote:
 On Thu, Aug 14, 2014 at 10:12 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Fujii Masao masao.fu...@gmail.com writes:
 I'd like to propose to add new option --immediate to pg_ctl promote.
 When this option is set, recovery ignores any WAL data which have not
 been replayed yet and exits immediately. Patch attached.

 This promotion is faster than normal one but can cause data loss.

 TBH, I cannot imagine a situation where that would be a sane thing to do.
 If you have WAL, why would you not replay what you have?  The purpose
 of a database is to preserve your data, not randomly throw it away.

 I've wanted this a number of times, so I think it's quite sane.

 fast promotion was actually a supported option in r8 of Postgres but
 this option was removed when we implemented streaming replication in
 r9.0

 The *rough* requirement is sane, but that's not the same thing as
 saying this exact patch makes sense.

Granted.  Fair point.

 If you are paused and you can see that WAL up ahead is damaged, then
 YES, you do want to avoid applying it. That is possible by setting a
 PITR target so that recovery stops at a precise location specified by
 you. As an existing option is it better than the blunt force trauma
 suggested here.

You can pause at a recovery target, but then what if you want to go
read/write at that point?  Or what if you've got a time-delayed
standby and you want to break replication so that it doesn't replay
the DROP TABLE students that somebody ran on the master?  It doesn't
have to be that WAL is unreadable or corrupt; it's enough for it to
contain changes you wish to avoid replaying.

 If you really don't care, just shutdown server, resetxlog and start
 her up - again, no need for new option.

To me, being able to say pg_ctl promote_right_now -m yes_i_mean_it
seems like a friendlier interface than making somebody shut down the
server, run pg_resetxlog, and start it up again.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Collations and Replication; Next Steps

On Wed, Sep 17, 2014 at 5:16 PM, Robert Haas robertmh...@gmail.com wrote:
 Of course, there's also the question of whether ICU would have similar
 issues.  You're assuming that they *don't* whack the collation order
 around in minor releases, or at least that they do so to some lesser
 degree than glibc, but is that actually true?

No, but they're disciplined about it. They clearly do versioning
properly, which seems to not be the case with glibc, based on Peter's
remarks: http://userguide.icu-project.org/collation/architecture#TOC-Versioning
(they talk about a 32-bit identifier here).

PostgreSQL's problems in this area are exactly the same as every other
database system's (the Unicode consortium anticipated these problems
too, and as I pointed out have commented on these problems.). A bunch
of prominent database systems are listed as using ICU.
-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] proposal: plpgsql - Assert statement

2014-09-17 Thread Jan Wieck


On 09/17/2014 03:36 PM, Peter Eisentraut wrote:

On 9/17/14 3:04 PM, Pavel Stehule wrote:

What is difference between content of variable or content of database?
You can test any prerequisite, but when this prerequisite is not solved,
than exception is very very hard without possible handling.


If the assertion tests arbitrary Boolean expressions, then we can't stop
the user from abusing them.


Exactly. Doing something like

ASSERT (select count(*) from foo
where fk not in (select pk from bar)) = 0;

is a perfectly fine, arbitrary boolean expression. It will probably work 
well in a development environment too. And I am very sure that it will 
not scale well once that code gets deployed. And I know how DBAs react 
to the guaranteed following performance problem. They will disable ALL 
assert ... or was there some sort of assert class system proposed that I 
missed?




But it's another thing if we design specific syntax that encourages such
abuse, as proposed earlier.


The design should explicitly discourage that sort of nonsense.


Jan


--
Jan Wieck
Senior Software Engineer
http://slony.info


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [Windows,PATCH] Use faster, higher precision timer API

On 09/17/2014 11:19 PM, Andrew Dunstan wrote:
 
 On 09/17/2014 08:27 AM, Craig Ringer wrote:
 Hi all
 On Windows 2012 and Windows 8 I'd like to use the new
 GetSystemTimePreciseAsFileTime call instead. As this requires some extra
 hoop-jumping to safely and efficiently use it without breaking support
 for older platforms I suggest that we start with just switching over to
 GetSystemTimeAsFileTime, which has been supported since Windows 2000.
 Then more precise time capture can be added in a later patch.

 That will presumably breaK XP.

Yes, and Windows 7. But this patch doesn't to that, it just makes
adjustments that make it easier.

The next step is to use LoadLibrary and GetProcAddress to resolve
GetSystemTimePreciseAsFileTime *if it is available*, during backend
start. Then use it if possible, and fall back to GetSystemTimeAsFileTime
if it isn't.

This patch does not introduce any BC changes. At all. I should've
omitted all mention of the next step I want to take, but I thought it
was a useful explanation of why this change makes a bigger improvement
easier.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [Windows,PATCH] Use faster, higher precision timer API

On 09/18/2014 12:58 AM, Andrew Dunstan wrote:
 
 Oh, hmm, yes, you're right. For some reason I was thinking W2K was later
 than XP. I get more random memory errors as I get older ...

It's because people say Win2k3 / Win2k8 / Win2k8r2 / Win2k12 a lot as
shorthand for Windows Server 2003 (XP-based), Windows Server 2008 (Vista
based), Windows Server 2008 R2 (Windows 7 based) and Windows Server 2012
(Windows 8 based) respectively.

Win2k is just Windows 2000, the release before Windows XP, released in
December 1999. Needless to say, if it's compatible even as far back as
Win2k it's not going to worry anybody.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Protocol TODO: Identify server charset in handshake

Hi all

In the wire protocol, currently if you get an error from the server
before you know it's processed your startup packet successfully you
don't know what character encoding that error is in.

If the error came from the postmaster then it's in the system's default
encoding or whatever locale the postmaster was started under.

If the error came from the DB backend it's in the DB backend's default
encoding, switched to during backend startup. Assuming it got that far.

If the error came after client_encoding was applied, it's in your
requested client_encoding.

This leaves the client unable to reliably interpret error messages. The
4.1 protocol should probably explicitly signal the encoding in the first
message from the server, and thereafter whenever it changes.

(This is somewhat related to the mess we make of text encodings in the
log files, where the postmaster writes in one encoding and DB backends
write in another).

Example psql session, in a terminal with en_AU.UTF-8 locale, connecting
to a postmaster started with:

$ LC_ALL=ru_RU.ISO-8859-5 LANG=ru_RU.ISO-8859-5 PATH=$HOME/pg/pg94/bin
postgres -D pg94_ru -p 9595

$ locale
LANG=en_AU.UTF-8
LC_CTYPE=en_AU.UTF-8
...
LC_ALL=

$ psql -p 9595
psql: �: �� craig �� ��

$ psql -q -p 9595 postgres
postgres=# \c nosuch
�: �� nosuch �� ��
Previous connection kept
postgres=# select indb_error();
ОШИБКА: функция indb_error() не существует
LINE 1: select indb_error();
^
HINT: Функция с данными именем и типами аргументов не найдена.
Возможно, вам следует добавить явные преобразования типов.

Note the garbage where psql happily dumps an ISO-8859-5 message to the
terminal because it has no way of knowing it isn't in the current
client_encoding, and no way of telling what encoding it is anyway.

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Collations and Replication; Next Steps

2014-09-17 Thread Oleg Bartunov

We use ICU with postgres for many years in our  mchar extension, which
provides  case-insensitive  text data type for popular  russian financial
system.  I don't know if we may ask ICU to give us special BSD-compatible
license ?

Re: [HACKERS] Collations and Replication; Next Steps

On Wed, Sep 17, 2014 at 9:06 PM, Oleg Bartunov obartu...@gmail.com wrote:
 We use ICU with postgres for many years in our  mchar extension, which
 provides  case-insensitive  text data type for popular  russian financial
 system.  I don't know if we may ask ICU to give us special BSD-compatible
 license ?

I don't think that's necessary. Firebird uses ICU, and has similar
licensing terms to PostgreSQL.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Collations and Replication; Next Steps

On 09/17/2014 09:17 PM, Robert Haas wrote:
 What I find astonishing is that whoever maintains glibc (or the Red
 Hat packaging for it) thinks it's OK to change the collation order in
 a minor release.  I'd understand changing it between, say, RHEL 6 and
 RHEL 7.  But the idea that minor release, supposedly safe updates
 think they can whack this around without breaking applications really
 kind of blows my mind.


If confirmed, it certainly requires some ... firm ... bug reports.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Collations and Replication; Next Steps

2014-09-17 Thread Oleg Bartunov

On Thu, Sep 18, 2014 at 1:09 PM, Peter Geoghegan p...@heroku.com wrote:

 On Wed, Sep 17, 2014 at 9:06 PM, Oleg Bartunov obartu...@gmail.com
 wrote:
  We use ICU with postgres for many years in our  mchar extension, which
  provides  case-insensitive  text data type for popular  russian financial
  system.  I don't know if we may ask ICU to give us special BSD-compatible
  license ?

 I don't think that's necessary. Firebird uses ICU, and has similar
 licensing terms to PostgreSQL.


Firebird uses MPL license, which is compatible with GPL
http://en.wikipedia.org/wiki/Mozilla_Public_License


 --
 Peter Geoghegan

Re: [HACKERS] Collations and Replication; Next Steps

 On Wed, Sep 17, 2014 at 9:06 PM, Oleg Bartunov obartu...@gmail.com wrote:
 We use ICU with postgres for many years in our  mchar extension, which
 provides  case-insensitive  text data type for popular  russian financial
 system.  I don't know if we may ask ICU to give us special BSD-compatible
 license ?
 
 I don't think that's necessary. Firebird uses ICU, and has similar
 licensing terms to PostgreSQL.

http://source.icu-project.org/repos/icu/icu/trunk/license.html


 permission notice appear in supporting documentation.Permission is
 hereby granted, free of charge, to any person obtaining a copy of
 this software and associated documentation files (the Software), to
 deal in the Software without restriction, including without
 limitation the rights to use, copy, modify, merge, publish,
 distribute, and/or sell copies of the Software, and to permit persons
 to whom the Software is furnished to do so, provided that the above
 copyright notice(s) and this permission notice appear in all copies
 of the Software and that both the above copyright notice(s) and this
 permission notice appear in supporting documentation.


In my understanding PostgreSQL's manual MUST include the ICU license
term (this is not a problem).  What I am not so sure is, any software
uses PostgreSQL also MUST include the ICU license or not. If yes, I
think this is surely a problem.

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Collations and Replication; Next Steps