Re: [HACKERS] KNN-GiST with recheck

2014-09-17 Thread Emre Hasegeli
  While looking it at I found a bug.  It returns the second column
  in wrong order when both of the distance functions return recheck = true.
  Test script attached to run on the regression database.  I tried to
  fix but could not.  searchTreeItemDistanceRecheck function is not
  very easy to follow.  I think it deserves more comments.
 
 Fixed, thanks. It was logical error in comparison function implementation.

I managed to break it again by ordering rows only by the second column
of the index.  Test script attached.


knn-gist-recheck-test-secondcolumn.sql
Description: application/sql

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] KNN-GiST with recheck

2014-09-17 Thread Emre Hasegeli
 I managed to break it again by ordering rows only by the second column
 of the index.  Test script attached.

I was confused.  It is undefined behavior.  Sorry for the noise.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Patch to Remove INNER JOINs

2014-09-17 Thread David Rowley
Over here -
http://www.postgresql.org/message-id/caaphdvqd99i2eesy6phueo8cmkkudhenzsa-edamswszhu2...@mail.gmail.com
I posted a patch to add support for removing SEMI and ANTI joins, where the
join could be proved useless by the existence of a foreign key.

The patch was part of my incremental work with the end goal of allowing
INNER JOIN removal using similar logic. Though it seems that nobody was
interested the extra code for such a small use case as semi/anti join
removal.

Over here -
http://www.postgresql.org/message-id/2261.1409163...@sss.pgh.pa.us Tom said
he'd rather see the INNER JOIN removal stuff working before getting all the
required infrastructure into the code base... (or at least that's how I
understood it) I then went off and started working on the INNER JOIN
removal stuff and posted it here -
http://www.postgresql.org/message-id/CAApHDvpDXXvKE+=ug1kA--nKfa=bjrjvk8gp9g8uywv6nhc...@mail.gmail.com
, so there's now little point in going on with discussions on the thread
that was meant for semi/anti join removals. Hence why I'm starting a new
one to talk about the INNER JOIN removal patch. I'm hoping this subject
will generate enough interest for a review at some point.

I've attached a patch that I've done a bit more work on based on a few
comments gathered from the semi/anti join thread.

For anyone that's not clicked the links above.. Here's a demo of what the
patch does.

test=# create table c (id int primary key);
CREATE TABLE
test=# create table b (id int primary key, c_id int not null references
c(id));
CREATE TABLE
test=# create table a (id int primary key, b_id int not null references
b(id));
CREATE TABLE
test=#
test=# explain select a.* from a inner join b on a.b_id = b.id inner join c
on b.c_id = c.id;
 QUERY PLAN
-
 Seq Scan on a  (cost=0.00..31.40 rows=2140 width=8)
 Planning time: 1.061 ms

Regards

David Rowley


inner_join_removals_2014-09-17_d17e608.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] KNN-GiST with recheck

2014-09-17 Thread Alexander Korotkov
On Wed, Sep 17, 2014 at 12:30 PM, Emre Hasegeli e...@hasegeli.com wrote:

  I managed to break it again by ordering rows only by the second column
  of the index.  Test script attached.

 I was confused.  It is undefined behavior.  Sorry for the noise.


No problem. Thanks a lot for testing.

--
With best regards,
Alexander Korotkov.


Re: [HACKERS] [TODO] Process pg_hba.conf keywords as case-insensitive

2014-09-17 Thread Kyotaro HORIGUCHI
Hi,

At Thu, 11 Sep 2014 08:10:54 -0400, Robert Haas robertmh...@gmail.com wrote 
in ca+tgmoz9xinc_ca23-p1dmihmv0zhckef6_rv6v3s+oxrla...@mail.gmail.com
 On Wed, Sep 10, 2014 at 4:54 AM, Kyotaro HORIGUCHI
 horiguchi.kyot...@lab.ntt.co.jp wrote:
  Finally I think that we need case-insensitive version of
  get_role_id and() get_database_id() to acoomplish this patch'es
  objective. (This runs full-scans on pg_database or pg_authid X()
 
 Any such thing is certainly grounds for rejecting the patch outright.
 It may be that pg_hba.conf should follow the same case-folding rules
 we use elsewhere, but it should not invent novel semantics, especially
 ones that make connecting to the database a far more expensive
 operation than it is today.

No wonder.  I wondered why such things are needed for this
'case-insensitive matcing'. I've misunderstood the meaning of
'case-insensitive'. There's no need to scanning catalogues for
the 'case-insensitive' matching. Thank you for suggestion.

 - Non-quoted names are matched with the names in the catalog
   after lowercased.

 - Quoted names are matched as is.

This is archieved by simply downcase the identifier if not
case-insensitive notation, and remove case-insensitive version
catalog stuff.

I'll show you more reasonable version sooner.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Escaping from blocked send() reprised.

2014-09-17 Thread Kyotaro HORIGUCHI
Sorry for the mistake...

At Wed, 10 Sep 2014 18:53:03 +0300, Heikki Linnakangas 
hlinnakan...@vmware.com wrote in 541073df.70...@vmware.com
 Wrong thread...
 
 On 09/10/2014 03:04 AM, Kyotaro HORIGUCHI wrote:
  Hmm. Sorry, I misunderstood the specification.
 
  You approach that coloring tokens seems right, but you have
  broken the parse logic by adding your code.
 
  Other than the mistakes others pointed, I found that
 
  - non-SQL-ident like tokens are ignored by their token style,
 quoted or not, so the following line works.
 
  | local All aLL trust

-- 
Kyotaro Horiguchi
NTT Open Source Software Center


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Join consolidation / Removing duplicate joins

2014-09-17 Thread David Rowley
I've been looking into improving the performance of queries that have
co-related sub-queries, such as:

SELECT id,(SELECT value FROM t2 WHERE t2.id = t1.id) FROM t1;

Where currently we produce a plan that executes the subquery as a sub plan,
like:

QUERY PLAN
--
 Seq Scan on t1  (cost=0.00..8456925.00 rows=100 width=4) (actual
time=0.139..4071.598 rows=100 loops=1)
   SubPlan 1
 -  Index Scan using t2_pkey on t2  (cost=0.42..8.44 rows=1 width=4)
(actual time=0.003..0.003 rows=1 loops=100)
   Index Cond: (id = t1.id)
 Planning time: 0.169 ms
 Execution time: 4107.809 ms

Though, if the subquery could be proved only to ever return 1 record, then
this could be re-written to become:

explain analyze SELECT t1.id,t2.value FROM t1 LEFT OUTER JOIN t2 ON t1.id =
t2.id;
QUERY PLAN
--
 Hash Left Join  (cost=30832.00..70728.00 rows=100 width=8) (actual
time=384.337..1452.387 rows=100 loops=1)
   Hash Cond: (t1.id = t2.id)
   -  Seq Scan on t1  (cost=0.00..14425.00 rows=100 width=4) (actual
time=0.015..199.716 rows=100 loops=1)
   -  Hash  (cost=14425.00..14425.00 rows=100 width=8) (actual
time=382.387..382.387 rows=100 loops=1)
 Buckets: 131072  Batches: 16  Memory Usage: 2463kB
 -  Seq Scan on t2  (cost=0.00..14425.00 rows=100 width=8)
(actual time=0.010..179.911 rows=100 loops=1)
 Planning time: 0.396 ms
 Execution time: 1473.392 ms
(8 rows)

(notice performance increase)

Of course, you might ask, why not just write the 2nd query in the first
place? Well, good idea, but it's not always that simple, take the following
update statement:

explain update t1
set value = (select value from t2 where t1.id=t2.id),
  value2 = (select value2 from t2 where t1.id = t2.id)
where exists(select 1 from t2 where t1.id=t2.id);

We end up with a quite a few extra sub queries where probably 1 join would
have done the trick. We could have probably written this using the
UPDATE/FROM syntax, but if you like to write standard SQL then that might
not fly.

Anyway... I've been thinking of writing some code that converts these sub
plans into left joins where it can be proved that the subquery would only
at most produce 1 row, but as for the case in the UPDATE statement above, I
didn't really want the code to create a new join for each subplan that it
pulls into the outer query, it would be nice if the code could detect if
there's a suitable join there already, and make use of it.

I started to think what an efficient way to do this might be, and thought
maybe that it would be good if on RelOptInfo or so, if we could add a bool
flag named something like uniquejoin, where this would be set to True, if
we could detect that a unique index existed on the relation that was a
subset of the join condition, (similar to the requirement of LEFT JOIN
removals), we could have a function that tried to pullup all these subplans
into the outer query, when it could detect that at most 1 subplan row could
exist for each outer plan row, the code could then convert these to a LEFT
OUTER JOIN on the outer query.

A bit later in planning, once all the subqueries are planned or pulled up,
a join consolidation function could be run that merges all of these
duplicate joins into 1 join for each relation, it could do this by just
looking at relations that have this uniquejoin flag set to true, and then
dig deeper to ensure that the join conditions also match. If we also went
to the trouble of setting this flag for outer rels and not just the ones we
add from this pullup operation, then we could also merge duplicate INNER
JOINS too.

Another example of where removing these duplicated joins could be useful is
when 2 views get joined that share a non-empty intersection of their
relations, where the join conditions are the same.

I kind of thought that perhaps that all of this extra processing to look
for unique indexes might just be more processing than it's worth, as
success cases of join consolidation might be small, but then I started
looking at the executor code around merge and hash join. It seems that
there might also be some performance gains in here too.  I see with SEMI
joins we move to the next outer row once we've found 1 matching inner row.
I believe that with these uniquejoin relations that we could also skip to
the next outer row the same as with semi joins. I did some quick hacks in
the merge join code to test if there was much performance to gain to be had
here and found that on a join conditions involving 2 varchar fields of
about 16 chars, that I could get the query to run in 97% of the time
(saving 1 comparison on the join 

Re: [HACKERS] Immediate standby promotion

2014-09-17 Thread Simon Riggs
On 14 August 2014 20:27, Robert Haas robertmh...@gmail.com wrote:
 On Thu, Aug 14, 2014 at 10:12 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Fujii Masao masao.fu...@gmail.com writes:
 I'd like to propose to add new option --immediate to pg_ctl promote.
 When this option is set, recovery ignores any WAL data which have not
 been replayed yet and exits immediately. Patch attached.

 This promotion is faster than normal one but can cause data loss.

 TBH, I cannot imagine a situation where that would be a sane thing to do.
 If you have WAL, why would you not replay what you have?  The purpose
 of a database is to preserve your data, not randomly throw it away.

 I've wanted this a number of times, so I think it's quite sane.

fast promotion was actually a supported option in r8 of Postgres but
this option was removed when we implemented streaming replication in
r9.0

The *rough* requirement is sane, but that's not the same thing as
saying this exact patch makes sense.

If you are paused and you can see that WAL up ahead is damaged, then
YES, you do want to avoid applying it. That is possible by setting a
PITR target so that recovery stops at a precise location specified by
you. As an existing option is it better than the blunt force trauma
suggested here.

If you really don't care, just shutdown server, resetxlog and start
her up - again, no need for new option.

Anyway, my view is that this is a blunt instrument to do something we
can already do.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Enable WAL archiving even in standby

2014-09-17 Thread Simon Riggs
On 13 August 2014 11:42, Fujii Masao masao.fu...@gmail.com wrote:

 I'd propose the attached WIP patch which allows us to enable WAL archiving
 even in standby. The patch adds always as the valid value of archive_mode.
 If it's set to always, the archiver is started when the server is in standby
 mode and all the WAL files that walreceiver wrote to the disk are archived by
 using archive_command. Then, even after the server is promoted to master,
 the archiver keeps archiving WAL files. The patch doesn't change the meanings
 of the setting values on and off of archive_mode.

Seems OK idea.

Perhaps better to have a new parameter called
   archive_role = Origin | Replica | Always should be used so we match
up with how triggers/rules work.
rather than abuse archive_mode

We can pass a attribute called %m (mode?) which tells the
archive_command whether we are Master or Standby when called

Remember to bump the number of background procs by 1 since we
previously assumed archiver would never run in recovery

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] removing volatile qualifiers from lwlock.c

2014-09-17 Thread Andres Freund
Hi,

On 2014-09-10 14:53:07 -0400, Robert Haas wrote:
 As discussed on the thread Spinlocks and compiler/memory barriers,
 now that we've made the spinlock primitives function as compiler
 barriers (we think), it should be possible to remove volatile
 qualifiers from many places in the source code.  The attached patch
 does this in lwlock.c.  If the changes in commit
 0709b7ee72e4bc71ad07b7120acd117265ab51d0 (and follow-on commits) are
 correct and complete, applying this shouldn't break anything, while
 possibly giving the compiler room to optimize things better than it
 does today.
 
 However, demonstrating the necessity of that commit for these changes
 seems to be non-trivial.  I tried applying this patch and reverting
 commits 5b26278822c69dd76ef89fd50ecc7cdba9c3f035,
 b4c28d1b92c81941e4fc124884e51a7c110316bf, and
 0709b7ee72e4bc71ad07b7120acd117265ab51d0 on a PPC64 POWER8 box with a
 whopping 192 hardware threads (thanks, IBM!).  I then ran the
 regression tests repeatedly, and I ran several long pgbench runs with
 as many as 350 concurrent clients.  No failures.

There's actually one more commit to revert. What I used was:
git revert 5b26278822c69dd76ef89fd50ecc7cdba9c3f035 \
b4c28d1b92c81941e4fc124884e51a7c110316bf \
68e66923ff629c324e219090860dc9e0e0a6f5d6 \
0709b7ee72e4bc71ad07b7120acd117265ab51d0

 So I'm posting this patch in the hope that others can help.  The
 relevant tests are:
 
 1. If you apply this patch to master and run tests of whatever kind
 strikes your fancy, does anything break under high concurrency?  If it
 does, then the above commits weren't enough to make this safe on your
 platform.
 
 2. If you apply this patch to master, revert the commits mentioned
 above, and again run tests, does anything now break?  If it does (but
 the first tests were OK), then that shows that those commits did
 something useful on your platform.

I just tried this on my normal x86 workstation. I applied your lwlock
patch and ontop I removed most volatiles (there's a couple still
required) from xlog.c. Works for 100 seconds. Then I reverted the above
commits. Breaks within seconds:
master:
LOG:  request to flush past end of generated WAL; request 2/E5EC3DE0, currpos 
2/E5EC1E60
standby:
LOG:  record with incorrect prev-link 4/684C3108 at 4/684C3108
and similar.

So at least for x86 the compiler barriers are obviously required and
seemingly working.

I've attached the very quickly written xlog.c de-volatizing patch.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services
From 2b68a134925e4b2fb6ff282f5ed83b8f57b10732 Mon Sep 17 00:00:00 2001
From: Andres Freund and...@anarazel.de
Date: Wed, 17 Sep 2014 13:21:20 +0200
Subject: [PATCH] xlog.c-remove-volatile

---
 src/backend/access/transam/xlog.c | 473 ++
 1 file changed, 176 insertions(+), 297 deletions(-)

diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 34f2fc0..103f077 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -1219,16 +1219,13 @@ begin:;
 	 */
 	if (StartPos / XLOG_BLCKSZ != EndPos / XLOG_BLCKSZ)
 	{
-		/* use volatile pointer to prevent code rearrangement */
-		volatile XLogCtlData *xlogctl = XLogCtl;
-
-		SpinLockAcquire(xlogctl-info_lck);
+		SpinLockAcquire(XLogCtl-info_lck);
 		/* advance global request to include new block(s) */
-		if (xlogctl-LogwrtRqst.Write  EndPos)
-			xlogctl-LogwrtRqst.Write = EndPos;
+		if (XLogCtl-LogwrtRqst.Write  EndPos)
+			XLogCtl-LogwrtRqst.Write = EndPos;
 		/* update local result copy while I have the chance */
-		LogwrtResult = xlogctl-LogwrtResult;
-		SpinLockRelease(xlogctl-info_lck);
+		LogwrtResult = XLogCtl-LogwrtResult;
+		SpinLockRelease(XLogCtl-info_lck);
 	}
 
 	/*
@@ -1323,7 +1320,7 @@ static void
 ReserveXLogInsertLocation(int size, XLogRecPtr *StartPos, XLogRecPtr *EndPos,
 		  XLogRecPtr *PrevPtr)
 {
-	volatile XLogCtlInsert *Insert = XLogCtl-Insert;
+	XLogCtlInsert *Insert = XLogCtl-Insert;
 	uint64		startbytepos;
 	uint64		endbytepos;
 	uint64		prevbytepos;
@@ -1378,7 +1375,7 @@ ReserveXLogInsertLocation(int size, XLogRecPtr *StartPos, XLogRecPtr *EndPos,
 static bool
 ReserveXLogSwitch(XLogRecPtr *StartPos, XLogRecPtr *EndPos, XLogRecPtr *PrevPtr)
 {
-	volatile XLogCtlInsert *Insert = XLogCtl-Insert;
+	XLogCtlInsert *Insert = XLogCtl-Insert;
 	uint64		startbytepos;
 	uint64		endbytepos;
 	uint64		prevbytepos;
@@ -1696,7 +1693,7 @@ WaitXLogInsertionsToFinish(XLogRecPtr upto)
 	uint64		bytepos;
 	XLogRecPtr	reservedUpto;
 	XLogRecPtr	finishedUpto;
-	volatile XLogCtlInsert *Insert = XLogCtl-Insert;
+	XLogCtlInsert *Insert = XLogCtl-Insert;
 	int			i;
 
 	if (MyProc == NULL)
@@ -2131,16 +2128,11 @@ AdvanceXLInsertBuffer(XLogRecPtr upto, bool opportunistic)
 break;
 
 			/* Before waiting, get info_lck and update LogwrtResult */
-			{
-/* use volatile pointer 

[HACKERS] [Windows,PATCH] Use faster, higher precision timer API

2014-09-17 Thread Craig Ringer
Hi all

Attached is a patch to switch 9.5 over to using the
GetSystemTimeAsFileTime call instead of separate GetSystemTime and
SystemTimeToFileTime calls.

This patch the first step in improving PostgreSQL's support for Windows
high(er) resolution time.

In addition to requiring one less call into the platform libraries, this
change permits capture of timestamps at up to 100ns precision, instead
of the current 1ms limit. Unfortunately due to platform timer resolution
limitations it will in practice only report with 1ms resolution and
0.1ms precision - or sometimes even as much as 15ms resolution. (If you
want to know more, see the README for
https://github.com/2ndQuadrant/pg_sysdatetime).

On Windows 2012 and Windows 8 I'd like to use the new
GetSystemTimePreciseAsFileTime call instead. As this requires some extra
hoop-jumping to safely and efficiently use it without breaking support
for older platforms I suggest that we start with just switching over to
GetSystemTimeAsFileTime, which has been supported since Windows 2000.
Then more precise time capture can be added in a later patch.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services
From 337f1e4894bf7071c6d0c24a77c433a0b9e0cc81 Mon Sep 17 00:00:00 2001
From: Craig Ringer cr...@2ndquadrant.com
Date: Fri, 12 Sep 2014 12:41:35 +0800
Subject: [PATCH] Use GetSystemTimeAsFileTime directly in windows gettimeofday
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

PostgreSQL was calling GetSystemTime followed by SystemTimeToFileTime in the
win32 port gettimeofday function. This is not necessary and limits the reported
precision to the 1ms granularity that the SYSTEMTIME struct can represent. By
using GetSystemTimeAsFileTime we avoid unnecessary conversions and capture
timestamps at 100ns granularity, which is then rounded to 1µs granularity for
storage in a PostgreSQL timestamp.

On most Windows systems this change will actually have no significant effect as
the system timer tick is typically between 1ms and 15ms depending on what timer
resolution currently running applications have requested. You can check this
with clockres.exe from sysinternals. Despite the platform limiation this change
still permits capture of finer timestamps where the system is capable of
producing them and it gets rid of an unnecessary syscall.

Future work may permit use of GetSystemTimePreciseAsFileTime on Windows 8 and
Windows Server 2012 for higher resolution time capture. This call has the same
interface as GetSystemTimeAsFileTime.
---
 src/port/gettimeofday.c | 6 ++
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/src/port/gettimeofday.c b/src/port/gettimeofday.c
index 75a9199..73ec406 100644
--- a/src/port/gettimeofday.c
+++ b/src/port/gettimeofday.c
@@ -44,16 +44,14 @@ int
 gettimeofday(struct timeval * tp, struct timezone * tzp)
 {
 	FILETIME	file_time;
-	SYSTEMTIME	system_time;
 	ULARGE_INTEGER ularge;
 
-	GetSystemTime(system_time);
-	SystemTimeToFileTime(system_time, file_time);
+	GetSystemTimeAsFileTime(file_time);
 	ularge.LowPart = file_time.dwLowDateTime;
 	ularge.HighPart = file_time.dwHighDateTime;
 
 	tp-tv_sec = (long) ((ularge.QuadPart - epoch) / 1000L);
-	tp-tv_usec = (long) (system_time.wMilliseconds * 1000);
+	tp-tv_usec = (long) (((ularge.QuadPart - epoch) % 1000L) / 10);
 
 	return 0;
 }
-- 
1.9.3


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Collations and Replication; Next Steps

2014-09-17 Thread Martijn van Oosterhout
On Tue, Sep 16, 2014 at 02:57:00PM -0700, Peter Geoghegan wrote:
 On Tue, Sep 16, 2014 at 2:07 PM, Peter Eisentraut pete...@gmx.net wrote:
  Clearly, this is worth documenting, but I don't think we can completely
  prevent the problem.  There has been talk of a built-in index integrity
  checking tool.  That would be quite useful.
 
 We could at least use the GNU facility for versioning collations where
 available, LC_IDENTIFICATION [1]. By not versioning collations, we are
 going against the express advice of the Unicode consortium (they also
 advise to do a strcmp() tie-breaker, something that I think we
 independently discovered in 2005, because of a bug report - this is
 what I like to call the Hungarian issue. They know what our
 constraints are.). I recognize it's a tricky problem, because of our
 historic dependence on OS collations, but I think we should definitely
 do something. That said, I'm not volunteering for the task, because I
 don't have time. While I'm not sure of what the long term solution
 should be, it *is not* okay that we don't version collations. I think
 that even the best possible B-Tree check tool is a not a solution.

Personally I think we should just support ICU as an option. FreeBSD has
been maintaining an out of tree patch for 10 years now so we know it
works.

The FreeBSD patch is not optimal though, these days ICU supports UTF-8
directly so many of the push-ups FreeBSD does are no longer necessary.
It is often faster than glibc and the key sizes for strxfrm are more
compact [1] which is relevent for the recent optimisation patch.

Lets solve this problem for once and for all.

[1] http://site.icu-project.org/charts/collation-icu4c48-glibc

-- 
Martijn van Oosterhout   klep...@svana.org   http://svana.org/kleptog/
 He who writes carelessly confesses thereby at the very outset that he does
 not attach much importance to his own thoughts.
   -- Arthur Schopenhauer


signature.asc
Description: Digital signature


Re: [HACKERS] Collations and Replication; Next Steps

2014-09-17 Thread Matthew Kelly
Here is where I think the timezone and PostGIS cases are fundamentally 
different:
I can pretty easily make sure that all my servers run in the same timezone.  
That's just good practice.  I'm also going to install the same version of 
PostGIS everywhere in a cluster.  I'll build PostGIS and its dependencies from 
the exact same source files, regardless of when I build the machine.

Timezone is a user level setting; PostGIS is a user level library used by a 
subset.

glibc is a system level library, and text is a core data type, however.  
Changing versions to something that doesn't match the kernel can lead to system 
level instability, broken linkers, etc.  (I know because I tried).  Here are 
some subtle other problems that fall out:

 * Upgrading glibc, the kernel, and linker through the package manager in order 
to get security updates can cause the corruption.
 * A basebackup that is taken in production and placed on a backup server might 
not be valid on that server, or your desktop machine, or on the spare you keep 
to do PITR when someone screws up.
 * Unless you keep _all_ of your clusters on the same OS, machines from your 
database spare pool probably won't be the right OS when you add them to the 
cluster because a member failed.

Keep in mind here, by OS I mean CentOS versions.  (we're running a mix of late 
5.x and 6.x, because of our numerous issues with the 6.x kernel)

The problem with LC_IDENTIFICATION is that every machine I have seen reports 
revision 1.0, date 2000-06-24.  It doesn't seem like the versioning is 
being actively maintained.

I'm with Martjin here, lets go ICU, if only because it moves sorting to a user 
level library, instead of a system level.  Martjin do you have a link to the 
out of tree patch?  If not I'll find it.  I'd like to apply it to a branch and 
start playing with it.

- Matt K


On Sep 17, 2014, at 7:39 AM, Martijn van Oosterhout klep...@svana.org
 wrote:

 On Tue, Sep 16, 2014 at 02:57:00PM -0700, Peter Geoghegan wrote:
 On Tue, Sep 16, 2014 at 2:07 PM, Peter Eisentraut pete...@gmx.net wrote:
 Clearly, this is worth documenting, but I don't think we can completely
 prevent the problem.  There has been talk of a built-in index integrity
 checking tool.  That would be quite useful.
 
 We could at least use the GNU facility for versioning collations where
 available, LC_IDENTIFICATION [1]. By not versioning collations, we are
 going against the express advice of the Unicode consortium (they also
 advise to do a strcmp() tie-breaker, something that I think we
 independently discovered in 2005, because of a bug report - this is
 what I like to call the Hungarian issue. They know what our
 constraints are.). I recognize it's a tricky problem, because of our
 historic dependence on OS collations, but I think we should definitely
 do something. That said, I'm not volunteering for the task, because I
 don't have time. While I'm not sure of what the long term solution
 should be, it *is not* okay that we don't version collations. I think
 that even the best possible B-Tree check tool is a not a solution.
 
 Personally I think we should just support ICU as an option. FreeBSD has
 been maintaining an out of tree patch for 10 years now so we know it
 works.
 
 The FreeBSD patch is not optimal though, these days ICU supports UTF-8
 directly so many of the push-ups FreeBSD does are no longer necessary.
 It is often faster than glibc and the key sizes for strxfrm are more
 compact [1] which is relevent for the recent optimisation patch.
 
 Lets solve this problem for once and for all.
 
 [1] http://site.icu-project.org/charts/collation-icu4c48-glibc
 
 -- 
 Martijn van Oosterhout   klep...@svana.org   http://svana.org/kleptog/
 He who writes carelessly confesses thereby at the very outset that he does
 not attach much importance to his own thoughts.
   -- Arthur Schopenhauer



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Collations and Replication; Next Steps

2014-09-17 Thread Robert Haas
On Wed, Sep 17, 2014 at 9:07 AM, Matthew Kelly mke...@tripadvisor.com wrote:
 Here is where I think the timezone and PostGIS cases are fundamentally 
 different:
 I can pretty easily make sure that all my servers run in the same timezone.  
 That's just good practice.  I'm also going to install the same version of 
 PostGIS everywhere in a cluster.  I'll build PostGIS and its dependencies 
 from the exact same source files, regardless of when I build the machine.

 Timezone is a user level setting; PostGIS is a user level library used by a 
 subset.

 glibc is a system level library, and text is a core data type, however.  
 Changing versions to something that doesn't match the kernel can lead to 
 system level instability, broken linkers, etc.  (I know because I tried).  
 Here are some subtle other problems that fall out:

  * Upgrading glibc, the kernel, and linker through the package manager in 
 order to get security updates can cause the corruption.
  * A basebackup that is taken in production and placed on a backup server 
 might not be valid on that server, or your desktop machine, or on the spare 
 you keep to do PITR when someone screws up.
  * Unless you keep _all_ of your clusters on the same OS, machines from your 
 database spare pool probably won't be the right OS when you add them to the 
 cluster because a member failed.

 Keep in mind here, by OS I mean CentOS versions.  (we're running a mix of 
 late 5.x and 6.x, because of our numerous issues with the 6.x kernel)

 The problem with LC_IDENTIFICATION is that every machine I have seen reports 
 revision 1.0, date 2000-06-24.  It doesn't seem like the versioning is 
 being actively maintained.

 I'm with Martjin here, lets go ICU, if only because it moves sorting to a 
 user level library, instead of a system level.  Martjin do you have a link to 
 the out of tree patch?  If not I'll find it.  I'd like to apply it to a 
 branch and start playing with it.

What I find astonishing is that whoever maintains glibc (or the Red
Hat packaging for it) thinks it's OK to change the collation order in
a minor release.  I'd understand changing it between, say, RHEL 6 and
RHEL 7.  But the idea that minor release, supposedly safe updates
think they can whack this around without breaking applications really
kind of blows my mind.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Collations and Replication; Next Steps

2014-09-17 Thread Matthew Kelly
Let me double check that assertion before we go too far with it.

Most of the problems I've seen are across 5 and 6 boundaries.  I thought I had 
case where it was within a minor release but I can't find it right now.  I'm 
going to dig.

That being said the sort order changes whether you statically or dynamically 
link (demonstrated on 4+ machines running different linux flavors), so at the 
point I have no reason to trust the stability of the sort across any build.  I 
legitimately question whether strcoll is buggy.  Ex. I have cases where for 
three strings a, b and c:  a  b, but  (a || c)  (b || c).  That's right 
postfixing doesn't hold.  It actually calls into question the index scan 
optimization that occurs when you do LIKE 'test%' even on a single machine, but 
I don't want to bite that off at the moment.

My mentality has switched to 'don't trust any change until shown otherwise', so 
that may have bled into my last email.

- Matt K.




On Sep 17, 2014, at 8:17 AM, Robert Haas robertmh...@gmail.com
 wrote:

 On Wed, Sep 17, 2014 at 9:07 AM, Matthew Kelly mke...@tripadvisor.com wrote:
 Here is where I think the timezone and PostGIS cases are fundamentally 
 different:
 I can pretty easily make sure that all my servers run in the same timezone.  
 That's just good practice.  I'm also going to install the same version of 
 PostGIS everywhere in a cluster.  I'll build PostGIS and its dependencies 
 from the exact same source files, regardless of when I build the machine.
 
 Timezone is a user level setting; PostGIS is a user level library used by a 
 subset.
 
 glibc is a system level library, and text is a core data type, however.  
 Changing versions to something that doesn't match the kernel can lead to 
 system level instability, broken linkers, etc.  (I know because I tried).  
 Here are some subtle other problems that fall out:
 
 * Upgrading glibc, the kernel, and linker through the package manager in 
 order to get security updates can cause the corruption.
 * A basebackup that is taken in production and placed on a backup server 
 might not be valid on that server, or your desktop machine, or on the spare 
 you keep to do PITR when someone screws up.
 * Unless you keep _all_ of your clusters on the same OS, machines from your 
 database spare pool probably won't be the right OS when you add them to the 
 cluster because a member failed.
 
 Keep in mind here, by OS I mean CentOS versions.  (we're running a mix of 
 late 5.x and 6.x, because of our numerous issues with the 6.x kernel)
 
 The problem with LC_IDENTIFICATION is that every machine I have seen reports 
 revision 1.0, date 2000-06-24.  It doesn't seem like the versioning is 
 being actively maintained.
 
 I'm with Martjin here, lets go ICU, if only because it moves sorting to a 
 user level library, instead of a system level.  Martjin do you have a link 
 to the out of tree patch?  If not I'll find it.  I'd like to apply it to a 
 branch and start playing with it.
 
 What I find astonishing is that whoever maintains glibc (or the Red
 Hat packaging for it) thinks it's OK to change the collation order in
 a minor release.  I'd understand changing it between, say, RHEL 6 and
 RHEL 7.  But the idea that minor release, supposedly safe updates
 think they can whack this around without breaking applications really
 kind of blows my mind.
 
 -- 
 Robert Haas
 EnterpriseDB: http://www.enterprisedb.com
 The Enterprise PostgreSQL Company



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Collations and Replication; Next Steps

2014-09-17 Thread Greg Stark
On Tue, Sep 16, 2014 at 11:41 PM, Peter Geoghegan p...@heroku.com wrote:
 The timezone case you highlight here seems quite distinct from what
 Matthew is talking about, because in point of fact the on-disk
 representation is merely *interpreted* with reference to the timezone
 database. So, you could have an inconsistency between standbys
 concerning what the time was in a particular timezone at a particular
 timestamp value as reported by the timestamptz output function, but
 both standbys would be correct on their own terms, which isn't too
 bad.

You could have a problem if you have an expression index on (timestamp
AT TIME ZONE '...'). I may have the expression slightly wrong but I
believe it is posisble to write an immutable expression that depends
on the tzdata data as long as it doesn't depend on not the user's
current time zone (which would be stable but not immutable). The
actual likelihood of that situation might be much lower and the
ability to avoid it higher but in theory I think Peter's right that
it's the same class of problem.

Generally speaking we try to protect against most environment
dependencies that lead to corrupt databases by encoding them in the
control file. Obviously we can't encode an entire collation in the
controlfile though. We could conceivably have a corpus of
representative strings that we sort and then checksum in the
controlfile. It wouldn't be foolproof but if we collect interesting
examples as we find them it might be a worthwhile safety check.

Just brainstorming... I wonder if it would be possible to include any
collation comparisons made in handling an index insert in the xlog
record and have the standby verify those comparisons are valid on the
standby. I guess that would be pretty hard to arrange code-wise since
the comparisons could be coming from anywhere to say nothing of the
wal bloat.

Peter G, could go into more detail about collation versioning? What
would the implications be for Postgres?

-- 
greg


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Collations and Replication; Next Steps

2014-09-17 Thread Tatsuo Ishii
Why don't we have our collation data? It seems MySQL has already done this.

http://dev.mysql.com/doc/refman/5.0/en/charset-collation-implementations.html

I don't think we cannot achieve that because even MySQL accomplishes:-)

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Collations and Replication; Next Steps

2014-09-17 Thread Greg Stark
On Wed, Sep 17, 2014 at 3:47 PM, Tatsuo Ishii is...@postgresql.org wrote:
 I don't think we cannot achieve that because even MySQL accomplishes:-)

We've always considered it an advantage that we're consistent with the
collations in the rest of the system. Generally speaking the fact that
Postgres integrates with the system rather than be a separate system
unto itself.

Consider bug reports like I've configured my system to use
fr_FR.UTF-8 and sort produces output in this order why is Postgres
producing output in a different order? Or extension authors using
strcoll and being surprised that the module gets inconsistent data
from the database.

Separately we always had a huge problem with ICU that it depended on
storing everything in a UCS-16 native encoding and required converting
to and from UTF-8 using an iterator interface. I heard that improved
somewhat but from what I understand it would be a struggle to avoid
copying every string before using it and consuming twice as much
memory. No more using strings directly out of disk buffers.

Then there's the concern that ICU is a *huge* dependency. ICU is
itself larger than the entire Postgres install. It's a big burden on
users to have to install and configure a second collation library in
addition to the system library and a complete non-starter for embedded
systems or low-memory systems.


-- 
greg


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] printing table in asciidoc with psql

2014-09-17 Thread Emanuel Calvo

El 16/09/14 16:52, Szymon Guz escribió:
 Hi,
 I've been working a little bit on a patch for printing tables in
 asciidoc with psql.

 It's not finished yet, I'm not sure it there is any sense in
 supporting border types etc. The code is not cleared so far, but any
 remarks on the style not playing well with the normal postgres style
 of code are welcomed.

 The code just works. With extended and normal modes. With table
 columns made of funny characters, with alignment of data in table
 cells. I was trying to implement it similar to the html export
 function, however escaping of the strings was much easier, as the
 normal html-way substitution is not easy to implement in asciidoc.

 I'd like to ask you for any advices for this code.

 thanks,
 Szymon



Please add asciidoc in src/bin/psql/help.c[354]

 354,96-103
fprintf(output, _(  format set output format
[unaligned, aligned, wrapped, html, latex, ..]\n));

Tested the patch and worked fine.

-- 
--
Emanuel Calvo http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services



Re: [HACKERS] [Windows,PATCH] Use faster, higher precision timer API

2014-09-17 Thread Andrew Dunstan


On 09/17/2014 08:27 AM, Craig Ringer wrote:

Hi all

Attached is a patch to switch 9.5 over to using the
GetSystemTimeAsFileTime call instead of separate GetSystemTime and
SystemTimeToFileTime calls.

This patch the first step in improving PostgreSQL's support for Windows
high(er) resolution time.

In addition to requiring one less call into the platform libraries, this
change permits capture of timestamps at up to 100ns precision, instead
of the current 1ms limit. Unfortunately due to platform timer resolution
limitations it will in practice only report with 1ms resolution and
0.1ms precision - or sometimes even as much as 15ms resolution. (If you
want to know more, see the README for
https://github.com/2ndQuadrant/pg_sysdatetime).

On Windows 2012 and Windows 8 I'd like to use the new
GetSystemTimePreciseAsFileTime call instead. As this requires some extra
hoop-jumping to safely and efficiently use it without breaking support
for older platforms I suggest that we start with just switching over to
GetSystemTimeAsFileTime, which has been supported since Windows 2000.
Then more precise time capture can be added in a later patch.





That will presumably breaK XP. I know XP has been declared at EOL, but 
there are still a heck of a lot of such systems out there, especially in 
places like ATMs, but I saw it in use recently at a US surgical facility 
(which is slightly scary, although this wasn't for life-sustaining 
functionality). My XP system is still actually getting some security 
updates sent from Microsoft.


I'm fine with doing this - frogmouth and currawong would retire on the 
buildfarm.


Just wanted to be up front about it.

cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] printing table in asciidoc with psql

2014-09-17 Thread Pavel Stehule
2014-09-16 21:52 GMT+02:00 Szymon Guz mabew...@gmail.com:

 Hi,
 I've been working a little bit on a patch for printing tables in asciidoc
 with psql.

 It's not finished yet, I'm not sure it there is any sense in supporting
 border types etc. The code is not cleared so far, but any remarks on the
 style not playing well with the normal postgres style of code are welcomed.

 The code just works. With extended and normal modes. With table columns
 made of funny characters, with alignment of data in table cells. I was
 trying to implement it similar to the html export function, however
 escaping of the strings was much easier, as the normal html-way
 substitution is not easy to implement in asciidoc.

 I'd like to ask you for any advices for this code.


nice +1

Pavel



 thanks,
 Szymon


 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers




Re: [HACKERS] Windows exit code 128 ... it's baaack

2014-09-17 Thread Andrew Dunstan


On 04/07/2014 10:26 AM, Andres Freund wrote:

On 2014-04-05 11:05:09 -0400, Tom Lane wrote:

Andres Freund and...@2ndquadrant.com writes:

On 2014-02-27 19:14:13 -0500, Tom Lane wrote:

I looked at the postmaster log for the ongoing issue on narwhal
(to wit, that the contrib/dblink test dies the moment it tries
to do anything dblink-y), and looky here what the postmaster
has logged:

One interesting bit about this is that it seems to work in 9.3...

Well, yeah, it seems to have been broken somehow by the Windows
linking changes we did awhile back to try to ensure that missing
PGDLLIMPORT markers would be detected reliably.  Which we did not
back-patch.

Hard to say since there's been no working builds for HEAD for so long on
narwahl :(.




This issue has been hanging around for many months, possibly much longer 
since the last successful build on narwhal was 2012-08-01 and then it 
went quiet until 2014-02-03, when it came back with this error.


If we don't care to find a fix, I think we need to declare narwhal's 
fairly ancient compiler out of support and decommission it. Other gcc 
systems we have with more modern compilers are not getting this issue.


cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Join consolidation / Removing duplicate joins

2014-09-17 Thread Marti Raudsepp
On Wed, Sep 17, 2014 at 2:00 PM, David Rowley dgrowle...@gmail.com wrote:
 Anyway... I've been thinking of writing some code that converts these sub
 plans into left joins where it can be proved that the subquery would only at
 most produce 1 row

 Does anyone have any thoughts on this?

+1, I've thought about this part before. There is already precedent
for inlining FROM clause subqueries into the main query, it would be
nice to do that for correlated subqueries as well. It seems we've been
adding features to the planner without fully exploiting opportunities
for normalization and consolidation of optimization techniques.

I think it's not even necessary to prove uniqueness of the subquery as
you describe. Now that 9.3 added LATERAL, a correlated subquery can be
seen as a special case of LATERAL LEFT JOIN with an additional check
to raise an error if 1 rows are returned from the inner side. And you
could optionally elide the error check if you can prove uniqueness.

Advantages:
1. Sufficiently simple lateral subqueries are already normalized into
ordinary JOINs with hash/merge support, so you would get that for free
(probably requires eliding the 1-row check).
2. We get rid of silliness like the explosion of SubPlan nodes for
each reference (see examples below).
3. Based on some naive testing, it seems that 9.5devel performs
slightly better with NestLoop LATERAL subqueries than SubPlan
correlated ones.
4. EXPLAIN output is easier to read, I find.

I suppose EXISTS/IN with correlated subqueries needs some different
treatment, as it can currently take advantage of the hashed SubPlan
optimization.

Can anyone see any downsides? Perhaps one day we can get rid of
SubPlan entirely, would anyone miss it?


Example of SubPlan explosion:

regression=# create view foo1 as select *, (select ten as ten2 from
tenk2 where tenk1.unique1=tenk2.unique1) from tenk1;

regression=# explain analyze select * from foo1 where ten2 between 1 and 3;
 Seq Scan on tenk1  (cost=0.00..175782.08 rows= width=244) (actual
time=0.052..49.288 rows=3000 loops=1)
   Filter: (((SubPlan 2) = 1) AND ((SubPlan 3) = 3))
   Rows Removed by Filter: 7000
   SubPlan 1
 -  Index Scan using tenk2_unique1 on tenk2  (cost=0.29..8.30
rows=1 width=4) (actual time=0.001..0.002 rows=1 loops=3000)
   Index Cond: (tenk1.unique1 = unique1)
   SubPlan 2
 -  Index Scan using tenk2_unique1 on tenk2 tenk2_1
(cost=0.29..8.30 rows=1 width=4) (actual time=0.002..0.002 rows=1
loops=1)
   Index Cond: (tenk1.unique1 = unique1)
   SubPlan 3
 -  Index Scan using tenk2_unique1 on tenk2 tenk2_2
(cost=0.29..8.30 rows=1 width=4) (actual time=0.001..0.002 rows=1
loops=9000)
   Index Cond: (tenk1.unique1 = unique1)
 Execution time: 49.508 ms


LATERAL is a win even when using OFFSET 0 to prevent inlining:

regression=# create view foo3 as select * from tenk1 left join lateral
(select ten as ten2 from tenk2 where tenk1.unique1=tenk2.unique1
offset 0) x on true;

regression=# explain analyze select * from foo3 where ten2 between 1 and 3;
 Nested Loop  (cost=0.29..83733.00 rows=1 width=248) (actual
time=0.043..28.963 rows=3000 loops=1)
   -  Seq Scan on tenk1  (cost=0.00..458.00 rows=1 width=244)
(actual time=0.008..1.177 rows=1 loops=1)
   -  Subquery Scan on x  (cost=0.29..8.32 rows=1 width=4) (actual
time=0.002..0.002 rows=0 loops=1)
 Filter: ((x.ten2 = 1) AND (x.ten2 = 3))
 Rows Removed by Filter: 1
 -  Index Scan using tenk2_unique1 on tenk2  (cost=0.29..8.30
rows=1 width=4) (actual time=0.002..0.002 rows=1 loops=1)
   Index Cond: (tenk1.unique1 = unique1)
 Execution time: 29.186 ms


And if you could prove uniqueness of the inner side and inline it,
WHERE clauses can also be pushed down trivially:

regression=# create view foo2 as select * from tenk1 left join lateral
(select ten as ten2 from tenk2 where tenk1.unique1=tenk2.unique1) x on
true;

regression=# explain analyze select * from foo2 where ten2 between 1 and 3;
 Hash Join  (cost=532.50..1083.00 rows=3000 width=248) (actual
time=1.848..4.480 rows=3000 loops=1)
   Hash Cond: (tenk1.unique1 = tenk2.unique1)
   -  Seq Scan on tenk1  (cost=0.00..458.00 rows=1 width=244)
(actual time=0.002..0.617 rows=1 loops=1)
   -  Hash  (cost=495.00..495.00 rows=3000 width=8) (actual
time=1.837..1.837 rows=3000 loops=1)
 Buckets: 1024  Batches: 1  Memory Usage: 118kB
 -  Seq Scan on tenk2  (cost=0.00..495.00 rows=3000 width=8)
(actual time=0.004..1.562 rows=3000 loops=1)
   Filter: ((ten = 1) AND (ten = 3))
   Rows Removed by Filter: 7000
 Execution time: 4.591 ms


Regards,
Marti


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [Windows,PATCH] Use faster, higher precision timer API

2014-09-17 Thread Tom Lane
Andrew Dunstan and...@dunslane.net writes:
 On 09/17/2014 08:27 AM, Craig Ringer wrote:
 Attached is a patch to switch 9.5 over to using the
 GetSystemTimeAsFileTime call instead of separate GetSystemTime and
 SystemTimeToFileTime calls.

 That will presumably breaK XP. I know XP has been declared at EOL, but 
 there are still a heck of a lot of such systems out there,

Yeah.  Do we really think more precise timestamps are worth dropping
XP support?  On the Unix side, I know exactly what would happen to a
patch proposing that we replace gettimeofday() with clock_gettime()
with no thought for backwards compatibility.  Why would we expect
less on the Windows side?

Quite aside from XP ... AFAICS from the patch description, this patch
in itself moves us to a place that's a net negative in terms of
functionality.  Maybe it's a stepping stone to something better,
but I think we should just go directly to the something better.
I don't care for committing regressions on the promise that they'll
get fixed later.

Or in short: let's do the work needed to adapt our code to what's
available on the particular Windows version *first*.  Once we've
got that configuration support done, it shouldn't be much extra
work to continue XP support here.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [Windows,PATCH] Use faster, higher precision timer API

2014-09-17 Thread Andres Freund
On 2014-09-17 11:19:36 -0400, Andrew Dunstan wrote:
 
 On 09/17/2014 08:27 AM, Craig Ringer wrote:
 Hi all
 
 Attached is a patch to switch 9.5 over to using the
 GetSystemTimeAsFileTime call instead of separate GetSystemTime and
 SystemTimeToFileTime calls.
 
 This patch the first step in improving PostgreSQL's support for Windows
 high(er) resolution time.
 
 In addition to requiring one less call into the platform libraries, this
 change permits capture of timestamps at up to 100ns precision, instead
 of the current 1ms limit. Unfortunately due to platform timer resolution
 limitations it will in practice only report with 1ms resolution and
 0.1ms precision - or sometimes even as much as 15ms resolution. (If you
 want to know more, see the README for
 https://github.com/2ndQuadrant/pg_sysdatetime).
 
 On Windows 2012 and Windows 8 I'd like to use the new
 GetSystemTimePreciseAsFileTime call instead. As this requires some extra
 hoop-jumping to safely and efficiently use it without breaking support
 for older platforms I suggest that we start with just switching over to
 GetSystemTimeAsFileTime, which has been supported since Windows 2000.
 Then more precise time capture can be added in a later patch.
 
 
 
 
 That will presumably breaK XP.

The proposed patch? I don't really see why? GetSystemTimeAsFileTime() is
documented to be available since win2k?

Or do you mean GetSystemTimePreciseAsFileTime()? That'd surely - as
indicated by Craig - would have to be optional since it's not available
anywhere but 2012 and windows 8?

 I know XP has been declared at EOL, but there
 are still a heck of a lot of such systems out there, especially in places
 like ATMs, but I saw it in use recently at a US surgical facility (which is
 slightly scary, although this wasn't for life-sustaining functionality). My
 XP system is still actually getting some security updates sent from
 Microsoft.

I unfortunately have to agree, dropping XP is probably at least a year
or two out.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [Windows,PATCH] Use faster, higher precision timer API

2014-09-17 Thread Andrew Dunstan


On 09/17/2014 12:51 PM, Andres Freund wrote:

On 2014-09-17 11:19:36 -0400, Andrew Dunstan wrote:

On 09/17/2014 08:27 AM, Craig Ringer wrote:

Hi all

Attached is a patch to switch 9.5 over to using the
GetSystemTimeAsFileTime call instead of separate GetSystemTime and
SystemTimeToFileTime calls.

This patch the first step in improving PostgreSQL's support for Windows
high(er) resolution time.

In addition to requiring one less call into the platform libraries, this
change permits capture of timestamps at up to 100ns precision, instead
of the current 1ms limit. Unfortunately due to platform timer resolution
limitations it will in practice only report with 1ms resolution and
0.1ms precision - or sometimes even as much as 15ms resolution. (If you
want to know more, see the README for
https://github.com/2ndQuadrant/pg_sysdatetime).

On Windows 2012 and Windows 8 I'd like to use the new
GetSystemTimePreciseAsFileTime call instead. As this requires some extra
hoop-jumping to safely and efficiently use it without breaking support
for older platforms I suggest that we start with just switching over to
GetSystemTimeAsFileTime, which has been supported since Windows 2000.
Then more precise time capture can be added in a later patch.




That will presumably breaK XP.

The proposed patch? I don't really see why? GetSystemTimeAsFileTime() is
documented to be available since win2k?



Oh, hmm, yes, you're right. For some reason I was thinking W2K was later 
than XP. I get more random memory errors as I get older ...


cheers

andrew




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] printing table in asciidoc with psql

2014-09-17 Thread Peter Eisentraut
On 9/16/14 3:52 PM, Szymon Guz wrote:
 It's not finished yet, I'm not sure it there is any sense in supporting
 border types etc.

AFAICT, Asciidoc doesn't support border types, so (if so) you should
just ignore that setting.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [Windows,PATCH] Use faster, higher precision timer API

2014-09-17 Thread Andres Freund
On 2014-09-17 09:38:59 -0700, Tom Lane wrote:
 On the Unix side, I know exactly what would happen to a
 patch proposing that we replace gettimeofday() with clock_gettime()
 with no thought for backwards compatibility.

Btw, do you plan to pursue clock_gettime()? It'd be really neat to have
it...

 
 Quite aside from XP ... AFAICS from the patch description, this patch
 in itself moves us to a place that's a net negative in terms of
 functionality.  Maybe it's a stepping stone to something better, but I
 think we should just go directly to the something better.  I don't
 care for committing regressions on the promise that they'll get fixed
 later.

I don't think there's any regressions in that patch? Rather the
contrary. I understand the comment about the timer tick to be just as
applicable to the current code as the new version. Just that the old
code can't possibly have a precision lower than 1ms, but the new one
can.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] printing table in asciidoc with psql

2014-09-17 Thread Szymon Guz
On 17 September 2014 19:30, Peter Eisentraut pete...@gmx.net wrote:

 On 9/16/14 3:52 PM, Szymon Guz wrote:
  It's not finished yet, I'm not sure it there is any sense in supporting
  border types etc.

 AFAICT, Asciidoc doesn't support border types, so (if so) you should
 just ignore that setting.


Too late, I've done something like this:

border=0
[frame=none,grid=none]

border=1
[frame=all,grid=none]

border=2
[frame=all,grid=all]

thanks,
Szymon


Re: [HACKERS] Collations and Replication; Next Steps

2014-09-17 Thread Peter Geoghegan
On Wed, Sep 17, 2014 at 6:17 AM, Robert Haas robertmh...@gmail.com wrote:
 What I find astonishing is that whoever maintains glibc (or the Red
 Hat packaging for it) thinks it's OK to change the collation order in
 a minor release.  I'd understand changing it between, say, RHEL 6 and
 RHEL 7.  But the idea that minor release, supposedly safe updates
 think they can whack this around without breaking applications really
 kind of blows my mind.

Why wouldn't they feel entitled to? To quote UTS #10 [1]:


Collation order is not fixed.

Over time, collation order will vary: there may be fixes needed as
more information becomes available about languages; there may be new
government or industry standards for the language that require
changes; and finally, new characters added to the Unicode Standard
will interleave with the previously-defined ones. This means that
collations must be carefully versioned.


Indeed, they do version collations with LC_IDENTIFICATION. We just
don't make any attempt to use the version information. In short, this
is our fault.   :-(

[1] http://www.unicode.org/reports/tr10/#Stability
-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Collations and Replication; Next Steps

2014-09-17 Thread Peter Eisentraut
On 9/17/14 10:47 AM, Tatsuo Ishii wrote:
 Why don't we have our collation data? It seems MySQL has already done this.

Where would you get the source data from?  How would you maintain it?


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Collations and Replication; Next Steps

2014-09-17 Thread Peter Eisentraut
On 9/16/14 5:57 PM, Peter Geoghegan wrote:
 On Tue, Sep 16, 2014 at 2:07 PM, Peter Eisentraut pete...@gmx.net wrote:
 Clearly, this is worth documenting, but I don't think we can completely
 prevent the problem.  There has been talk of a built-in index integrity
 checking tool.  That would be quite useful.
 
 We could at least use the GNU facility for versioning collations where
 available, LC_IDENTIFICATION [1].

It looks like the revisions or dates reported by LC_IDENTIFICATION
aren't ever updated for most locales.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Collations and Replication; Next Steps

2014-09-17 Thread Martijn van Oosterhout
On Wed, Sep 17, 2014 at 01:07:56PM +, Matthew Kelly wrote:
 I'm with Martjin here, lets go ICU, if only because it moves sorting
 to a user level library, instead of a system level.  Martjin do you
 have a link to the out of tree patch?  If not I'll find it.  I'd like
 to apply it to a branch and start playing with it.

http://people.freebsd.org/~girgen/postgresql-icu/README.html
http://people.freebsd.org/~girgen/postgresql-icu/

Note I said optional. It is a large library for sure, but for some
installations I think the benefits are sufficient.

Mvg,
-- 
Martijn van Oosterhout   klep...@svana.org   http://svana.org/kleptog/
 He who writes carelessly confesses thereby at the very outset that he does
 not attach much importance to his own thoughts.
   -- Arthur Schopenhauer


signature.asc
Description: Digital signature


Re: [HACKERS] Collations and Replication; Next Steps

2014-09-17 Thread Peter Eisentraut
On 9/17/14 9:07 AM, Matthew Kelly wrote:
 Here is where I think the timezone and PostGIS cases are fundamentally 
 different:
 I can pretty easily make sure that all my servers run in the same timezone.  
 That's just good practice.  I'm also going to install the same version of 
 PostGIS everywhere in a cluster.  I'll build PostGIS and its dependencies 
 from the exact same source files, regardless of when I build the machine.

I wrote time zone *database*, not time zone.  The time zone database  is
(in some configurations) part of glibc.

I also wrote PostGIS dependent libraries, not PostGIS itself.  If you
are comparing RHEL 5 and 6, as you wrote elsewhere, then some of those
will most likely be different.  (Heck, glibc could be different.  Is
glibc never allowed to fix insufficiencies in its floating-point
implementation, for example?)

Also, there is nothing that guarantees that the PostGIS version will be
the same on both sides.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Collations and Replication; Next Steps

2014-09-17 Thread Peter Geoghegan
On Wed, Sep 17, 2014 at 11:05 AM, Peter Eisentraut pete...@gmx.net wrote:
 We could at least use the GNU facility for versioning collations where
 available, LC_IDENTIFICATION [1].

 It looks like the revisions or dates reported by LC_IDENTIFICATION
 aren't ever updated for most locales.


That's not surprising. There is zero controversy about how to
correctly sort English text, for example. For other languages, that
might be much less true.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Collations and Replication; Next Steps

2014-09-17 Thread Peter Eisentraut
On 9/17/14 10:46 AM, Greg Stark wrote:
 You could have a problem if you have an expression index on (timestamp
 AT TIME ZONE '...'). I may have the expression slightly wrong but I
 believe it is posisble to write an immutable expression that depends
 on the tzdata data as long as it doesn't depend on not the user's
 current time zone (which would be stable but not immutable). The
 actual likelihood of that situation might be much lower and the
 ability to avoid it higher but in theory I think Peter's right that
 it's the same class of problem.

I was thinking of something like a text column with natural input of
time stamp information, and and index on that_column::timestamp.

 Generally speaking we try to protect against most environment
 dependencies that lead to corrupt databases by encoding them in the
 control file. Obviously we can't encode an entire collation in the
 controlfile though. We could conceivably have a corpus of
 representative strings that we sort and then checksum in the
 controlfile. It wouldn't be foolproof but if we collect interesting
 examples as we find them it might be a worthwhile safety check.

I think it could be useful in a number of situations if a type could
stick some arbitrary additional information into a new column in
pg_type, such as versions of libraries it depends on or storage format
versions.

Then again, collation isn't actually a property of any single type.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Collations and Replication; Next Steps

2014-09-17 Thread Peter Eisentraut
On 9/17/14 2:07 PM, Peter Geoghegan wrote:
 On Wed, Sep 17, 2014 at 11:05 AM, Peter Eisentraut pete...@gmx.net wrote:
 We could at least use the GNU facility for versioning collations where
 available, LC_IDENTIFICATION [1].

 It looks like the revisions or dates reported by LC_IDENTIFICATION
 aren't ever updated for most locales.
 
 
 That's not surprising. There is zero controversy about how to
 correctly sort English text, for example. For other languages, that
 might be much less true.
 

a) There is plenty of controversy about how to sort English text on
Stack Overflow. ;-)

b) Even in an English locale you have to maintain a sort order for all
Unicode characters, and that changes more than zero times.  But it's
quite clear from looking at the glibc git logs that no one is
maintaining these version numbers.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Collations and Replication; Next Steps

2014-09-17 Thread Peter Geoghegan
On Wed, Sep 17, 2014 at 11:08 AM, Peter Eisentraut pete...@gmx.net wrote:
 I also wrote PostGIS dependent libraries, not PostGIS itself.  If you
 are comparing RHEL 5 and 6, as you wrote elsewhere, then some of those
 will most likely be different.  (Heck, glibc could be different.  Is
 glibc never allowed to fix insufficiencies in its floating-point
 implementation, for example?)


The operator class author has a responsibility to make sure that
doesn't happen. If he or she should fail, then it's a bug, and
possibly a failure of imagination on their part. This is the only way
of thinking about it that makes sense. If you want to use a library
feature in your opclass B-Tree support function 1, then you'd better
be damned sure that it implies immutability insofar as that's
possible. Sure, it's also possible that your users could be the victim
on an unfortunate upstream bug that you couldn't reasonably predict,
but when is that not true?

In general, I am totally unconvinced by this line of argument. It
implies that everyone has to be an expert on everything just to use
Postgres.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [Windows,PATCH] Use faster, higher precision timer API

2014-09-17 Thread Tom Lane
Andres Freund and...@2ndquadrant.com writes:
 On 2014-09-17 09:38:59 -0700, Tom Lane wrote:
 On the Unix side, I know exactly what would happen to a
 patch proposing that we replace gettimeofday() with clock_gettime()
 with no thought for backwards compatibility.

 Btw, do you plan to pursue clock_gettime()? It'd be really neat to have
 it...

It's on my TODO list, but not terribly close to the top.  If you're
excited about that, feel free to take it up.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Turning off HOT/Cleanup sometimes

2014-09-17 Thread Emanuel Calvo

El 15/09/14 18:13, Simon Riggs escribió:
 On 15 September 2014 17:09, Robert Haas robertmh...@gmail.com wrote:

 Do we really want to disable HOT for all catalog scans?
 The intention of the patch is that catalog scans are treated
 identically to non-catalog scans. The idea here is that HOT cleanup
 only occurs on scans on target relations, so only INSERT, UPDATE and
 DELETE do HOT cleanup.

 It's possible that many catalog scans don't follow the normal target
 relation logic, so we might argue we should use HOT every time. OTOH,
 since we now have separate catalog xmins we may find that using HOT on
 catalogs is no longer effective. So I could go either way on how to
 proceed; its an easy change either way.


I setup a more concurrent scenario and the difference is quite larger:

Without patch:
1st concurrent with writes:
tps = 5705.261620 (including connections establishing)
tps = 5945.338557 (excluding connections establishing)
2nd no writes being executed:
tps = 9988.792575 (including connections establishing)
tps = 11059.671351 (excluding connections establishing)


Patched version:
1st concurrent with writes:
tps = 9476.741707 (including connections establishing)
tps = 10274.831185 (excluding connections establishing)
2nd no writes being executed:
tps = 12993.644808 (including connections establishing)
tps = 15171.214744 (excluding connections establishing)


Stats (writes have been run with a time limit, not by tx):

hotcleanup=# select relname ,n_live_tup, n_dead_tup, n_tup_hot_upd from
pg_stat_user_tables where relname ~ 'pgbench';
 relname  | n_live_tup | n_dead_tup | n_tup_hot_upd
--+++---
 pgbench_tellers  |500 |  0 |   2044192
 pgbench_accounts |5109728 | 310842 |   1969264
 pgbench_history  |2265882 |  0 | 0
 pgbench_branches | 50 |  0 |   2237167
(4 rows)

hotcleanup=# select relname ,n_live_tup, n_dead_tup, n_tup_hot_upd from
pg_stat_user_tables where relname ~ 'pgbench';
 relname  | n_live_tup | n_dead_tup | n_tup_hot_upd
--+++---
 pgbench_history  |2148946 |  0 | 0
 pgbench_tellers  |500 |  0 |   1969675
 pgbench_branches | 50 |  0 |   2150655
 pgbench_accounts |5098774 | 300123 |   1897484
(4 rows)

  

I ran the regression tests over the patched version and they passed ok.


-- 
--
Emanuel Calvo http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: plpgsql - Assert statement

2014-09-17 Thread Peter Eisentraut
On 9/16/14 12:01 AM, Alvaro Herrera wrote:
 Jan Wieck wrote:
 I think that most data integrity issues can be handled by a well
 designed database schema that uses UNIQUE, NOT NULL, REFERENCES and
 CHECK constraints. Assertions are usually found inside of complex
 code constructs to check values of local variables. I don't think it
 is even a good idea to implement assertions that can query arbitrary
 data.
 
 Actually Peter Eisentraut posted a patch for SQL assertions:
 http://www.postgresql.org/message-id/1384486216.5008.17.ca...@vanquo.pezone.net

SQL assertions are just a kind of CHECK constraint, so fully
Jan-compliant. ;-)

I don't mind PL/pgSQL having an assert statement like many programming
languages, but I find a lot of the proposed details dubious.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: plpgsql - Assert statement

2014-09-17 Thread Peter Eisentraut
On 9/14/14 2:49 PM, Jan Wieck wrote:
 I don't think it is even a good idea to implement assertions that can
 query arbitrary data.

In a normal programming language, an assertion is usually a static fault
in your program.  If the assertion ever fails, you fix your program and
then it hopefully never happens again.

Assertion that query the state of the database or result row counts are
pushing that concept quite a bit.  Those are not assertions, those are
just plain old error handling.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: plpgsql - Assert statement

2014-09-17 Thread Marko Tiikkaja

On 9/17/14, 9:00 PM, Peter Eisentraut wrote:

On 9/14/14 2:49 PM, Jan Wieck wrote:

I don't think it is even a good idea to implement assertions that can
query arbitrary data.


In a normal programming language, an assertion is usually a static fault
in your program.  If the assertion ever fails, you fix your program and
then it hopefully never happens again.

Assertion that query the state of the database or result row counts are
pushing that concept quite a bit.  Those are not assertions, those are
just plain old error handling.


*shrug*  I don't see them as error handling if they're just checking 
conditions which should never happen.


That said, in PL/PgSQL these expressions would likely have to be SQL 
expressions, and then you'd have to go out of your way to implement 
assertions which *can't* query arbitrary data.  And that just seems silly.



.marko


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: plpgsql - Assert statement

2014-09-17 Thread Pavel Stehule
2014-09-17 21:00 GMT+02:00 Peter Eisentraut pete...@gmx.net:

 On 9/14/14 2:49 PM, Jan Wieck wrote:
  I don't think it is even a good idea to implement assertions that can
  query arbitrary data.

 In a normal programming language, an assertion is usually a static fault
 in your program.  If the assertion ever fails, you fix your program and
 then it hopefully never happens again.

 Assertion that query the state of the database or result row counts are
 pushing that concept quite a bit.  Those are not assertions, those are
 just plain old error handling.


What is difference between content of variable or content of database? You
can test any prerequisite, but when this prerequisite is not solved, than
exception is very very hard without possible handling.

Pavel


Re: [HACKERS] printing table in asciidoc with psql

2014-09-17 Thread Szymon Guz
On 17 September 2014 19:55, Szymon Guz mabew...@gmail.com wrote:



 On 17 September 2014 19:30, Peter Eisentraut pete...@gmx.net wrote:

 On 9/16/14 3:52 PM, Szymon Guz wrote:
  It's not finished yet, I'm not sure it there is any sense in supporting
  border types etc.

 AFAICT, Asciidoc doesn't support border types, so (if so) you should
 just ignore that setting.


 Too late, I've done something like this:

 border=0
 [frame=none,grid=none]

 border=1
 [frame=all,grid=none]

 border=2
 [frame=all,grid=all]

 thanks,
 Szymon



Hi,
thanks for all the remarks.

I've attached another version of this patch.

I think it's done.

- This works: `\pset format asciidoc`

- Output is formatted as asciidoc tables.

- There is support for borders {0,1,2}. The attached html file was made by
running tests for psql, taking the asciidoc tables from it, converting to
html with `asciidoc file`.
-- border = 0 - [frame=none,grid=none]
-- border = 1 - [frame=none,grid=all]
-- border = 2 - [frame=all,grid=all]

- There are also tests.
-- For normal and extended mode combined with each of the border values.
-- With column names made of characters which need escaping
-- With values: (with escape needed characters, string '11' and integer 11
- they should have different right-left alignment).

- Documentation for psql is updated.

- According to Emanuel's advice: help.c is updated.

The attached html file contains tables from the test in this order:

normal, border 0
normal, border 1
normal, border 2
expanded, border 0
expanded, border 1
expanded, border 2

regards,
Szymon
 text/html; charset=US-ASCII; name="asciidoc_output.html": Unrecognized 
diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml
index e7fcc73..cd64b88 100644
--- a/doc/src/sgml/ref/psql-ref.sgml
+++ b/doc/src/sgml/ref/psql-ref.sgml
@@ -2092,8 +2092,8 @@ lo_import 152801
   literalaligned/literal, literalwrapped/literal,
   literalhtml/literal,
   literallatex/literal (uses literaltabular/literal),
-  literallatex-longtable/literal, or
-  literaltroff-ms/literal.
+  literallatex-longtable/literal,
+  literaltroff-ms/literal, or literalasciidoc/literal.
   Unique abbreviations are allowed.  (That would mean one letter
   is enough.)
   /para
@@ -2120,7 +2120,8 @@ lo_import 152801
 
   para
   The literalhtml/, literallatex/,
-  literallatex-longtable/literal, and literaltroff-ms/
+  literallatex-longtable/literal, literaltroff-ms/,
+  and literalasciidoc/
   formats put out tables that are intended to
   be included in documents using the respective mark-up
   language. They are not complete documents! This might not be
diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c
index 2227db4..ae6b106 100644
--- a/src/bin/psql/command.c
+++ b/src/bin/psql/command.c
@@ -2247,6 +2247,9 @@ _align2string(enum printFormat in)
 		case PRINT_TROFF_MS:
 			return troff-ms;
 			break;
+		case PRINT_ASCIIDOC:
+			return asciidoc;
+			break;
 	}
 	return unknown;
 }
@@ -2316,9 +2319,11 @@ do_pset(const char *param, const char *value, printQueryOpt *popt, bool quiet)
 			popt-topt.format = PRINT_LATEX_LONGTABLE;
 		else if (pg_strncasecmp(troff-ms, value, vallen) == 0)
 			popt-topt.format = PRINT_TROFF_MS;
+		else if (pg_strncasecmp(asciidoc, value, vallen) == 0)
+			popt-topt.format = PRINT_ASCIIDOC;
 		else
 		{
-			psql_error(\\pset: allowed formats are unaligned, aligned, wrapped, html, latex, troff-ms\n);
+			psql_error(\\pset: allowed formats are unaligned, aligned, wrapped, html, latex, troff-ms, asciidoc\n);
 			return false;
 		}
 
diff --git a/src/bin/psql/help.c b/src/bin/psql/help.c
index 6035a77..66da6ec 100644
--- a/src/bin/psql/help.c
+++ b/src/bin/psql/help.c
@@ -351,7 +351,7 @@ helpVariables(unsigned short int pager)
 	fprintf(output, _(  expanded (or x)toggle expanded output\n));
 	fprintf(output, _(  fieldsep   field separator for unaligned output (default '|')\n));
 	fprintf(output, _(  fieldsep_zero  set field separator in unaligned mode to zero\n));
-	fprintf(output, _(  format set output format [unaligned, aligned, wrapped, html, latex, ..]\n));
+	fprintf(output, _(  format set output format [unaligned, aligned, wrapped, html, latex, asciidoc ..]\n));
 	fprintf(output, _(  footer enable or disable display of the table footer [on, off]\n));
 	fprintf(output, _(  linestyle  set the border line drawing style [ascii, old-ascii, unicode]\n));
 	fprintf(output, _(  null   set the string to be printed in place of a null value\n));
diff --git a/src/bin/psql/print.c b/src/bin/psql/print.c
index 3b3c3b7..956bbb1 100644
--- a/src/bin/psql/print.c
+++ b/src/bin/psql/print.c
@@ -2475,6 +2475,200 @@ print_troff_ms_vertical(const printTableContent *cont, FILE *fout)
 	}
 }
 
+/*/
+/* ASCIIDOC **/

Re: [HACKERS] proposal: plpgsql - Assert statement

2014-09-17 Thread Peter Eisentraut
On 9/17/14 3:04 PM, Pavel Stehule wrote:
 What is difference between content of variable or content of database?
 You can test any prerequisite, but when this prerequisite is not solved,
 than exception is very very hard without possible handling.

If the assertion tests arbitrary Boolean expressions, then we can't stop
the user from abusing them.

But it's another thing if we design specific syntax that encourages such
abuse, as proposed earlier.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Turning off HOT/Cleanup sometimes

2014-09-17 Thread Simon Riggs
On 15 September 2014 22:13, Simon Riggs si...@2ndquadrant.com wrote:
 On 15 September 2014 17:09, Robert Haas robertmh...@gmail.com wrote:

 Do we really want to disable HOT for all catalog scans?

 The intention of the patch is that catalog scans are treated
 identically to non-catalog scans. The idea here is that HOT cleanup
 only occurs on scans on target relations, so only INSERT, UPDATE and
 DELETE do HOT cleanup.

Since INSERT, UPDATE and DELETE can only be called when
!RecoveryInProgress(), we can completely avoid making this test at the
top of each heap_page_prune_opt() call.

I very much like the simplicity of saying no target, no cleanup.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: plpgsql - Assert statement

2014-09-17 Thread Pavel Stehule
2014-09-17 21:36 GMT+02:00 Peter Eisentraut pete...@gmx.net:

 On 9/17/14 3:04 PM, Pavel Stehule wrote:
  What is difference between content of variable or content of database?
  You can test any prerequisite, but when this prerequisite is not solved,
  than exception is very very hard without possible handling.

 If the assertion tests arbitrary Boolean expressions, then we can't stop
 the user from abusing them.


I am thinking so unhandled signal can be good defence. (and possibility to
disable assertions)

We design a database system, so we should to reflect it - plpgsql (or any
PL environment) are not classic language. There are lot of database
specific constructs.


 But it's another thing if we design specific syntax that encourages such
 abuse, as proposed earlier.


Other note - I am thinking so ANSI SQL Assertions and PL assertions are
independent features. Although they can have some common goals.


Re: [HACKERS] Anonymous code block with parameters

2014-09-17 Thread Vik Fearing
On 09/16/2014 10:09 AM, Heikki Linnakangas wrote:
 On 09/16/2014 10:57 AM, Craig Ringer wrote:
 On 09/16/2014 03:15 PM, Pavel Stehule wrote:

 Why we don't introduce a temporary functions instead?

 I think that'd be a lot cleaner and simpler. It's something I've
 frequently wanted, and as Hekki points out it's already possible by
 creating the function in pg_temp, there just isn't the syntax sugar for
 CREATE TEMPORARY FUNCTION.

 So why not just add CREATE TEMPORARY FUNCTION?
 
 Sure, why not.

Because you still have to do

SELECT pg_temp.my_temp_function(blah);

to execute it.

 It means two steps:

 CREATE TEMPORARY FUNCTION ... $$ $$;

 SELECT my_temp_function(blah);

That won't work; see above.
-- 
Vik


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Anonymous code block with parameters

2014-09-17 Thread Pavel Stehule
2014-09-17 22:07 GMT+02:00 Vik Fearing vik.fear...@dalibo.com:

 On 09/16/2014 10:09 AM, Heikki Linnakangas wrote:
  On 09/16/2014 10:57 AM, Craig Ringer wrote:
  On 09/16/2014 03:15 PM, Pavel Stehule wrote:
 
  Why we don't introduce a temporary functions instead?
 
  I think that'd be a lot cleaner and simpler. It's something I've
  frequently wanted, and as Hekki points out it's already possible by
  creating the function in pg_temp, there just isn't the syntax sugar for
  CREATE TEMPORARY FUNCTION.
 
  So why not just add CREATE TEMPORARY FUNCTION?
 
  Sure, why not.

 Because you still have to do

 SELECT pg_temp.my_temp_function(blah);

 to execute it.


this problem should be solvable. I can to use a temporary tables without
using pg_temp schema.

Pavel



  It means two steps:
 
  CREATE TEMPORARY FUNCTION ... $$ $$;
 
  SELECT my_temp_function(blah);

 That won't work; see above.
 --
 Vik



Re: [HACKERS] Collations and Replication; Next Steps

2014-09-17 Thread Peter Geoghegan
On Wed, Sep 17, 2014 at 7:46 AM, Greg Stark st...@mit.edu wrote:
 You could have a problem if you have an expression index on (timestamp
 AT TIME ZONE '...'). I may have the expression slightly wrong but I
 believe it is posisble to write an immutable expression that depends
 on the tzdata data as long as it doesn't depend on not the user's
 current time zone (which would be stable but not immutable). The
 actual likelihood of that situation might be much lower and the
 ability to avoid it higher but in theory I think Peter's right that
 it's the same class of problem.

Really? If you have an IMMUTABLE expression predicated on something
that isn't immutable, then it isn't actually immutable, regardless of
the ability of Postgres to determine that. It's that simple. So, if
that's true, then it sounds like a bug to me. I didn't check, but I
think that AT TIME ZONE just changes the display format, and the
relevant output function is only stable, and so this shouldn't be a
problem.

 Peter G, could go into more detail about collation versioning? What
 would the implications be for Postgres?

Well, I think one implication might be that Postgres won't start, just
because you updated the OS (we have no way to get back results
consistent with the old collation file, I would think). At least your
database isn't corrupt, but that's still a really bad loss of
availability. This makes me lean towards pursuing ICU support as part
of any versioning scheme. There is a reason why everyone else does
something similar. Apparently DB2 uses ICU.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Final Patch for GROUPING SETS

2014-09-17 Thread Marti Raudsepp
On Fri, Sep 12, 2014 at 9:41 PM, Andrew Gierth
and...@tao11.riddles.org.uk wrote:
 gsp1.patch - phase 1 code patch (full syntax, limited functionality)
 gsp2.patch - phase 2 code patch (adds full functionality using the
  new chained aggregate mechanism)

I gave these a try by converting my current CTE-based queries into
CUBEs and it works as expected; query time is cut in half and lines of
code is 1/4 of original. Thanks!

I only have a few trivial observations; if I'm getting too nitpicky
let me know. :)


Since you were asking for feedback on the EXPLAIN output on IRC, I'd
weigh in and say that having the groups on separate lines would be
significantly more readable. It took me a while to understand what's
going on in my queries due to longer table and column names and
wrapping; The comma separators between groups are hard to distinguish.
If that can be made to work with the EXPLAIN printer without too much
trouble.

So instead of:
 GroupAggregate
   Output: four, ten, hundred, count(*)
   Grouping Sets: (onek.four, onek.ten, onek.hundred), (onek.four,
onek.ten), (onek.four), ()

Perhaps print:
   Grouping Sets: (onek.four, onek.ten, onek.hundred)
  (onek.four, onek.ten)
  (onek.four)
  ()

Or maybe:
   Grouping Set: (onek.four, onek.ten, onek.hundred)
   Grouping Set: (onek.four, onek.ten)
   Grouping Set: (onek.four)
   Grouping Set: ()

Both seem to work with the explain.depesz.com parser, although the 1st
won't be aligned as nicely.


Do you think it would be reasonable to normalize single-set grouping
sets into a normal GROUP BY? Such queries would be capable of using
HashAggregate, but the current code doesn't allow that. For example:

set enable_sort=off;
explain select two, count(*) from onek group by grouping sets (two);
Could be equivalent to:
explain select two, count(*) from onek group by two;


I'd expect GROUP BY () to be fully equivalent to having no GROUP BY
clause, but there's a difference in explain output. The former
displays Grouping Sets: () which is odd, since none of the grouping
set keywords were used.

# explain select count(*) from onek group by ();
 Aggregate  (cost=77.78..77.79 rows=1 width=0)
   Grouping Sets: ()
   -  Index Only Scan using onek_stringu1 on onek  (cost=0.28..75.28
rows=1000 width=0)

Regards,
Marti


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Collations and Replication; Next Steps

2014-09-17 Thread Tatsuo Ishii
 On 9/17/14 10:47 AM, Tatsuo Ishii wrote:
 Why don't we have our collation data? It seems MySQL has already done this.
 
 Where would you get the source data from?  How would you maintain it?

Don't know. However seeing that that MySQL manages it, it should be
possible for us.

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [v9.5] Custom Plan API

2014-09-17 Thread Kouhei Kaigai
   Why does it need to know that?  I don't see that it's doing
   anything that requires knowing the size of that node, and if it is,
   I think it shouldn't be.  That should get delegated to the callback
   provided by the custom plan provider.
  
   Sorry, my explanation might be confusable. The create_custom_scan()
   does not need to know the exact size of the CustomScan (or its
   inheritance) because of the two separated hooks; one is node
   allocation time, the other is the tail of the series of initialization.
   If we have only one hook here, we need to have a mechanism to
   informs
   create_custom_scan() an exact size of the CustomScan node; including
   private fields managed by the provider, instead of the first hook on
   node allocation time. In this case, node allocation shall be
   processed by create_custom_scan() and it has to know exact size of
   the node to be
  allocated.
  
   How do I implement the feature here? Is the combination of static
   node size and callback on the tail more simple than the existing
   design that takes two individual hooks on create_custom_scan()?
 
  I still don't get it.  Right now, the logic in create_custom_scan(),
  which I think should really be create_custom_plan() or
  create_plan_from_custom_path(), basically looks like this:
 
  1. call hook function CreateCustomPlan 2. compute values for tlist and
  clauses 3. pass those values to hook function InitCustomScan() 4. call
  copy_path_costsize
 
  What I think we should do is:
 
  1. compute values for tlist and clauses 2. pass those values to hook
  function PlanCustomPath(), which will return a Plan 3. call
  copy_path_costsize
 
  create_custom_scan() does not need to allocate the node.  You don't
  need the node to be allocated before computing tlist and clauses.
 
 Thanks, I could get the point.
 I'll revise the patch according to the suggestion above.
 
At this moment, I revised the above portion of the patches.
create_custom_plan() was modified to call PlanCustomPath callback
next to the initialization of tlist and clauses.

It's probably same as what you suggested.

 It seems to me, we can also apply similar manner on ExecInitCustomScan().
 The current implementation doing is:
 1. call CreateCustomScanState() to allocate a CustomScanState node 2.
 common initialization of the fields on CustomScanState, but not private
fields.
 3. call BeginCustomScan() to initialize remaining stuffs and begin
 execution.
 
 If BeginCustomScan() is re-defined to accept values for common
 initialization portions and to return a CustomScanState node, we may be
 able to eliminate the CreateCustomScanState() hook.
 
 Unlike create_custom_scan() case, it takes more number of values for common
 initialization portions; expression tree of tlist and quals, scan and result
 tuple-slot, projection info and relation handler. It may mess up the
 interface specification.
 In addition, BeginCustomScan() has to belong to CustomScanMethods, not
 CustomexecMethods. I'm uncertain whether it is straightforward location.
 (a whisper: It may not need to be separate tables. CustomScan always
 populates CustomScanState, unlike relationship between CustomPath and
 CustomScan.)
 
 How about your opinion to apply the above manner on ExecInitCustomScan()
 also?
 
I kept existing implementation around ExecInitCustomScan() right now.

Thanks,
--
NEC OSS Promotion Center / PG-Strom Project
KaiGai Kohei kai...@ak.jp.nec.com


pgsql-v9.5-custom-scan.part-3.v10.patch
Description: pgsql-v9.5-custom-scan.part-3.v10.patch


pgsql-v9.5-custom-scan.part-2.v10.patch
Description: pgsql-v9.5-custom-scan.part-2.v10.patch


pgsql-v9.5-custom-scan.part-1.v10.patch
Description: pgsql-v9.5-custom-scan.part-1.v10.patch

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Collations and Replication; Next Steps

2014-09-17 Thread Tatsuo Ishii
 On Wed, Sep 17, 2014 at 3:47 PM, Tatsuo Ishii is...@postgresql.org wrote:
 I don't think we cannot achieve that because even MySQL accomplishes:-)
 
 We've always considered it an advantage that we're consistent with the
 collations in the rest of the system. Generally speaking the fact that
 Postgres integrates with the system rather than be a separate system
 unto itself.

 Consider bug reports like I've configured my system to use
 fr_FR.UTF-8 and sort produces output in this order why is Postgres
 producing output in a different order? Or extension authors using
 strcoll and being surprised that the module gets inconsistent data
 from the database.

I doubt it. glibc takes liberty to change the collation data release
by release, but people don't seem to complain it. Then why would
people complain the collation difference between PostgreSQL and glibc
if there's any?

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Collations and Replication; Next Steps

2014-09-17 Thread Robert Haas
On Wed, Sep 17, 2014 at 10:06 AM, Matthew Kelly mke...@tripadvisor.com wrote:
 Let me double check that assertion before we go too far with it.

 Most of the problems I've seen are across 5 and 6 boundaries.  I thought I 
 had case where it was within a minor release but I can't find it right now.  
 I'm going to dig.

 That being said the sort order changes whether you statically or dynamically 
 link (demonstrated on 4+ machines running different linux flavors), so at the 
 point I have no reason to trust the stability of the sort across any build.  
 I legitimately question whether strcoll is buggy.  Ex. I have cases where for 
 three strings a, b and c:  a  b, but  (a || c)  (b || c).  That's right 
 postfixing doesn't hold.  It actually calls into question the index scan 
 optimization that occurs when you do LIKE 'test%' even on a single machine, 
 but I don't want to bite that off at the moment.

 My mentality has switched to 'don't trust any change until shown otherwise', 
 so that may have bled into my last email.

Of course, there's also the question of whether ICU would have similar
issues.  You're assuming that they *don't* whack the collation order
around in minor releases, or at least that they do so to some lesser
degree than glibc, but is that actually true?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Final Patch for GROUPING SETS

2014-09-17 Thread Josh Berkus
On 09/17/2014 03:02 PM, Marti Raudsepp wrote:
 So instead of:
  GroupAggregate
Output: four, ten, hundred, count(*)
Grouping Sets: (onek.four, onek.ten, onek.hundred), (onek.four,
 onek.ten), (onek.four), ()
 
 Perhaps print:
Grouping Sets: (onek.four, onek.ten, onek.hundred)
   (onek.four, onek.ten)
   (onek.four)
   ()

So:

  Grouping Sets: [
[ onek.four, onek.ten, onek.hundred ],
[ onek.four, onek.ten ],
[ onek.four ],
[]
]

.. in JSON?

Seems to me that we need a better way to display the grand total
grouping set.

 
 Or maybe:
Grouping Set: (onek.four, onek.ten, onek.hundred)
Grouping Set: (onek.four, onek.ten)
Grouping Set: (onek.four)
Grouping Set: ()

The latter won't work with JSON and YAML output.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Immediate standby promotion

2014-09-17 Thread Robert Haas
On Wed, Sep 17, 2014 at 7:23 AM, Simon Riggs si...@2ndquadrant.com wrote:
 On 14 August 2014 20:27, Robert Haas robertmh...@gmail.com wrote:
 On Thu, Aug 14, 2014 at 10:12 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Fujii Masao masao.fu...@gmail.com writes:
 I'd like to propose to add new option --immediate to pg_ctl promote.
 When this option is set, recovery ignores any WAL data which have not
 been replayed yet and exits immediately. Patch attached.

 This promotion is faster than normal one but can cause data loss.

 TBH, I cannot imagine a situation where that would be a sane thing to do.
 If you have WAL, why would you not replay what you have?  The purpose
 of a database is to preserve your data, not randomly throw it away.

 I've wanted this a number of times, so I think it's quite sane.

 fast promotion was actually a supported option in r8 of Postgres but
 this option was removed when we implemented streaming replication in
 r9.0

 The *rough* requirement is sane, but that's not the same thing as
 saying this exact patch makes sense.

Granted.  Fair point.

 If you are paused and you can see that WAL up ahead is damaged, then
 YES, you do want to avoid applying it. That is possible by setting a
 PITR target so that recovery stops at a precise location specified by
 you. As an existing option is it better than the blunt force trauma
 suggested here.

You can pause at a recovery target, but then what if you want to go
read/write at that point?  Or what if you've got a time-delayed
standby and you want to break replication so that it doesn't replay
the DROP TABLE students that somebody ran on the master?  It doesn't
have to be that WAL is unreadable or corrupt; it's enough for it to
contain changes you wish to avoid replaying.

 If you really don't care, just shutdown server, resetxlog and start
 her up - again, no need for new option.

To me, being able to say pg_ctl promote_right_now -m yes_i_mean_it
seems like a friendlier interface than making somebody shut down the
server, run pg_resetxlog, and start it up again.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Collations and Replication; Next Steps

2014-09-17 Thread Peter Geoghegan
On Wed, Sep 17, 2014 at 5:16 PM, Robert Haas robertmh...@gmail.com wrote:
 Of course, there's also the question of whether ICU would have similar
 issues.  You're assuming that they *don't* whack the collation order
 around in minor releases, or at least that they do so to some lesser
 degree than glibc, but is that actually true?

No, but they're disciplined about it. They clearly do versioning
properly, which seems to not be the case with glibc, based on Peter's
remarks: http://userguide.icu-project.org/collation/architecture#TOC-Versioning
(they talk about a 32-bit identifier here).

PostgreSQL's problems in this area are exactly the same as every other
database system's (the Unicode consortium anticipated these problems
too, and as I pointed out have commented on these problems.). A bunch
of prominent database systems are listed as using ICU.
-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: plpgsql - Assert statement

2014-09-17 Thread Jan Wieck

On 09/17/2014 03:36 PM, Peter Eisentraut wrote:

On 9/17/14 3:04 PM, Pavel Stehule wrote:

What is difference between content of variable or content of database?
You can test any prerequisite, but when this prerequisite is not solved,
than exception is very very hard without possible handling.


If the assertion tests arbitrary Boolean expressions, then we can't stop
the user from abusing them.


Exactly. Doing something like

ASSERT (select count(*) from foo
where fk not in (select pk from bar)) = 0;

is a perfectly fine, arbitrary boolean expression. It will probably work 
well in a development environment too. And I am very sure that it will 
not scale well once that code gets deployed. And I know how DBAs react 
to the guaranteed following performance problem. They will disable ALL 
assert ... or was there some sort of assert class system proposed that I 
missed?




But it's another thing if we design specific syntax that encourages such
abuse, as proposed earlier.


The design should explicitly discourage that sort of nonsense.


Jan


--
Jan Wieck
Senior Software Engineer
http://slony.info


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [Windows,PATCH] Use faster, higher precision timer API

2014-09-17 Thread Craig Ringer
On 09/17/2014 11:19 PM, Andrew Dunstan wrote:
 
 On 09/17/2014 08:27 AM, Craig Ringer wrote:
 Hi all
 On Windows 2012 and Windows 8 I'd like to use the new
 GetSystemTimePreciseAsFileTime call instead. As this requires some extra
 hoop-jumping to safely and efficiently use it without breaking support
 for older platforms I suggest that we start with just switching over to
 GetSystemTimeAsFileTime, which has been supported since Windows 2000.
 Then more precise time capture can be added in a later patch.

 That will presumably breaK XP.

Yes, and Windows 7. But this patch doesn't to that, it just makes
adjustments that make it easier.

The next step is to use LoadLibrary and GetProcAddress to resolve
GetSystemTimePreciseAsFileTime *if it is available*, during backend
start. Then use it if possible, and fall back to GetSystemTimeAsFileTime
if it isn't.

This patch does not introduce any BC changes. At all. I should've
omitted all mention of the next step I want to take, but I thought it
was a useful explanation of why this change makes a bigger improvement
easier.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [Windows,PATCH] Use faster, higher precision timer API

2014-09-17 Thread Craig Ringer
On 09/18/2014 12:58 AM, Andrew Dunstan wrote:
 
 Oh, hmm, yes, you're right. For some reason I was thinking W2K was later
 than XP. I get more random memory errors as I get older ...

It's because people say Win2k3 / Win2k8 / Win2k8r2 / Win2k12 a lot as
shorthand for Windows Server 2003 (XP-based), Windows Server 2008 (Vista
based), Windows Server 2008 R2 (Windows 7 based) and Windows Server 2012
(Windows 8 based) respectively.

Win2k is just Windows 2000, the release before Windows XP, released in
December 1999. Needless to say, if it's compatible even as far back as
Win2k it's not going to worry anybody.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Protocol TODO: Identify server charset in handshake

2014-09-17 Thread Craig Ringer
Hi all

In the wire protocol, currently if you get an error from the server
before you know it's processed your startup packet successfully you
don't know what character encoding that error is in.

If the error came from the postmaster then it's in the system's default
encoding or whatever locale the postmaster was started under.

If the error came from the DB backend it's in the DB backend's default
encoding, switched to during backend startup. Assuming it got that far.

If the error came after client_encoding was applied, it's in your
requested client_encoding.

This leaves the client unable to reliably interpret error messages. The
4.1 protocol should probably explicitly signal the encoding in the first
message from the server, and thereafter whenever it changes.

(This is somewhat related to the mess we make of text encodings in the
log files, where the postmaster writes in one encoding and DB backends
write in another).


Example psql session, in a terminal with en_AU.UTF-8 locale, connecting
to a postmaster started with:



$ LC_ALL=ru_RU.ISO-8859-5 LANG=ru_RU.ISO-8859-5 PATH=$HOME/pg/pg94/bin
postgres -D pg94_ru -p 9595

$ locale
LANG=en_AU.UTF-8
LC_CTYPE=en_AU.UTF-8
...
LC_ALL=

$ psql -p 9595
psql: �:   �� craig �� ��

$ psql -q -p 9595 postgres
postgres=# \c nosuch
�:   �� nosuch �� ��
Previous connection kept
postgres=# select indb_error();
ОШИБКА:  функция indb_error() не существует
LINE 1: select indb_error();
   ^
HINT:  Функция с данными именем и типами аргументов не найдена.
Возможно, вам следует добавить явные преобразования типов.



Note the garbage where psql happily dumps an ISO-8859-5 message to the
terminal because it has no way of knowing it isn't in the current
client_encoding, and no way of telling what encoding it is anyway.


-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Collations and Replication; Next Steps

2014-09-17 Thread Oleg Bartunov
We use ICU with postgres for many years in our  mchar extension, which
provides  case-insensitive  text data type for popular  russian financial
system.  I don't know if we may ask ICU to give us special BSD-compatible
license ?


Re: [HACKERS] Collations and Replication; Next Steps

2014-09-17 Thread Peter Geoghegan
On Wed, Sep 17, 2014 at 9:06 PM, Oleg Bartunov obartu...@gmail.com wrote:
 We use ICU with postgres for many years in our  mchar extension, which
 provides  case-insensitive  text data type for popular  russian financial
 system.  I don't know if we may ask ICU to give us special BSD-compatible
 license ?

I don't think that's necessary. Firebird uses ICU, and has similar
licensing terms to PostgreSQL.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Collations and Replication; Next Steps

2014-09-17 Thread Craig Ringer
On 09/17/2014 09:17 PM, Robert Haas wrote:
 What I find astonishing is that whoever maintains glibc (or the Red
 Hat packaging for it) thinks it's OK to change the collation order in
 a minor release.  I'd understand changing it between, say, RHEL 6 and
 RHEL 7.  But the idea that minor release, supposedly safe updates
 think they can whack this around without breaking applications really
 kind of blows my mind.


If confirmed, it certainly requires some ... firm ... bug reports.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Collations and Replication; Next Steps

2014-09-17 Thread Oleg Bartunov
On Thu, Sep 18, 2014 at 1:09 PM, Peter Geoghegan p...@heroku.com wrote:

 On Wed, Sep 17, 2014 at 9:06 PM, Oleg Bartunov obartu...@gmail.com
 wrote:
  We use ICU with postgres for many years in our  mchar extension, which
  provides  case-insensitive  text data type for popular  russian financial
  system.  I don't know if we may ask ICU to give us special BSD-compatible
  license ?

 I don't think that's necessary. Firebird uses ICU, and has similar
 licensing terms to PostgreSQL.


Firebird uses MPL license, which is compatible with GPL
http://en.wikipedia.org/wiki/Mozilla_Public_License


 --
 Peter Geoghegan



Re: [HACKERS] Collations and Replication; Next Steps

2014-09-17 Thread Tatsuo Ishii
 On Wed, Sep 17, 2014 at 9:06 PM, Oleg Bartunov obartu...@gmail.com wrote:
 We use ICU with postgres for many years in our  mchar extension, which
 provides  case-insensitive  text data type for popular  russian financial
 system.  I don't know if we may ask ICU to give us special BSD-compatible
 license ?
 
 I don't think that's necessary. Firebird uses ICU, and has similar
 licensing terms to PostgreSQL.

http://source.icu-project.org/repos/icu/icu/trunk/license.html


 permission notice appear in supporting documentation.Permission is
 hereby granted, free of charge, to any person obtaining a copy of
 this software and associated documentation files (the Software), to
 deal in the Software without restriction, including without
 limitation the rights to use, copy, modify, merge, publish,
 distribute, and/or sell copies of the Software, and to permit persons
 to whom the Software is furnished to do so, provided that the above
 copyright notice(s) and this permission notice appear in all copies
 of the Software and that both the above copyright notice(s) and this
 permission notice appear in supporting documentation.


In my understanding PostgreSQL's manual MUST include the ICU license
term (this is not a problem).  What I am not so sure is, any software
uses PostgreSQL also MUST include the ICU license or not. If yes, I
think this is surely a problem.

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Collations and Replication; Next Steps

2014-09-17 Thread Peter Geoghegan
On Wed, Sep 17, 2014 at 9:35 PM, Tatsuo Ishii is...@postgresql.org wrote:
 In my understanding PostgreSQL's manual MUST include the ICU license
 term (this is not a problem).  What I am not so sure is, any software
 uses PostgreSQL also MUST include the ICU license or not. If yes, I
 think this is surely a problem.


It uses the MIT license, which is less restrictive than even the BSD
license. I believe one part (the Chinese/Japanese Word Break
Dictionary Data) is BSD Licensed, though:

https://ssl.icu-project.org/repos/icu/icu/trunk/license.html

I don't think licensing would be a problem.
-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers