Re: [HACKERS] [PATCH] backend: compare word-at-a-time in bcTruelen

2009-06-26 Thread Dimitri Fontaine

Hi,

Le 26 juin 09 à 05:20, Jeremy Kerr a écrit :

Unfortunately, the cases with lots of padding spaces are probably
much less probable than the cases with fewer.  It would be unpleasant
for example if this patch resulted in a severe performance
degradation for a canonical example of char(n) being used properly,
such as char(2) for US state abbreviations.


Yep, makes sense. The other consideration is stock-ticker symbols, I
assume they may also be stored in CHAR(small n) columns.



Could this optimisation only kicks in when n is big enough?
I'm don't know if this part of the code knows the typmod...

Regards,
--
dim
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Query progress indication - an implementation

2009-06-26 Thread Scara Maccai
Hi all,

following the link in 

http://wiki.postgresql.org/wiki/Query_progress_indication

but mostly:

http://www.postech.ac.kr/~swhwang/progress2.pdf [1]

I'm trying to write an implementation of the dne method in postgresql.

I added another column to the pg_stat_get_activity function to report the 
percentage of work done for the query (of course, any other method could be 
used... the way the percentage is reported to the user can be easily changed).

I attached a first patch (just to see if anyone is interested, the work is by 
no means finished).

I guess I did a lot of mistakes, since I don't know anything about postgresql 
code...

1) the progress indicator can be eliminated at runtime; this could be done with 
another runtime flag (at the moment is always on)

2) I added a new structure (Progress) to PlanState to keep all the info about 
execution progress

3) I needed a pointer to the root of the PlanStates, to be able to calculate 
the total progress of the query tree (I bet this pointer was already available 
somewhere, but I couldn't find where...)

4) sub-plans are not included yet (well, just to be honest, I don't really know 
what postgresql means with those... :) )

5) the percentage is updated at most every second (can be easily changed)

6) the methods to adjust upper/lower bounds in [1] are not implemented yet (but 
that shouldn't be a problem)

7) the spilled tuples handling in [1] is not supported yet

8) only hash join, loop join, aggregate, sequence scans are implemented at the 
moment

9) I added another flag (EXEC_FLAG_DRIVER_BRANCH) in executor.h to signal to 
the sub-nodes if they are part of a branch that will contain a driver node (for 
example, inner subtree of a Nested Loops join is not a driver branch). I guess 
this could be done better at Plan level (instead of PlanState), but this way 
less code has to be changed

10) at the moment all driver nodes have the same work_per_tuple=1, but this 
could be changed (for example, CPU-intensive driver nodes could have a smaller 
work_per_tuple value)

Well, some (very early) tests on a tpcd db showed it works as expected (well, I 
only did very few tests...)

Hope someone is interested


  Index: src/include/pgstat.h
===
RCS file: /projects/cvsroot/pgsql/src/include/pgstat.h,v
retrieving revision 1.83
diff -r1.83 pgstat.h
568a569,571
 
 /* current percentage of progress */
 float   st_progress_perc;
646a650,651
 extern void pgstat_report_progress_percentage(double perc);
 
Index: src/backend/executor/Makefile
===
RCS file: /projects/cvsroot/pgsql/src/backend/executor/Makefile,v
retrieving revision 1.29
diff -r1.29 Makefile
25c25
nodeWindowAgg.o tstoreReceiver.o spi.o
---
nodeWindowAgg.o tstoreReceiver.o spi.o progress.o
Index: src/backend/executor/execProcnode.c
===
RCS file: /projects/cvsroot/pgsql/src/backend/executor/execProcnode.c,v
retrieving revision 1.65
diff -r1.65 execProcnode.c
109a110,111
 #include executor/progress.h
 #include pgstat.h
111a114
 void ProgressUpdate(PlanState* node, double* tot_operations_expected, double* 
 tot_operations_so_far);
132a136
   boolis_driver_node_candidate = false;
175a180,181
   is_driver_node_candidate = true;
 
261a268
   is_driver_node_candidate = true;
314a322,325
   /* Set up progress info for this node if requested */
   if (result-state-es_progress)
   ProgressSetInfo(result, node, eflags, is_driver_node_candidate);
 
328a340,343
   struct timeval t;
   double  tot_operations_expected = 0;
   double  tot_operations_so_far = 0;
 
462a478,493
   // progress calcs (only if required)
   if (node-state-es_progress  node-progress != NULL  
 node-progress-is_driver_node)
   {
   node-progress-operations_so_far++;
   gettimeofday(t, NULL);
   if (t.tv_sec  node-state-es_progress_last_update.tv_sec)
   {
   ProgressUpdate(node-state-es_root_planstate, 
 tot_operations_expected, tot_operations_so_far);
   if (tot_operations_expected != 0)
   {
   
 pgstat_report_progress_percentage(tot_operations_so_far*100/tot_operations_expected);
   node-state-es_progress_last_update = t;
   }
   }
   }
 
466a498,524
 void ProgressUpdate(PlanState* node, double* tot_operations_expected, double* 
 tot_operations_so_far)
 {
   // TODO here a  switch (nodeTag(node)) is needed in case we want 
 upper/lower limit update
   if (node-progress-is_driver_node)
   {
   *tot_operations_expected += node-progress-lower_bound;
   *tot_operations_so_far += 

[HACKERS] Proposal: More portable way to support 64bit platforms

2009-06-26 Thread Tsutomu Yamada
Proposal: More portable way to support 64bit platforms

Short description:

Current PostgreSQL implementation has some portability issues to
support 64bit platforms: pointer calculations using long is not
portable, for example on Windows x64 platform. We propose to use
intptr_t instead of long, which appears in in C99.

Details: intptr_t is defined in stdint.h. configure script already
has HAVE_STDINT_H but never uses it. This needs to be enabled.

Please note that Windows/VC++ defines intptr_t in crtdefs.h.

Included is a conceptual patch to use intptr_t. Comments are welcome.

Some notes for the patches:

access/common/heaptuple.c:
Casting using (long) is removed. It is no more necessary if we
introduce intptr_t.

include/c.h:
Many Alignment macros which use long are rewritten to use intrptr_t.

The patches is against PostgreSQL 8.4beta2. Regression test
passed. Windows x64 is ok even with shared_buffers = 3000MB.

Tested platforms are as follows:

Windows Server 2008 SP1 x64+ Visual Studio 2005 
RHEL 4  x86_64 + gcc 3.4.6
FreeBSD 7.1 i386   + gcc 4.2.1

TODO:
Some problems may occur on older platforms, which do not have
stdint.h. In this case we need to add something like below to
include/port/*.h.

/* LP64, IPL64, ILP32, LP32 */
typedef  long intptr_t;
typedef unsigned long uintptr_t;

/* LLP64 */
typedef  long long intptr_t;
typedef unsigned long long uintptr_t; 


Thanks,

-- 
Tsutomu Yamada // tsut...@sraoss.co.jp
SRA OSS, Inc. Japan

diff -cwbr postgresql-8.4beta2-orig/src/backend/access/common/heaptuple.c 
postgresql-8.4beta2-winx64/src/backend/access/common/heaptuple.c
*** postgresql-8.4beta2-orig/src/backend/access/common/heaptuple.c  
2009-03-30 13:08:43.0 +0900
--- postgresql-8.4beta2-winx64/src/backend/access/common/heaptuple.c
2009-06-19 16:26:01.0 +0900
***
*** 192,198 
if (att[i]-attbyval)
{
/* pass-by-value */
!   data = (char *) att_align_nominal((long) data, 
att[i]-attalign);
store_att_byval(data, values[i], att[i]-attlen);
data_length = att[i]-attlen;
}
--- 192,198 
if (att[i]-attbyval)
{
/* pass-by-value */
!data = (char *) att_align_nominal(data, att[i]-attalign);
store_att_byval(data, values[i], att[i]-attlen);
data_length = att[i]-attlen;
}
***
*** 226,232 
else
{
/* full 4-byte header varlena */
!   data = (char *) att_align_nominal((long) data,

  att[i]-attalign);
data_length = VARSIZE(val);
memcpy(data, val, data_length);
--- 226,232 
else
{
/* full 4-byte header varlena */
!   data = (char *) att_align_nominal(data,

  att[i]-attalign);
data_length = VARSIZE(val);
memcpy(data, val, data_length);
***
*** 243,249 
else
{
/* fixed-length pass-by-reference */
!   data = (char *) att_align_nominal((long) data, 
att[i]-attalign);
Assert(att[i]-attlen  0);
data_length = att[i]-attlen;
memcpy(data, DatumGetPointer(values[i]), data_length);
--- 243,249 
else
{
/* fixed-length pass-by-reference */
!   data = (char *) att_align_nominal(data, 
att[i]-attalign);
Assert(att[i]-attlen  0);
data_length = att[i]-attlen;
memcpy(data, DatumGetPointer(values[i]), data_length);
diff -cwbr postgresql-8.4beta2-orig/src/backend/access/hash/hashfunc.c 
postgresql-8.4beta2-winx64/src/backend/access/hash/hashfunc.c
*** postgresql-8.4beta2-orig/src/backend/access/hash/hashfunc.c 2009-02-10 
06:18:28.0 +0900
--- postgresql-8.4beta2-winx64/src/backend/access/hash/hashfunc.c   
2009-06-18 22:37:46.0 +0900
***
*** 319,325 
a = b = c = 0x9e3779b9 + len + 3923095;
  
/* If the source pointer is word-aligned, we use word-wide fetches */
!   if (((long) k  UINT32_ALIGN_MASK) == 0)
{
/* Code path for aligned source data */
register const uint32 *ka = (const uint32 *) k;
--- 319,325 
a 

Re: [HACKERS] [PATCH] backend: compare word-at-a-time in bcTruelen

2009-06-26 Thread Stephen Frost
* Dimitri Fontaine (dfonta...@hi-media.com) wrote:
 Le 26 juin 09 à 05:20, Jeremy Kerr a écrit :
 Unfortunately, the cases with lots of padding spaces are probably
 much less probable than the cases with fewer.  It would be unpleasant
 for example if this patch resulted in a severe performance
 degradation for a canonical example of char(n) being used properly,
 such as char(2) for US state abbreviations.

 Yep, makes sense. The other consideration is stock-ticker symbols, I
 assume they may also be stored in CHAR(small n) columns.

 Could this optimisation only kicks in when n is big enough?
 I'm don't know if this part of the code knows the typmod...

Is it just the size that matters, or is it when there are few spaces at
the end?  We do know the overall length, but I didn't see a definite
that if it's larger than X words, doing the by-word comparison is a win
regardless of how many actual spaces are at the end (apologies to Jeremy
if it's in his more detailed report, I havn't had a chance to look yet).

Thanks,

Stpehen


signature.asc
Description: Digital signature


Re: [HACKERS] [PATCH] backend: compare word-at-a-time in bcTruelen

2009-06-26 Thread Jeremy Kerr
Stephen,

 Is it just the size that matters, or is it when there are few spaces
 at the end?

It's the number of spaces at the end. If we knew this number, then we 
wouldn't have to do any comparisons at all :)

Cheers,


Jeremy

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] backend: compare word-at-a-time in bcTruelen

2009-06-26 Thread Stephen Frost
* Jeremy Kerr (j...@ozlabs.org) wrote:
  Is it just the size that matters, or is it when there are few spaces
  at the end?
 
 It's the number of spaces at the end. If we knew this number, then we 
 wouldn't have to do any comparisons at all :)

I meant in terms of affecting the performance of this function..  We
know the total length of the string, including spaces, coming into the
function.  If the updated function is always faster when the overall
string is at least, say, 16 characters long, then Dimitri's suggestion
to just check the overall length coming in and decide which approach to
use might make sense.  If the new function is always slower, regardless
of overall string length, when there's only 1 extra space at the end,
then chances are it's not worth it.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] [PATCH] backend: compare word-at-a-time in bcTruelen

2009-06-26 Thread Jeremy Kerr
Stephen,

 If the updated function is always faster when the overall string is at
 least, say, 16 characters long,

But that's not the case - the cost of the function (and the speedup from 
the previous version) depends on the number of spaces that there are at 
the end.

For the new function to be faster, we need to know that there are more 
than 6 (on average, depending on alignment) trailing spaces. The number 
of non-space characters in the string won't affect performance (other 
than that there is probably a correlation between total string length 
and number of spaces, but we can't be certain of that).

 If the new function is always slower, regardless of overall string
 length, when there's only 1 extra space at the end

Yes, that is correct.

Cheers,


Jeremy


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] gettext version problem exposed by buildfarm failures on canary

2009-06-26 Thread Tom Lane
I see buildfarm member canary (NetBSD 1.6) was recently resurrected,
but is failing to build HEAD, because msgfmt fails like this:

po/es.po:8039: keyword msgid_plural unknown
po/es.po:8036: missing `msgstr' section
po/es.po:8039: parse error

This indicates (1) canary needs a newer version of gettext, and
(2) our configure test for a sufficiently new gettext installation
is not in fact correct, because it didn't complain.

I asked about the configure issue some time ago, and was told that
the existing test for bind_textdomain_codeset() in libintl was
sufficient because that was introduced after ngettext().  It could
be that the library's history is different on NetBSD, or perhaps
canary has a recent libintl.so and a not so recent msgfmt.

I doubt it's worth trying to fix this before 8.4 release, but it
should get looked into eventually.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] backend: compare word-at-a-time in bcTruelen

2009-06-26 Thread Tom Lane
Jeremy Kerr j...@ozlabs.org writes:
 Stephen,
 If the updated function is always faster when the overall string is at
 least, say, 16 characters long,

 But that's not the case - the cost of the function (and the speedup from 
 the previous version) depends on the number of spaces that there are at 
 the end.

Right, but there are certainly not more spaces than there are string
characters ;-)

I think Dimitri's idea is eminently worth trying.  In a string of less
than, say, 16 bytes, the prospects of being able to win anything get
much smaller compared to the prospects of wasting the extra loop
overhead.  There is also a DBA psychology angle to it.  If you've got
CHAR(n) for very small n, it's likely that the type is being used in the
canonical fashion and there won't be many trailing blanks.  The case
where we can hope to win is where we have CHAR(255) or some other
plucked-from-the-air limit.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] query cancel issues in contrib/dblink

2009-06-26 Thread Merlin Moncure
On Thu, Jun 25, 2009 at 10:41 PM, Itagaki
Takahiroitagaki.takah...@oss.ntt.co.jp wrote:
 Hi,

 contrib/dblink seems to have no treatments for query cancels.
 It causes the following issues:

 (1) Users need to wait for completion of remote query.
    Requests for query cancel won't be delivered to remote servers.

 (2) PGresult objects will be memory leak. The result is not released
    when query is cancelled; it is released only when dblink function
    is called max_calls times.

 They are long standing issues (not only in 8.4),
 but I hope we will fix them to make dblink more robust.

 For (1), asynchronous libpq functions should be used instead of blocking
 ones, and wait for the remote query using a loop with CHECK_FOR_INTERRUPTS().

How would you structure this loop exactly?

merlin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal: More portable way to support 64bit platforms

2009-06-26 Thread Peter Eisentraut
On Friday 26 June 2009 12:07:24 Tsutomu Yamada wrote:
 Proposal: More portable way to support 64bit platforms

 Short description:

 Current PostgreSQL implementation has some portability issues to
 support 64bit platforms: pointer calculations using long is not
 portable, for example on Windows x64 platform. We propose to use
 intptr_t instead of long, which appears in in C99.

This makes sense.  You can also review the archives for previous iterations of 
this discussion (search for intptr_t).

You might want to add your patch to the next commit fest.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] gettext version problem exposed by buildfarm failures on canary

2009-06-26 Thread Andrew Dunstan
On Fri, June 26, 2009 9:34 am, Tom Lane wrote:
 I see buildfarm member canary (NetBSD 1.6) was recently resurrected,
 but is failing to build HEAD, because msgfmt fails like this:

 po/es.po:8039: keyword msgid_plural unknown
 po/es.po:8036: missing `msgstr' section
 po/es.po:8039: parse error

 This indicates (1) canary needs a newer version of gettext, and
 (2) our configure test for a sufficiently new gettext installation
 is not in fact correct, because it didn't complain.

 I asked about the configure issue some time ago, and was told that
 the existing test for bind_textdomain_codeset() in libintl was
 sufficient because that was introduced after ngettext().  It could
 be that the library's history is different on NetBSD, or perhaps
 canary has a recent libintl.so and a not so recent msgfmt.

 I doubt it's worth trying to fix this before 8.4 release, but it
 should get looked into eventually.



yes ... but presumably we don't want canary upgraded until we fix this,
unless we get some other test platform for it.

cheers

andrew


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] gettext version problem exposed by buildfarm failures on canary

2009-06-26 Thread Tom Lane
Andrew Dunstan and...@dunslane.net writes:
 On Fri, June 26, 2009 9:34 am, Tom Lane wrote:
 I doubt it's worth trying to fix this before 8.4 release, but it
 should get looked into eventually.

 yes ... but presumably we don't want canary upgraded until we fix this,
 unless we get some other test platform for it.

What I'd suggest Kris do is temporarily disable the NLS option on that
build, so we can get at least some useful testing from the machine.
After a proposed configure fix has been installed, re-enable NLS, verify
configure fails as expected, *then* upgrade msgfmt.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] backend: compare word-at-a-time in bcTruelen

2009-06-26 Thread Dimitri Fontaine

Le 26 juin 09 à 14:47, Jeremy Kerr a écrit :

For the new function to be faster, we need to know that there are more
than 6 (on average, depending on alignment) trailing spaces.


It's becoming somewhat tricky, but maybe the test to do for the  
optimisation to get used is n = threshold  str[n-6] == 0x20, àla  
Boyer/Moore?


I call it tricky because you could have a space here which isn't  
followed by spaces, but still, if it's not a space here, you're saying  
we should not even try the optimisation.


--
dim
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] backend: compare word-at-a-time in bcTruelen

2009-06-26 Thread tomas
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On Fri, Jun 26, 2009 at 05:03:11PM +0200, Dimitri Fontaine wrote:
 Le 26 juin 09 à 14:47, Jeremy Kerr a écrit :
 For the new function to be faster, we need to know that there are more
 than 6 (on average, depending on alignment) trailing spaces.

 It's becoming somewhat tricky, but maybe the test to do for the 
 optimisation to get used is n = threshold  str[n-6] == 0x20, àla 
 Boyer/Moore?

That's cute. What about comparing the last aligned word which completely
fits in the buffer? Something along the lines of (assuming four-byte
words)

   * (int*) (4 * ((int) buf[0]) / 4)

(now that's an ugly one, but you know what I mean?)

Regards
- -- tomás
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFKROlCBcgs9XrR2kYRArHPAJ9VhT+RfK5/5BxwA0nxaOmK4nfuWACdFtFL
iKtvPaZY/KhDJMOf4hyzmQI=
=yd05
-END PGP SIGNATURE-

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] backend: compare word-at-a-time in bcTruelen

2009-06-26 Thread Tom Lane
to...@tuxteam.de writes:
 On Fri, Jun 26, 2009 at 05:03:11PM +0200, Dimitri Fontaine wrote:
 It's becoming somewhat tricky, but maybe the test to do for the 
 optimisation to get used is n = threshold  str[n-6] == 0x20, àla 
 Boyer/Moore?

 That's cute. What about comparing the last aligned word which completely
 fits in the buffer? Something along the lines of (assuming four-byte
 words)
* (int*) (4 * ((int) buf[0]) / 4)

We're trying to avoid adding cycles to the optimization-is-useless case.
The more expensive this test gets, the slower the unoptimizable case
becomes.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] backend: compare word-at-a-time in bcTruelen

2009-06-26 Thread Andrew Dunstan
On Fri, June 26, 2009 11:39 am, Tom Lane wrote:
 to...@tuxteam.de writes:
 On Fri, Jun 26, 2009 at 05:03:11PM +0200, Dimitri Fontaine wrote:
 It's becoming somewhat tricky, but maybe the test to do for the
 optimisation to get used is n = threshold  str[n-6] == 0x20, àla
 Boyer/Moore?

 That's cute. What about comparing the last aligned word which completely
 fits in the buffer? Something along the lines of (assuming four-byte
 words)
* (int*) (4 * ((int) buf[0]) / 4)

 We're trying to avoid adding cycles to the optimization-is-useless case.
 The more expensive this test gets, the slower the unoptimizable case
 becomes.


Yeah. Like you, I like the idea of a switch based on string length. I
would suggest a cutoff of something like 36 (length of the string
representation of a UUID). But maybe that will miss lots of optimisable
cases like address fields. I guess those people should really be using
varchar(n) anyway.

cheers

andrew



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] gettext version problem exposed by buildfarm failures on canary

2009-06-26 Thread Kris Jurka

Tom Lane wrote:

I see buildfarm member canary (NetBSD 1.6) was recently resurrected,
but is failing to build HEAD, because msgfmt fails like this:

po/es.po:8039: keyword msgid_plural unknown
po/es.po:8036: missing `msgstr' section
po/es.po:8039: parse error

This indicates (1) canary needs a newer version of gettext, and
(2) our configure test for a sufficiently new gettext installation
is not in fact correct, because it didn't complain.


NetBSD has an odd mix of a BSD libintl and GNU msgfmt, so the library 
provides both ngettext and bind_textdomain_codeset, but that has no 
implications about what msgfmt supports.


pgf...@netbsd:~$ msgfmt --version
msgfmt (GNU gettext) 0.10.35

Also, the man page for libintl says, bind_textdomain_codeset() does not 
work at this moment (always fail).


So perhaps this platform is just a lost cause.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] gettext version problem exposed by buildfarm failures on canary

2009-06-26 Thread Kris Jurka

Tom Lane wrote:


What I'd suggest Kris do is temporarily disable the NLS option on that
build, so we can get at least some useful testing from the machine.
After a proposed configure fix has been installed, re-enable NLS, verify
configure fails as expected, *then* upgrade msgfmt.



I disabled NLS, got a clean test run, and have now re-enabled nls so 
that people can test things if they care.


Kris Jurka

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Join optimization for inheritance tables

2009-06-26 Thread Nedyalko Borisov

Hi all,

We are working with Aster for the summer and we would like to bounce some ideas 
that we are having for some possible PostgreSQL extensions. In order to 
describe our ideas we will use the following example:

create table msg(
  msg_id int,
  msg text );

create table receiver(
  msg_id int,
  user_id int,
  ts timestamp );


create table msg_100(
  check ( 1 = msg_id and msg_id  100 )
  ) inherits (msg);

create table msg_200(
  check ( 100 = msg_id and msg_id  200 )
  ) inherits (msg);

create table msg_300(
  check ( 200 = msg_id and msg_id  300 )
  ) inherits (msg);


create table receiver_100(
  check ( 1 = msg_id and msg_id  100 )
  ) inherits (receiver);

create table receiver_200(
  check ( 100 = msg_id and msg_id  200 )
  ) inherits (receiver);

create table receiver_300(
  check ( 200 = msg_id and msg_id  300 )
  ) inherits (receiver);


When we are issuing queries on one of the parent tables, like,

   SELECT * FROM msg WHERE msg_id BETWEEN 50 AND 70;

PostgreSQL is smart enough to filter out child tables with check constraints 
that are refuted by the filter conditions. In this example, the optimizer will 
pick a plan that only considers the parent table 'msg' and one of the child 
tables 'msg_100':

   Result
  -  Append
-  Seq Scan on msg
  Filter: ((msg_id = 50) AND (msg_id = 70))
-  Seq Scan on msg_100 msg
  Filter: ((msg_id = 50) AND (msg_id = 70))

Plan costs are removed for simplicity of the presentation.

Now, if we issue a join query between the two parent tables, like,

   SELECT * FROM msg m JOIN receiver r ON m.msg_id = r.msg_id;

the execution plan will be:

   Merge Join
  Merge Cond: (m.msg_id = r.msg_id)
  -  Sort
Sort Key: m.msg_id
-  Append
  -  Seq Scan on msg m
  -  Seq Scan on msg_100 m
  -  Seq Scan on msg_200 m
  -  Seq Scan on msg_300 m
  -  Sort
Sort Key: r.msg_id
-  Append
  -  Seq Scan on receiver r
  -  Seq Scan on receiver_100 r
  -  Seq Scan on receiver_200 r
  -  Seq Scan on receiver_300 r


During the planning phase, the optimizer treats an entire hierarchy as a single entity. 
Hence, it first considers the most efficient way to create the append paths for the two 
hierarchies, and then the best way to join them. However, there are some optimizations 
that are possible here, similar to the table filtering described above. In particular, 
instead of joining the two appends, we could push down the join to the child 
relations - that is, create pairwise joins between the children and then append the join 
results together.

Based on the check conditions of the children and the join predicate, it is 
possible to filter out joins that cannot produce any results. For example, 
joining 'msg_100' with 'receiver_300' is redundant since the check constraints 
of these two tables do not overlap. Tuples in 'msg_100' have 'msg_id' between 1 
and 100, whereas tuples in 'receiver_300' have 'msg_id' between 200 and 300. 
Therefore, no tuples can be produce from this join.

A plan with such optimizations could be:

 Result
   -  Append
-  Hash Join
  Hash Cond: (msg.msg_id = receiver.msg_id)
  -  Seq Scan on msg msg
  -  Hash
-  Seq Scan on receiver receiver
-  Hash Join
  Hash Cond: (msg.msg_id = receiver.msg_id)
  -  Seq Scan on msg msg
  -  Hash
-  Seq Scan on receiver_100 receiver
-  Hash Join
  Hash Cond: (msg.msg_id = receiver.msg_id)
  -  Seq Scan on msg msg
  -  Hash
-  Seq Scan on receiver_200 receiver
-  Hash Join
  Hash Cond: (msg.msg_id = receiver.msg_id)
  -  Seq Scan on msg msg
  -  Hash
-  Seq Scan on receiver_300 receiver
-  Hash Join
  Hash Cond: (msg.msg_id = receiver.msg_id)
  -  Seq Scan on msg_100 msg
  -  Hash
-  Seq Scan on receiver receiver
-  Hash Join
  Hash Cond: (msg.msg_id = receiver.msg_id)
  -  Seq Scan on msg_100 msg
  -  Hash
-  Seq Scan on receiver_100 receiver
-  Hash Join
  Hash Cond: (msg.msg_id = receiver.msg_id)
  -  Seq Scan on msg_200 msg
  -  Hash
-  Seq Scan on receiver receiver
-  Hash Join
  Hash Cond: (msg.msg_id = receiver.msg_id)
  -  Seq Scan on msg_200 msg
  -  Hash
-  Seq Scan on receiver_200 receiver
-  Hash Join
  Hash Cond: (msg.msg_id = receiver.msg_id)
  -  Seq Scan on msg_300 msg
  -  Hash
-  Seq Scan on 

Re: [HACKERS] 8.4 open item: copy performance regression?

2009-06-26 Thread Kevin Grittner
Tom Lane t...@sss.pgh.pa.us wrote: 
 Kevin Grittner kevin.gritt...@wicourts.gov writes:
 The checkpoint_segments seems dramatic enough to be real.  I wonder
 if the test is short enough that it never got around to re-using
 any of them, so it was doing extra writes for the initial creation
 during the test?
 
 That's exactly what I was about to suggest.  Are you starting each
 run from a fresh initdb?  If so, try running the load long enough
 that the number of WAL files stabilizes (should happen at 2x
 checkpoint_segments) and then start the test measurement.
 
default conf (xlogs not populated)
real3m49.604s
real3m47.225s
real3m45.831s

default conf (xlogs populated)
real3m45.603s
real3m45.284s
real3m45.906s

default conf + checkpoint_segments = 100 (xlogs not populated)
real4m27.629s
real4m24.496s
real4m22.832s

default conf + checkpoint_segments = 100 (xlogs populated)
real3m52.746s
real3m52.619s
real3m50.418s
 
I used ten times the number of rows, to get more meaningful results. 
To get the populated times, I just dropped the target table and
created it again; otherwise identical runs.  Clearly, pre-populating
the xlog files reduces run time, especially for a large number of xlog
files; however, I still got better performance with a smaller set of
xlog files.
 
Regarding the fact that even with the xlog files pre-populated, the
smaller set of xlog files is faster: I'm only guessing, but I suspect
the battery backed RAID controller is what's defeating conventional
wisdom here.  By writing to the same, relatively small, set of xlog
files repeatedly, some of the actual disk writes probably evaporate in
the BBU cache.  More frequent checkpoints from the smaller number of
xlog files might also have caused data to start streaming to the disk
a little sooner, minimizing write gluts later.
 
I've often seen similar benefits to the BBU cache which cause some of
the frequently-given advice here to have no discernible affect or be
counter-productive in our environment.  (I know that some doubted that
my aggressive background writer settings didn't increase disk writes,
but I couldn't even measure a difference there in the writes from OS
cache to the controller cache, much less anything which indicated it
actually increased physical disk writes.)
 
By the way, the number of xlog files seemed to always go to two above
2x checkpoint_segments.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 8.4 open item: copy performance regression?

2009-06-26 Thread Tom Lane
Kevin Grittner kevin.gritt...@wicourts.gov writes:
 Regarding the fact that even with the xlog files pre-populated, the
 smaller set of xlog files is faster: I'm only guessing, but I suspect
 the battery backed RAID controller is what's defeating conventional
 wisdom here.  By writing to the same, relatively small, set of xlog
 files repeatedly, some of the actual disk writes probably evaporate in
 the BBU cache.

Yeah, sounds plausible.  How big is your BBU cache?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 8.4 open item: copy performance regression?

2009-06-26 Thread Andrew Dunstan
On Fri, June 26, 2009 4:13 pm, Kevin Grittner wrote:
 By the way, the number of xlog files seemed to always go to two above
 2x checkpoint_segments.


The docs say:

There will always be at least one WAL segment file, and will normally not
be more than (2 + checkpoint_completion_target) * checkpoint_segments + 1
files.


cheers

andrew


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 8.4 open item: copy performance regression?

2009-06-26 Thread Kevin Grittner
Tom Lane t...@sss.pgh.pa.us wrote: 
 How big is your BBU cache?
 
On this machine, I guess it is 512MB.  (Possibly 1GB, but I'm having
trouble finding the right incantation to check it at the moment, so
I'm going by what the hardware tech remembers.)
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] [v8.5] Security checks on largeobjects

2009-06-26 Thread Bernd Helmle



--On 26. Juni 2009 13:08:37 +0900 KaiGai Kohei kai...@ak.jp.nec.com wrote:


 * Is the named large object (including fully qualified one) worth?
   It will enables us to specify a largeobject with human readable
   identifier string.


I don't understand the notion of this. Does this mean you can create a LO 
with an identifier string, generated from (e.g.) your application?


--
Thanks

Bernd

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Join optimization for inheritance tables

2009-06-26 Thread Tom Lane
Nedyalko Borisov nedya...@asterdata.com writes:
 In summary, we are making two suggestions:
 1. Extend the optimizer to consider joins between child tables when 
 hierarchies are joined together.

We already handle this for the case where the join is nestloop with
inner index scan, and I'm not convinced that there's any real gain to be
had for other join types.

 2. Add the Empty Check Constraint, which would enforce that a particular 
 table is to remain empty.

The trouble with that is that a constraint that doesn't propagate to its
child tables is a weird beast that I'd just as soon not invent.

We are currently thinking about inventing an explicit notion of
partitioned tables.  If we had that, it would be reasonable to have
a special kind of parent table for a partitioned set and refuse to
allow any data in that relation.  But I'm not excited about contorting
the general constraint mechanism in the way that would be necessary to
express this as a constraint.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers