date:20051003

Re: [HACKERS] PG Killed by OOM Condition

2005-10-03 Thread Jeff Davis

> It's not an easy decision. Linux isn't wrong. Solaris isn't wrong.
> Most people never hit these problems, and the people that do, are
> just as likely to hit one problem, or the other. The grass is always
> greener on the side of the fence that isn't hurting me right now,
> and all that.
> 
> Cheers,
> mark
> 

Thanks, a very informative reply.

Do you have some references where I can learn more?

I think that I've run into the OOM killer without a fork() being
involved, but I could be wrong. Is it possible to be hit by the OOM
killer if no applications use fork()?

Regards,
Jeff Davis

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

[HACKERS] Vacuum and Transactions

2005-10-03 Thread Rod Taylor

As I understand it vacuum operates outside of the regular transaction
and if you stop it (SIGTERM, or pg_cancel_backend()) some of the work it
accomplished will be kept when it rolls back.

For large structures with a ton of dead entries (which I seem to have a
case), running vacuum takes long enough that high-churn structures begin
to experience difficulties.

Is it reasonable to cancel and restart the vacuum process periodically
(say every 12 hours) until it manages to complete the work? It takes
about 2 hours to do the table scan, and should get in about 10 hours of
index work each round.

The vacuum ignores vacuum transaction concept looks handy right now.

-- 


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [HACKERS] Tuning current tuplesort external sort code for 8.2

2005-10-03 Thread Tom Lane

Simon Riggs <[EMAIL PROTECTED]> writes:
> AFAICS the following opportunities exist, without changing any of the
> theoretical algorithms or the flexibility of definable datatypes:

> 1. tuplesort_heap_siftup and tuplesort_heap_insert make no attempt to
> cache the values of keys that have been obtained from *_getattr macros.
> The two routines navigate a tournament sort heap, so that on average 50%
> of comparisons use at least one immediately preceeding tuple and key
> values from that could be cached ready for the next call.

Hmm ... this seems interesting, but you also have to look at the
potential downside: what is the cost of doing the caching?

> All of the remaining ideas relate to NULL handling.

I can't get excited about this.  Most sort scenarios involve few if any
nulls.

One thought that comes to mind is that the current system structure
encourages sorting on keys that are at the end of their tuples.
For instance, "SELECT foo, bar FROM ... ORDER BY baz" will sort by
the third column of the generated tuples, which is certainly the least
efficient to access.  It'd be interesting to look into generating the
working tuples with baz as the first column.  I fear this is nontrivial,
because there are a lot of places that expect resjunk columns to come
last, but we should study it.  (Note: this will do nada for Josh's
original complaint about index build time, since btree index sorts will
by definition use all the tuple's columns, but it seems like a
significant issue for ordinary sorts.)

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [HACKERS] effective SELECT from child tables

2005-10-03 Thread Greg Stark

Rod Taylor <[EMAIL PROTECTED]> writes:

> > Hm. So you're saying there are only ever exactly two types of defaults. The
> > "initial" default that applies to all tuples that were created before the
> > column was added. And the "current" default that only ever applies to newly
> > created tuples.
> > 
> > That does seem to cleanly close this hole.
> 
> I don't think so.
> 
> ALTER TABLE tab ADD foo integer DEFAULT 1;
> INSERT INTO tab DEFAULT VALUES;

This inserts a physical "1" in the record (the "current" default").

> ALTER TABLE tab ALTER foo SET DEFAULT 2
> INSERT INTO tab DEFAULT VALUES;

This inserts a physical "2" in the record.

> ALTER TABLE tab ALTER foo SET DEFAULT 3
> INSERT INTO tab DEFAULT VALUES;

This inserts a physical "3" in the record.

> SELECT foo FROM tab;

This checks for any old records that predate the column and use the "initial"
default of 1 for those records. The three records above all postdate the
column addition so they have values present, namely 1, 2, and 3.

-- 
greg

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] effective SELECT from child tables

2005-10-03 Thread Rod Taylor

On Mon, 2005-10-03 at 23:51 -0400, Rod Taylor wrote:
> On Mon, 2005-10-03 at 23:24 -0400, Greg Stark wrote:
> > Hannu Krosing <[EMAIL PROTECTED]> writes:
> > 
> > > On P, 2005-10-02 at 23:00 -0400, Tom Lane wrote:
> > > > 
> > > > Here's another interesting case to think about:
> > > > 
> > > > ALTER TABLE ADD foo integer DEFAULT 1
> > > > ...
> > > > ALTER TABLE ALTER foo SET DEFAULT 2
> > > >
> > > > You'll have to pay the table-traversal cost on one step or the other.
> > > 
> > > The second, ALTER ... SET DEFAULT, would only set default for newly
> > > inserted columns, not the ones which are missing due to tuples being
> > > created before the column existed.
> > 
> > Hm. So you're saying there are only ever exactly two types of defaults. The
> > "initial" default that applies to all tuples that were created before the
> > column was added. And the "current" default that only ever applies to newly
> > created tuples.
> > 
> > That does seem to cleanly close this hole.
> 
> I don't think so.

Ignore me. The thread seems to be about allowing fast addition of
columns, not decreasing storage space.

For some reason I was thinking of a bitmap like the NULL bitmap for
compressing out all default values.


-- 


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] effective SELECT from child tables

2005-10-03 Thread Rod Taylor

On Mon, 2005-10-03 at 23:24 -0400, Greg Stark wrote:
> Hannu Krosing <[EMAIL PROTECTED]> writes:
> 
> > On P, 2005-10-02 at 23:00 -0400, Tom Lane wrote:
> > > 
> > > Here's another interesting case to think about:
> > > 
> > >   ALTER TABLE ADD foo integer DEFAULT 1
> > >   ...
> > >   ALTER TABLE ALTER foo SET DEFAULT 2
> > >
> > > You'll have to pay the table-traversal cost on one step or the other.
> > 
> > The second, ALTER ... SET DEFAULT, would only set default for newly
> > inserted columns, not the ones which are missing due to tuples being
> > created before the column existed.
> 
> Hm. So you're saying there are only ever exactly two types of defaults. The
> "initial" default that applies to all tuples that were created before the
> column was added. And the "current" default that only ever applies to newly
> created tuples.
> 
> That does seem to cleanly close this hole.

I don't think so.

ALTER TABLE tab ADD foo integer DEFAULT 1;
INSERT INTO tab DEFAULT VALUES;

ALTER TABLE tab ALTER foo SET DEFAULT 2
INSERT INTO tab DEFAULT VALUES;

ALTER TABLE tab ALTER foo SET DEFAULT 3
INSERT INTO tab DEFAULT VALUES;

SELECT foo FROM tab;
-- 


---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [HACKERS] [PATCHES] External Sort timing debug statements

2005-10-03 Thread Tom Lane

Greg Stark <[EMAIL PROTECTED]> writes:
> I'm not actually involved in this so maybe I'm completely off base here. But
> wouldn't you want to know how many tuples are being sorted and how many data
> are being written in these runs in order to be able to actually make sense of
> these timing measurements?

In all plausible test cases you should be able to know the total number
of tuples being sorted independently.  Counting the tuples processed in
each run would require extra per-tuple overhead, which I'd rather avoid
until proven necessary.

The total-data-volume aspect may or may not be interesting, not sure
yet.  Let's see what we can learn from the present low-impact patch.

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [HACKERS] [PATCHES] External Sort timing debug statements

2005-10-03 Thread Greg Stark

Tom Lane <[EMAIL PROTECTED]> writes:

> Applied with revisions: I made it use the VacRUsage code so that we
> could see both CPU and elapsed time, and moved the report points around
> a bit.  The output with trace_sort enabled looks like this:
> 
> NOTICE:  begin tuple sort: nkeys = 1, workMem = 1024, randomAccess = t
> NOTICE:  switching to external sort: CPU 0.05s/0.10u sec elapsed 0.15 sec
> NOTICE:  finished writing run 1: CPU 0.14s/0.83u sec elapsed 0.99 sec
> NOTICE:  finished writing run 2: CPU 0.25s/1.67u sec elapsed 1.94 sec
> NOTICE:  finished writing run 3: CPU 0.37s/2.51u sec elapsed 2.90 sec
> NOTICE:  finished writing run 4: CPU 0.48s/3.36u sec elapsed 3.86 sec

I'm not actually involved in this so maybe I'm completely off base here. But
wouldn't you want to know how many tuples are being sorted and how many data
are being written in these runs in order to be able to actually make sense of
these timing measurements?

Otherwise you would never know whether the results people will send you are
really comparable. And even if you're testing things in a well controlled
environments you'll never know if you missed something and you're comparing
two incomparable results.

-- 
greg

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [HACKERS] New Point Releases Available

2005-10-03 Thread Tom Lane

"Marc G. Fournier" <[EMAIL PROTECTED]> writes:
> Just bundled up 7.3.11, 7.4.9 and 8.0.4 ... please look them over and make 
> sure they look okay ... will announce late tomorrow to -announce ...

Tarballs match what I have here ...

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [HACKERS] effective SELECT from child tables

2005-10-03 Thread Greg Stark

Hannu Krosing <[EMAIL PROTECTED]> writes:

> On P, 2005-10-02 at 23:00 -0400, Tom Lane wrote:
> > 
> > Here's another interesting case to think about:
> > 
> > ALTER TABLE ADD foo integer DEFAULT 1
> > ...
> > ALTER TABLE ALTER foo SET DEFAULT 2
> >
> > You'll have to pay the table-traversal cost on one step or the other.
> 
> The second, ALTER ... SET DEFAULT, would only set default for newly
> inserted columns, not the ones which are missing due to tuples being
> created before the column existed.

Hm. So you're saying there are only ever exactly two types of defaults. The
"initial" default that applies to all tuples that were created before the
column was added. And the "current" default that only ever applies to newly
created tuples.

That does seem to cleanly close this hole.

-- 
greg

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [HACKERS] effective SELECT from child tables

2005-10-03 Thread Greg Stark

Hannu Krosing <[EMAIL PROTECTED]> writes:

> Probably a different syntax would be better here, perhaps
> 
> ALTER TABLE ADD foo integer AS 1 WHEN MISSING;
>
> or somesuch.

Uhm, if you're adding the column they're *all* "missing". That's the whole
point. If you start inventing a new user-visible concept "missing" and try to
distinguish it from NULL you're going to have a hell of a time defining the
semantics.

The goal has to be to provide the *exact* same user-visible semantics as
actually setting the default. That means setting all the existing rows if
you're adding a new column.

It also unfortunately means tackling the much trickier gotcha that Tom raised
about what happens if you want to later change the default.

-- 
greg

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] Build Farm: thrush

2005-10-03 Thread Tom Lane

Michael Fuhr <[EMAIL PROTECTED]> writes:
> On Mon, Oct 03, 2005 at 05:19:43PM +0200, Gaetano Mendola wrote:
>>> What version of Python have you got on that thing?  It seems to be
>>> emitting still another spelling of the encoding error message :-(
>> 
>> $ python -V
>> Python 2.2.3

> The attached variant file (plpython_error_2.out) should allow this
> Python version's wording of the error message.

Applied, thanks.

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [HACKERS] RPMs for RedHat ES3.0

2005-10-03 Thread Philip Yarra

Hi Tom,

On Tue, 4 Oct 2005 11:36 am, Tom Lane wrote:
> I think there's no question that this represents an RPM-building error.
> We haven't entirely figured out what's causing it though.

Right, gotcha. I think this may be the issue:
[EMAIL PROTECTED] 8.x]# rpm -ql postgresql-libs-8.0.3-1PGDG
[snip]
/usr/lib/libecpg_compat.so.2
[snip]

[EMAIL PROTECTED] 8.x]# ldd /usr/lib/libecpg_compat.so.2
libecpg.so.4 => /usr/lib/libecpg.so.4 (0x007ba000)
libpgtypes.so.1 => /usr/lib/libpgtypes.so.1 (0x00e0e000)
libpq.so.3 => /usr/lib/libpq.so.3 (0x00efa000)

[EMAIL PROTECTED] 8.x]# rpm -q --whatprovides /usr/lib/libpq.so.3
compat-postgresql-libs-3-3PGDG

Including /usr/lib/libecpg_compat.so.2 in postgresql-libs-8.0.3-1PGDG causes 
automatic dependency 
(http://www.rpm.org/max-rpm/s1-rpm-depend-auto-depend.html) on those .so 
files which are only included in compat RPM. 

Regards, Philip.

-
Utiba Pty Ltd 
This message has been scanned for viruses and
dangerous content by Utiba mail server and is 
believed to be clean.

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] [ADMIN] Vacuum Full Analyze Stalled

2005-10-03 Thread Tom Lane

[ I just noticed that this thread is happening on pgsql-admin, which is
  completely inappropriate for discussing bugs in a beta version.
  Please redirect followups to pgsql-hackers. ]

I wrote:
> ... The hypothesis I'm thinking about is that VACUUM is trying to do
> LockBufferForCleanup() and for some reason it never finishes.

I set up a simple-minded reproduction of Kevin's situation: I did

create domain dint as int check (value > 0);
create table manyd  (f1 dint, f2 dint, f3 dint,
f4 dint, f5 dint, f6 dint, f7 dint, f8 dint, f9 dint, f10 dint);

and then ran ten concurrent clients doing this continuously:

insert into manyd values(1,2,3,4,5,6,7,8,9,10);

which should be enough to force a lot of indexscans on
pg_constraint_contypid_index.  I added an additional client doing

create domain d1 as int check (value > 0);
drop domain d1;

to ensure that there were dead rows needing vacuuming in pg_constraint.
(BTW, Tatsuo's new version of pg_bench lets me do all this without
writing a line of code...)

Finally, I added some debug printouts to LockBufferForCleanup so I
could see if it was being executed or not.

Then I tried both manual and autovacuum-driven vacuums of pg_constraint.
I was able to see from the debug printouts that LockBufferForCleanup was
sometimes forced to wait in both cases.  But it never got "stuck".

This eliminates one thing I was worrying about, which was the
possibility that the LockBufferForCleanup waiting path was completely
broken inside autovacuum for some reason.  But it doesn't get us a whole
lot closer to a solution.

At this point I think we need more info from Kevin and Jeff before we
can go further.  There must be some additional special feature of their
application that makes the problem appear, but what?

A stack trace of the stuck process would definitely help...

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] [COMMITTERS] pgsql: Fix procedure for updating nextval()

2005-10-03 Thread Bruce Momjian

Christopher Kings-Lynne wrote:
> Didn't Alvaro write some script that we were going to use to help people 
> apply SQL changes against their databases?

I already wrote something up and it was in the release notes.  Tom just
updated it for the new DEFAULT format after the patch.

---

> 
> Chris
> 
> Tom Lane wrote:
> > Log Message:
> > ---
> > Fix procedure for updating nextval() defaults so that it actually works.
> > Update release dates for pending back-branch releases.
> > 
> > Modified Files:
> > --
> > pgsql/doc/src/sgml:
> > release.sgml (r1.386 -> r1.387)
> > 
> > (http://developer.postgresql.org/cvsweb.cgi/pgsql/doc/src/sgml/release.sgml.diff?r1=1.386&r2=1.387)
> > 
> > ---(end of broadcast)---
> > TIP 1: if posting/reading through Usenet, please send an appropriate
> >subscribe-nomail command to [EMAIL PROTECTED] so that your
> >message can get through to the mailing list cleanly
> 
> 
> ---(end of broadcast)---
> TIP 5: don't forget to increase your free space map settings
> 

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  pgman@candle.pha.pa.us   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] RPMs for RedHat ES3.0

2005-10-03 Thread Tom Lane

Philip Yarra <[EMAIL PROTECTED]> writes:
> Have a look at my original post: I was unable to install any of the PGDG RPMs
> as they have a dependency on the contents of the compat RPM. E.g.:

> [EMAIL PROTECTED] 8.x]# rpm -i postgresql-libs-8.0.3-1PGDG.i686.rpm
> error: Failed dependencies:
> libecpg.so.4 is needed by postgresql-libs-8.0.3-1PGDG
> libpgtypes.so.1 is needed by postgresql-libs-8.0.3-1PGDG
> libpq.so.3 is needed by postgresql-libs-8.0.3-1PGDG

I think there's no question that this represents an RPM-building error.
We haven't entirely figured out what's causing it though.

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

[HACKERS] New Point Releases Available

2005-10-03 Thread Marc G. Fournier



Just bundled up 7.3.11, 7.4.9 and 8.0.4 ... please look them over and make 
sure they look okay ... will announce late tomorrow to -announce ...



Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org)
Email: [EMAIL PROTECTED]   Yahoo!: yscrappy  ICQ: 7615664

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq

Re: [HACKERS] Last call for back-branch fixes

2005-10-03 Thread Greg Sabino Mullane


-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


> ! (void) sprintf((char *) rp, "%03o", *vp);
> ! (void) snprintf((char *) rp, 6, "%03o", *vp);
  
> Hardly looks like a critical fix, or even a needed one.  Why bother?

No, not critical. Just something that tripped me up a few months ago when
writing some code based on this section, and I finally remembered it when
I saw your email. It's more of a nice to have, as the surrounding code
should prevent the %03o from making more than 3 characters.

- --
Greg Sabino Mullane [EMAIL PROTECTED]
PGP Key: 0x14964AC8 200510032127
http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8
-BEGIN PGP SIGNATURE-

iD8DBQFDQdq9vJuQZxSWSsgRAgeRAKCujcqJFtPf+LSlQ17BBCbGmfelmACgr30s
5OFfHfL3evrwV/FQI/XRMJI=
=ZTcO
-END PGP SIGNATURE-



---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [HACKERS] [COMMITTERS] pgsql: Fix procedure for updating nextval() defaults

2005-10-03 Thread Christopher Kings-Lynne

Didn't Alvaro write some script that we were going to use to help people 
apply SQL changes against their databases?


Chris

Tom Lane wrote:

Log Message:
---
Fix procedure for updating nextval() defaults so that it actually works.
Update release dates for pending back-branch releases.

Modified Files:
--
pgsql/doc/src/sgml:
release.sgml (r1.386 -> r1.387)

(http://developer.postgresql.org/cvsweb.cgi/pgsql/doc/src/sgml/release.sgml.diff?r1=1.386&r2=1.387)

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly



---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] RPMs for RedHat ES3.0

2005-10-03 Thread Philip Yarra

Hi Devrim, thanks for the quick response.

> We haven't discussed it before, but I think we can't put the RPM among
> PGDG RPMs. The main reason is that this is not a piece of software that's
> included within that release of PostgreSQL. Also, this compat RPM is not
> needed if you don't have a software that needs that RPM.

Have a look at my original post: I was unable to install any of the PGDG RPMs 
as they have a dependency on the contents of the compat RPM. E.g.:

[EMAIL PROTECTED] 8.x]# rpm -i postgresql-libs-8.0.3-1PGDG.i686.rpm
error: Failed dependencies:
libecpg.so.4 is needed by postgresql-libs-8.0.3-1PGDG
libpgtypes.so.1 is needed by postgresql-libs-8.0.3-1PGDG
libpq.so.3 is needed by postgresql-libs-8.0.3-1PGDG

Once I installed the compat RPM, I could then install 
postgresql-libs-8.0.3-1PGDG. So the software that needs the compat RPM is the 
PGDG RPMs. 

Please tell me if I have missed something here, I'm not claiming any great 
insight, just what rpm seems to be telling me, and what worked to get me 
through the installation.

> Also remember that this will be probably 'solved' in RHEL 5 (and FC 5), so
> merging compat with main libs package is not a good idea.

Our clients are still on ES3.0, then they *might* migrate to 4.0 one day. 
RHEL5 is a long way off for us, I assure you :-) so I'm happy to lend any 
help I can to resolve this issue for ES3.0.

I don't quite understand how the compat libs can be counted as "not part of 
this release" if the release packages require it. Is it maybe just an issue 
with rpm creating automatic dependiencies for those .so files? (sorry, this 
is me at my vaguest, rpm is something of an unknown to me, but I suspect I'm 
about to learn).

Regards, Philip.


-
Utiba Pty Ltd 
This message has been scanned for viruses and
dangerous content by Utiba mail server and is 
believed to be clean.


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [HACKERS] [PERFORM] A Better External Sort?

2005-10-03 Thread Ron Peacetree

OK, change "performance" to "single thread performance" and we
still have a valid starting point for a discussion.

Ron


-Original Message-
From: Gregory Maxwell <[EMAIL PROTECTED]>
Sent: Oct 3, 2005 8:19 PM
To: Ron Peacetree <[EMAIL PROTECTED]>
Subject: Re: [HACKERS] [PERFORM] A Better External Sort?

On 10/3/05, Ron Peacetree <[EMAIL PROTECTED]> wrote:
[snip]
> Just how bad is this CPU bound condition?  How powerful a CPU is
> needed to attain a DB IO rate of 25MBps?
>
> If we replace said CPU with one 2x, 10x, etc faster than that, do we
> see any performance increase?
>
> If a modest CPU can drive a DB IO rate of 25MBps, but that rate
> does not go up regardless of how much extra CPU we throw at
> it...

Single threaded was mentioned.
Plus even if it's purely cpu bound, it's seldom as trivial as throwing
CPU at it, consider the locking in both the application, in the
filesystem, and elsewhere in the kernel.


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [HACKERS] [PERFORM] A Better External Sort?

2005-10-03 Thread Gregory Maxwell

On 10/3/05, Ron Peacetree <[EMAIL PROTECTED]> wrote:
[snip]
> Just how bad is this CPU bound condition?  How powerful a CPU is
> needed to attain a DB IO rate of 25MBps?
>
> If we replace said CPU with one 2x, 10x, etc faster than that, do we
> see any performance increase?
>
> If a modest CPU can drive a DB IO rate of 25MBps, but that rate
> does not go up regardless of how much extra CPU we throw at
> it...

Single threaded was mentioned.
Plus even if it's purely cpu bound, it's seldom as trivial as throwing
CPU at it, consider the locking in both the application, in the
filesystem, and elsewhere in the kernel.

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [HACKERS] [PERFORM] A Better External Sort?

2005-10-03 Thread Ron Peacetree

Let's pretend we get a 24HD HW RAID solution like that J Baker
says he has access to and set it up as a RAID 10.  Assuming
it uses two 64b 133MHz PCI-X busses and has the fastest HDs
available on it,  Jeff says he can hit ~1GBps of XFS FS IO rate
with that set up (12*83.3MBps= 1GBps).

Josh says that pg can't do more than 25MBps of DB level IO
regardless of how fast the physical IO subsystem is because at
25MBps, pg is CPU bound.  

Just how bad is this CPU bound condition?  How powerful a CPU is
needed to attain a DB IO rate of 25MBps?
 
If we replace said CPU with one 2x, 10x, etc faster than that, do we
see any performance increase?

If a modest CPU can drive a DB IO rate of 25MBps, but that rate
does not go up regardless of how much extra CPU we throw at
it...

Ron 

-Original Message-
From: Josh Berkus 
Sent: Oct 3, 2005 6:03 PM
To: "Jeffrey W. Baker" <[EMAIL PROTECTED]>
Cc: 
Subject: Re: [HACKERS] [PERFORM] A Better External Sort?

Jeffrey,

> I guess database reads are different, but I remain unconvinced that they
> are *fundamentally* different.  After all, a tab-delimited file (my sort
> workload) is a kind of database.

Unfortunately, they are ... because of CPU overheads.   I'm basing what's 
"reasonable" for data writes on the rates which other high-end DBs can 
make.   From that, 25mb/s or even 40mb/s for sorts should be achievable 
but doing 120mb/s would require some kind of breakthrough.

> On a single disk you wouldn't notice, but XFS scales much better when
> you throw disks at it.  I get a 50MB/sec boost from the 24th disk,
> whereas ext3 stops scaling after 16 disks.  For writes both XFS and ext3
> top out around 8 disks, but in this case XFS tops out at 500MB/sec while
> ext3 can't break 350MB/sec.

That would explain it.  I seldom get more than 6 disks (and 2 channels) to 
test with.

-- 
--Josh

Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [HACKERS] Tuning current tuplesort external sort code for 8.2

2005-10-03 Thread Simon Riggs

On Mon, 2005-10-03 at 23:25 +0200, Martijn van Oosterhout wrote:
> Possible problem with caching:
> if you're called by the system qsort, can you assume anything about the
> order of the comparisons?

That applies only to the non-external sort case, which I'm not trying to
improve with these suggestions. (No, you can't assume that, it was
a heapsort only suggestion).

> Please note: if inlineApplySortFunction() is actually inlined (it isn't
> by default)

Can you explain your last post some more. Thats not what I get.

> > is a subset of the PK (a typical one-many relationship) and groupings
> > also. In the majority of cases, these attrs are at the start of a tuple.
> > The *_getattr macros are particularly poor at handling NULLs. When
> > *_getattr sees *any* NULL is present for a tuple it checks the
> > nullability of all attrs up to the current attrnum before returning
> > using the cached offsets. The macro could be altered so that if the
> > current attrnum < firstNullableAttrnum (which we can set once for the
> 
> Maybe easier, in the macro use: bitmap & ((1< check that no nulls precede the value we're looking for and hence we
> can use the fast path anyway. Along the lines of:

You may be right, the exact code that brings the right benefit is
somewhat trickier than spotting the opportunity.

> Mind you, if you start creating seperate routines for different cases
> you can go a long way. Elsewhere on this list I created a special case
> for single-key integer index columns and got an 8% speed increase. Not
> exactly a viable solution though.

But an interesting one. Once we've done everything else, that use case
is close to the top of my list, if the performance gain was still as
useful, all other things considered. 

Best Regards, Simon Riggs


---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [HACKERS] [PATCHES] External Sort timing debug statements

2005-10-03 Thread Simon Riggs

On Mon, 2005-10-03 at 18:59 -0400, Tom Lane wrote:
> Simon Riggs <[EMAIL PROTECTED]> writes:
> >> The following patch implements a fairly light set of timing statements
> >> aimed at understanding external sort performance. There is no attempt to
> >> alter the algorithms.
> 
> > Minor update of patch, use this version please.
> 
> Applied with revisions: I made it use the VacRUsage code so that we
> could see both CPU and elapsed time, and moved the report points around
> a bit.  The output with trace_sort enabled looks like this:
> 
> NOTICE:  begin tuple sort: nkeys = 1, workMem = 1024, randomAccess = t
> NOTICE:  switching to external sort: CPU 0.05s/0.10u sec elapsed 0.15 sec
> NOTICE:  finished writing run 1: CPU 0.14s/0.83u sec elapsed 0.99 sec
> NOTICE:  finished writing run 2: CPU 0.25s/1.67u sec elapsed 1.94 sec
> NOTICE:  finished writing run 3: CPU 0.37s/2.51u sec elapsed 2.90 sec
> NOTICE:  finished writing run 4: CPU 0.48s/3.36u sec elapsed 3.86 sec
> ...
> NOTICE:  finished writing run 45: CPU 5.06s/38.26u sec elapsed 43.55 sec
> NOTICE:  performsort starting: CPU 5.10s/38.62u sec elapsed 43.95 sec
> NOTICE:  finished writing run 46: CPU 5.11s/38.84u sec elapsed 44.18 sec
> NOTICE:  finished writing final run 47: CPU 5.11s/38.88u sec elapsed 44.22 sec
> NOTICE:  finished merge step: CPU 5.12s/39.02u sec elapsed 44.37 sec
> NOTICE:  finished merge step: CPU 5.13s/39.16u sec elapsed 44.53 sec
> ...
> NOTICE:  finished merge step: CPU 6.57s/67.78u sec elapsed 74.83 sec
> NOTICE:  performsort done: CPU 6.57s/67.78u sec elapsed 74.84 sec
> NOTICE:  sort ended: CPU 10.80s/74.73u sec elapsed 86.21 sec

Many thanks,

Best Regards, Simon Riggs


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [HACKERS] [PATCHES] External Sort timing debug statements

2005-10-03 Thread Tom Lane

Simon Riggs <[EMAIL PROTECTED]> writes:
>> The following patch implements a fairly light set of timing statements
>> aimed at understanding external sort performance. There is no attempt to
>> alter the algorithms.

> Minor update of patch, use this version please.

Applied with revisions: I made it use the VacRUsage code so that we
could see both CPU and elapsed time, and moved the report points around
a bit.  The output with trace_sort enabled looks like this:

NOTICE:  begin tuple sort: nkeys = 1, workMem = 1024, randomAccess = t
NOTICE:  switching to external sort: CPU 0.05s/0.10u sec elapsed 0.15 sec
NOTICE:  finished writing run 1: CPU 0.14s/0.83u sec elapsed 0.99 sec
NOTICE:  finished writing run 2: CPU 0.25s/1.67u sec elapsed 1.94 sec
NOTICE:  finished writing run 3: CPU 0.37s/2.51u sec elapsed 2.90 sec
NOTICE:  finished writing run 4: CPU 0.48s/3.36u sec elapsed 3.86 sec
...
NOTICE:  finished writing run 45: CPU 5.06s/38.26u sec elapsed 43.55 sec
NOTICE:  performsort starting: CPU 5.10s/38.62u sec elapsed 43.95 sec
NOTICE:  finished writing run 46: CPU 5.11s/38.84u sec elapsed 44.18 sec
NOTICE:  finished writing final run 47: CPU 5.11s/38.88u sec elapsed 44.22 sec
NOTICE:  finished merge step: CPU 5.12s/39.02u sec elapsed 44.37 sec
NOTICE:  finished merge step: CPU 5.13s/39.16u sec elapsed 44.53 sec
...
NOTICE:  finished merge step: CPU 6.57s/67.78u sec elapsed 74.83 sec
NOTICE:  performsort done: CPU 6.57s/67.78u sec elapsed 74.84 sec
NOTICE:  sort ended: CPU 10.80s/74.73u sec elapsed 86.21 sec

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] Build Farm: thrush

2005-10-03 Thread Michael Fuhr

On Mon, Oct 03, 2005 at 05:19:43PM +0200, Gaetano Mendola wrote:
> Tom Lane wrote:
> > Gaetano Mendola <[EMAIL PROTECTED]> writes:
> >> I'm the administrator of that machine and PLCheck is failing.
> >> Is there anything I can do to fix it ?
> > 
> > What version of Python have you got on that thing?  It seems to be
> > emitting still another spelling of the encoding error message :-(
> 
> $ python -V
> Python 2.2.3

The attached variant file (plpython_error_2.out) should allow this
Python version's wording of the error message.

-- 
Michael Fuhr
-- test error handling, i forgot to restore Warn_restart in
-- the trigger handler once. the errors and subsequent core dump were
-- interesting.
SELECT invalid_type_uncaught('rick');
WARNING:  plpython: in function invalid_type_uncaught:
DETAIL:  plpy.SPIError: Unknown error in PLy_spi_prepare
ERROR:  type "test" does not exist
SELECT invalid_type_caught('rick');
WARNING:  plpython: in function invalid_type_caught:
DETAIL:  plpy.SPIError: Unknown error in PLy_spi_prepare
ERROR:  type "test" does not exist
SELECT invalid_type_reraised('rick');
WARNING:  plpython: in function invalid_type_reraised:
DETAIL:  plpy.SPIError: Unknown error in PLy_spi_prepare
ERROR:  type "test" does not exist
SELECT valid_type('rick');
 valid_type 

 
(1 row)

--
-- Test Unicode error handling.
--
SELECT unicode_return_error();
ERROR:  plpython: function "unicode_return_error" could not create return value
DETAIL:  exceptions.UnicodeError: ASCII encoding error: ordinal not in 
range(128)
INSERT INTO unicode_test (testvalue) VALUES ('test');
ERROR:  plpython: function "unicode_trigger_error" could not modify tuple
DETAIL:  exceptions.UnicodeError: ASCII encoding error: ordinal not in 
range(128)
SELECT unicode_plan_error1();
WARNING:  plpython: in function unicode_plan_error1:
DETAIL:  plpy.Error: Unknown error in PLy_spi_execute_plan
ERROR:  plpython: function "unicode_plan_error1" could not execute plan
DETAIL:  exceptions.UnicodeError: ASCII encoding error: ordinal not in 
range(128)
SELECT unicode_plan_error2();
ERROR:  plpython: function "unicode_plan_error2" could not execute plan
DETAIL:  exceptions.UnicodeError: ASCII encoding error: ordinal not in 
range(128)

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

[HACKERS] COPY FROM with CSV header

2005-10-03 Thread Jim C. Nasby

Instead of ignoring the first line of a COPY FROM ... WITH CSV HEADER,
what about allowing the first line to be used as a list of field names?
This means you wouldn't have to include field order in the COPY command
if the names matched field names in the table.
-- 
Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Software  http://pervasive.comwork: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] PG Killed by OOM Condition

2005-10-03 Thread mark

On Mon, Oct 03, 2005 at 01:25:00PM -0700, Jeff Davis wrote:
> It's happened to me...
> Usually it's when there's some other runaway process, and the kernel
> decides to kill PostgreSQL because it can't tell the difference.
> I really don't like that "feature" in linux. Nobody has been able to
> explain to me why linux is the only OS with an OOM Killer. If someone
> here has more information, I'd like to know.
> When using linux I always set vm_overcommit=2.

I don't think it's the only one. Perhaps the only one with a default
setting of on?

I believe Solaris can be configured to over-commit memory.

The problem really comes down to the definition of fork(). UNIX fork()
requires that the system splits a process into two separate copies.
For an application that is currently using 500 Mbytes of virtual
memory, this would require that the system accept that each process,
may use its own complete copy of this 500 Mbytes, for a total of 1
Gbyte in active use. fork() a few more times, and we hit 2 Gbytes, 4
Gbytes - whatever. Even if only for an instant, and even if the pages
are copy-on-write, the system has to consider the possibility that
each application may choose to modify all pages, resulting in complete
copies.

Solaris defaults to not over committing. This means that such an
application, as defined above, would fail at one of the invocations
of fork(). Even though the memory isn't being used - Solaris, by default,
isn't willing to 'over commit' to having the memory allocated as a result
of fork(). Some very large applications don't work under Solaris as a
result, unless this setting is disabled.

Linux takes the opposite extreme. It assumes that copy-on-write will
get us through. The fork() would be allowed - but if available virtual
memory actually does become low, it's faced with a hard decision. It
either fails an application of its choice in a controlled "OOM"
manner, by trying to guess which application is misbehaving, and
deciding to kill that one - or it waits until memory really is gone,
at which point MANY applications may start to fail, as their page
fault fails to allocate a page, and the process dies a horrible death.

It's not an easy decision. Linux isn't wrong. Solaris isn't wrong.
Most people never hit these problems, and the people that do, are
just as likely to hit one problem, or the other. The grass is always
greener on the side of the fence that isn't hurting me right now,
and all that.

Cheers,
mark

-- 
[EMAIL PROTECTED] / [EMAIL PROTECTED] / [EMAIL PROTECTED] 
__
.  .  _  ._  . .   .__.  . ._. .__ .   . . .__  | Neighbourhood Coder
|\/| |_| |_| |/|_ |\/|  |  |_  |   |/  |_   | 
|  | | | | \ | \   |__ .  |  | .|. |__ |__ | \ |__  | Ottawa, Ontario, Canada

  One ring to rule them all, one ring to find them, one ring to bring them all
   and in the darkness bind them...

   http://mark.mielke.cc/

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [HACKERS] [PERFORM] A Better External Sort?

2005-10-03 Thread Josh Berkus

Jeffrey,

> I guess database reads are different, but I remain unconvinced that they
> are *fundamentally* different.  After all, a tab-delimited file (my sort
> workload) is a kind of database.

Unfortunately, they are ... because of CPU overheads.   I'm basing what's 
"reasonable" for data writes on the rates which other high-end DBs can 
make.   From that, 25mb/s or even 40mb/s for sorts should be achievable 
but doing 120mb/s would require some kind of breakthrough.

> On a single disk you wouldn't notice, but XFS scales much better when
> you throw disks at it.  I get a 50MB/sec boost from the 24th disk,
> whereas ext3 stops scaling after 16 disks.  For writes both XFS and ext3
> top out around 8 disks, but in this case XFS tops out at 500MB/sec while
> ext3 can't break 350MB/sec.

That would explain it.  I seldom get more than 6 disks (and 2 channels) to 
test with.

-- 
--Josh

Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [HACKERS] [PERFORM] A Better External Sort?

2005-10-03 Thread Josh Berkus

Michael,

> >Nope, LOTS of testing, at OSDL, GreenPlum and Sun.   For comparison, A
> >Big-Name Proprietary Database doesn't get much more than that either.
>
> You seem to be talking about database IO, which isn't what you said.

Right, well, it was what I meant.   I failed to specify, that's all.

-- 
--Josh

Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] [PERFORM] A Better External Sort?

2005-10-03 Thread Luke Lonergan

Hannu,

On 10/3/05 2:43 PM, "Hannu Krosing" <[EMAIL PROTECTED]> wrote:

> Just FYI, I run a count(*) on a 15.6GB table on a lightly loaded db and
> it run in 163 sec. (Dual opteron 2.6GHz, 6GB RAM, 6 x 74GB 15k  disks in
> RAID10, reiserfs). A little less than 100MB sec.

This confirms our findings - sequential scan is CPU limited at about 120MB/s
per single threaded executor.  This is too slow for fast file systems like
we're discussing here.

Bizgres MPP gets 250MB/s by running multiple scanners, but we still chew up
unnecessary amounts of CPU.

> After this I ran count(*) over a 2.4GB file from another tablespace on
> another device (4x142GB 10k disks in RAID10) and it run 22.5 sec on
> first run and 12.5 on second.

You're getting caching effects here.

- Luke

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] [PERFORM] A Better External Sort?

2005-10-03 Thread Simon Riggs

On Sun, 2005-10-02 at 21:38 +0200, Martijn van Oosterhout wrote:
> Ok, I tried two optimisations:
> 

> 2. By specifying: -Winline -finline-limit-1500 (only on tuplesort.c).
> This causes inlineApplySortFunction() to be inlined, like the code
> obviously expects it to be.
> 
> default build (baseline)235 seconds
> -finline only   217 seconds (7% better)
> comparetup_index_fastbyval4 only221 seconds (6% better)
> comparetup_index_fastbyval4 and -finline203 seconds (13.5% better)
> 
> This is indexing the integer sequence column on a 2.7 million row
> table. The times are as given by gprof and so exclude system call time.
> 
> Basically, I recommend adding "-Winline -finline-limit-1500" to the
> default build while we discuss other options.

I add -Winline but get no warnings. Why would I use -finline-limit-1500?

I'm interested, but uncertain as to what difference this makes. Surely
using -O3 works fine?

Best Regards, Simon Riggs


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [HACKERS] [PERFORM] A Better External Sort?

2005-10-03 Thread Hannu Krosing

On E, 2005-10-03 at 14:16 -0700, Josh Berkus wrote:
> Jeff,
> 
> > > Nope, LOTS of testing, at OSDL, GreenPlum and Sun.   For comparison, A
> > > Big-Name Proprietary Database doesn't get much more than that either.
> >
> > I find this claim very suspicious.  I get single-threaded reads in
> > excess of 1GB/sec with XFS and > 250MB/sec with ext3.
> 
> Database reads?  Or raw FS reads?  It's not the same thing.

Just FYI, I run a count(*) on a 15.6GB table on a lightly loaded db and
it run in 163 sec. (Dual opteron 2.6GHz, 6GB RAM, 6 x 74GB 15k  disks in
RAID10, reiserfs). A little less than 100MB sec.

After this I ran count(*) over a 2.4GB file from another tablespace on
another device (4x142GB 10k disks in RAID10) and it run 22.5 sec on
first run and 12.5 on second.

db=# show shared_buffers ;
 shared_buffers

 196608
(1 row)

db=# select version();
  version

 PostgreSQL 8.0.3 on x86_64-pc-linux-gnu, compiled by GCC cc (GCC) 3.3.6
(Debian 1:3.3.6-7)
(1 row)

-- 
Hannu Krosing <[EMAIL PROTECTED]>

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [HACKERS] [PERFORM] A Better External Sort?

2005-10-03 Thread Jeffrey W. Baker

On Mon, 2005-10-03 at 14:16 -0700, Josh Berkus wrote:
> Jeff,
> 
> > > Nope, LOTS of testing, at OSDL, GreenPlum and Sun.   For comparison, A
> > > Big-Name Proprietary Database doesn't get much more than that either.
> >
> > I find this claim very suspicious.  I get single-threaded reads in
> > excess of 1GB/sec with XFS and > 250MB/sec with ext3.
> 
> Database reads?  Or raw FS reads?  It's not the same thing.

Just reading files off the filesystem.  These are input rates I get with
a specialized sort implementation.  1GB/sec is not even especially
wonderful, I can get that on two controllers with 24-disk stripe set.

I guess database reads are different, but I remain unconvinced that they
are *fundamentally* different.  After all, a tab-delimited file (my sort
workload) is a kind of database.

> Also, we're talking *write speed* here, not read speed.

Ok, I did not realize.  Still you should see 250-300MB/sec
single-threaded sequential output on ext3, assuming the storage can
provide that rate.

> I also find *your* claim suspicious, since there's no way XFS is 300% faster 
> than ext3 for the *general* case.

On a single disk you wouldn't notice, but XFS scales much better when
you throw disks at it.  I get a 50MB/sec boost from the 24th disk,
whereas ext3 stops scaling after 16 disks.  For writes both XFS and ext3
top out around 8 disks, but in this case XFS tops out at 500MB/sec while
ext3 can't break 350MB/sec.

I'm hopeful that in the future the work being done at ClusterFS will
make ext3 on-par with XFS.

-jwb

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] [PERFORM] A Better External Sort?

2005-10-03 Thread Luke Lonergan

Jeff, Josh,

On 10/3/05 2:16 PM, "Josh Berkus"  wrote:

> Jeff,
> 
>>> Nope, LOTS of testing, at OSDL, GreenPlum and Sun.   For comparison, A
>>> Big-Name Proprietary Database doesn't get much more than that either.
>> 
>> I find this claim very suspicious.  I get single-threaded reads in
>> excess of 1GB/sec with XFS and > 250MB/sec with ext3.
> 
> Database reads?  Or raw FS reads?  It's not the same thing.
> 
> Also, we're talking *write speed* here, not read speed.

I think you are both talking past each other here.  I'll state what I
*think* each of you are saying:

Josh: single threaded DB writes are limited to 25MB/s

My opinion: Not if they're done better than they are now in PostgreSQL.
PostgreSQL COPY is still CPU limited at 12MB/s on a super fast Opteron.  The
combination of WAL and head writes while this is the case is about 50MB/s,
which is far from the limit of the filesystems we test on that routinely
perform at 250MB/s on ext2 writing in sequential 8k blocks.

There is no reason that we couldn't do triple the current COPY speed by
reducing the CPU overhead in parsing and attribute conversion.  We've talked
this to death, and implemented much of the code to fix it, but there's much
more to do.

Jeff: Plenty of FS bandwidth to be had on Linux, observed 250MB/s on ext3
and 1,000MB/s on XFS.

Wow - can you provide a link or the results from the XFS test?  Is this 8k
blocksize sequential I/O?  How many spindles and what controller are you
using?  Inquiring minds want to know...

- Luke 

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [HACKERS] Tuning current tuplesort external sort code for 8.2

2005-10-03 Thread Martijn van Oosterhout

On Mon, Oct 03, 2005 at 09:35:30PM +0100, Simon Riggs wrote:
> Based upon profiling of the initial stage of external sorting, it seems
> that this stage is overall CPU bound, with hotspots in comparetup_*
> accounting for around 50% of CPU time; lets just call that too much,
> since your exact experience may vary.

Indeed, however as I pointed out, if you arrange for
inlineApplySortFunction() actually be inlined, you can cut costs,
especially in the index creation case.

> values from that could be cached ready for the next call. Caching would
> reduce number of *_getattr calls from 2N to N+1, where N is likely to go

My profiling indicates that the second getattr is half the cost of the
first, gcc optimisation at work. Note that setting CFLAGS=-pg for
configure disables optimisation, I missed that the first time.
Ofcourse, every call saved is time saved.

> 2. In comparetup_ the second attr value is always fetched, even when the
> first attr is null. When the first attr is null the value of the second
> need never be checked, just whether the second attr is null or not, so
> the full cost of the *_getattr need not actually be paid at all. The
> relevance of this is not reduced as a result of the caching suggested in
> (1).

Actually, attribute is null is the cheap case because you only need to
check the bitmap. But you could optimise stuff by expanding the
*_getattr calls and optimising directly. Possible problem with caching:
if you're called by the system qsort, can you assume anything about the
order of the comparisons?

Please note: if inlineApplySortFunction() is actually inlined (it isn't
by default), gcc does get very smart about this and sometimes optimises
out the Datum fetches depending on the isNull flags. So we need to
check we're actually making an improvement over the compiler.

> is a subset of the PK (a typical one-many relationship) and groupings
> also. In the majority of cases, these attrs are at the start of a tuple.
> The *_getattr macros are particularly poor at handling NULLs. When
> *_getattr sees *any* NULL is present for a tuple it checks the
> nullability of all attrs up to the current attrnum before returning
> using the cached offsets. The macro could be altered so that if the
> current attrnum < firstNullableAttrnum (which we can set once for the

Maybe easier, in the macro use: bitmap & ((1< 0), \
*(isnull) = false, \
!IndexTupleHasNulls(tup) || (attnum < 32 && (NullBitmap(tup) & 
((1attcacheoff >= 0 ? \

Nice ideas though, a seperate run just for NULL keys is interesting. If
you only have one sort key it becomes a whole tape which doesn't need
to be sorted anymore, just emit it at the beginning or end. Could be
helpful.

Mind you, if you start creating seperate routines for different cases
you can go a long way. Elsewhere on this list I created a special case
for single-key integer index columns and got an 8% speed increase. Not
exactly a viable solution though.

Have a nice day,
-- 
Martijn van Oosterhout  http://svana.org/kleptog/
> Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> tool for doing 5% of the work and then sitting around waiting for someone
> else to do the other 95% so you can sue them.

pgpWTkP5cRja6.pgp
Description: PGP signature

Re: [HACKERS] PG Killed by OOM Condition

2005-10-03 Thread Alvaro Herrera

On Mon, Oct 03, 2005 at 01:25:00PM -0700, Jeff Davis wrote:
> Martijn van Oosterhout wrote:
> > On Mon, Oct 03, 2005 at 11:03:06PM +1000, John Hansen wrote:
> > 
> >>Might it be worth while protecting the postmaster from an OOM Kill on
> >>Linux by setting /proc/{pid}/oom_adj to -17 ?
> >>(Described vaguely in mm/oom_kill.c)
> > 
> > Has it actually happened to you? PostgreSQL is pretty good about its
> > memory usage. Besides, seems to me it should be an system admisitrator
> > descision.

Maybe what we could do is put a line to change the setting in the
contrib/start-script/linux script, and perhaps lobby the packagers of
Linux distributions to do the same.

ISTM it's trivial to test whether the file exists, and useful to
activate the feature if available.

-- 
Alvaro Herrerahttp://www.PlanetPostgreSQL.org
"In fact, the basic problem with Perl 5's subroutines is that they're not
crufty enough, so the cruft leaks out into user-defined code instead, by
the Conservation of Cruft Principle."  (Larry Wall, Apocalypse 6)

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] [PERFORM] A Better External Sort?

2005-10-03 Thread Ron Peacetree

Jeff, are those _burst_ rates from HD buffer or _sustained_ rates from
actual HD media?  Rates from IO subsystem buffer or cache are
usually considerably higher than Average Sustained Transfer Rate.

Also, are you measuring _raw_ HD IO (bits straight off the platters, no
FS or other overhead) or _cooked_ HD IO (actual FS or pg IO)?

BTW, it would seem Useful to measure all of raw HD IO, FS HD IO,
and pg HD IO as this would give us an idea of just how much overhead
each layer is imposing on the process.

We may be able to get better IO than we currently are for things like
sorts by the simple expedient of making sure we read enough data per
seek.

For instance, a HD with a 12ms average access time and a ASTR of
50MBps should always read _at least_ 600KB/access or it is impossible
for it to achieve it's rated ASTR.

This number will vary according to the average access time and the
ASTR of your physical IO subsystem, but the concept is valid for _any_
physical IO subsystem.

-Original Message-
From: "Jeffrey W. Baker" <[EMAIL PROTECTED]>
Sent: Oct 3, 2005 4:42 PM
To: josh@agliodbs.com
Cc: 
Subject: Re: [HACKERS] [PERFORM] A Better External Sort?

On Mon, 2005-10-03 at 13:34 -0700, Josh Berkus wrote:
> Michael,
> 
> > >Realistically, you can't do better than about 25MB/s on a
> > > single-threaded I/O on current Linux machines,
> >
> > What on earth gives you that idea? Did you drop a zero?
> 
> Nope, LOTS of testing, at OSDL, GreenPlum and Sun.   For comparison, A 
> Big-Name Proprietary Database doesn't get much more than that either.

I find this claim very suspicious.  I get single-threaded reads in
excess of 1GB/sec with XFS and > 250MB/sec with ext3.  

-jwb

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [HACKERS] [PERFORM] A Better External Sort?

2005-10-03 Thread Josh Berkus

Jeff,

> > Nope, LOTS of testing, at OSDL, GreenPlum and Sun.   For comparison, A
> > Big-Name Proprietary Database doesn't get much more than that either.
>
> I find this claim very suspicious.  I get single-threaded reads in
> excess of 1GB/sec with XFS and > 250MB/sec with ext3.

Database reads?  Or raw FS reads?  It's not the same thing.

Also, we're talking *write speed* here, not read speed.

I also find *your* claim suspicious, since there's no way XFS is 300% faster 
than ext3 for the *general* case.

-- 
Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [HACKERS] effective SELECT from child tables

2005-10-03 Thread Hannu Krosing

On P, 2005-10-02 at 23:00 -0400, Tom Lane wrote:
> Greg Stark <[EMAIL PROTECTED]> writes:
> > It would be nice to be able to do:
> >   ALTER TABLE ADD foo integer DEFAULT 1
> > And there's no question of what what the semantics of this are.
> 
> Sure, but you can only optimize this if the default expression is
> immutable...
> 
> > On the other hand if you do
> >  ALTER TABLE ADD foo integer
> > and then later do
> >  ALTER TABLE ALTER foo SET DEFAULT 1
> > then there is a window where all those foos are NULL and then they magically
> > become 1? That doesn't seem tenable.
> 
> It'd also be contrary to the SQL spec, AFAICS.
> 
> Here's another interesting case to think about:
> 
>   ALTER TABLE ADD foo integer DEFAULT 1
>   ...
>   ALTER TABLE ALTER foo SET DEFAULT 2
>
> You'll have to pay the table-traversal cost on one step or the other.

The second, ALTER ... SET DEFAULT, would only set default for newly
inserted columns, not the ones which are missing due to tuples being
created before the column existed.

But completely different syntax may be more clear.

ALTER TABLE ADD foo integer WITH DEFAULT 1;

Or whatever 

-- 
Hannu Krosing <[EMAIL PROTECTED]>


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [HACKERS] [PERFORM] A Better External Sort?

2005-10-03 Thread Jeffrey W. Baker

On Mon, 2005-10-03 at 13:34 -0700, Josh Berkus wrote:
> Michael,
> 
> > >Realistically, you can't do better than about 25MB/s on a
> > > single-threaded I/O on current Linux machines,
> >
> > What on earth gives you that idea? Did you drop a zero?
> 
> Nope, LOTS of testing, at OSDL, GreenPlum and Sun.   For comparison, A 
> Big-Name Proprietary Database doesn't get much more than that either.

I find this claim very suspicious.  I get single-threaded reads in
excess of 1GB/sec with XFS and > 250MB/sec with ext3.  

-jwb

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [HACKERS] [PERFORM] A Better External Sort?

2005-10-03 Thread Josh Berkus

Tom,

> Raising work_mem to a gig should result in about five runs, needing only
> one pass, which is really going to be as good as it gets.  If you could
> not see any difference then I see little hope for the idea that reducing
> the number of merge passes will help.

Right.  It *should have*, but didn't seem to.Example of a simple sort 
test of 100 million random-number records

1M   3294 seconds
  16M   1107 seconds
  256M   1209 seconds
  512M   1174 seconds
  512M with 'not null' for column that is indexed  1168 seconds

> Umm ... you were raising maintenance_work_mem, I trust, not work_mem?

Yes.

>
> We really need to get some hard data about what's going on here.  The
> sort code doesn't report any internal statistics at the moment, but it
> would not be hard to whack together a patch that reports useful info
> in the form of NOTICE messages or some such.

Yeah, I'll do this as soon as the patch is finished.   Always useful to 
gear up the old TPC-H.

-- 
--Josh

Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

[HACKERS] Tuning current tuplesort external sort code for 8.2

2005-10-03 Thread Simon Riggs


Based upon profiling of the initial stage of external sorting, it seems
that this stage is overall CPU bound, with hotspots in comparetup_*
accounting for around 50% of CPU time; lets just call that too much,
since your exact experience may vary.

Previously, I'd looked through all of the code with respect to the basic
algorithms. These are good, but it seemed very likely that there would
be opportunities to improve on the way things are, so I kept looking.

AFAICS the following opportunities exist, without changing any of the
theoretical algorithms or the flexibility of definable datatypes:

1. tuplesort_heap_siftup and tuplesort_heap_insert make no attempt to
cache the values of keys that have been obtained from *_getattr macros.
The two routines navigate a tournament sort heap, so that on average 50%
of comparisons use at least one immediately preceeding tuple and key
values from that could be cached ready for the next call. Caching would
reduce number of *_getattr calls from 2N to N+1, where N is likely to go
up on average linearly with work_mem. This would reduce the cost of
comparetup_ significantly. The inlined calls to myFunctionCall2 could
also take advantage of that caching to reduce pre-call setup by at least
50% also. (Only the first sort key attr would be cached, since the vast
majority of times only the first sort key will be checked. The sort does
use index number as first sort key at this time, but since this is run-
number, that isn't granular enough to reduce comparisons sufficiently).

All of the remaining ideas relate to NULL handling.

2. In comparetup_ the second attr value is always fetched, even when the
first attr is null. When the first attr is null the value of the second
need never be checked, just whether the second attr is null or not, so
the full cost of the *_getattr need not actually be paid at all. The
relevance of this is not reduced as a result of the caching suggested in
(1).

3. In a great many cases, sorts will be performed on non-nullable attrs,
e.g. PK indexes, many FK indexes, sort-merge joins based upon a FK that
is a subset of the PK (a typical one-many relationship) and groupings
also. In the majority of cases, these attrs are at the start of a tuple.
The *_getattr macros are particularly poor at handling NULLs. When
*_getattr sees *any* NULL is present for a tuple it checks the
nullability of all attrs up to the current attrnum before returning
using the cached offsets. The macro could be altered so that if the
current attrnum < firstNullableAttrnum (which we can set once for the
high level tupleDesc, rather than once per tuple) then we use the cached
offset, whether or not other nulls exist within the tuple. If not, then
we can start testing for nullability from the firstNullableAttrnum.
Currently, if we are *slow* according to nocachegetattr, i.e. there was
a prior NULL value, then we forget which one that was and go and re-
check them all from the start again. When slow, we could start
calculating the offset using the cached value of firstNull and then
working up from there. (All of that relates to the macros in general,
though they aren't actually used anymore apart from in various catalog
fetches and COPY TO)

Also, there is an opportunity to modify the run building with respect to
NULL values. Knuth's algorithm doesn't take into account 3VL at all, so
he might have wanted to do the following, if he could:

4. In an external sort we do a k passes through the total sort file.
During run building, the first comparison will reveal that a value has a
leading NULL sort key attr. NULLs always sort higher/lower than all
other values, yet are equal to each other. Once we know a tuple has a
leading NULL sort key attr, we could divert it to a NULL-holding file to
avoid involving it in many pointless comparisons and read/write I/Os.
Once all tuples have been emitted, the NULL-holding file can itself be
sorted recursively (starting at the second key etc.). At the end, the
NULL-holding file can be either read first or last, according to where
NULLs are placed (hi/lo/asc/desc). That technique might be of use when
we are trying to stay just inside memory, or when we have so many sort
passes that saving time on the NULLs could be a real saving; this would
likely be too much code for too little benefit. (This may be orthogonal
to the idea of using very large numbers of virtual tapes).

Other possibilities exist, most notable of which is the > 6 tape merge
already mentioned by Tom. None of the above conflicts with that change.

Assuming these sound good to all, I'll be starting to write up these
ideas in a couple of weeks, though comments on these specific code
suggestions are welcome now.

It may be possible to improve upon the basic theoretical algorithms, but
I'm not looking to try that right now. We've got enough ideas here to
make some good progress over next few months.

Best Regards, Simon Riggs


---(end of broadcast)

Re: [HACKERS] External Sort timing debug statements

2005-10-03 Thread Simon Riggs

On Mon, 2005-10-03 at 10:36 -0400, Tom Lane wrote:
> Simon Riggs <[EMAIL PROTECTED]> writes:
> > The following patch implements a fairly light set of timing statements
> > aimed at understanding external sort performance. There is no attempt to
> > alter the algorithms.
> 
> What do people think about putting something like this into 8.1?
> Strictly speaking it's a new feature, but the patch seems pretty
> noninvasive, and we'd be much more likely to get data points if the
> code exists in the mainline release than if people have to patch
> their copies.
> 
> > postgres=# set debug_sort=true;
> 
> I'm a bit inclined to call it trace_sort instead, and to document it
> under "Developer Options".  Comments?

Sounds good to me.

Any other requests for info in the trace output shout now

Best Regards, Simon Riggs


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] effective SELECT from child tables

2005-10-03 Thread Hannu Krosing

On P, 2005-10-02 at 22:49 -0400, Greg Stark wrote:

> No, I think redefining NULL is a non-starter. This whole thing only has legs
> if Postgres can distinguish between a column that has never been set and a
> column that has NULL.
> 
> Actually the only case where I don't see bizarre semantic consequences is the
> case of a newly created column. It would be nice to be able to do:
> 
>   ALTER TABLE ADD foo integer DEFAULT 1

Probably a different syntax would be better here, perhaps

ALTER TABLE ADD foo integer AS 1 WHEN MISSING;

or somesuch.

> And there's no question of what what the semantics of this are. No question of
> columns magically acquiring a value where they were NULL before.
> 
> The *main* reason I would be happy about this is that it would let me add NOT
> NULL columns efficiently. I would expect NOT NULL to be allowed whenever a
> default is provided since there's obviously no way it can cause a problem.

> (I don't follow the "NULL is evil all the time" philosophy but I do try to set
> columns NOT NULL wherever I can. It always annoys me that when adding a new
> column I have to choose between a massive disk intensive batch job or
> compromising the design.)
> 
> 
> On the other hand if you do
> 
>  ALTER TABLE ADD foo integer
> 
> and then later do
> 
>  ALTER TABLE ALTER foo SET DEFAULT 1
> 
> then there is a window where all those foos are NULL and then they magically
> become 1? That doesn't seem tenable.

Not if you require these two to happen in one transaction to affect
added/missing values. 

-- 
Hannu Krosing <[EMAIL PROTECTED]>


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] [PERFORM] A Better External Sort?

2005-10-03 Thread Josh Berkus

Michael,

> >Realistically, you can't do better than about 25MB/s on a
> > single-threaded I/O on current Linux machines,
>
> What on earth gives you that idea? Did you drop a zero?

Nope, LOTS of testing, at OSDL, GreenPlum and Sun.   For comparison, A 
Big-Name Proprietary Database doesn't get much more than that either.

-- 
--Josh

Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] PG Killed by OOM Condition

2005-10-03 Thread Jeff Davis

Martijn van Oosterhout wrote:
> On Mon, Oct 03, 2005 at 11:03:06PM +1000, John Hansen wrote:
> 
>>Might it be worth while protecting the postmaster from an OOM Kill on
>>Linux by setting /proc/{pid}/oom_adj to -17 ?
>>(Described vaguely in mm/oom_kill.c)
> 
> 
> Has it actually happened to you? PostgreSQL is pretty good about its
> memory usage. Besides, seems to me it should be an system admisitrator
> descision.
> 

It's happened to me...

Usually it's when there's some other runaway process, and the kernel
decides to kill PostgreSQL because it can't tell the difference.

I really don't like that "feature" in linux. Nobody has been able to
explain to me why linux is the only OS with an OOM Killer. If someone
here has more information, I'd like to know.

When using linux I always set vm_overcommit=2.

Regards,
Jeff Davis

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [PATCHES] [HACKERS] External Sort timing debug statements

2005-10-03 Thread David Fetter

On Mon, Oct 03, 2005 at 10:36:57AM -0400, Tom Lane wrote:
> Simon Riggs <[EMAIL PROTECTED]> writes:
> > The following patch implements a fairly light set of timing
> > statements aimed at understanding external sort performance. There
> > is no attempt to alter the algorithms.
> 
> What do people think about putting something like this into 8.1?
> Strictly speaking it's a new feature, but the patch seems pretty
> noninvasive, and we'd be much more likely to get data points if the
> code exists in the mainline release than if people have to patch
> their copies.

This would be good :)

> > postgres=# set debug_sort=true;
> 
> I'm a bit inclined to call it trace_sort instead, and to document it
> under "Developer Options".  Comments?

+1

Cheers,
D
-- 
David Fetter [EMAIL PROTECTED] http://fetter.org/
phone: +1 510 893 6100   mobile: +1 415 235 3778

Remember to vote!

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [HACKERS] PG Killed by OOM Condition

2005-10-03 Thread John Hansen

Tom Lane Wrote:

> (a) wouldn't that require root privilege?  (b) how would we 
> determine whether we are on a system to which this applies?  
> (c) is it actually documented in a way that makes you think 
> it'll be a permanently supported feature (ie, somewhere 
> outside the source code)?

(a) No, /proc/{pid}/* is owned by the process
(b) /proc/{pid}/oom_adj exists ?
(c) No, from the source: (not docbooked, we don't want this one
cluttering up the manual)

... John

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [HACKERS] PG Killed by OOM Condition

2005-10-03 Thread John Hansen

Martijn van Oosterhout Wrote:
 
> Has it actually happened to you? PostgreSQL is pretty good 
> about its memory usage. Besides, seems to me it should be an 
> system admisitrator descision.

No, Just came across this by chance, and thought it might be a good
idea.
Perhaps as a postgresql.conf setting.

... John


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [HACKERS] RPMs for RedHat ES3.0

2005-10-03 Thread Devrim GUNDUZ



Hi,

On Mon, 3 Oct 2005, Philip Yarra wrote:


Devrim, I had some problems installing on RedHat ES3.0 with the RPMs. This
issue turned out to be that I needed the compat libs to provide the old libs
before I could install the new libs.

I found a link to
http://developer.PostgreSQL.org/~devrim/compat-postgresql-libs-3-3PGDG.i686.rpm
in an archived post from -hackers. It wasn't all that obvious what was
required though, so is it possible that this RPM could actually be put with
the downloads for the main RPMs? I'm not sure anyone's installation would
succeed without that RPM on RedHat ES3.0. Please let me know if I'm missing
something.


We haven't discussed it before, but I think we can't put the RPM among 
PGDG RPMs. The main reason is that this is not a piece of software that's 
included within that release of PostgreSQL. Also, this compat RPM is not 
needed if you don't have a software that needs that RPM.



As an aside: is it possible to actually to just include the contents of
compat-libs in postgresql-libs, since they actually seem to be required? It'd
save me some hassles for future installations. Probably other folks too.


As I wrote above, it does not seem possible.

Also remember that this will be probably 'solved' in RHEL 5 (and FC 5), so 
merging compat with main libs package is not a good idea.


Regards,
--
Devrim GUNDUZ
Kivi Bilişim Teknolojileri - http://www.kivi.com.tr
devrim~gunduz.org, devrim~PostgreSQL.org, devrim.gunduz~linux.org.tr
  http://www.gunduz.org
---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq

Re: [HACKERS] [pgadmin-hackers] pgAdmin guru hints

2005-10-03 Thread Dave Page




Thanks - fix committed to svn.
 
Regards, Dave

  
  
  From: [EMAIL PROTECTED] 
  [mailto:[EMAIL PROTECTED] On Behalf Of Bastiaan 
  WakkieSent: 03 October 2005 17:04To: Andreas 
  PflugCc: PostgreSQL-development; pgadmin-hackers; 
  pgsql-admin@postgresql.orgSubject: Re: [pgadmin-hackers] pgAdmin 
  guru hints
  I noticed a typo in hints/conn-hba.htmlThe second internal 
  ip adres missed a '.'
  You're invited to make your comments on the hints: are these correct, is 
there something missing or misleading?
Cheers,
  


  -- 
^(B(astia{2}n)?)(\s)?(W(ak{2}ie)?)$

Re: [HACKERS] Build Farm: thrush

2005-10-03 Thread Gaetano Mendola

Tom Lane wrote:
> Gaetano Mendola <[EMAIL PROTECTED]> writes:
>> I'm the administrator of that machine and PLCheck is failing.
>> Is there anything I can do to fix it ?
> 
> What version of Python have you got on that thing?  It seems to be
> emitting still another spelling of the encoding error message :-(

$ python -V
Python 2.2.3



Regards
Gaetano Mendola






---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [PATCHES] [HACKERS] External Sort timing debug statements

2005-10-03 Thread Dave Page

 

> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of Tom Lane
> Sent: 03 October 2005 15:37
> To: Simon Riggs
> Cc: pgsql-hackers@postgresql.org; pgsql-patches@postgresql.org
> Subject: Re: [PATCHES] [HACKERS] External Sort timing debug 
> statements 
> 
> Simon Riggs <[EMAIL PROTECTED]> writes:
> > The following patch implements a fairly light set of timing 
> statements
> > aimed at understanding external sort performance. There is 
> no attempt to
> > alter the algorithms.
> 
> What do people think about putting something like this into 8.1?
> Strictly speaking it's a new feature, but the patch seems pretty
> noninvasive, and we'd be much more likely to get data points if the
> code exists in the mainline release than if people have to patch
> their copies.

Agreed.

> > postgres=# set debug_sort=true;
> 
> I'm a bit inclined to call it trace_sort instead, and to document it
> under "Developer Options".  Comments?

Sounds reasonable to me.

Regards, Dave.

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [HACKERS] External Sort timing debug statements

2005-10-03 Thread Jonah H. Harris

I'm not averse to it.  I think it's a good option and I support trace_sort (it really is more of a trace).On 10/3/05, Tom Lane <
[EMAIL PROTECTED]> wrote:Simon Riggs <
[EMAIL PROTECTED]> writes:> The following patch implements a fairly light set of timing statements> aimed at understanding external sort performance. There is no attempt to> alter the algorithms.
What do people think about putting something like this into 8.1?Strictly speaking it's a new feature, but the patch seems prettynoninvasive, and we'd be much more likely to get data points if thecode exists in the mainline release than if people have to patch
their copies.> postgres=# set debug_sort=true;I'm a bit inclined to call it trace_sort instead, and to document itunder "Developer Options".  Comments?regards,
tom lane---(end of broadcast)---TIP 2: Don't 'kill -9' the postmaster-- Respectfully,Jonah H. Harris, Database Internals Architect
EnterpriseDB Corporationhttp://www.enterprisedb.com/

Re: [HACKERS] External Sort timing debug statements

2005-10-03 Thread Tom Lane

Simon Riggs <[EMAIL PROTECTED]> writes:
> The following patch implements a fairly light set of timing statements
> aimed at understanding external sort performance. There is no attempt to
> alter the algorithms.

What do people think about putting something like this into 8.1?
Strictly speaking it's a new feature, but the patch seems pretty
noninvasive, and we'd be much more likely to get data points if the
code exists in the mainline release than if people have to patch
their copies.

> postgres=# set debug_sort=true;

I'm a bit inclined to call it trace_sort instead, and to document it
under "Developer Options".  Comments?

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] PG Killed by OOM Condition

2005-10-03 Thread Tom Lane

"John Hansen" <[EMAIL PROTECTED]> writes:
> Might it be worth while protecting the postmaster from an OOM Kill on
> Linux by setting /proc/{pid}/oom_adj to -17 ?
> (Described vaguely in mm/oom_kill.c)

(a) wouldn't that require root privilege?  (b) how would we determine
whether we are on a system to which this applies?  (c) is it actually
documented in a way that makes you think it'll be a permanently
supported feature (ie, somewhere outside the source code)?

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [HACKERS] Last call for back-branch fixes

2005-10-03 Thread Tom Lane

"Greg Sabino Mullane" <[EMAIL PROTECTED]> writes:
> ! (void) sprintf((char *) rp, "%03o", *vp);

> ! (void) snprintf((char *) rp, 6, "%03o", *vp);

Hardly looks like a critical fix, or even a needed one.  Why bother?

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [HACKERS] Build Farm: thrush

2005-10-03 Thread Tom Lane

Gaetano Mendola <[EMAIL PROTECTED]> writes:
> I'm the administrator of that machine and PLCheck is failing.
> Is there anything I can do to fix it ?

What version of Python have you got on that thing?  It seems to be
emitting still another spelling of the encoding error message :-(

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [HACKERS] Syntax of transaction identifiers

2005-10-03 Thread Tom Lane

Peter Eisentraut <[EMAIL PROTECTED]> writes:
> Why are the transaction identifiers used in the 2PC commands lexical 
> strings rather than identifiers?  Shouldn't database objects be 
> identifiers and strings be purely data objects?  Or is there some 
> standard or precedent on that?

I proposed the same awhile back, but it was shot down --- there is
a standard out there somewhere, and it requires strings of 150 or so
characters for GIDs, so identifiers don't work.  See the archives.

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [HACKERS] Updated documentation for new sequence binding

2005-10-03 Thread Bruce Momjian


Thanks, fixed.

---

Michael Paesold wrote:
> Bruce Momjian wrote:
> 
> > pgman wrote:
> >> I have marged Tom's description of the new sequence binding with text I
> >> was working on.  I modified it to follow the existing "we used to do X,
> >> now we do Y" pattern in the surrounding entries:
> >>
> >> http://candle.pha.pa.us/tmp/pgsql/release.html#RELEASE-8-1
> >
> > Sorry, this is a better URL:
> >
> > http://candle.pha.pa.us/main/writings/pgsql/sgml/release.html#RELEASE-8-1
> 
> Btw. I think the header "Add proper sequence function dependencies for 
> DEFAULT" is in the wrong font, i.e. it's all monospace.
> 
> Best Regards
> Michael Paesold 
> 
> 
> ---(end of broadcast)---
> TIP 4: Have you searched our list archives?
> 
>http://archives.postgresql.org
> 

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  pgman@candle.pha.pa.us   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] PG Killed by OOM Condition

2005-10-03 Thread Martijn van Oosterhout

On Mon, Oct 03, 2005 at 11:03:06PM +1000, John Hansen wrote:
> Might it be worth while protecting the postmaster from an OOM Kill on
> Linux by setting /proc/{pid}/oom_adj to -17 ?
> (Described vaguely in mm/oom_kill.c)

Has it actually happened to you? PostgreSQL is pretty good about its
memory usage. Besides, seems to me it should be an system admisitrator
descision.

Have a nice day,
-- 
Martijn van Oosterhout  http://svana.org/kleptog/
> Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> tool for doing 5% of the work and then sitting around waiting for someone
> else to do the other 95% so you can sue them.


pgpZ0ko2iHAwg.pgp
Description: PGP signature

[HACKERS] PG Killed by OOM Condition

2005-10-03 Thread John Hansen

Good people,

Just had a thought!

Might it be worth while protecting the postmaster from an OOM Kill on
Linux by setting /proc/{pid}/oom_adj to -17 ?
(Described vaguely in mm/oom_kill.c)

Kind Regards,

John Hansen


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [HACKERS] Last call for back-branch fixes

2005-10-03 Thread Greg Sabino Mullane


Index: fe-exec.c
===
RCS file: /projects/cvsroot/pgsql/src/interfaces/libpq/fe-exec.c,v
retrieving revision 1.175
diff -c -r1.175 fe-exec.c
*** fe-exec.c   24 Sep 2005 17:53:28 -  1.175
--- fe-exec.c   3 Oct 2005 12:49:08 -
***
*** 2429,2435 
{
if (*vp < 0x20 || *vp > 0x7e)
{
!   (void) sprintf((char *) rp, "%03o", *vp);
rp += 5;
}
else if (*vp == '\'')
--- 2429,2435 
{
if (*vp < 0x20 || *vp > 0x7e)
{
!   (void) snprintf((char *) rp, 6, "%03o", *vp);
rp += 5;
}
else if (*vp == '\'')



---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

[HACKERS] Build Farm: thrush

2005-10-03 Thread Gaetano Mendola

Hi all,
I'm the administrator of that machine and PLCheck is failing.
Is there anything I can do to fix it ?


Regards
Gaetano Mendola

---(end of broadcast)---
TIP 6: explain analyze is your friend

[HACKERS] Syntax of transaction identifiers

2005-10-03 Thread Peter Eisentraut

Why are the transaction identifiers used in the 2PC commands lexical 
strings rather than identifiers?  Shouldn't database objects be 
identifiers and strings be purely data objects?  Or is there some 
standard or precedent on that?

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

[HACKERS] RPMs for RedHat ES3.0

2005-10-03 Thread Philip Yarra

Devrim, I had some problems installing on RedHat ES3.0 with the RPMs. This 
issue turned out to be that I needed the compat libs to provide the old libs 
before I could install the new libs.

I found a link to  
http://developer.PostgreSQL.org/~devrim/compat-postgresql-libs-3-3PGDG.i686.rpm 
in an archived post from -hackers. It wasn't all that obvious what was 
required though, so is it possible that this RPM could actually be put with 
the downloads for the main RPMs? I'm not sure anyone's installation would 
succeed without that RPM on RedHat ES3.0. Please let me know if I'm missing 
something.

As an aside: is it possible to actually to just include the contents of 
compat-libs in postgresql-libs, since they actually seem to be required? It'd 
save me some hassles for future installations. Probably other folks too.

Regards, Philip.

Boring details:
1) removed all previous traces of postgresql:
[EMAIL PROTECTED] root]# rpm -e `rpm -qa | grep postgres` \
perl-DBD-Pg-1.21-2 mod_auth_pgsql-2.0.1-4.ent php-pgsql-4.3.2-19.ent

2) Grabbed 8.0.3 binaries from:
http://www.postgresql.org/ftp/binary/v8.0.3/linux/rpms/redhat/rhel-es-3.0/

3) Try to install:
[EMAIL PROTECTED] 8.x]# rpm -i postgresql-8.0.3-1PGDG.i686.rpm
error: Failed dependencies:
libpq.so.4 is needed by postgresql-8.0.3-1PGDG
[EMAIL PROTECTED] 8.x]# rpm -i postgresql-libs-8.0.3-1PGDG.i686.rpm
error: Failed dependencies:
libecpg.so.4 is needed by postgresql-libs-8.0.3-1PGDG
libpgtypes.so.1 is needed by postgresql-libs-8.0.3-1PGDG
libpq.so.3 is needed by postgresql-libs-8.0.3-1PGDG
Suggested resolutions:
rh-postgresql-libs-7.3.8-2.i386.rpm

4) Install 
http://developer.postgresql.org/~devrim/compat-postgresql-libs-3-3PGDG.i686.rpm 
, 
repeat step 3 and suddenly life gets a whole lot better :-)

-
Utiba Pty Ltd 
This message has been scanned for viruses and
dangerous content by Utiba mail server and is 
believed to be clean.


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

68 matches

Mail list logo