Re: [HACKERS] Time to run initdb is mostly figure-out-the-timezone work

2009-12-19 Thread Alex Hunsaker
On Fri, Dec 18, 2009 at 10:57, Tom Lane  wrote:
> Obviously there's something there for the kernel guys to fix, but
> even with a non-borked kernel it's an expensive thing to do.

Any thoughts on back patching this? While its not a bug per-say, it
seems reasonably low-risk.  I for one would love a 2-4x initdb speedup
in the back branches :)  Granted now I know I can just set TZ...

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Time to run initdb is mostly figure-out-the-timezone work

2009-12-18 Thread Tom Lane
Alvaro Herrera  writes:
> I notice that most of the difference is system time ... I imagine we do
> a lot of syscalls to guess the timezone.

Yeah, it seems to be mostly the cost of searching the timezone directory
tree and reading all those small files.  I was led to notice this
because Red Hat's latest devel kernels seem to have a bit of a
performance regression in this area:
https://bugzilla.redhat.com/show_bug.cgi?id=548403
Obviously there's something there for the kernel guys to fix, but
even with a non-borked kernel it's an expensive thing to do.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Time to run initdb is mostly figure-out-the-timezone work

2009-12-18 Thread Tom Lane
Joshua Tolley  writes:
> On Fri, Dec 18, 2009 at 06:20:39PM +0100, Guillaume Lelarge wrote:
>> Le 18/12/2009 18:07, Tom Lane a écrit :
>>> On current Fedora 11, there is a huge difference in initdb time if you
>>> have TZ set versus if you don't: I get about 18 seconds versus less than
>>> four.

>> I have the exact same issue:

> For whatever it's worth, I get it too, on Ubuntu 9.04... ~4s without TZ vs.
> ~1.8s with TZ.

BTW, I just realized that it makes a difference that I customarily use
the configure option --with-system-tzdata=/usr/share/zoneinfo on that
machine.  I do it mainly because it saves a few seconds during "make
install", but also because Red Hat's PG packages use that option so I
want to test it regularly.  The impact of this is that the TZ search
also has to scan through a bunch of leap-second-aware timezone files,
which are not present in a default PG build's timezone tree.  So that
probably explains why I see a 4x slowdown while you get more like 2x.
Still, it seems worth doing something about, if it's as easy as a
one-line addition.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Time to run initdb is mostly figure-out-the-timezone work

2009-12-18 Thread Alvaro Herrera
Tom Lane wrote:
> On current Fedora 11, there is a huge difference in initdb time if you
> have TZ set versus if you don't: I get about 18 seconds versus less than
> four.

Wow, I can reproduce this (11-12 secs when no TZ versus 5 when TZ is
defined).  I'd never noticed because I normally have TZ set; but yes I
agree that this is worthwhile.

I notice that most of the difference is system time ... I imagine we do
a lot of syscalls to guess the timezone.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Time to run initdb is mostly figure-out-the-timezone work

2009-12-18 Thread Joshua Tolley
On Fri, Dec 18, 2009 at 06:20:39PM +0100, Guillaume Lelarge wrote:
> Le 18/12/2009 18:07, Tom Lane a écrit :
> > On current Fedora 11, there is a huge difference in initdb time if you
> > have TZ set versus if you don't: I get about 18 seconds versus less than
> > four.
> I have the exact same issue:

For whatever it's worth, I get it too, on Ubuntu 9.04... ~4s without TZ vs.
~1.8s with TZ.

--
Joshua Tolley / eggyknap
End Point Corporation
http://www.endpoint.com


signature.asc
Description: Digital signature


Re: [HACKERS] Time to run initdb is mostly figure-out-the-timezone work

2009-12-18 Thread Guillaume Lelarge
Le 18/12/2009 18:07, Tom Lane a écrit :
> On current Fedora 11, there is a huge difference in initdb time if you
> have TZ set versus if you don't: I get about 18 seconds versus less than
> four.
> 
> $ time initdb
> ... blah blah blah ...
> 
> real0m17.953s
> user0m6.490s
> sys 0m10.935s
> $ rm -rf $PGDATA
> $ export TZ=GMT
> $ time initdb
> ... blah blah blah ...
> 
> real0m3.767s
> user0m2.997s
> sys 0m0.784s
> $ 
> 
> The reason for this is that initdb launches the postmaster many times
> (at least 14) and each one of those launches results in a search of
> every file in the timezone database, if we don't have a TZ value to
> let us identify the timezone immediately.
> 
> Now this hardly matters to end users who seldom do initdb, but from a
> developer's perspective it would be awfully nice if initdb took less
> time.  If other people can reproduce similar behavior, I think it
> would be worth the trouble to have initdb forcibly set the TZ or PGTZ
> variable while it runs.

I have the exact same issue:

guilla...@laptop:~$ time initdb
Les fichiers de ce cluster appartiendront à l'utilisateur « guillaume ».
[...]
real0m7.972s
user0m3.588s
sys 0m3.444s
guilla...@laptop:~$ export TZ=GMT
guilla...@laptop:~$ rm -rf t1
guilla...@laptop:~$ time initdb
[...]
real0m1.828s
user0m1.436s
sys 0m0.368s


This is on Ubuntu 9.10.

Quite impressive. I think I'll add an alias (alias initdb="TZ=GMT initdb").


-- 
Guillaume.
 http://www.postgresqlfr.org
 http://dalibo.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Time to run initdb is mostly figure-out-the-timezone work

2009-12-18 Thread Tom Lane
On current Fedora 11, there is a huge difference in initdb time if you
have TZ set versus if you don't: I get about 18 seconds versus less than
four.

$ time initdb
... blah blah blah ...

real0m17.953s
user0m6.490s
sys 0m10.935s
$ rm -rf $PGDATA
$ export TZ=GMT
$ time initdb
... blah blah blah ...

real0m3.767s
user0m2.997s
sys 0m0.784s
$ 

The reason for this is that initdb launches the postmaster many times
(at least 14) and each one of those launches results in a search of
every file in the timezone database, if we don't have a TZ value to
let us identify the timezone immediately.

Now this hardly matters to end users who seldom do initdb, but from a
developer's perspective it would be awfully nice if initdb took less
time.  If other people can reproduce similar behavior, I think it
would be worth the trouble to have initdb forcibly set the TZ or PGTZ
variable while it runs.  AFAIK it does not matter what timezone
environment postgres sees during initdb; we don't put that into the
config file.  It'd be about a one-line addition ...

Comments?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers