date:20080225

Re: [HACKERS] [COMMITTERS] pgsql: Link postgres from all object files at once, to avoid the

2008-02-25 Thread Tom Lane

Peter Eisentraut <[EMAIL PROTECTED]> writes:
> Tom Lane wrote:
>> Hm, just noticed another little annoyance: CVS is going to complain about
>> the objfiles.txt files unless we add a .cvsignore entry to every last
>> subdirectory of the backend.

> Complain how?

Try a "cvs diff" or "cvs update" while these files are present.

> Why should it complain more or less than about the SUBSYS.o 
> files?

It has a hard-wired rule not to complain about files named *.o.

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] [COMMITTERS] pgsql: Link postgres from all object files at once, to avoid the

2008-02-25 Thread Peter Eisentraut

Tom Lane wrote:
> Hm, just noticed another little annoyance: CVS is going to complain about
> the objfiles.txt files unless we add a .cvsignore entry to every last
> subdirectory of the backend.

Complain how?  Why should it complain more or less than about the SUBSYS.o 
files?

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [HACKERS] 8.3 / 8.2.6 restore comparison

2008-02-25 Thread Joshua D. Drake

On Mon, 25 Feb 2008 22:29:32 -0800
Jeff Davis <[EMAIL PROTECTED]> wrote:

> For me it would still be very helpful. If that 100GB table has several
> indexes, particularly on localized text, that can take a lot of
> processor time to rebuild (even for a substantially smaller dataset,
> like in the "7 hour restore" thread). It seems like a no-brainer to be
> able to utilize all available cores.

Oh, I agree that we should be using all cores. I would argue that we
should have been doing that for years now but more importantly to me is
that pg_restore even single threaded is slow. 

> 
> I think we should consider all of these pg_restore improvements,
> because they're merely simplifying the DBA's job. Currently, to get
> these benefits, I have to organize and parallelize the restore
> manually.

Definitely.

> 
> Actually, the tests you're running are helping me as much as any
> pg_restore changes might anyway. I don't mind a small amount of extra
> work to dump/restore, but other users might get a bad impression of
> PostgreSQL if they don't know how to make it perform to their
> expectations.

Certainly but having to hand roll this is bad. It presents us in a
decidedly hackish light. 

Sincerely,

Joshua D. Drake

> 
> Regards,
>   Jeff Davis
> 


-- 
The PostgreSQL Company since 1997: http://www.commandprompt.com/ 
PostgreSQL Community Conference: http://www.postgresqlconference.org/
Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL SPI Liaison | SPI Director |  PostgreSQL political pundit



signature.asc
Description: PGP signature

Re: [HACKERS] Batch update of indexes on data loading

2008-02-25 Thread ITAGAKI Takahiro

Simon Riggs <[EMAIL PROTECTED]> wrote:

> One of the reasons why I hadn't wanted to pursue earlier ideas to use
> LOCK was that applying a lock will prevent running in parallel, which
> ultimately may prevent further performance gains.
> 
> Is there a way of doing this that will allow multiple concurrent COPYs?

I think there is same difficulty as parallel queries. It requires tighter
communication among COPY threads whether we will use multi-process model
or multi-thread model.

We have independent concurrent COPYs now; COPYs are not aware of each
other because no intermediate status during COPY. However, COPY will
have "phases" if we use bulkbuild. Therefore, we will need joining
COPY threads and passing each working memories between threads.

Here is a possible multi-threaded workload:

  A. For each row:
  1. Parsing new coming data
  2. Add the row into the heap.
  3. Spool index entries to each index spooler.
  B. Wait for all threads.
  C. Merge spools and corresponding existing indexes into new ones.

Phase A could be concurrently as same as now. A1 and A2 are independent
jobs. We could have shared spooler or per-thread spooler.
Phase B is needed to build indexes at once, or it will be double work.
Phase C could be concurrently for each indexes. A thread is responsible
to build one index. It merges the existing index and one shared spool
or multiple spools if we use per-thread spooler.

One of the issues is how to pass or share spoolers between COPY threads.
Another is how to make it transaction safe. If one of the thread fails to
build its index, all thread should be rollback.
I'm not sure how to do them...

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [HACKERS] 8.3 / 8.2.6 restore comparison

2008-02-25 Thread Jeff Davis

On Mon, 2008-02-25 at 21:18 -0800, Joshua D. Drake wrote:
> As simple as this solution is, it is not eloquent nor is it smart.
> Using this method, if you have a 100GB table (which is very common)
> you are still bound in a bad way by a single connection and you are
> holding up everyone else.

In your case I can see your point. 

For me it would still be very helpful. If that 100GB table has several
indexes, particularly on localized text, that can take a lot of
processor time to rebuild (even for a substantially smaller dataset,
like in the "7 hour restore" thread). It seems like a no-brainer to be
able to utilize all available cores.

I think one big improvement is to break it into steps as Simon suggests
here:
http://archives.postgresql.org/pgsql-hackers/2008-02/msg00205.php

and my idea to further break it down:
http://archives.postgresql.org/pgsql-hackers/2008-02/msg00699.php

I think we should consider all of these pg_restore improvements, because
they're merely simplifying the DBA's job. Currently, to get these
benefits, I have to organize and parallelize the restore manually.

Actually, the tests you're running are helping me as much as any
pg_restore changes might anyway. I don't mind a small amount of extra
work to dump/restore, but other users might get a bad impression of
PostgreSQL if they don't know how to make it perform to their
expectations.

Regards,
Jeff Davis

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] pg_dump additional options for performance

2008-02-25 Thread Tom Lane

Simon Riggs <[EMAIL PROTECTED]> writes:
> ... So it would be good if we could dump objects in 3 groups
> 1. all commands required to re-create table
> 2. data
> 3. all commands required to complete table after data load

[ much subsequent discussion snipped ]

BTW, what exactly was the use-case for this?  The recent discussions
about parallelizing pg_restore make it clear that the all-in-one
dump file format still has lots to recommend it.  So I'm just wondering
what the actual advantage of splitting the dump into multiple files
will be.  It clearly makes life more complicated; what are we buying?

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [HACKERS] 8.3 / 8.2.6 restore comparison

2008-02-25 Thread Joshua D. Drake

On Mon, 25 Feb 2008 14:11:16 -0800
"Joshua D. Drake" <[EMAIL PROTECTED]> wrote:

O.k. with 24 connections 3.5 hours. Testing with 12 now. 6 never
finished due to a bug.

Observations:

As simple as this solution is, it is not eloquent nor is it smart.
Using this method, if you have a 100GB table (which is very common)
you are still bound in a bad way by a single connection and you are
holding up everyone else.

Sincerely,

Joshua D. Drake

-- 
The PostgreSQL Company since 1997: http://www.commandprompt.com/ 
PostgreSQL Community Conference: http://www.postgresqlconference.org/
Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL SPI Liaison | SPI Director |  PostgreSQL political pundit



signature.asc
Description: PGP signature

Re: [HACKERS] [COMMITTERS] pgsql: Link postgres from all object files at once, to avoid the

2008-02-25 Thread Tom Lane

"D'Arcy J.M. Cain" <[EMAIL PROTECTED]> writes:
> On Mon, 25 Feb 2008 21:39:27 -0500
> Tom Lane <[EMAIL PROTECTED]> wrote:
>> Hm, just noticed another little annoyance: CVS is going to complain about
>> the objfiles.txt files unless we add a .cvsignore entry to every last
>> subdirectory of the backend.  That seems like a lot of maintenance
>> tedium.  I wonder if there's another way, such as using a file name that
>> CVS is already programmed to ignore.

> Why not just add it to CVSROOT/cvsignore?

Shows you how little I know about CVS repository administration ;-)
Yeah, that sounds like a simple fix.

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

[HACKERS] Reference by in \d out

2008-02-25 Thread kenneth d'souza


Hi, 
 
 
Refering to this request http://momjian.us/mhonarc/patches_hold/msg00022.html
I have created a patch. The output doesn't exaclty match with what is stated 
here 
http://momjian.us/mhonarc/patches_hold/msg00023.html. 
 
However, it does tell the required details in a similar format. I had posted 
this on -patches but somehow it is not appearing as a thread and hence I have 
posted this on hackers.
 
Comments?
 
 
osdb_pgarch=# \d htest
   Table "public.htest"
Column| Type  | Modifiers
--+---+---
 new_id   | integer   | not null
 test_name| character(20) |
 test_cust_id | integer   |
Indexes:
"htest_pkey" PRIMARY KEY,  btree (new_id)
Foreign-key constraints:
"htest_test_cust_id_fkey" FOREIGN KEY (test_cust_id) REFERENCES 
customers(customer_id)
Refrenced by :
  "htest_child_ctest_cust_id_fkey" IN public.htest_child(ctest_cust_id) 
REFERENCES htest(new_id)
  "htest_child1_ctest_cust_id_fkey" IN public.htest_child1(ctest_cust_id) 
REFERENCES htest(new_id)
 
 
 
 
 
diff describe.c_orig describe.c
1109c1109,1110
<  *result6 = NULL;
---
>  *result6 = NULL,
>  *result7 = NULL;
1114a1116
> refof_count = 0,
1247,1248c1249,1265
<   footers = pg_malloc_zero((index_count + check_count + 
rule_count + trigger_count + foreignkey_count + inherit
s_count + 7 + 1)
<* 
sizeof(*footers));
---
>   /* reference_by count */
> 
> printfPQExpBuffer(&buf,"SELECT 
> c.conname,n.nspname,p2.relname,pg_catalog.pg_get_constraintdef(c.oid, true)\
n"
>   "FROM pg_catalog.pg_class p, 
> pg_catalog.pg_constraint c,  pg_catalog.pg_class p2 \n"
>   ",pg_catalog.pg_namespace n WHERE p.oid 
> = '%s' AND c.confrelid = '%s'\n"
>   "AND c.conrelid = p2.oid AND n.oid 
> =p2.relnamespace", oid,oid);
> 
> result7 = PSQLexec(buf.data, false);
> if (!result7)
> goto error_return;
> else
> refof_count = PQntuples(result7);
> 
> 
> footers = pg_malloc_zero((index_count + check_count + 
> rule_count + trigger_count + foreignkey_count + inher
its_count + refof_count +  7 + 1) * sizeof(*footers));
> 
> 
1483a1501,1526
>   /* print reference count details */
> if (refof_count > 0)
> {
> printfPQExpBuffer(&buf, _("Refrenced by :"));
> footers[count_footers++] = pg_strdup(buf.data);
> for (i = 0; i < refof_count; i++)
> {
> const char *refbydef;
> const char *usingpos;
> printfPQExpBuffer(&buf, _("  \"%s\" IN 
> %s.%s"),
>
> PQgetvalue(result7,i,0),
>  
> PQgetvalue(result7,i,1),
>  
> PQgetvalue(result7,i,2));
> 
> /* Everything after "FOREIGN KEY " is echoed 
> verbatim */
> refbydef = PQgetvalue(result7, i, 3);
> usingpos = strstr(refbydef, "FOREIGN KEY ");
> if (usingpos)
> refbydef = usingpos + 12;
> appendPQExpBuffer(&buf, "%s",refbydef);
> 
> footers[count_footers++] = 
> pg_strdup(buf.data);
> }
>  }
> 
 
 
_
Tried the new MSN Messenger? It’s cool! Download now.
http://messenger.msn.com/Download/Default.aspx?mkt=en-in

[HACKERS] Re: [COMMITTERS] pgsql: Link postgres from all object files at once, to avoid the

2008-02-25 Thread D'Arcy J.M. Cain

On Mon, 25 Feb 2008 21:39:27 -0500
Tom Lane <[EMAIL PROTECTED]> wrote:
> Hm, just noticed another little annoyance: CVS is going to complain about
> the objfiles.txt files unless we add a .cvsignore entry to every last
> subdirectory of the backend.  That seems like a lot of maintenance
> tedium.  I wonder if there's another way, such as using a file name that
> CVS is already programmed to ignore.

Why not just add it to CVSROOT/cvsignore?

-- 
D'Arcy J.M. Cain <[EMAIL PROTECTED]> |  Democracy is three wolves
http://www.druid.net/darcy/|  and a sheep voting on
+1 416 425 1212 (DoD#0082)(eNTP)   |  what's for dinner.

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] libpq.rc make rule

2008-02-25 Thread Dave Page

On Mon, Feb 25, 2008 at 11:48 PM, Peter Eisentraut <[EMAIL PROTECTED]> wrote:
> Dave Page wrote:
>  > Yes, because newer builds may be linked against updated runtime
>  > versions. We need to be sure the installer will upgrade the file so it
>  > definitely matches any runtimes (or other dependencies) that we're
>  > also installing/upgrading.
>
>  If it is so very important to update this file for every build, why are we
>  shipping it in the distribution tarball, which is done under the assumption
>  that it never has to be updated?  Something doesn't fit here.

That I can't answer.

>  Also, does this theory apply to all shared libraries?  What about the ecpg
>  libraries?

All user-facing binaries should be affected, both executables and
libraries. iirc, we don't bother with contrib dlls or conversion
libraries etc as they are far less likely to cause problems. I can't
help thinking there's something else I'm forgetting as well...


-- 
Dave Page
EnterpriseDB UK: http://www.enterprisedb.com
The Oracle-compatible database company

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [HACKERS] libpq.rc make rule

2008-02-25 Thread Peter Eisentraut

Dave Page wrote:
> Yes, because newer builds may be linked against updated runtime
> versions. We need to be sure the installer will upgrade the file so it
> definitely matches any runtimes (or other dependencies) that we're
> also installing/upgrading.

If it is so very important to update this file for every build, why are we 
shipping it in the distribution tarball, which is done under the assumption 
that it never has to be updated?  Something doesn't fit here.

Also, does this theory apply to all shared libraries?  What about the ecpg 
libraries?

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [HACKERS] idea: simple variadic functions in SQL and PL/pgSQL

2008-02-25 Thread Andrew Dunstan




Pavel Stehule wrote:

Hello,

I found easy implementation of variadic functions. It's based on
adapation FuncnameGetCandidates. When I found variadic function, then
I should create accurate number of last arguments (diff between
pronargs and nargs). Variadic function can be signed via flag or via
some pseudotype. Flag is better - allows variadic arguments of any
type. In static languages (like SQL or PL/pgSQL) variadic variables
can ba accessed via array (variadic arguments can be only nonarray).
This isn't problem in C language, there are arguments available
directly.
  


There are a whole slew of questions around this, ISTM.

For example: What will be the type inferred for the array of variadic 
args in plpgsql?


If we are going to do this I think we need some discussion on design 
before we rush into it.


cheers

andrew




---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [HACKERS] libpq.rc make rule

2008-02-25 Thread Dave Page

On Mon, Feb 25, 2008 at 10:03 PM, Peter Eisentraut <[EMAIL PROTECTED]> wrote:
> Dave Page wrote:
>  > It's used on Windows to ensure that installers can do the right thing
>  > when replacing a copy of libpq.dll. The daily build number was the
>  > most maintenance-free way of getting a fourth value for the version
>  > resource.
>
>  Isn't that what the shared library version numbers are for?  Do you need 
> finer
>  resolution than that?

Yes, because newer builds may be linked against updated runtime
versions. We need to be sure the installer will upgrade the file so it
definitely matches any runtimes (or other dependencies) that we're
also installing/upgrading.


-- 
Dave Page
EnterpriseDB UK: http://www.enterprisedb.com
The Oracle-compatible database company

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] 8.3 / 8.2.6 restore comparison

2008-02-25 Thread Joshua D. Drake

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On Mon, 25 Feb 2008 14:05:58 -0800
Jeff Davis <[EMAIL PROTECTED]> wrote:

> > Yep :) but as a note:
> > 
> > I am currently testing on the data set that is giving us all these
> > issues. Previously we were pushing ~ 22G an hour over a single
> > thread. I am currently pushing ~ 28G every 16 minutes over 6
> > threads.
> > 
> > With 30-40% IO wait.
> 
> That begs the question: what about 12 threads?

That is exactly what Alvaro said :P, let me finish this one first.

Joshua D. Drake



- -- 
The PostgreSQL Company since 1997: http://www.commandprompt.com/ 
PostgreSQL Community Conference: http://www.postgresqlconference.org/
Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL SPI Liaison | SPI Director |  PostgreSQL political pundit

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFHwz0EATb/zqfZUUQRAo1HAJkB58g/gkCWPTqSqjNzrcxGZ4eiNACgi7Va
gZGMboxrPwV4euv67anSyfo=
=9S0a
-END PGP SIGNATURE-

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate

Re: [HACKERS] 8.3 / 8.2.6 restore comparison

2008-02-25 Thread Jeff Davis

On Mon, 2008-02-25 at 12:28 -0800, Joshua D. Drake wrote:
> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA1
> 
> On Mon, 25 Feb 2008 12:17:10 -0800
> Jeff Davis <[EMAIL PROTECTED]> wrote:
> 
> 
> > > I would personally rather keep it simple, hard core, and data
> > > shoving as possible without any issue with scheduling etc..
> > > 
> > 
> > Just a thought. After it's actually implemented it won't be hard to
> > see if it's a win.
> 
> Yep :) but as a note:
> 
> I am currently testing on the data set that is giving us all these
> issues. Previously we were pushing ~ 22G an hour over a single thread.
> I am currently pushing ~ 28G every 16 minutes over 6 threads.
> 
> With 30-40% IO wait.

That begs the question: what about 12 threads?

Regards,
Jeff Davis


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] libpq.rc make rule

2008-02-25 Thread Peter Eisentraut

Dave Page wrote:
> It's used on Windows to ensure that installers can do the right thing
> when replacing a copy of libpq.dll. The daily build number was the
> most maintenance-free way of getting a fourth value for the version
> resource.

Isn't that what the shared library version numbers are for?  Do you need finer 
resolution than that?

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] libpq.rc make rule

2008-02-25 Thread Dave Page

On Mon, Feb 25, 2008 at 7:21 PM, Peter Eisentraut <[EMAIL PROTECTED]> wrote:
> The libpq.rc make rule says:
>
>  # depend on Makefile.global to force rebuild on re-run of configure
>  $(srcdir)/libpq.rc: libpq.rc.in $(top_builddir)/src/Makefile.global
> sed -e 's/\(VERSION.*\),0 *$$/\1,'`date '+%y%j' | sed 's/^0*//'`'/' < 
> $< > $@
>
>  However, libpq.rc is also included in the distribution, so whenever a
>  distribution is built, the distributed file would be overwritten, which is
>  not a nice thing to do.
>
>  Could someone explain what the requirement behind this is?  The '+%y%j'
>  changes every day.  Why is libpq the only subsystem that needs a daily
>  version number?

It's used on Windows to ensure that installers can do the right thing
when replacing a copy of libpq.dll. The daily build number was the
most maintenance-free way of getting a fourth value for the version
resource.

-- 
Dave Page
EnterpriseDB UK: http://www.enterprisedb.com
The Oracle-compatible database company

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [HACKERS] 8.3 / 8.2.6 restore comparison

2008-02-25 Thread Joshua D. Drake

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On Mon, 25 Feb 2008 12:17:10 -0800
Jeff Davis <[EMAIL PROTECTED]> wrote:


> > I would personally rather keep it simple, hard core, and data
> > shoving as possible without any issue with scheduling etc..
> > 
> 
> Just a thought. After it's actually implemented it won't be hard to
> see if it's a win.

Yep :) but as a note:

I am currently testing on the data set that is giving us all these
issues. Previously we were pushing ~ 22G an hour over a single thread.
I am currently pushing ~ 28G every 16 minutes over 6 threads.

With 30-40% IO wait.

Sincerely,

Joshua D. Drake


- -- 
The PostgreSQL Company since 1997: http://www.commandprompt.com/ 
PostgreSQL Community Conference: http://www.postgresqlconference.org/
Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL SPI Liaison | SPI Director |  PostgreSQL political pundit

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFHwyToATb/zqfZUUQRAsDzAJ0cZEujQIW1SQ9Wd1nd1jWRVWy09ACgpryh
SJENqCnmwKoSMF5fSHBRtsg=
=hVeo
-END PGP SIGNATURE-

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [HACKERS] Strange behavior with leap dates and centuries BC

2008-02-25 Thread Bernd Helmle

--On Montag, Februar 25, 2008 14:04:18 -0500 Tom Lane <[EMAIL PROTECTED]> 
wrote:



The other issue is whether to throw error for year zero, rather than
silently interpreting it as 1 BC.  I can't recall whether that behavior
was intentional at the time, but given our current rather strict
interpretation of date validity checking, it hardly seems like a good
idea now.  What I suggest is that we throw error in 8.4 and beyond,
but not back-patch that change, so as to avoid introducing a behavioral
change in minor releases.


That sounds reasonable. I'm still trying to find out how it was managed to 
get such a date into the database, since it seems not to be intended 
behavior by the client. Maybe it's an errorneous to_date() formatting.


--
 Thanks

   Bernd

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org

Re: [HACKERS] 8.3 / 8.2.6 restore comparison

2008-02-25 Thread Jeff Davis

On Mon, 2008-02-25 at 12:05 -0800, Joshua D. Drake wrote:
> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA1
> 
> On Mon, 25 Feb 2008 11:36:56 -0800
> Jeff Davis <[EMAIL PROTECTED]> wrote:
> 
> > 
> > If there is any significant I/O latency for a single backend, it seems
> > like a context switch could be a win for processor utilization. It
> > might not be a win overall, but at least potentially a win.
> 
> Do we want a 20% potential win or an 80% potential win?
> 
> I would personally rather keep it simple, hard core, and data shoving
> as possible without any issue with scheduling etc..
> 

Just a thought. After it's actually implemented it won't be hard to see
if it's a win.

Regards,
Jeff Davis


---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate

Re: [HACKERS] [PATCHES] Avahi support for Postgresql

2008-02-25 Thread Tom Lane

Mathias Hasselmann <[EMAIL PROTECTED]> writes:
> Just to be sure we talk about the same topic: I assume the prohibition
> you talk about is something like "no use of threads in Postmaster"?

Correct.

> If that's the case: Are there some docs, mails, ... explaining the
> rationale behind this restriction? I could imagine your do not want
> random locking in the postmaster code?

Portability, irreproducible misbehavior, etc.  Some trawling in the
pgsql-hackers archives should turn up previous discussions.  For a
recent demonstration that wanting to avoid threads is not just idle
paranoia on our part, see
http://archives.postgresql.org/pgsql-patches/2007-09/msg00194.php

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] 8.3 / 8.2.6 restore comparison

2008-02-25 Thread Joshua D. Drake

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On Mon, 25 Feb 2008 11:36:56 -0800
Jeff Davis <[EMAIL PROTECTED]> wrote:

> 
> If there is any significant I/O latency for a single backend, it seems
> like a context switch could be a win for processor utilization. It
> might not be a win overall, but at least potentially a win.

Do we want a 20% potential win or an 80% potential win?

I would personally rather keep it simple, hard core, and data shoving
as possible without any issue with scheduling etc..

Sincerely,

Joshua D. Drake


- -- 
The PostgreSQL Company since 1997: http://www.commandprompt.com/ 
PostgreSQL Community Conference: http://www.postgresqlconference.org/
Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL SPI Liaison | SPI Director |  PostgreSQL political pundit

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFHwx+EATb/zqfZUUQRAvOgAJ4vWCO74XzXy9Pbzqz3otWoqKI3HgCfRwUI
ZLd0SOgf5jnInZvOxCS+iNU=
=Syk1
-END PGP SIGNATURE-

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] Tuning 8.3

2008-02-25 Thread Christopher Browne

"Get thee to a connection pooler ASAP."

We've got systems where we establish ~1K connections, but that's on
UNIX, where the handling of large systems is *WAY* more mature than
Windows.

Any time those kinds of quantities of connections appear necessary, it
seems highly preferable to be using connection pooling so as to try to
reduce the number of actual connections and to increase the
per-connection usage.

-- 
http://linuxfinances.info/info/linuxdistributions.html
"The definition of insanity is doing the same thing over and over and
expecting different results."  -- assortedly attributed to Albert
Einstein, Benjamin Franklin, Rita Mae Brown, and Rudyard Kipling

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [HACKERS] [PATCHES] Avahi support for Postgresql

2008-02-25 Thread Mathias Hasselmann

Am Montag, den 25.02.2008, 14:32 -0300 schrieb Alvaro Herrera:
> Peter Eisentraut wrote:
> > Am Montag, 25. Februar 2008 schrieb Alvaro Herrera:
> > > Hmm, a quick look at the third patch reveals that it is using the
> > > "threaded" Avahi client.  That's a showstopper.
> > 
> > Could you elaborate why that is?
> 
> Because it creates a new thread under the Postmaster to handle Avahi
> events, if I'm reading the Avahi docs right.  This is verboten.

Just to be sure we talk about the same topic: I assume the prohibition
you talk about is something like "no use of threads in Postmaster"?

If that's the case: Are there some docs, mails, ... explaining the
rationale behind this restriction? I could imagine your do not want
random locking in the postmaster code?

See, interaction points with the main thread are very small:

1) Lock-free creation of the threaded Avahi client in PostmasterMain()
2) Locked shutdown of the Avahi client in ExitAvahiClient(), which only
   is called from ExitPostmaster().

So IMHO usage of the threaded poll API has much smaller impact on the
behavior of the postmaster process, than any attempt to integrate Avahi
with postmaster's main loop.

> We have an event loop in the postmaster -- see ServerLoop.  Is there a
> reason the Avahi events could not be hooked in there?

Currently there are four ___well tested___ implementations of Avahi's
poll API: AvahiSimplePoll, which really just works for simple command
line tools and demonstration purposes. The single threaded APIs that
integrate with the main loops of glib and Qt, and the threaded poll API.

Avahi's requirements for a poll API aren't exactly trivial: You don't
only have to care about file descriptors, you also have to implement
some kind of timeout scheduling. So in favor of reinventing the wheel
and delivering an untested custom poll API, I've chosen the threaded
poll API: It's the only well-tested poll API that fits into Postgresql,
and its interaction points with the Postmaster process are minimal.

>From looking at ServerLoop() I do not see any facilities for
registering timeout callbacks. Select timeouts are static. So for
implementing Avahi's poll API in ServerLoop() some radical code changes
would be needed. I don't believe such changes would be justified,
unless other portions of postmaster also need timeout callbacks.

Ciao,
Mathias
-- 
Mathias Hasselmann <[EMAIL PROTECTED]>
http://www.openismus.com/ - We can get it done.

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] 8.3 / 8.2.6 restore comparison

2008-02-25 Thread Jeff Davis

On Sun, 2008-02-24 at 09:47 -0800, Joshua D. Drake wrote:
> A less hacker and more DBA bottleneck will be to limit the number of 
> backends being created for restore. We don't really want to have more 
> than one backend per CPU, otherwise we just start switching.

Are you sure that it would always be a loss?

If there is any significant I/O latency for a single backend, it seems
like a context switch could be a win for processor utilization. It might
not be a win overall, but at least potentially a win.

Regards,
Jeff Davis

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [HACKERS] [PATCHES] Avahi support for Postgresql

2008-02-25 Thread Tom Lane

Mathias Hasselmann <[EMAIL PROTECTED]> writes:
> Am Montag, den 25.02.2008, 15:05 +0100 schrieb Peter Eisentraut:
>> Is there a reason we couldn't use the Bonjour compatibility layer offered by
>> Avahi to keep the code differences to a minimum?

> 1) The Avahi's compatibility layer doesn't implement the now deprecated
> function DNSServiceRegistrationCreate().

Note that Apple themselves have been deprecating
DNSServiceRegistrationCreate for some time:

postmaster.c: In function 'PostmasterMain':
postmaster.c:856: warning: 'DNSServiceRegistrationCreate' is deprecated 
(declared at /usr/include/DNSServiceDiscovery/DNSServiceDiscovery.h:139)

It's a fairly good bet that the function will disappear entirely from OS
X at some point, so we're going to have to change this code soon anyway.

What I'd like to know is whether the Avahi API that this patch is using
is compatible with whatever Apple is pushing as the not-deprecated API.
It would be annoying to tell Mac users that they have to install Avahi
to get at functionality that their platform provides natively.

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

[HACKERS] libpq.rc make rule

2008-02-25 Thread Peter Eisentraut

The libpq.rc make rule says:

# depend on Makefile.global to force rebuild on re-run of configure
$(srcdir)/libpq.rc: libpq.rc.in $(top_builddir)/src/Makefile.global
sed -e 's/\(VERSION.*\),0 *$$/\1,'`date '+%y%j' | sed 's/^0*//'`'/' < 
$< > $@

However, libpq.rc is also included in the distribution, so whenever a
distribution is built, the distributed file would be overwritten, which is
not a nice thing to do.

Could someone explain what the requirement behind this is?  The '+%y%j'
changes every day.  Why is libpq the only subsystem that needs a daily
version number?

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [HACKERS] [PATCHES] Avahi support for Postgresql

2008-02-25 Thread Mathias Hasselmann


Am Montag, den 25.02.2008, 15:05 +0100 schrieb Peter Eisentraut:
> Am Dienstag, 27. November 2007 schrieb Mathias Hasselmann:
> > Postmaster already has code to announce its services via DNS-SD
> > (ZeroConf) by using Apple's Bonjour API. This series of patches
> > implements that capability on top of the Avahi library[1] which
> > is free software, available for a wider variety of platforms.
> 
> Is there a reason we couldn't use the Bonjour compatibility layer offered by 
> Avahi to keep the code differences to a minimum?

1) The Avahi's compatibility layer doesn't implement the now deprecated
function DNSServiceRegistrationCreate().

2) Unless DNSServiceRegistrationCreate() installs a lot of black magic,
the Bonjour code in postmaster.c has very poor error handling: There
seem to be no attempts made to handle name collisions (unless Bonjour
does this automatically, of course).

Ciao,
Mathias
-- 
Mathias Hasselmann <[EMAIL PROTECTED]>
http://www.openismus.com/ - We can get it done.


---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate

Re: [HACKERS] Strange behavior with leap dates and centuries BC

2008-02-25 Thread Tom Lane

Bernd Helmle <[EMAIL PROTECTED]> writes:
> I stepped through the code in datetime.c and it seems the culprit here is 
> DecodeDate(). It get's the date string from DecodeDateTime(), but without 
> the 'BC' century notation. However, it then performs the following check

Yeah, I had just come to the same conclusion.  It is premature for
DecodeDate to be trying to check this, because AFAICT there is no input
syntax in which it will be given both the date fields and the AD/BC
field.  There is already checking code at the bottom of DecodeDateTime,
so it looks to me like DecodeDate's checks are simply redundant in that
code path.  They aren't redundant in the calls from DecodeTimeOnly,
however.  I think we should move the date range checks and BC adjustment
into a separate function ValidateDate() that is called from the bottom
of the outer loops in DecodeDateTime/DecodeTimeOnly.

The other issue is whether to throw error for year zero, rather than
silently interpreting it as 1 BC.  I can't recall whether that behavior
was intentional at the time, but given our current rather strict
interpretation of date validity checking, it hardly seems like a good
idea now.  What I suggest is that we throw error in 8.4 and beyond,
but not back-patch that change, so as to avoid introducing a behavioral
change in minor releases.

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [HACKERS] Questions about indexes with text_pattern_ops

2008-02-25 Thread Tom Lane

Gregory Stark <[EMAIL PROTECTED]> writes:
> It may be more right in an abstract ideal world -- the reality is that text
> collation is annoyingly complex. But this may be a case where we can get away
> with just eliding this hassle.

If anyone actually complains about it, I think we can point to the SQL
spec, which unambiguously says that a multicolumn sort key is considered
one column at a time.

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] One more option for pg_dump...

2008-02-25 Thread Leonardo Cezar

On Mon, Feb 25, 2008 at 12:33 PM, David BOURIAUD
<[EMAIL PROTECTED]> wrote:
> Le lundi 25 février 2008, Leonardo Cezar a écrit :
>
>  Hi Leonardo,
>  Thanks for your quick answer, I didn't know it was a TODO item, and that
>  somepeople were working on it... Keep going, then, cause I'm really waiting
>  for these features !

As I said before, I'm writing a *proposal* (proto) to patch which I
should publish here at the next days. So as It would be the first
dump's patch I'd like to deal among other things: what's a better
syntax and so on.

-Leo
-- 
Leonardo Cezar et all
http://www.dextra.com.br/postgres
http://www.postgresql.org.br

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [HACKERS] build environment: a different makefile

2008-02-25 Thread Peter Eisentraut

Am Mittwoch, 6. Februar 2008 schrieb Paul van den Bogaard:
> I was hoping someone in the community already has a makefile that
> "just" creates object files from C-sources directly that I can use to
> try out the effect of in-lining to the performance of postgres.

This is now the default in 8.4devel.  Let us know what you find out.

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] Strange behavior with leap dates and centuries BC

2008-02-25 Thread Bernd Helmle

--On Montag, Februar 25, 2008 12:00:05 -0500 Tom Lane <[EMAIL PROTECTED]> 
wrote:



regression=# select '0001-02-28 BC'::date + 1;
   ?column?
---
 0001-02-29 BC
(1 row)

regression=# select '0002-02-28 BC'::date + 1;
   ?column?
---
 0002-03-01 BC
(1 row)


I stepped through the code in datetime.c and it seems the culprit here is 
DecodeDate(). It get's the date string from DecodeDateTime(), but without 
the 'BC' century notation. However, it then performs the following check


/* there is no year zero in AD/BC notation; i.e. "1 BC" == year 0 */
if (bc)
{
if (tm->tm_year > 0)
tm->tm_year = -(tm->tm_year - 1);
else
ereport(ERROR,

(errcode(ERRCODE_INVALID_DATETIME_FORMAT),
 errmsg("inconsistent use of year %04d and 
\"BC\"",
tm->tm_year)));
}

bc never becames true during parsing and the final check for the leap date 
fails:


/* We don't want to hint about DateStyle for Feb 29 */
if (tm->tm_mday > day_tab[isleap(tm->tm_year)][tm->tm_mon - 1])
{
return DTERR_FIELD_OVERFLOW;
}

Maybe that helps a little bit.

--
 Thanks

   Bernd

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate

Re: [HACKERS] Questions about indexes with text_pattern_ops

2008-02-25 Thread Gregory Stark

"Tom Lane" <[EMAIL PROTECTED]> writes:

> Gregory Stark <[EMAIL PROTECTED]> writes:
>> "Tom Lane" <[EMAIL PROTECTED]> writes:
>>> How so?  If you think this change is a bad idea you'd better speak up
>>> PDQ.
>
>> Well I think it's fine for 'foo ' != 'foo' even if they sort similarly. 
>
>> But I'm not sure it makes sense for <'foo ','a'> to sort after <'foo','b'> if
>> the locale says that 'foo ' should be compare "equal" to 'foo' and 'a' before
>> 'b'.
>
> I don't think we can concern ourselves with that; it would require
> allowing different columns of an index to interact, which would be
> impossibly messy.  What's more, it'd destroy the property that a btree
> index is sorted by its leading column(s) as well as by all its columns.

Well, I was thinking we might have to separate the equal operators from the
btree opclass. Equals would be a stricter property than "sorts as same".

It would be mighty strange to have values which compare unequal but are
neither < or > though. Or which compare equal but also compare < or >.

It might be a little less surprising if we invent a new operator === for
"actually the same" and have == report whether two objects sort as equals. But
I'm not sure our experience with Turkish doesn't show that that will still
surprise people.

It may be more right in an abstract ideal world -- the reality is that text
collation is annoyingly complex. But this may be a case where we can get away
with just eliding this hassle.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Get trained by Bruce Momjian - ask me about EnterpriseDB's PostgreSQL 
training!

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] Questions about indexes with text_pattern_ops

2008-02-25 Thread Tom Lane

Gregory Stark <[EMAIL PROTECTED]> writes:
> "Tom Lane" <[EMAIL PROTECTED]> writes:
>> How so?  If you think this change is a bad idea you'd better speak up
>> PDQ.

> Well I think it's fine for 'foo ' != 'foo' even if they sort similarly. 

> But I'm not sure it makes sense for <'foo ','a'> to sort after <'foo','b'> if
> the locale says that 'foo ' should be compare "equal" to 'foo' and 'a' before
> 'b'.

I don't think we can concern ourselves with that; it would require
allowing different columns of an index to interact, which would be
impossibly messy.  What's more, it'd destroy the property that a btree
index is sorted by its leading column(s) as well as by all its columns.

> Perhaps we should always generate those inequalities even if there's no index
> that can use them.

Hmmm ... we intentionally don't do that, but the constraint exclusion
code might be a sufficient reason to reconsider.

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend

[HACKERS] pgAgent job throttling

2008-02-25 Thread Roberts, Jon

I posted earlier about how to tune my server and I think the real
problem is how many connections pgAgent creates for my job needs.  

I basically need to run hundreds of jobs daily all to be executed at
4:00 AM.  To keep the jobs from killing the other systems, I am
throttling this with a queue table.  

With pgAgent, it creates 2 connections (one to the maintenance db and
one to the target db) and then my queue throttling makes a third
connection every 10 seconds checking the job queue to see if there is an
available queue to execute.

A better solution would be to incorporate job throttling in pgAgent.
Currently, pgAgent will spawn as many jobs as required and it creates a
minimum of two database connections per job.  I think a solution would
be for pgAgent to not create the connection and execute my job steps
unless the current number of jobs running is less than a result from a
function.  

Sort of like this:

select count(*) into v_count from queue where status = 'Processing';

while v_count >= fn_get_max_jobs() loop


  pg_sleep(fn_get_sleep_time());


  select count(*) into v_count from queue where status = 'Processing';

end loop;


I'm doing this now but inside a function being executed by pgAgent.
This means I have two connections open for each job.  Plus, I use a
function that uses a dblink to lock the queue table and then update the
status so that is a third connection that lasts just for a millisecond.


So if 200 jobs are queued to run at 4:00 AM, then I have 400 connections
open and then it will spike a little bit as each queued job checks to
see if it can run.  

Do you guys think it is a good idea to add job throttling to pgAgent to
limit the number of connections?  Setting the value to -1 could be the
default value which would allow an unlimited number of jobs to run at a
time (like it is now) but a value greater than -1 would be the max
number of jobs that can run concurrently.


Jon

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [HACKERS] [PATCHES] Avahi support for Postgresql

2008-02-25 Thread Alvaro Herrera

Peter Eisentraut wrote:
> Am Montag, 25. Februar 2008 schrieb Alvaro Herrera:
> > Hmm, a quick look at the third patch reveals that it is using the
> > "threaded" Avahi client.  That's a showstopper.
> 
> Could you elaborate why that is?

Because it creates a new thread under the Postmaster to handle Avahi
events, if I'm reading the Avahi docs right.  This is verboten.

We have an event loop in the postmaster -- see ServerLoop.  Is there a
reason the Avahi events could not be hooked in there?

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] Questions about indexes with text_pattern_ops

2008-02-25 Thread Gregory Stark

"Tom Lane" <[EMAIL PROTECTED]> writes:

> Gregory Stark <[EMAIL PROTECTED]> writes:
>> "Tom Lane" <[EMAIL PROTECTED]> writes:
>>> I'm intending to get rid of ~=~ and ~<>~ for 8.4; there's no longer any
>>> reason why those slots in the pattern_ops classes can't be filled by the
>>> plain = and <> operators.  (There *was* a reason when they were first
>>> invented --- but now that texteq will only return true for exact bitwise
>>> match, I think it's OK to assume these are equivalent.)
>
>> The only question is whether we'll keep that forever. I thought it was a good
>> idea at the time but I'm starting to wonder about the implications for
>> multi-key indexes.
>
> How so?  If you think this change is a bad idea you'd better speak up
> PDQ.

Well I think it's fine for 'foo ' != 'foo' even if they sort similarly. 

But I'm not sure it makes sense for <'foo ','a'> to sort after <'foo','b'> if
the locale says that 'foo ' should be compare "equal" to 'foo' and 'a' before
'b'.

I'm starting to think "equality" and "sorts interchangeably" are not the same
operator at all. On the other hand it may not be worth the complexity that
might bring.

>> I was thinking that the inequalities that the LIKE index scan introduces 
>> would
>> imply the inequality. I take it we generate those inequalities too late in 
>> the
>> planning process to use them for other planning? 
>
> Hmm, good point [ experiments... ]  Yeah, it seems we don't reconsider
> partial indexes after those clauses have been generated.  Not sure how
> expensive it'd be to change that.

Perhaps we should always generate those inequalities even if there's no index
that can use them. Then calculate the regexp selectivity based only on the
regexp after the prefix.

That might also help constraint exclusion. Maybe merge joins?

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's Slony Replication support!

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] Strange behavior with leap dates and centuries BC

2008-02-25 Thread Tom Lane

Bernd Helmle <[EMAIL PROTECTED]> writes:
> CREATE TABLE foo(datum date);
> INSERT INTO foo VALUES('-02-29');

Since there is no year zero according to Gregorian reckoning, this
should have been rejected to start with.

> INSERT INTO foo VALUES('0001-02-29 BC');
> ERROR:  date/time field value out of range: "0001-02-29 BC"

Yeah, something broken there too.  It does know (correctly) that 1BC
is a leap year:

regression=# select '0001-02-28 BC'::date + 1;
   ?column?
---
 0001-02-29 BC
(1 row)

regression=# select '0002-02-28 BC'::date + 1;
   ?column?
---
 0002-03-01 BC
(1 row)

So I'd say there are two separate bugs in datetime input processing
exposed here.

> Huh? It seems the calculation for leap dates with negative year values is 
> broken. This example was taken from a current HEAD checkout today, the 
> original version i've seen this behavior first was 8.2.4.

I see the same behaviors in 7.4.x, so it's a longstanding problem...

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [HACKERS] Tuning 8.3

2008-02-25 Thread Roberts, Jon

> > I need to run about 1000 PostgreSQL connections on a server that I
can
> > use about 4 GB of the total 16 GB of total RAM.  It seems that each
> > session creates a process that uses about 15 MB of RAM just for
> > connecting so I'm running out of RAM rather quickly.
> 
> I think you're being bitten by a different problem than it appears.
> Windows
> has a fixed size per-session shared memory pool which runs out rather
> quickly.
> You can raise that parameter though. (The 125 mentioned there is
raised to
> about 300 with Pg 8.3.)
> 
> See:
> 
> http://www.postgresql.org/docs/faqs.FAQ_windows.html#4.4
> 
> 

Thanks for the tip and I'll be moving this to the performance forum.
Although, with 8.3, it seems that the FAQ is out of date?



Jon

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [HACKERS] Questions about indexes with text_pattern_ops

2008-02-25 Thread Tom Lane

Gregory Stark <[EMAIL PROTECTED]> writes:
> "Tom Lane" <[EMAIL PROTECTED]> writes:
>> I'm intending to get rid of ~=~ and ~<>~ for 8.4; there's no longer any
>> reason why those slots in the pattern_ops classes can't be filled by the
>> plain = and <> operators.  (There *was* a reason when they were first
>> invented --- but now that texteq will only return true for exact bitwise
>> match, I think it's OK to assume these are equivalent.)

> The only question is whether we'll keep that forever. I thought it was a good
> idea at the time but I'm starting to wonder about the implications for
> multi-key indexes.

How so?  If you think this change is a bad idea you'd better speak up
PDQ.

>> In the meantime, though, I think the only way that Kaare's query can use
>> that index is if he writes
>> WHERE b LIKE 'whatever' AND b <> '';
>> (with whatever spelling of <> the index predicate has).  There is not
>> anything in the predicate proving machinery that knows enough about LIKE
>> to be able to show that "b LIKE 'whatever'" implies "b <> ''".

> I was thinking that the inequalities that the LIKE index scan introduces would
> imply the inequality. I take it we generate those inequalities too late in the
> planning process to use them for other planning? 

Hmm, good point [ experiments... ]  Yeah, it seems we don't reconsider
partial indexes after those clauses have been generated.  Not sure how
expensive it'd be to change that.

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] Tuning 8.3

2008-02-25 Thread Gregory Stark

"Roberts, Jon" <[EMAIL PROTECTED]> writes:

> I need to run about 1000 PostgreSQL connections on a server that I can
> use about 4 GB of the total 16 GB of total RAM.  It seems that each
> session creates a process that uses about 15 MB of RAM just for
> connecting so I'm running out of RAM rather quickly.

I think you're being bitten by a different problem than it appears. Windows
has a fixed size per-session shared memory pool which runs out rather quickly.
You can raise that parameter though. (The 125 mentioned there is raised to
about 300 with Pg 8.3.)

See:

http://www.postgresql.org/docs/faqs.FAQ_windows.html#4.4

> Any tips for reducing the memory footprint per session?  There is
> pgBouncer but is there anything I can do in the configuration before I
> go with a connection pooler?

I think at 1,000 you're probably into the domain where pgbouncer (or others
like it) is a good idea. Or you could pool or batch at a higher level and have
fewer sessions active at all. You don't win any performance by trying to do
more things simultaneously if they're just competing for cpu timeslices or i/o
bandwidth.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Get trained by Bruce Momjian - ask me about EnterpriseDB's PostgreSQL 
training!

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate

Re: [HACKERS] Questions about indexes with text_pattern_ops

2008-02-25 Thread Gregory Stark

"Tom Lane" <[EMAIL PROTECTED]> writes:

> Gregory Stark <[EMAIL PROTECTED]> writes:
>> Hm, for a simple = or <> I think it doesn't matter which operator class you
>> use. For < or > it would produce different answers. Postgres isn't clever 
>> enough
>> to notice that this is equivalent though so I think you would have to do
>> something like (untested):
>
>> CREATE INDEX new_index ON a (b text_pattern_ops) WHERE b ~<>~ '';
>
>> That uses the same operator that the LIKE clause will use for the index 
>> range.
>
> I'm intending to get rid of ~=~ and ~<>~ for 8.4; there's no longer any
> reason why those slots in the pattern_ops classes can't be filled by the
> plain = and <> operators.  (There *was* a reason when they were first
> invented --- but now that texteq will only return true for exact bitwise
> match, I think it's OK to assume these are equivalent.)

The only question is whether we'll keep that forever. I thought it was a good
idea at the time but I'm starting to wonder about the implications for
multi-key indexes.

> In the meantime, though, I think the only way that Kaare's query can use
> that index is if he writes
>   WHERE b LIKE 'whatever' AND b <> '';
> (with whatever spelling of <> the index predicate has).  There is not
> anything in the predicate proving machinery that knows enough about LIKE
> to be able to show that "b LIKE 'whatever'" implies "b <> ''".

I was thinking that the inequalities that the LIKE index scan introduces would
imply the inequality. I take it we generate those inequalities too late in the
planning process to use them for other planning? 


-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's 24x7 Postgres support!

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] Tuning 8.3

2008-02-25 Thread Andrew Dunstan




Roberts, Jon wrote:


I need to run about 1000 PostgreSQL connections on a server that I can 
use about 4 GB of the total 16 GB of total RAM. It seems that each 
session creates a process that uses about 15 MB of RAM just for 
connecting so I’m running out of RAM rather quickly.


I have these non-default settings:

shared_buffers = 30MB

max_connections = 1000

I tried decreasing the work_mem but the db wouldn’t start then.

I’m running version 8.3 on Windows 2003 Server.

Any tips for reducing the memory footprint per session? There is 
pgBouncer but is there anything I can do in the configuration before I 
go with a connection pooler?





Please ask usage questions on the appropriate list (in this case one of: 
pgsql-general, pgsql-performance or pgsql-admin). pgsql-hackers is for 
discussion of development of features, not for usage issues.


cheers

andrew

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq

Re: [HACKERS] [PATCHES] Avahi support for Postgresql

2008-02-25 Thread Peter Eisentraut

Am Montag, 25. Februar 2008 schrieb Alvaro Herrera:
> Hmm, a quick look at the third patch reveals that it is using the
> "threaded" Avahi client.  That's a showstopper.

Could you elaborate why that is?

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

[HACKERS] Tuning 8.3

2008-02-25 Thread Roberts, Jon

I need to run about 1000 PostgreSQL connections on a server that I can
use about 4 GB of the total 16 GB of total RAM.  It seems that each
session creates a process that uses about 15 MB of RAM just for
connecting so I'm running out of RAM rather quickly.

 

I have these non-default settings:

shared_buffers = 30MB

max_connections = 1000

 

I tried decreasing the work_mem but the db wouldn't start then.

 

I'm running version 8.3 on Windows 2003 Server.

 

Any tips for reducing the memory footprint per session?  There is
pgBouncer but is there anything I can do in the configuration before I
go with a connection pooler?

 

 

Jon

Re: [HACKERS] Questions about indexes with text_pattern_ops

2008-02-25 Thread Tom Lane

Gregory Stark <[EMAIL PROTECTED]> writes:
> Hm, for a simple = or <> I think it doesn't matter which operator class you
> use. For < or > it would produce different answers. Postgres isn't clever 
> enough
> to notice that this is equivalent though so I think you would have to do
> something like (untested):

> CREATE INDEX new_index ON a (b text_pattern_ops) WHERE b ~<>~ '';

> That uses the same operator that the LIKE clause will use for the index range.

I'm intending to get rid of ~=~ and ~<>~ for 8.4; there's no longer any
reason why those slots in the pattern_ops classes can't be filled by the
plain = and <> operators.  (There *was* a reason when they were first
invented --- but now that texteq will only return true for exact bitwise
match, I think it's OK to assume these are equivalent.)

In the meantime, though, I think the only way that Kaare's query can use
that index is if he writes
WHERE b LIKE 'whatever' AND b <> '';
(with whatever spelling of <> the index predicate has).  There is not
anything in the predicate proving machinery that knows enough about LIKE
to be able to show that "b LIKE 'whatever'" implies "b <> ''".

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [HACKERS] One more option for pg_dump...

2008-02-25 Thread David BOURIAUD

Le lundi 25 février 2008, Leonardo Cezar a écrit :

Hi Leonardo,
Thanks for your quick answer, I didn't know it was a TODO item, and that 
somepeople were working on it... Keep going, then, cause I'm really waiting 
for these features !

> On Mon, Feb 25, 2008 at 10:48 AM, David BOURIAUD
>
> <[EMAIL PROTECTED]> wrote:
> >  Could there be an option to pg_dump (let's say --function [func_name])
> > to be abble to dump the complete source code of a function in a separate
> > file, or on the terminal ?
>
> It's a TODO item. Just not to functions and so others (operators,
> casts,...) objects as well.
>
> I'm coding a fully functional prototype that solves these features.
> Just now I'm going think in a way to dump overloaded functions that
> seems me one more complicated issue.
>
> >  Do you think it could be a good thing ?
>
> Yep! Hence it's a todo item :-)
>
> >  Are there workarounds to have the same comportement that I'm not aware
> > of ?
>
> Alot of lines sed+awk+pg_dump scripts ..

Nay, I use vim with two buffers, search for the code I want and copy-paste, 
but reconn that it's not very user friendly !
Thanks again, I'll try to wait for the improvements in the next version of 
postgres !
>
> -Leo




signature.asc
Description: This is a digitally signed message part.

[HACKERS] Strange behavior with leap dates and centuries BC

2008-02-25 Thread Bernd Helmle

I saw this strange behavior due to a customer problem who managed to get 
dumps which aren't restorable anymore:


CREATE TABLE foo(datum date);

INSERT INTO foo VALUES('-02-29');
INSERT 0 1

SELECT * FROM foo;
datum
---
0001-02-29 BC
(1 row)

COPY foo TO STDOUT;
0001-02-29 BC

INSERT INTO foo VALUES('0001-02-29 BC');
ERROR:  date/time field value out of range: "0001-02-29 BC"


Huh? It seems the calculation for leap dates with negative year values is 
broken. This example was taken from a current HEAD checkout today, the 
original version i've seen this behavior first was 8.2.4.




--
 Thanks

   Bernd

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org

Re: [HACKERS] Questions about indexes with text_pattern_ops

2008-02-25 Thread Gregory Stark

"Kaare Rasmussen" <[EMAIL PROTECTED]> writes:

> Hi 
>
> The database is initialized with utf8, so in order for LIKE to use the index 
> on
> a text field, I used text_pattern_ops when I created it. So far so good. 
>
> It's in the documentation, but there's no explanation of why this index will
> only work for LIKE searches. How come that I have to have two different 
> indexes
> if I want to give Postgres the ability to choose index scan over seq scan on
> LIKE and non-LIKE searches? 

Because in non-C locales (which you're almost certainly using if you're using
UTF8) the ordering which the normal text operations use can be quite complex.
Just as an example most locales have spaces being entirely insignificant. So
no range can reliably match a prefix LIKE pattern. The text_pattern_ops use
simple character-by-character ordering which are useful for LIKE but not for
regular < and > comparisons. They're just two different orderings.

> Also, when I tried to create the index as a partial one (avoiding the 95%
> entries with empty strings), Postgresql chooses to use seq scan. This sounds
> counter intuitive to me. 
>
> CREATE INDEX new_index ON a (b text_pattern_ops) WHERE b <> '';
> This is 8.2.6.

Hm, for a simple = or <> I think it doesn't matter which operator class you
use. For < or > it would produce different answers. Postgres isn't clever enough
to notice that this is equivalent though so I think you would have to do
something like (untested):

CREATE INDEX new_index ON a (b text_pattern_ops) WHERE b ~<>~ '';

That uses the same operator that the LIKE clause will use for the index range.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's On-Demand Production Tuning

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] One more option for pg_dump...

2008-02-25 Thread Leonardo Cezar

On Mon, Feb 25, 2008 at 10:48 AM, David BOURIAUD
<[EMAIL PROTECTED]> wrote:
>  Could there be an option to pg_dump (let's say --function [func_name]) to be
>  abble to dump the complete source code of a function in a separate file, or
>  on the terminal ?

It's a TODO item. Just not to functions and so others (operators,
casts,...) objects as well.

I'm coding a fully functional prototype that solves these features.
Just now I'm going think in a way to dump overloaded functions that
seems me one more complicated issue.

>  Do you think it could be a good thing ?

Yep! Hence it's a todo item :-)

>  Are there workarounds to have the same comportement that I'm not aware of ?

Alot of lines sed+awk+pg_dump scripts ..

-Leo
-- 
Leonardo Cezar et all
http://www.dextra.com.br/postgres
http://www.postgresql.org.br

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [HACKERS] [PATCHES] Avahi support for Postgresql

2008-02-25 Thread Alvaro Herrera

Mathias Hasselmann wrote:

> The patches were in my initial mail, but now I've also uploaded them to
> my personal site for convenience:
> 
>   http://taschenorakel.de/files/pgsql-avahi-support/

Hmm, a quick look at the third patch reveals that it is using the
"threaded" Avahi client.  That's a showstopper.  Please consider using
some other approach -- writing our own handlers for AvahiPoll would seem
apropos.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] [PATCHES] Avahi support for Postgresql

2008-02-25 Thread Peter Eisentraut

Am Dienstag, 27. November 2007 schrieb Mathias Hasselmann:
> Postmaster already has code to announce its services via DNS-SD
> (ZeroConf) by using Apple's Bonjour API. This series of patches
> implements that capability on top of the Avahi library[1] which
> is free software, available for a wider variety of platforms.

Note to hackers:  This set of patches renames the GUC parameter bonjour_name 
to zeroconf_name to make the more general meaning clear.  We have generally 
not hesitated to rename GUC parameters between major releases, but if this 
bothers you, please speak up.

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

[HACKERS] One more option for pg_dump...

2008-02-25 Thread David BOURIAUD

Hi all,
On the 6th of february, there's been a thread about adding new options to 
pg_dump, but it is now too late for me to add comments to this thread, since 
all that was said wouldn't be readable at this time, so I add an new thread 
here.
I haven't found any option to dump any user-defined function stored in a 
database, unless doing a pg_dump -D -s database, but so far one would get the 
definitions of the tables, the permissions, the triggers, and so on, so when 
you have a big schema, it is not much user friendly to do a full dump to 
change one or two lines of code in a function.
Could there be an option to pg_dump (let's say --function [func_name]) to be 
abble to dump the complete source code of a function in a separate file, or 
on the terminal ?
I've found that when in  psql, one can type \df+ func_name to have many 
informations about the so named func_name, but it is not well readeable.
Do you think it could be a good thing ?
Are there workarounds to have the same comportement that I'm not aware of ?
Thanks for your ideas about this.


signature.asc
Description: This is a digitally signed message part.

Re: [HACKERS] [PATCHES] Avahi support for Postgresql

2008-02-25 Thread Peter Eisentraut

Am Dienstag, 27. November 2007 schrieb Mathias Hasselmann:
> Postmaster already has code to announce its services via DNS-SD
> (ZeroConf) by using Apple's Bonjour API. This series of patches
> implements that capability on top of the Avahi library[1] which
> is free software, available for a wider variety of platforms.

Is there a reason we couldn't use the Bonjour compatibility layer offered by 
Avahi to keep the code differences to a minimum?

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/

---(end of broadcast)---
TIP 6: explain analyze is your friend

Smaller db in 8.3 (was: Re: [HACKERS] insert ... delete ... returning ... ?)

2008-02-25 Thread Mark Mielke


Tom Lane wrote:

Mark Mielke <[EMAIL PROTECTED]> writes:
  
I'm noticing a massive reduction in on disk storage required for my 
database that I believe is primarily attributable due to Tom's reduced 
overhead for short strings.



Twasn't my work; Greg Stark gets most of the credit for that one, and
you might be seeing some benefit from Heikki's work to cut the tuple
header size too.
  


Oops. You are right. Thanks Greg and Heikki! Whatever you did works 
great! :-)


Cheers,
mark

--
Mark Mielke <[EMAIL PROTECTED]>

[HACKERS] Questions about indexes with text_pattern_ops

2008-02-25 Thread Kaare Rasmussen

Hi 

The database is initialized with utf8, so in order for LIKE to use the index 
on a text field, I used text_pattern_ops when I created it. So far so good. 

It's in the documentation, but there's no explanation of why this index will 
only work for LIKE searches. How come that I have to have two different 
indexes if I want to give Postgres the ability to choose index scan over seq 
scan on LIKE and non-LIKE searches? 

Is it a performance issue? 

Also, when I tried to create the index as a partial one (avoiding the 95% 
entries with empty strings), Postgresql chooses to use seq scan. This sounds 
counter intuitive to me. 


CREATE INDEX new_index ON a (b text_pattern_ops) WHERE b <> '';
This is 8.2.6.

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] dblink doesn't honor interrupts while waiting a result

2008-02-25 Thread Florian G. Pflug


Marko Kreen wrote:

On 2/25/08, Florian G. Pflug <[EMAIL PROTECTED]> wrote:

 I'm not sure how a proper fix for this could look like, since the
 blocking actually happens inside libpq - but this certainly makes
 working with dblink painfull...


Proper fix would be to use async libpq API, then loop on poll(2)
with small timeout.  You can look at pl/proxy for example code.

Ah, cool, I'll check out pl/proxy.

regards, Florian Pflug

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] dblink doesn't honor interrupts while waiting a result

2008-02-25 Thread Marko Kreen

On 2/25/08, Florian G. Pflug <[EMAIL PROTECTED]> wrote:
>  I'm not sure how a proper fix for this could look like, since the
>  blocking actually happens inside libpq - but this certainly makes
>  working with dblink painfull...

Proper fix would be to use async libpq API, then loop on poll(2)
with small timeout.  You can look at pl/proxy for example code.

-- 
marko

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [HACKERS] OSSP can be used in the windows environment now!

2008-02-25 Thread Hiroshi Saito


Hi.

Please check it.
build is successful for it in my environment. 
Thanks!


Regards,
Hiroshi Saito

- Original Message - 
From: "Hiroshi Saito" <[EMAIL PROTECTED]>




Hi.

From: "Magnus Hagander" <[EMAIL PROTECTED]>



we can include in the next release.:-)
Thanks!


Good news. Can you provide a patch for the msvc build system to build
with it? We can't really ship it in the next release if we can't build
with it ;-)


Ahh Ok, I  try it first and need to check clear environment.
Probably, Monday will come by the reason I'm very busy.
Thanks!

Regards,
Hiroshi Saito

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq

msvc_uuid_patch
Description: Binary data

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [HACKERS] pg_dump additional options for performance

2008-02-25 Thread Tom Dunstan

On Sun, Feb 24, 2008 at 6:52 PM, Jochem van Dieten <[EMAIL PROTECTED]> wrote:
>  Or we could have a switch that specifies a directory and have pg_dump
>  split the dump not just in pre-schema, data and post-schema, but also
>  split the data in a file for each table. That would greatly facilitate
>  a parallel restore of the data through multiple connections.

I'll admit to thinking something similar while reading this thread,
mostly because having to specify multiple filenames just to do a dump
and then do them all on the way back in seemed horrible. My idea was
to stick the multiple streams into a structured container file rather
than a directory though - a zip file a la JAR/ODF leapt to mind. That
has the nice property of being a single dump file with optional built
in compression that could store all the data as separate streams and
would allow a smart restore program to do as much in parallel as makes
sense. Mucking around with directories or three different filenames or
whatever is a pain. I'll bet most users want to say "pg_dump
--dump-file=foo.zip foo", back up foo.zip as appropriate, and when
restoring saying "pg_restore --dump-file=foo.zip -j 4" or whatever and
having pg_restore do the rest. The other nice thing about using a zip
file as a container is that you can inspect it with standard tools if
you need to.

Another thought is that doing things this way would allow us to add
extra metadata to the dump in later versions without giving the user
yet another command line switch for an extra file. Or even, thinking a
bit more outside the box, allow us to store data in binary format if
that's what the user wants at some point (thinking of the output from
binary io rather than on disk representation, obviously). Exposing all
the internals of this stuff via n command line args is pretty
constraining - it would be nice if pg_dump just produced the most
efficient dump, and if we decide at a later date that that means doing
things a bit differently, then we bump the dump file version and just
do it.

Just a thought...

Cheers

Tom

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

62 matches

Mail list logo