Re: [HACKERS] sp-gist porting to postgreSQL

2004-11-09 Thread Ramy M. Hassan
Oleg,

Thanks for your prompt reply.
Actually, I am able to create a new access method for testing and add an
operator class for the type "integer" using the new access method. Then
created a table with two integer fields, one indexed using the new access
method and the other using a btree index, and everything is ok so far. Even
using EXPLAIN statement for queries show that the indexes are used correctly
as they should.
I am using postgresql version 8.0.0beta3 from CVS.

Thanks
Ramy



-Original Message-
From: Oleg Bartunov [mailto:[EMAIL PROTECTED] 
Sent: Wednesday, November 10, 2004 12:35 AM
To: Ramy M. Hassan; Pgsql Hackers
Cc: Teodor Sigaev; Walid G. Aref
Subject: Re: sp-gist porting to postgreSQL

Ramy,

glad to hear from you !
AFAIK, posgresql doesnt' supports several indices for the same type.
I think this is a problem of optimizer. Probably other hackers know
better. I forward your message to -hackers mailing list which is a
relevant place for GiST discussion.

regards,
Oleg


On Tue, 9 Nov 2004, Ramy M. Hassan wrote:

> Dear Oleg and Teodor,
> Thanks for offering help.
> I have a design question for now.
> Currently in the postgresql GiST implementation, I noticed that the way to

> have a GiST based index is to define an operator class for a certain type 
> using GiST index. There is no new index type defined from the point of
view 
> of postgresql ( nothing is added to pg_am ). This means that for a certain

> type there could only be one GiST based index. I mean that there is no way
in 
> the same server to use gist to implement an xtree index and a ytree  for
the 
> same  type even if they index different fields in different relations. is 
> that correct ?
> What about doing it the other way ( I am talking about SP-GiST now ) , by 
> providing the extension writer with an API to use it to instantiate a 
> standalone SP-GiST based index ( for example trie index ) that has a
record 
> in the pg_am relation. In my point of view this would give more
flexibility, 
> and also would not require the extension writer to learn the postgresql
API ( 
> maybe oneday SP-GiST will be ported to another database engine )  he will 
> just need to learn the SP-GiST API which will propably be less amount of 
> study  (and this is what GiST and SP-GiST is all about if I correctly 
> understand ).
> Please let me know your opinions regarding to this.
>
> Thanks
>
> Ramy
>

Regards,
Oleg
_
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: [EMAIL PROTECTED], http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] sp-gist porting to postgreSQL

2004-11-09 Thread Oleg Bartunov
Ramy,
glad to hear from you !
AFAIK, posgresql doesnt' supports several indices for the same type.
I think this is a problem of optimizer. Probably other hackers know
better. I forward your message to -hackers mailing list which is a
relevant place for GiST discussion.
regards,
Oleg
On Tue, 9 Nov 2004, Ramy M. Hassan wrote:
Dear Oleg and Teodor,
Thanks for offering help.
I have a design question for now.
Currently in the postgresql GiST implementation, I noticed that the way to 
have a GiST based index is to define an operator class for a certain type 
using GiST index. There is no new index type defined from the point of view 
of postgresql ( nothing is added to pg_am ). This means that for a certain 
type there could only be one GiST based index. I mean that there is no way in 
the same server to use gist to implement an xtree index and a ytree  for the 
same  type even if they index different fields in different relations. is 
that correct ?
What about doing it the other way ( I am talking about SP-GiST now ) , by 
providing the extension writer with an API to use it to instantiate a 
standalone SP-GiST based index ( for example trie index ) that has a record 
in the pg_am relation. In my point of view this would give more flexibility, 
and also would not require the extension writer to learn the postgresql API ( 
maybe oneday SP-GiST will be ported to another database engine )  he will 
just need to learn the SP-GiST API which will propably be less amount of 
study  (and this is what GiST and SP-GiST is all about if I correctly 
understand ).
Please let me know your opinions regarding to this.

Thanks
Ramy
Regards,
Oleg
_
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: [EMAIL PROTECTED], http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83
---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
 joining column's datatypes do not match


Re: [HACKERS] A modest proposal: get rid of GUC's USERLIMIT variable

2004-11-09 Thread Bruce Momjian
Greg Stark wrote:
> 
> Tom Lane <[EMAIL PROTECTED]> writes:
> 
> > I'd like to propose that we get rid of GUC's USERLIMIT category and
> > convert all the variables in it to plain SUSET.  In my mind, USERLIMIT
> > is a failed experiment: it's way too complicated, and it still doesn't
> > do quite what it was intended to do, because there are times when it
> > can't check whether you're a superuser.
> > 
> > The only variables that are in the category are log-verbosity-related:
> 
> Would that mean I wouldn't be able to change the logging level on the fly at
> all?
> 
> That would disappoint at least one user, myself. I've found the best debugging
> compromise is to leave log_statement off in general but have a magic parameter
> I can pass to the application that will set log_statement = true for a single
> transaction.
> 
> That way I can look at what queries transpired in my session without having to
> dig through hundreds of other queries from other sessions. And have the
> complete logs for the session I'm debugging without the performance impact in
> the normal case.

Yes, this would not be possible for non-super users with the new
proposal.  You could set the setting for non-super users per-user but
not per-session.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] A modest proposal: get rid of GUC's USERLIMIT variable category

2004-11-09 Thread Greg Stark

Tom Lane <[EMAIL PROTECTED]> writes:

> I'd like to propose that we get rid of GUC's USERLIMIT category and
> convert all the variables in it to plain SUSET.  In my mind, USERLIMIT
> is a failed experiment: it's way too complicated, and it still doesn't
> do quite what it was intended to do, because there are times when it
> can't check whether you're a superuser.
> 
> The only variables that are in the category are log-verbosity-related:

Would that mean I wouldn't be able to change the logging level on the fly at
all?

That would disappoint at least one user, myself. I've found the best debugging
compromise is to leave log_statement off in general but have a magic parameter
I can pass to the application that will set log_statement = true for a single
transaction.

That way I can look at what queries transpired in my session without having to
dig through hundreds of other queries from other sessions. And have the
complete logs for the session I'm debugging without the performance impact in
the normal case.

-- 
greg


---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] [Pgsphere-dev] GIST index concurrency concern

2004-11-09 Thread Oleg Bartunov
Daniel,
concurrency is a big issue of current implementation of GiST.
But it should don't bite you for READ ops ! 
-hackers mailing list is a very relevant mailing list for GiST
discussions. It's pity we several times claimed to work on GiST
concurrency and recovery, but never got a chance :)
I see Neil become interested in GiST concurrency, though.

Oleg
On Tue, 9 Nov 2004, Daniel Ceregatti wrote:
Hi,
It's recently come to my attention that GIST indices suffer from
concurrency issues. I have already developed a dating sites using GIST
for use with attributes using the intarray contrib, and for Earth
distance/radius calculations using pg_sphere.
I'm wondering if I haven't shot myself in the foot here. So far, I
understand that a GIST index will be locked by a backend for any DML.
Basically I'm concerned that my database will not scale in the manner
that I was hoping, because the sites that access the database are to be
used by many multiple concurrent users, doing  some DML.
I expect my site to sustain something around 1000-3000 new user
acquisitions per day, all of which will account for an insert into 3
GIST indices. Additionally there will be people that will be updating
their attributes and locations as well, but this will probably only
account for a small fraction of the DML. We don't allow people to delete
stuff.
My concern now is this concurrency issue. My question is: Is there
anyone out there using a GIST index on a database where there's a lot of
DML? Should I be concerned with this issue at all?
If so, what can be done to minimize the impact of heavy DML on a GIST
index? I've pondered rolling all DML into queues via triggers and then
de-queuing them in one transaction every so often, like 15 minutes, via
cron. Any other suggestions?
I'm posting to this list because I understand that both Oleg and Teodor
read it, and I found no other relevant list. If I've misposted, please
accept my apology and please direct me to the appropriate list.
Thanks,
Daniel

Regards,
Oleg
_
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: [EMAIL PROTECTED], http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83
---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
 subscribe-nomail command to [EMAIL PROTECTED] so that your
 message can get through to the mailing list cleanly


[HACKERS] GiST: range of penalty method?

2004-11-09 Thread Neil Conway
Does anyone know what the expected range of the "penalty" GiST method
is? (i.e. Is the legal range documented anywhere? Failing that, what
does existing GiST-based code expect?)

While rewriting gistchoose() in gist.c to be less obfuscated, it
occurred to me that (a) I don't think the existing code will work as
intended if a negative penalty is returned (b) it would be good to
define a "minimum legal penalty". Once the minimum penalty has been
returned for a particular location, GiST can know that all other
locations where it might insert the node will have a penalty greater
than or equal to that value, so gistchoose() can bail-out earlier.

Therefore, I'd like to make "0.0" is the minimum legal penalty, and
require all GiST "penalty" methods to return values >= 0. I think
requiring the penalty != NaN would also be a good idea. Comments?

Note that making this change to CVS tip results in a regression failure
in contrib/btree_gist. The regression.diffs are attached. The regression
tests for rtree_gist, tsearch, tsearch2 and pg_trgm succeed after making
the modification.

-Neil

*** ./expected/bit.out	2004-11-09 11:20:51 +11:00
--- ./results/bit.out	2004-11-10 15:44:52 +11:00
***
*** 33,38 
--- 33,39 
  (1 row)
  
  CREATE INDEX bitidx ON bittmp USING GIST ( a );
+ ERROR:  illegal penalty from GiST penalty method: -116307439489975337053247905800904507392.00
  SET enable_seqscan=off;
  SELECT count(*) FROM bittmp WHERE a <   '01101100010001011101100011100';
   count 

==

*** ./expected/varbit.out	2004-11-09 11:20:51 +11:00
--- ./results/varbit.out	2004-11-10 15:44:52 +11:00
***
*** 33,38 
--- 33,39 
  (1 row)
  
  CREATE INDEX varbitidx ON varbittmp USING GIST ( a );
+ ERROR:  illegal penalty from GiST penalty method: -92381335565846851340979378715858305024.00
  SET enable_seqscan=off;
  SELECT count(*) FROM varbittmp WHERE a <   '1110100111010'::varbit;
   count 

==


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


[HACKERS] CREATE or REPLACE function pg_catalog.*

2004-11-09 Thread John Hansen
Hi,

When doing CREATE or REPLACE FUNCTION of a builtin function, it seems to
have no effect if its in the 'C" language. SQL functions seem to work,
but as neilc pointed out, it may be due to the SQL function being
inlined.

The builtin function is still called, not the userdefined function for
'C' language functions.

... John



---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [HACKERS] CVS should die (was: Possible make_oidjoins_check ...)

2004-11-09 Thread Steve Crawford
> This doesn't really answer the question of what tool Postgres might
> change to, but it seems that Subversion is a good tool one should
> consider. And by golly, CVS is bad. Just consider the cons – having
> to forbid renames in all but the most necessary cases – it just
> invites cruft into any project.

Interesting reading:
http://better-scm.berlios.de/comparison/comparison.html
http://zooko.com/revision_control_quick_ref.html

Cheers,
Steve


---(end of broadcast)---
TIP 8: explain analyze is your friend


[HACKERS] A modest proposal: get rid of GUC's USERLIMIT variable category

2004-11-09 Thread Tom Lane
I'd like to propose that we get rid of GUC's USERLIMIT category and
convert all the variables in it to plain SUSET.  In my mind, USERLIMIT
is a failed experiment: it's way too complicated, and it still doesn't
do quite what it was intended to do, because there are times when it
can't check whether you're a superuser.

The only variables that are in the category are log-verbosity-related:

regression=# select name from pg_settings where context = 'userlimit';
name

 log_duration
 log_executor_stats
 log_min_duration_statement
 log_min_error_statement
 log_min_messages
 log_parser_stats
 log_planner_stats
 log_statement
 log_statement_stats
(9 rows)

What the USERLIMIT code tries to do is allow non-superusers to
"increase" but not "decrease" the logging verbosity for their sessions
only.  (For instance, a non-superuser could turn log_duration on, but
can't turn it off if the DBA has turned it on.)  However, the usefulness
of this capability is really pretty debatable.  A non-superuser
presumably doesn't have access to the postmaster log file anyhow, so why
does he need to be able to turn up the logging?  You could even argue
that being able to flood the logs with stuff the DBA doesn't want is a
mild form of DOS attack.

If we do get rid of USERLIMIT, another benefit accrues: we can assume
that ALTER USER and ALTER DATABASE settings were fully checked when they
were installed, and thereby accept them at session start without any
extra permissions check.  This would mean that, for example, a superuser
could use ALTER USER to set these variables on a per-user basis for
non-superusers, and it would actually work.  Right now the value is
rechecked as if the setting were being issued by the non-superuser,
and so it may fail.

For more discussion see this thread in pgsql-bugs:
http://archives.postgresql.org/pgsql-bugs/2004-11/msg00101.php

regards, tom lane

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [HACKERS] Call for objections: simplify stable functions during estimation

2004-11-09 Thread Tom Lane
Robert Treat <[EMAIL PROTECTED]> writes:
> On Tuesday 09 November 2004 11:28, Tom Lane wrote:
>> (One of the potential objections went away when
>> we started enforcing that stable functions don't have side-effects.)

> Since we know people will be calling volatile functions inside stable 
> functions (see thread from last week if you need a refresher as to why) is 
> there any serious negative side-effect in those cases?

If you are making an end-run around the rule, IMHO it's up to you to make
sure that the behavior of the function is sane.  In practice, the
planner is only going to be estimating values for functions that appear
in WHERE/GROUP BY/HAVING clauses, and anyone who puts a function with
real side-effects in such places is in deep trouble anyway.

The real bottom line here is that it's better to take the current value
of the function as the planner estimate than to fall back to completely
default selectivity estimates.  You can doubtless invent scenarios where
this is wrong, but they are far outweighed by cases where it is right.

regards, tom lane

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [HACKERS] Call for objections: simplify stable functions during estimation

2004-11-09 Thread Robert Treat
On Tuesday 09 November 2004 11:28, Tom Lane wrote:
> Awhile back, there was some discussion about pre-folding now() and
> related functions when the planner is trying to estimate selectivities.
> This would allow reasonable plans to be made for cases like
>  WHERE moddate >= current_date - 10;
> without having to indulge in any crude hacks with mislabeled wrapper
> functions, such as you can find all too often in the archives :-(
>
> I was a bit hesitant about it at the time because I wasn't sure of all
> the implications; but I've looked the idea over again, and as far as I
> can see it's reasonable to pre-fold *all* stable functions when deriving
> statistical estimates.  (One of the potential objections went away when
> we started enforcing that stable functions don't have side-effects.)
>

Since we know people will be calling volatile functions inside stable 
functions (see thread from last week if you need a refresher as to why) is 
there any serious negative side-effect in those cases?

-- 
Robert Treat
Build A Brighter Lamp :: Linux Apache {middleware} PostgreSQL

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [HACKERS] [GENERAL] server auto-restarts and ipcs

2004-11-09 Thread Tom Lane
"Ed L." <[EMAIL PROTECTED]> writes:
> A power failure led to failed postmaster restart using 7.4.6 (see output
> below).  The short-term fix is usually to delete the pid file and restart.
> I often wonder why ipcs never seems to show the shared memory 
> block in question?

> 2004-11-08 17:17:22.398 [18038] FATAL:  pre-existing shared memory block (key 
> 9746001, ID 658210829) is still in use

I did a bit of experimentation and found that the Linux kernel does seem
to reproducibly assign similar shmem IDs from one boot cycle to the
next.  Here's a smoking-gun case:

$ sudo ipcs -m

-- Shared Memory Segments 
keyshmid  owner  perms  bytes  nattch status
0x0052e2c1 65536  postgres  60010436608   1
0x 131073 gdm   600393216 2  dest
0x00530201 163842 tgl   60010395648   2

[ reboot ]

$ sudo ipcs -m

-- Shared Memory Segments 
keyshmid  owner  perms  bytes  nattch status
0x0052e2c1 65536  postgres  60010436608   1
0x00530201 98305  tgl   60010395648   2
0x 163842 gdm   600393216 2  dest

The "tgl" entry is a manually-started postmaster, which in the second
boot cycle I was able to start before gdm came up.  Notice that gdm has
been handed out a shmid that belonged to a different userID in the
previous boot cycle.

What this says is that given a little bit of variability in the boot
cycle, it is fairly likely for the postmaster.pid file to contain a
shared memory ID that has already been assigned to another daemon in
the current boot cycle.  The way that PGSharedMemoryIsInUse() is coded,
this will result in a failure as exhibited by Ed, because shmctl() will
return EACCES and we interpret that as a conflicting shmem segment.
(The reason this is considered dangerous is it suggests that there might
be backends still alive from a crashed previous postmaster; we dare not
start new backends that are not in sync with the old ones.)

After thinking about this awhile, I believe that it is safe to consider
EACCES as a don't-care situation.  EACCES could only happen if the shmem
ID belongs to a different userid, which implies that it is not a
postgres shared memory segment.  Even if you are running postmasters
under multiple userids, this can be ignored, because all that we care
about is whether the shared memory segment could indicate the presence
of backends running in the current $PGDATA directory.  With the file
permissions that we use, it is not possible for a shared memory segment
to belong to a userid different from the one that owns the data
directory, and so any postmaster having a different userid must be
managing a different data directory.

So we could reduce our exposure to failure-to-start conditions by
allowing the EACCES case in PGSharedMemoryIsInUse.  Does anyone see
a flaw in this reasoning?

This isn't a complete solution, because if you are running multiple
postmasters under the *same* userid, they could still get confused.
We could probably fix that by marking each shmem seg to indicate which
data directory it goes with (eg, store the directory's inode number in
the seg header).  If we see an apparently live shmem segment of our own
userid, we could attach to it and check the header to determine whether
it's really a conflict or not.  There might be some portability issues
here though; didn't we find out that Windows doesn't really have inode
numbers?

regards, tom lane

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] Call for objections: simplify stable functions during

2004-11-09 Thread Marc G. Fournier
On Tue, 9 Nov 2004, Tom Lane wrote:
Awhile back, there was some discussion about pre-folding now() and
related functions when the planner is trying to estimate selectivities.
This would allow reasonable plans to be made for cases like
WHERE moddate >= current_date - 10;
without having to indulge in any crude hacks with mislabeled wrapper
functions, such as you can find all too often in the archives :-(
I was a bit hesitant about it at the time because I wasn't sure of all
the implications; but I've looked the idea over again, and as far as I
can see it's reasonable to pre-fold *all* stable functions when deriving
statistical estimates.  (One of the potential objections went away when
we started enforcing that stable functions don't have side-effects.)
The infrastructure for this is already there, because of Oliver Jowett's
previous work to teach eval_const_expressions() whether it's folding
the expression "for real" or just for estimation; it's basically a one
line change to treat stable functions differently in the two cases.
I know it's a bit late in the cycle, but I'd like to go ahead and make
this change for 8.0.  Objections?

From a performance tuning standpoing, I can't argue against it ... go for 
it ...

Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org)
Email: [EMAIL PROTECTED]   Yahoo!: yscrappy  ICQ: 7615664
---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] Call for objections: simplify stable functions

2004-11-09 Thread Simon Riggs
On Tue, 2004-11-09 at 16:28, Tom Lane wrote:
> Awhile back, there was some discussion about pre-folding now() and
> related functions when the planner is trying to estimate selectivities.
> This would allow reasonable plans to be made for cases like
>   WHERE moddate >= current_date - 10;
> without having to indulge in any crude hacks with mislabeled wrapper
> functions, such as you can find all too often in the archives :-(
> 
> I was a bit hesitant about it at the time because I wasn't sure of all
> the implications; but I've looked the idea over again, and as far as I
> can see it's reasonable to pre-fold *all* stable functions when deriving
> statistical estimates.  (One of the potential objections went away when
> we started enforcing that stable functions don't have side-effects.)
> 
> The infrastructure for this is already there, because of Oliver Jowett's
> previous work to teach eval_const_expressions() whether it's folding
> the expression "for real" or just for estimation; it's basically a one
> line change to treat stable functions differently in the two cases.
> 
> I know it's a bit late in the cycle, but I'd like to go ahead and make
> this change for 8.0.  Objections?

None. IMHO Poor performance is a valid bug that must be addressed in the
beta cycle. We've chased out most of the functional deficiencies, now
its time to concentrate on the performance ones.

-- 
Best Regards, Simon Riggs


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] debugging PostgreSQL

2004-11-09 Thread Alvaro Herrera
On Tue, Nov 09, 2004 at 09:16:34AM +0100, [EMAIL PROTECTED] wrote:

> Could someone be so kind to give me some pointers about how
> to debug pg. I would like to know which debugger you use under windows and
> linux

GDB.  The mechanism is simple: start a connection, and in a terminal
window get the backend's PID with ps(1), then call gdb -p .

I assume this would also work with other debuggers or GDB frontends, but
I haven't tried.

-- 
Alvaro Herrera ()
"El número de instalaciones de UNIX se ha elevado a 10,
y se espera que este número aumente" (UPM, 1972)


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] Reorganization of the translation files

2004-11-09 Thread Tom Lane
Peter Eisentraut <[EMAIL PROTECTED]> writes:
> I was thinking about organizing the translation files in a more efficient 
> manner. (not for 8.0, obviously)
> [snip]
> And it would easily solve issues like translatable strings 
> appearing in the pgport library, which has no makefile structure to support 
> translations (which would be a waste for like 5 strings), and no run-time 
> support either.

Sounds like a win to me on that grounds alone; but probably the
translators need to have the biggest say here, since they'll be affected
the most.

One question is whether the gettext code is markedly less efficient when
accessing a large .mo file than a small one; if so, the consolidation
would probably hurt client performance.

regards, tom lane

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


[HACKERS] Reorganization of the translation files

2004-11-09 Thread Peter Eisentraut
I was thinking about organizing the translation files in a more efficient 
manner. (not for 8.0, obviously)

Right now we have one PO file for each combination of program (or library) and 
language.  The idea was that one only has to install the translation files 
for the PostgreSQL pieces that one actually uses on a particular system, and 
this is basically how it has worked out.

Nevertheless, I think it may make sense to provide only a single PO file for 
each language, covering the entire PostgreSQL source tree.  Because if you 
think about it, the above space-saving scheme actually wastes space.  If you 
install a PostgreSQL server package today, you install 14 translations, 93% 
of which you don't need.  On the other hand, if we only provided one PO file 
per language and encouraged packagers to create a separate package for each 
language (say, postgresql-i18n-de), then a client-only installation wastes 
more like 60%, and a server installation (which usually includes the clients) 
wastes nothing.  Moreover, English speakers have the option to install no 
translations at all.

Now, the above can be accomplished by realigning the packaging without 
changing the PO files, but if we were to do that, then there is little reason 
to keep separate files.  Moreover, having one big file would have several 
other advantages:  It would save space because many strings are duplicated in 
several PO files.  It would make life easier on those installing the 
translations.  I suspect it would also make life easier for translators.  And 
it would be easier to release translation updates or new translations between 
code releases.  And it would easily solve issues like translatable strings 
appearing in the pgport library, which has no makefile structure to support 
translations (which would be a waste for like 5 strings), and no run-time 
support either.

Comments?

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


[HACKERS] Call for objections: simplify stable functions during estimation

2004-11-09 Thread Tom Lane
Awhile back, there was some discussion about pre-folding now() and
related functions when the planner is trying to estimate selectivities.
This would allow reasonable plans to be made for cases like
WHERE moddate >= current_date - 10;
without having to indulge in any crude hacks with mislabeled wrapper
functions, such as you can find all too often in the archives :-(

I was a bit hesitant about it at the time because I wasn't sure of all
the implications; but I've looked the idea over again, and as far as I
can see it's reasonable to pre-fold *all* stable functions when deriving
statistical estimates.  (One of the potential objections went away when
we started enforcing that stable functions don't have side-effects.)

The infrastructure for this is already there, because of Oliver Jowett's
previous work to teach eval_const_expressions() whether it's folding
the expression "for real" or just for estimation; it's basically a one
line change to treat stable functions differently in the two cases.

I know it's a bit late in the cycle, but I'd like to go ahead and make
this change for 8.0.  Objections?

regards, tom lane

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


Re: [HACKERS] Increasing the length of

2004-11-09 Thread Simon Riggs
On Mon, 2004-11-08 at 22:32, Tom Lane wrote:
> Another relevant question is why you are expecting to get this
> information through pgstats and not by looking in the postmaster log.

This is only available if you log all queries, which isn't normally done
while you are in production. When you hit a long running query, you do
wish you had that enabled, and if it was you could look there. 

It would be best to leave the postmaster logging turned off, then allow
dynamic inspection of the query iff you have a rogue query.

This is an important admin consideration for data warehousing.

> I don't know about you, but I don't have any tools that are designed to
> cope nicely with looking at tables that have columns that might be many
> K wide.  Looking in the log seems a much nicer way of examining the full
> text of extremely long queries.  So I think it's actually a good thing
> that pgstats truncates the queries at some reasonable width.

You're right...if the query was in the log, thats where I'd look.

In general, I'm not bothered whether I can see the whole query or not.
But it would be good to have a mode of operation that allows the whole
query to be seen via pg_stat_activity, when required.

Could this allow some dynamic behaviour? i.e. set it low, as Tom
suggests for most of the time, then set it higher, as Greg suggests,
upon demand, for a short period only? [Not sure, but I think it may only
be sent once at start of query, then never againso may be a hole in
this thinking]

-- 
Best Regards, Simon Riggs


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [HACKERS] Increasing the length of pg_stat_activity.current_query...

2004-11-09 Thread Andreas Pflug
Josh Berkus wrote:
Tom,

Another relevant question is why you are expecting to get this
information through pgstats and not by looking in the postmaster log.
I don't know about you, but I don't have any tools that are designed to
cope nicely with looking at tables that have columns that might be many
K wide.  Looking in the log seems a much nicer way of examining the full
text of extremely long queries.  So I think it's actually a good thing
that pgstats truncates the queries at some reasonable width.

Because pg_stat_activity can be queried dynamically, and the log can't. 
I've been planning to post a lengthy mail after 8.0 release, but it 
seems a good idea to do it now.

When comparing pgsql to MSSQL in practice, I encounter a similar problem 
as Josh. I got a server hammered by countless queries, some of them not 
too well constructed and thus soaking CPU from all users. On MSSQL, I'd 
be using the Profiler, which lets me tap one or more connections, and 
log whatever I think is important to trace down the problem. This lets 
me filter out those uninteresting 99.9 % of queries which would make my 
log unreadable. Additionally, some performance measures are recorded for 
each query, enabling me to spot the bad guys, analyze and improve them.

On pgsql, all logging goes unstructured into one file, I even can't 
start and stop a new log on demand on my observation period (somebody 
refused to implement a manual log rotation function, "nobody needs 
that"...) On a server addressed by 100 users, with several dozens of 
queries fired every second, it's hard work to locate the offending query.

It appears to me that simple increasing the max query length won't do 
the deal (and 16k would not be enough). What I'd like to see is the 
possibility to tap one or more backends (this is superuser only, of 
course), and put them in a logging mode, which will record the complete 
query including performance counters to some process in a lossless way. 
When I say tapping I mean that the backend configuration switch is *not* 
set by the very same backend, but from a different superuser backend.

Regards,
Andreas
---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
   (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


Re: [HACKERS] ExclusiveLock

2004-11-09 Thread Simon Riggs
On Mon, 2004-11-08 at 21:37, Tom Lane wrote:
> Simon Riggs <[EMAIL PROTECTED]> writes:
> > Recent runs of DBT-2 show very occasional ExclusiveLock (s) being held
> > by transactions, sometimes waiting to be granted.
> 
> I think you are right that these reflect heap or btree-index extension
> operations.  Those do not actually take locks on the *table* however,
> but locks on a single page within it (which are completely orthogonal to
> table locks and don't conflict).  The pg_locks output leaves something
> to be desired, because you can't tell the difference between table and
> page locks.

Good. Thought it was worth discussion...

> It's odd that your example does not appear to show someone else holding
> a conflicting lock.

There isI didn't copy the whole lock table output...here it is...

   relname|  pid  |   mode   | granted 
---+---+--+-
 new_order | 21735 | AccessShareLock  | t
 new_order | 21735 | RowExclusiveLock | t
 orders| 21715 | AccessShareLock  | t
 orders| 21715 | RowExclusiveLock | t
 pg_class  | 23254 | AccessShareLock  | t
 order_line| 21715 | AccessShareLock  | t
 order_line| 21715 | RowExclusiveLock | t
 order_line| 21735 | ExclusiveLock| f
 new_order | 21715 | AccessShareLock  | t
 new_order | 21715 | RowExclusiveLock | t
 customer  | 21715 | AccessShareLock  | t
 pk_order_line | 21735 | AccessShareLock  | t
 pk_order_line | 21735 | RowExclusiveLock | t
 item  | 21715 | AccessShareLock  | t
 orders| 21735 | AccessShareLock  | t
 orders| 21735 | RowExclusiveLock | t
 order_line| 21735 | AccessShareLock  | t
 order_line| 21735 | RowExclusiveLock | t
 stock | 21715 | AccessShareLock  | t
 stock | 21715 | RowExclusiveLock | t
 order_line| 21715 | ExclusiveLock| t
 pk_order_line | 21715 | RowExclusiveLock | t
 pg_locks  | 23254 | AccessShareLock  | t
 district  | 21715 | AccessShareLock  | t
 district  | 21715 | RowShareLock | t
 district  | 21715 | RowExclusiveLock | t
 warehouse | 21715 | AccessShareLock  | t
 customer  | 21735 | AccessShareLock  | t
 customer  | 21735 | RowExclusiveLock | t
(29 rows)


Pids 21715 and 21735 are conflicting.

There's also another example where the lock table output is > 1400 rows,
with two lock requests pending.

The oprofile for this run looks like this: (but is not of course a
snapshot at a point in time, like the lock list)

CPU: CPU with timer interrupt, speed 0 MHz (estimated)
Profiling through timer interrupt
samples  %app name symbol name
170746   42.7220  vmlinux-2.6.8.1-osdl2ia64_pal_call_static
18934 4.7374  libc-2.3.2.so(no symbols)
10691 2.6750  postgres FunctionCall2
9814  2.4555  postgres hash_seq_search
8654  2.1653  postgres SearchCatCache
7389  1.8488  postgres AllocSetAlloc
6122  1.5318  postgres hash_search
5707  1.4279  postgres OpernameGetCandidates
4901  1.2263  postgres StrategyDirtyBufferList
4627  1.1577  postgres XLogInsert
4424  1.1069  postgres pglz_decompress
4371  1.0937  vmlinux-2.6.8.1-osdl2__copy_user
3796  0.9498  vmlinux-2.6.8.1-osdl2finish_task_switch
3483  0.8715  postgres LWLockAcquire
3458  0.8652  postgres eqjoinsel
3001  0.7509  vmlinux-2.6.8.1-osdl2get_exec_dcookie
2824  0.7066  postgres AtEOXact_CatCache
2745  0.6868  postgres _bt_compare
2730  0.6831  postgres nocachegetattr
2715  0.6793  postgres SearchCatCacheList
2659  0.6653  postgres MemoryContextAllocZeroAligned
2604  0.6515  postgres yyparse
2553  0.6388  postgres eqsel
2127  0.5322  postgres deconstruct_array
1921  0.4806  postgres hash_any
1919  0.4801  postgres int4eq
1855  0.4641  postgres LWLockRelease
1839  0.4601  postgres StrategyBufferLookup
1777  0.4446  postgres GetSnapshotData
1729  0.4326  postgres heap_getsysattr
1595  0.3991  postgres DLMoveToFront
1586  0.3968  postgres MemoryContextAlloc
1485  0.3716  vmlinux-2.6.8.1-osdl2try_atomic_semop
1455  0.3641  postgres anonymous symbol from section .plt
1409  0.3525  postgres lappend
1352  0.3383  postgres heap_release_fetch
1270  0.3178  postgres PinBuffer
1141  0.2855  postgres DirectFunctionCall1
1132  0.2832  postgres base_yylex
982   0.2457  postgres 

[HACKERS] debugging PostgreSQL

2004-11-09 Thread gevik
Dear Folks,

Could someone be so kind to give me some pointers about how
to debug pg. I would like to know which debugger you use under windows and
linux

Regards,
Gevik

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings