Re: [HACKERS] Overhauling GUCS

2008-06-05 Thread Heikki Linnakangas

David E. Wheeler wrote:

On Jun 5, 2008, at 14:47, Greg Smith wrote:

This is why there's the emphasis on preserving comments as they pass 
into the GUC structure and back to an output file.  This is one of the 
implementation details I haven't fully made up my mind on:  how to 
clearly label user comments in the postgresql.conf to distinguish them 
from verbose ones added to the file.  I have no intention of letting 
manual user edits go away; what I'm trying to do here (and this part 
is much more me than Josh) is make them more uniform such that they 
can co-exist with machine edits without either stomping on the other.  
Right now doing that is difficult, because it's impossible to tell the 
default comments from the ones the users added and the current comment 
structure bleeds onto the same lines as the settings.


How about a simple rule, such as that machine-generated comments start 
with "##", while user comments start with just "#"? I think that I've 
seen such a rule used before. At any rate, I think that, unless you have 
some sort of line marker for machine-generated comments, there will be 
no way to tell them apart from user comments.


What comments do we consider machine-generated? Just the ones used to 
comment out settings, like


#shared_buffers = 32MB

or something else?

If the automatic tool lets alone all other kind of comments, I think 
we're fine. In fact, it wouldn't necessarily need to modify those 
comments either, it could simply add a new setting line below that:


#shared_buffers = 32MB
shared_buffers = 1024MB

For extra safety, it could comment out old settings, perhaps with 
something like this:


#shared_buffers = 32MB
#shared_buffers = 1024MB  # commented out by wizard on 2008-06-05
shared_buffers = 2048MB

This would preserve a full change history in the file. It would become 
quite messy after a lo of changes, of course, but a user can trim the 
history by hand if he wants to.


Or perhaps we should explicitly mark the settings the tool has 
generated, and comment out:


#shared_buffers = 32MB   # commented out by wizard on 2008-06-05
shared_buffers = 1024MB  # automatically set by wizard on 2008-06-05

That way the tool could safely replace automatically set settings, 
without replacing manually set ones without leaving a clear trace of 
what happened.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] orafce does NOT build with Sun Studio compiler

2008-06-05 Thread Pavel Stehule
Hello

2008/6/5 Mayuresh Nirhali <[EMAIL PROTECTED]>:
> Hello hackers,
>
> During the Oracle migration tutorial by peter at PGCon, I took an action
> item for myself to try orafce on Solaris/OpenSolaris.
> As pg binaries are bundled with Solaris now (using Sun Studio compiler), I
> decided to try out building orafce against the same bundled binaries (with
> USE_PGXS=1).
>
> I see following build error,
> /opt/SUNWspro/SS11/bin/cc -xc99=none -xCC -KPIC -I.
> -I/usr/include/pgsql/server -I/usr/include/pgsql/internal -I/usr/sfw/include
> -I/usr/include/kerberosv5  -c -o pipe.o pipe.c
> "pipe.c", line 149: null dimension: data
> cc: acomp failed for pipe.c
> gmake[1]: *** [pipe.o] Error 2
> gmake[1]: Leaving directory `/builds2/postgres/orafce/orafce'
> *** Error code 2
> make: Fatal error: Command failed for target `orafce/config.status'
> Current working directory /builds2/postgres/orafce
>
> Sun Studio does not like array declarations with null as dimenstion.
> So, In pipe.c we have,
>
> typedef struct
> {
>   LWLockId shmem_lock;
>   pipe *pipes;
>   alert_event *events;
>   alert_lock *locks;
>   size_t size;
>   unsigned int sid;
>   char data[];   /* line 149 */
> } sh_memory;
>
> A quick look tells me that this should not be hard to fix, but have not
> prepared any patch as I dont understand the code very well.
> Is it possible to fix this soon ? This will increase the portability and
> would help people use orafce with existing pg binaries on Solaris.
>
> Thanks
> Mayuresh
>
>

I'll fix it soon

Pavel
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Overhauling GUCS

2008-06-05 Thread David E. Wheeler

On Jun 5, 2008, at 17:53, Greg Smith wrote:

I was already considering keeping user comments as # while making  
all system-inserted ones #! ; many people are already used to #!  
having a special system-related meaning from its use in UNIX shell  
scripting which makes it easier to remember.


Oooh, yeah. I hadn't even thought of that! I was just looking at  
characters on my keyboard and typing them in to see which ones I  
thought were most distinctive. This may be part of the reason I  
thought that #! was distinctive. :-)


Best,

David

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Overhauling GUCS

2008-06-05 Thread Greg Smith

On Thu, 5 Jun 2008, Alvaro Herrera wrote:


FWIW smb.conf uses ; for one purpose and # for the other.


They're actually combining the way UNIX files use # with how Windows INI 
files use ; in a config file context, which I personally find a little 
weird.


I was already considering keeping user comments as # while making all 
system-inserted ones #! ; many people are already used to #! having a 
special system-related meaning from its use in UNIX shell scripting which 
makes it easier to remember.


I think the next step to this whole plan is to generate a next-gen 
postgresql.conf mock-up showing what each of the outputs from the 
pg_generate_conf tool might look like to get feedback on that; it will 
make what is planned here a bit easier to understand as well.


--
* Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Overhauling GUCS

2008-06-05 Thread Alvaro Herrera
David E. Wheeler wrote:

> How about a simple rule, such as that machine-generated comments start  
> with "##", while user comments start with just "#"? I think that I've  
> seen such a rule used before. At any rate, I think that, unless you have 
> some sort of line marker for machine-generated comments, there will be no 
> way to tell them apart from user comments.

FWIW smb.conf uses ; for one purpose and # for the other.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Overhauling GUCS

2008-06-05 Thread David E. Wheeler

On Jun 5, 2008, at 14:47, Greg Smith wrote:

This is why there's the emphasis on preserving comments as they pass  
into the GUC structure and back to an output file.  This is one of  
the implementation details I haven't fully made up my mind on:  how  
to clearly label user comments in the postgresql.conf to distinguish  
them from verbose ones added to the file.  I have no intention of  
letting manual user edits go away; what I'm trying to do here (and  
this part is much more me than Josh) is make them more uniform such  
that they can co-exist with machine edits without either stomping on  
the other.  Right now doing that is difficult, because it's  
impossible to tell the default comments from the ones the users  
added and the current comment structure bleeds onto the same lines  
as the settings.


How about a simple rule, such as that machine-generated comments start  
with "##", while user comments start with just "#"? I think that I've  
seen such a rule used before. At any rate, I think that, unless you  
have some sort of line marker for machine-generated comments, there  
will be no way to tell them apart from user comments.


Other possibilities for machine-comments:

## Machine comment
### Machine comment
#! Machine comment
#@ Machine comment
#$ Machine comment
#^ Machine comment
# Machine comment

I actually kinda like "#!". It's distinctive and unlikely to appear in  
a user comment. Anyway, just food for thought.


Best,

David

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] "ERROR: operator is not unique" with Custom Data Type

2008-06-05 Thread David E. Wheeler

On Jun 5, 2008, at 14:07, Martijn van Oosterhout wrote:

I'm sure I'm missing something simple here. How do I make it  
assignment?


# \h create cast
Command: CREATE CAST
Description: define a new cast
Syntax:

CREATE CAST (sourcetype AS targettype)
   WITHOUT FUNCTION
   [ AS ASSIGNMENT | AS IMPLICIT ]


I need to read up on the CAST documentation. Thanks.


Huh. That's what citext has, too:

CREATE CAST (citext AS text) WITHOUT FUNCTION AS IMPLICIT;
CREATE CAST (text AS citext) WITHOUT FUNCTION AS IMPLICIT;


And citext probably doesn't work with 8.3? The casting rules wrt text
have changed...


Yes, that is correct. It builds, but the SQL doesn't all run properly.  
I'll be wading through all those failures once I get the basics worked  
out with v2.


Thanks,

David


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] About dependency reports in DROP RESTRICT

2008-06-05 Thread Tom Lane
Gregory Stark <[EMAIL PROTECTED]> writes:
> On the other hand the fact that we don't actually provide an
> exhaustive set of data for that purpose and a) nobody's complained and
> b) it's for basically the same reason that you're suggesting this
> change, ie, that it isn't convenient and isn't important enough to go
> out of our way to build just for that purpose could mean it's a
> reasonable compromise. Are you just worried about the memory and cpu
> cycles or is it actually a lot of code?

Well, the problem is that it uglifies the code quite a lot.  The patch
as I've got it now adds a "flags" field to ObjectAddress, which is
unused dead space for about half of the uses of ObjectAddress; to keep
the old behavior we'd need to either add three more half-used fields,
or persuade ObjectAddresses to manage two parallel arrays, neither of
which seems very nice.  I'll do it if people want it, but I thought
first I should ask if anyone really cares.

> Incidentally, if it happens to be straightforward (I suspect not :( ) in the
> above example it would be nice to compress out the internal dependencies and
> show just the "view b depends on function a(text)" which would actually make
> sense to a DBA. The intermediate rules going via internal objects (rules)
> they've never heard of make it a lot harder to read.

Actually, I think the patch as I've got it now will behave that way
(though it's not done enough to test yet ...)

>> BTW, it would now be possible to do something like what the shdepend
>> code does, and stuff all these reports into the DETAIL field of a
>> single message, instead of emitting them as separate notices.
>> Any feelings pro or con about that?

> Seems fine either way -- I wonder if one way is more convenient for pgadmin or
> applications? I suspect if so it would be the DETAIL field?

The arguments are all about the same as they were for shdepend messages,
I think.  The case to think about is where there are LOTS of
dependencies.  Do you want 1 separate NOTICE messages, or a large,
perhaps truncated DETAIL field?  I don't recall for sure, but I think
we made the shdepend code act the way it does because we thought that
was better --- certainly it would've been easy to make it just spit
individual NOTICES like the older pg_depend code does.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] About dependency reports in DROP RESTRICT

2008-06-05 Thread Gregory Stark

"Tom Lane" <[EMAIL PROTECTED]> writes:

> Currently, if you do DROP something RESTRICT where there are multiple
> levels of dependencies on the "something", you get reports that might
> look about like this:
>
> NOTICE:  x depends on something
...
> So what I'd like to do about it is just use the CASCADE style all the
> time.  Thoughts?

Well personally I always react to the notices by adding the CASCADE token but
that's because I'm just testing stuff. If I was working with a real database I
would probably be quite likely to be looking for the minimal fix to break the
dependency chain.

So for example in a situation like this:

postgres=# create function a(text) returns text as 'select $1' language sql;

CREATE FUNCTION
postgres=# select a('foo');
  a  
-
 foo
(1 row)

postgres=# create view b as select a('foo');
CREATE VIEW

postgres=# create view c as select * from b;
CREATE VIEW

postgres=# drop function a(text);
NOTICE:  0: rule _RETURN on view b depends on function a(text)
NOTICE:  0: view b depends on rule _RETURN on view b
NOTICE:  0: rule _RETURN on view c depends on view b
NOTICE:  0: view c depends on rule _RETURN on view c
ERROR:  2BP01: cannot drop function a(text) because other objects depend on it

postgres=# create or replace view b  as select 'foo'::text as a;
CREATE VIEW

postgres=# drop function a(text);
DROP FUNCTION

postgres=# select * from c;
  a  
-
 foo
(1 row)

It seems like it's quite relevant to provide the dependency chain to help the
DBA find the point in the chain he wants to intervene.

On the other hand the fact that we don't actually provide an exhaustive set of
data for that purpose and a) nobody's complained and b) it's for basically the
same reason that you're suggesting this change, ie, that it isn't convenient
and isn't important enough to go out of our way to build just for that purpose
could mean it's a reasonable compromise. Are you just worried about the memory
and cpu cycles or is it actually a lot of code?

Incidentally, if it happens to be straightforward (I suspect not :( ) in the
above example it would be nice to compress out the internal dependencies and
show just the "view b depends on function a(text)" which would actually make
sense to a DBA. The intermediate rules going via internal objects (rules)
they've never heard of make it a lot harder to read.

> BTW, it would now be possible to do something like what the shdepend
> code does, and stuff all these reports into the DETAIL field of a
> single message, instead of emitting them as separate notices.
> Any feelings pro or con about that?

Seems fine either way -- I wonder if one way is more convenient for pgadmin or
applications? I suspect if so it would be the DETAIL field?

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's On-Demand Production Tuning

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] About dependency reports in DROP RESTRICT

2008-06-05 Thread Alvaro Herrera
Tom Lane wrote:

> So what I'd like to do about it is just use the CASCADE style all the
> time.  Thoughts?

It is loss of functionality, but I very much doubt anyone is depending
on it -- it's way too elaborate.  +1 on doing the simpler report if it's
too expensive to build the full report.

> BTW, it would now be possible to do something like what the shdepend
> code does, and stuff all these reports into the DETAIL field of a
> single message, instead of emitting them as separate notices.
> Any feelings pro or con about that?

I think it makes more sense to do it that way (considering that they're
really part of the single error message, not independent reports), but
there's the problem that the error report gets too long.  So we would
have to send a truncated report to the client and the full report to the
log only.  Would people be upset at that?

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] About dependency reports in DROP RESTRICT

2008-06-05 Thread Tom Lane
Currently, if you do DROP something RESTRICT where there are multiple
levels of dependencies on the "something", you get reports that might
look about like this:

NOTICE:  x depends on something
NOTICE:  y depends on x
NOTICE:  z depends on y

that is, you can trace the chain of reasoning for each deletion.
However, we don't do that in CASCADE mode; you'll just see

NOTICE:  drop cascades to x
NOTICE:  drop cascades to y
NOTICE:  drop cascades to z

I'm working on revising the DROP dependency logic as sketched here:
http://archives.postgresql.org/pgsql-hackers/2008-05/msg00301.php
and I'm realizing that it's going to be quite expensive to maintain the
old NOTICE style for RESTRICT, because we aren't emitting the notices
on-the-fly anymore, but only after we've finished recursing to find all
the objects to delete; we'd have to save about twice as much state to
remember which object was the immediate predecessor of each victim.
And the old behavior was always a bit indeterminate anyway because there
could be multiple dependency paths, and which one got reported as the
deletion cause would be happenstance.

So what I'd like to do about it is just use the CASCADE style all the
time.  Thoughts?

BTW, it would now be possible to do something like what the shdepend
code does, and stuff all these reports into the DETAIL field of a
single message, instead of emitting them as separate notices.
Any feelings pro or con about that?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Overhauling GUCS

2008-06-05 Thread Greg Smith

On Thu, 5 Jun 2008, Aidan Van Dyk wrote:

People like me don't want to have postgresql.conf be *only* a 
machine-generated file, which I am not allowed to edit anymore because 
next DBA doing a "SET PERSISTANT" type of command is going to cause 
postgres to write out something else, over-writing my carefully 
documented reason for some particular setting.


This is why there's the emphasis on preserving comments as they pass into 
the GUC structure and back to an output file.  This is one of the 
implementation details I haven't fully made up my mind on:  how to clearly 
label user comments in the postgresql.conf to distinguish them from 
verbose ones added to the file.  I have no intention of letting manual 
user edits go away; what I'm trying to do here (and this part is much more 
me than Josh) is make them more uniform such that they can co-exist with 
machine edits without either stomping on the other.  Right now doing that 
is difficult, because it's impossible to tell the default comments from 
the ones the users added and the current comment structure bleeds onto the 
same lines as the settings.



But the big issue I have (not that it really matters, because I'm not
one of the ones working on it, so I please don't take this as me telling
anyone what they can or can't do) is that that goal doesn't solve any of
the listed problems stated in the proposal
  1.  Most people have no idea how to set these.


Making it much easier to build recommendation tools is how this helps 
here.



  2. The current postgresql.conf file is a huge mess of 194 options,
 the vast majority of which most users will never touch.


The proposed pg_generate_conf tool includes options to spit out a basic 
configuration file instead of the complete one.



  3. GUCS lists are kept in 3 different places (guc.c, postgresql.conf,
 and settings.sgml), which are only synched with each other manually.


The proposal throws away having a separate postgresql.conf file, so that 
reduces it from 3 places to 2.  That's moving in the right direction



  4. We don't seem to be getting any closer to autotuning.


If you try to build a tuning tool, these areas end up being the 
unnecessarily hard parts.


Thanks for the comments on the proposal.  I'm only bothering to respond to 
messages like yours now, am deleting all of the continuing attemps to 
divert the discussion over to parameter tuning details or expanding the 
scope here.


--
* Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Overhauling GUCS

2008-06-05 Thread Aidan Van Dyk
* Greg Smith <[EMAIL PROTECTED]> [080605 15:17]:
> On Thu, 5 Jun 2008, Alvaro Herrera wrote:
> 
> >I must say that I am confused by this thread.  What's the discussed GUC
> >overhaul?
> 
> http://wiki.postgresql.org/wiki/GUCS_Overhaul
> 
> I drop that URL in every other message in hopes that people might start 
> commenting on it directly if they see it enough; the fact that you're 
> confused says I may need to keep that up :(

I've read it.  A couple times now.  And I like lots of it.

> >(1) Add a lot more comments to each setting
> >(2) Add documentation links to each setting
> >(3) Move more frequently used settings to the top of the file
> >(4) Ship different sample config files
> >(5) Create an expert system to suggest tuning
> >(6) Other random ideas (XML, settings in database, others?)
 
> (3) (4) (5) and (6) were off-topic diversions.

But, right from the above mentioned page:

*Goals*

  By shifting from a model where postgresql.conf is document-formatted
  and hand-edited to one where it's machine generated, it becomes vastly
  easier to write simple utilities to manage these settings. Right now,
  the big "obstacle" to things like SET PERSISTENT is "how to we
  preserve the hand-edited comments in the file" -- and the answer is we
  don't.

This little goal leads to:

  * By having a generated postgresql.conf and an easy way to generate
it, writing autoconfiguration scripts (as well as shortcuts like SET
 PERSISTENT) become vastly easier. 

And later:

  There needs to be a way to modify the underlying settings and save
  that into a new machine-generated postgresql.conf file. Is
  implementing SET PERSISTENT sufficient for that? 

I think that these parts, seemingly "snuck into" the GUC overhaul
proposal is what people like me a wary of.   People like me don't want
to have postgresql.conf be *only* a machine-generated file, which I am
not allowed to edit anymore because next DBA doing a "SET PERSISTANT"
type of command is going to cause postgres to write out something else,
over-writing my carefully documented reason for some particular setting.

But the big issue I have (not that it really matters, because I'm not
one of the ones working on it, so I please don't take this as me telling
anyone what they can or can't do) is that that goal doesn't solve any of
the listed problems stated in the proposal:

   1.  Most people have no idea how to set these.

   2. The current postgresql.conf file is a huge mess of 194 options,
  the vast majority of which most users will never touch.  

   3. GUCS lists are kept in 3 different places (guc.c, postgresql.conf,
  and settings.sgml), which are only synched with each other manually.  

   4. We don't seem to be getting any closer to autotuning. 


-- 
Aidan Van Dyk Create like a god,
[EMAIL PROTECTED]   command like a king,
http://www.highrise.ca/   work like a slave.


signature.asc
Description: Digital signature


Re: [HACKERS] "ERROR: operator is not unique" with Custom Data Type

2008-06-05 Thread Martijn van Oosterhout
On Thu, Jun 05, 2008 at 11:37:28AM -0700, David E. Wheeler wrote:
> >Whichever way
> >you want it, make that direction implicit and the other direction
> >assignment.
> 
> I'm sure I'm missing something simple here. How do I make it assignment?

# \h create cast
Command: CREATE CAST
Description: define a new cast
Syntax:

CREATE CAST (sourcetype AS targettype)
WITHOUT FUNCTION
[ AS ASSIGNMENT | AS IMPLICIT ]

> Huh. That's what citext has, too:
> 
> CREATE CAST (citext AS text) WITHOUT FUNCTION AS IMPLICIT;
> CREATE CAST (text AS citext) WITHOUT FUNCTION AS IMPLICIT;

And citext probably doesn't work with 8.3? The casting rules wrt text
have changed...

Have a nice day,
-- 
Martijn van Oosterhout   <[EMAIL PROTECTED]>   http://svana.org/kleptog/
> Please line up in a tree and maintain the heap invariant while 
> boarding. Thank you for flying nlogn airlines.


signature.asc
Description: Digital signature


Re: [HACKERS] Overhauling GUCS

2008-06-05 Thread Ron Mayer

Steve Atkins wrote:

... cross-platform (Windows, Linux, Solaris, OS X as a bare
minimum) 


I wonder how cross-platform the tuning algorithm itself is.

I could also imagine that decisions like "do I let the OS page
cache, or postgres's buffer cache get most of the memory" are
extremely OS dependent.

Also, the configuration tool would be even more useful if it could
offer extra platform-specific advice like "hey, I see you're using
this filesystem. You should be using this journaling option for WAL
and this other one for data", or "on your system you should be using
this fsync method", or "use relatime or noatime when mounting your
disk".


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Overhauling GUCS

2008-06-05 Thread Ron Mayer

Tom Lane wrote:

How far could we get with the answers to just three questions:

* How many concurrent queries do you expect to have?
* How much RAM space are you willing to let Postgres use?
* How much "overhead" disk space are you willing to let Postgres use?


+1 to this approach - these are the kinds of questions that
make sense to me when first setting up a new installation.
They sound useful for both large servers and tiny (salesguy
laptop for demos) installations.

If those aren't enough questions, what else must we ask? 


* Perhaps something to guess FSM settings?  I think FSM is
  tunable I most often get wrong with more painful
  consequences (bloat) than other tunables.
  My approach is to have cron run database-wide vacuums
  even on systems with autovacuum just to see the log
  messages about FSM.

* Something to tune vacuum delay?  Perhaps:
  How much I/O bandwidth can be dedicated to Postgres
  background activities?

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Overhauling GUCS

2008-06-05 Thread Robert Lor

Tom Lane wrote:

If those aren't enough questions, what else must we ask?  Or maybe they
aren't the right questions at all --- maybe we should ask "is this a
dedicated machine or not" and try to extrapolate everything else from
what we (hopefully) can find out about the hardware.
  
I think probably a combination of  high and low-level questions and make 
the low-level (more specific) questions optional since some users may 
not be able to provide low-level info.  Here's a rough idea of how I 
envision this tool should work.


$ pg_config_wizard

Is this machine dedicated to Postgres? (y/n) n   (now tool 
auto-discovers available HW resources)
Your system has 32GB memory. What percentage do you want to allocate to 
Postgres? (1=50%, 2=33%, 3=25%, etc) 1

What type of workload? (OLTP, DSS, etc)
...

At the end, the tool runs for a while checking to see if certain 
thresholds are reached to determine which parameters need to be 
increased.  The tool would change the parameters causing the 
bottlenecks, rerun Postgres/workload, and iterate a few times until the 
results are satfisfactory. Write the recommended settings to 
postgresql.conf.recommend and let the user update postgresql.conf 
himself or whatever.


I update my postgresql.conf, rerun the app and see 100% increased in 
throughput :-)


-Robert


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Overhauling GUCS

2008-06-05 Thread Greg Smith

On Thu, 5 Jun 2008, Alvaro Herrera wrote:


I must say that I am confused by this thread.  What's the discussed GUC
overhaul?


http://wiki.postgresql.org/wiki/GUCS_Overhaul

I drop that URL in every other message in hopes that people might start 
commenting on it directly if they see it enough; the fact that you're 
confused says I may need to keep that up :(



(1) Add a lot more comments to each setting
(2) Add documentation links to each setting
(3) Move more frequently used settings to the top of the file
(4) Ship different sample config files
(5) Create an expert system to suggest tuning
(6) Other random ideas (XML, settings in database, others?)

To me, there are two ideas that are doable right now, which are (2) and
(4).  (1) seems to be a step backwards in pg_hba.conf experience, and we
would have to maintain duplicate documentation.  (3) seems messy.  (5)
is a lot of work; do we have volunteers?  As for (6), the two examples I
give can be easily dismissed.
(2) and (4) do not seem necessary to get the config API built.


(1) is in that proposal but is strictly optional as something to put in 
the configuration file itself.  The idea behind (2) is to enable tool 
authors to have an easier way to suggest where to head for more 
information.  I'd like for it to be trivial for a tool to say "Suggested 
value for  is ; see 
http://www.postgresql.org/docs/8.3/interactive/runtime-config-resource.html 
for more information".  I know what most of the settings I tinker with do, 
but even I'd like it to be easier to find the right spot in the manual; 
for newbies it's vital.  You are correct that (2) isn't strictly necessary 
here, but it's valuable and will be easier to wrap into this than to bolt 
on later.


(3) (4) (5) and (6) were off-topic diversions.

--
* Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] "ERROR: operator is not unique" with Custom Data Type

2008-06-05 Thread David E. Wheeler

On Jun 5, 2008, at 11:51, Tom Lane wrote:


I was thinking that the ::text should be cast to ::lctext, as that's
how `'a'::lctext = 'a'` works, but I keep going back and forth in my
mind. Maybe 'a'::lctext should not equal 'A'::text.


It seems to me that lctext is sort of like a more-constrained version
of text (like a domain),


Yes, exactly.


which suggests that the lctext -> text
direction can be implicit but the other direction should not be.


Ah, okay. That's a good way of putting it. So I should just eliminate  
the implicit text -> lctext cast, then? That will solve the problem?



Moreover, if you don't have lctext -> text be implicit then you
will find that none of the non-comparison text functions work on
lctext except with a cast; which is not the place you want to be.


No, quite right.


I concur with Martijn that having both directions implicit is a
Bad Idea.

BTW, I would encourage you to think of this project as citext  
version 2,

rather than inventing a new name for the datatype.  All you'll
accomplish with that is make it hard for users of citext to  
transition.


Fair enough. It was a working title, anyway.

Best,

David


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] "ERROR: operator is not unique" with Custom Data Type

2008-06-05 Thread Tom Lane
"David E. Wheeler" <[EMAIL PROTECTED]> writes:
> On Jun 5, 2008, at 11:28, Martijn van Oosterhout wrote:
>> What would you want postgresql to choose in this case.

> I was thinking that the ::text should be cast to ::lctext, as that's  
> how `'a'::lctext = 'a'` works, but I keep going back and forth in my  
> mind. Maybe 'a'::lctext should not equal 'A'::text.

It seems to me that lctext is sort of like a more-constrained version
of text (like a domain), which suggests that the lctext -> text
direction can be implicit but the other direction should not be.

Moreover, if you don't have lctext -> text be implicit then you
will find that none of the non-comparison text functions work on
lctext except with a cast; which is not the place you want to be.

I concur with Martijn that having both directions implicit is a
Bad Idea.

BTW, I would encourage you to think of this project as citext version 2,
rather than inventing a new name for the datatype.  All you'll
accomplish with that is make it hard for users of citext to transition.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Overhauling GUCS

2008-06-05 Thread Robert Lor

Tom Lane wrote:

This is even assuming that the tool needs to edit the file itself,
rather than just give advice.  The advice is the hard part, folks;
could we stop obsessing about trivia?
+1. IMHO, the tool doesn't need to worry about generating a prettied 
version of postgresql.conf. It should just inform the user about the 
appropriate settings or create a postgresql.conf.recommend and have the 
user update postgresql.conf himself.


-Robert

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] "ERROR: operator is not unique" with Custom Data Type

2008-06-05 Thread David E. Wheeler

On Jun 5, 2008, at 11:28, Martijn van Oosterhout wrote:


On Thu, Jun 05, 2008 at 11:18:26AM -0700, David E. Wheeler wrote:

I'm working on a custom data type based on TEXT that does case-
insensitive, locale-aware comparisons, essentially by calling LOWER()
to compare values.


What makes this different from the citext project?


citext is not locale-aware; please Tom's comments in the "Case- 
Insensitve Text Comparison" thread.



However, thanks to the implicit cast PostgreSQL finds more than one
candidate operator when I compare properly casted values:

try=# select 'a'::lctext =  'a'::text;
ERROR:  operator is not unique: lctext = text
LINE 1: select 'a'::lctext =  'a'::text;
  ^


What would you want postgresql to choose in this case.


I was thinking that the ::text should be cast to ::lctext, as that's  
how `'a'::lctext = 'a'` works, but I keep going back and forth in my  
mind. Maybe 'a'::lctext should not equal 'A'::text.



Whichever way
you want it, make that direction implicit and the other direction
assignment.


I'm sure I'm missing something simple here. How do I make it assignment?


Having A->B and B->A both as implicit just leads to
ambiguity.


Huh. That's what citext has, too:

CREATE CAST (citext AS text) WITHOUT FUNCTION AS IMPLICIT;
CREATE CAST (text AS citext) WITHOUT FUNCTION AS IMPLICIT;
CREATE CAST (citext AS varchar) WITHOUT FUNCTION AS IMPLICIT;
CREATE CAST (varchar AS citext) WITHOUT FUNCTION AS IMPLICIT;
CREATE CAST (citext AS bpchar) WITHOUT FUNCTION AS IMPLICIT;
CREATE CAST (bpchar AS citext) WITHOUT FUNCTION AS IMPLICIT;

But I agree that there is confusion for PostgreSQL here.

Thanks,

David


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] "ERROR: operator is not unique" with Custom Data Type

2008-06-05 Thread Martijn van Oosterhout
On Thu, Jun 05, 2008 at 11:18:26AM -0700, David E. Wheeler wrote:
> I'm working on a custom data type based on TEXT that does case- 
> insensitive, locale-aware comparisons, essentially by calling LOWER()  
> to compare values.

What makes this different from the citext project?

> However, thanks to the implicit cast PostgreSQL finds more than one  
> candidate operator when I compare properly casted values:
> 
> try=# select 'a'::lctext =  'a'::text;
> ERROR:  operator is not unique: lctext = text
> LINE 1: select 'a'::lctext =  'a'::text;
>^

What would you want postgresql to choose in this case. Whichever way
you want it, make that direction implicit and the other direction
assignment. Having A->B and B->A both as implicit just leads to
ambiguity.

Have a nice day,
-- 
Martijn van Oosterhout   <[EMAIL PROTECTED]>   http://svana.org/kleptog/
> Please line up in a tree and maintain the heap invariant while 
> boarding. Thank you for flying nlogn airlines.


signature.asc
Description: Digital signature


[HACKERS] "ERROR: operator is not unique" with Custom Data Type

2008-06-05 Thread David E. Wheeler

Howdy,

I'm working on a custom data type based on TEXT that does case- 
insensitive, locale-aware comparisons, essentially by calling LOWER()  
to compare values. I'll have more to ask about this later, when I want  
to get feedback on the implementation. But right now I'm just writing  
tests and trying to get it all to work the way I think it should.


So I've implemented operators and an operator class for the new type,  
and they work great. I've also added implicit casts between the other  
string data types:


CREATE CAST (lctext AS text)WITHOUT FUNCTION AS IMPLICIT;
CREATE CAST (text AS lctext)WITHOUT FUNCTION AS IMPLICIT;
CREATE CAST (lctext AS varchar) WITHOUT FUNCTION AS IMPLICIT;
CREATE CAST (varchar AS lctext) WITHOUT FUNCTION AS IMPLICIT;
CREATE CAST (lctext AS bpchar)  WITHOUT FUNCTION AS IMPLICIT;
CREATE CAST (bpchar AS lctext)  WITHOUT FUNCTION AS IMPLICIT;

However, thanks to the implicit cast PostgreSQL finds more than one  
candidate operator when I compare properly casted values:


try=# select 'a'::lctext =  'a'::text;
ERROR:  operator is not unique: lctext = text
LINE 1: select 'a'::lctext =  'a'::text;
   ^
HINT:  Could not choose a best candidate operator. You might need to  
add explicit type casts.


So is there a way to resolve this? Would I need to add explicit  
operators between lctext and text (and more, betwein text and lctext),  
assuming that PostgreSQL would find those to be the best candidate  
operators?


I'm kind of hoping that there's a simpler answer, because otherwise  
I'd have to create operators and classes for all of:


  ( lctext,  lctext  )
  ( lctext,text  )
  (   text,  lctext  )
  ( lctext,  lctext  )
  ( lctext,  varchar )
  ( varchar, lctext  )
  ( lctext,  bpchar  )
  ( bpchar,  lctext  )

And then I supposed that I'd have to do the same not only for the  
comparison operators in the operator class, but also any other binary  
operators (concatenation, regular expression, LIKE, etc.). This sounds  
like somewhat of a PITA, though I'd of course just do the cut-and- 
paste work to make it so if that was what's required. But is it? Is  
there no simpler way to do it?


Many thanks,

David

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Overhauling GUCS

2008-06-05 Thread Robert Lor

Steve Atkins wrote:


I'd be interested in putting together a framework+GUI client to do this
cleanly in a cross-platform (Windows, Linux, Solaris, OS X as a bare
minimum) sort of way, if no-one else already has such a thing.

This is a great idea, and I was thinking along the same line. The 
framework can provide generic interfaces for the GUI/Web client, and 
leave it up to the OS to provide the needed data.


-Robert


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Overhauling GUCS

2008-06-05 Thread Alvaro Herrera
Greg Smith wrote:
> On Thu, 5 Jun 2008, Magnus Hagander wrote:
>
>> We really need a "proper API" for it, and the stuff in pgAdmin isn't  
>> even enough to base one on.
>
> I would be curious to hear your opinion on whether the GUC overhaul  
> discussed in this thread is a useful precursor to building such a proper  
> API.

I must say that I am confused by this thread.  What's the discussed GUC
overhaul?  Things that I vaguely recall being proposed are (in the order
they came to mind):

(1) Add a lot more comments to each setting
(2) Add documentation links to each setting
(3) Move more frequently used settings to the top of the file
(4) Ship different sample config files
(5) Create an expert system to suggest tuning
(6) Other random ideas (XML, settings in database, others?)

To me, there are two ideas that are doable right now, which are (2) and
(4).  (1) seems to be a step backwards in pg_hba.conf experience, and we
would have to maintain duplicate documentation.  (3) seems messy.  (5)
is a lot of work; do we have volunteers?  As for (6), the two examples I
give can be easily dismissed.

(2) and (4) do not seem necessary to get the config API built.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Overhauling GUCS

2008-06-05 Thread Andreas Pflug

Greg Smith wrote:

On Thu, 5 Jun 2008, Magnus Hagander wrote:

We really need a "proper API" for it, and the stuff in pgAdmin isn't 
even enough to base one on.


I would be curious to hear your opinion on whether the GUC overhaul 
discussed in this thread is a useful precursor to building such a 
proper API.


Since I'm the guy who initially wrote that config file editing stuff, I 
feel somehow addressed.

The answer is a clear ABSOLUTELY.

- The current implementation is able to edit the file directly or 
through pgsql functions; any format change will affect that function 
immediately.
- If documentation is enhanced by adding more comments in the 
postgresql.conf file, this won't help the editor because it can't rely 
on it to present help and hints to the user. It needs the comments/help 
in pg_settings or alike.



Regards,
Andreas


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Overhauling GUCS

2008-06-05 Thread Greg Smith

On Thu, 5 Jun 2008, Magnus Hagander wrote:

We really need a "proper API" for it, and the stuff in pgAdmin isn't 
even enough to base one on.


I would be curious to hear your opinion on whether the GUC overhaul 
discussed in this thread is a useful precursor to building such a proper 
API.


--
* Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Overhauling GUCS

2008-06-05 Thread Magnus Hagander
Heikki Linnakangas wrote:
> Alvaro Herrera wrote:
>> Greg Smith wrote:
>>
>>> Where Josh got hung up, where I got hung up, where Lance Campbell
>>> stopped at with his Dummies tool, and what some unknown number of
>>> other people  have been twarted by, is that taking that knowledge and
>>> turning it into a tool useful to users is surprisingly difficult. 
>>> The reason for that is  the current postgresql.conf file and how it
>>> maps internally to GUC  information isn't particularly well suited to
>>> automated generation,  analysis, or updates.
>>
>> What I think this says is that we should be pushing Magnus more to
>> continue work on the configuration API thing he was designing.
> 
> There's some magic in pgAdmin to parse and write the file. You could
> take a look a that.

Not really. You don't want to go there. We really need a "proper API"
for it, and the stuff in pgAdmin isn't even enough to base one on.

//Magnus


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Overhauling GUCS

2008-06-05 Thread Alvaro Herrera
Heikki Linnakangas wrote:
> Alvaro Herrera wrote:

>> What I think this says is that we should be pushing Magnus more to
>> continue work on the configuration API thing he was designing.
>
> There's some magic in pgAdmin to parse and write the file. You could  
> take a look a that.

That's what they want to get rid of.  It's a problem that every GUI
writer needs to reimplement that functionality; ISTM this is something
worth having in the backend.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Overhauling GUCS

2008-06-05 Thread Heikki Linnakangas

Alvaro Herrera wrote:

Greg Smith wrote:

Where Josh got hung up, where I got hung up, where Lance Campbell stopped 
at with his Dummies tool, and what some unknown number of other people  
have been twarted by, is that taking that knowledge and turning it into a 
tool useful to users is surprisingly difficult.  The reason for that is  
the current postgresql.conf file and how it maps internally to GUC  
information isn't particularly well suited to automated generation,  
analysis, or updates.


What I think this says is that we should be pushing Magnus more to
continue work on the configuration API thing he was designing.


There's some magic in pgAdmin to parse and write the file. You could 
take a look a that.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Case-Insensitve Text Comparison

2008-06-05 Thread Jeff Davis
On Wed, 2008-06-04 at 06:35 +0200, Martijn van Oosterhout wrote:
> Check the archives for details on how it works precisely, but it's far
> nicer than merely adding an typmod, since that would cause you to throw
> errors at runtime if there's a problem.

Ok, that makes sense. I agree that any type mismatches should cause a
compile-time error. Thanks for the explanation.

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Overhauling GUCS

2008-06-05 Thread Alvaro Herrera
Greg Smith wrote:

> Where Josh got hung up, where I got hung up, where Lance Campbell stopped 
> at with his Dummies tool, and what some unknown number of other people  
> have been twarted by, is that taking that knowledge and turning it into a 
> tool useful to users is surprisingly difficult.  The reason for that is  
> the current postgresql.conf file and how it maps internally to GUC  
> information isn't particularly well suited to automated generation,  
> analysis, or updates.

What I think this says is that we should be pushing Magnus more to
continue work on the configuration API thing he was designing.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [GENERAL] Fragments in tsearch2 headline

2008-06-05 Thread Teodor Sigaev
A couple of caveats: 


1. ts_headline testing was done with current cvs head where as
headline_with_fragments was done with postgres 8.3.1.
2. For headline_with_fragments, TSVector for the document was obtained
by joining with another table.
Are these differences understandable?


That is possible situation because ts_headline has several criterias of 'best' 
covers - length, number of words from query, good words at the begin and at the 
end of headline while your fragment's algorithm takes care only on total number 
of words in all covers. It's not very good, but it's acceptable, I think. 
Headline (and ranking too) hasn't any formal rules to define is it good or bad? 
Just a people's opinions.


Next possible reason: original algorithm had a look on all covers trying to find 
the best one while your algorithm tries to find just the shortest covers to fill 
a headline.


But it's very desirable to use ShortWord - it's not very comfortable for user if 
one option produces unobvious side effect with another one.

`


If you think these caveats are the reasons or there is something I am
missing, then I can repeat the entire experiments with exactly the same
conditions. 


Interesting for me test is a comparing hlCover with Cover in your patch, i.e. 
develop a patch which uses hlCover instead of Cover and compare  old patch with 
new one.

--
Teodor Sigaev   E-mail: [EMAIL PROTECTED]
   WWW: http://www.sigaev.ru/

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Overhauling GUCS

2008-06-05 Thread Tom Lane
"Heikki Linnakangas" <[EMAIL PROTECTED]> writes:
> A configuration wizard would be nice, but it would be a good start to 
> add a section to the manual on how to do the basic tuning.

+1.  If we can't write an explanation of what to do, we certainly aren't
going to be able to implement it in a wizard.  Agreeing on what to put
in the manual would also go a long way towards providing an
implementation spec for the wizard ...

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] phrase search

2008-06-05 Thread Teodor Sigaev

I can add index support and support for arbitrary distance between
lexeme. 
It appears to me that supporting arbitrary boolean expression will be

complicated. Can we pull out something from TSQuery?


I don't very like an idea to have separated interface for phrase search. Your 
patch may be a module and used by people who really wants to have a phrase search.


Introducing new operator in tsquery allows to use already existing 
infrastructure of tsquery such as concatenations (&&, ||, !!), rewrite subsystem 
etc.  But new operation/types specially designed for phrase search makes needing 
to make that work again.


--
Teodor Sigaev   E-mail: [EMAIL PROTECTED]
   WWW: http://www.sigaev.ru/

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Overhauling GUCS

2008-06-05 Thread Martijn van Oosterhout
On Thu, Jun 05, 2008 at 01:23:53AM +0200, Rainer Bauer wrote:
> I think it would be an enourmous help for beginners if they had a "simple"
> tuning tool which would tell them which values where altered (and possibly
> why) from Postgres' default settings based on some basic information.
> Like:
> - machine hardware (disk layout, OS, installed memory, etc.)
> - application usage (no. of clients, read/write activity, etc)

It would be possible to make a program that worked like:
# pg_autotune
Detected 4GB memory, 2.3GB free
Measured random_page_cost=2.3
Select expected usage: (d)edicated, (n)ormal, (t)iny
> t
Shared_buffers 64MB
Configuration stored to /etc/postgresql/8.3/postgresql.conf
#

This would probably solve most issues.

Have a nice day,
-- 
Martijn van Oosterhout   <[EMAIL PROTECTED]>   http://svana.org/kleptog/
> Please line up in a tree and maintain the heap invariant while 
> boarding. Thank you for flying nlogn airlines.


signature.asc
Description: Digital signature


[HACKERS] ExecuteTruncate quirk: expects a unique list of relations

2008-06-05 Thread Nikhils
Hi,

Consider this simple case:

postgres=# TRUNCATE foo, foo;
ERROR:  cannot TRUNCATE "foo" because it is being used by active queries in
this session

The above occurs because the ExecuteTruncate() function invokes
truncate_check_rel() in a loop. Since the same table name appears twice, the
rd_refcnt for table "foo" is bumped up to 2, causing the above failure.

We might want to add a step to ExecuteTruncate(), or whatever calls it, to
make the list unique.

Regards,
Nikhils
-- 
EnterpriseDB http://www.enterprisedb.com


Re: [HACKERS] pg_dump restore time and Foreign Keys

2008-06-05 Thread Heikki Linnakangas

Simon Riggs wrote:

On Thu, 2008-06-05 at 16:01 +0300, Heikki Linnakangas wrote:
Well, one idea would be to allow adding multiple foreign keys in one 
command, and checking them all at once with one SQL query instead of one 
per foreign key. Right now we need one seq scan over the table per 
foreign key, by checking all references at once we would only need one 
seq scan to check them all.


No need. Just parallelise the restore with concurrent psql. Which would
speed up the index creation also.


True, you could do that.


 Does Greg have plans for further work?


I believe he's busy with other stuff at the moment.

Thinking about this idea a bit more, instead of loading the whole target 
table into memory, it would probably make more sense to keep a hash 
table as just a cache of the most recent keys that have been referenced.


If you can think of a way of improving hash joins generally, then it
will work for this specific case also.


Individual RI checks performed on inserts/COPY don't do a hash join. The 
bulk check done by ALTER TABLE ADD FOREIGN KEY does, but that's 
different issue.


This hash table would be a specific trick to speed up RI checks. If 
you're anyway I/O bound, it wouldn't help, and you'd already be better 
off creating the foreign key first and loading the data after that.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_dump restore time and Foreign Keys

2008-06-05 Thread Simon Riggs

On Thu, 2008-06-05 at 16:01 +0300, Heikki Linnakangas wrote:
> Simon Riggs wrote:
> > On Thu, 2008-06-05 at 10:19 +0300, Heikki Linnakangas wrote:
> >> Simon Riggs wrote:
> >>> I'm guessing that the WITHOUT CHECK option would not be acceptable as an
> >>> unprotected trap for our lazy and wicked users. :-)
> >> Yes, that sounds scary.
> >>
> >> Instead, I'd suggest finding ways to speed up the ALTER TABLE ADD 
> >> FOREIGN KEY. 
> > 
> > I managed a suggestion for improving it for integers only, but if
> > anybody has any other ideas, I'm all ears. 
> 
> Well, one idea would be to allow adding multiple foreign keys in one 
> command, and checking them all at once with one SQL query instead of one 
> per foreign key. Right now we need one seq scan over the table per 
> foreign key, by checking all references at once we would only need one 
> seq scan to check them all.

No need. Just parallelise the restore with concurrent psql. Which would
speed up the index creation also. Does Greg have plans for further work?

> >> Or speeding up COPY into a table with foreign keys already 
> >> defined. For example, you might want to build an in-memory hash table of 
> >> the keys in the target table, instead of issuing a query on each INSERT, 
> >> if the target table isn't huge.
> > 
> > No, that's not the problem, but I agree that is a problem also.
> 
> It is related, because if we can make COPY into a table with foreign 
> keys fast enough, we could rearrange dumps so that foreign keys are 
> created before loading data. That would save the seqscan over the table 
> altogether.

True.

> Thinking about this idea a bit more, instead of loading the whole target 
> table into memory, it would probably make more sense to keep a hash 
> table as just a cache of the most recent keys that have been referenced.

If you can think of a way of improving hash joins generally, then it
will work for this specific case also.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_dump restore time and Foreign Keys

2008-06-05 Thread Heikki Linnakangas

Simon Riggs wrote:

On Thu, 2008-06-05 at 10:19 +0300, Heikki Linnakangas wrote:

Simon Riggs wrote:

I'm guessing that the WITHOUT CHECK option would not be acceptable as an
unprotected trap for our lazy and wicked users. :-)

Yes, that sounds scary.

Instead, I'd suggest finding ways to speed up the ALTER TABLE ADD 
FOREIGN KEY. 


I managed a suggestion for improving it for integers only, but if
anybody has any other ideas, I'm all ears. 


Well, one idea would be to allow adding multiple foreign keys in one 
command, and checking them all at once with one SQL query instead of one 
per foreign key. Right now we need one seq scan over the table per 
foreign key, by checking all references at once we would only need one 
seq scan to check them all.


Or speeding up COPY into a table with foreign keys already 
defined. For example, you might want to build an in-memory hash table of 
the keys in the target table, instead of issuing a query on each INSERT, 
if the target table isn't huge.


No, that's not the problem, but I agree that is a problem also.


It is related, because if we can make COPY into a table with foreign 
keys fast enough, we could rearrange dumps so that foreign keys are 
created before loading data. That would save the seqscan over the table 
altogether.


Thinking about this idea a bit more, instead of loading the whole target 
table into memory, it would probably make more sense to keep a hash 
table as just a cache of the most recent keys that have been referenced.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_dump restore time and Foreign Keys

2008-06-05 Thread Simon Riggs

On Thu, 2008-06-05 at 07:57 -0400, Andrew Dunstan wrote:
> 
> Simon Riggs wrote:
> > pg_dump restore times can be high when they include many ALTER TABLE ADD
> > FORIEGN KEY statements, since each statement checks the data to see if
> > it is fully valid in all cases.
> >
> > I've been asked "why we run that at all?", since if we dumped the tables
> > together, we already know they match.
> >
> > If we had a way of pg_dump passing on the information that the test
> > already passes, we would be able to skip the checks.
> >
> > Proposal:
> >
> > * Introduce a new mode for ALTER TABLE ADD FOREIGN KEY [WITHOUT CHECK];
> > When we run WITHOUT CHECK, iff both the source and target table are
> > newly created in this transaction, then we skip the check. If the check
> > is skipped we mark the constraint as being unchecked, so we can tell
> > later if this has been used.
> >
> > * Have pg_dump write the new syntax into its dumps, when both the source
> > and target table are dumped in same run
> >
> > I'm guessing that the WITHOUT CHECK option would not be acceptable as an
> > unprotected trap for our lazy and wicked users. :-)
> >   
> 
> This whole proposal would be a major footgun which would definitely be 
> abused, IMNSHO.

OK, understood. Two negatives is enough to sink it.

> I think Heikki's idea of speeding up the check using a hash table of the 
> foreign keys possibly has merit.

The query is sent through SPI, so if there was a way to speed this up,
we would already be using it implicitly. If we find a way to speed up
joins it will improve the FK check also.

The typical join plan for the check query is already a hash join,
assuming the target table is small enough. If not, its a huge sort/merge
join. So in a way, we already follow the suggestion.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_dump restore time and Foreign Keys

2008-06-05 Thread Heikki Linnakangas

Simon Riggs wrote:

Are you saying you don't like the rest of the proposal, or just don't
like the idea of having that added as an unprotected option, but find
the proposal acceptable? 


I don't like the idea of having an unprotected option. If we were going 
to have one, I wouldn't bother with the extra checks you proposed; it's 
going to be unsafe anyway.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_dump restore time and Foreign Keys

2008-06-05 Thread Andrew Dunstan



Simon Riggs wrote:

pg_dump restore times can be high when they include many ALTER TABLE ADD
FORIEGN KEY statements, since each statement checks the data to see if
it is fully valid in all cases.

I've been asked "why we run that at all?", since if we dumped the tables
together, we already know they match.

If we had a way of pg_dump passing on the information that the test
already passes, we would be able to skip the checks.

Proposal:

* Introduce a new mode for ALTER TABLE ADD FOREIGN KEY [WITHOUT CHECK];
When we run WITHOUT CHECK, iff both the source and target table are
newly created in this transaction, then we skip the check. If the check
is skipped we mark the constraint as being unchecked, so we can tell
later if this has been used.

* Have pg_dump write the new syntax into its dumps, when both the source
and target table are dumped in same run

I'm guessing that the WITHOUT CHECK option would not be acceptable as an
unprotected trap for our lazy and wicked users. :-)
  


This whole proposal would be a major footgun which would definitely be 
abused, IMNSHO.


I think Heikki's idea of speeding up the check using a hash table of the 
foreign keys possibly has merit.


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Core team statement on replication in PostgreSQL

2008-06-05 Thread Gregory Stark
"Jeff Davis" <[EMAIL PROTECTED]> writes:

> On Wed, 2008-06-04 at 14:17 +0300, Heikki Linnakangas wrote:
>> > Would that also cover possible differences in page size, 32bit OS vs.
>> > 64bit OS, different timestamp flavour, etc. issues ? AFAIR, all these
>> > things can have an influence on how the data is written and possibly
>> > make the WAL incompatible with other postgres instances, even if the
>> > exact same version...
>> 
>> These are already covered by the information in pg_control.
>
> Another thing that can change between systems is the collation behavior,
> which can corrupt indexes (and other bad things).

Well, yes and no. It's entirely possible, for example, for a minor release of
an OS to tweak the collation rules for a collation without changing the name.
For the sake of argument they might just be fixing a bug in the collation
rules. From the point of view of the OS that's a minor bug fix that they might
not foresee causing data corruption problems.

Pegging pg_control to a particular release of the OS would be pretty terrible
though. I don't really see an out for this. But it's another roadblock to
consider akin to "not-really-immutable index expressions" for any proposal
which depends on re-finding index pointers :(

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's Slony Replication support!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_dump restore time and Foreign Keys

2008-06-05 Thread Richard Huxton

Simon Riggs wrote:


If we had a way of pg_dump passing on the information that the test
already passes, we would be able to skip the checks.

Proposal:

* Introduce a new mode for ALTER TABLE ADD FOREIGN KEY [WITHOUT CHECK];



* Have pg_dump write the new syntax into its dumps, when both the source
and target table are dumped in same I've been known to manually tweak dumps before now. I can see me 

forgetting this.

What about pg_dump writing out a row-count and MD5 of the rows in the 
COPY (just a textual calculation). Iff the restore checksum matches the 
dump checksum for both tables then the foreign-keys can be skipped.


If the restore checksum doesn't match the dump then it can issue a 
warning, but continue and run the full fkey check.


--
  Richard Huxton
  Archonet Ltd

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Overhauling GUCS

2008-06-05 Thread Greg Smith

On Thu, 5 Jun 2008, Heikki Linnakangas wrote:

A configuration wizard would be nice, but it would be a good start to add a 
section to the manual on how to do the basic tuning. AFAICS we don't have 
one. Clear instructions on how to set the few most important settings like 
shared_buffers and checkpoint_timeout/segments would probably be enough, with 
a link to the main configuration section that explains the rest of the 
settings.


It hasn't gelled yet but I'm working on that.  Most of the text needed is 
now linked to at http://wiki.postgresql.org/wiki/Performance_Optimization


I already talked with Chris Browne about merging his document I put first 
in that list with useful pieces from some of mine into one more 
comprehensive document on the Wiki, covering everything you mention here. 
If we took a snapshot of that when it's done and dumped that into the 
manual, I don't think that would be a problem to wrap up before 8.4 is 
done.  I'd like to include a link to the above performance page in that 
section of the manual as well, both so that people are more likely to find 
fresh content as well as to give them pointers toward more resources than 
the manual can possibly cover.


If people don't read the manual, we can add a link to it from 
postgresql.conf.sample, add a screen to the Windows installer suggesting 
to read it, or even open postgresql.conf in Notepad.


They don't.  Putting pointers toward a relatively simple performance 
tuning document a bit more in people's faces might help lower some of the 
criticism the project takes over providing low defaults for so many 
things.


--
* Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_dump restore time and Foreign Keys

2008-06-05 Thread Simon Riggs

On Thu, 2008-06-05 at 10:19 +0300, Heikki Linnakangas wrote:
> Simon Riggs wrote:
> > I'm guessing that the WITHOUT CHECK option would not be acceptable as an
> > unprotected trap for our lazy and wicked users. :-)
> 
> Yes, that sounds scary.
> 
> Instead, I'd suggest finding ways to speed up the ALTER TABLE ADD 
> FOREIGN KEY. 

I managed a suggestion for improving it for integers only, but if
anybody has any other ideas, I'm all ears. 

> Or speeding up COPY into a table with foreign keys already 
> defined. For example, you might want to build an in-memory hash table of 
> the keys in the target table, instead of issuing a query on each INSERT, 
> if the target table isn't huge.

No, that's not the problem, but I agree that is a problem also.

> Nothing beats the speed of simply not checking the constraint, of 
> course, but I'd hate to lose the protection it gives.

Are you saying you don't like the rest of the proposal, or just don't
like the idea of having that added as an unprotected option, but find
the proposal acceptable? 

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_dump restore time and Foreign Keys

2008-06-05 Thread Heikki Linnakangas

Simon Riggs wrote:

I'm guessing that the WITHOUT CHECK option would not be acceptable as an
unprotected trap for our lazy and wicked users. :-)


Yes, that sounds scary.

Instead, I'd suggest finding ways to speed up the ALTER TABLE ADD 
FOREIGN KEY. Or speeding up COPY into a table with foreign keys already 
defined. For example, you might want to build an in-memory hash table of 
the keys in the target table, instead of issuing a query on each INSERT, 
if the target table isn't huge.


Nothing beats the speed of simply not checking the constraint, of 
course, but I'd hate to lose the protection it gives.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers