date:20080128

Re: [HACKERS] autonomous transactions

2008-01-28 Thread Hans-Juergen Schoenig



On Jan 25, 2008, at 7:27 AM, Decibel! wrote:


On Wed, Jan 23, 2008 at 05:50:02PM -0500, Tom Lane wrote:

Simon Riggs <[EMAIL PROTECTED]> writes:

From looking at how Oracle does them, autonomous transactions are
completely independent of the transaction that originates them  
-- they
take a new database snapshot. This means that uncommitted  
changes in the
originating transaction are not visible to the autonomous  
transaction.



Oh! Recursion depth would need to be tested for as well. Nasty.


Seems like the cloning-a-session idea would be a possible  
implementation

path for these too.


Oracle has a feature where you can effectively save a session and  
return
to it. For example, if filling out a multi-page web form, you could  
save

state in the database between those calls. I'm assuming that they use
that capability for their autonomous transactions; save the current
session to the stack, clone it, run the autonomous transaction, then
restore the saved one.



If you want to use it for webforms you cannot just put it on the  
stack - you had to put it in shared memory because you don't know if  
you will ever get the same database connection back from the pool.
personally i like marko's idea. if a snapshot was identified by a key  
it would be perfect. we could present the snapshots saved as a nice  
nice superuser-readable system view (similar to what we do for 2PC)


the only thing i would do is to give those snapshots some sort of  
timeout (configurable). otherwise we will get countless VACUUM  
related reports.

this sounds like a very cool feature - definitely useful.

many thanks,

hans

--
Cybertec Schönig & Schönig GmbH
PostgreSQL Solutions and Support
Gröhrmühlgasse 26, 2700 Wiener Neustadt
Tel: +43/1/205 10 35 / 340
www.postgresql.at, www.cybertec.at

Re: [HACKERS] Transition functions for SUM(::int2), SUM(::int4, SUM(::int8])

2008-01-28 Thread Tom Lane

Caleb Welton <[EMAIL PROTECTED]> writes:
> Is there any reason that int2_sum, int4_sum, and int8_sum are not marked as
> being strict?

They wouldn't work otherwise, because the transition datatypes aren't
the same as the inputs.

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

[HACKERS] Transition functions for SUM(::int2), SUM(::int4, SUM(::int8])

2008-01-28 Thread Caleb Welton

Is there any reason that int2_sum, int4_sum, and int8_sum are not marked as
being strict?  All the other transition functions for sum, and every other
built in aggregation function is marked as strict, as demonstrated with:

select x.proname, t.proname, t.proisstrict
from ((pg_aggregate a left join
   pg_proc x on (a.aggfnoid = x.oid)) left join
   pg_proc t on (a.aggtransfn = t.oid))
where not t.proisstrict;

 proname | proname  | proisstrict
-+--+-
 sum | int2_sum | f
 sum | int4_sum | f
 sum | int8_sum | f


select x.proname, t.proname, t.proisstrict
from ((pg_aggregate a left join
   pg_proc x on (a.aggfnoid = x.oid)) left join
   pg_proc t on (a.aggtransfn = t.oid))
where x.proname = 'sum';

 proname |   proname   | proisstrict
-+-+-
 sum | int8_sum| f
 sum | int4_sum| f
 sum | int2_sum| f
 sum | float4pl| t
 sum | float8pl| t
 sum | cash_pl | t
 sum | interval_pl | t
 sum | numeric_add | t
(8 rows)

Thanks,
  Caleb


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [HACKERS] RFC: array_agg() per SQL:200n

2008-01-28 Thread Hitoshi Harada

yet another inverse function I wrote before, though it applies for only 1D
array.

typedef struct _enuminfo{
ArrayType   *data;
char*ptr;
int16   typlen;
booltypbyval;
chartypalign;
} EnumInfo;

Datum array_enum(PG_FUNCTION_ARGS){
FuncCallContext *funcctx;
MemoryContext oldcontext;
ArrayType   *input;
EnumInfo*info;
Datum   result;

if(SRF_IS_FIRSTCALL()){
funcctx = SRF_FIRSTCALL_INIT();
oldcontext =
MemoryContextSwitchTo(funcctx->multi_call_memory_ctx);

input = PG_GETARG_ARRAYTYPE_P(0);
if(ARR_NDIM(input) != 1){
elog(ERROR, "array_enum() accepts only one dimension
array.");
}
funcctx->max_calls = ArrayGetNItems(ARR_NDIM(input),
ARR_DIMS(input));

info = (EnumInfo*)palloc0(sizeof(EnumInfo));
info->data = (ArrayType*)PG_DETOAST_DATUM_COPY(input);
info->ptr = ARR_DATA_PTR(info->data);
get_typlenbyvalalign(
info->data->elemtype, 
&(info->typlen), 
&(info->typbyval), 
&(info->typalign)
);

funcctx->user_fctx = info;

MemoryContextSwitchTo(oldcontext);
}

funcctx = SRF_PERCALL_SETUP();
info = funcctx->user_fctx;

if(funcctx->call_cntr < funcctx->max_calls){
/* Get source element */
result = fetch_att(info->ptr, info->typbyval, info->typlen);

info->ptr = att_addlength(info->ptr, info->typlen,
PointerGetDatum(info->ptr));
info->ptr = (char *) att_align(info->ptr, info->typalign);
SRF_RETURN_NEXT(funcctx, result);
}else{
SRF_RETURN_DONE(funcctx);
}
}

Hitoshi Harada

> -Original Message-
> From: [EMAIL PROTECTED]
> [mailto:[EMAIL PROTECTED] On Behalf Of Joe Conway
> Sent: Tuesday, January 29, 2008 11:00 AM
> To: Jeff Davis
> Cc: Neil Conway; pgsql-hackers
> Subject: Re: [HACKERS] RFC: array_agg() per SQL:200n
> 
> Jeff Davis wrote:
> > On Sun, 2008-01-27 at 22:11 -0800, Neil Conway wrote:
> >> p. 564 discusses the required behavior. The result of array_agg() is an
> >> array with one element per input value, sorted according to the optional
> >> ORDER BY clause. NULL input values are included in the array, and the
> >> result for an empty group is NULL, not an empty array. Note that per
> >> page 66, I'd expect array values in the input to array_agg() not to be
> >> flattened.
> >
> > Should there be an inverse operator (a SRF, in this case) that returns a
> > set from an array?
> 
> Yes -- see UNNEST
> 
> Joe
> 
> ---(end of broadcast)---
> TIP 9: In versions below 8.0, the planner will ignore your desire to
>choose an index scan if your joining column's datatypes do not
>match


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

[HACKERS] Bogus cleanup code in GSSAPI/SSPI patch

2008-01-28 Thread Tom Lane

Hi Magnus,
  Would you take a look at the patch I just committed in fe-connect.c?

I found out today that PQreset() wasn't working on a GSSAPI connection,
because closePGconn hadn't been patched to clear out the GSSAPI state
(resulting in "duplicate GSS authentication request" failure).  I think
I fixed that but it wouldn't be bad for someone who knows that code
better to double-check.  Also, I can't test the ENABLE_SSPI case,
so that needs to be looked at.

The test case I've been using is to "select 1;" in psql, then kill -9
the backend from another window, then "select 1;" again.  psql should
recover and reconnect successfully.

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] RFC: array_agg() per SQL:200n

2008-01-28 Thread Joe Conway


Jeff Davis wrote:

On Sun, 2008-01-27 at 22:11 -0800, Neil Conway wrote:

p. 564 discusses the required behavior. The result of array_agg() is an
array with one element per input value, sorted according to the optional
ORDER BY clause. NULL input values are included in the array, and the
result for an empty group is NULL, not an empty array. Note that per
page 66, I'd expect array values in the input to array_agg() not to be
flattened.


Should there be an inverse operator (a SRF, in this case) that returns a
set from an array?


Yes -- see UNNEST

Joe

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match

Re: [HACKERS] find_typedef alternative that works on mainstream systems

2008-01-28 Thread Bruce Momjian

Alvaro Herrera wrote:
> Alvaro Herrera wrote:
> 
> > objdump  -W $object_file | \
> > awk '/DW_TAG_/ { grab=0 } /DW_TAG_typedef/ { grab=1 } /DW_AT_name/ { if 
> > (grab) { print $0 } }' | \
> > sed -e 's/^.*: \([^ ]*\)/\1/' | \
> > sort | \
> > uniq
> 
> I oversimplified the awk line, causing some garbage to appear at the end
> :-(  The full awk line I am using is 
> 
> awk '
> /^Contents of / { if (read) exit }
> /^The section / { if (read) exit }
> /^The section .debug_info contains:/ { read=1 }
> /DW_TAG_/ { grab=0 }
> /DW_TAG_typedef/ { grab=1 }
> /DW_AT_name/ { if (read && grab) { print $0 } }
> '

objdump errors on the -W option here (returns -1) so put some
conditional logic in tools/find_typedef and we should be fine.

-- 
  Bruce Momjian  <[EMAIL PROTECTED]>http://momjian.us
  EnterpriseDB http://postgres.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [HACKERS] [PATCHES] Proposed patch: synchronized_scanning GUCvariable

2008-01-28 Thread Ron Mayer

Jeff Davis wrote:
> On Mon, 2008-01-28 at 23:13 +, Heikki Linnakangas wrote:
>   
>> "clusteredness" didn't get screwed up by a table that looks like this: 
>> "5 6 7 8 9 1 2 3 4"
>> 
> ...test table with a similar
> distribution to your example, and it shows a correlation of about -0.5,
> but it should be as good as something near -1 or +1.
>
> I am not a statistics expert, but it seems like a better measurement
> would be: "what is the chance that, if the tuples are close together in
> index order, the corresponding heap tuples are close together?".
>   
Same applies for data clustered by zip-code.

All rows for any State or City or County or SchoolZone
are "close together" on the same pages; yet postgres's
stats think they're totally unclustered.
> The answer to that question in your example is "very likely", so there
> would be no problem.
> Is there a reason we don't do this?
>   
I've been tempted to do things like

   update pg_statistic set stanumbers3='{1.0}' where starelid=2617 and
staattnum=7;

after every analyze when I have data like this from tables clustered
by zip.  Seems it'd help more plans than it hurts, but haven't been
brave enough to try in production.


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] Truncate Triggers

2008-01-28 Thread Alvaro Herrera

Decibel! wrote:
> On Fri, Jan 25, 2008 at 11:40:19AM +, Simon Riggs wrote:
> > (for 8.4 ...)
> > I'd like to introduce triggers that fire when we issue a truncate:
> 
> Rather than focusing exclusively on TRUNCATE, how about "triggers" that
> fire whenever any kind of DDL operation is performed? (Ok, truncate is
> more DML than DDL, but still).

I don't think it makes sense in general.  For example, would we fire
triggers on CLUSTER?  Or on ALTER TABLE / SET STATISTICS?

TRUNCATE seems a special case that needs it.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [HACKERS] Strange locking choices in pg_shdepend.c

2008-01-28 Thread Alvaro Herrera

Tom Lane wrote:
> Decibel! <[EMAIL PROTECTED]> writes:
> > Would it be worthwhile to allow for logging when a lock gets upgraded?
> > That would make it easier to protect against deadlocks...
> 
> There is some debug code for that in the backend, but my experience
> is that it's too noisy to have on by default.

Seems a good idea to separate the upgrade bits from the other stuff and
enable it on --enable-cassert or something similar.  TODO for 8.4?

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] [PATCHES] Proposed patch: synchronized_scanning GUCvariable

2008-01-28 Thread Simon Riggs

On Mon, 2008-01-28 at 23:13 +, Heikki Linnakangas wrote:
> Tables that are seq scanned are typically very small, like a summary 
> table with just a few rows, or huge tables in a data warehousing 
> system. Between the extremes, I don't think the threshold actually has
> a very big impact.

And if you have a partitioned table with partitions inconveniently
sized? You'd need to *reduce* shared_buffers specifically to get synch
scans and BAS to kick in. Or increase partition size. Both of which
reduce the impact of the benefits we've added.

I don't think the argument that "a table is smaller than shared buffers
therefore it is already in shared buffers" holds true in all cases. I/O
does matter.

-- 
  Simon Riggs
  2ndQuadrant  http://www.2ndQuadrant.com 

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate

Re: [HACKERS] [PATCHES] Proposed patch: synchronized_scanning GUCvariable

2008-01-28 Thread Jeff Davis

On Mon, 2008-01-28 at 23:13 +, Heikki Linnakangas wrote:
> It's a good point that we don't want pg_dump to screw up the cluster 
> order, but that's the only use case I've seen this far for disabling 
> sync scans. Even that wouldn't matter much if our estimate for 
> "clusteredness" didn't get screwed up by a table that looks like this: 
> "5 6 7 8 9 1 2 3 4"

It doesn't seem like there is any reason for the estimate to get
confused, but it apparently does. I loaded a test table with a similar
distribution to your example, and it shows a correlation of about -0.5,
but it should be as good as something near -1 or +1.

I am not a statistics expert, but it seems like a better measurement
would be: "what is the chance that, if the tuples are close together in
index order, the corresponding heap tuples are close together?".

The answer to that question in your example is "very likely", so there
would be no problem.

Is there a reason we don't do this?

Regards,
Jeff Davis

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [HACKERS] [PATCHES] Better default_statistics_target

2008-01-28 Thread Christopher Browne

On Dec 6, 2007 6:28 PM, Decibel! <[EMAIL PROTECTED]> wrote:
> FWIW, I've never seen anything but a performance increase or no change
> when going from 10 to 100. In most cases there's a noticeable
> improvement since it's common to have over 100k rows in a table, and
> there's just no way to capture any kind of a real picture of that with
> only 10 buckets.

I'd be more inclined to try to do something that was at least somewhat
data aware.

The "interesting theory" that I'd like to verify if I had a chance
would be to run through a by-column tuning using a set of heuristics.
My "first order approximation" would be:

- If a column defines a unique key, then we know there will be no
clustering of values, so no need to increase the count...

- If a column contains a datestamp, then the distribution of values is
likely to be temporal, so no need to increase the count...

- If a column has a highly constricted set of values (e.g. - boolean),
then we might *decrease* the count.

- We might run a query that runs across the table, looking at
frequencies of values, and if it finds a lot of repeated values, we'd
increase the count.

That's a bit "hand-wavy," but that could lead to both increases and
decreases in the histogram sizes.  Given that, we can expect the
overall stat sizes to not forcibly need to grow *enormously*, because
we can hope for there to be cases of shrinkage.


-- 
http://linuxfinances.info/info/linuxdistributions.html
"The definition of insanity is doing the same thing over and over and
expecting different results."  -- assortedly attributed to Albert
Einstein, Benjamin Franklin, Rita Mae Brown, and Rudyard Kipling

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [HACKERS] [PATCHES] Proposed patch: synchronized_scanning GUCvariable

2008-01-28 Thread Heikki Linnakangas


Simon Riggs wrote:

On Mon, 2008-01-28 at 16:21 -0500, Tom Lane wrote:

Simon Riggs <[EMAIL PROTECTED]> writes:

Rather than having a boolean GUC, we should have a number and make the
parameter "synchronised_scan_threshold".

This would open up a can of worms I'd prefer not to touch, having to do
with whether the buffer-access-strategy behavior should track that or
not.  As the note in heapam.c says,

 * If the table is large relative to NBuffers, use a bulk-read access
 * strategy and enable synchronized scanning (see syncscan.c).  Although
 * the thresholds for these features could be different, we make them the
 * same so that there are only two behaviors to tune rather than four.

It's a bit late in the cycle to be revisiting that choice.  Now we do
already have three behaviors to worry about (BAS on and syncscan off)
but throwing in a randomly settable knob will take it back to four,
and we have no idea how that fourth case will behave.  The other tack we
could take (having the one GUC variable control both thresholds) is
not good since it will result in pg_dump trashing the buffer cache.


OK, good points. 


I'm still concerned that the thresholds gets higher as we increase
shared_buffers. We may be removing performance features as fast as we
gain performance when we set shared_buffers higher.

Might we agree that the threshold should be fixed at 8MB, rather than
varying upwards as we try to tune? 


Synchronized scans, and the bulk-read strategy, don't help if the table 
fits in cache. If it fits in shared buffers, you're better off keeping 
it there, than swap pages between the OS cache and shared buffers, or 
spend any effort synchronizing scans. That's why we agreed back then 
that the threshold should be X% of shared_buffers.


It's a good point that we don't want pg_dump to screw up the cluster 
order, but that's the only use case I've seen this far for disabling 
sync scans. Even that wouldn't matter much if our estimate for 
"clusteredness" didn't get screwed up by a table that looks like this: 
"5 6 7 8 9 1 2 3 4"


Now, maybe there's more use cases where you'd want to tune the 
threshold, but I'd like to see some before we add more knobs.


To benefit from a lower threshold, you'd need to have a table large 
enough that its cache footprint matters, but is still smaller than 25% 
of shared_buffers, and have seq scans on it. In that scenario, you might 
benefit from a lower threshold, because that would leave some 
shared_buffers free for other use. Even that is quite hand-wavey; the 
buffer cache LRU algorithm handles that kind of scenarios reasonably 
well already, and whether or not


To benefit from a larger threshold, you'd need to have a table larger 
than 25% of shared_buffers, but still smaller than shared_buffers, and 
seq scan it often enough that you want to keep it in shared buffers. If 
you're frequently seq scanning a table of that size, you're most likely 
suffering from a bad plan. Even then, the performance difference 
shouldn't be that great, the table surely fits in OS cache anyway, with 
typical shared_buffers settings.


Tables that are seq scanned are typically very small, like a summary 
table with just a few rows, or huge tables in a data warehousing 
system. Between the extremes, I don't think the threshold actually has a 
very big impact.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] CLUSTER and synchronized scans and pg_dump et al

2008-01-28 Thread Jeff Davis

On Sun, 2008-01-27 at 13:37 -0500, Tom Lane wrote:
> Also, does anyone object to making pg_dump just disable it
> unconditionally?  Greg's original gripe only mentioned the case of
> clustered tables, but it'd be kind of a pain to make pg_dump turn it
> on and off again for different tables.  And I could see people
> complaining about pg_dump failing to preserve row order even in
> unclustered tables.
> 

If you are running pg_dump, that increases the likelihood that multiple
sequential scans will be reading the same large table at the same time.
Sync scans prevent that additional scan from bringing your active
database to a halt during your dump (due to horrible seeking and poor
cache efficiency).

I think that pg_dump is a good use case for synchronized scans. Assuming
it doesn't hold up 8.3, I think it's worthwhile to consider only
disabling it for tables that have been clustered.

That being said, this isn't a strong objection. Having a painless 8.3
release is the top priority, of course.

Regards,
Jeff Davis

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] [PATCHES] Proposed patch: synchronized_scanning GUC variable

2008-01-28 Thread Simon Riggs

On Mon, 2008-01-28 at 16:21 -0500, Tom Lane wrote:
> Simon Riggs <[EMAIL PROTECTED]> writes:
> > Rather than having a boolean GUC, we should have a number and make the
> > parameter "synchronised_scan_threshold".
> 
> This would open up a can of worms I'd prefer not to touch, having to do
> with whether the buffer-access-strategy behavior should track that or
> not.  As the note in heapam.c says,
> 
>  * If the table is large relative to NBuffers, use a bulk-read access
>  * strategy and enable synchronized scanning (see syncscan.c).  Although
>  * the thresholds for these features could be different, we make them the
>  * same so that there are only two behaviors to tune rather than four.
> 
> It's a bit late in the cycle to be revisiting that choice.  Now we do
> already have three behaviors to worry about (BAS on and syncscan off)
> but throwing in a randomly settable knob will take it back to four,
> and we have no idea how that fourth case will behave.  The other tack we
> could take (having the one GUC variable control both thresholds) is
> not good since it will result in pg_dump trashing the buffer cache.

OK, good points. 

I'm still concerned that the thresholds gets higher as we increase
shared_buffers. We may be removing performance features as fast as we
gain performance when we set shared_buffers higher.

Might we agree that the threshold should be fixed at 8MB, rather than
varying upwards as we try to tune? 

The objective of having a tuning hook would have been simply to
normalise the effects of increasing shared_buffers anyway, so fixing it
would solve the problem I see.

(8MB is the default, based upon 25% of 32MB).

-- 
  Simon Riggs
  2ndQuadrant  http://www.2ndQuadrant.com 


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] CLUSTER and synchronized scans and pg_dump et al

2008-01-28 Thread Jeff Davis

On Sun, 2008-01-27 at 15:02 +, Gregory Stark wrote:
> It occurred to me the other day that synchronized scans could play havoc with
> clustered tables. When you dump and reload a table even if it was recently
> clustered if any other sequential scans are happening in the system at the
> time you dump it the dump could shuffle the records out of order. 
> 
> Now the records would still be effectively ordered for most purposes but our
> statistics can't detect that. Since the correlation would be poor the restored
> database would have markedly different statistics showing virtually no
> correlation on the clustered column.
> 
> Perhaps we should have some form of escape hatch for pg_dump to request real
> physical order when dumping clustered tables.
> 

Thank you for bringing this up, it's an interesting point.

Keep in mind that this only matters if you are actually running pg_dump
concurrently with another scan, because a scan will reset the starting
point after completing.

Regards,
Jeff Davis


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] CLUSTER and synchronized scans and pg_dump et al

2008-01-28 Thread Jeff Davis

On Sun, 2008-01-27 at 12:45 -0500, Tom Lane wrote:
> Maybe a GUC variable to enable/disable syncscan?

The first iterations of the patch included a GUC. 

I don't have any objection to re-introducing a GUC to enable/disable it.
However, I would suggest that it defaults to "on", because:

1. There aren't many cases where you'd want it to be off, and this
particular case with pg_dump is the best one that I've heard of (thanks
Greg). We want people who install 8.3 to see a boost without lots of
tuning, if possible.
2. It only turns on for tables over 25% of shared buffers anyway.

Introducing GUCs reintroduces the same questions that were discussed
before. 

1. Should the 25% figure be tunable as well? 
2. Remember that the 25% figure is also tied to Simon and Heikki's
buffer recycling patch (buffer ring patch). Should they be configurable
independently? Should they be tied together, but configurable?

The simplest solution, in my opinion, is something like:

large_scan_threshold = 0.25 # set to -1 to disable

Where a scan of any table larger than (large_scan_threshold *
shared_buffers) employs both synchronized scans and buffer recycling. We
may implement other large scan strategies in the future.

Regards,
Jeff Davis


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] RFC: array_agg() per SQL:200n

2008-01-28 Thread Jeff Davis

On Sun, 2008-01-27 at 22:11 -0800, Neil Conway wrote:
> p. 564 discusses the required behavior. The result of array_agg() is an
> array with one element per input value, sorted according to the optional
> ORDER BY clause. NULL input values are included in the array, and the
> result for an empty group is NULL, not an empty array. Note that per
> page 66, I'd expect array values in the input to array_agg() not to be
> flattened.

Should there be an inverse operator (a SRF, in this case) that returns a
set from an array?

Regards,
Jeff Davis


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [HACKERS] autonomous transactions

2008-01-28 Thread Roberts, Jon

> On Wed, Jan 23, 2008 at 05:50:02PM -0500, Tom Lane wrote:
> > Simon Riggs <[EMAIL PROTECTED]> writes:
> > >> From looking at how Oracle does them, autonomous transactions are
> > >> completely independent of the transaction that originates them --
> they
> > >> take a new database snapshot. This means that uncommitted changes
in
> the
> > >> originating transaction are not visible to the autonomous
> transaction.
> >
> > > Oh! Recursion depth would need to be tested for as well. Nasty.
> >
> > Seems like the cloning-a-session idea would be a possible
implementation
> > path for these too.
> 
> Oracle has a feature where you can effectively save a session and
return
> to it. For example, if filling out a multi-page web form, you could
save
> state in the database between those calls. I'm assuming that they use
> that capability for their autonomous transactions; save the current
> session to the stack, clone it, run the autonomous transaction, then
> restore the saved one.
> --

You are describing an uncommitted transaction and not an autonomous
transaction.  Transactions in Oracle are not automatically committed
like they are in PostgreSQL.

Here is a basic example of an autonomous transaction: 

create or replace procedure pr_log_error (p_error_message
errorlog.message%type) is 
  pragma autonomous_transaction;
begin
  insert
into errorlog
 (log_user,
  log_time,
  error_message)
  values (user,
  sysdate(),
  p_error_message);
  commit;
exception
  when others then
rollback;
raise;
end;
  

And then you can call it from a procedure like this:

create or replace procedure pr_example is
begin
  null;--do some work
  commit;  --commit the work
exception
  when others
pr_log_error(p_error_message => sqlerrm);
rollback;
raise;
end;

The autonomous transaction allows me to insert and commit a record in
different transaction than the calling procedure so the calling
procedure can rollback or commit.

You can also remove the commit/rollback from pr_example and instead do
it from the anonymous block that calls it.  I just added it to make it
clear that it is a different transaction than the error logging
transaction.



Jon

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [HACKERS] [PATCHES] Proposed patch: synchronized_scanning GUC variable

2008-01-28 Thread Tom Lane

Simon Riggs <[EMAIL PROTECTED]> writes:
> Rather than having a boolean GUC, we should have a number and make the
> parameter "synchronised_scan_threshold".

This would open up a can of worms I'd prefer not to touch, having to do
with whether the buffer-access-strategy behavior should track that or
not.  As the note in heapam.c says,

 * If the table is large relative to NBuffers, use a bulk-read access
 * strategy and enable synchronized scanning (see syncscan.c).  Although
 * the thresholds for these features could be different, we make them the
 * same so that there are only two behaviors to tune rather than four.

It's a bit late in the cycle to be revisiting that choice.  Now we do
already have three behaviors to worry about (BAS on and syncscan off)
but throwing in a randomly settable knob will take it back to four,
and we have no idea how that fourth case will behave.  The other tack we
could take (having the one GUC variable control both thresholds) is
not good since it will result in pg_dump trashing the buffer cache.

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [HACKERS] Strange locking choices in pg_shdepend.c

2008-01-28 Thread Tom Lane

Decibel! <[EMAIL PROTECTED]> writes:
> Would it be worthwhile to allow for logging when a lock gets upgraded?
> That would make it easier to protect against deadlocks...

There is some debug code for that in the backend, but my experience
is that it's too noisy to have on by default.

regards, tom lane

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate

Re: [HACKERS] [PATCHES] Friendly help for psql

2008-01-28 Thread Tom Lane

Bruce Momjian <[EMAIL PROTECTED]> writes:
> Alvaro Herrera wrote:
>> To avoid the usage of unadorned "help" (which I don't think is going to
>> ever cause conflicts with a SQL command but perhaps it's better to be
>> prepared), one idea would be to respond with "please execute \help
>> instead", and then \help would emit the verbose output.  Perhaps
>> eventually we could adorn it with "\help category", etc.

> Uh, imagine:

>   test=> SELECT * from pg_class
>   test-> help

You didn't read the patch: it would recognize "help" only when the input
buffer is empty.  The only thing it's assuming is that no SQL command
will ever *begin* with the word "help".

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] [GENERAL] SHA1 on postgres 8.3

2008-01-28 Thread Greg Sabino Mullane


-BEGIN PGP SIGNED MESSAGE-
Hash: RIPEMD160


> I am not thrilled about moving _some_ of pgcrypto into the backend ---
> pgcrypto right now seems well designed and if we pull part of it out it
> seems it will be less clear than what we have now.  Perhaps we just need
> to document that md5() isn't for general use and some function in
> pgcrypto should be used instead?

I think looking at this as putting some of pg_crypto into core is looking
at this the wrong way. We are never going to put the whole thing into
core given the current state of cryptography laws, as obviously the
current status of giving users md5() and nothing else is not ideal. What
we're looking for is a middle ground. It seems to me we've narrowed
it down to two questions:

1) Does sha1(), or other hashing algorithms risk running afoul of
cryptography regulations?

I'm 100% sure that sha1() itself is not a problem (it's even a PHP builtin,
and good luck finding a box these days wihout that monstrosity installed).
I highly doubt any of the rest (SHA*, HMAC, etc.) are a problem either:
we're doing a one-way hash, not encrypting data. But common sense and
cryptography have seldom been seen together since the start of the cold war,
so I'll hold my final judgement.

2) Which ones do we include?

Putting sha1() seems a no-brainer, but as Joe points out, why not add all
the rest in at the same time?

- --
Greg Sabino Mullane [EMAIL PROTECTED]
PGP Key: 0x14964AC8 200801281506
http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8
-BEGIN PGP SIGNATURE-

iD8DBQFHnjeJvJuQZxSWSsgRA2fWAKCljvbj5BVaFQ5mEDvckNGhVz6rDgCg0DRc
zaIu/rT1vdDrL61JTsXdIZ8=
=7DKm
-END PGP SIGNATURE-



---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

[HACKERS] Mail issue

2008-01-28 Thread Decibel!

I had a mail issue on my end which resulted in a number of outbound
emails getting stuck in a queue. They all just went out; sorry for the
flood.
-- 
Decibel!, aka Jim C. Nasby, Database Architect  [EMAIL PROTECTED] 
Give your computer some brain candy! www.distributed.net Team #1828


pgp2tr8ezOZOZ.pgp
Description: PGP signature

Re: [HACKERS] autonomous transactions

2008-01-28 Thread Decibel!

On Wed, Jan 23, 2008 at 05:50:02PM -0500, Tom Lane wrote:
> Simon Riggs <[EMAIL PROTECTED]> writes:
> >> From looking at how Oracle does them, autonomous transactions are
> >> completely independent of the transaction that originates them -- they
> >> take a new database snapshot. This means that uncommitted changes in the
> >> originating transaction are not visible to the autonomous transaction.
> 
> > Oh! Recursion depth would need to be tested for as well. Nasty.
> 
> Seems like the cloning-a-session idea would be a possible implementation
> path for these too.

Oracle has a feature where you can effectively save a session and return
to it. For example, if filling out a multi-page web form, you could save
state in the database between those calls. I'm assuming that they use
that capability for their autonomous transactions; save the current
session to the stack, clone it, run the autonomous transaction, then
restore the saved one.
-- 
Decibel!, aka Jim C. Nasby, Database Architect  [EMAIL PROTECTED] 
Give your computer some brain candy! www.distributed.net Team #1828


pgpKD3eTOJmEA.pgp
Description: PGP signature

Re: [HACKERS] [PATCHES] Friendly help for psql

2008-01-28 Thread Greg Sabino Mullane


-BEGIN PGP SIGNED MESSAGE-
Hash: RIPEMD160


> Uh, imagine:
>
>   test=> SELECT * from pg_class
>   test-> help
>
> Technically 'help' is now an alias for 'pg_class'.  Are you suggesting
> supporting 'help' in this usage?  People were saying they forget
> semicolons, so this 'help' usage is quite possible.  We don't want to
> hear "Why doesn't 'help' work sometimes?"

I don't think we have to worry about supporting that case. We only
support it when the command begins with "help". No ambiguity with
SQL to worry about. Don't forget that ctrl-c resets a query as well, so
the above situation is not as dire as it may appear to the uninitiated
user.

> Supporting 'help' in psql seems like a very slippery slope.  We are very
> tight in defining when an entry is psql and when it is SQL and this
> weakens that.

Does not seem slippery to me, seems a simple, one-time fix that replaces
a syntax error with a helpful message. As the instigator of this discussion,
I can assure you I have no plans for anything other than "help". It's simple,
standard, and points you to anything else you may need or want to do.

> What would be interesting would be if the _server_ could send back some
> message about "Use the help facility of your client application" but it
> would have to have a trailing semicolon;  unlikely.  :-(

Now THAT would be a bad mangling of SQL and non-SQL. We'll pretend you
didn't suggest that one. :)

- --
Greg Sabino Mullane [EMAIL PROTECTED]
PGP Key: 0x14964AC8 200801281451
http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8
-BEGIN PGP SIGNATURE-

iD8DBQFHnjJgvJuQZxSWSsgRA8mMAJ4jzlrwYDVguUNt2yAwMTXMvV1S3QCg+SaF
n8ybxb/KbGBxW9aouZGpzaY=
=igJU
-END PGP SIGNATURE-



---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] [PATCHES] Better default_statistics_target

2008-01-28 Thread Decibel!

On Wed, Dec 05, 2007 at 06:49:00PM +0100, Guillaume Smet wrote:
> On Dec 5, 2007 3:26 PM, Greg Sabino Mullane <[EMAIL PROTECTED]> wrote:
> > Agreed, this would be a nice 8.4 thing. But what about 8.3 and 8.2? Is
> > there a reason not to make this change? I know I've been lazy and not run
> > any absolute figures, but rough tests show that raising it (from 10 to
> > 100) results in a very minor increase in analyze time, even for large
> > databases. I think the burden of a slightly slower analyze time, which
> > can be easily adjusted, both in postgresql.conf and right before running
> > an analyze, is very small compared to the pain of some queries - which 
> > worked
> > before - suddenly running much, much slower for no apparent reason at all.
> 
> As Tom stated it earlier, the ANALYZE slow down is far from being the
> only consequence. The planner will also have more work to do and
> that's the hard point IMHO.

How much more? Doesn't it now use a binary search? If so, ISTM that
going from 10 to 100 would at worst double the time spent finding the
bucket we need. Considering that we're talking something that takes
microseconds, and that there's a huge penalty to be paid if you have bad
stats estimates, that doesn't seem that big a deal. And on modern
machines it's not like the additional space in the catalogs is going to
kill us.

FWIW, I've never seen anything but a performance increase or no change
when going from 10 to 100. In most cases there's a noticeable
improvement since it's common to have over 100k rows in a table, and
there's just no way to capture any kind of a real picture of that with
only 10 buckets.
-- 
Decibel!, aka Jim C. Nasby, Database Architect  [EMAIL PROTECTED] 
Give your computer some brain candy! www.distributed.net Team #1828


pgpTmTmwysHI5.pgp
Description: PGP signature

Re: [HACKERS] Strange locking choices in pg_shdepend.c

2008-01-28 Thread Decibel!

On Mon, Jan 21, 2008 at 04:54:06PM -0500, Tom Lane wrote:
> It's probably not a good idea to have shdepReassignOwned() take only
> AccessShareLock on pg_shdepend.  Even though the function itself
> merely reads the table, it is going to call functions that will take
> RowExclusiveLock, meaning that we're setting ourselves up for potential
> deadlock failures due to lock-upgrade.  It'd be safer (and faster too)
> to just hold RowExclusiveLock through the whole operation.

Just a thought...

Would it be worthwhile to allow for logging when a lock gets upgraded?
That would make it easier to protect against deadlocks...
-- 
Decibel!, aka Jim C. Nasby, Database Architect  [EMAIL PROTECTED] 
Give your computer some brain candy! www.distributed.net Team #1828


pgpEq5RUInVoR.pgp
Description: PGP signature

Re: [HACKERS] Spoofing as the postmaster

2008-01-28 Thread Decibel!

On Sat, Dec 22, 2007 at 09:25:05AM -0500, Bruce Momjian wrote:
> So, what solutions exist?  We could require the use of port numbers less
> than 1024 which typically require root and then become a non-root user,
> but that requires root to start the server.  We could put the unix

I don't know about *requiring* this, but it would certainly be a nice
option to have. Right now there's absolutely no way that you could get
Postgres to use a port < 1024.
-- 
Decibel!, aka Jim C. Nasby, Database Architect  [EMAIL PROTECTED] 
Give your computer some brain candy! www.distributed.net Team #1828


pgp5kYYo0x2d8.pgp
Description: PGP signature

Re: [HACKERS] There's random access and then there's random access

2008-01-28 Thread Decibel!

On Wed, Dec 05, 2007 at 01:49:20AM +, Gregory Stark wrote:
> Regardless of what mechanism is used and who is responsible for doing it
> someone is going to have to figure out which blocks are specifically
> interesting to prefetch. Bitmap index scans happen to be the easiest since
> we've already built up a list of blocks we plan to read. Somehow that
> information has to be pushed to the storage manager to be acted upon. 
> 
> Normal index scans are an even more interesting case but I'm not sure how hard
> it would be to get that information. It may only be convenient to get the
> blocks from the last leaf page we looked at, for example.

I guess it depends on how you're looking at things... I'm thinking more
in terms of telling the OS to fetch stuff we're pretty sure we're going
to need while we get on with other work. There's a lot of cases where
you know that besides just a bitmap scan (though perhaps code-wise
bitmap scan is easier to implement...)

For a seqscan, we'd want to be reading some number of blocks ahead of
where we're at right now.

Ditto for index pages on an index scan. In addition, when we're scanning
the index, we'd definitely want to issue heap page requests
asynchronously, since that gives the filesystem, etc a better shot at
re-ordering the reads to improve performance.
-- 
Decibel!, aka Jim C. Nasby, Database Architect  [EMAIL PROTECTED] 
Give your computer some brain candy! www.distributed.net Team #1828


pgpSjafYi0G3G.pgp
Description: PGP signature

Re: [HACKERS] Truncate Triggers

2008-01-28 Thread Decibel!

On Fri, Jan 25, 2008 at 11:40:19AM +, Simon Riggs wrote:
> (for 8.4 ...)
> I'd like to introduce triggers that fire when we issue a truncate:

Rather than focusing exclusively on TRUNCATE, how about "triggers" that
fire whenever any kind of DDL operation is performed? (Ok, truncate is
more DML than DDL, but still).

The reason I put triggers in quotes is because I'm not suggesting that
we actually put triggers on the catalog tables, since we all know that's
hard/impossible. Instead this would have to tie into command processing,
similar to what you're proposing for truncate.
-- 
Decibel!, aka Jim C. Nasby, Database Architect  [EMAIL PROTECTED] 
Give your computer some brain candy! www.distributed.net Team #1828


pgpGSGEk92QWF.pgp
Description: PGP signature

Re: [HACKERS] CLUSTER and synchronized scans and pg_dump et al

2008-01-28 Thread Florian G. Pflug


Steve Atkins wrote:

On Jan 28, 2008, at 8:36 AM, Tom Lane wrote:


Andrew Dunstan <[EMAIL PROTECTED]> writes:

Kevin Grittner wrote:

It would seem reasonable to me for pg_dump to use ORDER BY to select
data from clustered tables.



What will be the performance hit from doing that?


That worries me too.  Also, in general pg_dump's charter is to reproduce
the state of the database as best it can, not to "improve" it.


One common use of cluster around here is to act as a faster version
of vacuum full when there's a lot of dead rows in a table. There's no
intent to keep the table clustered on that index, and the cluster flag
isn't removed with alter table (why bother, the only thing it affects is
the cluster command).

I'm guessing that's not unusual, and it'd lead to sorting tables as part
of pg_dump.


I've done that too - and every time I typed that "CLUSTER ... " I 
thought why, oh why isn't there something like REWRITE TABLE ", 
which would work just like CLUSTER, but without the sorting ;-) Maybe 
something to put on the TODO list...


We might even call it "VACCUM REWRITE" ;-)

regards, Florian Pflug

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] [PATCHES] Friendly help for psql

2008-01-28 Thread Bruce Momjian

Alvaro Herrera wrote:
> To avoid the usage of unadorned "help" (which I don't think is going to
> ever cause conflicts with a SQL command but perhaps it's better to be
> prepared), one idea would be to respond with "please execute \help
> instead", and then \help would emit the verbose output.  Perhaps
> eventually we could adorn it with "\help category", etc.

Uh, imagine:

test=> SELECT * from pg_class
test-> help

Technically 'help' is now an alias for 'pg_class'.  Are you suggesting
supporting 'help' in this usage?  People were saying they forget
semicolons, so this 'help' usage is quite possible.  We don't want to
hear "Why doesn't 'help' work sometimes?"

I think the fundamental problem is that most programs, like ftp, have a
predefined set of single-line commands, while we have an SQL 'language'
that can be multi-line and has no special markings in psql.  In fact the
special marking is for help and psql-commands using backslash.

Supporting 'help' in psql seems like a very slippery slope.  We are very
tight in defining when an entry is psql and when it is SQL and this
weakens that.

What would be interesting would be if the _server_ could send back some
message about "Use the help facility of your client application" but it
would have to have a trailing semicolon;  unlikely.  :-(

-- 
  Bruce Momjian  <[EMAIL PROTECTED]>http://momjian.us
  EnterpriseDB http://postgres.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [HACKERS] [PATCHES] Proposed patch: synchronized_scanning GUC variable

2008-01-28 Thread Hans-Juergen Schoenig



On Jan 28, 2008, at 6:14 PM, Simon Riggs wrote:


On Sun, 2008-01-27 at 21:04 -0500, Tom Lane wrote:

[ redirecting thread to -hackers ]

Neil Conway <[EMAIL PROTECTED]> writes:

On Sun, 2008-01-27 at 21:54 +, Gregory Stark wrote:

I liked the "synchronized_sequential_scans" idea myself.



I think that's a bit too long. How about "synchronized_scans", or
"synchronized_seqscans"?


We have enable_seqscan already, so that last choice seems to fit in.


If we're going to have a GUC, we may as well make it as useful as
possible.

Currently we set synch scan on when the table is larger than 25% of
shared_buffers. So increasing shared_buffers can actually turn this
feature off.

Rather than having a boolean GUC, we should have a number and make the
parameter "synchronised_scan_threshold". This would then be the  
size of
a table above which we would perform synch scans. If its set to -1,  
then
this would be the same as "off" in all cases. The default value  
would be

25% of shared_buffers. (Think we can only do that at initdb time
currently).

If we do that, its clearly different from the enable_* parameters, so
the name is easier to decide ;-)



+1
This is in fact a lot more flexible and transparent.
It gives us a lot more control over the process and it is easy to  
explain / understand.


best regards,

hans

--
Cybertec Schönig & Schönig GmbH
PostgreSQL Solutions and Support
Gröhrmühlgasse 26, 2700 Wiener Neustadt
Tel: +43/1/205 10 35 / 340
www.postgresql.at, www.cybertec.at

Re: [HACKERS] [GENERAL] SHA1 on postgres 8.3

2008-01-28 Thread Bruce Momjian


I am not thrilled about moving _some_ of pgcrypto into the backend ---
pgcrypto right now seems well designed and if we pull part of it out it
seems it will be less clear than what we have now.  Perhaps we just need
to document that md5() isn't for general use and some function in
pgcrypto should be used instead?

---

Marko Kreen wrote:
> On 1/21/08, Tom Lane <[EMAIL PROTECTED]> wrote:
> > > MD5 is broken in the sense that you can create two or more meaningful
> > > documents with the same hash.
> >
> > Note that this isn't actually very interesting for the purpose for
> > which the md5() function was put into core: namely, hashing passwords
> > before they are stored in pg_authid.
> 
> Note: this was bad idea.  The function that should have been
> added to core would be pg_password_hash(username, password).
> 
> Adding md5() lessens incentive to install pgcrypto or push/accept
> digest() into core and gives impression there will be sha1(), etc
> in the future.
> 
> Now users who want to store passwords in database (the most
> popular usage) will probably go with md5() without bothering
> with pgcrypto.  They probably see "Postgres itself uses MD5 too",
> without realizing their situation is totally different from
> pg_authid one.
> 
> It's like we have solution that is ACID-compliant 99% of the time in core,
> so why bother with 100% one.
> 
> -- 
> marko
> 
> ---(end of broadcast)---
> TIP 4: Have you searched our list archives?
> 
>http://archives.postgresql.org

-- 
  Bruce Momjian  <[EMAIL PROTECTED]>http://momjian.us
  EnterpriseDB http://postgres.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] [PATCH] Add size/acl information when listing databases

2008-01-28 Thread Bruce Momjian


This has been saved for the 8.4 release:

http://momjian.postgresql.org/cgi-bin/pgpatches_hold

---

Andrew Gilligan wrote:
> 
> On 20 Jan 2008, at 04:34, Tom Lane wrote:
> > Andrew Gilligan <[EMAIL PROTECTED]> writes:
> >> Showing the size on \l+ probably makes more sense, but I imagine that
> >> would require a very different approach due to the permissions  
> >> changes?
> >
> > I haven't experimented, but perhaps something along the lines of
> >
> > case when has_database_privilege(current_user, db.oid, 'connect')
> > then pg_database_size...
> > else null
> > end
> >
> > would fix it.
> 
> 
> Yep, that seems to do the trick.
> 
> I've attached an updated patch (based on 8.3RC2) that adds the ACL
> information to \l and the size to \l+ if available.
> 
> Best regards,
> -Andy
> 
> 

[ Attachment, skipping... ]

> 
> 
> ---(end of broadcast)---
> TIP 7: You can help support the PostgreSQL project by donating at
> 
> http://www.postgresql.org/about/donate

-- 
  Bruce Momjian  <[EMAIL PROTECTED]>http://momjian.us
  EnterpriseDB http://postgres.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [HACKERS] [PATCHES] Proposed patch: synchronized_scanning GUC variable

2008-01-28 Thread Simon Riggs

On Sun, 2008-01-27 at 21:04 -0500, Tom Lane wrote:
> [ redirecting thread to -hackers ]
> 
> Neil Conway <[EMAIL PROTECTED]> writes:
> > On Sun, 2008-01-27 at 21:54 +, Gregory Stark wrote:
> >> I liked the "synchronized_sequential_scans" idea myself.
> 
> > I think that's a bit too long. How about "synchronized_scans", or
> > "synchronized_seqscans"?
> 
> We have enable_seqscan already, so that last choice seems to fit in.

If we're going to have a GUC, we may as well make it as useful as
possible.

Currently we set synch scan on when the table is larger than 25% of
shared_buffers. So increasing shared_buffers can actually turn this
feature off.

Rather than having a boolean GUC, we should have a number and make the
parameter "synchronised_scan_threshold". This would then be the size of
a table above which we would perform synch scans. If its set to -1, then
this would be the same as "off" in all cases. The default value would be
25% of shared_buffers. (Think we can only do that at initdb time
currently).

If we do that, its clearly different from the enable_* parameters, so
the name is easier to decide ;-)

-- 
  Simon Riggs
  2ndQuadrant  http://www.2ndQuadrant.com

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [HACKERS] CLUSTER and synchronized scans and pg_dump et al

2008-01-28 Thread Kevin Grittner

>>> On Mon, Jan 28, 2008 at 10:36 AM, in message <[EMAIL PROTECTED]>,
Tom Lane <[EMAIL PROTECTED]> wrote: 
> in general pg_dump's charter is to reproduce
> the state of the database as best it can, not to "improve" it.

Seems that I've often seen it recommended as a way to eliminate bloat.

It seems like there are some practical use cases where it would be
a pain to have to do a CLUSTER right on the heels of having used
pg_dump to psql.

This does seem like the right way to do it where a user really wants
to maintain the physical sequence; my biggest concern is that
CLUSTER is sometimes used to eliminate bloat, and there is no real
interest in maintaining that sequence later.  I'd bet that people
generally do not alter the table to remove the clustered index
choice, so this option could be rather painful somewhere
downstream, when the sequence has become pretty random.

Maybe it would make sense if it was not the default, and the issues
were properly documented under the description of the option?

-Kevin

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] CLUSTER and synchronized scans and pg_dump et al

2008-01-28 Thread Steve Atkins



On Jan 28, 2008, at 8:36 AM, Tom Lane wrote:


Andrew Dunstan <[EMAIL PROTECTED]> writes:

Kevin Grittner wrote:

It would seem reasonable to me for pg_dump to use ORDER BY to select
data from clustered tables.



What will be the performance hit from doing that?


That worries me too.  Also, in general pg_dump's charter is to  
reproduce

the state of the database as best it can, not to "improve" it.


One common use of cluster around here is to act as a faster version
of vacuum full when there's a lot of dead rows in a table. There's no
intent to keep the table clustered on that index, and the cluster flag
isn't removed with alter table (why bother, the only thing it affects is
the cluster command).

I'm guessing that's not unusual, and it'd lead to sorting tables as part
of pg_dump.

Cheers,
  Steve


---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate

Re: [HACKERS] system catalog constraints question

2008-01-28 Thread Gevik Babakhani

Thank you :) 

> -Original Message-
> From: Heikki Linnakangas [mailto:[EMAIL PROTECTED] 
> Sent: Monday, January 28, 2008 5:35 PM
> To: Gevik Babakhani
> Cc: pgsql-hackers@postgresql.org
> Subject: Re: [HACKERS] system catalog constraints question
> 
> Gevik Babakhani wrote:
> > Is there a way to query the column constraints between the 
> tables of 
> > system catalog.
> > For example pg_attribute.atttypid>pg_type.oid. This is 
> described 
> > in the docs of course , but does the system use something like 
> > pg_constaint or the system catalog constraints are enforced 
> only in the code?
> 
> There is pg_depend, which is kind of like constraints, but 
> enforced in code.
> 
> -- 
>Heikki Linnakangas
>EnterpriseDB   http://www.enterprisedb.com
> 


---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [HACKERS] GSSAPI doesn't play nice with non-canonical host names

2008-01-28 Thread Tom Lane

Magnus Hagander <[EMAIL PROTECTED]> writes:
>>> While I'm complaining: that's got to be one of the least useful error
>>> messages I've ever seen, and it's for a case that's surely going to be
>>> fairly common in practice.

> AFAIK, that one is for Kerberos only. For GSSAPI, we already use the
> gss_display_status function to get the error messages. I think the problem
> here is in the Kerberos library?

Yeah, I had verified by tracing through it that the text was just what
gss_display_status gave us.  It could be that it's just plain broken,
but I was sort of hoping that we were using it incorrectly or that
there's some magic flag to set to get better messages out of it.

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] CLUSTER and synchronized scans and pg_dump et al

2008-01-28 Thread Tom Lane

Andrew Dunstan <[EMAIL PROTECTED]> writes:
> Kevin Grittner wrote:
>> It would seem reasonable to me for pg_dump to use ORDER BY to select
>> data from clustered tables.

> What will be the performance hit from doing that?

That worries me too.  Also, in general pg_dump's charter is to reproduce
the state of the database as best it can, not to "improve" it.

regards, tom lane

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate

Re: [HACKERS] system catalog constraints question

2008-01-28 Thread Heikki Linnakangas


Gevik Babakhani wrote:

Is there a way to query the column constraints between the tables of system
catalog.
For example pg_attribute.atttypid>pg_type.oid. This is described in the 
docs of course , but does the system use something like pg_constaint 
or the system catalog constraints are enforced only in the code?


There is pg_depend, which is kind of like constraints, but enforced in 
code.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [HACKERS] CLUSTER and synchronized scans and pg_dump et al

2008-01-28 Thread Tom Lane

"Guillaume Smet" <[EMAIL PROTECTED]> writes:
> It was my first idea but I didn't propose it as it's really a
> different thing IMHO. enable_* variables don't change the way
> PostgreSQL really does the job as synchronize_scans (or whatever the
> name will be) does.
> And it's not very consistent with the other GUC variables (most of
> them could have "enable" in their name) but we limited the usage of
> enable_* to planner variables. I don't know if it's on purpose though.

Yeah, it is a more or less deliberate policy to use enable_ only for
planner control variables, which this one certainly isn't.  I seem
to recall an argument also that prefixing enable_ is just noise; it
doesn't add anything to your understanding of what the variable does.

So far I think "synchronize_seqscans" is the best proposal.

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend

[HACKERS] system catalog constraints question

2008-01-28 Thread Gevik Babakhani

Hi,

Is there a way to query the column constraints between the tables of system
catalog.
For example pg_attribute.atttypid>pg_type.oid. This is described in the 
docs of course , but does the system use something like pg_constaint 
or the system catalog constraints are enforced only in the code?

Regards,
Gevik.



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [HACKERS] CLUSTER and synchronized scans and pg_dump et al

2008-01-28 Thread Kevin Grittner

>>> On Mon, Jan 28, 2008 at  9:00 AM, in message <[EMAIL PROTECTED]>,
Andrew Dunstan <[EMAIL PROTECTED]> wrote: 
> Kevin Grittner wrote:
> On Sun, Jan 27, 2008 at  9:02 AM, in message
>> <[EMAIL PROTECTED]>, Gregory Stark <[EMAIL PROTECTED]>
>> wrote: 
>>  
>>> Perhaps we should have some form of escape hatch for pg_dump to request real
>>> physical order when dumping clustered tables.
>>  
>> It would seem reasonable to me for pg_dump to use ORDER BY to select
>> data from clustered tables.
> 
> What will be the performance hit from doing that?
 
If the rows actually are in order of the clustered index, it
shouldn't add much more than the time needed to sequentially pass
the clustered index, should it?  Even so, perhaps there should be a
command-line option on pg_dump to control whether it does this.
 
-Kevin
 



---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [HACKERS] CLUSTER and synchronized scans and pg_dump et al

2008-01-28 Thread Andrew Dunstan




Kevin Grittner wrote:

On Sun, Jan 27, 2008 at  9:02 AM, in message


<[EMAIL PROTECTED]>, Gregory Stark <[EMAIL PROTECTED]>
wrote: 
 
  

Perhaps we should have some form of escape hatch for pg_dump to request real
physical order when dumping clustered tables.

 
It would seem reasonable to me for pg_dump to use ORDER BY to select

data from clustered tables.
 
I don't see a general case for worrying about the order of rows

returned by queries which lack an ORDER BY clause.
 

  


What will be the performance hit from doing that?

cheers

andrew

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org

Re: [HACKERS] CLUSTER and synchronized scans and pg_dump et al

2008-01-28 Thread Kevin Grittner

>>> On Sun, Jan 27, 2008 at  9:02 AM, in message
<[EMAIL PROTECTED]>, Gregory Stark <[EMAIL PROTECTED]>
wrote: 

> Perhaps we should have some form of escape hatch for pg_dump to request real
> physical order when dumping clustered tables.

It would seem reasonable to me for pg_dump to use ORDER BY to select
data from clustered tables.

I don't see a general case for worrying about the order of rows
returned by queries which lack an ORDER BY clause.

-Kevin

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] CLUSTER and synchronized scans and pg_dump et al

2008-01-28 Thread Guillaume Smet

Hi Florian,

Glad to see you back!

On Jan 28, 2008 3:25 PM, Florian G. Pflug <[EMAIL PROTECTED]> wrote:
> How about enable_syncscan, or enable_seqscan_sync? It's not strictly
> something the influences the planner, but maybe it's similar enough to
> justify a similar naming?

It was my first idea but I didn't propose it as it's really a
different thing IMHO. enable_* variables don't change the way
PostgreSQL really does the job as synchronize_scans (or whatever the
name will be) does.
And it's not very consistent with the other GUC variables (most of
them could have "enable" in their name) but we limited the usage of
enable_* to planner variables. I don't know if it's on purpose though.

--
Guillaume

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [HACKERS] CLUSTER and synchronized scans and pg_dump et al

2008-01-28 Thread Florian G. Pflug


Guillaume Smet wrote:

On Jan 27, 2008 9:07 PM, Markus Bertheau
<[EMAIL PROTECTED]> wrote:

2008/1/28, Tom Lane <[EMAIL PROTECTED]>:

Do we have nominations for a name?  The first idea that comes to
mind is "synchronized_scanning" (defaulting to ON).
"synchronized_sequential_scans" is a bit long, but contains the 
keyword "sequential scans", which will ring a bell with many, more

so than "synchronized_scanning".


synchronize_seqscans?


How about enable_syncscan, or enable_seqscan_sync? It's not strictly
something the influences the planner, but maybe it's similar enough to
justify a similar naming?

regards, Florian Pflug

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match

Re: [PATCHES] [HACKERS] Including Snapshot Info with Indexes

2008-01-28 Thread Jonah H. Harris

On Jan 28, 2008 8:21 AM, Gokulakannan Somasundaram <[EMAIL PROTECTED]> wrote:
> I am not seeing my mail getting listed in the archives. So i am just
> resending it, in case the above one has got missed.

It was sent.  Archive processing is delayed.

-- 
Jonah H. Harris, Sr. Software Architect | phone: 732.331.1324
EnterpriseDB Corporation| fax: 732.331.1301
499 Thornall Street, 2nd Floor  | [EMAIL PROTECTED]
Edison, NJ 08837| http://www.enterprisedb.com/

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [HACKERS] GSSAPI doesn't play nice with non-canonical host names

2008-01-28 Thread Magnus Hagander

On Sun, Jan 27, 2008 at 09:32:54PM -0500, Stephen Frost wrote:
> 
> > While I'm complaining: that's got to be one of the least useful error
> > messages I've ever seen, and it's for a case that's surely going to be
> > fairly common in practice.  Can't we persuade GSSAPI to produce
> > something more user-friendly?  At least convert "7" to "Server not
> > found in Kerberos database"?
> 
> I agree, and have found it to be very frustrating while working w/
> Kerberos in general.  I *think* there's a library which can convert
> those error-codes (libcomm-err?), but I've not really looked into it
> yet.

AFAIK, that one is for Kerberos only. For GSSAPI, we already use the
gss_display_status function to get the error messages. I think the problem
here is in the Kerberos library?

//Magnus

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [HACKERS] Proposal: Integrity check

2008-01-28 Thread Zdenek Kotala


Simon Riggs wrote:

On Fri, 2008-01-25 at 17:56 +0100, Zdenek Kotala wrote:
Regarding to Robert Mach's work during Google SOC on data integrity 
check. I would like to improve storage module and implement some 
Robert's code into the core.


I would like to make following modification:

1) Add ReadBuffer_noerror (recommend me better name) function which will 
accept damaged page without Error. This page will be marked as corrupted 
and when ReadBuffer will touch this page then it will be handled in 
standard way.


This is important for check and repair functions to process all table 
without interruption.


We shouldn't let duff data into shared buffers at all.


As Tom mentioned before. I agree, it could cause a lot of problems.


I think you could mix the two methods of reading buffers

- start a subtransaction
- read blocks into shared buffers
- if error, then re-read block into private memory and examine
- carry on thru table in a new subtransaction


It seems like good idea.


OK with other points, except I don't want a new command. Let's do it as
a function that can accept block ranges to check, not just whole tables.
e.g. pg_check_blocks(17, 43) would check blocks 17 -> 43


It makes sense. I think following function should cover all cases:

pg_check_blocks() - all db
pg_check_blocks(relno) - all relation
pg_check_blocks(relno, start, stop) - selected interval
pg_check_blocks(relno, array of blocks) - selected blocks


 Zdenek




---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] [PATCHES] Proposed patch: synchronized_scanning GUC variable

2008-01-28 Thread Zeugswetter Andreas ADI SD


> >> I liked the "synchronized_sequential_scans" idea myself.
> 
> > I think that's a bit too long. How about "synchronized_scans", or
> > "synchronized_seqscans"?
> 
> We have enable_seqscan already, so that last choice seems to fit in.

Yes looks good, how about synchronized_seqscan without plural ?

Andreas


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [HACKERS] plperl: Documentation on BYTEA decoding is wrong

2008-01-28 Thread Florian Weimer

* Robert Treat:

> Note we've been using Theo's plperl bytea patch on one of our
> production servers for some time; if anyone wants access to that
> lmk.

I'm interested.  Could you post a pointer to this code, please?

-- 
Florian Weimer<[EMAIL PROTECTED]>
BFK edv-consulting GmbH   http://www.bfk.de/
Kriegsstraße 100  tel: +49-721-96201-1
D-76133 Karlsruhe fax: +49-721-96201-99

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [HACKERS] RFC: array_agg() per SQL:200n

2008-01-28 Thread Gregory Stark

"Neil Conway" <[EMAIL PROTECTED]> writes:

> AFAIK the conclusion reached by the previous thread was that to be type
> safe, you'd need one distinct pseudotype per aggregate function, along
> with some way to let the planner distinguish this class of pseudotypes
> from other types (in order to apply the heuristic that functions like
> these are likely to consume more memory). You could identify this class
> by an additional column in pg_type, but I think we'd need a lot of
> machinery to do this properly (e.g. to allow these types to be created
> via SQL). I wonder if this isn't over-engineering: the simple approach
> originally followed by Stephen Frost was to declare the transition value
> as, say, int8, and just disallow the transition and final functions from
> being called outside an aggregate context. AFAIK this would be safe,
> although of course it is ugly.

The alternative is to use the regular array type and have the implementation
of it have some magic behind the scenes.

I was already thinking we might need some magic like this for read-only cases
like:

select * where i in array[1,3,5,...]

or 

for i in 1..n
 var = arrayvar[i]
 ...
end

Both of these are O(n^2) (assuming the size of the array and the number of
loop iterations are both n). Each array IN scan or index lookup is O(n).

These cases might be easier to deal with, my idea was to memoize the array
contents in a hash data structure referenced by the parse tree fnextra
pointer. The array functions would check their function call site's fnextra
pointer to see if the array has previously been cached in the more efficient
form and is the same array and then use either hash probes for the IN case or
a C datum array for the latter case.

Could the same be done by the aggregate call site where the aggregate's type
is a plain anyarray like normal, but the array_accum call would look at the
call site and stash the actual contents there in a linked list or tuplesort?
The actual anyarray data type would just have a flag saying "the data's over
there".

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's On-Demand Production Tuning

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate

Re: [HACKERS] SSL connections don't cope with server crash very well at all

2008-01-28 Thread Magnus Hagander

On Sun, Jan 27, 2008 at 08:09:10PM -0500, Tom Lane wrote:
> If you do a manual "kill -9" (for testing purposes) on its connected
> server process, psql normally recovers nicely:
> 
> regression=# select 1;
>  ?column? 
> --
> 1
> (1 row)
> 
> -- issue kill here in another window
> regression=# select 1;
> server closed the connection unexpectedly
> This probably means the server terminated abnormally
> before or while processing the request.
> The connection to the server was lost. Attempting reset: Succeeded.
> regression=# 
> 
> But try it with an SSL-enabled connection, and psql just dies rudely.
> Investigation shows that it's being killed by SIGPIPE while attempting
> to clean up the failed connection:
> 
> Program received signal SIGPIPE, Broken pipe.
> 0x0030f7ec6e80 in __write_nocancel () from /lib64/libc.so.6
> (gdb) bt
> #0  0x0030f7ec6e80 in __write_nocancel () from /lib64/libc.so.6
> #1  0x003102497a27 in rl_filename_completion_function ()
>from /lib64/libcrypto.so.6
> #2  0x003102495e5e in BIO_write () from /lib64/libcrypto.so.6
> #3  0x003877a1f449 in ssl3_write_pending () from /lib64/libssl.so.6
> #4  0x003877a1f8b6 in ssl3_dispatch_alert () from /lib64/libssl.so.6
> #5  0x003877a1d602 in ssl3_shutdown () from /lib64/libssl.so.6
> #6  0x2aac2675 in close_SSL (conn=0x642d60) at fe-secure.c:1095
> #7  0x2aabb483 in pqReadData (conn=0x642d60) at fe-misc.c:719
> #8  0x2aaba9b8 in PQgetResult (conn=0x642d60) at fe-exec.c:1223
> #9  0x2aabaa8e in PQexecFinish (conn=0x642d60) at fe-exec.c:1452
> #10 0x004075b7 in SendQuery (query=)
> at common.c:853
> #11 0x00409cf3 in MainLoop (source=0x30f8151680) at mainloop.c:225
> #12 0x0040c3dc in main (argc=, argv=0x100)
> at startup.c:352
> 
> Apparently we need to do the SIGPIPE disable/enable dance around
> SSL_shutdown() as well as SSL_write().  I wonder whether we don't need
> it around SSL_read() as well --- I seem to recall that OpenSSL might
> either read or write the socket within SSL_read(), due to various corner
> cases in the SSL protocol.
> 
> Comments?

Yes, AFAIK it can end up doing writes - for example if it has to
re-negotiate the encryption key.

So yes, I think we need it around both.

//Magnus

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [HACKERS] GSSAPI doesn't play nice with non-canonical host names

2008-01-28 Thread Magnus Hagander

On Sun, Jan 27, 2008 at 09:51:48PM -0500, Tom Lane wrote:
> Stephen Frost <[EMAIL PROTECTED]> writes:
> > * Tom Lane ([EMAIL PROTECTED]) wrote:
> >> Whilst trying to reproduce bug #3902 I noticed that the code doesn't
> >> work with an abbreviated host name:
> 
> > Testing w/ 8.3RC2, everything seems to be working fine here:
> 
> Okay, that probably means there's something wacko about my Kerberos
> setup.  It's quite likely got something to do with the fact that I
> set up the KDC on the same machine where I'm doing the PG testing,
> which is surely a case that would never be sane in practice.
> 
> [ thinks for a bit... ]  In this context there's some ambiguity as to
> whether 'rh2' should resolve as 127.0.0.1 or the machine's real IP
> address, and no doubt something is making the wrong choice someplace.
> That's probably how the localdomain lookups got into it.

Sounds likely. FWIW, DNS issues is by far the most common problem with
Kerberos installations - at least it is on Windows.

//Magnus

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

59 matches

Mail list logo