date:20040126

Re: [HACKERS] Disaster!

2004-01-26 Thread Christopher Kings-Lynne

Awesome Tom :)

I'm glad I happened to have all the data required on hand to fully analyze
the problem.   Let's hope this make this failure condition go away for all
future postgresql users :)

Chris

On Mon, 26 Jan 2004, Tom Lane wrote:

> Okay ... Chris was kind enough to let me examine the WAL logs and
> postmaster stderr log for his recent problem, and I believe that
> I have now achieved a full understanding of what happened.  The true
> bug is indeed somewhere else than slru.c, and we would not have found
> it if slru.c had had less-paranoid error checking.

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org

[HACKERS] Corrupted db?

2004-01-26 Thread Michael Brusser

We got a problem ticket from the customer.
Messages in the error_log indicated problem trying to insert
a duplicate value into a table.
 
Looking at the database (v.7.3.2 on Solaris) I'm puzzled with this:

syncdb=# \d
ERROR:  Cache lookup failed for relation 17075

syncdb=# \di
ERROR:  Cache lookup failed for relation 17081

syncdb=# \d property_types;
ERROR:  Cache lookup failed for relation 17075

syncdb=# \d public.property_types
   Table "public.property_types"
   Column|  Type   | Modifiers 
-+-+---
 id  | integer | not null
 name| character varying(256)  | not null
 stamp   | date| not null
 description | character varying(1024) | 
Indexes: pt_pk primary key btree (id),
 pt_name_uid_uk unique btree (name)


syncdb=# SET search_path = public, pg_catalog;
SET
syncdb=# \d property_types;
ERROR:  Cache lookup failed for relation 17075

syncdb=# select * from pg_namespace ;
  nspname   | nspowner | nspacl 
+--+
 pg_catalog |1 | {=U}
 pg_toast   |1 | {=}
 public |1 | {=UC}
 pg_temp_1  |1 | 
(4 rows)

syncdb=# select * from property_types;
 id | name | stamp | description 
+--+---+-
(0 rows)

INSERT INTO property_types (id, name, description, stamp)
VALUES (nextval('pt_seq'), 'FROM_RELATIONSHIP', 
   'Describes the relationship of ...', date('today'));

ERROR:  Cannot insert a duplicate key into unique index pt_name_uid_uk
-

I can only guess that something is corrupted.
What could lead to that?
Can this can be prevented or repaired?

Thank you,
Mike.



---(end of broadcast)---
TIP 8: explain analyze is your friend

Re: [HACKERS] 7.5 change documentation

2004-01-26 Thread Simon Riggs

OK, I will attempt to draw together this information as currently
stands. If this makes any sense, we can discuss what the
requirement/process is for regular maintenance (daily/weekly/monthly
etc).

Understood to mean "changes in next release (current progress)" - items
that have been completed/committed since last release, for the purpose
of informing developers/testers what's new PRIOR to full release. 

Leaving unobstructed the functions of 
- TODO list - a combined list of desired work items (Bruce)
- Release Notes - final list of features of a release (Bruce)

This should help alpha testing, which should allow more control of what
actually does get released (and therefore what the contents of Release
Notes should be)

Best Regards, Simon

> -Original Message-
> From: Tom Lane [mailto:[EMAIL PROTECTED]
> Sent: Friday, January 23, 2004 20:40
> To: Neil Conway
> Cc: [EMAIL PROTECTED]; 'Jan Wieck'; 'Postgresql Hackers'
> Subject: Re: 7.5 change documentation
> 
> Neil Conway <[EMAIL PROTECTED]> writes:
> > Tom Lane <[EMAIL PROTECTED]> writes:
> >> In theory there should be a section at the head of release.sgml
> >> mentioning the major changes done-so-far, but for various reasons
> >> this hasn't gotten installed in the 7.5 branch yet.  (Look at the
> >> CVS versions during 7.4 development to see how we did it last
time.)
> 
> > Well, keep in mind we didn't do it very effectively in 7.4 :-) The
> > vast majority of changes weren't recorded there, and the ones that
> > were had to be fleshed out quite a lot in the actual release notes.
> 
> > The last time that someone (Peter and myself, IIRC) suggested that
we
> > really incrementally maintain the release notes during the
development
> > cycle, Bruce said that he personally finds it more comfortable to
> > summarize the CVS changelogs all at once shortly before we release
the
> > first beta. AFAIR that's where the discussion ended.
> 
> It's fine with me if Bruce prefers to build the release notes directly
> from the change logs.  As I saw it, the purpose of the temporary list
of
> things-done-so-far is not to be the raw material for the release
notes.
> It's to let alpha testers know about major changes that they might
want
> to test.  As such, it's fine that it's incomplete.
> 
> The other way we could handle this goal is to be a tad more vigorous
about
> checking off items as "done" in the TODO list.  However, Bruce
generally
> doesn't bother to make a new entry in the TODO list if someone does
> something that wasn't in the list to begin with, and so I'm not sure
> it's the right vehicle.
> 
>   regards, tom lane


---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match

Re: [HACKERS] cache control?

2004-01-26 Thread Simon Riggs

Jan,

Happy to continue the discussion...though without changing my suggestion
that we defer any further more specialised improvements for now.

> Jan Wieck replied to...
> Simon Riggs wrote:
> > If we know ahead of time that a large scan is going to have this
effect,
> > why wait for the ARC to play its course, why not take exactly the
same
> > action?
> > Have large scans call StrategyHint also. (Maybe rename it...?)...of
> > course, some extra code to establish it IS a large scan...
> > ...large table lookup should wait until a shared catalog cache is
> > implemented
> 
> The problem with this is a) how to detect that something will be a
large
> scan, and b) how to decide what is a large scan in the first place.
> 

My thoughts are that we know immediately prior to execution whether or
not a plan calls for a full table scan (FTS) (or not). We also know the
table and therefore its size. A large table in this context is one that
would disrupt the cache if it made it onto T2. We can discuss an
appropriate and usefully simple rule, perhaps sizeoftable(T) > 2*C???

> Large sequential scans in warehousing are often part of more complex
> join operations. 

Yes, I agree. PostgreSQL is particularly prone to this currently,
because of the high number of plans that resolve to FTS. Complexity of
plan shouldn't effect the basic situation that we are reading all the
blocks of a table and putting them in sequentially into T1 and then
working on them. Plan complexity may increase the time that a T1 block
stays in memory, with subsequent increase in probability of promotion to
T1.

> And just because something returns a large number of
> result rows doesn't mean that the input data was that much.

I agree also that overall execution time may be unrelated to whether a
"large" table is involved. The number of output rows shouldn't have any
effect on input rows and thus data blocks that need to be cached.

(Jan gives a detailed analysis...ending with)
> Honestly, I don't even know what type of application could possibly
> produce such a screwed access pattern. And I am absolutely confident
one
> can find corner cases to wring down Oracles complicated configuration
> harness more easily.

I agree with everything you say. The algorithm copes well with almost
every sequential pattern of access and there is significant benefit from
ignoring the very very very rare cases that might give it problems.

My thoughts are about multiple concurrent accesses, specifically FTS on
large tables, rather than sequential ones.

> Buffers evicted from T1 are remembered in B1, and because of that even
> repeated sequential scans of the same large relation will only cycle
> through T1 blocks, never cause any turbulence in T2 or B2.

If we have a situation where a single backend makes repeated scans of
the same table, these will be sequential and will have no effect on T1.

In a DW situation, you are likely to have one or more very popular large
tables (maybe think of this as the "Fact table", if you have a
dimensional design). The tables are large and therefore query execution
times will be extended (and accepted by user). In this situation it is
very likely that: i) a single user/app submits multiple requests from
other windows/threads
Or simply,
ii) multiple users access the popular table

The common effect will be concurrent, rather than sequential, access to
the popular table. Different SQL statements will have different plans
and will perform scans of the same table at different rates because of
other joins, more complex WHERE clauses etc. Like waves at a beach
moving at different rates. Every time one scan catches up with another,
it will cause T1 hits for almost the whole of the T1 list, promoting all
of these blocks to the top of the T2 MRU and thus spoiling the cache -
if it hits one it will probably hit most of them. This will not happen
ALL the time, but I don't want it to happen EVER. Even in DW situation,
I still want to be inserting data regularly (that's how the table got
big!), so I have index blocks and other stuff that I want almost
permanently in memory. Concurrent access via an index might have the
same effect, though less dramatically.

The closer the size of a table I to C, the greater the likelihood that
these spurious cache hits will occur. (Of course, it might be argued
that these are worthwhile and genuine cache hits - I argue that they are
not wanted and this is the main basis of my discussion). Of course, if a
table does fit in memory than that is very good. If a table was, say
2*C, then spurious cache hits will occur often and spoil the whole of
T2.

The DBT-3 workload is very similar to TPC-D/TPC-H workload. The test
consists of a power test (all queries sequentially in random order) and
a throughput test (1 or more concurrent streams, each stream executing
all queries in a random order). When this benchmark first came out most
vendors chose to perform the throughput test with only 1 stream (though
with parallel p

[HACKERS] what does it mean

2004-01-26 Thread ohp

Hi,

I've upgraded my production system to 741 yesterday, and just discovered
this log message: statistic buffer is full.

What does it mean? filing up too fast, no more stats, need to grow?
What should I do?

Also, Many thanks for this great versionn of PostgreSQL, keep going...

Revgards

-- 
Olivier PRENANT Tel: +33-5-61-50-97-00 (Work)
6, Chemin d'Harraud Turrou   +33-5-61-50-97-01 (Fax)
31190 AUTERIVE   +33-6-07-63-80-64 (GSM)
FRANCE  Email: [EMAIL PROTECTED]
--
Make your life a dream, make your dream a reality. (St Exupery)

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

[HACKERS] pg_dump and CHECK constraints

2004-01-26 Thread Curt Sampson


I notice that pg_dump is still dumping CHECK constraints with the table,
rather than at the very end, as it does with all the other constraints.
As discussed in bug report #787, at

http://archives.postgresql.org/pgsql-bugs/2002-09/msg00278.php

this breaks your restore if your CHECK constraint uses a user-defined
function.

1. Does anybody have any plans to fix this in the very near future?

2. If not, is there something that makes it particularly hard to
fix? I notice, Tom, that in your reply to that bug report you
intimated that this wasn't an easy fix, but I don't see why CHECK
constraints couldn't be added at the end of the dump, just as all
the other constraints are. Presumably, your message being late 2002,
this was before pg_dump was modified to re-order stuff?

3. If we created a patch for this at my work, would it be accepted?

I dunno...this looks really easy to me

cjs
-- 
Curt Sampson  <[EMAIL PROTECTED]>   +81 90 7737 2974   http://www.NetBSD.org
Don't you know, in this new Dark Age, we're all light.  --XTC

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match

Re: [HACKERS] Corrupted db?

2004-01-26 Thread Tom Lane

Michael Brusser <[EMAIL PROTECTED]> writes:
> Looking at the database (v.7.3.2 on Solaris) I'm puzzled with this:

> syncdb=# \d
> ERROR:  Cache lookup failed for relation 17075

You might try reindexing the indexes on pg_class (particularly the one
on pg_class.oid).  See the REINDEX man page for procedural details.

regards, tom lane

---(end of broadcast)---
TIP 8: explain analyze is your friend

Re: [HACKERS] cache control?

2004-01-26 Thread Jan Wieck

Simon Riggs wrote:
Jan,
[...]

My thoughts are about multiple concurrent accesses, specifically FTS on
large tables, rather than sequential ones.
Single or multiple backends is irrelevant here because a data block only 
exists once, and therefore we have only one shared buffer cache.


Buffers evicted from T1 are remembered in B1, and because of that even
repeated sequential scans of the same large relation will only cycle
through T1 blocks, never cause any turbulence in T2 or B2.
If we have a situation where a single backend makes repeated scans of
the same table, these will be sequential and will have no effect on T1.
You really have to look at this a bit more global, not table related. 
The strategy of ARC is this:

In an unknown access pattern, if a specific block is accessed less 
frequently than every C requests, then it will only go into T1, age, get 
evicted and the CDB moves to B1, will get removed from that and is 
forgotten. Every block that is accessed more frequently than C will be 
after it's last access in any of the four queues of the directory and 
immediately go into T2.

The adjustment of the target T1 size is an attempt to catch as many 
newcomers as possible. If an application does many inserts, it will 
access new blocks very soon again, so that a small T1 is sufficient to 
hold them in memory until their next access where they move into T2. An 
application that does non-uniform random access to blocks (there are 
always bestsellers and less frequently asked items), then a larger T1 
might better satisfy that access pattern.

In a DW situation, you are likely to have one or more very popular large
tables (maybe think of this as the "Fact table", if you have a
dimensional design). The tables are large and therefore query execution
times will be extended (and accepted by user). In this situation it is
very likely that: i) a single user/app submits multiple requests from
other windows/threads
Or simply,
ii) multiple users access the popular table
If that causes that it's blocks are more frequently requested than every 
C lookups, it belongs into T2.

The common effect will be concurrent, rather than sequential, access to
the popular table. Different SQL statements will have different plans
and will perform scans of the same table at different rates because of
other joins, more complex WHERE clauses etc. Like waves at a beach
moving at different rates. Every time one scan catches up with another,
it will cause T1 hits for almost the whole of the T1 list, promoting all
of these blocks to the top of the T2 MRU and thus spoiling the cache -
if it hits one it will probably hit most of them. This will not happen
ALL the time, but I don't want it to happen EVER. Even in DW situation,
I still want to be inserting data regularly (that's how the table got
big!), so I have index blocks and other stuff that I want almost
permanently in memory. Concurrent access via an index might have the
same effect, though less dramatically.
The closer the size of a table I to C, the greater the likelihood that
these spurious cache hits will occur. (Of course, it might be argued
that these are worthwhile and genuine cache hits - I argue that they are
not wanted and this is the main basis of my discussion). Of course, if a
table does fit in memory than that is very good. If a table was, say
2*C, then spurious cache hits will occur often and spoil the whole of
T2.
How can any generic algorithm ever sense that when the application is 
accessing the same blocks multiple times, it should NOT cache them? Are 
you asking for a fine granulated tuning of cache priorities and 
behaviour on a per table basis?

The DBT-3 workload is very similar to TPC-D/TPC-H workload. The test
consists of a power test (all queries sequentially in random order) and
a throughput test (1 or more concurrent streams, each stream executing
all queries in a random order). When this benchmark first came out most
vendors chose to perform the throughput test with only 1 stream (though
with parallel processing)...I would say one reason for this is poor
cache management...hence recent changes in various commercial products.
In summary, I believe there is a reasonably common effect in DW
situations where concurrent query access to large and popular tables
will result in undesirable cache spoiling. This effect will still occur
even after the ARC improvements are introduced - though in every other
case I can think of, the ARC code is a major improvement on earlier
strategies and should be hailed as a major improvement in automatic
performance adaptation.
There are two solution ideas:
i) change the code so that FTS on large tables use the "no cache"
strategy that has already been developed to support Vaccuum.
ii) more complex: synchronise the FTS of the large table so that all
backends that want scans produce only one set of I/Os and they share the
block many times (yet still don't put it in cache!). FTS don't start at
"the beginning" every time, they star

Re: [HACKERS] what does it mean

2004-01-26 Thread Tom Lane

[EMAIL PROTECTED] writes:
> I've upgraded my production system to 741 yesterday, and just discovered
> this log message: statistic buffer is full.

If you see this a lot, it might be worth increasing PGSTAT_RECVBUFFERSZ.

regards, tom lane

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html

Re: [HACKERS] Named arguments in function calls

2004-01-26 Thread Peter Eisentraut

Rod Taylor wrote:
> If that was IS, then foo(x is 13) makes sense.

I like that syntax.  For example

select interest(amount is 500.00, rate is 1.3)

is very readable, yet brief.


---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] cache control?

2004-01-26 Thread Simon Riggs

Jan,

I think we should suspend further discussion for now...in summary:

ARC Buffer management is an important new performance feature for 7.5;
the implementation is a good one and should have positive benefit for
everybody's workload. ARC will adapt to a variety of situations and has
been designed to allow Vacuum to avoid interfering with user
applications.

That's the important bit: The implementation notes are detailed; I've
read them a few times to ensure I've got it straight. I am confident
that the situation I described CAN exist with regard to multiple
concurrent queries performing full table scans upon a single large
table. Further debate on that point is continuing because of my poor
explanation of that situation; forgive me. Thanks very much for your
further explanations and examples.

I will take a more practical tack on this now: providing evidence of a
real query mix that exhibits the described properties and quantifying
the effects and their frequency. If it IS worth it, and I accept that it
may not be, I'll have a hack at the very specialised improvement I was
suggesting, for very specific workload types.

Best Regards

Simon Riggs



---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] Named arguments in function calls

2004-01-26 Thread Tom Lane

Peter Eisentraut <[EMAIL PROTECTED]> writes:
> Rod Taylor wrote:
>> If that was IS, then foo(x is 13) makes sense.

> I like that syntax.  For example

> select interest(amount is 500.00, rate is 1.3)

> is very readable, yet brief.

Yes, that does read well.  And "IS" is already a keyword.  We might have
to promote it from func_name_keyword to fully reserved status, but that
doesn't seem like a big loss.  I could go with this.

(We still need to check SQL200x though ...)

regards, tom lane

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [HACKERS] Named arguments in function calls

2004-01-26 Thread Tom Lane

Peter Eisentraut <[EMAIL PROTECTED]> writes:
> Rod Taylor wrote:
>> If that was IS, then foo(x is 13) makes sense.

> I like that syntax.  For example
> select interest(amount is 500.00, rate is 1.3)
> is very readable, yet brief.

On second thought though, it doesn't work.

select func(x is null);

is ambiguous, especially if func() accepts boolean.

regards, tom lane

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [HACKERS] what does it mean

2004-01-26 Thread ohp

Thanks for replying Tom,

How much is a lot? it occured ~30 times since 23pm (it's 17:48 pm now)

On Mon, 26 Jan 2004, Tom Lane wrote:

> Date: Mon, 26 Jan 2004 09:48:03 -0500
> From: Tom Lane <[EMAIL PROTECTED]>
> To: [EMAIL PROTECTED]
> Cc: pgsql-hackers list <[EMAIL PROTECTED]>
> Subject: Re: [HACKERS] what does it mean
>
> [EMAIL PROTECTED] writes:
> > I've upgraded my production system to 741 yesterday, and just discovered
> > this log message: statistic buffer is full.
>
> If you see this a lot, it might be worth increasing PGSTAT_RECVBUFFERSZ.
>
>   regards, tom lane
>

-- 
Olivier PRENANT Tel: +33-5-61-50-97-00 (Work)
6, Chemin d'Harraud Turrou   +33-5-61-50-97-01 (Fax)
31190 AUTERIVE   +33-6-07-63-80-64 (GSM)
FRANCE  Email: [EMAIL PROTECTED]
--
Make your life a dream, make your dream a reality. (St Exupery)

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] returning PGresult as xml

2004-01-26 Thread Andrew Dunstan



Peter Eisentraut wrote:

Let me point out an implementation I made last time this subject was 
discussed:

http://developer.postgresql.org/~petere/xmltable.tar.bz2

This package contains server-side functions that convert a table (more 
generally a query result) to an XML document and/or and XSL schema both 
mimicking the SQL/XML standard.

Additionally, it contains a function to convert such an XML document 
back to a table source.  I also threw in an XSLT stylesheet to convert 
an SQL/XML table to an HTML table, so you can more easily view the 
results.

I also have some code in development that adds cursor interfaces, an XML 
data type, and some integration with the existing XPath functionality.  
I think that for processing XML in the database and as far as following 
the existing standards, this is the direction to take.

 

Peter: this looks very nice. What are your intentions with this code? 
Put it in contrib? Also, do you intend to implement the SQL/XML 
functions XMLElement, XMLForest, XMLAttributes, XMLConcat and XMLAgg?

cheers

andrew

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

[HACKERS] Functions returning complex types.

2004-01-26 Thread Thomas Hallgren

I'm trying to use a function that returns a complex type. I have no problem
creating the function but when I try to use it I get the message:

ERROR:  function in FROM has unsupported return type

Apparently, this message stems from the parser. Changing the function so
that it returns a SETOF the same complex type works fine.

Is this the expected behavior? Is SETOF a requirement when using complex
types?

Regards,

- thomas



---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

Re: [HACKERS] what does it mean

2004-01-26 Thread Tom Lane

[EMAIL PROTECTED] writes:
>>> I've upgraded my production system to 741 yesterday, and just discovered
>>> this log message: statistic buffer is full.
>> 
>> If you see this a lot, it might be worth increasing PGSTAT_RECVBUFFERSZ.
>
> How much is a lot? it occured ~30 times since 23pm (it's 17:48 pm now)

Hm.  I wonder whether something is interfering with the stats collection
code on your system?  I can't recall anyone else reporting seeing this
log message, so I'm not convinced that PGSTAT_RECVBUFFERSZ is too small.

What platform are you using, again?

regards, tom lane

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [pgsql-hackers-win32] [HACKERS] What's left?

2004-01-26 Thread Bruce Momjian

Dann Corbit wrote:
> I may be able to help on the localization and path stuff.  We have
> solved those issues for our port of 7.1.3, and I expect the work for 7.5
> to be extremely similar.
> 
> Where can I get the latest tarball for Win32 development?

CVS HEAD now has all the Win32 work.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 8: explain analyze is your friend

Re: [HACKERS] what does it mean

2004-01-26 Thread Larry Rosenman



--On Monday, January 26, 2004 12:11:19 -0500 Tom Lane <[EMAIL PROTECTED]> 
wrote:

[EMAIL PROTECTED] writes:
I've upgraded my production system to 741 yesterday, and just
discovered this log message: statistic buffer is full.
If you see this a lot, it might be worth increasing PGSTAT_RECVBUFFERSZ.
How much is a lot? it occured ~30 times since 23pm (it's 17:48 pm now)
Hm.  I wonder whether something is interfering with the stats collection
code on your system?  I can't recall anyone else reporting seeing this
log message, so I'm not convinced that PGSTAT_RECVBUFFERSZ is too small.
What platform are you using, again?
UnixWare 7.1.3

LER

			regards, tom lane

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


--
Larry Rosenman http://www.lerctr.org/~ler
Phone: +1 972-414-9812 E-Mail: [EMAIL PROTECTED]
US Mail: 1905 Steamboat Springs Drive, Garland, TX 75044-6749


pgp0.pgp
Description: PGP signature

Re: 7.5 change documentation (was Re: [HACKERS] cache control?)

2004-01-26 Thread Bruce Momjian

Tom Lane wrote:
> "Simon Riggs" <[EMAIL PROTECTED]> writes:
> > If the TODO-list-with-dash isn't the correct place to have looked, is
> > there another list of committed changes for the next release?
> 
> We tend to rely on the CVS commit logs as the definitive source.  You
> can pull the info from the CVS server (I use cvs2cl.pl to format the
> results nicely), or read the archives of pgsql-committers.
> 
> In theory there should be a section at the head of release.sgml
> mentioning the major changes done-so-far, but for various reasons this
> hasn't gotten installed in the 7.5 branch yet.  (Look at the CVS
> versions during 7.4 development to see how we did it last time.)
> 
> As far as the ARC change goes, I believe Jan still considers it a
> work-in-progress, so it may not be appropriate to list yet anyway.
> (Jan, where are you on that exactly?)
> 
> > Do we need such a list? (I'd be happy to compile and maintain this if it
> > agreed that it is a good idea to have such a document or process as
> > separate from TODO - I'll be doing this anyway before I pass further
> > comments!)
> 
> If you wanted to go through the existing 7.5 commits and write up a
> new done-so-far section, it'd save someone else (like me or Bruce)
> from having to do it sometime soon ...

Doesn't Robert Treat's News Bits list all the major changes weekly? 
That would b e a good source.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match

Re: [HACKERS] Functions returning complex types.

2004-01-26 Thread Tom Lane

"Thomas Hallgren" <[EMAIL PROTECTED]> writes:
> I'm trying to use a function that returns a complex type. I have no problem
> creating the function but when I try to use it I get the message:

> ERROR:  function in FROM has unsupported return type

AFAICS it's not possible to get that message for a function returning a
composite type.  You'd better show exactly what you did.

regards, tom lane

---(end of broadcast)---
TIP 8: explain analyze is your friend

Re: [HACKERS] what does it mean

2004-01-26 Thread ohp

On Mon, 26 Jan 2004, Tom Lane wrote:

> Date: Mon, 26 Jan 2004 12:11:19 -0500
> From: Tom Lane <[EMAIL PROTECTED]>
> To: [EMAIL PROTECTED]
> Cc: pgsql-hackers list <[EMAIL PROTECTED]>,
>  Jan Wieck <[EMAIL PROTECTED]>
> Subject: Re: [HACKERS] what does it mean
>
> [EMAIL PROTECTED] writes:
> >>> I've upgraded my production system to 741 yesterday, and just discovered
> >>> this log message: statistic buffer is full.
> >>
> >> If you see this a lot, it might be worth increasing PGSTAT_RECVBUFFERSZ.
> >
> > How much is a lot? it occured ~30 times since 23pm (it's 17:48 pm now)
>
> Hm.  I wonder whether something is interfering with the stats collection
> code on your system?  I can't recall anyone else reporting seeing this
> log message, so I'm not convinced that PGSTAT_RECVBUFFERSZ is too small.
>
> What platform are you using, again?
UnixWare 7.1.3 UP3

Just noticed now, it also happend on 7.3.5 (the log message was
different...)

Do you need something to analyze it?
>
>   regards, tom lane
>

-- 
Olivier PRENANT Tel: +33-5-61-50-97-00 (Work)
6, Chemin d'Harraud Turrou   +33-5-61-50-97-01 (Fax)
31190 AUTERIVE   +33-6-07-63-80-64 (GSM)
FRANCE  Email: [EMAIL PROTECTED]
--
Make your life a dream, make your dream a reality. (St Exupery)

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [HACKERS] 7.5 change documentation

2004-01-26 Thread Bruce Momjian

Tom Lane wrote:
> Neil Conway <[EMAIL PROTECTED]> writes:
> It's fine with me if Bruce prefers to build the release notes directly
> from the change logs.  As I saw it, the purpose of the temporary list of
> things-done-so-far is not to be the raw material for the release notes.
> It's to let alpha testers know about major changes that they might want
> to test.  As such, it's fine that it's incomplete.
> 
> The other way we could handle this goal is to be a tad more vigorous about
> checking off items as "done" in the TODO list.  However, Bruce generally
> doesn't bother to make a new entry in the TODO list if someone does
> something that wasn't in the list to begin with, and so I'm not sure
> it's the right vehicle.

Right. I see TODO as a way for us to remember our limitations, and to
document them for our users.  Once an item is completed, it didn't seem
necessary to put it on the TODO list.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [HACKERS] Named arguments in function calls

2004-01-26 Thread Matthew Kirkwood

On Mon, 26 Jan 2004, Tom Lane wrote:

> >> If that was IS, then foo(x is 13) makes sense.
>
> > I like that syntax.  For example
> > select interest(amount is 500.00, rate is 1.3)
> > is very readable, yet brief.
>
> On second thought though, it doesn't work.
>
>   select func(x is null);
>
> is ambiguous, especially if func() accepts boolean.

You're unlikely to care, but Oracle's syntax is Perlish:

select interest(amount => 500.0, rate => 1.3);

That'd be ambiguous again, though.  Perhaps:

select interest(amount := 500.0, rate := 1.3);

?

Matthew.

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [HACKERS] Disaster!

2004-01-26 Thread Bruce Momjian

Tom Lane wrote:
> I said:
> > If there wasn't disk space enough to hold the clog page, the checkpoint
> > attempt should have failed.  So it may be that allowing a short read in
> > slru.c would be patching the symptom of a bug that is really elsewhere.
> 
> After more staring at the code, I have a theory.  SlruPhysicalWritePage
> and SlruPhysicalReadPage are coded on the assumption that close() can
> never return any interesting failure.  However, it now occurs to me that
> there are some filesystem implementations wherein ENOSPC could be
> returned at close() rather than the preceding write().  (For instance,
> the HPUX man page for close() states that this never happens on local
> filesystems but can happen on NFS.)  So it'd be possible for
> SlruPhysicalWritePage to think it had successfully written a page when
> it hadn't.  This would allow a checkpoint to complete :-(
> 
> Chris, what's your platform exactly, and what kind of filesystem are
> you storing pg_clog on?

We already have a TODO on fclose():

* Add checks for fclose() failure

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [HACKERS] Disaster!

2004-01-26 Thread Bruce Momjian


Excellent analysis.  Thanks.  Are there any other cases like this?

---

Tom Lane wrote:
> Okay ... Chris was kind enough to let me examine the WAL logs and
> postmaster stderr log for his recent problem, and I believe that
> I have now achieved a full understanding of what happened.  The true
> bug is indeed somewhere else than slru.c, and we would not have found
> it if slru.c had had less-paranoid error checking.
> 
> The WAL log shows that checkpoints were happening every five minutes
> up to 2004-01-23 10:13:10, but no checkpoint completion record appears
> after that.  However, the system remained up, with plenty of activity,
> until 10:45:24, when it was finally taken down by a panic.  The last
> transaction commit records in the WAL log are
> 
> commit: 14286807 at 2004-01-23 10:45:23 
> commit: 14286811 at 2004-01-23 10:45:24 
> commit: 14286814 at 2004-01-23 10:45:24 
> commit: 14286824 at 2004-01-23 10:45:24 
> commit: 14286825 at 2004-01-23 10:45:24 
> commit: 14286836 at 2004-01-23 10:45:24 
> commit: 14286838 at 2004-01-23 10:45:24 
> commit: 14286850 at 2004-01-23 10:45:24 
> commit: 14286851 at 2004-01-23 10:45:24 
> 
> Over in the postmaster log, the first sign of trouble is
> 
> Jan 23 10:18:07 canaveral postgres[20039]: [879-1] LOG:  could not close temporary 
> statistics file "/usr/local/pgsql/data/global/pgstat.tmp.20035": No space left on 
> device
> 
> and there is a steady stream of transactions failing with out-of-space
> errors over the next half hour, but none of the failures are worse than
> a transaction abort.  Finally we see
> 
> Jan 23 10:45:24 canaveral postgres[57237]: [17-1] ERROR:  could not access status of 
> transaction 0
> Jan 23 10:45:24 canaveral postgres[57237]: [17-2] DETAIL:  could not write to file 
> "/usr/local/pgsql/data/pg_clog/000D" at offset 147456: No space left on device
> Jan 23 10:45:24 canaveral postgres[57237]: [18-1] WARNING:  AbortTransaction and not 
> in in-progress state
> Jan 23 10:45:24 canaveral postgres[57237]: [19-1] PANIC:  could not access status of 
> transaction 0
> Jan 23 10:45:24 canaveral postgres[57237]: [19-2] DETAIL:  could not write to file 
> "/usr/local/pgsql/data/pg_clog/000D" at offset 147456: No space left on device
> Jan 23 10:45:24 canaveral postgres[20035]: [5-1] LOG:  server process (PID 57237) 
> was terminated by signal 6
> Jan 23 10:45:24 canaveral postgres[20035]: [6-1] LOG:  terminating any other active 
> server processes
> 
> after which the postmaster's recovery attempts fail, as Chris already
> detailed.  (Note: the reference to "transaction 0" is not significant;
> that just happens because SimpleLruWritePage doesn't have a specific
> transaction number to blame its write failures on.)
> 
> Those are the observed facts, what's the interpretation?  I think it
> shows that Postgres is pretty darn robust, actually.  We were able to
> stay up and do useful work for quite a long time with zero free space;
> what's more, we lost no transactions that were successfully committed.
> The data was successfully stored in preallocated WAL space.  (If things
> had gone on this way for awhile longer, we would have panicked for lack
> of WAL space, but Chris was actually not anywhere near there; he'd only
> filled about two WAL segments in the half hour of operations.)  Note
> also that checkpoints were attempted several times during that interval,
> and they all failed gracefully --- no panic, no incorrect WAL update.
> 
> But why did this panic finally happen?  The key observation is that
> the first nonexistent page of pg_clog was the page beginning with
> transaction 14286848.  Neither this xact nor the following one have any
> commit or abort record in WAL, but we do see entries for 14286850 and
> 14286851.  It is also notable that there is no WAL entry for extension
> of pg_clog to include this page --- normally a WAL entry is made each
> time a page of zeroes is added to pg_clog.  My interpretation of the
> sequence of events is:
> 
> Transaction 14286848 started, and since it was the first for its pg_clog
> page, it tried to do ZeroCLOGPage() for that page (see ExtendCLOG).  This
> required making room in the in-memory clog buffers, which required
> dumping one of the previously-buffered clog pages, which failed for lack
> of disk space, leading to this log entry:
> 
> Jan 23 10:45:24 canaveral postgres[57237]: [17-1] ERROR:  could not access status of 
> transaction 0
> Jan 23 10:45:24 canaveral postgres[57237]: [17-2] DETAIL:  could not write to file 
> "/usr/local/pgsql/data/pg_clog/000D" at offset 147456: No space left on device
> Jan 23 10:45:24 canaveral postgres[57237]: [18-1] WARNING:  AbortTransaction and not 
> in in-progress state
> 
> (Note: page offset 147456 is the page two before the one containing xid
> 14286848.  This page had been allocated in clog buffers but never yet
> successfully written to disk.  Ditto for the page in bet

Re: [HACKERS] What's planned for 7.5?

2004-01-26 Thread Bruce Momjian

Christopher Kings-Lynne wrote:
> 
> >>-COMMENT ON [ CAST | CONVERSION | OPERATOR CLASS | LARGE OBJECT | LANGUAGE ]
> >>(Christopher)
> 
> Hey Bruce,
> 
> You probably should add 'Dump LOB comments in custom dump format' to the 
> todo.  That's the last part of that task above which I haven't done yet, 
> and for various reasons probably won't have time to try for a while. 
> Just so we don't forget it.

Added:

* Dump large object comments in custom dump format

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match

Re: [HACKERS] Disaster!

2004-01-26 Thread Alvaro Herrera

On Mon, Jan 26, 2004 at 02:52:58PM +0900, Michael Glaesemann wrote:

> I don't know if the 'canaveral' prompt had anything to do with it 
> (maybe it was just the subject line), but I kept thinking of shuttle 
> disasters, o-rings, and plane crashes reading through this. I won't 
> claim to understand everything in huge detail, but from this newbie's 
> point of view, well explained! I enjoyed reading it.

Just for the record, the Canaveral you are thinking about is derived
from the spanish word "Cañaveral", which is a place where "cañas" grow
(canes or stems, according to my dictionary -- some sort of vegetal
living form anyway).  I suppose Cape Kennedy was filled with those
plants and that's what the name comes from.

I dunno if Chris' machine's name derives from that or not; Merriam
Webster does not list any other meaning for that word.

-- 
Alvaro Herrera ()
"The Gord often wonders why people threaten never to come back after they've
been told never to return" (www.actsofgord.com)

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [HACKERS] Named arguments in function calls

2004-01-26 Thread Tom Lane

Matthew Kirkwood <[EMAIL PROTECTED]> writes:
> ... Perhaps:

>   select interest(amount := 500.0, rate := 1.3);

That might work, since := isn't a legal operator name.  It might pose a
conflict for clients like ECPG that like to use ":name" as a parameter
indicator, but since we don't have an identifier directly following ":"
it seems like they could cope.

regards, tom lane

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [HACKERS] Functions returning complex types.

2004-01-26 Thread Thomas Hallgren

The unsupported return type was all my fault. The Form_pg_type typrelid
attribute points to the class of the relation, not the relation as such.
Duh...

But now, when I actually can return complex types, I encounter another
problem. It happens when I pass a complex type returned from one function as
a complex parameter to another function, i.e. something like:

SELECT printMyComplexType(obtainMyComplexType());

Some research shows that the TupleTableSlot* that I create in
obtainMyComplexType() using the following code:

   TupleDesc tupleDesc = TypeGetTupleDesc(typeId, NIL);
   TupleTableSlot* slot = TupleDescGetSlot(tupleDesc);
   return TupleGetDatum(slot, tuple);

is exactly the same TupleTableSlot* that is passed into my
printMyComplextType function. This is of course extremely bad since the
MemoryContext where it was allocated has gone out of scope (I guess, since
this is another call). AFAICS, the way I do it is by the book. What am I
doing wrong?

Regards,

Thomas Hallgren

"Tom Lane" <[EMAIL PROTECTED]> wrote in message
news:[EMAIL PROTECTED]
> "Thomas Hallgren" <[EMAIL PROTECTED]> writes:
> > I'm trying to use a function that returns a complex type. I have no
problem
> > creating the function but when I try to use it I get the message:
>
> > ERROR:  function in FROM has unsupported return type
>
> AFAICS it's not possible to get that message for a function returning a
> composite type.  You'd better show exactly what you did.
>
> regards, tom lane
>
> ---(end of broadcast)---
> TIP 8: explain analyze is your friend
>

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [HACKERS] Functions returning complex types.

2004-01-26 Thread Tom Lane

"Thomas Hallgren" <[EMAIL PROTECTED]> writes:
> ... exactly the same TupleTableSlot* that is passed into my
> printMyComplextType function. This is of course extremely bad since the
> MemoryContext where it was allocated has gone out of scope (I guess, since
> this is another call).

I don't think so; unless you are hacking memory contexts internally to
your function.  Here's some empirical proof that the function call
mechanism is not broken:

regression=# create type mytype as (f1 int ,f2 int);
CREATE TYPE
regression=# create function obtaintype(int,int) returns mytype as
regression-# 'select $1,$2' language sql;
CREATE FUNCTION
regression=# select * from obtaintype(1,2);
 f1 | f2
+
  1 |  2
(1 row)

regression=# create function usetype(mytype) returns int as
regression-# 'select $1.f1 + $1.f2' language sql;
CREATE FUNCTION
regression=# select usetype(obtaintype(1,2));
 usetype
-
   3
(1 row)

regards, tom lane

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

Re: [HACKERS] Log rotation for pg_autovacuum

2004-01-26 Thread Mark Hollow

Seems posting to this list from the office didn't work...

The patch is attached as requested - this is just a quick hack, written to
do what I needed at the time.. consider it just as a starting point for
further work.

I've tested it on Solaris 9 with Sun's compiler (Sun Studio 8 Compilers)
but unfortunately don't have the time or facilities for more extensive
tests on other OS's.

Cheers,
Mark

On Thu, 15 Jan 2004, Matthew T. O'Connor wrote:

> Christopher Kings-Lynne wrote:
> 
> > What's the best way to do log rolling with pg_autovacuum?  It doesn't 
> > seem to have any syslog options, etc.  Using 'tee' maybe?
> 
> 
> I got an email from Mark Hollow saying that he had implemented a syslog 
> patch for pg_autovacuum.  Don't know how good it is, but it might be a 
> start.
> 
> Mark can you post it to the list?
> 
> Matthew
> 
> ---(end of broadcast)---
> TIP 9: the planner will ignore your desire to choose an index scan if your
>   joining column's datatypes do not match
> 
Index: pg_autovacuum.c
===
RCS file: /projects/cvsroot/pgsql-server/contrib/pg_autovacuum/pg_autovacuum.c,v
retrieving revision 1.13
diff -r1.13 pg_autovacuum.c
4a5
>  * Basic syslog facility added by Mark Hollow <[EMAIL PROTECTED]>
12a14,27
> /* logging macros */
> #define LOG(M, F) log_entry(M, LOG_INFO, F)
> #define DEBUG(M, F) log_entry(M, LOG_DEBUG, F)
> 
> /* log_entry. if fl is true & we're logging to a file (not syslog)
>  * then flush the output buffer.
>  */
> void
> log_entry(const char *logentry, int level, int fl)
> {
>   if(args->logtype == LOGTYPE_FILE) log_file(logentry, fl);
>   if(args->logtype == LOGTYPE_SYSLOG) log_syslog(logentry, level);
> }
> 
14c29
< log_entry(const char *logentry)
---
> log_file(const char *logentry, int fl)
23a39,48
> 
>   /* Flush log? */
>   if(fl) fflush(LOGOUTPUT);
> }
> 
> /* log to syslog 
> void
> log_syslog(const char *logentry, int level)
> {
>   syslog(level, logentry);
37,38c62
<   log_entry("Error: cannot disassociate from controlling TTY");
<   fflush(LOGOUTPUT);
---
>   LOG("Error: cannot disassociate from controlling TTY", 1);
52,53c76
<   log_entry("Error: cannot disassociate from controlling TTY");
<   fflush(LOGOUTPUT);
---
>   LOG("Error: cannot disassociate from controlling TTY", 1);
68,69c91
<   log_entry("init_table_info: Cannot get memory");
<   fflush(LOGOUTPUT);
---
>   LOG("init_table_info: Cannot get memory", 1);
82,83c104
<   log_entry("init_table_info: malloc failed on new_tbl->schema_name");
<   fflush(LOGOUTPUT);
---
>   LOG("init_table_info: malloc failed on new_tbl->schema_name", 1);
94,95c115
<   log_entry("init_table_info: malloc failed on new_tbl->table_name");
<   fflush(LOGOUTPUT);
---
>   LOG("init_table_info: malloc failed on new_tbl->table_name", 1);
285c305
<   log_entry(logbuffer);
---
>   DEBUG(logbuffer, 0);
289c309
<   fflush(LOGOUTPUT);
---
>   if(LOGOUTPUT != NULL) fflush(LOGOUTPUT);
308,309c328
<   log_entry(logbuffer);
<   fflush(LOGOUTPUT);
---
>   DEBUG(logbuffer, 1);
363c382
<   log_entry(logbuffer);
---
>   DEBUG(logbuffer, 0);
365c384
<   log_entry(logbuffer);
---
>   DEBUG(logbuffer, 0);
367c386
<   log_entry(logbuffer);
---
>   DEBUG(logbuffer, 0);
370c389
<   log_entry(logbuffer);
---
>   DEBUG(logbuffer, 0);
373c392
<   log_entry(logbuffer);
---
>   DEBUG(logbuffer, 0);
376,377c395
<   log_entry(logbuffer);
<   fflush(LOGOUTPUT);
---
>   DEBUG(logbuffer, 1);
395,396c413
<   log_entry("init_db_list(): Error creating db_list for db: template1.");
<   fflush(LOGOUTPUT);
---
>   LOG("init_db_list(): Error creating db_list for db: template1.", 1);
470,471c487
<   log_entry("updating the database list");
<   fflush(LOGOUTPUT);
---
>   LOG("updating the database list", 1);
556c572
<   log_entry(logbuffer);
---
>   DEBUG(logbuffer, 0);
560c576
<   fflush(LOGOUTPUT);
---
>   if(LOGOUTPUT != NULL) fflush(LOGOUTPUT);
614,615c630
<   log_entry(logbuffer);
<   fflush(LOGOUTPUT);
---
>   DEBUG(logbuffer, 1);
683c698
<   log_entry(logbuffer);
---
>   LOG(logbuffer, 0);
686c701
<   log_entry(logbuffer);
---
>   LOG(logbuffer, 0);
688c703
<   log_entry(" conn is valid, we are connected");
---
>   LOG(" conn is valid, we are connected", 1);
690c705
<   log_entry(" conn is null, we are not connected.");
---
>

Re: [HACKERS] Functions returning complex types.

2004-01-26 Thread Thomas Hallgren

I found the following piece of code in the plpgsql pl_comp.c module:

 /*
  * This is a bit ugly --- need a permanent copy of the rel's tupdesc.
  * Someday all these mallocs should go away in favor of a per-function
  * memory context ...
  */
 oldcxt = MemoryContextSwitchTo(TopMemoryContext);
 row->rowtupdesc = CreateTupleDescCopy(RelationGetDescr(rel));
 MemoryContextSwitchTo(oldcxt);

My guess is that the SQL functions do something similar (pre-compile and
store all TupleDesc's in TopMemoryContext), hence there's no problem with
your example. Writing a C function I get a TupleDesc from the
TypeGetTupleDesc() function that has been allocated using palloc(). I'm
quite sure I don't do anything with memory contexts and the TupleDesc really
seems to go out of scope when the function returns.

If I create a copy of the TupleDesc in the TopMemoryContext directly after
the call to TypeGetTupleDesc, then everything works just fine.

I still suspect that something is broken with the calling mechanism. At
least for languages that do not compile the stuff into permanent structures
prior to evaluating.

Regards,

Thomas Hallgren

"Tom Lane" <[EMAIL PROTECTED]> wrote in message
news:[EMAIL PROTECTED]
> "Thomas Hallgren" <[EMAIL PROTECTED]> writes:
> > ... exactly the same TupleTableSlot* that is passed into my
> > printMyComplextType function. This is of course extremely bad since the
> > MemoryContext where it was allocated has gone out of scope (I guess,
since
> > this is another call).
>
> I don't think so; unless you are hacking memory contexts internally to
> your function.  Here's some empirical proof that the function call
> mechanism is not broken:
>
> regression=# create type mytype as (f1 int ,f2 int);
> CREATE TYPE
> regression=# create function obtaintype(int,int) returns mytype as
> regression-# 'select $1,$2' language sql;
> CREATE FUNCTION
> regression=# select * from obtaintype(1,2);
>  f1 | f2
> +
>   1 |  2
> (1 row)
>
> regression=# create function usetype(mytype) returns int as
> regression-# 'select $1.f1 + $1.f2' language sql;
> CREATE FUNCTION
> regression=# select usetype(obtaintype(1,2));
>  usetype
> -
>3
> (1 row)
>
> regards, tom lane
>
> ---(end of broadcast)---
> TIP 2: you can get off all lists at once with the unregister command
> (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
>

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings

Re: [HACKERS] Disaster!

2004-01-26 Thread Christopher Kings-Lynne

Just for the record, the Canaveral you are thinking about is derived
from the spanish word "Cañaveral", which is a place where "cañas" grow
(canes or stems, according to my dictionary -- some sort of vegetal
living form anyway).  I suppose Cape Kennedy was filled with those
plants and that's what the name comes from.
I dunno if Chris' machine's name derives from that or not; Merriam
Webster does not list any other meaning for that word.
All our server machines are named after launch sites/space centres.  It 
might have been a bit of a mistake, since we're starting to run out of 
names now, and the Japanese names are just too much of a mouthful :)

Chris

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [HACKERS] What's left?

2004-01-26 Thread Bruce Momjian

pgman wrote:
> > > PeerDirect handles rename by just looping.  We really can't delay a
> > > rename.  There is discussion in the Win32 TODO detail that goes over
> > > some options, I think.
> > 
> > Do we really have any problem with rename?  We don't rename table files.
> > The renames I can think of are renaming temp files into place as
> > permanent ones, and there would be no open references to such a file.
> 
> We do have a problem.  It is with cache files read on startup, like
> pg_pwd.  We can generate the file as temp, but we have to slide it in
> while a backend is not reading it.  On a busy system, I am not sure how
> large a window we will get for the rename.  The rename is all
> centralized in port/dirmod.c, so we can deal with it there, whatever the
> solution.
> 
> We also have to do the rename during xact close because we need to hold
> locks so we are sure the files are written in the same order that they
> modify pg_shadow, waiting a long time for the rename is a serious
> problem.

I think I have a solution to this.  The problem with rename is that if
the file is open under win32, we can't rename, so we loop, but we are
still holding locks.

My idea is to do this:

grab lock(e.g. pg_shadow)
write temp file
rename temp file to realfile.new
release lock
rename realfile.new to realfile

In this way, no one ever has the rename file open while we are holding
the locks, and we can loop without holding an exclusive lock on
pg_shadow, and file writes remain in order.

It's so easy, I think I could code it myself.  :-)

This, along with the idea of having the checkpoint delete files that are
pending should solve both problems for us.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] What's left?

2004-01-26 Thread Tom Lane

Bruce Momjian <[EMAIL PROTECTED]> writes:
> In this way, no one ever has the rename file open while we are holding
> the locks, and we can loop without holding an exclusive lock on
> pg_shadow, and file writes remain in order.

You're doing this where exactly, and are certain that you are holding no
locks why exactly?  And if you aren't holding a lock, what prevents
concurrency bugs?

regards, tom lane

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match

Re: [pgsql-hackers-win32] [HACKERS] What's left?

2004-01-26 Thread Bruce Momjian

Tom Lane wrote:
> Bruce Momjian <[EMAIL PROTECTED]> writes:
> > In this way, no one ever has the rename file open while we are holding
> > the locks, and we can loop without holding an exclusive lock on
> > pg_shadow, and file writes remain in order.
> 
> You're doing this where exactly, and are certain that you are holding no
> locks why exactly?  And if you aren't holding a lock, what prevents
> concurrency bugs?

I am looking now at the relcache file, pg_pwd and pg_group.  I am sure I
am holding some locks, but not an exclusive lock on e.g. pg_shadow. I am
working on a patch now.  I don't expect to eliminate the looping for
rename, but to eliminate holding exclusive locks while doing the rename
to a file actively being read.  By using realfile.new, the first rename
is only being done on a file that is never opened, just renamed, which
should be quick.

I can't think of a cleaner solution.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings

Re: [HACKERS] What's left?

2004-01-26 Thread Bruce Momjian

Tom Lane wrote:
> Bruce Momjian <[EMAIL PROTECTED]> writes:
> > In this way, no one ever has the rename file open while we are holding
> > the locks, and we can loop without holding an exclusive lock on
> > pg_shadow, and file writes remain in order.
> 
> You're doing this where exactly, and are certain that you are holding no
> locks why exactly?  And if you aren't holding a lock, what prevents
> concurrency bugs?

Oh, for concurrency bugs, you make realfile.new while holding the
exclusive lock, so someone could come in later and replace realfile.new
while I am in the rename loop, but then I just install theirs instead.

I could install someone who has just done the rename to realfile.new but
not tried the rename from realfile.new to realfile, but that seems OK. 
They will just fine the file missing and fail on the rename, which is
OK.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [HACKERS] Extending SET SESSION AUTHORIZATION

2004-01-26 Thread Bruce Momjian

Ezra Epstein wrote:
> 
> I'd like to extend SET SESSION AUTHORIZATION to support a form which takes a
> password.  Looking at the source it seems, other than changes to the parser,
> there are only 2 relevant functions in 2 files that would be affected.  Each
> function is quite small and its function is clear.
> 
> I did not find this functionality on the current to-do list:
>   http://developer.postgresql.org/todo.php
> And I'm quite new to the PG backend.  I don't want to code something up that
> is unwelcome by the developers.  On the other hand, if appropriate/accepted,
> I'd be glad to write this little addition to the current functionality.

[ CC to hackers added.]

Uh, a password?  What purpose would that serve?  Isn't that something
you control when attaching to the database?  Is this for prompting for
a username password?  The problem is that the SQL query passing isn't
secure like the way we send passwords using libpq, so I don't think this
would be secure or wise to hardcode a password in the SQL.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

Re: [HACKERS] Extending SET SESSION AUTHORIZATION

2004-01-26 Thread Tom Lane

Bruce Momjian <[EMAIL PROTECTED]> writes:
> Ezra Epstein wrote:
>> I'd like to extend SET SESSION AUTHORIZATION to support a form which takes a
>> password.

> Uh, a password?  What purpose would that serve?

Indeed.  SET SESSION AUTH is already allowed only to superusers --- a
superuser hardly needs any additional privileges to become whoever he
wants.

regards, tom lane

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

40 matches

Mail list logo