date:20070115

Re: [HACKERS] [GENERAL] Checkpoint request failed on version 8.2.1.

2007-01-15 Thread Tom Lane

"Takayuki Tsunakawa" <[EMAIL PROTECTED]> writes:
> BTW, why does the bgwriter try to open and write the pages of already
> dropped relations?

It does not; the problem is with stale fsync requests.

> If the relation being dropeed has
> already been registered in the list of files to be fsynced, isn't it
> possible to remove the file from the list before unlinking the file,
> asking bgwriter in a similar way as ForwardFsyncRequest()?

I suggested that here
http://archives.postgresql.org/pgsql-hackers/2007-01/msg00642.php
but have received no feedback about it ...

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [HACKERS] xml type and encodings

2007-01-15 Thread Tom Lane

Peter Eisentraut <[EMAIL PROTECTED]> writes:
> Andrew Dunstan wrote:
>> Are we going to ensure that what we hand back to another client has
>> an appropriate encding decl? Or will we just remove it in all cases?

> We can't do the former, but the latter might be doable.

I think that in the case of binary output, it'd be possible for xml_send
to include an encoding decl safely, because it could be sure that that's
where the data is going forthwith.  Not sure if that's worth anything
though.  The idea of text and binary output behaving differently on this
point doesn't seem all that attractive ...

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [HACKERS] [GENERAL] Checkpoint request failed on version 8.2.1.

2007-01-15 Thread Takayuki Tsunakawa

From: "Magnus Hagander" <[EMAIL PROTECTED]>
> But yeah, that's probably a good idea. A quick look at the code says
we
> should at least ask people who have this problem to give it a run
with
> logging at DEBUG5 which should then log exactly what the errorcode
was.
> Or are you seeing more places that need such logging first?

I'm sorry we can't get get the Win32 error code to be displayed.  I
got the following messages:

2007-01-16 09:24:48 DEBUG:  checkpoint starting
2007-01-16 09:24:48 ERROR:  could not open relation 1663/10819/18296:
Permission denied
2007-01-16 09:24:48 ERROR:  checkpoint request failed
2007-01-16 09:24:48 HINT:  Consult recent messages in the server log
for details.
2007-01-16 09:24:48 STATEMENT:  checkpoint;


The reason is that src/port/open.c does not use _dosmaperr().  It
converts the Win32 error code to errno directly.  EACCES is converted
from ERROR_ACCESS_DENIED only.  Mmm, we may have to compromise as
Tom-san says.

BTW, why does the bgwriter try to open and write the pages of already
dropped relations?  When dropping relations, DropRelFileNodeBuffers is
called to discard dirty buffers.  If the relation being dropeed has
already been registered in the list of files to be fsynced, isn't it
possible to remove the file from the list before unlinking the file,
asking bgwriter in a similar way as ForwardFsyncRequest()?

#
The timestamp at the head of each message is noisy since the event
viewer has the time info, isn't it?  Besides, several PostgreSQL
messages appeared as one entry of event log, separated by a LF instead
of CR LF.  On Windows, CR LF should separate lines.





---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

[HACKERS] Encoding conversion API

2007-01-15 Thread Peter Eisentraut

I'm becoming confused by the encoding conversion API, in particular by 
the function

pg_do_encoding_conversion(unsigned char *src, int len, int src_encoding, 
int dest_encoding);

Since the function in various circumstances passes back src directly, 
you might get back a string that encompasses the equivalent of more 
than len bytes of src.  Or if you pass a string that is not 
null-terminated (which would appear to be allowed, since you pass len), 
you might get back a null-terminated string or not.  And if you don't, 
you don't know how long it is.  So it seems that the only safe way to 
work with this is that src must be a regular null-terminated string, 
but then why must len be passed?

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [HACKERS] Autovacuum improvements

2007-01-15 Thread Darcy Buskermolen

On Monday 15 January 2007 15:23, Joshua D. Drake wrote:
> Darcy Buskermolen wrote:
> > On Sunday 14 January 2007 08:45, Joshua D. Drake wrote:
> >>> While we are talking autovacuum improvements, I'd like to also see some
> >>> better logging, something that is akin to the important information of
> >>> vacuum verbose being logged to a table or baring that the error_log. 
> >>> I'd like to be able to see what was done, and how long it took to do
> >>> for each relation touched by av.  A thought, having this information
> >>> may even be usefull for the above thought of scheduler because we may
> >>> be able to build some sort of predictive scheduling into this.
> >>
> >> This plays back to the vacuum summary idea that I requested:
> >>
> >> http://archives.postgresql.org/pgsql-hackers/2005-07/msg00451.php
> >
> > Well the fsm information is available in the pg_freespace contrib module,
> > however it does not help with the "how long does it take to maintian XZY,
> > or vacuum of relfoo did ABC".
> >
> > I'm thinking a logtable of something like the following:
> >
> > relid
> > starttime
> > elapsed_time
> > rows
> > rows_removed
> > pages
> > pages_removed
> > reusable_pages
> > cputime
> >
> > This information then could be statisticaly used to ballance N queues to
> > provide optimal vacuuming performance.
> >
> > Josh, is this more of what you were thinking as well ?
>
> My original thought with Vacuum summary was that it would only give me
> the information I need from vacuum verbose. Vacuum Verbose is great if
> you want all the info, but normally you just want the last 5 lines :)
>
> If there were functions along with the log table that would give me the
> same info that would be great! Something like:
>
> select show_omg_vacuum_now_tables() ;)
>
> Seriously though...
>
> select show_fsm_summary() which would show information over the last 12
> hours, 24 hours or since last vacuum or something.

If it's only fsm you are thinking of then the contrib module is probably good 
enough for you.

>
> Sincerely,
>
> Joshua D. Drake
>
> >> (Man our new search engine is so much better than the old one :))
> >>
> >> Joshua D. Drake
> >>
> >>> ---(end of
> >>> broadcast)--- TIP 7: You can help support the
> >>> PostgreSQL project by donating at
> >>>
> >>> http://www.postgresql.org/about/donate
> >
> > ---(end of broadcast)---
> > TIP 9: In versions below 8.0, the planner will ignore your desire to
> >choose an index scan if your joining column's datatypes do not
> >match

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] xml type and encodings

2007-01-15 Thread Peter Eisentraut

I wrote:
> We need to decide on how to handle encoding information embedded in
> xml data that is passed through the client/server encoding
> conversion.

Tangentially related, I'm currently experimenting with a setup that 
stores all xml data in UTF-8 on the server, converting it back to the 
server encoding on output.  This doesn't do anything to solve the 
problem above, but it makes the internal processing much simpler, since 
all of libxml uses UTF-8 internally anyway.  Is anyone opposed to that 
setup on principle?

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [HACKERS] Autovacuum improvements

2007-01-15 Thread Joshua D. Drake

Darcy Buskermolen wrote:
> On Sunday 14 January 2007 08:45, Joshua D. Drake wrote:
>>> While we are talking autovacuum improvements, I'd like to also see some
>>> better logging, something that is akin to the important information of
>>> vacuum verbose being logged to a table or baring that the error_log.  I'd
>>> like to be able to see what was done, and how long it took to do for each
>>> relation touched by av.  A thought, having this information may even be
>>> usefull for the above thought of scheduler because we may be able to
>>> build some sort of predictive scheduling into this.
>> This plays back to the vacuum summary idea that I requested:
>>
>> http://archives.postgresql.org/pgsql-hackers/2005-07/msg00451.php
> 
> Well the fsm information is available in the pg_freespace contrib module, 
> however it does not help with the "how long does it take to maintian XZY, or 
> vacuum of relfoo did ABC".
> 
> I'm thinking a logtable of something like the following:
> 
> relid
> starttime
> elapsed_time
> rows
> rows_removed
> pages
> pages_removed
> reusable_pages
> cputime
> 
> This information then could be statisticaly used to ballance N queues to 
> provide optimal vacuuming performance. 
> 
> Josh, is this more of what you were thinking as well ?


My original thought with Vacuum summary was that it would only give me
the information I need from vacuum verbose. Vacuum Verbose is great if
you want all the info, but normally you just want the last 5 lines :)

If there were functions along with the log table that would give me the
same info that would be great! Something like:

select show_omg_vacuum_now_tables() ;)

Seriously though...

select show_fsm_summary() which would show information over the last 12
hours, 24 hours or since last vacuum or something.

Sincerely,

Joshua D. Drake



> 
>> (Man our new search engine is so much better than the old one :))
>>
>> Joshua D. Drake
>>
>>> ---(end of broadcast)---
>>> TIP 7: You can help support the PostgreSQL project by donating at
>>>
>>> http://www.postgresql.org/about/donate
> 
> ---(end of broadcast)---
> TIP 9: In versions below 8.0, the planner will ignore your desire to
>choose an index scan if your joining column's datatypes do not
>match
> 


-- 

  === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive  PostgreSQL solutions since 1997
 http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL Replication: http://www.commandprompt.com/products/


---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate

Re: [HACKERS] Autovacuum improvements

2007-01-15 Thread Darcy Buskermolen

On Sunday 14 January 2007 08:45, Joshua D. Drake wrote:
> > While we are talking autovacuum improvements, I'd like to also see some
> > better logging, something that is akin to the important information of
> > vacuum verbose being logged to a table or baring that the error_log.  I'd
> > like to be able to see what was done, and how long it took to do for each
> > relation touched by av.  A thought, having this information may even be
> > usefull for the above thought of scheduler because we may be able to
> > build some sort of predictive scheduling into this.
>
> This plays back to the vacuum summary idea that I requested:
>
> http://archives.postgresql.org/pgsql-hackers/2005-07/msg00451.php

Well the fsm information is available in the pg_freespace contrib module, 
however it does not help with the "how long does it take to maintian XZY, or 
vacuum of relfoo did ABC".

I'm thinking a logtable of something like the following:

relid
starttime
elapsed_time
rows
rows_removed
pages
pages_removed
reusable_pages
cputime

This information then could be statisticaly used to ballance N queues to 
provide optimal vacuuming performance. 

Josh, is this more of what you were thinking as well ?

>
> (Man our new search engine is so much better than the old one :))
>
> Joshua D. Drake
>
> > ---(end of broadcast)---
> > TIP 7: You can help support the PostgreSQL project by donating at
> >
> > http://www.postgresql.org/about/donate

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [HACKERS] xml type and encodings

2007-01-15 Thread Peter Eisentraut

Andrew Dunstan wrote:
> We should error out on any explicit encoding that conflicts with the
> client encoding. I don't like the idea of just ignoring an explicit
> encoding decl.

That is an instance of the problem of figuring out which encoding names 
are equivalent, which I believe we have settled on finding impossible.

> Are we going to ensure that what we hand back to another client has
> an appropriate encding decl? Or will we just remove it in all cases?

We can't do the former, but the latter might be doable.

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [HACKERS] xml type and encodings

2007-01-15 Thread Andrew Dunstan

Peter Eisentraut wrote:
> Florian G. Pflug wrote:
>> Couldn't the server change the encoding declaration inside the xml to
>> the correct
>> one (the same as client_encoding) before returning the result?
>
> The data type output function doesn't know what the client encoding is
> or whether the data will be shipped to the client at all.  But what I'm
> thinking is that we should remove the encoding declaration if possible.
> At least that would be less confusing, albeit still potentially
> incorrect if the client continues to process the document without care.

The XML SPec says:

"In the absence of information provided by an external transport protocol
(e.g. HTTP or MIME), it is a fatal error for an entity including an
encoding declaration to be presented to the XML processor in an encoding
other than that named in the declaration, or for an entity which begins
with neither a Byte Order Mark nor an encoding declaration to use an
encoding other than UTF-8. Note that since ASCII is a subset of UTF-8,
ordinary ASCII entities do not strictly need an encoding declaration."

ISTM we are reasonably entitled to require the client to pass in an xml
document that uses the client encoding, re-encoding it if necessary (and
adjusting the encoding decl if any in the process).

We should error out on any explicit encoding that conflicts with the
client encoding. I don't like the idea of just ignoring an explicit
encoding decl.

Are we going to ensure that what we hand back to another client has an
appropriate encding decl? Or will we just remove it in all cases?

cheers

andrew

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] xml type and encodings

2007-01-15 Thread Peter Eisentraut

Florian G. Pflug wrote:
> Sorry, I don't get it - how does this work for text, then? It works
> there to dynamically recode the data from the database encoding to
> the client encoding before sending it off to the client, no?

Sure, but it doesn't change the text inside the datum.
-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] Autovacuum improvements

2007-01-15 Thread Alvaro Herrera

Tom Lane wrote:
> Alvaro Herrera <[EMAIL PROTECTED]> writes:
> > Here it is.
> 
> I'd drop the InitProcess API change, which touches many more places than
> you really need, and just have InitProcess check IsAutoVacuumProcess(),
> which should be valid by the time control gets to it.  This is more like
> the way that InitPostgres handles it, anyway.

Hmm, the problem is SubPostmasterMain, which is in the EXEC_BACKEND
path.  It hasn't reached the autovacuum.c code yet, so it hasn't had the
chance to set the am_autovacuum static variable (in autovacuum.c).  I
guess the answer here is to allow that variable to be set from the
outside.

> > Note that I used the same DatabaseHasActiveBackends() function to do the
> > kill.
> 
> Agreed; maybe change the name to something that sounds less like a
> side-effect-free function?

I'm short on ideas on how to name it ...
DatabaseHasActiveBackendsAndKillAutovac() sounds a bit too much :-(
Maybe DatabaseCancelAutovacuumActivity()?  (but then it's not obvious
that it counts other processes at all)  And make it kill all autovac
processes inconditionally, which also fixes thing per your comment
below:

> > Another point to make is that it only kills autovacuum, and only if no
> > other process is found.  So if there are two processes and autovacuum is
> > one of them, it will be allowed to continue.
> 
> What if there are two autovac processes, which seems likely to be
> possible soon enough?

On the other hand, I was thinking that if we're going to have an autovacuum
launcher that's continuously running, we're going to have a lot of API
changes in this area anyway, so I wasn't in a hurry to consider the
posibility of two autovacuum processes.  But I don't think it's very
important anyway.

PS -- first time I try to be strict about switching between
pgsql-hackers and pgsql-patches and already I find it a bit annoying ...
not to mention that this is probably going to look weird on the
archives.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] xml type and encodings

2007-01-15 Thread Florian G. Pflug


Peter Eisentraut wrote:

Florian G. Pflug wrote:

Couldn't the server change the encoding declaration inside the xml to
the correct
one (the same as client_encoding) before returning the result?


The data type output function doesn't know what the client encoding is 
or whether the data will be shipped to the client at all.  But what I'm 
thinking is that we should remove the encoding declaration if possible.  
At least that would be less confusing, albeit still potentially 
incorrect if the client continues to process the document without care.


Sorry, I don't get it - how does this work for text, then? It works there
to dynamically recode the data from the database encoding to the client
encoding before sending it off to the client, no?

greetings, Florian Pflug

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] Recent ecpg patch...

2007-01-15 Thread Magnus Hagander

Joachim Wieland wrote:
> On Thu, Jan 11, 2007 at 09:59:14PM +0100, Magnus Hagander wrote:
>> .. appears to have killed win32. It did kill my manual MSVC builds, but
>> it also seems to have killed win32 buildfarm members yak and snake:
>> http://www.pgbuildfarm.org/cgi-bin/show_log.pl?nm=yak&dt=2007-01-11%2020:32:11
>> http://www.pgbuildfarm.org/cgi-bin/show_log.pl?nm=snake&dt=2007-01-11%2018:30:01
> 
>> (same error on mingw and msvc)
> 
> ecpg_internal_regression_mode has to be declared at least in one file without
> the "extern" keyword.
> 
> With the attached patch I get a clean build again with MSVC.

In case nobody has confirmed that yet (I may have missed a msg or two),
with whatever got committed to cvs eventually, it builds fine on MSVC now.

//Magnus

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [HACKERS] ideas for auto-processing patches

2007-01-15 Thread markwkm

On 1/12/07, Andrew Dunstan <[EMAIL PROTECTED]> wrote:

[EMAIL PROTECTED] wrote:
> What do you think about setting up the buildfarm clients
> with the users they are willing to test patches for, as opposed to
> having the patch system track who is are trusted users?  My thoughts
> are the former is easier to implement and that it allows anyone to use
> the buildfarm to test a patch for anyone, well each buildfarm client
> user permitting.

We can do this, but the utility will be somewhat limited. The submitters
will still have to be known and authenticated on the patch server. I
think you're also overlooking one of the virtues of the buildfarm,
namely that it does its thing unattended. If there is a preconfigured
set of submitters/vetters then we can rely on them all to do their
stuff. If it's more ad hoc, then when Joe Bloggs submits a spiffy new
patch every buildfarm owner that wanted to test it would need to go and
add him to their configured list of patch submitters. This doesn't seem
too workable.

Ok so it really wasn't much work to put together a SOAP call that'll
return patches from everyone, a trusted group, or a specified
individual.  I put together a short perl example that illustrates some
of this:
 http://folio.dyndns.org/example.pl.txt

How does that look?

Regards,
Mark

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate

Re: [HACKERS] Function execution costs 'n all that

2007-01-15 Thread Tom Lane

Neil Conway <[EMAIL PROTECTED]> writes:
> BTW, your proposal would still pushdown all qualifiers, right?

Yeah, I have no intention of readopting xfunc in the near future ...
especially seeing that it's possible for the user to force that
sort of thing if he really has to.

SELECT * FROM (SELECT ... OFFSET 0) ss
WHERE expensive_function(...);

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [HACKERS] xml type and encodings

2007-01-15 Thread Peter Eisentraut

Florian G. Pflug wrote:
> Couldn't the server change the encoding declaration inside the xml to
> the correct
> one (the same as client_encoding) before returning the result?

The data type output function doesn't know what the client encoding is 
or whether the data will be shipped to the client at all.  But what I'm 
thinking is that we should remove the encoding declaration if possible.  
At least that would be less confusing, albeit still potentially 
incorrect if the client continues to process the document without care.
-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [HACKERS] xml type and encodings

2007-01-15 Thread Peter Eisentraut

Martijn van Oosterhout wrote:
> The only real alternative is to treat xml more like bytea than text
> (ie, treat the input as a stream of octets).

bytea isn't "treated" any different than other data types.  You just 
have to take care in the client that you escape every byte greater than 
127.  The same option is available to you in xml, if you escape all 
suspicious characters using entities.  Then, the encoding declaration 
is immaterial anyway.  (Unless you allow UTF-16 into the picture, but 
let's say we exclude that implicitly.)

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] Function execution costs 'n all that

2007-01-15 Thread Neil Conway

On Mon, 2007-01-15 at 15:05 -0500, Tom Lane wrote:
> maybe we should just do the constant for starters and see how many
> people really want to write C-code estimators ...

+1

BTW, your proposal would still pushdown all qualifiers, right?
Hellerstein's xfunc work discusses situations in which it makes sense to
pullup expensive qualifiers above joins, for example, in order to reduce
the number of tuples the qualifier is applied to. Unfortunately, this
would probably increase the optimizer's search space by a fairly
significant degree, so it might need to be controlled by a GUC variable,
or only applied when the estimated cost of applying a qualifier is
particularly large relative to the total estimated cost of the plan.

-Neil

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] Function execution costs 'n all that

2007-01-15 Thread Tom Lane

Brian Hurt <[EMAIL PROTECTED]> writes:
> Non-developer here, but we use a lot of plpgsql functions at work.  And 
> the functions we use fall into two broad, ill-defined catagories- 
> "expensive" functions and "cheap" functions.  What I'd like as a user is 
> some way to tell the planner "this function is expensive- prefer plans 
> which call this function less even if they're otherwise more expensive" 
> or "this function is cheap, prefer plans that are otherwise less 
> expensive even if they call this function more often".  Precise cost 
> estimates aren't that important, IMHO.

Right, so a plain constant cost would be plenty for your situation.

I suspect there's an 80/20 rule at work here --- the estimator-function
side of this will take most of the effort to design/implement, but not
get used nearly as much as the plain-constant form ... maybe we should
just do the constant for starters and see how many people really want to
write C-code estimators ...

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] Function execution costs 'n all that

2007-01-15 Thread Brian Hurt


Neil Conway wrote:


On Mon, 2007-01-15 at 10:51 -0800, Richard Troy wrote:
 


I therefore propose that the engine evaluate -
benchmark, if you will - all functions as they are ingested, or
vacuum-like at some later date (when valid data for testing may exist),
and assign a cost relative to what it already knows - the built-ins, for
example.
   



That seems pretty unworkable. It is unsafe, for one: evaluating a
function may have side effects (inside or outside the database), so the
DBMS cannot just invoke user-defined functions at whim. Also, the
relationship between a function's arguments and its performance will
often be highly complex -- it would be very difficult, not too mention
computationally infeasible, to reconstruct that relationship
automatically, especially without any real knowledge about the
function's behavior.
 

Non-developer here, but we use a lot of plpgsql functions at work.  And 
the functions we use fall into two broad, ill-defined catagories- 
"expensive" functions and "cheap" functions.  What I'd like as a user is 
some way to tell the planner "this function is expensive- prefer plans 
which call this function less even if they're otherwise more expensive" 
or "this function is cheap, prefer plans that are otherwise less 
expensive even if they call this function more often".  Precise cost 
estimates aren't that important, IMHO.


Brian

Re: [HACKERS] Function execution costs 'n all that

2007-01-15 Thread Richard Troy


On Mon, 15 Jan 2007, Neil Conway wrote:
> On Mon, 2007-01-15 at 10:51 -0800, Richard Troy wrote:
> > I therefore propose that the engine evaluate -
> > benchmark, if you will - all functions as they are ingested, or
> > vacuum-like at some later date (when valid data for testing may exist),
> > and assign a cost relative to what it already knows - the built-ins, for
> > example.
>
> That seems pretty unworkable. It is unsafe, for one: evaluating a
> function may have side effects (inside or outside the database), so the
> DBMS cannot just invoke user-defined functions at whim. Also, the
> relationship between a function's arguments and its performance will
> often be highly complex -- it would be very difficult, not too mention
> computationally infeasible, to reconstruct that relationship
> automatically, especially without any real knowledge about the
> function's behavior.
>
> -Neil

Hi Neil,

Tom had already proposed:
>
> I'm envisioning that the CREATE FUNCTION syntax would add optional
> clauses
>
>COST function-name-or-numeric-constant
>ROWS function-name-or-numeric-constant
>
> that would be used to fill these columns.

I was considering these ideas in the mix; let the user provide either a
numeric or a function, the distinction here being that instead of running
that function at planning-time, it could be run "off-line", so to speak.

Richard

-- 
Richard Troy, Chief Scientist
Science Tools Corporation
510-924-1363 or 202-747-1263
[EMAIL PROTECTED], http://ScienceTools.com/


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] Function execution costs 'n all that

2007-01-15 Thread Neil Conway

On Mon, 2007-01-15 at 10:51 -0800, Richard Troy wrote:
> I therefore propose that the engine evaluate -
> benchmark, if you will - all functions as they are ingested, or
> vacuum-like at some later date (when valid data for testing may exist),
> and assign a cost relative to what it already knows - the built-ins, for
> example.

That seems pretty unworkable. It is unsafe, for one: evaluating a
function may have side effects (inside or outside the database), so the
DBMS cannot just invoke user-defined functions at whim. Also, the
relationship between a function's arguments and its performance will
often be highly complex -- it would be very difficult, not too mention
computationally infeasible, to reconstruct that relationship
automatically, especially without any real knowledge about the
function's behavior.

-Neil

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] [GENERAL] Checkpoint request failed on version 8.2.1.

2007-01-15 Thread Tom Lane

Magnus Hagander <[EMAIL PROTECTED]> writes:
> Tom Lane wrote:
>> DEBUG5 is going to be a bit voluminous, but let's try that if we can.

> Perhaps we should switch down the DEBUG level of it, at least until we
> know what happens?

That would have to wait on another update release, or at least someone
being willing to build a nonstandard executable for Windows, so let's
first see if people are willing to do the DEBUG5 bit.

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [HACKERS] Function execution costs 'n all that

2007-01-15 Thread Richard Troy

On Mon, 15 Jan 2007, Tom Lane wrote:

> So I've been working on the scheme I suggested a few days ago of
> representing "equivalence classes" of variables explicitly, and avoiding
> the current ad-hocery of generating and then removing redundant clauses
> in favor of generating only the ones we want in the first place.  Any
> clause that looks like an equijoin gets sent to the EquivalenceClass
> machinery by distribute_qual_to_rels, and not put into the
> restrictlist/joinlist data structure at all.  Then we make passes over
> the EquivalenceClass lists at appropriate times to generate the clauses
> we want.  This is turning over well enough now to pass the regression
> tests,

That was quick...

> In short, this approach results in a whole lot less stability in the
> order in which WHERE clauses are evaluated.  That might be a killer
> objection to the whole thing, but on the other hand we've never made
> any strong promises about WHERE evaluation order.

Showing my ignorance here, but I've never been a fan of "syntax based
optimization," though it is better than no optimization. If people are
counting on order for optimization, then, hmmm... If you can provide a way
to at least _try_ to do better, then don't worry about it. It will improve
with time.

> Instead, I'm thinking it might be time to re-introduce some notion of
> function execution cost into the system, and make use of that info to
> sort WHERE clauses into a reasonable execution order.

Ingres did/does it that way, IIRC. It's a solid strategy.

>  This example
> would be fixed with even a very stupid rule-of-thumb about SQL functions
> being more expensive than C functions, but if we're going to go to the
> trouble it seems like it'd be a good idea to provide a way to label
> user-defined functions with execution costs.
>
> Would a simple constant value be workable, or do we need some more
> complex model (and if so what)?

Ingres would, if I'm not mistaken, gain through historical use through
histograms. Short of that, you've got classes of functions, agregations,
for example, and there's sure to be missing information to make a great
decision at planning time. However, I take it that the cost here is
primarily CPU and not I/O. I therefore propose that the engine evaluate -
benchmark, if you will - all functions as they are ingested, or
vacuum-like at some later date (when valid data for testing may exist),
and assign a cost relative to what it already knows - the built-ins, for
example. Doing so could allow this strategy to be functional in short
order and be improved with time so all the work doesn't have to be
implemented on day 1. And, DBA/sys-admin tweaking can always be done by
updating the catalogues.

HTH,
Richard

-- 
Richard Troy, Chief Scientist
Science Tools Corporation
510-924-1363 or 202-747-1263
[EMAIL PROTECTED], http://ScienceTools.com/

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [HACKERS] [GENERAL] Checkpoint request failed on version 8.2.1.

2007-01-15 Thread Magnus Hagander

Tom Lane wrote:
> Magnus Hagander <[EMAIL PROTECTED]> writes:
>> But yeah, that's probably a good idea. A quick look at the code says we
>> should at least ask people who have this problem to give it a run with
>> logging at DEBUG5 which should then log exactly what the errorcode was.
>> Or are you seeing more places that need such logging first?
> 
> DEBUG5 is going to be a bit voluminous, but let's try that if we can.

Perhaps we should switch down the DEBUG level of it, at least until we
know what happens?

//Magnus

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [HACKERS] Autovacuum improvements

2007-01-15 Thread Tom Lane

Alvaro Herrera <[EMAIL PROTECTED]> writes:
> Note that currently there's no way for a backend to know whether another
> backend is autovacuum or not.  I thought about adding a flag to PGPROC,
> but eventually considered it ugly,

No, that was exactly the way I thought we'd do it.  One thing to note is
that to avoid race conditions, the PGPROC entry has to be marked as
autovac from the instant it's inserted into the array --- with a
separate area I think you'd have difficulty avoiding the race condition.

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] [GENERAL] Checkpoint request failed on version 8.2.1.

2007-01-15 Thread Tom Lane

Magnus Hagander <[EMAIL PROTECTED]> writes:
> But yeah, that's probably a good idea. A quick look at the code says we
> should at least ask people who have this problem to give it a run with
> logging at DEBUG5 which should then log exactly what the errorcode was.
> Or are you seeing more places that need such logging first?

DEBUG5 is going to be a bit voluminous, but let's try that if we can.

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] Autovacuum improvements

2007-01-15 Thread Alvaro Herrera

Tom Lane wrote:

> Also see Peter's nearby suggestion that we ought to wait instead of fail
> for *all* cases of somebody attached to the database.  This would adapt
> readily enough to that.
> 
> I was complaining elsewhere that I didn't want to use a sleep loop
> for fixing the fsync-synchronization issue, but CREATE/DROP DATABASE
> seems a much heavier-weight operation, so I don't feel that a sleep
> is inappropriate here.

Note that currently there's no way for a backend to know whether another
backend is autovacuum or not.  I thought about adding a flag to PGPROC,
but eventually considered it ugly, so I started coding it as a shared
memory area instead, similar to what the bgwriter uses (storing the PID
there, etc).  After that was almost done I noticed that it's not a very
good idea either because there's no way to clean the shmem if autovacuum
dies -- the only one who knows about it, postmaster, does not want to
have access to shmem, so it can't do it.

So I'm reverting to using the flag in PGPROC for now, with an eye
towards using shmem eventually if we decide that using an always-running
autovacuum launcher is a good idea.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [HACKERS] [GENERAL] Checkpoint request failed on version 8.2.1.

2007-01-15 Thread Magnus Hagander

Tom Lane wrote:
> Magnus Hagander <[EMAIL PROTECTED]> writes:
>> Tom Lane wrote:
>>> pg_control is certainly not ever deleted or renamed, and in fact I
>>> believe there's an LWLock enforcing that only one PG process at a time
>>> is even touching it.  So we need another theory to explain this one :-(
> 
>> Right. What we need is a list of which processes have handles open to
>> the file, which can be dumped using Process Explorer (there are other
>> sysinternals tools to do it as well, but PE is probably the easiest)-
> 
> Hmm, are you just assuming that the underlying error is
> ERROR_SHARING_VIOLATION?  One of the things that's bothered me all along
> is that there are a dozen different Windows error codes that we map to
> EACCES ... perhaps it's time to think about disambiguating that a bit
> better?

I was. Using PE is just one way to prove that was it :-)

But yeah, that's probably a good idea. A quick look at the code says we
should at least ask people who have this problem to give it a run with
logging at DEBUG5 which should then log exactly what the errorcode was.
Or are you seeing more places that need such logging first?


//Magnus

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] xml type and encodings

2007-01-15 Thread Florian G. Pflug


Peter Eisentraut wrote:

Am Montag, 15. Januar 2007 17:33 schrieb Florian G. Pflug:

Would this mean that if the client_encoding is for example latin1, and I
retrieve an xml document uploaded by a client with client_encoding utf-8
(and thus having encoding="c" in the xml tag), that I would get a
document with latin1 encoding but saying that it's utf-8 in it's xml tag?


That is likely to be a consequence of this proposed behaviour, but no doubt 
not a nice one.


Couldn't the server change the encoding declaration inside the xml to 
the correct

one (the same as client_encoding) before returning the result?
Otherwise, parsing the xml on the client with some xml library becomes 
difficult, because the library is likely to get confused by the wrong 
encoding tag - and you can't even fix that by using the correct client
encoding, because you don't know what the encoding tag says until you 
have retrieved the document...


greetings, Florian Pflug


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] Function execution costs 'n all that

2007-01-15 Thread Tom Lane

Heikki Linnakangas <[EMAIL PROTECTED]> writes:
> Tom Lane wrote:
>> If we go this route it seems like we'll need four more columns in
>> pg_proc (cost estimation function OID, rowcount estimation function OID,
>> fallback cost constant, fallback rowcount constant).

> What would the fallbacks be for?

By "fallback" I meant "this is what to use if no estimation function is
provided".

I'm envisioning that the CREATE FUNCTION syntax would add optional
clauses

COST function-name-or-numeric-constant
ROWS function-name-or-numeric-constant

that would be used to fill these columns.

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [HACKERS] Function execution costs 'n all that

2007-01-15 Thread Heikki Linnakangas


Tom Lane wrote:

Heikki Linnakangas <[EMAIL PROTECTED]> writes:
A simple constant would probably be enough. If we want anything fancier 
than that, it should be up to the author of the function to define the 
cost model as well. I'm envisioning that you could attach a custom cost 
function to a user-defined function which would return the estimated CPU 
cost. And # of rows returned for a set-returning function.


But what will such an estimation function work on?  In general the
planner does not have the value(s) that will be passed to the actual
function at runtime.  It might have expressions or estimates but
those data structures are certainly not something we could pass to
non-C-coded functions.   Are we willing to restrict these functions
to being coded in C, as selectivity estimation functions are?


Yeah, I don't know. If the planner knows the actual value, that would 
certainly be the easiest for the UDF writer to work with. Anything more 
than that gets really complicated.



If we go this route it seems like we'll need four more columns in
pg_proc (cost estimation function OID, rowcount estimation function OID,
fallback cost constant, fallback rowcount constant).


What would the fallbacks be for?


BTW, I'm thinking that a "cost constant" probably ought to be measured
in units of cpu_operator_cost.  The default for built-in functions would
thus be 1, at least till such time as someone wants to refine the
estimates.  We'd probably want the default for PL and SQL functions to
be 10 or 100 or so.


Agreed.

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate

Re: [HACKERS] xml type and encodings

2007-01-15 Thread Martijn van Oosterhout

On Mon, Jan 15, 2007 at 05:47:37PM +0100, Peter Eisentraut wrote:
> Am Montag, 15. Januar 2007 17:33 schrieb Florian G. Pflug:
> > Would this mean that if the client_encoding is for example latin1, and I
> > retrieve an xml document uploaded by a client with client_encoding utf-8
> > (and thus having encoding="c" in the xml tag), that I would get a
> > document with latin1 encoding but saying that it's utf-8 in it's xml tag?
> 
> That is likely to be a consequence of this proposed behaviour, but no doubt 
> not a nice one.

The only real alternative is to treat xml more like bytea than text
(ie, treat the input as a stream of octets). Whether that's "nice", I
have no idea.

Have a nice day,
-- 
Martijn van Oosterhout  http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to 
> litigate.


signature.asc
Description: Digital signature

Re: [HACKERS] Function execution costs 'n all that

2007-01-15 Thread Tom Lane

Heikki Linnakangas <[EMAIL PROTECTED]> writes:
> Tom Lane wrote:
>> Would a simple constant value be workable, or do we need some more
>> complex model (and if so what)?

> A simple constant would probably be enough. If we want anything fancier 
> than that, it should be up to the author of the function to define the 
> cost model as well. I'm envisioning that you could attach a custom cost 
> function to a user-defined function which would return the estimated CPU 
> cost. And # of rows returned for a set-returning function.

But what will such an estimation function work on?  In general the
planner does not have the value(s) that will be passed to the actual
function at runtime.  It might have expressions or estimates but
those data structures are certainly not something we could pass to
non-C-coded functions.  Are we willing to restrict these functions
to being coded in C, as selectivity estimation functions are?

If we go this route it seems like we'll need four more columns in
pg_proc (cost estimation function OID, rowcount estimation function OID,
fallback cost constant, fallback rowcount constant).

BTW, I'm thinking that a "cost constant" probably ought to be measured
in units of cpu_operator_cost.  The default for built-in functions would
thus be 1, at least till such time as someone wants to refine the
estimates.  We'd probably want the default for PL and SQL functions to
be 10 or 100 or so.

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] Function execution costs 'n all that

2007-01-15 Thread Heikki Linnakangas


Tom Lane wrote:

Instead, I'm thinking it might be time to re-introduce some notion of
function execution cost into the system, and make use of that info to
sort WHERE clauses into a reasonable execution order.  


That sounds like a straightforward idea.


This example
would be fixed with even a very stupid rule-of-thumb about SQL functions
being more expensive than C functions, but if we're going to go to the
trouble it seems like it'd be a good idea to provide a way to label
user-defined functions with execution costs.


Agreed.


Would a simple constant value be workable, or do we need some more
complex model (and if so what)?


A simple constant would probably be enough. If we want anything fancier 
than that, it should be up to the author of the function to define the 
cost model as well. I'm envisioning that you could attach a custom cost 
function to a user-defined function which would return the estimated CPU 
cost. And # of rows returned for a set-returning function.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] xml type and encodings

2007-01-15 Thread Peter Eisentraut

Am Montag, 15. Januar 2007 17:33 schrieb Florian G. Pflug:
> Would this mean that if the client_encoding is for example latin1, and I
> retrieve an xml document uploaded by a client with client_encoding utf-8
> (and thus having encoding="c" in the xml tag), that I would get a
> document with latin1 encoding but saying that it's utf-8 in it's xml tag?

That is likely to be a consequence of this proposed behaviour, but no doubt 
not a nice one.

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] xml type and encodings

2007-01-15 Thread Florian G. Pflug


Peter Eisentraut wrote:

Am Montag, 15. Januar 2007 12:42 schrieb Nikolay Samokhvalov:

On 1/15/07, Peter Eisentraut <[EMAIL PROTECTED]> wrote:

Client encoding is A, server encoding is B.  Client sends an xml datum
that looks like this:

INSERT INTO table VALUES (xmlparse(document '...'));

Assuming that A, B, and C are all distinct, this could fail at a number
of places.

I suggest that we make the system ignore all encoding declarations in
xml data.  That is, in the above example, the string would actually
have to be encoded in client encoding B on the client, would be
converted to A on the server and stored as such.  As far as I can tell,
this is easily implemented and allowed by the XML standard.

In other words, in case when B != C server must trigger an error, right?


No, C is ignored in all cases.


Would this mean that if the client_encoding is for example latin1, and I
retrieve an xml document uploaded by a client with client_encoding utf-8 
(and thus having encoding="c" in the xml tag), that I would get a 
document with latin1 encoding but saying that it's utf-8 in it's xml tag?


greetings, Florian Pflug



---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

[HACKERS] Function execution costs 'n all that

2007-01-15 Thread Tom Lane

So I've been working on the scheme I suggested a few days ago of
representing "equivalence classes" of variables explicitly, and avoiding
the current ad-hocery of generating and then removing redundant clauses
in favor of generating only the ones we want in the first place.  Any
clause that looks like an equijoin gets sent to the EquivalenceClass
machinery by distribute_qual_to_rels, and not put into the
restrictlist/joinlist data structure at all.  Then we make passes over
the EquivalenceClass lists at appropriate times to generate the clauses
we want.  This is turning over well enough now to pass the regression
tests, but I noticed that one query in opr_sanity got a whole lot slower
than before.  The query is

SELECT p1.opcname, p1.opcfamily
FROM pg_opclass AS p1
WHERE NOT EXISTS(SELECT 1 FROM pg_amop AS p2
 WHERE p2.amopfamily = p1.opcfamily
   AND binary_coercible(p1.opcintype, p2.amoplefttype));

and investigation showed that the plan changed from (8.2 and before)

 Seq Scan on pg_opclass p1  (cost=0.00..393.94 rows=51 width=68)
   Filter: (NOT (subplan))
   SubPlan
 ->  Seq Scan on pg_amop p2  (cost=0.00..7.66 rows=2 width=0)
   Filter: ((amopfamily = $0) AND binary_coercible($1, amoplefttype))

to

 Seq Scan on pg_opclass p1  (cost=0.00..393.94 rows=51 width=68)
   Filter: (NOT (subplan))
   SubPlan
 ->  Seq Scan on pg_amop p2  (cost=0.00..7.66 rows=2 width=0)
   Filter: (binary_coercible($1, amoplefttype) AND (amopfamily = $0))

thus resulting in many more calls of binary_coercible() which is a
pretty expensive function.  This is not too surprising: the clause
p2.amopfamily = p1.opcfamily got diverted through the EquivalenceClass
code for just long enough to end up behind the other one in the table's
restrictlist :-(

In short, this approach results in a whole lot less stability in the
order in which WHERE clauses are evaluated.  That might be a killer
objection to the whole thing, but on the other hand we've never made
any strong promises about WHERE evaluation order.

Instead, I'm thinking it might be time to re-introduce some notion of
function execution cost into the system, and make use of that info to
sort WHERE clauses into a reasonable execution order.  This example
would be fixed with even a very stupid rule-of-thumb about SQL functions
being more expensive than C functions, but if we're going to go to the
trouble it seems like it'd be a good idea to provide a way to label
user-defined functions with execution costs.

Would a simple constant value be workable, or do we need some more
complex model (and if so what)?

Comments?

regards, tom lane

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate

Re: [HACKERS] [GENERAL] Autovacuum Improvements

2007-01-15 Thread Alban Hertroys

Pavan Deolasee wrote:
> Simon Riggs wrote:
>> On Fri, 2006-12-29 at 20:25 -0300, Alvaro Herrera wrote:
>>> Christopher Browne wrote:
>>>
 Seems to me that you could get ~80% of the way by having the simplest
 "2 queue" implementation, where tables with size < some threshold get
 thrown at the "little table" queue, and tables above that size go to
 the "big table" queue.

 That should keep any small tables from getting "vacuum-starved."
>>
> 
> This is exectly what I am trying, two process autovacuum and a GUC to
> seperate small tables.
> 
> In this case, one process takes up vacuuming of the small tables and
> other process vacuuming of the remaining tables as well as Xid
> avoidance related vacuuming. The goal is to avoid starvation of small
> tables when a large table is being vacuumed (which may take
> several hours) without adding too much complexity to the code.

Would it work to make the queues push the treshold into the direction of
the still running queue if the other queue finishes before the still
running one? This would achieve some kind of auto-tuning, but that is
usually tricky.

For example, what if one of the queues got stuck on a lock?

-- 
Alban Hertroys
[EMAIL PROTECTED]

magproductions b.v.

T: ++31(0)534346874
F: ++31(0)534346876
M:
I: www.magproductions.nl
A: Postbus 416
   7500 AK Enschede

// Integrate Your World //

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [HACKERS] Autovacuum improvements

2007-01-15 Thread Tom Lane

Alvaro Herrera <[EMAIL PROTECTED]> writes:
> I'm cooking a patch for this which seems pretty reasonable, but I'm
> having a problem: what mechanism do we have for waiting until a process
> exits?

None, and I think you probably don't want to sit on the database lock
while waiting, either.  I was envisioning a simple sleep loop, viz

for(;;)
{
acquire database lock;
foreach(PGPROC entry in that database)
{
if (it's autovac)
send sigint;
else
fail;
}
if (found any autovacs)
{
release database lock;
sleep(100ms or so);
/* loop back and try again */
}
else
break;
}

Also see Peter's nearby suggestion that we ought to wait instead of fail
for *all* cases of somebody attached to the database.  This would adapt
readily enough to that.

I was complaining elsewhere that I didn't want to use a sleep loop
for fixing the fsync-synchronization issue, but CREATE/DROP DATABASE
seems a much heavier-weight operation, so I don't feel that a sleep
is inappropriate here.

> Maybe make autovacuum acquire an LWLock at start, which it then
> keeps until it's gone, but it seems wasteful to have a lwlock just for
> that purpose.

And it doesn't scale to multiple autovacs anyway, much less the wait-for-
everybody variant.

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] [INTERFACES] ODBC: how to change search_path in DSN?

2007-01-15 Thread Carlo Stonebanks

Hi Jim,

The solution was to put " set search_path=new_path" in Connection Settings
in the ODBC configuration (on Page 2 of the dialog).

Carlo

-Original Message-
From: Jim Nasby [mailto:[EMAIL PROTECTED] 
Sent: January 14, 2007 10:07 PM
To: Carlo Stonebanks
Cc: [EMAIL PROTECTED]; PGSQL Hackers
Subject: Re: [INTERFACES] ODBC: how to change search_path in DSN?

On Dec 19, 2006, at 12:06 PM, Carlo Stonebanks wrote:
> I would like to create an ODBC DSN for a user to export files rom  
> Acccess to
> a safe area. Right now, all tables are being dumped to public by  
> default.
>
> In order to use the one-click export option for Access, I would  
> like to have
> the connection for the ODBC DSN file change the default search_path to
> my_schema (and my_schema only).
>
> After seeing a post on this on another resource, I tried putting
> schema=my_schema in the connection settings text box, but the tables
> appeared in public anyway.

I think your only option is to change the default for the user. IT  
would be handy if there was a way to set search_path via  
pg_service.conf, though...
--
Jim Nasby[EMAIL PROTECTED]
EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)



---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [HACKERS] [GENERAL] Autovacuum Improvements

2007-01-15 Thread Pavan Deolasee

Simon Riggs wrote:
> On Fri, 2006-12-29 at 20:25 -0300, Alvaro Herrera wrote:
>> Christopher Browne wrote:
>>
>>> Seems to me that you could get ~80% of the way by having the simplest
>>> "2 queue" implementation, where tables with size < some threshold get
>>> thrown at the "little table" queue, and tables above that size go to
>>> the "big table" queue.
>>>
>>> That should keep any small tables from getting "vacuum-starved."
>

This is exectly what I am trying, two process autovacuum and a GUC to
seperate small tables.

In this case, one process takes up vacuuming of the small tables and
other process vacuuming of the remaining tables as well as Xid
avoidance related vacuuming. The goal is to avoid starvation of small
tables when a large table is being vacuumed (which may take
several hours) without adding too much complexity to the code.

>
> Some feedback from initial testing is that 2 queues probably isn't
> enough. If you have tables with 100s of blocks and tables with millions
> of blocks, the tables in the mid-range still lose out. So I'm thinking
> that a design with 3 queues based upon size ranges, plus the idea that
> when a queue is empty it will scan for tables slightly above/below its
> normal range. That way we wouldn't need to specify the cut-offs with a
> difficult to understand new set of GUC parameters, define them exactly
> and then have them be wrong when databases grow.
>
> The largest queue would be the one reserved for Xid wraparound
> avoidance. No table would be eligible for more than one queue at a time,
> though it might change between queues as it grows.
>
> Alvaro, have you completed your design?
>
> Pavan, what are your thoughts?
>

IMO 2-queue is a good step forward, but in long term we may need to go
for a multiprocess autovacuum where the number and tasks of processes
are either demand based and/or user configurable.

Another idea is to vacuum the tables in round-robin fashion
where the quantum could be either "time" or "number of block". The
autovacuum process would vacuum 'x' blocks of one table and then
schedule next table in the queue. This would avoid starvation of
small tables, though cost of index cleanup might go up because of
increased IO. Any thoughts of this approach ?

Thanks,
Pavan

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org

Re: [HACKERS] [INTERFACES] ODBC: how to change search_path in DSN?

2007-01-15 Thread Jim Nasby


On Dec 19, 2006, at 12:06 PM, Carlo Stonebanks wrote:
I would like to create an ODBC DSN for a user to export files rom  
Acccess to
a safe area. Right now, all tables are being dumped to public by  
default.


In order to use the one-click export option for Access, I would  
like to have

the connection for the ODBC DSN file change the default search_path to
my_schema (and my_schema only).

After seeing a post on this on another resource, I tried putting
schema=my_schema in the connection settings text box, but the tables
appeared in public anyway.


I think your only option is to change the default for the user. IT  
would be handy if there was a way to set search_path via  
pg_service.conf, though...

--
Jim Nasby[EMAIL PROTECTED]
EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match

Re: [HACKERS] -f option for pg_dumpall

2007-01-15 Thread Andrew Dunstan




Neil Conway wrote:

On Thu, 2007-01-11 at 14:36 -0500, Neil Conway wrote:
  

I don't think they need to be integrated any time soon, but if we were
to design pg_dump and pg_dumpall from scratch, it seems more logical to
use a single program



On thinking about this some more, it might be useful to factor much of
pg_dump's logic for reconstructing the state of a database into a shared
library. This would make it relatively easy for developers to plug new
archive formats into the library (in addition to the present 3 archive
formats), or to make use of this functionality in other applications
that want to reconstruct the logical state of a database from the
content of the system catalogs. We could then provide a client app
implemented on top of the library that would provide similar
functionality to pg_dump.

Moving pg_dump's functionality into the backend has been suggested in
the past (and rejected for good reason), but I think this might be a
more practical method for making the pg_dump logic more easily reusable.


  


I like this idea. For example, we might usefully map some of this to 
psql \ commands, without having to replicate the underlying logic.


cheers

andrew

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate

Re: [HACKERS] Autovacuum improvements

2007-01-15 Thread Alvaro Herrera

Tom Lane wrote:

> The DROP is at risk, but CREATE is also at risk because autovac feels
> free to connect to template0.  (One of the reasons we invented template0
> was to prevent CREATE DATABASE failures due to someone-else-connected,
> but autovac has broken that idea.)

ALTER DATABASE RENAME also needs the same treatment.

> Possibly we could handle these by extending create/drop db to check
> whether a process-connected-to-the-target-db is an autovac, and if so
> send it a SIGINT and wait for the process to terminate, instead of
> failing.

I'm cooking a patch for this which seems pretty reasonable, but I'm
having a problem: what mechanism do we have for waiting until a process
exits?  Maybe make autovacuum acquire an LWLock at start, which it then
keeps until it's gone, but it seems wasteful to have a lwlock just for
that purpose.

Another idea is to do kill(0, AutoVacPID); sleep(); in a loop, but that
seems pretty stupid.

Better ideas anyone?

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [HACKERS] xml type and encodings

2007-01-15 Thread Peter Eisentraut

Am Montag, 15. Januar 2007 12:42 schrieb Nikolay Samokhvalov:
> On 1/15/07, Peter Eisentraut <[EMAIL PROTECTED]> wrote:
> > Client encoding is A, server encoding is B.  Client sends an xml datum
> > that looks like this:
> >
> > INSERT INTO table VALUES (xmlparse(document ' > encoding="C"?>...'));
> >
> > Assuming that A, B, and C are all distinct, this could fail at a number
> > of places.
> >
> > I suggest that we make the system ignore all encoding declarations in
> > xml data.  That is, in the above example, the string would actually
> > have to be encoded in client encoding B on the client, would be
> > converted to A on the server and stored as such.  As far as I can tell,
> > this is easily implemented and allowed by the XML standard.
>
> In other words, in case when B != C server must trigger an error, right?

No, C is ignored in all cases.

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] xml type and encodings

2007-01-15 Thread Nikolay Samokhvalov


On 1/15/07, Peter Eisentraut <[EMAIL PROTECTED]> wrote:


Client encoding is A, server encoding is B.  Client sends an xml datum
that looks like this:

INSERT INTO table VALUES (xmlparse(document '...'));

Assuming that A, B, and C are all distinct, this could fail at a number
of places.

I suggest that we make the system ignore all encoding declarations in
xml data.  That is, in the above example, the string would actually
have to be encoded in client encoding B on the client, would be
converted to A on the server and stored as such.  As far as I can tell,
this is easily implemented and allowed by the XML standard.



In other words, in case when B != C server must trigger an error, right?

--
Best regards,
Nikolay

48 matches

Mail list logo