Re: [HACKERS] Bunch o' dead code in GEQO

2004-01-22 Thread Tom Lane
Neil Conway <[EMAIL PROTECTED]> writes:
> Tom Lane <[EMAIL PROTECTED]> writes:
>> Where are we going to find a representative test set of
>> dozen-or-more- way SQL join queries?

> As far as getting good lotsa-join queries, I think we can either:

>(1) generate the queries programmatically

>For example, star-schema join queries might be tractable via this
>method.

Sure, we can generate umpteen thousand star joins in no time, but they
are all the same problem.  I don't think this is particularly helpful
either for finding optimizer bugs or for making choices about
performance issues.

An example of the kind of thing I'm worried about: I realized just
yesterday that GEQO is actively broken in 7.4 because it cannot generate
"bushy" plans.  As of 7.4 there are cases involving IN constructs where
the only valid plans are bushy.  For example, in the regression database:

regression=# set geqo_threshold to 3;
SET
regression=# explain select * from tenk1 where
regression-# unique1 in (select unique2 from tenk1 t2, int4_tbl t3 where hundred = f1) 
and
regression-# unique2 in (select unique1 from tenk1 t4, int4_tbl t5 where hundred = f1);
ERROR:  failed to make a valid plan

You could test star joins all day long and not find that bug.

>(2) get the queries manually

>This would involve either writing schema and a bunch of queries for
>an "example app" (a la the Java Web Store), or getting a sanitized
>version of the schema & common queries used by a few large PG
>users. The latter might be the better way to go...

The only thing I'd really trust is a sampling of complex queries from
different real-world applications.  This will probably be hard to get,
and we can only hope to have dozens of queries not hundreds or
thousands.  We will also need to think about how we will get the
pg_statistic entries to correspond to the real-world situations.

regards, tom lane

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [HACKERS] RFC: bufmgr locking changes

2004-01-22 Thread Tom Lane
Neil Conway <[EMAIL PROTECTED]> writes:
> Tom Lane <[EMAIL PROTECTED]> writes:
>> The latter is a really bad idea IMHO.  The io_in_progress lock can be
>> held for eons (in CPU terms) and should not be blocking people who
>> simply want to bump their refcount up and down.

> My reasoning was that the contention for the per-buffer meta data lock
> should be pretty low. The io_in_progress lock is held when we're
> either faulting a page in or flushing a page out. In the first case,
> we can't actually use the buffer no matter how we do the locking (its
> content is incomplete), so there's no effective loss in
> concurrency. In the second case, what kinds of concurrent activity can
> we allow on the buffer? (We can allow reads, of course, but I don't
> believe we can allow writes.)

True, there's no win in the read-busy case, but I think you
underestimate the value of the write-busy case.  Multiple concurrent
readers are a very important consideration.  In Postgres it is possible
for a reader to cause a write to occur (because it sets commit hint
bits, as per the SetBufferCommitInfoNeedsSave() business), and so you
could have a situation like

Reader pins page
Reader examines some tuples
Reader sets a commit bit and dirties page
...
Writer starts write
...
Reader examines some more tuples
Reader unpins page
Writer finishes write

If the reader can't unpin until the writer is done, then we will have
foreground readers blocked on the background writer process, which is
exactly what we do not want.

>> I think both of these are ill-considered micro-optimization.  How do you
>> know that the pin count can't exceed 64K?  Consider recursive plpgsql
>> functions for a likely counterexample.

> Fair enough -- I couldn't conceive of an actual scenario in which
> a single backend would acquire > 64K pins on a single buffer, but I'll
> take your word that it's not so far fetched. However, there is still
> room for improvement, IMHO: on a machine with 64-bit longs, we're
> plainly allocating 4 bytes more than we need do. Any objection if I
> change these to arrays of int32?

That seems like a reasonable compromise.

regards, tom lane

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


Re: [HACKERS] Bunch o' dead code in GEQO

2004-01-22 Thread Neil Conway
Tom Lane <[EMAIL PROTECTED]> writes:
> Where are we going to find a representative test set of
> dozen-or-more- way SQL join queries?

Interesting that you should mention that. I've been thinking for a
while that we need a much more extensive test suite for the query
optimizer. This would allow us to more easily spot regressions in the
optimizer, to get quantifiable data on the effect of optimizer
improvements and optimizations, and it might end up being a good
general-purpose performance benchmark as well.

As far as getting good lotsa-join queries, I think we can either:

   (1) generate the queries programmatically

   For example, star-schema join queries might be tractable via this
   method. One nice benefit of generating the queries via this method
   is that it should allow us to scale the number of joins pretty
   easily. One downside might be that we wouldn't get the kind of
   diversity of queries that #2 might provide.

   (2) get the queries manually

   This would involve either writing schema and a bunch of queries for
   an "example app" (a la the Java Web Store), or getting a sanitized
   version of the schema & common queries used by a few large PG
   users. The latter might be the better way to go...

We could do both, of course, which might be the way to go.

Any thoughts?

-Neil

P.S. Unfortunately, I'm sufficiently busy right now that I won't be
able to do any work on this any time soon -- I just wanted to toss out
some ideas because I really think it's worth doing. Anyone who's
interested is more than welcome to get started.


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


Re: [HACKERS] LWLock/ShmemIndex startup question

2004-01-22 Thread Tom Lane
Bruce Momjian <[EMAIL PROTECTED]> writes:
> Andrew Dunstan wrote:
>> AFAIK the only target build environment for Windows right now is MinGW/gcc
>> 
>> If anyone knows how to get the M$ compilers to work nicely with our build
>> system that might be interesting, but probably at a later stage.

> MS C compiler usage would require gmake and a whole host of other stuff
> that doesn't seem worth doing.

Um, good point.  Porting our Makefiles to anything but gmake seems
completely out of the question.  So unless someone has a Windows build
environment that has gmake but not gcc, this is moot.

regards, tom lane

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [HACKERS] Bunch o' dead code in GEQO

2004-01-22 Thread Tom Lane
Neil Conway <[EMAIL PROTECTED]> writes:
> Tom Lane <[EMAIL PROTECTED]> writes:
>> I'm assuming that the original author of the GEQO code already did
>> that testing ...

> Removing the code without bothering to verify this assumption is a
> little unwise, IMHO:

Fair enough.  I did a little bit of poking around and it seems that
ERX (edge-recombination crossover) is still considered one of the best
available choices for solving Traveling Salesman problems via genetic
optimization.  Now there is the little issue that our problem isn't
really TSP and might behave a bit differently, but I see no evidence
to suggest that one of the other recombinator methods would do better.
They were all designed to do TSP.

The big problem here is that I don't see any practical method of
obtaining indisputable proof that one method is better than another.
Where are we going to find a representative test set of dozen-or-more-
way SQL join queries?

regards, tom lane

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] Permissions and PGSQL

2004-01-22 Thread Bruce Momjian
Tom Lane wrote:
> Bruce Momjian <[EMAIL PROTECTED]> writes:
> > On Monday 05 January 2004 12:51, Peter Eisentraut wrote:
> >> Jean-Eric Cuendet wrote:
> >>> - User permissions based on columns?
> > 
> >> Well, I have about half a patch for column privileges lying around, but
> >> I've never had enough motivation to do the other, more complicated
> >> half...
> 
> > I thought views took care of this.  Comments?
> 
> You can do it with views, but it's tedious, and more to the point the
> SQL standard says that you're supposed to be able to do it directly.
> So I think this should be on TODO.  I wouldn't put a real high priority
> on it personally, but it's a spec-compliance item ...

Added to TODO:

* Allow column-level privileges

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [HACKERS] LWLock/ShmemIndex startup question

2004-01-22 Thread Bruce Momjian
Claudio Natoli wrote:
> 
> 
> Tom Lane writes:
> > [cvs is your friend...]  It appears to have been added as part of the
> > MinGW porting work last May.  I don't have much faith in it; as far as
> > I heard the MinGW port never got further than making the client-side
> > code work, and so this file has no real-world testing.
> 
> FWIW, I've done a code walk-through, and it looks ok (lack of real-world
> testing notwithstanding), and actually does use the Win32 sema set. The only
> real problem is that it calls ShmemInitStruct in semget, which ultimately
> gets us into bootstrap hell (without native spinlocks, at least).
> 
> Also, as far as using it in the "hardware independent" version of spin-locks
> go, it makes kernel calls, which, as spin.c comments: "is too slow to be
> very useful".

Yep, native gcc TAS assembler should work fine on MinGW with gcc.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [HACKERS] LWLock/ShmemIndex startup question

2004-01-22 Thread Bruce Momjian
Tom Lane wrote:
> Claudio Natoli <[EMAIL PROTECTED]> writes:
> > Or, maybe we'll just use the tas() implementation that already exists for
> > __i386__/__x86_64__ in s_lock.h. How did I miss that?
> > Move along. Nothing to see here.
> 
> Actually, I was expecting you to complain that the s_lock.h coding is
> gcc-specific.  Which compilers do we need to support on Windows?
> 
> We might have to fall back to something comparable to the CVS-tip s_lock
> support for hppa: inline assembler in s_lock.h for gcc, and a separate
> assembly source file for use with vendor compiler(s).

MinGW only uses gcc so we are OK there.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [HACKERS] LWLock/ShmemIndex startup question

2004-01-22 Thread Bruce Momjian
Tom Lane wrote:
> Claudio Natoli <[EMAIL PROTECTED]> writes:
> > Tom Lane writes:
> >> I'm not sure there's any need for
> >> src/backend/port/win32/sema.c at all.
> 
> > (Do you have any idea on the historical
> > context of this code? I wondered as to, if we have no win32 port, why there
> > would be a seemingly good-to-go sema replacement?)
> 
> [cvs is your friend...]  It appears to have been added as part of the
> MinGW porting work last May.  I don't have much faith in it; as far as
> I heard the MinGW port never got further than making the client-side
> code work, and so this file has no real-world testing.

Right, that is from PeerDirect's Win32 port, with a little cleanup.  It
should work fine, or we can wack it around.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] LWLock/ShmemIndex startup question

2004-01-22 Thread Bruce Momjian
Andrew Dunstan wrote:
> Tom Lane said:
> >
> > Actually, I was expecting you to complain that the s_lock.h coding is
> > gcc-specific.  Which compilers do we need to support on Windows?
> >
> 
> AFAIK the only target build environment for Windows right now is MinGW/gcc
> 
> If anyone knows how to get the M$ compilers to work nicely with our build
> system that might be interesting, but probably at a later stage.

MS C compiler usage would require gmake and a whole host of other stuff
that doesn't seem worth doing.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [HACKERS] Permissions and PGSQL

2004-01-22 Thread Tom Lane
Bruce Momjian <[EMAIL PROTECTED]> writes:
> On Monday 05 January 2004 12:51, Peter Eisentraut wrote:
>> Jean-Eric Cuendet wrote:
>>> - User permissions based on columns?
> 
>> Well, I have about half a patch for column privileges lying around, but
>> I've never had enough motivation to do the other, more complicated
>> half...

> I thought views took care of this.  Comments?

You can do it with views, but it's tedious, and more to the point the
SQL standard says that you're supposed to be able to do it directly.
So I think this should be on TODO.  I wouldn't put a real high priority
on it personally, but it's a spec-compliance item ...

regards, tom lane

---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [HACKERS] Permissions and PGSQL

2004-01-22 Thread Christopher Kings-Lynne
Well, I have about half a patch for column privileges lying around, but
I've never had enough motivation to do the other, more complicated
half...
Is there a TODO and TODO.detail warrented here?


I thought views took care of this.  Comments?
They're needed for SQL99 anyway I think.

Chris

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] Permissions and PGSQL

2004-01-22 Thread Bruce Momjian
Robert Treat wrote:
> On Monday 05 January 2004 12:51, Peter Eisentraut wrote:
> > Jean-Eric Cuendet wrote:
> > > - User permissions based on columns? (Ex: User1 has Select on Column
> > > "CompayName" but User2 has update on column "CompanyName" while User3
> > > has create new row on table).
> >
> > Well, I have about half a patch for column privileges lying around, but
> > I've never had enough motivation to do the other, more complicated
> > half...
> >
> 
> Is there a TODO and TODO.detail warrented here?

I thought views took care of this.  Comments?

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


Re: [HACKERS] What's left?

2004-01-22 Thread Claudio Natoli
 
Tom Lane wrote:
> Claudio Natoli <[EMAIL PROTECTED]> writes:
> > One important thing I forgot, that someone could start looking at now:
> >  *  backends keeping files open when other backends are trying to
> > delete/rename them
> 
> > We must do better for the official port,
> 
> Why?  The procedure you mentioned seems perfectly adequate to me,
> seeing that it's a bit of a corner case to start with.

Because, on occasion, I end up with GBs of log files hanging around. You can
wrack up disk space real fast that way.


> I cannot think of any way of "doing better" that wouldn't be far too
> invasive to be acceptable.

In that case, I'm more than happy to contribute the code for this solution
along with the remaining changes I've got ahead of me, but, still, if
someone thinks they can do better without a great deal of drama (which, like
you, I think is unlikely) then I'm all for it.

Cheers,
Claudio

--- 
Certain disclaimers and policies apply to all email sent from Memetrics.
For the full text of these disclaimers and policies see 
http://www.memetrics.com/emailpolicy.html";>http://www.memetrics.com/em
ailpolicy.html

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


[HACKERS] I am back

2004-01-22 Thread Bruce Momjian
I have spent the past 10 days doing work for SRA, my employer, and
attending LinuxWorld in New York City.  Our BOF was well attended, with
35 people, and there were a lot of good PostgreSQL usage reports at the
event.  I have told people to post their news items to the web site or
advocacy.

I have 2k message to read, and hopefully will read them all before I
have to leave again on Tuesday.  I don't have anything scheduled after
this, so I will be around after February 6.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] What's left?

2004-01-22 Thread Tom Lane
Claudio Natoli <[EMAIL PROTECTED]> writes:
> One important thing I forgot, that someone could start looking at now:
>  *  backends keeping files open when other backends are trying to
> delete/rename them

> We must do better for the official port,

Why?  The procedure you mentioned seems perfectly adequate to me,
seeing that it's a bit of a corner case to start with.

I cannot think of any way of "doing better" that wouldn't be far too
invasive to be acceptable.

regards, tom lane

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] Allow backend to output result sets in XML

2004-01-22 Thread Scott Lamb
On Jan 21, 2004, at 12:19 PM, Peter Eisentraut wrote:
I think "output XML" is just buzz.  Give us a real use scenario and an
indication that a majority also has that use scenario (vs. the other
ones listed above), then we can talk.
I do this all the time.

I have JDBC code to take a java.sql.ResultSet and push out SAX events 
in a standard schema. I also have a XSLT stylesheet that formats them 
in a decent way. In this manner, it's very easy for me to make 
database-driven webpages. I can inherit the "normal" stylesheet and 
then just code the exceptions.

However, I'm quite happy doing this on the client side. I'm not sure 
why it would be beneficial to do this as part of the PostgreSQL server.

Scott Lamb

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?
  http://www.postgresql.org/docs/faqs/FAQ.html


Re: [pgsql-hackers-win32] [HACKERS] What's left?

2004-01-22 Thread Rob Butler
> Might I just suggest good old "C:\PostgreSQL" ?
>
> MS SQL server defaults to C:\MSSQL, so I don't think that a directory in
the
> root path is unreasonable.  Further, it makes it look more important if it
> installs in the root directory :)

Don't do that.  I hate software that does that.  To me it immediately
screams "WE DON'T CARE ABOUT DOING THINGS RIGHT!".

my 2cents
Later
Rob


---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [HACKERS] What's left?

2004-01-22 Thread Claudio Natoli

Some fool wrote:
> It will then be a matter of fixing things like:
>  * installation directory issues (/usr/local/pgsql/bin won't work too
>well outside of the MingW environment :-)
>  * general directory handling (ie. whitespaces in directory names;
>forward/backslash path canonicalization)
>  * sync issues
>  * any missing structs/items in shared memory
>  * generally, running the test suite, and fixing whatever is busted (I'm
>at 41 tests passing now :-)

One important thing I forgot, that someone could start looking at now:
 *  backends keeping files open when other backends are trying to
delete/rename them

The port I wrote for here at work simply modified the functions in dirmod.c,
to attempt the delete (or rename), and, on a failure identifiable as being
presumably due to another process holding the file open, simply schedules
the file for deletion at system start time using the Win32 API for doing so
(hey, it is Windows, it is going to reboot sooner or later :-). In the case
of rename, just copies the existing file and schedules the original for
deletion.

Ugly, and sometimes slow where we'd rather not be, but it gets us by.

We must do better for the official port, and whilst better solutions are
obviously conceivable, AFAICS they will require some amount of backend
changes and therefore consent from main list. Someone might want to start
looking at a nice, clean solution to this.

Cheers,
Claudio

--- 
Certain disclaimers and policies apply to all email sent from Memetrics.
For the full text of these disclaimers and policies see 
http://www.memetrics.com/emailpolicy.html";>http://www.memetrics.com/em
ailpolicy.html

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [pgsql-hackers-win32] [HACKERS] What's left?

2004-01-22 Thread Claudio Natoli


> Where can I get the latest tarball for Win32 development?

There isn't a specific Win32 tarball, but you can get nightly snapshots from
the usual place (ftp://ftp.postgresql.org/pub/dev/), or pull down the tip
from CVS.

Reading back through the thread though, you'll find that the code is not yet
compilable/runnable, and you might want to rein in your enthusiasm for just
two or three weeks more (which might not be easy :-) for these last couple
bits (which are nearly ready) to get sent, approved, + applied.

Cheers,
Claudio

--- 
Certain disclaimers and policies apply to all email sent from Memetrics.
For the full text of these disclaimers and policies see 
http://www.memetrics.com/emailpolicy.html";>http://www.memetrics.com/em
ailpolicy.html

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


[HACKERS] implemented missing bitSetBit() and bitGetBit()

2004-01-22 Thread David Helgason
I needed these, so I went and implemented them myself. I have to admit 
I'm not so wise on PostgreSQL backend stuff, and so I abstained from 
editing the fmgrtab.c and postgres.bki.

Sorry for that, but it just seemed too complicated. Perhaps there 
should really be a script to do this? I could imagine I'm not the only 
one slightly daunted by these files...

There's seems to be no reason for changing the docs since the functions 
are documented as existing :)
I didn't add the other functions that one might expect to exist 
(get-/set_byte and others), since I don't really need them.

I tested this code under 7.4 since that's what I've got here, but 
imagine nothing much changed in this end of the world for 7.4.1...

This is to be appended to src/backend/utils/adt/varbit.c:

/*-
 * bitSetBit
 *
 * Given an instance of type 'bit' creates a new one with
 * the Nth bit set to the given value.
 *
 *-
 */
PG_FUNCTION_INFO_V1(bitSetBit);
Datum bitSetBit(PG_FUNCTION_ARGS)
{
VarBit  *arg1 = PG_GETARG_VARBIT_P(0);
int32   n = PG_GETARG_INT32(1);
int32   newBit = PG_GETARG_INT32(2);
VarBit  *result;
int bitlen,
bytelen,
byteNo,
bitNo;
unsigned char   oldByte,
newByte;
	bitlen = VARBITLEN(arg1);

/*
 * sanity checks!
 */
if (newBit != 0 && newBit != 1)
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
 errmsg("new bit must be 0 or 1")));
if (n < 0 || n >= bitlen)
ereport(ERROR,
(errcode(ERRCODE_ARRAY_SUBSCRIPT_ERROR),
 errmsg("bit %d outside of valid range, 0..%d",
n, bitlen - 1)));
/* Copy input bitstring */
bytelen = VARSIZE(arg1);
result = (VarBit *) palloc(bytelen);
memcpy(VARBITS(result), VARBITS(arg1), VARBITBYTES(arg1));
VARATT_SIZEP(result) = bytelen;
VARBITLEN(result) = bitlen;
/*
 * Update the bit.
 */
byteNo = n / 8;
bitNo = 7 - (n % 8);
	oldByte = ((unsigned char *) VARBITS(result))[byteNo];

if (newBit == 0)
newByte = oldByte & (~(1 << bitNo));
else
newByte = oldByte | (1 << bitNo);
((unsigned char *) VARBITS(result))[byteNo] = newByte;
PG_RETURN_VARBIT_P(result);
}
/*-
 * bitGetBit
 *
 * Given an instance of type 'bit' returns the Nth bit.
 *
 *-
 */
PG_FUNCTION_INFO_V1(bitGetBit);
Datum bitGetBit(PG_FUNCTION_ARGS)
{
VarBit  *arg1 = PG_GETARG_VARBIT_P(0);
int32   n = PG_GETARG_INT32(1);
int bitlen,
bytelen,
byteNo,
bitNo;
unsigned char theByte;
	bitlen = VARBITLEN(arg1);

/*
 * sanity check!
 */
if (n < 0 || n >= bitlen)
ereport(ERROR,
(errcode(ERRCODE_ARRAY_SUBSCRIPT_ERROR),
 errmsg("bit %d outside of valid range, 0..%d",
n, bitlen - 1)));
/*
 * Find the target bit
 */
byteNo = n / BITS_PER_BYTE;
bitNo = BITS_PER_BYTE - 1 - (n % BITS_PER_BYTE);
	theByte = ((unsigned char *) VARBITS(arg1))[byteNo];

	/*
	 * Shift a set bit to target position, & with the target byte, shift 
back
	 * to get integer 0 or 1
	 */
	PG_RETURN_INT32((int)(theByte & (1 << bitNo)) >> bitNo);
}

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [pgsql-hackers-win32] [HACKERS] What's left?

2004-01-22 Thread Dann Corbit
> -Original Message-
> From: Claudio Natoli [mailto:[EMAIL PROTECTED] 
> Sent: Thursday, January 22, 2004 4:44 PM
> To: 'Andrew Dunstan '; 'pgsql-hackers-win32 '; 
> 'PostgreSQL-development '
> Subject: Re: [pgsql-hackers-win32] [HACKERS] What's left?
> 
> 
> 
> Andrew Dunstan wrote:
> > Claudio Natoli wrote:
> > 
> > > * installation directory issues (/usr/local/pgsql/bin 
> won't work too 
> > > well outside of the MingW environment :-)
> >
> > 
> > Clearly we will need an installer for a binary distribution.
> 
> Yes. To be more precise, my point was that doing so will 
> require some changes to the code (ie. configure/compile time 
> constants like PKGLIBDIR just won't do us any good).
> 
> 
> > But for now I suggest that the default prefix on Windows is
> > C:\Program Files\PostgreSQL

More properly:
%ProgramFiles%\PostgreSQL

And then, if they don't like that, let them put it wherever they darn
well please.
 
> For right now, I'd suggest a directory that doesn't have 
> whitespace and localization issues :-P

It will have to be faced.  If it cannot handle spaces or funny
characters, it will simply fail.

I may be able to help on the localization and path stuff.  We have
solved those issues for our port of 7.1.3, and I expect the work for 7.5
to be extremely similar.

Where can I get the latest tarball for Win32 development?

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [pgsql-hackers-win32] [HACKERS] What's left?

2004-01-22 Thread David Felstead
Hi all,

Might I just suggest good old "C:\PostgreSQL" ?

MS SQL server defaults to C:\MSSQL, so I don't think that a directory in the
root path is unreasonable.  Further, it makes it look more important if it
installs in the root directory :)

All the best,

-David Felstead

Claudio Natoli wrote:
> Andrew Dunstan wrote:
> > Claudio Natoli wrote:
> > 
> > > * installation directory issues (/usr/local/pgsql/bin won't work too
> > > well outside of the MingW environment :-)
> >
> > 
> > Clearly we will need an installer for a binary distribution. 
> 
> Yes. To be more precise, my point was that doing so will require some
> changes to the code (ie. configure/compile time constants like PKGLIBDIR
> just won't do us any good).
> 
> 
> > But for now I suggest that the default prefix on Windows is 
> > C:\Program Files\PostgreSQL
> 
> For right now, I'd suggest a directory that doesn't have whitespace and
> localization issues :-P
> 
> Cheers,
> Claudio
This email and any attachments may contain privileged and
confidential information and are intended for the named
addressee only. If you have received this e-mail in error,
please notify the sender and delete this e-mail immediately.
Any confidentiality, privilege or copyright is not waived or
lost because this e-mail has been sent to you in error. It
is your responsibility to check this e-mail and any 
attachments for viruses.

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [pgsql-hackers-win32] [HACKERS] What's left?

2004-01-22 Thread Claudio Natoli

Andrew Dunstan wrote:
> Claudio Natoli wrote:
> 
> > * installation directory issues (/usr/local/pgsql/bin won't work too
> > well outside of the MingW environment :-)
>
> 
> Clearly we will need an installer for a binary distribution. 

Yes. To be more precise, my point was that doing so will require some
changes to the code (ie. configure/compile time constants like PKGLIBDIR
just won't do us any good).


> But for now I suggest that the default prefix on Windows is 
> C:\Program Files\PostgreSQL

For right now, I'd suggest a directory that doesn't have whitespace and
localization issues :-P

Cheers,
Claudio

--- 
Certain disclaimers and policies apply to all email sent from Memetrics.
For the full text of these disclaimers and policies see 
http://www.memetrics.com/emailpolicy.html";>http://www.memetrics.com/em
ailpolicy.html

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] RFC: bufmgr locking changes

2004-01-22 Thread Neil Conway
  (Sorry Tom, I was meaning to reply to you once I'd had a chance to
  revise the bufmgr patch; since that seems a fair ways off, I figured
  it would be better to respond now.)

Tom Lane <[EMAIL PROTECTED]> writes:

> Neil Conway <[EMAIL PROTECTED]> writes:
>> we now hold the buffer's meta data lock while doing buffer I/O.
>
> The latter is a really bad idea IMHO.  The io_in_progress lock can be
> held for eons (in CPU terms) and should not be blocking people who
> simply want to bump their refcount up and down.

My reasoning was that the contention for the per-buffer meta data lock
should be pretty low. The io_in_progress lock is held when we're
either faulting a page in or flushing a page out. In the first case,
we can't actually use the buffer no matter how we do the locking (its
content is incomplete), so there's no effective loss in
concurrency. In the second case, what kinds of concurrent activity can
we allow on the buffer? (We can allow reads, of course, but I don't
believe we can allow writes.)

However, I'll think some more on this, you (and Jan, who raised this
point a while ago via IRC) are probably correct.

> It's possible that you could combine the io_in_progress lock with the
> cntx_lock

Yeah, that's a possibility.

>> - Remove SetBufferCommitInfoNeedsSave(). AFAICS, this is now
>>   completely equivalent to WriteNoReleaseBuffer(), so I just removed
>>   the former and replaced all the calls to it with calls to the later.
>
> The reason I've kept the separation was as a form of documentation as to
> the reason for each write.  Although they currently do the same thing,
> that might not always be true.  I'd prefer not to eliminate the
> distinction from the source code --- though I'd not object if you want
> to make SetBufferCommitInfoNeedsSave a macro that invokes the other
> routine.

Ok, fair enough -- I've changed SetBufferCommitInfoNeedsSave() to be a
macro for WriteNoReleaseBuffer().

>> - Make 'PrivateRefCount' an array of uint16s, rather than longs. This
>>   saves 2 bits * shared_buffers per backend on 32-bit machines and 6
>>   bits * shared_buffers per backend on some 64-bit machines. It means
>>   a given backend can only pin a single buffer 65,636 times, but that
>>   should be more than enough. Similarly, made LocalRefCount an array
>>   of uint16s.
>
> I think both of these are ill-considered micro-optimization.  How do you
> know that the pin count can't exceed 64K?  Consider recursive plpgsql
> functions for a likely counterexample.

Fair enough -- I couldn't conceive of an actual scenario in which
a single backend would acquire > 64K pins on a single buffer, but I'll
take your word that it's not so far fetched. However, there is still
room for improvement, IMHO: on a machine with 64-bit longs, we're
plainly allocating 4 bytes more than we need do. Any objection if I
change these to arrays of int32?

> Please put that back.  It is there to avoid unnecessary acquisitions
> of buffer locks during UnlockBuffers (which is executed during any
> transaction abort).  Without it, you will be needing to lock every
> single buffer during an abort in order to check its flags.

It seems bizarre that we need to iterate through a few thousand array
elements just to do some lock cleanup. I'll take a closer look at
this...

-Neil


---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] What's left?

2004-01-22 Thread Andrew Dunstan


Claudio Natoli wrote:

* installation directory issues (/usr/local/pgsql/bin won't work too well
outside of the MingW environment :-)
Clearly we will need an installer for a binary distribution. But for now 
I suggest that the default prefix on Windows is
C:\Program Files\PostgreSQL

cheers

andrew

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
   (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


Re: [HACKERS] What's left?

2004-01-22 Thread Claudio Natoli

> I would like very much to help any way possible in solving any last
> remaining issues.  Once the CVS sources are compliable, it will be
> easier to make meaningful contributions.  I'm really looking 
> forward to testing and benchmarking the win32 port.  A big thanks 
> to all who continue to work so hard on this project.

To answer the question, "what's left?"

There is one patch I've submitted waiting for application (12/01/04), and
I've got another patch almost ready (adds some missing functions, and fixes
busted Makefiles etc) that, with Magnus's signal code, will give us a
compilable, runnable, and somewhat buggy Win32 port.

It will then be a matter of fixing things like:
 * installation directory issues (/usr/local/pgsql/bin won't work too well
outside of the MingW environment :-)
 * general directory handling (ie. whitespaces in directory names;
forward/backslash path canonicalization)
 * sync issues
 * any missing structs/items in shared memory
 * generally, running the test suite, and fixing whatever is busted (I'm at
41 tests passing now :-)

I'm imagining we'll be in a position to be able to start on the above in as
little as 2 weeks...

Cheers,
Claudio

--- 
Certain disclaimers and policies apply to all email sent from Memetrics.
For the full text of these disclaimers and policies see 
http://www.memetrics.com/emailpolicy.html";>http://www.memetrics.com/em
ailpolicy.html

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


Re: [HACKERS] Bunch o' dead code in GEQO

2004-01-22 Thread Neil Conway
Tom Lane <[EMAIL PROTECTED]> writes:
> I'm assuming that the original author of the GEQO code already did
> that testing ...

Removing the code without bothering to verify this assumption is a
little unwise, IMHO: given the low quality of the rest of the GEQO
code, I wouldn't be surprised to learn that the present default is not
optimal in some or all circumstances.

-Neil


---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [HACKERS] [GENERAL] tablespaces a priority for 7.5?

2004-01-22 Thread lnd
>  [mailto:[EMAIL PROTECTED] On Behalf Of Tom Lane
> The current thought is to add some sort of hook that allows a 
> user-defined action to be taken at the point where a WAL file 
> would normally get recycled.  Presumably this action would 
> involve copying the WAL file to someplace else (eg a tape), 
> after which it could get recycled.  If you've got any 
> thoughts on exactly how to manage this, let's take up a 
> discussion on pgsql-hackers.


In practice it is quite OK just to have parameters like those:

-Are database WAL logs recycable or not, i.e. accumulates.

-location where to put(location's path, possibly more than one
location) those accumulating logs
-is the location mandatory or not
-log file name format: date/time/log sequence number

What's quite good to have are system catalog views for WAL logs:

log_history view: listing every log produced together with it's
sequence number, physical path - logs are assumed to have sequence numbers.
This view can recycle after N days based on some parameter.

From that view or from another it would be good to find out the last
applied WAL log when recovering.

What for are those views: they can be used to delete (yes, cleaning
after yourself is the issue) unneccesary logs (the recycling of log_history
view can/must be based on that criteria). 

More: this view would allow to figure out what logs to backup after
baseline backup (it is assumed that one may tell which WAL log is the
current/last from this view or another one). Vice versa, this view or another
one can be used to find out what WAL logs are applied during recovery andwhat
are still needed compared to the original database.


And a SQL command to complete a current WAL log (switch to a new) is
handy.



Brgds, Laimis




 



---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [HACKERS] Bunch o' dead code in GEQO

2004-01-22 Thread Tom Lane
"scott.marlowe" <[EMAIL PROTECTED]> writes:
> On Thu, 22 Jan 2004, Tom Lane wrote:
>> I'm assuming that the original author of the GEQO code already did that
>> testing ...

> Hmmm.  I was figuring he wasn't sure so he left them in for other people 
> to test.  Is this a part of the code that eats up much time, or something 
> simple and fast that isn't part of the "GEQO takes 8 seconds to plan" 
> problem?

Well, the basic plan of the GEQO code is

Step 1: generate a bunch of possible join paths at random.

Step 2: randomly select a pair of paths from the current population,
generate a new path that is some combination of these, and push it back
into the population, dropping the worst path from the population.
Repeat for a bunch of generations.

Step 3: take the best path in the final population.

The different recombination algorithms simply are different ways of
generating a "child" path given two "parent" paths in step 2.  Changing
them wouldn't affect the runtime noticeably at all --- the primary cost
is in evaluating each generated path, which is why the runtime is 
approximately the sum of the population size (step 1) and the number
of generations (step 2).  Possibly a different recombiner would give a
better chance of finding a good plan, but I'm unconvinced.  Arguably the
recombiners that are there are all wrong anyway, since they were all
invented to solve Traveling Salesman problems, which this is not quite.

The only way we can do much about the runtime is to reduce the default
population size.  With the current default parameters, a large query
will have population size 1024 and 400 generations, so about two-thirds
of the runtime is in generating the initial random population; if we
can't make a dent in that then we're not going to gain much.  The
question is what will this do to the average quality of the selected
plans.  One thing I'm thinking about is trying to improve the quality of
the initial population by immediately discarding any really bad plans
(for instance, use a heuristic that pays attention to which relations
are linked by WHERE clauses).

regards, tom lane

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [HACKERS] Bunch o' dead code in GEQO

2004-01-22 Thread scott.marlowe
On Thu, 22 Jan 2004, Tom Lane wrote:

> "scott.marlowe" <[EMAIL PROTECTED]> writes:
> > On Wed, 21 Jan 2004, Tom Lane wrote:
> >> The GEQO planner module contains six different recombination algorithms,
> 
> > considering the recent discussion about REALLY slow query planning by the 
> > GEQO module, it might be worth testing each one to see which works best 
> > before lopping them off.
> 
> I'm assuming that the original author of the GEQO code already did that
> testing ...

Hmmm.  I was figuring he wasn't sure so he left them in for other people 
to test.  Is this a part of the code that eats up much time, or something 
simple and fast that isn't part of the "GEQO takes 8 seconds to plan" 
problem?


---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [HACKERS] PostgreSQL installation CD based on Morphix

2004-01-22 Thread Andrew Dunstan
Jean-Michel POURE said:
>
> Morphix could be a possible solution to the MS Windows migration
> problems. It  probably involves 50 times less work than migrating
> PostgreSQL to MS Windows  (and is a complete different project except
> for the fact that it could well  bring as much new users to PostgreSQL
> than a native Win port and is less  risky).
>

This is a fantasy, IMNSHO. Windows users will want a native program that
they can run just like other Windows programs, and alongside them.
Requiring the installation of a complete Linux distribution, no matter how
nicely packaged, is hardly likely to win many Windows converts to postgres.

If I were a data center manager, I would not think of allowing such a
hybrid machine into it. I would insist that we either ran postgres as a
native windows service, or run it on a pur *nix machine, or not at all.

It would be a massive and unnecessary distraction, for no benefit, for
postgresql hackers to go down this route.

BTW, the major pieces of the Windows port are falling into place quite
nicely, with little evidence so far of the risks you mention. Of course,
there will be a few teething troubles, but nobody has yet seen a major
blocker. The biggest problems (forking and signal emulation) appear to
have been solved.

cheers

andrew




---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] Bunch o' dead code in GEQO

2004-01-22 Thread Tom Lane
"scott.marlowe" <[EMAIL PROTECTED]> writes:
> On Wed, 21 Jan 2004, Tom Lane wrote:
>> The GEQO planner module contains six different recombination algorithms,

> considering the recent discussion about REALLY slow query planning by the 
> GEQO module, it might be worth testing each one to see which works best 
> before lopping them off.

I'm assuming that the original author of the GEQO code already did that
testing ...

regards, tom lane

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


Re: [HACKERS] Singnals code (not just win32 specific)

2004-01-22 Thread Tom Lane
Jan Wieck <[EMAIL PROTECTED]> writes:
> While talking about it, I think our usage of signals is way overloaded 
> anyway. Any ideas how to replace it all with just one signal and a 
> regular message queue?

Fooling with the definitions of SIGTERM, SIGINT, SIGQUIT would be a
really bad idea, since we have to behave reasonably when those signals
are sent to us by code not under our control.  Unix system shutdown
pretty much forces our SIGTERM behavior, for example.

Everything else pretty much already is funneled through SIGUSR1 and
SIGUSR2.

regards, tom lane

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


Re: [HACKERS] cache control?

2004-01-22 Thread Tom Lane
"Simon Riggs" <[EMAIL PROTECTED]> writes:
> Any comments?? Takers?

Um, did you read the discussion of the ARC buffer management algorithm
that's already been implemented for 7.5?

The main objection I have to doing it Oracle's way is that that creates
more parameters that DBAs have to get right for reasonable performance.
Self-tuning algorithms are better, when available.

regards, tom lane

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [HACKERS] Bunch o' dead code in GEQO

2004-01-22 Thread scott.marlowe
On Wed, 21 Jan 2004, Tom Lane wrote:

> The GEQO planner module contains six different recombination algorithms,
> only one of which is actually used --- the others are ifdef'd out, and
> have been ever since we got the code.  Does anyone see a reason not to
> prune the deadwood?

considering the recent discussion about REALLY slow query planning by the 
GEQO module, it might be worth testing each one to see which works best 
before lopping them off.

But I don't do anything that needs GEQO, so for me, it doesn't really 
matter.


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [HACKERS] cache control?

2004-01-22 Thread Jan Wieck
Simon,

have you read src/backend/storage/buffer/README of current CVS tip?

The algorithm in the new replacement strategy is an attempt to figure 
that SMALL_TABLE_THRESHOLD automatically. Do you see anything that can 
be improved in that algorithm?

Jan

Simon Riggs wrote:

This discussion seems likely to have a major effect on DBT-3 (DSS-type)
performance from PostgreSQL...
On Fri, 16 Jan 2004, Michael Brusser wrote:

> Is there a way to force database to load
> a frequently-accessed table into cache and keep it there?
Scott Marlow replied...

Nope.  But there is a new cache buffer handler that may make it into
7.5
that would make that happen automagically.
The important question here is "what forces blocks out of cache?" rather
than thinking about how to directly keep them there.
Jeroen T. Vermeulen wrote:
Sent: Friday, January 16, 2004 23:02
Subject: [HACKERS] Reverse scans?
Would it be doable, and would it be useful, to try to alternate the
directions of table and index scans every time each table/index was
fully scanned?
I was thinking that it could help cache performance at various levels
in cases where data at the end of a large table, say, that remained in
memory after a scan, would otherwise be flushed out by a new scan of
the
same table.  If the next scan of the same table was to go in the other
direction, any remains of the last time around that were still in the
filesystem cache, buffer pool, hard disk cache etc. would stand a
greater
chance of being reused.
Jereon's idea is a good one when we consider the current behaviour,
which is 
that large scans are placed into block buffer cache, which then forces
other data out. I would like to question the latter behaviour, so we can
address the cause and not just the symptom.

Earlier versions of Oracle had a parameter called something like
SMALL_TABLE_THRESHOLD. Any scan of a table larger than that threshold
was NOT placed into buffer cache, but was consumed directly by the
shadow process (the backend). Teradata also uses a similar buffer
control technique for large table scans.
If a table is too large to fit into buffer, it clearly wasn't going to
be cached properly in the first place; Jereon's idea only works well for
tables near to the size of the cache. If the table is MUCH bigger then
it will have very little gain. Good original thinking, though I'm not
sure its worth it.
Oracle 9i now offers some variety for buffer cache management (as does
DB2). You can specify at the tablespace and object level whether to use
one of two buffer pools, KEEP or RECYCLE. The idea is to separate out
the two types of blocks - ones that are there because they're well used
and other blocks that are there at the moment, but unlikely to stay.
My suggestion would be to:
- split the buffer cache into two, just as Oracle does: KEEP & RECYCLE.
This could default to KEEP=66% of total memory available, but could also
be settable by init parameter.
[changes to the memory management routines]
- if we do a scan on a table whose size in blocks is more than some
fraction (25%?) of KEEP bufferpool then we place the blocks into RECYCLE
bufferpool. This can be decided immediately following optimization,
rather than including it within the optimizer decision process since we
aren't going to change the way the statement executes, we're just going
to stop it from having an adverse effect on other current or future
statements.
[additional test to set parameter, then work out where to note it]
Notice that I haven't suggested that the KEEP/RECYCLE option could be
specified at table level. That optionality sounds like a lot of extra
work, when what is needed is the automatic avoidance of cache-spoiling
behaviour. (This would still mean that very large indexes with random
request patterns would still spoil cache...maybe implement that later?)
This would remove most reasons for spoiling the cache and blocks would
then leave the cache only when they were genuinely no longer wanted.
Any comments?? Takers?

---(end of broadcast)---
TIP 6: Have you searched our list archives?
   http://archives.postgresql.org


--
#==#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.  #
#== [EMAIL PROTECTED] #
---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
 joining column's datatypes do not match


[HACKERS] PostgreSQL installation CD based on Morphix

2004-01-22 Thread Jean-Michel POURE
Dear friends,

In the past, there were several discussions in order to find suitable answers 
for MS Windows users willing to install and run PostgreSQL.

Some of you suggested that users should "install GNU/Linux over Windows". Not 
so easy for the basic MySQL or Ms Windows user, who need technical 
assistance ... and fear to destroy their machine.

Therefore, I would like to draw some attention on the Morphix CD project from  
http://sourceforge.net/projects/morphix.

Morphix is an auto-bootable Debian GNU/Linux distribution based on Knoppix.

What makes Morphix different is that the project has several graphical 
installers and wizards in preparation (written in plain C, using GTK-2 
libraries and Glade-2) ... which could possibly be used to install GNU/Linux 
and PostgreSQL tools over MS Windows.

You can download a preview CD of Morphix from this address:
http://prdownloads.sourceforge.net/morphix/MorphixCombined-Gnome-0.4-1.iso?download

Except for the installer itself, the wizards are not yet available in the main 
Morphix CD. To have a look at them, you will need to checkout from CVS 
(http://cvs.sourceforge.net/viewcvs.py/morphix) and open the files in 
Glade-2.

I also made some screenshots here:

- morphixinstaller: once Morphix has booted in demo mode, you can use this 
wizard to install Morphix on the disk. Screenshots:
http://developer.pgadmin.org/jean-michel/morphix/morphix_installer_step1.png
http://developer.pgadmin.org/jean-michel/morphix/morphix_installer_step2.png
http://developer.pgadmin.org/jean-michel/morphix/morphix_installer_step3.png
http://developer.pgadmin.org/jean-michel/morphix/morphix_installer_step4.png
http://developer.pgadmin.org/jean-michel/morphix/morphix_installer_step5.png
http://developer.pgadmin.org/jean-michel/morphix/morphix_installer_step6.png
http://developer.pgadmin.org/jean-michel/morphix/morphix_installer_step7.png

- isomorphgui: graphical distribution maker (select the requested packages for 
your distribution and burn the ISO). Screenshots:
http://developer.pgadmin.org/jean-michel/morphix/iso_morph_gui_1.png
http://developer.pgadmin.org/jean-michel/morphix/iso_morph_gui_2.png

- partitionmorpher: resize partitions on the fly (GTK-2 interface to 
libparted). Screenshots:
http://developer.pgadmin.org/jean-michel/morphix/partition_morpher_1.png
http://developer.pgadmin.org/jean-michel/morphix/partition_morpher_2.png

IMHO opinion, there is a need to study the Morphix project in more details.

Morphix could be a possible solution to the MS Windows migration problems. It 
probably involves 50 times less work than migrating PostgreSQL to MS Windows 
(and is a complete different project except for the fact that it could well 
bring as much new users to PostgreSQL than a native Win port and is less 
risky).

The possible steps for such a project could be (just a guess):

1) Validate Morphixinstaller, isomorphgui and partitionmorpher wizards. These 
wizards are still in early stages of development. They are developed by one 
person, who would probably like to receive some help.

2) Create a wizard for PostgreSQL post-installation steps using GTK-2 and 
Glade-2: configuration of hosts and authentication, tuning and optimization 
of various parameters (Mainly a graphical interface to pg_hba.conf and 
postgresql.conf).

3) Gather all known PostgreSQL related Debian packages, including pgAdmin III, 
PhpPgAdmin, as well as PHP, Apache and report tools (and much more) and burn 
Morphix-PostgreSQL CDs.

There could be a "light PostgreSQL CD" as well as a "complete PostgreSQL CD".

4) Create a single website in 30 languages (http://installer.postgresql.org?) 
dedicated to releasing the installation CD.

Based on the pgAdmin experience, there could well be more than 50.000 
downloads every month, out of which 70% would be migrating from MS Windows to 
GNU/Linux.

Comments and ideas are welcome. I would like to help anyone interested in 
becoming the lead of such a project.

Best regards,
Jean-Michel Pouré




---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


[HACKERS] What's left?

2004-01-22 Thread Merlin Moncure
Hello,

I think it's safe to say there is a working implementation of a signal
handler.  The one tricky part left is to identify some smart places for
the backend to check the awaiting signal queue.  The first one is easy:
switch recv() with select() with a reasonable timeout and a poll.

If and when the signal handler gets patched into CVS, is there anything
else left that prevents the cvs version from compiling and linking?
From what I understand, Claudio's fork/exec implementation is either
complete or nearly complete.  

I would like very much to help any way possible in solving any last
remaining issues.  Once the CVS sources are compliable, it will be
easier to make meaningful contributions.  I'm really looking forward to
testing and benchmarking the win32 port.  A big thanks to all who
continue to work so hard on this project.

Merlin



---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


Re: [HACKERS] Singnals code (not just win32 specific)

2004-01-22 Thread Jan Wieck
Magnus Hagander wrote:
Hello!

The backend signals code today uses pqsignal() instead of signal() at
all places. But it uses kill() and sigsetmask() (through the macro
PG_SETMASK) directly.
I propose to change this to pqkill() and pqsigsetmask(). In pqsignal.h,
these would be #define:d back to kill() and setsigmask() for the normal
method, but would be functions on win32.
Sounds logical and more consistent than it is now.

While talking about it, I think our usage of signals is way overloaded 
anyway. Any ideas how to replace it all with just one signal and a 
regular message queue?

Jan

--
#==#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.  #
#== [EMAIL PROTECTED] #
---(end of broadcast)---
TIP 8: explain analyze is your friend


[HACKERS] Dynamic modules and standard naming practice

2004-01-22 Thread Thomas Hallgren
On Unix, the general rule for a shared library is to prepend "lib" and
append ".so". On Windows, nothing is prepended and you add ".dll". Thus for
a module named "pljava" you'd get:

libpljava.so on Unix and:
pljava.dll on Windows.

This is in fact so common that the Java Native Interface (JNI) uses it as
the default way of finding things. When loading a library from Java, you
would say System.loadLibrary("pljava") and it would autmatically resolve to
libpljava.so and pljava.dll on the respective systems.

PostgreSQL, while substituting the postfix correctly, never seems to prepend
"lib" on a Unix system. Consequently, I have to use "libpljava" to reference
the module in SQL while I use "pljava" for the same module in Java. It would
be very nice if PostgreSQL could make an attempt to first prepend the "lib"
and if that doesn't work, try without the prefix on a Unix system.

Regards,

Thomas Hallgren



---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] cache control?

2004-01-22 Thread Simon Riggs

This discussion seems likely to have a major effect on DBT-3 (DSS-type)
performance from PostgreSQL...

> On Fri, 16 Jan 2004, Michael Brusser wrote:
> 
> > Is there a way to force database to load
> > a frequently-accessed table into cache and keep it there?
> 
> Scott Marlow replied...
>
> Nope.  But there is a new cache buffer handler that may make it into
7.5
> that would make that happen automagically.

The important question here is "what forces blocks out of cache?" rather
than thinking about how to directly keep them there.

> Jeroen T. Vermeulen wrote:
> Sent: Friday, January 16, 2004 23:02
> Subject: [HACKERS] Reverse scans?
> 
> Would it be doable, and would it be useful, to try to alternate the
> directions of table and index scans every time each table/index was
> fully scanned?
> 
> I was thinking that it could help cache performance at various levels
> in cases where data at the end of a large table, say, that remained in
> memory after a scan, would otherwise be flushed out by a new scan of
the
> same table.  If the next scan of the same table was to go in the other
> direction, any remains of the last time around that were still in the
> filesystem cache, buffer pool, hard disk cache etc. would stand a
greater
> chance of being reused.

Jereon's idea is a good one when we consider the current behaviour,
which is 
that large scans are placed into block buffer cache, which then forces
other data out. I would like to question the latter behaviour, so we can
address the cause and not just the symptom.

Earlier versions of Oracle had a parameter called something like
SMALL_TABLE_THRESHOLD. Any scan of a table larger than that threshold
was NOT placed into buffer cache, but was consumed directly by the
shadow process (the backend). Teradata also uses a similar buffer
control technique for large table scans.

If a table is too large to fit into buffer, it clearly wasn't going to
be cached properly in the first place; Jereon's idea only works well for
tables near to the size of the cache. If the table is MUCH bigger then
it will have very little gain. Good original thinking, though I'm not
sure its worth it.

Oracle 9i now offers some variety for buffer cache management (as does
DB2). You can specify at the tablespace and object level whether to use
one of two buffer pools, KEEP or RECYCLE. The idea is to separate out
the two types of blocks - ones that are there because they're well used
and other blocks that are there at the moment, but unlikely to stay.

My suggestion would be to:
- split the buffer cache into two, just as Oracle does: KEEP & RECYCLE.
This could default to KEEP=66% of total memory available, but could also
be settable by init parameter.
[changes to the memory management routines]
- if we do a scan on a table whose size in blocks is more than some
fraction (25%?) of KEEP bufferpool then we place the blocks into RECYCLE
bufferpool. This can be decided immediately following optimization,
rather than including it within the optimizer decision process since we
aren't going to change the way the statement executes, we're just going
to stop it from having an adverse effect on other current or future
statements.
[additional test to set parameter, then work out where to note it]

Notice that I haven't suggested that the KEEP/RECYCLE option could be
specified at table level. That optionality sounds like a lot of extra
work, when what is needed is the automatic avoidance of cache-spoiling
behaviour. (This would still mean that very large indexes with random
request patterns would still spoil cache...maybe implement that later?)

This would remove most reasons for spoiling the cache and blocks would
then leave the cache only when they were genuinely no longer wanted.

Any comments?? Takers?


---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


[HACKERS] Singnals code (not just win32 specific)

2004-01-22 Thread Magnus Hagander
Hello!

The backend signals code today uses pqsignal() instead of signal() at
all places. But it uses kill() and sigsetmask() (through the macro
PG_SETMASK) directly.

I propose to change this to pqkill() and pqsigsetmask(). In pqsignal.h,
these would be #define:d back to kill() and setsigmask() for the normal
method, but would be functions on win32.

The other option is to provide our own kill() function and not rename,
but that could cause problems if we ever link against a runtime library
that provides kill() (say, the MS libraries) - we need to have our own
version of kill called. And overriding "standard library" functions by
linking in a function with the same name seems a bit kludgy to me.

While this should have no effect on the functionality on non-win32
backends, it does affect the code in places that are not win32-local.

Would a patch to change kill() to pqkill() be accepted? If not, do you
have a preference on another way to tackle the issue?

//Magnus


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


[HACKERS] Abstract for my replication system

2004-01-22 Thread mitani
Dear Bruce:

Included is the abstract for my replication system I promised to you.

PGCluster is a replication system for PostgreSQL. 

Features of PGCluster include:

- synchronized multi-master replication
- load balancing
- re-synchronizing database cluster which is in
  lost-synchronization state

Currently PGCluster is implemented as two independent servers (load
balancer and replicator) and set of patches to PostgreSQL 7.3.4.

The English page for the PGCluster was prepared (it is still work in
progress)

Please download and try it. 
  http://hiroshima.sraw.co.jp/people/mitani/jpug/pgcluster/en/

PGCluster is released under the BSD license and currently runs on
Linux and FreeBSD.

I wish that PGCluster is helpful for many PostgreSQL users
who need synchronized replcation solution
And I want to contribute PGCluster for PostgreSQL. 
I would like you to examine. 

Sincerely yours,

--
Atsushi Mitani
[EMAIL PROTECTED]
--




---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match