Re: [HACKERS] Patch queue concern

2007-03-29 Thread Zeugswetter Andreas ADI SD

  My feeling is we should have more regular sync points where the
patch 
  queue is emptied and everything committed or rejected.
 
 No doubt, but the real problem here is that 
 reviewing/committing other people's patches is not fun, it's 
 just work :-(.  So it's no surprise that it tends to get put 
 off.  Not sure what to do about that.

In my experience it mostly pays to keep people directly responsible for
their own work.
Every intermediate tester/reviewer/coordinator tends to reduce the
submitter's feeling for responsibility.
So I could imagine a modus operandi where a submitter states:
I feel confident that you can commit without review and will be availabe
for fixes/additional work required.
Maybe we have that in the form of committers that commit their own work
already.

But I do feel that some patches Bruce is talking about need aggreement
and help, not only review.

Andreas

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Patch queue concern

2007-03-29 Thread Simon Riggs
On Thu, 2007-03-29 at 01:34 -0400, Tom Lane wrote:
 Simon Riggs [EMAIL PROTECTED] writes:
  My feeling is we should have more regular sync points where the patch
  queue is emptied and everything committed or rejected.
 
 No doubt, but the real problem here is that reviewing/committing other
 people's patches is not fun, it's just work :-(.  

Gosh, you always seemed to enjoy my patches so much ;-)

We can all see how hard you and Bruce work and its very much
appreciated, even if we don't often say so. That's why everybody else
works so hard too. Sometimes we only communicate the tensions caused by
external expectations.

 So it's no surprise
 that it tends to get put off.  Not sure what to do about that.

Well, one thing I can do is say Thanks now and try to do that more
regularly in the future.

The enjoyment you and others take from working on PostgreSQL is
infectious, so whatever else we do its gotta stay fun.

-- 
  Simon Riggs 
  EnterpriseDB   http://www.enterprisedb.com



---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] CREATE INDEX and HOT - revised design

2007-03-29 Thread Florian G. Pflug

Pavan Deolasee wrote:

In this specific context, this particular case is easy to handle because
we are only concerned about the serializable transactions started before
CREATE INDEX commits. If PREPARE can see the new index, it
implies that the CI transaction is committed. So the transaction
starting after than can only see the tuple version that we have indexed.


Yes, but the non-index plan PREPARE generated will be used until the end
of the session, nut only until the end of the transaction. Imagine that
it wasn't explicitly PREPARED (where you might say this is acceptable),
but rather just a query inside a plpgsql function, maybe even called
from some app using connection pooling. This means that the non-index
using plan might get used for a quite long time, which contradics the
work Tom did on plan invalidation I think.

Maybe Tom can comment on wheter it's possible to use plan invalidation
to eventually get rid of a stale plan in this context?

greetings, Florian Pflug


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [HACKERS] CREATE INDEX and HOT - revised design

2007-03-29 Thread Pavan Deolasee

On 3/29/07, Florian G. Pflug [EMAIL PROTECTED] wrote:



Yes, but the non-index plan PREPARE generated will be used until the end
of the session, nut only until the end of the transaction.



Frankly I don't know this works, but are you sure that the plan will
be used until the end of the session ? Even if thats the case, it can
happen even today if we create a new index, but the existing sessions
will use the stale plan (assuming what you said is true)

Thanks,
Pavan

--

EnterpriseDB http://www.enterprisedb.com


Re: [PATCHES] [HACKERS] Full page writes improvement, code update

2007-03-29 Thread Koichi Suzuki

Hi, Here're some feedback to the comment:

Simon Riggs wrote:

On Wed, 2007-03-28 at 10:54 +0900, Koichi Suzuki wrote:


As written below, full page write can be
categolized as follows:

1) Needed for crash recovery: first page update after each checkpoint.
This has to be kept in WAL.

2) Needed for archive recovery: page update between pg_start_backup and
pg_stop_backup. This has to be kept in archive log.

3) For log-shipping slave such as pg_standby: no full page writes will
be needed for this purpose.

My proposal deals with 2). So, if we mark each full_page_write, I'd
rather mark when this is needed. Still need only one bit because the
case 3) does not need any mark.


I'm very happy with this proposal, though I do still have some points in
detailed areas.

If you accept that 1  2 are valid goals, then 1  3 or 1, 2  3 are
also valid goals, ISTM. i.e. you might choose to use full_page_writes on
the primary and yet would like to see optimised data transfer to the
standby server. In that case, you would need the mark.


Yes, I need the mark.  In my proposal, only unmarked full-page-writes, 
which were written as the first update after a checkpoint, are to be 
removed offline (pg_compresslog).





- Not sure why we need full_page_compress, why not just mark them
always? That harms noone. (Did someone else ask for that? If so, keep
it)

No, no one asked to have a separate option. There'll be no bad
influence to do so.  So, if we mark each full_page_write, I'd
rather mark when this is needed. Still need only one bit because the
case 3) does not need any mark.


OK, different question: 
Why would anyone ever set full_page_compress = off? 


Why have a parameter that does so little? ISTM this is:

i) one more thing to get wrong

ii) cheaper to mark the block when appropriate than to perform the if()
test each time. That can be done only in the path where backup blocks
are present.

iii) If we mark the blocks every time, it allows us to do an offline WAL
compression. If the blocks aren't marked that option is lost. The bit is
useful information, so we should have it in all cases.


Not only full-page-writes are written as WAL record.   In my proposal, 
both full-page-writes and logical log are written in a WAL record, which 
will make WAL size slightly bigger (five percent or so).   If 
full_page_compress = off, only a full-page-write will be written in a 
WAL record.   I thought someone will not be happy with this size growth.


I agree to make this mandatory if every body is happy with extra logical 
 log in WAL records with full page writes.


I'd like to have your opinion.




- OTOH I'd like to see an explicit parameter set during recovery since
you're asking the main recovery path to act differently in case a single
bit is set/unset. If you are using that form of recovery, we should say
so explicitly, to keep everybody else safe.

Only one thing I had to do is to create dummy full page write to
maintain LSNs. Full page writes are omitted in archive log. We have to
LSNs same as those in the original WAL. In this case, recovery has to
read logical log, not dummy full page writes. On the other hand, if
both logical log and real full page writes are found in a log record,
the recovery has to use real full page writes.


I apologise for not understanding your reply, perhaps my original
request was unclear.

In recovery.conf, I'd like to see a parameter such as

dummy_backup_blocks = off (default) | on

to explicitly indicate to the recovery process that backup blocks are
present, yet they are garbage and should be ignored. Having garbage data
within the system is potentially dangerous and I want to be told by the
user that they were expecting that and its OK to ignore that data.
Otherwise I want to throw informative errors. Maybe it seems OK now, but
the next change to the system may have unintended consequences and it
may not be us making the change. It's OK the Alien will never escape
from the lab is the starting premise for many good sci-fi horrors and I
want to watch them, not be in one myself. :-)

We can call it other things, of course. e.g.
ignore_dummy_blocks
decompressed_blocks
apply_backup_blocks


So far, we don't need any modification to the recovery and redo 
functions.   They ignore the dummy and apply logical logs.   Also, if 
there are both full page writes and logical log, current recovery 
selects full page writes to apply.


I agree to introduce this option if 8.3 code introduces any conflict to 
the current.   Or, we could introduce this option for future safety.  Do 
you think we should introduce this option?


If this should be introduced now, what we should do is to check this 
option when dummy full-page-write appears.






Yes I believe so. As pg_standby does not include any chance to meet
partial writes of pages, I believe you can omit all the full page
writes. Of course, as Tom Lange suggested in
http://archives.postgresql.org/pgsql-hackers/2007-02/msg00034.php
removing full 

Re: [PATCHES] [HACKERS] Full page writes improvement, code update

2007-03-29 Thread Simon Riggs
On Thu, 2007-03-29 at 17:50 +0900, Koichi Suzuki wrote:

 Not only full-page-writes are written as WAL record.   In my proposal, 
 both full-page-writes and logical log are written in a WAL record, which 
 will make WAL size slightly bigger (five percent or so).   If 
 full_page_compress = off, only a full-page-write will be written in a 
 WAL record.   I thought someone will not be happy with this size growth.

OK, I see what you're doing now and agree with you that we do need a
parameter. Not sure about the name you've chosen though - it certainly
confused me until you explained.

A parameter called ..._compress indicates to me that it would reduce
something in size whereas what it actually does is increase the size of
WAL slightly. We should have a parameter name that indicates what it
actually does, otherwise some people will choose to use this parameter
even when they are not using archive_command with pg_compresslog.

Some possible names...

additional_wal_info = 'COMPRESS'
add_wal_info
wal_additional_info
wal_auxiliary_info
wal_extra_data
attach_wal_info
...
others?

I've got some ideas for the future for adding additional WAL info for
various purposes, so it might be useful to have a parameter that can
cater for multiple types of additional WAL data. Or maybe we go for
something more specific like

wal_add_compress_info = on
wal_add__info ...

  In recovery.conf, I'd like to see a parameter such as
  
  dummy_backup_blocks = off (default) | on
  
  to explicitly indicate to the recovery process that backup blocks are
  present, yet they are garbage and should be ignored. Having garbage data
  within the system is potentially dangerous and I want to be told by the
  user that they were expecting that and its OK to ignore that data.
  Otherwise I want to throw informative errors. Maybe it seems OK now, but
  the next change to the system may have unintended consequences and it
  may not be us making the change. It's OK the Alien will never escape
  from the lab is the starting premise for many good sci-fi horrors and I
  want to watch them, not be in one myself. :-)
  
  We can call it other things, of course. e.g.
  ignore_dummy_blocks
  decompressed_blocks
  apply_backup_blocks
 
 So far, we don't need any modification to the recovery and redo 
 functions.   They ignore the dummy and apply logical logs.   Also, if 
 there are both full page writes and logical log, current recovery 
 selects full page writes to apply.
 
 I agree to introduce this option if 8.3 code introduces any conflict to 
 the current.   Or, we could introduce this option for future safety.  Do 
 you think we should introduce this option?

Yes. You are skipping a correctness test and that should be by explicit
command only. It's no problem to include that as well, since you are
already having to specify pg_... decompress... but the recovery process
doesn't know whether or not you've done that.

 Anyway, could you try to run pg_standby with pg_compresslog and 
 pg_decompresslog?

After freeze, yes.

 
 Additional recomment on page header removal:
 
 I found that it is not simple to keep page header in the compressed 
 archive log.   Because we eliminate unmarked full page writes and shift 
 the rest of the WAL file data, it is not simple to keep page header as 
 the page header in the compressed archive log.   It is much simpler to 
 remove page header as well and rebuild them.   I'd like to keep current 
 implementation in this point.

OK.

This is a good feature. Thanks for your patience with my comments.

-- 
  Simon Riggs 
  EnterpriseDB   http://www.enterprisedb.com



---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


[HACKERS] Group Commit

2007-03-29 Thread Heikki Linnakangas
I've been working on the patch to enhance our group commit behavior. The 
patch is a dirty hack at the moment, but I'm settled on the algorithm 
I'm going to use and I know the issues involved.


Here's the patch as it is if you want to try it out:
http://community.enterprisedb.com/groupcommit-pghead-2.patch

but it needs a rewrite before being accepted. It'll only work on systems 
that use sysv semaphores, I needed to add a function to acquire a 
semaphore with timeout and I only did it for sysv_sema.c for now.



What are the chances of getting this in 8.3, assuming that I rewrite and 
submit a patch within the next week or two?



Algorithm
-

Instead of starting a WAL flush immediately after a commit record is 
inserted, we wait a while to give other backends a chance to finish 
their transactions and have them flushed by the same fsync call. There's 
two things we can control: how many commits to wait for (commit group 
size), and for how long (timeout).


We try to estimate the optimal commit group size. The estimate is

commit group size = (# of commit records flushed + # of commit records 
arrived while fsyncing).


This is a relatively simple estimate that works reasonably well with 
very short transactions, and the timeout limits the damage when the 
estimate is not working.


There's a lot more factors we could take into account in the estimate, 
for example:
- # of backends and their states (affects how many are likely to commit 
soon)

- amount of WAL written since last XLogFlush (affects the duration of fsync)
- when exaclty the commit records arrive (we don't want to wait 10 ms to 
get one more commit record in, when an fsync takes 11 ms)


but I wanted to keep this simple for now.

The timeout is currently hard-coded at 1 ms. I wanted to keep it short 
compared to the time it takes to fsync (somewhere in the 5-15 ms 
depending on hardware), to limit the damage when the algorithm isn't 
getting the estimate right. We could also vary the timeout, but I'm not 
sure how to calculate the optimal value and the real granularity will 
depend on the system anyhow.


Implementation
--

To count the # of commits since last XLogFlush, I added a new 
XLogCtlCommit struct in shared memory:


typedef struct XLogCtlCommit
{
slock_tcommit_lock;   /* protects the struct */
int	   commitCount;   /* # of commit records inserted since 
XLogFlush */

intgroupSize; /* current commit group size */
XLogRecPtr lastCommitPtr; /* location of the latest commit record */
PGPROC*waiter;/* process to signal when groupSize is 
reached */

} XLogCtlCommit;

Whenever a commit record is inserted in XLogInsert, commitCount is 
incremented and lastCommitPtr is updated.

When it reaches groupSize, the waiter-process is woken up.

In XLogFlush, after acquiring WALWriteLock, we wait until groupSize is 
reached (or timeout expires) before doing the flush.


Instead of the current logic to flush as much WAL as possible, we flush 
up to the last commit record. Flushing any more wouldn't save us an 
fsync later on, but might make the current fsync take longer. By doing 
that, we avoid the conditional acquire of the WALInsertLock that's in 
there currently. We make note of commitCount before starting the fsync; 
that's the # of commit records that arrived in time so that the fsync 
will flush them. Let's call that value intime.


After the fsync is finished, we update the groupSize for the next round. 
The new groupSize is the current commitCount after the fsync, IOW the 
number of commit records arrived after the previous XLogFlush, including 
the time it took to do the fsync. We update the commitCount by 
decrementing it by intime.


Now we're ready for the next round, and we can release WALWriteLock.

WALWriteLock


The above would work nicely, except that a normal lwlock doesn't play 
nicely. You can release and reacquire a lightwait lock in the same time 
slice even when there's other backends queuing for the lock, effectively 
cutting the queue.


Here's what sometimes happens, with 2 clients:

Client 1   Client 2
do workdo work
insert commit record   insert commit record
acquire WALWriteLock
   try to acquire WALWriteLock, blocks
fsync
release WALWriteLock
begin new transaction
do work
insert commit record
reacquire WALWriteLock
wait for 2nd commit to arrive

Client 1 will eventually time out and commit just its own commit record. 
Client 2 should be released immediately after client 1 releases the 
WALWriteLock. It only needs to observe that its commit record has 
already been flushed and doesn't need to do anything.


To fix the above, and other race conditions like that, we need a 
specialized WALWriteLock that orders the waiters by the commit record 
XLogRecPtrs. WALWriteLockRelease wakes up all waiters that have their 
commit record already flushed. They will just fall through without 

Re: [HACKERS] Group Commit

2007-03-29 Thread Heikki Linnakangas

I wrote:
What are the chances of getting this in 8.3, assuming that I rewrite and 
submit a patch within the next week or two?


I also intend to do performance testing with different workloads to 
ensure the patch doesn't introduce a performance regression under some 
conditions.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Modifying TOAST thresholds

2007-03-29 Thread Simon Riggs
On Wed, 2007-03-28 at 14:08 -0400, Tom Lane wrote:
 Gregory Stark [EMAIL PROTECTED] writes:
  Tom Lane [EMAIL PROTECTED] writes:
  I also think that we ought to add TOAST_MAX_CHUNK_SIZE to the set of
  compiled-in parameters that are recorded in pg_control and checked for
  compatibility at startup (like BLCKSZ) --- this will prevent anyone from
  shooting themselves in the foot while experimenting.
 
  Is there any reason to experiment with this? I would have thought we would
  divorce TOAST_MAX_CHUNK_SIZE from TOAST_THRESHOLD and hard code it as the 
  same
  expression that's there now. Ie, the largest size that can fit in a page.
 
 No, right now it's the largest size that you can fit 4 on a page.  It's
 not obvious to me that 4 is optimal once it's divorced from TOAST_THRESHOLD.
 It seems possible that the correct number is 1, and even if it's useful
 to keep the tuples smaller than that, there's no reason to assume 4 is
 the best number per page.

Well it certainly seems worth separating them. It does seem possible
that recursive toasting effected some of the earlier results we looked
at.

Would you like me to do this, or will you?

I'll look again at the possibility for setting TOAST_THRESHOLD and
re-cast the test patch I have for production use. But either way it's
going to be a couple of days after freeze now.

I'd like to get some mechanism for reducing WAL volume into 8.3, whether
its configurable toast or WAL reduction for UPDATEs. If for no other
reason than making backup and availability solutions more manageable.

-- 
  Simon Riggs 
  EnterpriseDB   http://www.enterprisedb.com



---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Patch queue concern

2007-03-29 Thread Gregory Stark

Tom Lane [EMAIL PROTECTED] writes:

 Simon Riggs [EMAIL PROTECTED] writes:
 My feeling is we should have more regular sync points where the patch
 queue is emptied and everything committed or rejected.

 No doubt, but the real problem here is that reviewing/committing other
 people's patches is not fun, it's just work :-(.  So it's no surprise
 that it tends to get put off.  Not sure what to do about that.

Obviously a big part of that is that we just don't have enough committers. I'm
hopeful that in time that situation will improve but in the meantime we do
have a problem and the burden falls unfairly on a few.

Is there anything others can do to help? If non-committers like Simon or I
reviewed patches would it be easier for you to give a quick agreement to the
comments or that's not an issue comment?

It seems like we do have a few committers who should be able to review code
quality but are uncertain about making major design decisions. If, for
example, Bruce or Jan reviewed patches more invasive than they usually do for
code quality and checked with you on design questions would that be helpful?

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] CREATE INDEX and HOT - revised design

2007-03-29 Thread Florian G. Pflug

Pavan Deolasee wrote:

On 3/29/07, Florian G. Pflug [EMAIL PROTECTED] wrote:


Yes, but the non-index plan PREPARE generated will be used until the end
of the session, nut only until the end of the transaction.


Frankly I don't know this works, but are you sure that the plan will
be used until the end of the session ? Even if thats the case, it can
happen even today if we create a new index, but the existing sessions
will use the stale plan (assuming what you said is true)


I've checked that:

test=# prepare myplan as select * from test where id=1 ;
PREPARE

test=# explain execute myplan ;
 QUERY PLAN

 Seq Scan on test  (cost=0.00..22897.70 rows=5421 width=36)
   Filter: (id = 1)
(2 rows)

 Now I create an index in another session 

test=# explain select * from test where id=1 ;
  QUERY PLAN
--
 Bitmap Heap Scan on test  (cost=95.11..8248.45 rows=5000 width=36)
   Recheck Cond: (id = 1)
   -  Bitmap Index Scan on idx  (cost=0.00..93.86 rows=5000 width=0)
 Index Cond: (id = 1)
(4 rows)

test=# explain execute myplan ; 
  QUERY PLAN


 Seq Scan on test  (cost=0.00..22897.70 rows=5421 width=36)
   Filter: (id = 1)
(2 rows)

!!! Index got used by the select ..  but not by execute myplan ... !!!

test=# prepare myplan2 as select * from test where id=1 ;
PREPARE
test=# explain execute myplan2 ;
   QUERY PLAN
-
 Index Scan using idx on test  (cost=0.00..8.38 rows=1 width=37)
   Index Cond: (id = 1)
(2 rows)

!!! A newly prepared plan of course uses the index !!!



So yes, plans get cached until the end of the session, and
yes, 8.2 won't notice index creation either ;-)

The open question is how CVS HEAD with plan invalidation behaves.
If it replans after the index-creating transaction commits, then
basing index validity on a snapshot will break this, because upon
replay they index might not be useable, but later on it may very
well be (but that plan invalidation machinery won't realize that)

So this might not introduce a regression compared to 8.2, but to
a future 8.3 with plan invalidation...

Sorry for being so unclear in my previous emails - I had confused
myself ;-)

greetings, Florian Pflug


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [HACKERS] ECPG threads test

2007-03-29 Thread Michael Meskes
On Wed, Mar 28, 2007 at 04:56:28PM +0200, Magnus Hagander wrote:
 From what I can see, the ecpg thread tests (src/interfaces/ecpg/threads)
 don't ever run. They rely on ENABLE_THREAD_SAFETY to be set, but even when
 I build with --enable-thread-safety, it's not set. This is because ecpg
 does not pull in pg_config.h, and also does not specify it on the
 commandline.

It used to include pg_config.h, but we removed it since it was too much
and instead created a file named ecpg_config.h. However, we missed
ENABLE_THREAD_SAFETY there. I just committed a fix.

Michael
-- 
Michael Meskes
Email: Michael at Fam-Meskes dot De, Michael at Meskes dot (De|Com|Net|Org)
ICQ: 179140304, AIM/Yahoo: michaelmeskes, Jabber: [EMAIL PROTECTED]
Go SF 49ers! Go Rhein Fire! Use Debian GNU/Linux! Use PostgreSQL!

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] ECPG regression tests expected files

2007-03-29 Thread Michael Meskes
On Wed, Mar 28, 2007 at 07:30:21PM +0200, Magnus Hagander wrote:
 If you want to pick one early, please look at the one about the thread
 regression tests not appearing to run at all. I'd like to have that
 confirmed before I try to dig into how to fix it - in case it's not
 actually broken, and it's just me who's doing something wrong...

I just committed some stuff including your patch and a fix to the
regression test problem. At least it works on my system.

Michael
-- 
Michael Meskes
Email: Michael at Fam-Meskes dot De, Michael at Meskes dot (De|Com|Net|Org)
ICQ: 179140304, AIM/Yahoo: michaelmeskes, Jabber: [EMAIL PROTECTED]
Go SF 49ers! Go Rhein Fire! Use Debian GNU/Linux! Use PostgreSQL!

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] Patch queue concern

2007-03-29 Thread Andrew Dunstan

Gregory Stark wrote:

Obviously a big part of that is that we just don't have enough committers. I'm
hopeful that in time that situation will improve but in the meantime we do
have a problem and the burden falls unfairly on a few.

Is there anything others can do to help? If non-committers like Simon or I
reviewed patches would it be easier for you to give a quick agreement to the
comments or that's not an issue comment?

It seems like we do have a few committers who should be able to review code
quality but are uncertain about making major design decisions. If, for
example, Bruce or Jan reviewed patches more invasive than they usually do for
code quality and checked with you on design questions would that be helpful?

  


I try to review things that I feel are well within my area of competence 
(e.g plperl, sql level commands) but I feel more hesitant about things 
very deep inside the backend - there's more danger I'll miss something 
subtle there.


Outside events have conspired to make both reviewing and coding harder 
for me to get done this cycle.


As for major design decisions, these should not be in the hands of a 
reviewer anyway - they should be explicitly discussed on list.


There is plenty of scope for people to review patches if they aren't 
committers. In fact, it is highly encouraged. Please review anything on 
the patch list you feel able to.



cheers

andrew

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Patch queue concern

2007-03-29 Thread Tom Lane
Andrew Dunstan [EMAIL PROTECTED] writes:
 There is plenty of scope for people to review patches if they aren't 
 committers. In fact, it is highly encouraged. Please review anything on 
 the patch list you feel able to.

Sure.  Even if you miss things, every problem you do spot is one less...
and there's no guarantee that the eventual committer would have seen it.

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] ECPG regression tests expected files

2007-03-29 Thread Magnus Hagander
On Thu, Mar 29, 2007 at 02:04:48PM +0200, Michael Meskes wrote:
 On Wed, Mar 28, 2007 at 07:30:21PM +0200, Magnus Hagander wrote:
  If you want to pick one early, please look at the one about the thread
  regression tests not appearing to run at all. I'd like to have that
  confirmed before I try to dig into how to fix it - in case it's not
  actually broken, and it's just me who's doing something wrong...
 
 I just committed some stuff including your patch and a fix to the
 regression test problem. At least it works on my system.

Thanks. Passes the regression tests on win32 with the native threading.
I've updated and committed updates to the thread regression tests to have
them use the native threading as well. This removes the final requirement
on pthreads on win32, so that's one more non-standard build depdendency no
longer needed. Yay!

//Magnus


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] CREATE INDEX and HOT - revised design

2007-03-29 Thread Tom Lane
Florian G. Pflug [EMAIL PROTECTED] writes:
 Pavan Deolasee wrote:
 Frankly I don't know this works, but are you sure that the plan will
 be used until the end of the session ? Even if thats the case, it can
 happen even today if we create a new index, but the existing sessions
 will use the stale plan (assuming what you said is true)

 I've checked that:

Evidently you weren't testing on HEAD.

 The open question is how CVS HEAD with plan invalidation behaves.
 If it replans after the index-creating transaction commits, then
 basing index validity on a snapshot will break this, because upon
 replay they index might not be useable, but later on it may very
 well be (but that plan invalidation machinery won't realize that)

It will replan at the first use of the plan after seeing the relcache
inval sent by commit of the index-creating transaction.  If you have
two separate transactions to create an index and then mark it valid
later, everything's fine because there are two inval events.
However, if you design something where an index becomes usable due
to the passage of time rather than an explicit mark-valid step,
there's gonna be a problem.  I'd suggest trying to stick to the
way CREATE INDEX CONCURRENTLY does it...

regards, tom lane

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] Modifying TOAST thresholds

2007-03-29 Thread Tom Lane
Simon Riggs [EMAIL PROTECTED] writes:
 Well it certainly seems worth separating them. It does seem possible
 that recursive toasting effected some of the earlier results we looked
 at.

 Would you like me to do this, or will you?

I'm willing to do the code changes to separate TOAST_THRESHOLD from
the toast chunk size, but I do not have the time or facilities to do
any performance testing for different parameter choices.  Anyone want
to work on that?

 I'd like to get some mechanism for reducing WAL volume into 8.3, whether
 its configurable toast or WAL reduction for UPDATEs. If for no other
 reason than making backup and availability solutions more manageable.

I think the WAL-reduction proposal needs more time and thought than is
feasible before 8.3.  OTOH, tuning the TOAST parameters seems like
something we understand well enough already, we just need to put some
cycles into testing different alternatives.  I would have no objection
to someone working on that during April and delivering a final patch
sometime before beta.

regards, tom lane

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] Warning on contrib/tsearch2

2007-03-29 Thread Teodor Sigaev

code: random backend code should not, not, not be using fopen()
directly.  If you lose control to an elog, which is certainly possible
seeing that this loop calls into the utils/mb subsystem, you'll leak
the file descriptor.  Use AllocateFile/FreeFile instead of fopen/fclose.

Will soon in tsearch_core patch



I find the direct use of malloc/realloc/strdup to be poor style as well
--- backend code that is not using palloc needs to have *very* good
reason to do so, and I see none here.


Already in tsearch_core patch.



--
Teodor Sigaev   E-mail: [EMAIL PROTECTED]
   WWW: http://www.sigaev.ru/

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


[HACKERS] tsearch_core patch for inclusion

2007-03-29 Thread Teodor Sigaev

http://www.sigaev.ru/misc/tsearch_core-0.43.gz

Changes:
1 Ispell dictionary now supports hunspell dictionary's  format which is used by
  OpenOffice = 2.0.2
  http://wiki.services.openoffice.org/wiki/Dictionaries
  Changes in format is addressed, basically, to better support of compound words
  ( German, Norwegian ). So, please, test it - we don't know that languages at
  all.
2 added recent fixes of contrib/tsearch2
3 fix usage of fopen/fclose

--
Teodor Sigaev   E-mail: [EMAIL PROTECTED]
   WWW: http://www.sigaev.ru/

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate


Re: [HACKERS] Group Commit

2007-03-29 Thread Tom Lane
Heikki Linnakangas [EMAIL PROTECTED] writes:
 I've been working on the patch to enhance our group commit behavior. The 
 patch is a dirty hack at the moment, but I'm settled on the algorithm 
 I'm going to use and I know the issues involved.
 ...
 The timeout is currently hard-coded at 1 ms.

This is where my bogometer triggered.  There's way too many platforms
where 1 msec timeout is a sheer fantasy.  If you cannot make it perform
well with a 10-msec timeout then I don't think it's going to be at all
portable.

Now I know that newer Linux kernels tend to ship with 1KHz scheduler
tick rate, so there's a useful set of platforms where you could make it
work even so, but I'm not really satisfied with saying this facility is
only usable if you have a fast kernel tick rate ...

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] CREATE INDEX and HOT - revised design

2007-03-29 Thread Pavan Deolasee

On 3/29/07, Tom Lane [EMAIL PROTECTED] wrote:



It will replan at the first use of the plan after seeing the relcache
inval sent by commit of the index-creating transaction.  If you have
two separate transactions to create an index and then mark it valid
later, everything's fine because there are two inval events.
However, if you design something where an index becomes usable due
to the passage of time rather than an explicit mark-valid step,
there's gonna be a problem.  I'd suggest trying to stick to the
way CREATE INDEX CONCURRENTLY does it...



I had earlier proposed to do things CIC way. But there were objections
to the additional wait introduced in CREATE INDEX, and I don't
think they were unreasonable. May be if we can avoid waits if there
are no HOT-chains in the table, but still we need agreement on that.

OTOH ISTM that the pg_index:xcreate solution may work fine if
we can keep index unusable to those transactions which started
before CREATE INDEX could commit. I coded a quick prototype
where I use ActiveSnapshot in get_relation_info() to test if the
CREATE INDEX transaction is seen as in-progress to the
transaction. If seen as in-progress, the index is not used (even
though the CREATE INDEX is committed and hence can be seen
by SnapshotNow).

If an index which was earlier seen unusable is marked as valid as
time progresses, could there be some trouble ? I mean, as long as
we don't make the index usable before all tuples which are not
indexed are DEAD, we should be fine.

Is there something I am missing ? Would it help to explain the idea
if I post the patch ?

CREATE INDEX and CREATE INDEX CONCURRENTLY turned
out to be much more difficult than I imagined earlier. While we are
discussing CREATE INDEX, I would post a design for CIC. I
restrained myself till now to avoid confusion, but with time running
out, it would be helpful to get agreement so that we can finish
the patch on time.

Thanks,
Pavan


--

EnterpriseDB http://www.enterprisedb.com


Re: [HACKERS] CREATE INDEX and HOT - revised design

2007-03-29 Thread Gregory Stark

Tom Lane [EMAIL PROTECTED] writes:

 However, if you design something where an index becomes usable due
 to the passage of time rather than an explicit mark-valid step,
 there's gonna be a problem.  I'd suggest trying to stick to the
 way CREATE INDEX CONCURRENTLY does it...

I'm a bit skeptical about the idea of CREATE INDEX (ie, non-concurrent)
creating an index that won't be used for a while. We get enough people asking
why Postgres isn't using an index as it is... Besides, it seems if people are
happy to have indexes take a long time to build they could just do a
concurrent build. The reason they do non-concurrent builds is precisely
because they're willing to take an exclusive lock in order to have them
complete as soon as possible.

Earlier we were talking about not inserting any HOT tuples until the index
became valid. The goal of having an xid on the index was so we would know when
we could start doing HOT updates again. That seems like a much lesser cost
than not being able to use the index until all live transactions exit.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com


---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] Group Commit

2007-03-29 Thread Heikki Linnakangas

Tom Lane wrote:

Heikki Linnakangas [EMAIL PROTECTED] writes:
I've been working on the patch to enhance our group commit behavior. The 
patch is a dirty hack at the moment, but I'm settled on the algorithm 
I'm going to use and I know the issues involved.

...
The timeout is currently hard-coded at 1 ms.


This is where my bogometer triggered.  There's way too many platforms
where 1 msec timeout is a sheer fantasy.  If you cannot make it perform
well with a 10-msec timeout then I don't think it's going to be at all
portable.

Now I know that newer Linux kernels tend to ship with 1KHz scheduler
tick rate, so there's a useful set of platforms where you could make it
work even so, but I'm not really satisfied with saying this facility is
only usable if you have a fast kernel tick rate ...


The 1 ms timeout isn't essential for the algorithm. In fact, I chose it 
arbitrarily; in the quick tests I did the length of the timeout didn't 
seem to matter much. I'm running with CONFIG_HZ=250 kernel myself, which 
means that the timeout is really 4 ms on my laptop.


I suspect the tick rate largely explains why the current commit_delay 
isn't very good is that even though you specify it in microseconds, it 
really waits a lot longer. With the proposed algorithm, the fsync is 
started immediately when enough commit records have been inserted, so 
the timeout only comes into play when the estimate for the group size is 
too high.


With a higher-precision timer, we could vary not only the commit group 
size but also the timeout.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate


Re: [HACKERS] Group Commit

2007-03-29 Thread Tom Lane
Heikki Linnakangas [EMAIL PROTECTED] writes:
 Tom Lane wrote:
 This is where my bogometer triggered.  There's way too many platforms
 where 1 msec timeout is a sheer fantasy.  If you cannot make it perform
 well with a 10-msec timeout then I don't think it's going to be at all
 portable.

 The 1 ms timeout isn't essential for the algorithm.

OK, but when you get to performance testing, please see how well it
works at CONFIG_HZ=100.

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


[HACKERS] CREATE INDEX CONCURRENTLY and HOT

2007-03-29 Thread Pavan Deolasee

Sorry to start another thread while we are still discussing CREATE
INDEX design, but I need help/suggestions to finish the patch on
time for 8.3

We earlier thought that CREATE INDEX CONCURRENTLY (CIC)
would be simpler to do because of the  existing waits in CIC.
But one major problem with CIC is that UPDATEs are allowed
while we are building the index and these UPDATEs can create
HOT-chains which has different values for attributes on which
we are building the new index. To keep the HOT-chain semantic
consistent across old and new indexes, we might be forced to
delete the old index entry and reinsert new one during the
validate_index() phase. This is of course not easy.

May I propose the following design which is less intrusive:

We do CIC in three phases:

In the first phase we just create the catalog entry for the new
index, mark the index read-only and commit the transaction.
By read-only, I mean that the index is not ready inserts, but
is consulted during UPDATEs to decide whether to do HOT
UPDATE or not (just like other existing indexes). We then
wait for all transactions conflicting on ShareLock to complete.
That would guarantee that all the existing transactions which
can not see the new index catalog entry are finished.

A new transaction is started. We then build the index just the
way we do today. While we are building the index, no new
HOT-chains are be created where the index keys do not
match because the new index is consulted while deciding
whether to do HOT UPDATE or not.

At the end of this step, the index is marked ready for
inserts, we once again wait for all transactions conflicting
on ShareLock to finish and commit the transaction.

In the third phase, we validate the index inserting any
missing index entries for tuples which are not HEAP_ONLY.
For HEAP_ONLY tuples we already have the index entry
though it points to the root tuple. Thats OK because we
guarantee that all tuples in the chain share the same key
with respect to old as well as new indexes.

We then mark the index valid and commit.

In summary, this design introduces one more transaction
and wait. But that should not be a problem because we
would anyways wait for those transactions to finish today
though a bit later in the process.

Comments/suggestions ?

Thanks,
Pavan

--

EnterpriseDB http://www.enterprisedb.com


Re: [HACKERS] Patch queue concern

2007-03-29 Thread Bruce Momjian
Gregory Stark wrote:
 
 Bruce Momjian [EMAIL PROTECTED] writes:
 
  It favours people who are short-sighted and don't see what possible
  improvements their code has. No code in an ongoing project like this is 
  ever
  completed anyways.
 
  It favors those who do not wait until the last minute, but complete them
  well before the freeze date.
 
 What is this complete you keep talking about? Should I stop working on the
 sort/limit patch even though Heikki pointed out a few things to clean up and
 the cost model isn't updated yet just so that you'll consider it complete
 and put it on the patch queue? If I don't stop working on it you think we
 should just ignore it even if it's in a usable state now? Even the cost model
 changes could be done pretty easily with some guidance from a review.

Complete means the author _thinks_ he is done, and has responded to all
community comments on the patch.

  It's also an artifact of the working model we have where patches are sent 
  in
  big chunks and reviewed much later during a feature freeze. If we were
  committing directly into a CVS repository we would have wanted to commit 
  these
  changes as soon as they were ready for committing, not wait until they're
  completed. Then continue working and commit further changes. It's only
 
  This would have CVS containing uncomplete features --- and before beta,
  we would either have to beg the authors to complete them, or rip them
  out, neither of which we want to do.
 
 You don't want to commit something if it's in an unusable state and would have
 to be ripped out without more work. I said as soon as they're ready for
 committing as opposed to completed.
 
 You're asking people if they've stopped working on patches and you're
 surprised to find that there are a lot of patches people are still working on.
 
 That's silly, of course people are still working on them, many of these tasks
 are open ended and can be improved as long as we have time. just because
 they're still working on them doesn't necessarily mean what they have so far
 isn't worth committing as is yet.

We don't want open-ended a few days before feature feeze.  We want them
to be as done, at some complete stopping point, and submitted.

  OK, but we don't want something that is ready to be committed, we need
  it complete.
 
 So how many more releases before you think Postgres is complete? 

I am getting tired of your semantic games, here, Greg. I have no idea
what you are trying to accomplish, but I have better things to do.

-- 
  Bruce Momjian  [EMAIL PROTECTED]  http://momjian.us
  EnterpriseDB   http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Fixing insecure security definer functions

2007-03-29 Thread Tom Lane
As was pointed out awhile ago
http://archives.postgresql.org/pgsql-general/2007-02/msg00673.php
it's insecure to run a SECURITY DEFINER function with a search_path
setting that's under the control of someone who wishes to subvert
the function.  Even for non-security-definer functions, it seems
useful to be able to select the search path for the function to use;
we've had requests for that before.  Right now, this is possible but
tedious and slow, because you really have to use a subtransaction
to ensure that the path is reset correctly on exit:

BEGIN
  SET LOCAL search_path = ...;
  ... useful work here ...
EXCEPTION
END

(In fact it's worse than that, since you can't write an EXCEPTION
without at least one WHEN clause, which is maybe something to change?)
Also, this approach isn't available in plain SQL functions.

I would like to fix this for 8.3.  I don't have a patch yet but want
to get buy-in on a design before feature freeze.  I propose the
following, fully-backward-compatible design:

1. Add a text column propath to pg_proc.  It can be either NULL or
a search path specification (interpreted the same as values for the
search_path GUC variable).  NULL means use the caller's setting, ie,
current behavior.

2. When fmgr.c sees either prosecdef or propath set for a function to be
called, it will insert the fmgr_security_definer hook into the call.
fmgr_security_definer will be responsible for establishing the correct
current-user and/or path settings and restoring them on exit.  (We could
use two independent hooks, but since these features will often be used
together, implementing both with just one hook seems reasonable.)

3. Add optional clauses to CREATE FUNCTION and ALTER FUNCTION to specify
the propath value.  I suggest, but am not wedded to,
PATH 'foo, bar'
PATH NONE
Since PATH NONE is the default, it's not really needed in CREATE
FUNCTION, but it seems useful to allow it for ALTER FUNCTION.

Comments?

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Patch queue concern

2007-03-29 Thread Joshua D. Drake



We don't want open-ended a few days before feature feeze.  We want them
to be as done, at some complete stopping point, and submitted.


OK, but we don't want something that is ready to be committed, we need
it complete.
So how many more releases before you think Postgres is complete? 


I am getting tired of your semantic games, here, Greg. I have no idea
what you are trying to accomplish, but I have better things to do.


I have to concur here. Everyone is doing the best that they can. Greg, 
how about reviewing some patches?


Joshua D. Drake



--

  === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive  PostgreSQL solutions since 1997
 http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL Replication: http://www.commandprompt.com/products/


---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [HACKERS] CREATE INDEX and HOT - revised design

2007-03-29 Thread Pavan Deolasee

On 3/29/07, Gregory Stark [EMAIL PROTECTED] wrote:


Besides, it seems if people are
happy to have indexes take a long time to build they could just do a
concurrent build.



I think we discussed this earlier. One of the down-side of CIC is that
it needs two complete heap scans. Apart from that CIC itself needs
to wait for all existing transactions to finish and more than one
instance of CIC can not be run on a table.



Earlier we were talking about not inserting any HOT tuples until the index
became valid. The goal of having an xid on the index was so we would know
when
we could start doing HOT updates again. That seems like a much lesser cost
than not being able to use the index until all live transactions exit.



What I am proposing is to keep index unusable for existing transactions.
The index is available for all new transactions even if there are unfinished
existing transactions. Is that a big problem ? Well, I still need buy-in and
review from Tom and others on the design, but it seems workable to me.


Thanks,
Pavan


--

EnterpriseDB http://www.enterprisedb.com


Re: [HACKERS] Patch queue concern

2007-03-29 Thread Bruce Momjian
Tom Lane wrote:
 Simon Riggs [EMAIL PROTECTED] writes:
  My feeling is we should have more regular sync points where the patch
  queue is emptied and everything committed or rejected.
 
 No doubt, but the real problem here is that reviewing/committing other
 people's patches is not fun, it's just work :-(.  So it's no surprise
 that it tends to get put off.  Not sure what to do about that.

Of course, writing patches isn't totally _fun_ either.

The big problem is shown in this chart:

   P a t c h   C o m p l e x i t y
Developer   | Simple  Complex
--
Experienced | Easy Medium
Novice  | Medium   Hard

The basic problem is we have a lot of complex patches coming in, and
many from people who do not have years of experience with submitting
patches to PostgreSQL.  A complex patch from a novice user takes a lot
of time to review, and frankly, we don't have enough experienced
developers doing such reviews.  If the patch deals with an area of the
code where I am not experienced, often even I am incapable of reviewing
the patch.

The bottom line is that we are getting more novice developers faster
than we grow experienced developers.  This is no big surprise, and I
don't see a simple solution.  Odds are this is going to continue.

-- 
  Bruce Momjian  [EMAIL PROTECTED]  http://momjian.us
  EnterpriseDB   http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Patch queue concern

2007-03-29 Thread Bruce Momjian
Gregory Stark wrote:
 
 Tom Lane [EMAIL PROTECTED] writes:
 
  Simon Riggs [EMAIL PROTECTED] writes:
  My feeling is we should have more regular sync points where the patch
  queue is emptied and everything committed or rejected.
 
  No doubt, but the real problem here is that reviewing/committing other
  people's patches is not fun, it's just work :-(.  So it's no surprise
  that it tends to get put off.  Not sure what to do about that.
 
 Obviously a big part of that is that we just don't have enough committers. I'm
 hopeful that in time that situation will improve but in the meantime we do
 have a problem and the burden falls unfairly on a few.
 
 Is there anything others can do to help? If non-committers like Simon or I
 reviewed patches would it be easier for you to give a quick agreement to the
 comments or that's not an issue comment?

Just to clarify, the committing is the easy part.  I can do that all day
and not break a sweat.  It is making sure the patch is correct in all
aspects --- functionality, clarity, modularity, reliability, design,
etc. that takes lots of time, and really can be done by anyone in the
community.  We already have people commenting on other peoples patches,
and new versions appearing, and every new version makes the final job of
review/commit easier, because someone was going to have to make those
changes before the patch was applied.

 It seems like we do have a few committers who should be able to review code
 quality but are uncertain about making major design decisions. If, for
 example, Bruce or Jan reviewed patches more invasive than they usually do for
 code quality and checked with you on design questions would that be helpful?

I wish that would work, but the big trick is getting the entire problem
into your head, matching user behavior with our existing code, and
making those link up.  There is really no _stage_ nature of final patch
review.  People can still comment on the patch, and improve it, but the
final decision has to be a holistic one that makes sure the entire
patch is in harmony.  (I am sounding new-age here.  :-) )

You might remember during I think 8.1 I started pushing patches because
no one was objecting to the patches, and people complained because the
patches we not complete.  The patches had problems, but I was unable to
fully understand some of the patches, and the patches had to be backed
out.  Since then, I haven't applied anything I didn't fully understand,
so the patches not languish in the patch queue until an experienced
PostgreSQL developer who does fully understand them can give me a green
light on it.

-- 
  Bruce Momjian  [EMAIL PROTECTED]  http://momjian.us
  EnterpriseDB   http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] Patch queue concern

2007-03-29 Thread August Zajonc
Bruce Momjian wrote:
 OK, but we don't want something that is ready to be committed, we need
 it complete.
 So how many more releases before you think Postgres is complete? 
 
 I am getting tired of your semantic games, here, Greg. I have no idea
 what you are trying to accomplish, but I have better things to do.

Why not just post a specific list of the patches you are thinking of? Is
it the patch queue list in total? Did I miss it?

Without specifics these things just spiral on forever, as all debates
about code do when there is no code to actually look at.

With specifics it is self documenting and definitional. You are thinking
/ concerned about x patches. Folks can look at how to move them forward,
and it would probably help guide community attention.





---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] CREATE INDEX and HOT - revised design

2007-03-29 Thread Bruce Momjian
Pavan Deolasee wrote:
  Earlier we were talking about not inserting any HOT tuples until the index
  became valid. The goal of having an xid on the index was so we would know
  when
  we could start doing HOT updates again. That seems like a much lesser cost
  than not being able to use the index until all live transactions exit.
 
 
 What I am proposing is to keep index unusable for existing transactions.
 The index is available for all new transactions even if there are unfinished
 existing transactions. Is that a big problem ? Well, I still need buy-in and
 review from Tom and others on the design, but it seems workable to me.

Yes, that seems totally acceptable to me.  As I remember, the index is
usable by the transaction that created it, and new transactions.  Hard
to see how someone would have a problem with that.

-- 
  Bruce Momjian  [EMAIL PROTECTED]  http://momjian.us
  EnterpriseDB   http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Fixing insecure security definer functions

2007-03-29 Thread Merlin Moncure

On 3/29/07, Tom Lane [EMAIL PROTECTED] wrote:

As was pointed out awhile ago
http://archives.postgresql.org/pgsql-general/2007-02/msg00673.php
it's insecure to run a SECURITY DEFINER function with a search_path
setting that's under the control of someone who wishes to subvert
the function.  Even for non-security-definer functions, it seems
useful to be able to select the search path for the function to use;
we've had requests for that before.  Right now, this is possible but
tedious and slow, because you really have to use a subtransaction
to ensure that the path is reset correctly on exit:

BEGIN
  SET LOCAL search_path = ...;
  ... useful work here ...
EXCEPTION
END

(In fact it's worse than that, since you can't write an EXCEPTION
without at least one WHEN clause, which is maybe something to change?)
Also, this approach isn't available in plain SQL functions.

I would like to fix this for 8.3.  I don't have a patch yet but want
to get buy-in on a design before feature freeze.  I propose the
following, fully-backward-compatible design:

1. Add a text column propath to pg_proc.  It can be either NULL or
a search path specification (interpreted the same as values for the
search_path GUC variable).  NULL means use the caller's setting, ie,
current behavior.

2. When fmgr.c sees either prosecdef or propath set for a function to be
called, it will insert the fmgr_security_definer hook into the call.
fmgr_security_definer will be responsible for establishing the correct
current-user and/or path settings and restoring them on exit.  (We could
use two independent hooks, but since these features will often be used
together, implementing both with just one hook seems reasonable.)

3. Add optional clauses to CREATE FUNCTION and ALTER FUNCTION to specify
the propath value.  I suggest, but am not wedded to,
PATH 'foo, bar'
PATH NONE
Since PATH NONE is the default, it's not really needed in CREATE
FUNCTION, but it seems useful to allow it for ALTER FUNCTION.


fwiw, I think this is a great solution...because the default behavior
is preserved you get through without any extra guc settings (although
you may want to add one anyways).

maybe security definer functions should raise a warning for implicit
PATH NONE, and possibly even deprecate that behavior and force people
to type it out in future (8.4+) releases.

merlin

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [HACKERS] Fixing insecure security definer functions

2007-03-29 Thread Stephen Frost
* Merlin Moncure ([EMAIL PROTECTED]) wrote:
 fwiw, I think this is a great solution...because the default behavior
 is preserved you get through without any extra guc settings (although
 you may want to add one anyways).

I agree that the proposed solution looks good.

 maybe security definer functions should raise a warning for implicit
 PATH NONE, and possibly even deprecate that behavior and force people
 to type it out in future (8.4+) releases.

While I agree that raising a warning makes sense I don't believe it
should be forced.  There may be cases where, even in security definer
functions, the current search_path should be used (though, of course,
care must be taken in writing such functions).

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Patch queue concern

2007-03-29 Thread Tom Lane
Bruce Momjian [EMAIL PROTECTED] writes:
 The basic problem is we have a lot of complex patches coming in, and
 many from people who do not have years of experience with submitting
 patches to PostgreSQL.  A complex patch from a novice user takes a lot
 of time to review, and frankly, we don't have enough experienced
 developers doing such reviews.

Part of the issue is that we have a lot of new developers who are trying
to solve hard problems without having learned their way around the code
by fixing easy stuff.  It was easier some years ago for people to learn
that way, because there was way more low-hanging fruit back then.  But
there's still some out there.  I have a distinct sense that we are
getting patches from people who are trying to run before they've learned
to walk.

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Fixing insecure security definer functions

2007-03-29 Thread Merlin Moncure

On 3/29/07, Stephen Frost [EMAIL PROTECTED] wrote:

* Merlin Moncure ([EMAIL PROTECTED]) wrote:
 fwiw, I think this is a great solution...because the default behavior
 is preserved you get through without any extra guc settings (although
 you may want to add one anyways).

I agree that the proposed solution looks good.

 maybe security definer functions should raise a warning for implicit
 PATH NONE, and possibly even deprecate that behavior and force people
 to type it out in future (8.4+) releases.

While I agree that raising a warning makes sense I don't believe it
should be forced.  There may be cases where, even in security definer
functions, the current search_path should be used (though, of course,
care must be taken in writing such functions).


I agree...I'm just suggesting to make you explicitly write 'PATH NONE'
for security definer functions because of the security risk...just a
thought though.

merlin

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Fixing insecure security definer functions

2007-03-29 Thread Tom Lane
Stephen Frost [EMAIL PROTECTED] writes:
 * Merlin Moncure ([EMAIL PROTECTED]) wrote:
 maybe security definer functions should raise a warning for implicit
 PATH NONE, and possibly even deprecate that behavior and force people
 to type it out in future (8.4+) releases.

 While I agree that raising a warning makes sense I don't believe it
 should be forced.

A WARNING seems reasonable to me too.  I'd just do it on the combination
of SECURITY DEFINER with PATH NONE, regardless of how you typed it
exactly.  ALTERing a function into that configuration should draw the
same warning.

regards, tom lane

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] tsearch_core patch for inclusion

2007-03-29 Thread Bruce Momjian

Your patch has been added to the PostgreSQL unapplied patches list at:

http://momjian.postgresql.org/cgi-bin/pgpatches

It will be applied as soon as one of the PostgreSQL committers reviews
and approves it.

---


Teodor Sigaev wrote:
 http://www.sigaev.ru/misc/tsearch_core-0.43.gz
 
 Changes:
 1 Ispell dictionary now supports hunspell dictionary's  format which is used 
 by
OpenOffice = 2.0.2
http://wiki.services.openoffice.org/wiki/Dictionaries
Changes in format is addressed, basically, to better support of compound 
 words
( German, Norwegian ). So, please, test it - we don't know that languages 
 at
all.
 2 added recent fixes of contrib/tsearch2
 3 fix usage of fopen/fclose
 
 -- 
 Teodor Sigaev   E-mail: [EMAIL PROTECTED]
 WWW: http://www.sigaev.ru/
 
 ---(end of broadcast)---
 TIP 7: You can help support the PostgreSQL project by donating at
 
 http://www.postgresql.org/about/donate

-- 
  Bruce Momjian  [EMAIL PROTECTED]  http://momjian.us
  EnterpriseDB   http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Concurrent connections in psql

2007-03-29 Thread Sailesh Krishnamurthy
+++

We'd love this feature as it would really help us write better test cases !

Regards
Sailesh

--
Sailesh Krishnamurthy
Amalgamated Insight
[W] (650) 242-3503
[C] (650) 804-6585

-Original Message-
From: [EMAIL PROTECTED] 
[mailto:[EMAIL PROTECTED] On Behalf Of Gregory Stark
Sent: Tuesday, March 27, 2007 6:39 AM
To: pgsql-hackers@postgresql.org
Subject: Re: [HACKERS] Concurrent connections in psql


Would people be interested in this feature? There was some positive reaction
from users but I'm not sure how excited developers are about complicating 
the
logic in psql (which is already pretty tangled).

This code bitrotted severely when Tom added the cursor support to psql. I
don't mind redoing it if people want it though. I already did a first pass 
at
doing so but it wasn't clear to me how best to integrate it with that cursor
support change. I elected to treat each chunk of results from the cursor as 
a
separate result set which makes it possible to switch connections between
chunks. That's nice but probably not really acceptable judging by how much
effort Tom went through in the cursor code to avoid having the chunks appear
as separate result sets. Probably I'll have to do more work in that area.

Are people interested in having this? The reason I think it's particularly
interesting is writing regression tests -- especially to test HOT cases.


Gregory Stark [EMAIL PROTECTED] writes:

 I mentioned this a while back, now that 8.2 is out perhaps others will be 
 more
 interested in new code.

 Currently Postgres regression tests only test functionality within a 
 single
 session. There are no regression tests that test the transaction semantics 
 or
 locking behaviour across multiple transactions.

 I modified psql to allow you to open multiple connections and switch 
 between
 them with a sort of csh job control style interface. It actually works out
 pretty well. It's fairly easy to write regression tests for basic 2-client 
 or
 3-client cases.

 The actual user interface may need some discussion though. I didn't want 
 to
 play the name game so I just prefixed all my commands with c and figured 
 we
 can always rename them later.

 And experience with actually writing the tests shows that the explicit 
 \cwait
 command which was needed to eliminate (in practice if not in theory) race
 conditions in regression tests turns out to be more flexibility than
 necessary. Since you end up having to insert one in precisely predictable
 locations -- namely after every asynchronous command and after every
 connection switch -- perhaps it would be simpler to just have a \pset 
 cwait
 command that automatically introduces timeouts in precisely those places.

 A brief explanation including an example regression test (the SAVEPOINT
 locking bug discovered recently) and the patch here:

   http://community.enterprisedb.com/concurrent/index.html

 -- 
   Gregory Stark
   EnterpriseDB  http://www.enterprisedb.com


 ---(end of broadcast)---
 TIP 5: don't forget to increase your free space map settings

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Server-side support of all encodings

2007-03-29 Thread Dezso Zoltan

Hello Everyone,

I very much understand why SJIS is not a server encoding. It contains
ASCII second bytes (including \ and ' both of which can be really
nasty inside a normal sql) and further, half-width katakana is
represented as one byte-characters, incidentally two of which coincide
with a kanji.

My question is, however: what would be the best practice if it was
imperative to use SJIS encoding for texts and no built-in conversions
are useful? To elaborate, I need to support japanese emoji characters,
which are special emoticons for mobile phones. These characters are
usually in a region that is not specified by the standard SJIS,
therefore they are not properly converted either to EUC or UTF8 (which
would be my prefered choice, but unfortunately not all mobile phones
support it, so conversion is still necessary - from what i've seen,
the new SJIS_2004 map seems to define these entities, but I'm not 100%
sure they all get converted properly).

I inherited a system in which this problem is bypassed by setting
SQL_ASCII server encoding, but that is not the best solution (full
text search is rendered useless and occasionally the special character
issue rears its ugly head - not only do we have to deal with normal
sqlinjection, but also encoding-based injections) (and for the real
WTF, my predecessor converted everything to EUC before inserting -
eventually losing all the emojis and creating all sorts of strange
phenomena, like tables with one column in euc until a certain date and
sjis from then on while euc for all other columns)

Is there a way to properly deal with sjis+emoji extensions (a patch
i'm not aware of, for example), is it considered as a todo for further
releases or should i consider augmenting postgres in a way (if the
latter, could you provide any pointers on how to proceed?)

Thank you,
Zaki

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Tom Lane
Sent: Monday, March 26, 2007 11:20 AM
To: ITAGAKI Takahiro
Cc: pgsql-hackers@postgresql.org
Subject: Re: [HACKERS] Server-side support of all encodings

ITAGAKI Takahiro [EMAIL PROTECTED] writes:

PostgreSQL suppots SJIS, BIG5, GBK, UHC and GB18030 as client encodings,
but we cannot use them as server encodings. Are there any reason for it?


Very much so --- they aren't safe ASCII-supersets, and thus for example
the parser will fail on them.  Backend encodings must have the property
that all bytes of a multibyte character are = 128.

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [PATCHES] [HACKERS] Full page writes improvement, code update

2007-03-29 Thread Josh Berkus
Simon,

 OK, different question:
 Why would anyone ever set full_page_compress = off?

The only reason I can see is if compression costs us CPU but gains RAM  
I/O.  I can think of a lot of applications ... benchmarks included ... 
which are CPU-bound but not RAM or I/O bound.  For those applications, 
compression is a bad tradeoff.

If, however, CPU used for compression is made up elsewhere through smaller 
file processing, then I'd agree that we don't need a switch.

-- 
--Josh

Josh Berkus
PostgreSQL @ Sun
San Francisco

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] CREATE INDEX and HOT - revised design

2007-03-29 Thread Gregory Stark
Pavan Deolasee [EMAIL PROTECTED] writes:

 What I am proposing is to keep index unusable for existing transactions.
 The index is available for all new transactions even if there are unfinished
 existing transactions. 

Ah thanks, that makes a lot more sense. Sorry for the false alarm.


-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Server-side support of all encodings

2007-03-29 Thread Martijn van Oosterhout
On Wed, Mar 28, 2007 at 10:44:00AM +0900, Dezso Zoltan wrote:
 My question is, however: what would be the best practice if it was
 imperative to use SJIS encoding for texts and no built-in conversions
 are useful? To elaborate, I need to support japanese emoji characters,
 which are special emoticons for mobile phones. These characters are
 usually in a region that is not specified by the standard SJIS,

Unicode also defines a region for user-defined characters. While it may
be odd, perhaps it would be reasonable to map the user-defined space in
SJIS to somewhere in the user-defined pace in Unicode, so that at least
you round-trip consistancy.

Have a nice day,
-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 From each according to his ability. To each according to his ability to 
 litigate.


signature.asc
Description: Digital signature


Re: [HACKERS] problems with plan invalidation

2007-03-29 Thread Tom Lane
Kris Jurka [EMAIL PROTECTED] writes:
 Running the JDBC driver's regression test suite for the first time in a 
 while I got a lot of failures that I would have to guess are related to 
 plan invalidation work.  Attached is a self contained test case and the 
 JDBC driver's log of what protocol messages it is sending.

I've committed a fix for this case --- please give it another try to see
if you find any other problems.

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] CREATE INDEX and HOT - revised design

2007-03-29 Thread Simon Riggs
On Thu, 2007-03-29 at 22:08 +0530, Pavan Deolasee wrote:
 
 
 On 3/29/07, Tom Lane [EMAIL PROTECTED] wrote:
 
 It will replan at the first use of the plan after seeing the
 relcache
 inval sent by commit of the index-creating transaction.  If
 you have
 two separate transactions to create an index and then mark it
 valid
 later, everything's fine because there are two inval events. 
 However, if you design something where an index becomes usable
 due
 to the passage of time rather than an explicit mark-valid
 step,
 there's gonna be a problem.  I'd suggest trying to stick to
 the
 way CREATE INDEX CONCURRENTLY does it... 
 
 
 I had earlier proposed to do things CIC way. But there were objections
 to the additional wait introduced in CREATE INDEX, and I don't
 think they were unreasonable. May be if we can avoid waits if there 
 are no HOT-chains in the table, but still we need agreement on that.
 
 OTOH ISTM that the pg_index:xcreate solution may work fine if
 we can keep index unusable to those transactions which started
 before CREATE INDEX could commit. 

Pavan, ISTM you have misunderstood Tom slightly.

Having the index invisible to all current transactions is acceptable.

However, the other backends will not receive an invalidation event,
which means even when they start new transactions they will still not
see it, which is not acceptable.

ISTM that the run-another-transaction-afterwards idea is the only one
that does everything I think we need. I really do wish we could put in a
wait, like CIC, but I just think it will break existing programs.

-- 
  Simon Riggs 
  EnterpriseDB   http://www.enterprisedb.com



---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] CREATE INDEX and HOT - revised design

2007-03-29 Thread Simon Riggs
On Thu, 2007-03-29 at 13:55 -0400, Bruce Momjian wrote:
 Pavan Deolasee wrote:
   Earlier we were talking about not inserting any HOT tuples until the index
   became valid. The goal of having an xid on the index was so we would know
   when
   we could start doing HOT updates again. That seems like a much lesser cost
   than not being able to use the index until all live transactions exit.
  
  
  What I am proposing is to keep index unusable for existing transactions.
  The index is available for all new transactions even if there are unfinished
  existing transactions. Is that a big problem ? Well, I still need buy-in and
  review from Tom and others on the design, but it seems workable to me.
 
 Yes, that seems totally acceptable to me.  As I remember, the index is
 usable by the transaction that created it, and new transactions.  Hard
 to see how someone would have a problem with that.

Agreed.

-- 
  Simon Riggs 
  EnterpriseDB   http://www.enterprisedb.com



---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


[HACKERS] Feature thought: idle in transaction timeout

2007-03-29 Thread Joshua D. Drake

Hello,

I ran into an interesting problem with a customer today. They are 
running Jabber XCP (not the one we use). Unfortunately, the product has 
a bug that causes it to leave connections persistent in a transaction 
state. This is what it does:


BEGIN; SELECT 1;

Basically it is verifying that the connection is live. However, it never 
calls commit. So what happens? We can't vacuum ;).


Anyway, my thought is, we know when a transaction is idle, why not have 
an idle timeout where we will explicitly close the connection or 
rollback or something? User configurable of course.


Sincerely,

Joshua D. Drake



--

  === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive  PostgreSQL solutions since 1997
 http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL Replication: http://www.commandprompt.com/products/


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Modifying TOAST thresholds

2007-03-29 Thread Simon Riggs
On Thu, 2007-03-29 at 12:05 -0400, Tom Lane wrote:
 I think the WAL-reduction proposal needs more time and thought than is
 feasible before 8.3.

Agreed.

We really need to focus on the major features.

-- 
  Simon Riggs 
  EnterpriseDB   http://www.enterprisedb.com



---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] [JDBC] problems with plan invalidation

2007-03-29 Thread Kris Jurka



On Thu, 29 Mar 2007, Tom Lane wrote:


Kris Jurka [EMAIL PROTECTED] writes:

Running the JDBC driver's regression test suite for the first time in a
while I got a lot of failures that I would have to guess are related to
plan invalidation work.  Attached is a self contained test case and the
JDBC driver's log of what protocol messages it is sending.


I've committed a fix for this case --- please give it another try to see
if you find any other problems.



That seems to fix all of them.  Thanks.

Kris Jurka

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


[HACKERS] timing in PostgreSQL

2007-03-29 Thread Carlos Chacon

Hi, i need you help Postgres gurus...

I have been modified postgres for a while... and i need  to test the
performance (for now, only time) of my modifications against the original
postgres

My first approach was to use the function clock() of the time.h C library,
in the postgres.c file but reading the file, i noticed that
you can set a group of options that print statistics (generals or specifics
to postgres modules, in the stdin) for each query that you send to postgres
Server...
So, i did it...

Now, i don't have a clue how interpreter what postgres print i search in
the postgres manual but i find nothing about it

Now, where i can find information about how to read this statistics??? (this
sound a pg_admin question but read the second one)

And, those times are exclusive postgres times, or they include the expended
time of postgres in process the query, plus the time that postgres waits
when there are other process running in the SO (you kwon, multi tasking)???

By the way, is this a good approach, or i will have to take the first
approach that i mentioned???

Thanks.

Waiting your answers,
Carlos Chacon.


Re: [HACKERS] CREATE INDEX and HOT - revised design

2007-03-29 Thread Tom Lane
Simon Riggs [EMAIL PROTECTED] writes:
 ISTM that the run-another-transaction-afterwards idea is the only one
 that does everything I think we need. I really do wish we could put in a
 wait, like CIC, but I just think it will break existing programs.

Actually, there's a showstopper objection to that: plain CREATE INDEX
has to be able to run within a larger transaction.  (To do otherwise
breaks pg_dump --single-transaction, just for starters.)  This means
it can *not* commit partway through.

Back to the drawing board :-(

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] CREATE INDEX and HOT - revised design

2007-03-29 Thread Florian G. Pflug

Tom Lane wrote:

Simon Riggs [EMAIL PROTECTED] writes:

ISTM that the run-another-transaction-afterwards idea is the only one
that does everything I think we need. I really do wish we could put in a
wait, like CIC, but I just think it will break existing programs.


Actually, there's a showstopper objection to that: plain CREATE INDEX
has to be able to run within a larger transaction.  (To do otherwise
breaks pg_dump --single-transaction, just for starters.)  This means
it can *not* commit partway through.


I believe the original idea was to invent some kind of on commit run
this transaction hook - similar to how files are deleted on commit,
I think. At least I understood the Run another transaction on commit
that way...

greetings, Florian Pflug


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


[HACKERS] List of uncompleted patches for 8.2

2007-03-29 Thread Bruce Momjian
Here is a web site showing all the uncompleted patches for 8.2 that I am
worried about:

http://momjian.postgresql.org/cgi-bin/pgpatches_hold

I can update the list as I get feedback.

-- 
  Bruce Momjian  [EMAIL PROTECTED]  http://momjian.us
  EnterpriseDB   http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] List of uncompleted patches for 8.2

2007-03-29 Thread Bruce Momjian
Bruce Momjian wrote:
 Here is a web site showing all the uncompleted patches for 8.2 that I am

Sorry, mean 8.3.

---


 worried about:
 
   http://momjian.postgresql.org/cgi-bin/pgpatches_hold
 
 I can update the list as I get feedback.
 
 -- 
   Bruce Momjian  [EMAIL PROTECTED]  http://momjian.us
   EnterpriseDB   http://www.enterprisedb.com
 
   + If your life is a hard drive, Christ can be your backup. +
 
 ---(end of broadcast)---
 TIP 3: Have you checked our extensive FAQ?
 
http://www.postgresql.org/docs/faq

-- 
  Bruce Momjian  [EMAIL PROTECTED]  http://momjian.us
  EnterpriseDB   http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [PATCHES] [HACKERS] Full page writes improvement, code update

2007-03-29 Thread Simon Riggs
On Thu, 2007-03-29 at 11:45 -0700, Josh Berkus wrote:

  OK, different question:
  Why would anyone ever set full_page_compress = off?
 
 The only reason I can see is if compression costs us CPU but gains RAM  
 I/O.  I can think of a lot of applications ... benchmarks included ... 
 which are CPU-bound but not RAM or I/O bound.  For those applications, 
 compression is a bad tradeoff.
 
 If, however, CPU used for compression is made up elsewhere through smaller 
 file processing, then I'd agree that we don't need a switch.

Koichi-san has explained things for me now.

I misunderstood what the parameter did and reading your post, ISTM you
have as well. I do hope Koichi-san will alter the name to allow
everybody to understand what it does.

-- 
  Simon Riggs 
  EnterpriseDB   http://www.enterprisedb.com



---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] List of uncompleted patches for 8.3

2007-03-29 Thread Bruce Momjian
[ Sorry, reposted with correct subject line.]

Here is a web site showing all the uncompleted patches for 8.3 that I
am worried about:

http://momjian.postgresql.org/cgi-bin/pgpatches_hold

-- 
  Bruce Momjian  [EMAIL PROTECTED]  http://momjian.us
  EnterpriseDB   http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] CREATE INDEX and HOT - revised design

2007-03-29 Thread Simon Riggs
On Thu, 2007-03-29 at 17:27 -0400, Tom Lane wrote:
 Simon Riggs [EMAIL PROTECTED] writes:
  ISTM that the run-another-transaction-afterwards idea is the only one
  that does everything I think we need. I really do wish we could put in a
  wait, like CIC, but I just think it will break existing programs.
 
 Actually, there's a showstopper objection to that: plain CREATE INDEX
 has to be able to run within a larger transaction.  (To do otherwise
 breaks pg_dump --single-transaction, just for starters.)  This means
 it can *not* commit partway through.

I agree with most of that, but thats why we-are-where-we-are and I don't
think its a showstopper at all.

The idea is to make note that the transaction has created an index
within a transaction block, so that after the top level transaction
commits we sneak in an extra hidden transaction to update the pg_index
tuple with the xcreate of the first transaction. 

We don't do this after the CREATE INDEX statement ends, only at the end
of the transaction in which it ran. We only do this if we are creating
an index on a table that is not a temporary table and was not created
during the transaction (so --single-transaction isn't broken and doesn't
require this additional action).

i.e. MyTransactionCreatedVisibleIndex, with special processing in
xact.c.

The only other alternative is to forcibly throw a relcache inval event
in the same circumstances without running the additional transaction,
but the solution is mostly the same.

I agree this is weird, but no more weird a solution as CIC was when that
first came out. I don't like it, or think its clever; I just think its
the only thing on the table. 

-- 
  Simon Riggs 
  EnterpriseDB   http://www.enterprisedb.com



---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


[HACKERS] Is this an psql Error???

2007-03-29 Thread Carlos Chacon

Hi...

I was trying to execute \timing in a psql console but the command that i
always get is \t  Is something that im missing??? or is an Error???
...My version of postgres is 8.1.4

Bye.


Re: [HACKERS] CREATE INDEX and HOT - revised design

2007-03-29 Thread Florian G. Pflug

Simon Riggs wrote:

On Thu, 2007-03-29 at 17:27 -0400, Tom Lane wrote:

Simon Riggs [EMAIL PROTECTED] writes:

ISTM that the run-another-transaction-afterwards idea is the only one
that does everything I think we need. I really do wish we could put in a
wait, like CIC, but I just think it will break existing programs.

Actually, there's a showstopper objection to that: plain CREATE INDEX
has to be able to run within a larger transaction.  (To do otherwise
breaks pg_dump --single-transaction, just for starters.)  This means
it can *not* commit partway through.


The idea is to make note that the transaction has created an index
within a transaction block, so that after the top level transaction
commits we sneak in an extra hidden transaction to update the pg_index
tuple with the xcreate of the first transaction. 


The only other alternative is to forcibly throw a relcache inval event
in the same circumstances without running the additional transaction,
but the solution is mostly the same.


I think one alternative might be to store a list of xid's together with
a cached plan, and replan if the commit status (as percieved by the
transaction the plan will be executed in) of one of those xid's changes.

greetings, Florian Pflug


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PATCHES] [HACKERS] Full page writes improvement, code update

2007-03-29 Thread Koichi Suzuki

Josh;

I'd like to explain what the term compression in my proposal means 
again and would like to show the resource consumption comparision with 
cp and gzip.


My proposal is to remove unnecessary full page writes (they are needed 
in crash recovery from inconsistent or partial writes) when we copy WAL 
to archive log and rebuilt them as a dummy when we restore from archive 
log.  Dummy is needed to maintain LSN.  So it is very very different 
from general purpose compression such as gzip, although pg_compresslog 
compresses archive log as a result.


As to CPU and I/O consumption, I've already evaluated as follows:

1) Collect all the WAL segment.
2) Copy them by different means, cp, pg_compresslog and gzip.

and compared the ellapsed time as well as other resource consumption.

Benchmark: DBT-2
Database size: 120WH (12.3GB)
Total WAL size: 4.2GB (after 60min. run)
Elapsed time:
  cp:120.6sec
  gzip:  590.0sec
  pg_compresslog: 79.4sec
Resultant archive log size:
  cp: 4.2GB
  gzip:   2.2GB
  pg_compresslog: 0.3GB
Resource consumption:
  cp:   user:   0.5sec system: 15.8sec idle:  16.9sec I/O wait: 87.7sec
  gzip: user: 286.2sec system:  8.6sec idle: 260.5sec I/O wait: 36.0sec
  pg_compresslog:
user:   7.9sec system:  5.5sec idle:  37.8sec I/O wait: 28.4sec

Because the resultant log size is considerably smaller than cp or gzip, 
pg_compresslog need much less I/O and because the logic is much simpler 
than gzip, it does not consume CPU.


The term compress may not be appropriate.   We may call this log 
optimization instead.


So I don't see any reason why this (at least optimization mark in each 
log record) can't be integrated.


Simon Riggs wrote:

On Thu, 2007-03-29 at 11:45 -0700, Josh Berkus wrote:


OK, different question:
Why would anyone ever set full_page_compress = off?
The only reason I can see is if compression costs us CPU but gains RAM  
I/O.  I can think of a lot of applications ... benchmarks included ... 
which are CPU-bound but not RAM or I/O bound.  For those applications, 
compression is a bad tradeoff.


If, however, CPU used for compression is made up elsewhere through smaller 
file processing, then I'd agree that we don't need a switch.


As I wrote to Simon's comment, I concern only one thing.

Without a switch, because both full page writes and corresponding 
logical log is included in WAL, this will increase WAL size slightly 
(maybe about five percent or so).   If everybody is happy with this, we 
don't need a switch.




Koichi-san has explained things for me now.

I misunderstood what the parameter did and reading your post, ISTM you
have as well. I do hope Koichi-san will alter the name to allow
everybody to understand what it does.



Here're some candidates:
full_page_writes_optimize
full_page_writes_mark: means it marks full_page_write as needed in 
crash recovery, needed in archive recovery and so on.


I don't insist these names.  It's very helpful if you have any 
suggestion to reflect what it really means.


Regards;
--
Koichi Suzuki

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


[HACKERS] Autovacuum vs statement_timeout

2007-03-29 Thread Tom Lane
I seem to remember that we'd agreed that autovacuum should ignore any
globally set statement_timeout, on the grounds that a poorly chosen
setting could indefinitely prevent large tables from being vacuumed.
But I do not see anything in autovacuum.c that resets the variable.
Am I just being blind?  (Quite possible, as I'm tired and under the
weather.)

The thing that brought this to mind was the idea that Mark
Shuttleworth's open problem might be triggered in part by a statement
timeout interrupting autovacuum at an inopportune point --- some logs
he sent me offlist show that he is using statement_timeout ...

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Server-side support of all encodings

2007-03-29 Thread Tatsuo Ishii
 Hello Everyone,
 
 I very much understand why SJIS is not a server encoding. It contains
 ASCII second bytes (including \ and ' both of which can be really
 nasty inside a normal sql) and further, half-width katakana is
 represented as one byte-characters, incidentally two of which coincide
 with a kanji.
 
 My question is, however: what would be the best practice if it was
 imperative to use SJIS encoding for texts and no built-in conversions
 are useful? To elaborate, I need to support japanese emoji characters,
 which are special emoticons for mobile phones. These characters are
 usually in a region that is not specified by the standard SJIS,
 therefore they are not properly converted either to EUC or UTF8 (which
 would be my prefered choice, but unfortunately not all mobile phones
 support it, so conversion is still necessary - from what i've seen,
 the new SJIS_2004 map seems to define these entities, but I'm not 100%
 sure they all get converted properly).
 
 I inherited a system in which this problem is bypassed by setting
 SQL_ASCII server encoding, but that is not the best solution (full
 text search is rendered useless and occasionally the special character
 issue rears its ugly head - not only do we have to deal with normal
 sqlinjection, but also encoding-based injections) (and for the real
 WTF, my predecessor converted everything to EUC before inserting -
 eventually losing all the emojis and creating all sorts of strange
 phenomena, like tables with one column in euc until a certain date and
 sjis from then on while euc for all other columns)
 
 Is there a way to properly deal with sjis+emoji extensions (a patch
 i'm not aware of, for example), is it considered as a todo for further
 releases or should i consider augmenting postgres in a way (if the
 latter, could you provide any pointers on how to proceed?)

You can always use CREATE CONVERSION for this kind of purpose.
Create your own conversion map between SJIS -- EUC or UT-8.
--
Tatsuo Ishii
SRA OSS, Inc. Japan

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] Server-side support of all encodings

2007-03-29 Thread Tatsuo Ishii
 Hello Everyone,
 
 I very much understand why SJIS is not a server encoding. It contains
 ASCII second bytes (including \ and ' both of which can be really
 nasty inside a normal sql) and further, half-width katakana is
 represented as one byte-characters, incidentally two of which coincide
 with a kanji.
 
 My question is, however: what would be the best practice if it was
 imperative to use SJIS encoding for texts and no built-in conversions
 are useful? To elaborate, I need to support japanese emoji characters,
 which are special emoticons for mobile phones. These characters are
 usually in a region that is not specified by the standard SJIS,
 therefore they are not properly converted either to EUC or UTF8 (which
 would be my prefered choice, but unfortunately not all mobile phones
 support it, so conversion is still necessary - from what i've seen,
 the new SJIS_2004 map seems to define these entities, but I'm not 100%
 sure they all get converted properly).
 
 I inherited a system in which this problem is bypassed by setting
 SQL_ASCII server encoding, but that is not the best solution (full
 text search is rendered useless and occasionally the special character
 issue rears its ugly head - not only do we have to deal with normal
 sqlinjection, but also encoding-based injections) (and for the real
 WTF, my predecessor converted everything to EUC before inserting -
 eventually losing all the emojis and creating all sorts of strange
 phenomena, like tables with one column in euc until a certain date and
 sjis from then on while euc for all other columns)
 
 Is there a way to properly deal with sjis+emoji extensions (a patch
 i'm not aware of, for example), is it considered as a todo for further
 releases or should i consider augmenting postgres in a way (if the
 latter, could you provide any pointers on how to proceed?)

You can always use CREATE CONVERSION for this kind of purpose.
Create your own conversion map between SJIS -- EUC or UT-8.
--
Tatsuo Ishii
SRA OSS, Inc. Japan

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PATCHES] [HACKERS] Full page writes improvement, code update

2007-03-29 Thread Koichi Suzuki

Hi,

Here's a patch reflected some of Simon's comments.

1) Removed an elog call in a critical section.

2) Changed the name of the commands, pg_complesslog and pg_decompresslog.

3) Changed diff option to make a patch.

--
Koichi Suzuki


pg_lesslog.tgz
Description: Binary data

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] CREATE INDEX and HOT - revised design

2007-03-29 Thread Pavan Deolasee

On 3/30/07, Simon Riggs [EMAIL PROTECTED] wrote:



Pavan, ISTM you have misunderstood Tom slightly.



Oh, yes. Now that I re-read Tom's comment, his plan invalidation
design and code, I understand things better.



Having the index invisible to all current transactions is acceptable.



Ok.



However, the other backends will not receive an invalidation event,
which means even when they start new transactions they will still not
see it, which is not acceptable.



Agree.


ISTM that the run-another-transaction-afterwards idea is the only one

that does everything I think we need. I really do wish we could put in a
wait, like CIC, but I just think it will break existing programs.



ISTM that the run-another-transaction-afterwards idea would have same
problem with plan invalidation. When the second transaction commits,
the relcache invalidation event is generated. The event may get consumed
by other backends, but the index may still not be usable to them because
their xid  xcreat. If no more relcache invalidation events are generated
after that, the backends would continue to use the cached plan, even
if index becomes usable to them later.

How about storing the snapshot which we used during planning in
CachedPlanSource, if at least one index was seen unusable because
its CREATE INDEX transaction was seen as in-progress ? In
RevalidateCachedPlan(), we check if snapshot is set in
CachedPlanSource and  check if we are now using a different snapshot.
This triggers plan invalidation and re-planning. This would also help us
to use index early in read-committed transactions even if the transaction
was started before CREATE INDEX committed.

Does this sound good ?

Thanks,
Pavan

--

EnterpriseDB http://www.enterprisedb.com