[HACKERS] Windows Vista support (Buildfarm Vaquita)

2007-05-09 Thread Dave Page

Hi,

I asked about this before, but the thread got hijacked to discuss 
another buildfarm failure :-(. Currently our only Windows Vista 
buildfarm member (Vaquita) fails every time (assuming it gets that far) 
on ECPG's dt_test and update tests.


I've checked the FS permissions, and see no obvious reason why the tests 
would fail, and I've tried running the tests manually and see them fail 
as well.


Can someone suggest what I might try next to resolve this? I don't 
really have the spare time to spend figuring out ECPG at the moment.


http://www.pgbuildfarm.org/cgi-bin/show_log.pl?nm=vaquitadt=2007-05-07%2020:00:05

Regards, Dave.

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [HACKERS] Seq scans roadmap

2007-05-09 Thread Zeugswetter Andreas ADI SD

  Are you filling multiple buffers in the buffer cache with a single 
  read-call?
 
  yes, needs vector or ScatterGather IO.
 
 I would expect that to get only moderate improvement.

The vast improvement comes from 256k blocksize.

 To get 
 the full benefit I would think you would want to either fire 
 off a separate thread to do the read-ahead, use libaio, or 
 funnel the read-ahead requests to a separate thread like our 
 bgwriter only it would be a bgreader or something like that.

I like bgreader :-) But that looks even more difficult than grabbing 32
[scattered or contiguous] buffers at once.
Especially in a situation where there is no concurrent load it would be
nice to do CPU work while waiting for the next read ahead IO. If there
is enough parallel CPU load it is actually not so important. So I opt,
that on a high load server you get nearly all benefit without any sort
of aio. 

  The OS should be doing readahead for us anyway, so I don't see how 
  just issuing multiple ReadBuffers one after each other helps.
 
  Last time I looked OS readahead was only comparable to 32k blocked
reads.
  256k blocked reads still perform way better. Also when the OS is 
  confronted with an IO storm the 256k reads perform way better than
OS readahead.
 
 Well that's going to depend on the OS. Last I checked Linux's 
 readahead logic is pretty straightforward and doesn't try to 
 do any better than 32k readahead and is easily fooled. 
 However I wouldn't be surprised if that's changed. 

My test was on AIX, 32 or 64k seem quite common, at least as default
setting.
Also on some OS's (like HPUX) OS readahead and writebehind strategy
changes with large IO blocksizes, imho beneficially.

Andreas

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] Allow use of immutable functions operating on constants with constraint exclusion

2007-05-09 Thread ITAGAKI Takahiro

Marshall, Steve [EMAIL PROTECTED] wrote:

 the first query would be optimized using 
 constraint exclusion, while the second query would not:
 
 SELECT * FROM test_bulletins WHERE created_at  '2006-09-09 
 05:00:00+00'::timestamptz;
 SELECT * FROM test_bulletins WHERE created_at  '2006-09-09 
 05:00:00+00'::timestamptz + '0 days'::interval;

Hmmm... CE seems to be still not enough to optimize complex expressions.
If I added the wrapper function, it worked.

CREATE FUNCTION timeadd(timestamptz, interval) RETURNS timestamptz
AS $$ SELECT $1 + $2; $$ LANGUAGE sql IMMUTABLE;

SELECT * FROM test_bulletins WHERE created_at 
timeadd('2006-09-09 05:00:00+00', '0 days');


I noticed that we should be careful about CE with prepared statements
and functions. Seamless partitioning requires more works.

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center



---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] Windows Vista support (Buildfarm Vaquita)

2007-05-09 Thread Michael Meskes
On Wed, May 09, 2007 at 09:21:59AM +0100, Dave Page wrote:
 I asked about this before, but the thread got hijacked to discuss 
 another buildfarm failure :-(. Currently our only Windows Vista 
 buildfarm member (Vaquita) fails every time (assuming it gets that far) 
 on ECPG's dt_test and update tests.

Dave, could you please run

insert into date_test ( d  , ts  ) values (  date '1966-01-17' ,
timestamp '2000-07-12 17:34:29' ); 

on the Vista system and then select * from date_test;?

According to the logs the insert runs successfully but the select gives
an invalid date format. Interestingly the date argument is displayed
correctly but the timestamp argument throws the invalid date error,
which does not really make sense. 

Unfortunately I do not have access to a Vista system I could use to test
and track this one down.

As far as the other message is concerned I'm at a loss. It simply
refuses to run the sql/update script. No idea why.

Michael
-- 
Michael Meskes
Email: Michael at Fam-Meskes dot De, Michael at Meskes dot (De|Com|Net|Org)
ICQ: 179140304, AIM/Yahoo: michaelmeskes, Jabber: [EMAIL PROTECTED]
Go SF 49ers! Go Rhein Fire! Use Debian GNU/Linux! Use PostgreSQL!

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Seq scans roadmap

2007-05-09 Thread Simon Riggs
On Tue, 2007-05-08 at 11:40 +0100, Heikki Linnakangas wrote:
 Here's my roadmap for the scan-resistant buffer cache and 
 synchronized scans patches.
 
 1. Fix the current vacuum behavior of throwing dirty buffers to the 
 freelist, forcing a lot of WAL flushes. Instead, use a backend-private 
 ring of shared buffers that are recycled. This is what Simon's 
 scan-resistant buffer manager did.
 
 The theory here is that if a page is read in by vacuum, it's unlikely to 
 be accessed in the near future, therefore it should be recycled. If 
 vacuum doesn't dirty the page, it's best to reuse the buffer immediately 
 for the next page. However, if the buffer is dirty (and not just because 
 we set hint bits), we ought to delay writing it to disk until the 
 corresponding WAL record has been flushed to disk.
 
 Simon's patch used a fixed size ring of buffers that are recycled, but I 
 think the ring should be dynamically sized. Start with a small ring, and 
 whenever you need to do a WAL flush to write a dirty buffer, increase 
 the ring size. On every full iteration through the ring, decrease its 
 size to trim down an unnecessarily large ring.
 
 This only alters the behavior of vacuums, and it's pretty safe to say it 
 won't get worse than what we have now. 

I think thats too much code, why not just leave it as it is. Would a
dynamic buffer be substantially better? If not, why bother?

 In the future, we can use the 
 buffer ring for seqscans as well; more on that on step 3.

There was clear benefit for that. You sound like you are suggesting to
remove the behaviour for Seq Scans, which wouldn't make much sense??

 2. Implement the list/table of last/ongoing seq scan positions. This is 
 Jeff's synchronized scans patch. When a seq scan starts on a table 
 larger than some threshold, it starts from where the previous seq scan 
 is currently, or where it ended. This will synchronize the scans so that 
 for two concurrent scans the total I/O is halved in the best case. There 
 should be no other effect on performance.
 
 If you have a partitioned table, or union of multiple tables or any 
 other plan where multiple seq scans are performed in arbitrary order, 
 this change won't change the order the partitions are scanned and won't 
 therefore ensure they will be synchronized.
 
 
 Now that we have both pieces of the puzzle in place, it's time to 
 consider what more we can do with them:
 
 
 3A. To take advantage of the cache trail of a previous seq scan, scan 
 backwards from where the previous seq scan ended, until a you hit a 
 buffer that's not in cache.
 
 This will allow taking advantage of the buffer cache even if the table 
 doesn't fit completely in RAM. That can make a big difference if the 
 table size is just slightly bigger than RAM, and can avoid the nasty 
 surprise when a table grows beyond RAM size and queries start taking 
 minutes instead of seconds.
 
 This should be a non-controversial change on its own from performance 
 point of view. No query should get slower, and some will become faster. 
 But see step 3B:
 
 3B. Currently, sequential scans on a large table spoils the buffer cache 
 by evicting other pages from the cache. In CVS HEAD, as soon as the 
 table is larger than shared_buffers, the pages in the buffer won't be 
 used to speed up running the same query again, and there's no reason to 
 believe the pages read in would be more useful than any other page in 
 the database, and in particular the pages that were in the buffer cache 
 before the huge seq scan. If the table being scanned is  5 * 
 shared_buffers, the scan will evict every other page from the cache if 
 there's no other activity in the database (max usage_count is 5).
 
 If the table is much larger than shared_buffers, say 10 times as large, 
 even with the change 3B to read the pages that are in cache first, using 
 all shared_buffers to cache the table will only speed up the query by 
 10%. We should not spoil the cache for such a small gain, and use the 
 local buffer ring strategy instead. It's better to make queries that are 
 slow anyway a little bit slower, than making queries that are normally 
 really fast, slow.
 
 
 As you may notice, 3A and 3B are at odds with each other. We can 
 implement both, but you can't use both strategies in the same scan.

Not sure I've seen any evidence of that.

Most scans will be solo and so should use the ring buffer, since there
is clear evidence of that. If there were evidence to suggest the two
patches conflict then we should turn off the ring buffer only when
concurrent scans are in progress (while remembering that concurrent
scans will not typically overlap as much as the synch scan tests show
and so for much of their execution they too will be solo).

 Therefore we need to have decision logic of some kind to figure out 
 which strategy is optimal.
 
 A simple heuristic is to decide based on the table size:
 
  0.1*shared_buffers  - start from page 0, keep in cache (like 

Re: [HACKERS] Windows Vista support (Buildfarm Vaquita)

2007-05-09 Thread Dave Page

Michael Meskes wrote:

Dave, could you please run

insert into date_test ( d  , ts  ) values (  date '1966-01-17' ,
timestamp '2000-07-12 17:34:29' ); 


on the Vista system and then select * from date_test;?

According to the logs the insert runs successfully but the select gives
an invalid date format. Interestingly the date argument is displayed
correctly but the timestamp argument throws the invalid date error,
which does not really make sense. 


I had to create the table manually of course, so copying what the code 
seems to do, I get:


regress1=# create table date_test (d date, ts timestamp);
CREATE TABLE
regress1=# set datestyle to iso;
SET
regress1=# insert into date_test(d, ts) values (date '1966-01-17', 
timestamp '2000-07-12 17:34:29');

INSERT 0 1
regress1=# select * from date_test;
 d  | ts
+-
 1966-01-17 | 2000-07-12 17:34:29
(1 row)

Which looks OK to me :-(


Unfortunately I do not have access to a Vista system I could use to test
and track this one down.


I'm happy to run any tests you like.


As far as the other message is concerned I'm at a loss. It simply
refuses to run the sql/update script. No idea why.


Oh, hang on... Vista's new 'security' features include popups that ask 
permission from the user before running any installers. One of the more 
basic checks they use is the filename - *anything* called setup.exe will 
cause user confirmation to be required before it will run. I believe for 
non-interactive sessions it'll just refuse to run. I just tried running 
update.exe myself, and yes, you guessed it, a user confirmation dialog 
popped up :-(


Can we rename the test please?

Regards, Dave.

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate


Re: [HACKERS] Windows Vista support (Buildfarm Vaquita)

2007-05-09 Thread Peter Eisentraut
Am Mittwoch, 9. Mai 2007 13:46 schrieb Dave Page:
 Can we rename the test please?

I'm thinking no.  Brain-dead systems should produce brain-dead test results.

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] Windows Vista support (Buildfarm Vaquita)

2007-05-09 Thread Dave Page

Peter Eisentraut wrote:

Am Mittwoch, 9. Mai 2007 13:46 schrieb Dave Page:

Can we rename the test please?


I'm thinking no.  Brain-dead systems should produce brain-dead test results.



And that helps us how exactly, on what will probably be the most widely 
used OS in the world within a few years?


Regards, Dave.

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] Windows Vista support (Buildfarm Vaquita)

2007-05-09 Thread Andrew Dunstan



Peter Eisentraut wrote:

Am Mittwoch, 9. Mai 2007 13:46 schrieb Dave Page:
  

Can we rename the test please?



I'm thinking no.  Brain-dead systems should produce brain-dead test results.

  


Not doing this would seem like sheer bloody-mindedness. We have 
workarounds for craziness on many systems. Not providing for this will 
mean that we have to disable the ECPG tests for Vista. I don't see how 
that helps anyone.


More seriously, we need to get the ECPG regression test rewritten in C, 
as was done for the main line regression tests. Maybe we need to factor 
out some of that into a library so the common code can be used in both 
programs. At any rate, until this is done we can't run the ECPG tests on 
MSVC builds.


cheers

andrew

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate


PostgreSQL wants to install, cancel or allow? (was Re: [HACKERS] Windows Vista support (Buildfarm Vaquita)

2007-05-09 Thread Ned Lilly

On 5/9/2007 7:46 AM Dave Page wrote:

Oh, hang on... Vista's new 'security' features include popups that ask 
permission from the user before running any installers. One of the more 
basic checks they use is the filename - *anything* called setup.exe will 
cause user confirmation to be required before it will run. I believe for 
non-interactive sessions it'll just refuse to run. I just tried running 
update.exe myself, and yes, you guessed it, a user confirmation dialog 
popped up :-(


You can just disable that feature by turning off User Account Control under 
the Windows Security Center...

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: PostgreSQL wants to install, cancel or allow? (was Re: [HACKERS] Windows Vista support (Buildfarm Vaquita)

2007-05-09 Thread Dave Page

Ned Lilly wrote:

On 5/9/2007 7:46 AM Dave Page wrote:

Oh, hang on... Vista's new 'security' features include popups that ask 
permission from the user before running any installers. One of the 
more basic checks they use is the filename - *anything* called 
setup.exe will cause user confirmation to be required before it will 
run. I believe for non-interactive sessions it'll just refuse to run. 
I just tried running update.exe myself, and yes, you guessed it, a 
user confirmation dialog popped up :-(


You can just disable that feature by turning off User Account 
Control under the Windows Security Center...


Yeah, I know, but that's not really a solution we can live with.

Regards, Dave.

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [HACKERS] Windows Vista support (Buildfarm Vaquita)

2007-05-09 Thread Magnus Hagander
On Wed, May 09, 2007 at 08:40:24AM -0400, Andrew Dunstan wrote:
 
 
 Peter Eisentraut wrote:
 Am Mittwoch, 9. Mai 2007 13:46 schrieb Dave Page:
   
 Can we rename the test please?
 
 
 I'm thinking no.  Brain-dead systems should produce brain-dead test 
 results.
 
   
 
 Not doing this would seem like sheer bloody-mindedness. We have 
 workarounds for craziness on many systems. Not providing for this will 
 mean that we have to disable the ECPG tests for Vista. I don't see how 
 that helps anyone.

Agreed.

 More seriously, we need to get the ECPG regression test rewritten in C, 
 as was done for the main line regression tests. Maybe we need to factor 
 out some of that into a library so the common code can be used in both 
 programs. At any rate, until this is done we can't run the ECPG tests on 
 MSVC builds.

IIRC, Joachim has at least started work on that. (Unrelated to vista - ecpg
tests are broken on both mingw and msvc for vista)

//Magnus


---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


[HACKERS] Where to hook my custom access control module?

2007-05-09 Thread Karel Gardas


Hello,

first of all, I've tried to ask on pgsql-general mailing list as advised, 
but no answer.


I'd like to look at a possibility of integrating OpenPMF 
(http://www.openpmf.org) with the PostgreSQL. There is a possibility to 
have a weak integration by using a provided rule system and SQL, but 
this way we would lose central management functionality of OpenPMF, 
especially its policy violation notifications and we'll need to 
synchronize rules with policies from time to time. So now I think about 
how to directly integrate some kind of OpenPMF policy enforcement point 
into the PostgreSQL. For this I would need some access control hook inside 
PostgreSQL which would be called on every action and my PEP would get a 
chance to deny some access. Is there anything like that supported in the 
PostgreSQL?


Thanks!
Karel
--
Karel Gardas  [EMAIL PROTECTED]
ObjectSecurity Ltd.   http://www.objectsecurity.com

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Where to hook my custom access control module?

2007-05-09 Thread Andrew Dunstan
Karel Gardas wrote:

 Hello,

 first of all, I've tried to ask on pgsql-general mailing list as advised,
 but no answer.

 I'd like to look at a possibility of integrating OpenPMF
 (http://www.openpmf.org) with the PostgreSQL. There is a possibility to
 have a weak integration by using a provided rule system and SQL, but
 this way we would lose central management functionality of OpenPMF,
 especially its policy violation notifications and we'll need to
 synchronize rules with policies from time to time. So now I think about
 how to directly integrate some kind of OpenPMF policy enforcement point
 into the PostgreSQL. For this I would need some access control hook inside
 PostgreSQL which would be called on every action and my PEP would get a
 chance to deny some access. Is there anything like that supported in the
 PostgreSQL?



Exactly what is open about this product other than the name? It looks
closed and proprietary to me.

cheers

andrew



---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] [COMMITTERS] pgsql: Teach tuplesort.c about top N sorting, in which only the first

2007-05-09 Thread Jim Nasby

On May 8, 2007, at 2:24 PM, Magnus Hagander wrote:
Speaking of which, it might be interesting to actually show these  
values

in the stats collector. I was thinking three cols for each database
(probably the best level?) that counts each of those three  
counters. If
you have a lot of sorts (percentage-wise) spilling to disk, it is  
often
something you want to investigate, so exposing it that way seems  
like a

good thing.


What 3 columns? In-memory sorts, on-disk sorts, and on-disk size?  
(Sum of how much spilled to disk).


I agree that per-database makes sense, though I'd settle for per- 
cluster.

--
Jim Nasby[EMAIL PROTECTED]
EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)



---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [HACKERS] [COMMITTERS] pgsql: Teach tuplesort.c about top N sorting, in which only the first

2007-05-09 Thread Magnus Hagander
On Wed, May 09, 2007 at 10:55:12AM -0500, Jim Nasby wrote:
 On May 8, 2007, at 2:24 PM, Magnus Hagander wrote:
 Speaking of which, it might be interesting to actually show these  
 values
 in the stats collector. I was thinking three cols for each database
 (probably the best level?) that counts each of those three  
 counters. If
 you have a lot of sorts (percentage-wise) spilling to disk, it is  
 often
 something you want to investigate, so exposing it that way seems  
 like a
 good thing.
 
 What 3 columns? In-memory sorts, on-disk sorts, and on-disk size?  
 (Sum of how much spilled to disk).

I was thinking in-mem sorts, on-disk sorts, limited-by-LIMIT sorts (that
would be the new feature..)

//Magnus

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Re: [COMMITTERS] psqlodbc - psqlodbc: Put Autotools-generated files into subdirectory

2007-05-09 Thread Robert Treat
On Monday 07 May 2007 15:52, Joshua D. Drake wrote:
 Andrew Dunstan wrote:
  Hiroshi Inoue wrote:
  Maybe it's BSD which is different from the license of psqlodbc (LGPL).
  Is there no problem with their coexistence ?
  Or is it possible for psqlodbc to be LGPL entirely ?
 
  I am having difficulty in understanding what the problem is. My
  understanding is that using BSD licensed code is ok in an LGPL project,
  but (probably) not vice versa.

 To my knowledge you can do it either way, as long as you remember that
 any changes to the lgpl code have to be released.


It's generally a very bad idea for a BSD licensed project to include lgpl 
licensed code because people who try and use your work in thier own projects,  
under the assumption that it really is bsd licensed, get bitten when they 
find out that they have now illegally included code that is licensed via some 
other license.  

-- 
Robert Treat
Build A Brighter LAMP :: Linux Apache {middleware} PostgreSQL

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Re: [COMMITTERS] psqlodbc - psqlodbc: Put Autotools-generated files into subdirectory

2007-05-09 Thread Alvaro Herrera
Robert Treat wrote:
 On Monday 07 May 2007 15:52, Joshua D. Drake wrote:
  Andrew Dunstan wrote:
   Hiroshi Inoue wrote:
   Maybe it's BSD which is different from the license of psqlodbc (LGPL).
   Is there no problem with their coexistence ?
   Or is it possible for psqlodbc to be LGPL entirely ?
  
   I am having difficulty in understanding what the problem is. My
   understanding is that using BSD licensed code is ok in an LGPL project,
   but (probably) not vice versa.
 
  To my knowledge you can do it either way, as long as you remember that
  any changes to the lgpl code have to be released.
 
 It's generally a very bad idea for a BSD licensed project to include lgpl 
 licensed code because people who try and use your work in thier own projects, 
  
 under the assumption that it really is bsd licensed, get bitten when they 
 find out that they have now illegally included code that is licensed via some 
 other license.  

Of course, the developer who owns the LGPL-licensed copyright is free to
relicense his work under a different license, so if the ODBC developers
want to contribute code to Postgres they can give their work under the
Postgres license.  (They must obtain permission from all the involved
developers, obviously).

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] [COMMITTERS] pgsql: Teach tuplesort.c about top N sorting, in which only the first

2007-05-09 Thread Gregory Stark

Magnus Hagander [EMAIL PROTECTED] writes:

 What 3 columns? In-memory sorts, on-disk sorts, and on-disk size?  
 (Sum of how much spilled to disk).

 I was thinking in-mem sorts, on-disk sorts, limited-by-LIMIT sorts (that
 would be the new feature..)

Tom's code distinguished in-memory, top-N, on-disk with final merge postponed,
and on-disk with materialized result. Four categories. But I think the
distinction between the two types of in-memory and the two types of on-disk
sorts is only really useful when you're looking at an individual query. And
even then probably only useful to a Postgres hacker, not a DBA.

It seems like it would be more useful to just break it down into in-memory and
on-disk but for each give number of sorts, number of tuples, and space used.

What would be really handy is breaking this down by table -- probably that
would only be possible when the sort is sorting directly a table scan. I don't
even know how easy it would be to get that information.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com


---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] Managing the community information stream

2007-05-09 Thread Josh Berkus

Jim,


I am sympathetic to the issues you and Andrew are describing (I
understand Bruce's stream analogy, but I think Andrew is right that
from the user's point of view, it's not usable).  But I am not
convinced that users voting on desired features will get us the
users' desired features.  The features we get are mostly the features
that have attracted developers.  The method by which that attraction
happens is interesting, but I don't think it's democratic.


Further, our community has always operated by consensus and public 
mailing list poll when applicable, and not by majority rules vote or 
anything similar.  The only advantage I can see to allowing voting on 
TODOs would be to quickly answer the question does anyone t all care 
about this, but I personally am not convinced that offering 
Bugzilla-style voting would help that at all.  On other projects, my 
experience is that people don't use the BZ voting, even projects which 
otherwise use BZ extensively.


--Josh Berkus


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


[HACKERS] Problem with create database ... with template

2007-05-09 Thread Oleg Bartunov

Hi there,

I'm investigating a problem my client experienced with 
create database xxx with template=yyy - they often get error

ERROR:  source database yyy is being accessed by other users,

I modelled situation in clean environment with very simple perl script
( FreeBSD 6.2-STABLE, postgresql-8.2.3 from ports ) and was able to
reproduce the problem. There were no other connections ( I checked log ).
I tried shell script and got the same problem. Is this a known problem ?


perl testcreatedb.pl
Repeat #:1
Repeat #:2
Repeat #:3
Repeat #:4
Repeat #:5
Repeat #:6
Repeat #:7
Repeat #:8
Repeat #:9
Repeat #:10
Repeat #:11
Repeat #:12
Repeat #:13
Repeat #:14
Repeat #:15
Repeat #:16
DBD::Pg::db do failed: ERROR:  source database yyy is being accessed by other 
users
DBD::Pg::db do failed: ERROR:  source database yyy is being accessed by other 
users



--- cut here ---
#!/usr/bin/perl

use strict;
use DBI;

my $dbh;
my $rv = undef;

foreach  my $i ( 1..100) {

$dbh = 
DBI-connect('dbi:Pg:database=yyy;host=msk2.socionet.ru;port=5432','db_user','',{ 
RaiseError = 1}) or
  die Can't connect to database !\n;

print STDERR Repeat #:$i\n;

$rv = $dbh-do('drop database xxx');
$rv = $dbh-do('create database xxx with  TEMPLATE=yyy');

print STDERR error creating database: $rv\n if ( $rv ne '0E0' );

$dbh-disconnect;
--- cut here -


Regards,
Oleg
_
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: [EMAIL PROTECTED], http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [HACKERS] Problem with create database ... with template

2007-05-09 Thread Tom Lane
Oleg Bartunov [EMAIL PROTECTED] writes:
 I'm investigating a problem my client experienced with 
 create database xxx with template=yyy - they often get error
 ERROR:  source database yyy is being accessed by other users,

Your test case simply shows that it takes a finite amount of time for
the previous backend to exit; depending on scheduler priorities it's
possible for the next instance to connect and try to CREATE DATABASE
before the previous one is gone.  My suggestion would be to connect
to some other database besides the one being copied.

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Problem with create database ... with template

2007-05-09 Thread Andrew Dunstan



Tom Lane wrote:

Your test case simply shows that it takes a finite amount of time for
the previous backend to exit; depending on scheduler priorities it's
possible for the next instance to connect and try to CREATE DATABASE
before the previous one is gone.  My suggestion would be to connect
to some other database besides the one being copied.


  


Didn't we invent the postgres database for just this reason?

cheers

andrew

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate


Re: [HACKERS] [PERFORM] Cannot make GIN intarray index be used by the planner

2007-05-09 Thread Tom Lane
[cc'ing to pgsql-hackers since this is looking like a contrib/intarray bug]

Valentine Gogichashvili [EMAIL PROTECTED] writes:
 here is the DT

That works fine for me in 8.2:

regression=#  explain SELECT id, (myintarray_int4)
  FROM myintarray_table_nonulls
 WHERE ARRAY[8] @ myintarray_int4;
QUERY PLAN  
  
--
 Index Scan using idx_nonnulls_myintarray_int4_gin on myintarray_table_nonulls  
(cost=0.00..8.27 rows=1 width=36)
   Index Cond: ('{8}'::integer[] @ myintarray_int4)
(2 rows)

What I am betting is that you've installed contrib/intarray in this
database and that's bollixed things up somehow.  In particular, intarray
tries to take over the position of default gin opclass for int4[],
and the opclass that it installs as default has operators named just
like the built-in ones.  If somehow your query is using pg_catalog.@
instead of intarray's public.@, then the planner wouldn't think the
index is relevant.

In a quick test your example still works with intarray installed, because
what it's really created is public.@ (integer[], integer[]) which is
an exact match and therefore takes precedence over the built-in
pg_catalog.@ (anyarray, anyarray).  But if for example you don't have
public in your search_path then the wrong operator would be chosen.

Please look at the pg_index entry for your index, eg

select * from pg_index where indexrelid =
'versionA.idx_nonnulls_myintarray_int4_gin'::regclass;

and see whether the index opclass is the built-in one or not.

Note to hackers: we've already discussed that intarray shouldn't be
trying to take over the default gin opclass, but I am beginning to
wonder if it still has a reason to live at all.  We should at least
consider removing the redundant operators to avoid risks like this one.

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Problem with create database ... with template

2007-05-09 Thread Oleg Bartunov

On Wed, 9 May 2007, Tom Lane wrote:


Oleg Bartunov [EMAIL PROTECTED] writes:

I'm investigating a problem my client experienced with
create database xxx with template=yyy - they often get error
ERROR:  source database yyy is being accessed by other users,


Your test case simply shows that it takes a finite amount of time for
the previous backend to exit; depending on scheduler priorities it's
possible for the next instance to connect and try to CREATE DATABASE
before the previous one is gone.  My suggestion would be to connect
to some other database besides the one being copied.


I tried that connecting to template1, but it doesn't helped. The problem
occurs even if I repeat 'drop;create' in one connection (the same backend).

connect
while true do
 drop database
 create database with template
end
disconnect


Regards,
Oleg
_
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: [EMAIL PROTECTED], http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] Problem with create database ... with template

2007-05-09 Thread Tom Lane
Oleg Bartunov [EMAIL PROTECTED] writes:
 I tried that connecting to template1, but it doesn't helped. The problem
 occurs even if I repeat 'drop;create' in one connection (the same backend).

Hmm.  Do you have autovacuum turned on?  8.2 isn't real smart about
dealing with autovac running in the template DB ...

FWIW, I could repeat it here easily with the given test case (new
connections being made each time) but not with a single persistent
connection.

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] [COMMITTERS] pgsql: Teach tuplesort.c about top N sorting, in which only the first

2007-05-09 Thread Magnus Hagander
Gregory Stark wrote:
 Magnus Hagander [EMAIL PROTECTED] writes:
 
 What 3 columns? In-memory sorts, on-disk sorts, and on-disk size?  
 (Sum of how much spilled to disk).
 I was thinking in-mem sorts, on-disk sorts, limited-by-LIMIT sorts (that
 would be the new feature..)
 
 Tom's code distinguished in-memory, top-N, on-disk with final merge postponed,
 and on-disk with materialized result. Four categories. But I think the
 distinction between the two types of in-memory and the two types of on-disk
 sorts is only really useful when you're looking at an individual query. And
 even then probably only useful to a Postgres hacker, not a DBA.

Missed the two on-disk distinctions, yeah. But you're probably right
that on-disk vs in-memory is enough, the interesting thing is to get
indications on when you hit disk given what it does for performance.


 It seems like it would be more useful to just break it down into in-memory and
 on-disk but for each give number of sorts, number of tuples, and space used.
 
 What would be really handy is breaking this down by table -- probably that
 would only be possible when the sort is sorting directly a table scan. I don't
 even know how easy it would be to get that information.

And how would you deal with the data that's sorting the result of a join
or something like that - makes things a lot more complicated ;)

And the original question remains, 8.3 or 8.4...

//Magnus

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Problem with create database ... with template

2007-05-09 Thread Oleg Bartunov

On Wed, 9 May 2007, Tom Lane wrote:


Oleg Bartunov [EMAIL PROTECTED] writes:

I tried that connecting to template1, but it doesn't helped. The problem
occurs even if I repeat 'drop;create' in one connection (the same backend).


Hmm.  Do you have autovacuum turned on?  8.2 isn't real smart about
dealing with autovac running in the template DB ...


yes, it turned on. With autovac off the problem seems gone away.



FWIW, I could repeat it here easily with the given test case (new
connections being made each time) but not with a single persistent
connection.


If I play with toy database (empty) I don't see error message. 
Real database is about 800Mb, so it takes about 22 seconds to create
database. Looks like there is a time lag between completion 
'create database ...' command and actual releasing of template db.



Regards,
Oleg
_
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: [EMAIL PROTECTED], http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Behavior of GENERATED columns per SQL2003

2007-05-09 Thread Tom Lane
After some more study of the SQL spec, the distinction between GENERATED
ALWAYS AS IDENTITY and GENERATED BY DEFAULT AS IDENTITY is not what
I thought it was.

* As far as I can find from the spec, there is *no* difference between
the two cases for INSERT commands.  The rule is that you ignore any
user-supplied data and use the default (ie, nextval()) unless OVERRIDING
SYSTEM VALUE is specified.  It is not an error to try to insert data
into an identity column, it's just ignored unless OVERRIDING SYSTEM
VALUE.

* The difference for UPDATE commands is that you can update a BY DEFAULT
identity column to anything you want, whereas for an ALWAYS identity
it's an error to update to anything but DEFAULT (which causes a fresh
nextval() to be assigned).  Both behaviors are different from a
generated column, which is updated whether you mention it or not.

This means that GENERATED BY DEFAULT AS IDENTITY is not at all
equivalent to our historical behavior for SERIAL columns and hence we
cannot merge the two cases.

The lack of any behavioral difference for INSERT seems surprising
and counterintuitive; have I just missed something in the spec?

BTW, I found what they did about the problem that generated columns
are out of sync with their underlying columns during BEFORE-trigger
execution: in 11.39

12)If BEFORE is specified, then: 
   ...
   c) The triggered action shall not contain a field
  reference that references a field in the new transition
  variable corresponding to a generated column of T.

IOW they just pretend you can't look.  So I think we need not worry
about leaving the values out-of-date until after the triggers fire.

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] [COMMITTERS] pgsql: Teach tuplesort.c about top N sorting, in which only the first

2007-05-09 Thread Heikki Linnakangas

Magnus Hagander wrote:

Gregory Stark wrote:

Magnus Hagander [EMAIL PROTECTED] writes:

What 3 columns? In-memory sorts, on-disk sorts, and on-disk size?  
(Sum of how much spilled to disk).

I was thinking in-mem sorts, on-disk sorts, limited-by-LIMIT sorts (that
would be the new feature..)

Tom's code distinguished in-memory, top-N, on-disk with final merge postponed,
and on-disk with materialized result. Four categories. But I think the
distinction between the two types of in-memory and the two types of on-disk
sorts is only really useful when you're looking at an individual query. And
even then probably only useful to a Postgres hacker, not a DBA.


Missed the two on-disk distinctions, yeah. But you're probably right
that on-disk vs in-memory is enough, the interesting thing is to get
indications on when you hit disk given what it does for performance.


Keep in mind that when the sort goes to disk, it actually just means 
that it used up work_mem and switches to merge sort with tapes. In a 
typical configuration, there's plenty of RAM available to buffer the 
tapes, so the terms on-disk and in-memory sorts are misleading. If I've 
understood earlier discussion correctly, the quicksort - tape sort 
point is not even that interesting because the tape sort is actually not 
that much slower than quicksort, as long as it fits in RAM.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] [COMMITTERS] pgsql: Teach tuplesort.c about top N sorting, in which only the first

2007-05-09 Thread Jim Nasby

On May 9, 2007, at 11:01 AM, Magnus Hagander wrote:

On Wed, May 09, 2007 at 10:55:12AM -0500, Jim Nasby wrote:

On May 8, 2007, at 2:24 PM, Magnus Hagander wrote:

Speaking of which, it might be interesting to actually show these
values
in the stats collector. I was thinking three cols for each database
(probably the best level?) that counts each of those three
counters. If
you have a lot of sorts (percentage-wise) spilling to disk, it is
often
something you want to investigate, so exposing it that way seems
like a
good thing.


What 3 columns? In-memory sorts, on-disk sorts, and on-disk size?
(Sum of how much spilled to disk).


I was thinking in-mem sorts, on-disk sorts, limited-by-LIMIT sorts  
(that

would be the new feature..)


It would also be useful to know how much got spilled to disk. If it's  
a large amount per sort, then there's probably not much you could do,  
but if it's just a tad over available memory per-sort...

--
Jim Nasby[EMAIL PROTECTED]
EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)



---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [HACKERS] Re: [COMMITTERS] psqlodbc - psqlodbc: Put Autotools-generated files into subdirectory

2007-05-09 Thread Hiroshi Inoue

Robert Treat wrote:

On Monday 07 May 2007 15:52, Joshua D. Drake wrote:

Andrew Dunstan wrote:

Hiroshi Inoue wrote:

Maybe it's BSD which is different from the license of psqlodbc (LGPL).
Is there no problem with their coexistence ?
Or is it possible for psqlodbc to be LGPL entirely ?

I am having difficulty in understanding what the problem is. My
understanding is that using BSD licensed code is ok in an LGPL project,
but (probably) not vice versa.

To my knowledge you can do it either way, as long as you remember that
any changes to the lgpl code have to be released.



It's generally a very bad idea for a BSD licensed project to include lgpl 
licensed code because people who try and use your work in thier own projects,  
under the assumption that it really is bsd licensed, get bitten when they 
find out that they have now illegally included code that is licensed via some 
other license.   


Psqlodbc package is LGPL licensed and seems to have little problem to
include copy of BSD licensed code as a part of it. It's what I
understand now from other developers' teachings. I'm still afraid of
my misunderstanding a little.

Thanks.

Hiroshi Inoue




---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Re: [COMMITTERS] psqlodbc - psqlodbc: Put Autotools-generated files into subdirectory

2007-05-09 Thread Hiroshi Inoue

Alvaro Herrera wrote:

Robert Treat wrote:

On Monday 07 May 2007 15:52, Joshua D. Drake wrote:

Andrew Dunstan wrote:

Hiroshi Inoue wrote:

Maybe it's BSD which is different from the license of psqlodbc (LGPL).
Is there no problem with their coexistence ?
Or is it possible for psqlodbc to be LGPL entirely ?

I am having difficulty in understanding what the problem is. My
understanding is that using BSD licensed code is ok in an LGPL project,
but (probably) not vice versa.

To my knowledge you can do it either way, as long as you remember that
any changes to the lgpl code have to be released.
It's generally a very bad idea for a BSD licensed project to include lgpl 
licensed code because people who try and use your work in thier own projects,  
under the assumption that it really is bsd licensed, get bitten when they 
find out that they have now illegally included code that is licensed via some 
other license.  


Of course, the developer who owns the LGPL-licensed copyright is free to
relicense his work under a different license, so if the ODBC developers
want to contribute code to Postgres they can give their work under the
Postgres license.  (They must obtain permission from all the involved
developers, obviously).


There are no original developers in the project now and I don't know
where or how they are now. I personally am not so eager to change the
license to BSD because it has been LGPL too long. Oppositely I thought
we can implement the BSD licensed autoconf macros by ourselves but I'm
not sure how it can be considered as *not derived*.

Thanks.

regards,
Hiroshi Inoue




---(end of broadcast)---
TIP 6: explain analyze is your friend


[HACKERS] Problem with CREATE LANGUAGE in CVS TIP

2007-05-09 Thread David Fetter
Folks,

While trying to test the arrays of complex types, I used a
stripped-down configure.  Too stripped down, it turns out.

CFLAGS=-O0 ./configure --prefix=/var/lib/pgsql/tip --with-pgport=2225 
--enable-cassert --enable-debug --enable-depend --enable-integer-datetimes 

I set $PGDATA to my test directory, then did

initdb -D $PGDATA -E utf8
createdb array_test
createlang plperlu array_test 

then spent some time trying to figure out why I got the following
error when I tried to create a PL/PerlU function:

psql:schema.sql:4259: ERROR: cache lookup failed for function 1000

It turns out that PL/PerlU wasn't available, even though createlang
had succeeded.  I think this is a bug in CREATE LANGUAGE, as CREATE
LANGUAGE should have failed when it found no PL/Perl compiled in.

Cheers,
D
-- 
David Fetter [EMAIL PROTECTED] http://fetter.org/
phone: +1 415 235 3778AIM: dfetter666
  Skype: davidfetter

Remember to vote!
Consider donating to PostgreSQL: http://www.postgresql.org/about/donate

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] Seq scans roadmap

2007-05-09 Thread CK Tan

Hi,

In reference to the seq scans roadmap, I have just submitted a patch  
that addresses some of the concerns.


The patch does this:

1. for small relation (smaller than 60% of bufferpool), use the  
current logic

2. for big relation:
- use a ring buffer in heap scan
- pin first 12 pages when scan starts
- on consumption of every 4-page, read and pin the next 4-page
	- invalidate used pages of in the scan so they do not force out  
other useful pages


4 files changed:
bufmgr.c, bufmgr.h, heapam.c, relscan.h

If there are interests, I can submit another scan patch that returns  
N tuples at a time, instead of current one-at-a-time interface. This  
improves code locality and further improve performance by another  
10-20%.


For TPCH 1G tables, we are seeing more than 20% improvement in scans  
on the same hardware.


 
-

- PATCHED VERSION
 
-

gptest=# select count(*) from lineitem;
  count
-
6001215
(1 row)

Time: 2117.025 ms

 
-

- ORIGINAL CVS HEAD VERSION
 
-

gptest=# select count(*) from lineitem;
  count
-
6001215
(1 row)

Time: 2722.441 ms


Suggestions for improvement are welcome.

Regards,
-cktan
Greenplum, Inc.

On May 8, 2007, at 5:57 AM, Heikki Linnakangas wrote:


Luke Lonergan wrote:
What do you mean with using readahead inside the heapscan?  
Starting an async read request?
Nope - just reading N buffers ahead for seqscans.  Subsequent  
calls use

previously read pages.  The objective is to issue contiguous reads to
the OS in sizes greater than the PG page size (which is much smaller
than what is needed for fast sequential I/O).


Are you filling multiple buffers in the buffer cache with a single  
read-call? The OS should be doing readahead for us anyway, so I  
don't see how just issuing multiple ReadBuffers one after each  
other helps.


Yes, I think the ring buffer strategy should be used when the  
table size
is  1 x bufcache and the ring buffer should be of a fixed size  
smaller

than L2 cache (32KB - 128KB seems to work well).


I think we want to let the ring grow larger than that for updating  
transactions and vacuums, though, to avoid the WAL flush problem.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

---(end of  
broadcast)---

TIP 6: explain analyze is your friend





---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [HACKERS] Re: [COMMITTERS] psqlodbc - psqlodbc: Put Autotools-generated files into subdirectory

2007-05-09 Thread Tom Lane
Hiroshi Inoue [EMAIL PROTECTED] writes:
 Robert Treat wrote:
 It's generally a very bad idea for a BSD licensed project to include lgpl 
 licensed code

 Psqlodbc package is LGPL licensed and seems to have little problem to
 include copy of BSD licensed code as a part of it.

Right, that direction is fine, it's the other way around that's
problematic (because adding some BSD code adds no new restrictions on
what users can do with an overall-LGPL project, whereas having some LGPL
components in a supposedly BSD project does limit what they can do with
it).  I don't see any reason why you shouldn't include those PG autoconf
macros in psqlodbc.  You just need to document that they have a BSD
license, in case someone wants to use them by themselves.

regards, tom lane

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] Problem with CREATE LANGUAGE in CVS TIP

2007-05-09 Thread Tom Lane
David Fetter [EMAIL PROTECTED] writes:
 It turns out that PL/PerlU wasn't available, even though createlang
 had succeeded.  I think this is a bug in CREATE LANGUAGE, as CREATE
 LANGUAGE should have failed when it found no PL/Perl compiled in.

Works for me:

$ createlang plperlu array_test
createlang: language installation failed: ERROR:  could not access file 
$libdir/plperl: No such file or directory

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq