[HACKERS] Windows Vista support (Buildfarm Vaquita)
Hi, I asked about this before, but the thread got hijacked to discuss another buildfarm failure :-(. Currently our only Windows Vista buildfarm member (Vaquita) fails every time (assuming it gets that far) on ECPG's dt_test and update tests. I've checked the FS permissions, and see no obvious reason why the tests would fail, and I've tried running the tests manually and see them fail as well. Can someone suggest what I might try next to resolve this? I don't really have the spare time to spend figuring out ECPG at the moment. http://www.pgbuildfarm.org/cgi-bin/show_log.pl?nm=vaquitadt=2007-05-07%2020:00:05 Regards, Dave. ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Seq scans roadmap
Are you filling multiple buffers in the buffer cache with a single read-call? yes, needs vector or ScatterGather IO. I would expect that to get only moderate improvement. The vast improvement comes from 256k blocksize. To get the full benefit I would think you would want to either fire off a separate thread to do the read-ahead, use libaio, or funnel the read-ahead requests to a separate thread like our bgwriter only it would be a bgreader or something like that. I like bgreader :-) But that looks even more difficult than grabbing 32 [scattered or contiguous] buffers at once. Especially in a situation where there is no concurrent load it would be nice to do CPU work while waiting for the next read ahead IO. If there is enough parallel CPU load it is actually not so important. So I opt, that on a high load server you get nearly all benefit without any sort of aio. The OS should be doing readahead for us anyway, so I don't see how just issuing multiple ReadBuffers one after each other helps. Last time I looked OS readahead was only comparable to 32k blocked reads. 256k blocked reads still perform way better. Also when the OS is confronted with an IO storm the 256k reads perform way better than OS readahead. Well that's going to depend on the OS. Last I checked Linux's readahead logic is pretty straightforward and doesn't try to do any better than 32k readahead and is easily fooled. However I wouldn't be surprised if that's changed. My test was on AIX, 32 or 64k seem quite common, at least as default setting. Also on some OS's (like HPUX) OS readahead and writebehind strategy changes with large IO blocksizes, imho beneficially. Andreas ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Allow use of immutable functions operating on constants with constraint exclusion
Marshall, Steve [EMAIL PROTECTED] wrote: the first query would be optimized using constraint exclusion, while the second query would not: SELECT * FROM test_bulletins WHERE created_at '2006-09-09 05:00:00+00'::timestamptz; SELECT * FROM test_bulletins WHERE created_at '2006-09-09 05:00:00+00'::timestamptz + '0 days'::interval; Hmmm... CE seems to be still not enough to optimize complex expressions. If I added the wrapper function, it worked. CREATE FUNCTION timeadd(timestamptz, interval) RETURNS timestamptz AS $$ SELECT $1 + $2; $$ LANGUAGE sql IMMUTABLE; SELECT * FROM test_bulletins WHERE created_at timeadd('2006-09-09 05:00:00+00', '0 days'); I noticed that we should be careful about CE with prepared statements and functions. Seamless partitioning requires more works. Regards, --- ITAGAKI Takahiro NTT Open Source Software Center ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Windows Vista support (Buildfarm Vaquita)
On Wed, May 09, 2007 at 09:21:59AM +0100, Dave Page wrote: I asked about this before, but the thread got hijacked to discuss another buildfarm failure :-(. Currently our only Windows Vista buildfarm member (Vaquita) fails every time (assuming it gets that far) on ECPG's dt_test and update tests. Dave, could you please run insert into date_test ( d , ts ) values ( date '1966-01-17' , timestamp '2000-07-12 17:34:29' ); on the Vista system and then select * from date_test;? According to the logs the insert runs successfully but the select gives an invalid date format. Interestingly the date argument is displayed correctly but the timestamp argument throws the invalid date error, which does not really make sense. Unfortunately I do not have access to a Vista system I could use to test and track this one down. As far as the other message is concerned I'm at a loss. It simply refuses to run the sql/update script. No idea why. Michael -- Michael Meskes Email: Michael at Fam-Meskes dot De, Michael at Meskes dot (De|Com|Net|Org) ICQ: 179140304, AIM/Yahoo: michaelmeskes, Jabber: [EMAIL PROTECTED] Go SF 49ers! Go Rhein Fire! Use Debian GNU/Linux! Use PostgreSQL! ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Seq scans roadmap
On Tue, 2007-05-08 at 11:40 +0100, Heikki Linnakangas wrote: Here's my roadmap for the scan-resistant buffer cache and synchronized scans patches. 1. Fix the current vacuum behavior of throwing dirty buffers to the freelist, forcing a lot of WAL flushes. Instead, use a backend-private ring of shared buffers that are recycled. This is what Simon's scan-resistant buffer manager did. The theory here is that if a page is read in by vacuum, it's unlikely to be accessed in the near future, therefore it should be recycled. If vacuum doesn't dirty the page, it's best to reuse the buffer immediately for the next page. However, if the buffer is dirty (and not just because we set hint bits), we ought to delay writing it to disk until the corresponding WAL record has been flushed to disk. Simon's patch used a fixed size ring of buffers that are recycled, but I think the ring should be dynamically sized. Start with a small ring, and whenever you need to do a WAL flush to write a dirty buffer, increase the ring size. On every full iteration through the ring, decrease its size to trim down an unnecessarily large ring. This only alters the behavior of vacuums, and it's pretty safe to say it won't get worse than what we have now. I think thats too much code, why not just leave it as it is. Would a dynamic buffer be substantially better? If not, why bother? In the future, we can use the buffer ring for seqscans as well; more on that on step 3. There was clear benefit for that. You sound like you are suggesting to remove the behaviour for Seq Scans, which wouldn't make much sense?? 2. Implement the list/table of last/ongoing seq scan positions. This is Jeff's synchronized scans patch. When a seq scan starts on a table larger than some threshold, it starts from where the previous seq scan is currently, or where it ended. This will synchronize the scans so that for two concurrent scans the total I/O is halved in the best case. There should be no other effect on performance. If you have a partitioned table, or union of multiple tables or any other plan where multiple seq scans are performed in arbitrary order, this change won't change the order the partitions are scanned and won't therefore ensure they will be synchronized. Now that we have both pieces of the puzzle in place, it's time to consider what more we can do with them: 3A. To take advantage of the cache trail of a previous seq scan, scan backwards from where the previous seq scan ended, until a you hit a buffer that's not in cache. This will allow taking advantage of the buffer cache even if the table doesn't fit completely in RAM. That can make a big difference if the table size is just slightly bigger than RAM, and can avoid the nasty surprise when a table grows beyond RAM size and queries start taking minutes instead of seconds. This should be a non-controversial change on its own from performance point of view. No query should get slower, and some will become faster. But see step 3B: 3B. Currently, sequential scans on a large table spoils the buffer cache by evicting other pages from the cache. In CVS HEAD, as soon as the table is larger than shared_buffers, the pages in the buffer won't be used to speed up running the same query again, and there's no reason to believe the pages read in would be more useful than any other page in the database, and in particular the pages that were in the buffer cache before the huge seq scan. If the table being scanned is 5 * shared_buffers, the scan will evict every other page from the cache if there's no other activity in the database (max usage_count is 5). If the table is much larger than shared_buffers, say 10 times as large, even with the change 3B to read the pages that are in cache first, using all shared_buffers to cache the table will only speed up the query by 10%. We should not spoil the cache for such a small gain, and use the local buffer ring strategy instead. It's better to make queries that are slow anyway a little bit slower, than making queries that are normally really fast, slow. As you may notice, 3A and 3B are at odds with each other. We can implement both, but you can't use both strategies in the same scan. Not sure I've seen any evidence of that. Most scans will be solo and so should use the ring buffer, since there is clear evidence of that. If there were evidence to suggest the two patches conflict then we should turn off the ring buffer only when concurrent scans are in progress (while remembering that concurrent scans will not typically overlap as much as the synch scan tests show and so for much of their execution they too will be solo). Therefore we need to have decision logic of some kind to figure out which strategy is optimal. A simple heuristic is to decide based on the table size: 0.1*shared_buffers - start from page 0, keep in cache (like
Re: [HACKERS] Windows Vista support (Buildfarm Vaquita)
Michael Meskes wrote: Dave, could you please run insert into date_test ( d , ts ) values ( date '1966-01-17' , timestamp '2000-07-12 17:34:29' ); on the Vista system and then select * from date_test;? According to the logs the insert runs successfully but the select gives an invalid date format. Interestingly the date argument is displayed correctly but the timestamp argument throws the invalid date error, which does not really make sense. I had to create the table manually of course, so copying what the code seems to do, I get: regress1=# create table date_test (d date, ts timestamp); CREATE TABLE regress1=# set datestyle to iso; SET regress1=# insert into date_test(d, ts) values (date '1966-01-17', timestamp '2000-07-12 17:34:29'); INSERT 0 1 regress1=# select * from date_test; d | ts +- 1966-01-17 | 2000-07-12 17:34:29 (1 row) Which looks OK to me :-( Unfortunately I do not have access to a Vista system I could use to test and track this one down. I'm happy to run any tests you like. As far as the other message is concerned I'm at a loss. It simply refuses to run the sql/update script. No idea why. Oh, hang on... Vista's new 'security' features include popups that ask permission from the user before running any installers. One of the more basic checks they use is the filename - *anything* called setup.exe will cause user confirmation to be required before it will run. I believe for non-interactive sessions it'll just refuse to run. I just tried running update.exe myself, and yes, you guessed it, a user confirmation dialog popped up :-( Can we rename the test please? Regards, Dave. ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] Windows Vista support (Buildfarm Vaquita)
Am Mittwoch, 9. Mai 2007 13:46 schrieb Dave Page: Can we rename the test please? I'm thinking no. Brain-dead systems should produce brain-dead test results. -- Peter Eisentraut http://developer.postgresql.org/~petere/ ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Windows Vista support (Buildfarm Vaquita)
Peter Eisentraut wrote: Am Mittwoch, 9. Mai 2007 13:46 schrieb Dave Page: Can we rename the test please? I'm thinking no. Brain-dead systems should produce brain-dead test results. And that helps us how exactly, on what will probably be the most widely used OS in the world within a few years? Regards, Dave. ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Windows Vista support (Buildfarm Vaquita)
Peter Eisentraut wrote: Am Mittwoch, 9. Mai 2007 13:46 schrieb Dave Page: Can we rename the test please? I'm thinking no. Brain-dead systems should produce brain-dead test results. Not doing this would seem like sheer bloody-mindedness. We have workarounds for craziness on many systems. Not providing for this will mean that we have to disable the ECPG tests for Vista. I don't see how that helps anyone. More seriously, we need to get the ECPG regression test rewritten in C, as was done for the main line regression tests. Maybe we need to factor out some of that into a library so the common code can be used in both programs. At any rate, until this is done we can't run the ECPG tests on MSVC builds. cheers andrew ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
PostgreSQL wants to install, cancel or allow? (was Re: [HACKERS] Windows Vista support (Buildfarm Vaquita)
On 5/9/2007 7:46 AM Dave Page wrote: Oh, hang on... Vista's new 'security' features include popups that ask permission from the user before running any installers. One of the more basic checks they use is the filename - *anything* called setup.exe will cause user confirmation to be required before it will run. I believe for non-interactive sessions it'll just refuse to run. I just tried running update.exe myself, and yes, you guessed it, a user confirmation dialog popped up :-( You can just disable that feature by turning off User Account Control under the Windows Security Center... ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: PostgreSQL wants to install, cancel or allow? (was Re: [HACKERS] Windows Vista support (Buildfarm Vaquita)
Ned Lilly wrote: On 5/9/2007 7:46 AM Dave Page wrote: Oh, hang on... Vista's new 'security' features include popups that ask permission from the user before running any installers. One of the more basic checks they use is the filename - *anything* called setup.exe will cause user confirmation to be required before it will run. I believe for non-interactive sessions it'll just refuse to run. I just tried running update.exe myself, and yes, you guessed it, a user confirmation dialog popped up :-( You can just disable that feature by turning off User Account Control under the Windows Security Center... Yeah, I know, but that's not really a solution we can live with. Regards, Dave. ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Windows Vista support (Buildfarm Vaquita)
On Wed, May 09, 2007 at 08:40:24AM -0400, Andrew Dunstan wrote: Peter Eisentraut wrote: Am Mittwoch, 9. Mai 2007 13:46 schrieb Dave Page: Can we rename the test please? I'm thinking no. Brain-dead systems should produce brain-dead test results. Not doing this would seem like sheer bloody-mindedness. We have workarounds for craziness on many systems. Not providing for this will mean that we have to disable the ECPG tests for Vista. I don't see how that helps anyone. Agreed. More seriously, we need to get the ECPG regression test rewritten in C, as was done for the main line regression tests. Maybe we need to factor out some of that into a library so the common code can be used in both programs. At any rate, until this is done we can't run the ECPG tests on MSVC builds. IIRC, Joachim has at least started work on that. (Unrelated to vista - ecpg tests are broken on both mingw and msvc for vista) //Magnus ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
[HACKERS] Where to hook my custom access control module?
Hello, first of all, I've tried to ask on pgsql-general mailing list as advised, but no answer. I'd like to look at a possibility of integrating OpenPMF (http://www.openpmf.org) with the PostgreSQL. There is a possibility to have a weak integration by using a provided rule system and SQL, but this way we would lose central management functionality of OpenPMF, especially its policy violation notifications and we'll need to synchronize rules with policies from time to time. So now I think about how to directly integrate some kind of OpenPMF policy enforcement point into the PostgreSQL. For this I would need some access control hook inside PostgreSQL which would be called on every action and my PEP would get a chance to deny some access. Is there anything like that supported in the PostgreSQL? Thanks! Karel -- Karel Gardas [EMAIL PROTECTED] ObjectSecurity Ltd. http://www.objectsecurity.com ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Where to hook my custom access control module?
Karel Gardas wrote: Hello, first of all, I've tried to ask on pgsql-general mailing list as advised, but no answer. I'd like to look at a possibility of integrating OpenPMF (http://www.openpmf.org) with the PostgreSQL. There is a possibility to have a weak integration by using a provided rule system and SQL, but this way we would lose central management functionality of OpenPMF, especially its policy violation notifications and we'll need to synchronize rules with policies from time to time. So now I think about how to directly integrate some kind of OpenPMF policy enforcement point into the PostgreSQL. For this I would need some access control hook inside PostgreSQL which would be called on every action and my PEP would get a chance to deny some access. Is there anything like that supported in the PostgreSQL? Exactly what is open about this product other than the name? It looks closed and proprietary to me. cheers andrew ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] [COMMITTERS] pgsql: Teach tuplesort.c about top N sorting, in which only the first
On May 8, 2007, at 2:24 PM, Magnus Hagander wrote: Speaking of which, it might be interesting to actually show these values in the stats collector. I was thinking three cols for each database (probably the best level?) that counts each of those three counters. If you have a lot of sorts (percentage-wise) spilling to disk, it is often something you want to investigate, so exposing it that way seems like a good thing. What 3 columns? In-memory sorts, on-disk sorts, and on-disk size? (Sum of how much spilled to disk). I agree that per-database makes sense, though I'd settle for per- cluster. -- Jim Nasby[EMAIL PROTECTED] EnterpriseDB http://enterprisedb.com 512.569.9461 (cell) ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] [COMMITTERS] pgsql: Teach tuplesort.c about top N sorting, in which only the first
On Wed, May 09, 2007 at 10:55:12AM -0500, Jim Nasby wrote: On May 8, 2007, at 2:24 PM, Magnus Hagander wrote: Speaking of which, it might be interesting to actually show these values in the stats collector. I was thinking three cols for each database (probably the best level?) that counts each of those three counters. If you have a lot of sorts (percentage-wise) spilling to disk, it is often something you want to investigate, so exposing it that way seems like a good thing. What 3 columns? In-memory sorts, on-disk sorts, and on-disk size? (Sum of how much spilled to disk). I was thinking in-mem sorts, on-disk sorts, limited-by-LIMIT sorts (that would be the new feature..) //Magnus ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Re: [COMMITTERS] psqlodbc - psqlodbc: Put Autotools-generated files into subdirectory
On Monday 07 May 2007 15:52, Joshua D. Drake wrote: Andrew Dunstan wrote: Hiroshi Inoue wrote: Maybe it's BSD which is different from the license of psqlodbc (LGPL). Is there no problem with their coexistence ? Or is it possible for psqlodbc to be LGPL entirely ? I am having difficulty in understanding what the problem is. My understanding is that using BSD licensed code is ok in an LGPL project, but (probably) not vice versa. To my knowledge you can do it either way, as long as you remember that any changes to the lgpl code have to be released. It's generally a very bad idea for a BSD licensed project to include lgpl licensed code because people who try and use your work in thier own projects, under the assumption that it really is bsd licensed, get bitten when they find out that they have now illegally included code that is licensed via some other license. -- Robert Treat Build A Brighter LAMP :: Linux Apache {middleware} PostgreSQL ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Re: [COMMITTERS] psqlodbc - psqlodbc: Put Autotools-generated files into subdirectory
Robert Treat wrote: On Monday 07 May 2007 15:52, Joshua D. Drake wrote: Andrew Dunstan wrote: Hiroshi Inoue wrote: Maybe it's BSD which is different from the license of psqlodbc (LGPL). Is there no problem with their coexistence ? Or is it possible for psqlodbc to be LGPL entirely ? I am having difficulty in understanding what the problem is. My understanding is that using BSD licensed code is ok in an LGPL project, but (probably) not vice versa. To my knowledge you can do it either way, as long as you remember that any changes to the lgpl code have to be released. It's generally a very bad idea for a BSD licensed project to include lgpl licensed code because people who try and use your work in thier own projects, under the assumption that it really is bsd licensed, get bitten when they find out that they have now illegally included code that is licensed via some other license. Of course, the developer who owns the LGPL-licensed copyright is free to relicense his work under a different license, so if the ODBC developers want to contribute code to Postgres they can give their work under the Postgres license. (They must obtain permission from all the involved developers, obviously). -- Alvaro Herrerahttp://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] [COMMITTERS] pgsql: Teach tuplesort.c about top N sorting, in which only the first
Magnus Hagander [EMAIL PROTECTED] writes: What 3 columns? In-memory sorts, on-disk sorts, and on-disk size? (Sum of how much spilled to disk). I was thinking in-mem sorts, on-disk sorts, limited-by-LIMIT sorts (that would be the new feature..) Tom's code distinguished in-memory, top-N, on-disk with final merge postponed, and on-disk with materialized result. Four categories. But I think the distinction between the two types of in-memory and the two types of on-disk sorts is only really useful when you're looking at an individual query. And even then probably only useful to a Postgres hacker, not a DBA. It seems like it would be more useful to just break it down into in-memory and on-disk but for each give number of sorts, number of tuples, and space used. What would be really handy is breaking this down by table -- probably that would only be possible when the sort is sorting directly a table scan. I don't even know how easy it would be to get that information. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] Managing the community information stream
Jim, I am sympathetic to the issues you and Andrew are describing (I understand Bruce's stream analogy, but I think Andrew is right that from the user's point of view, it's not usable). But I am not convinced that users voting on desired features will get us the users' desired features. The features we get are mostly the features that have attracted developers. The method by which that attraction happens is interesting, but I don't think it's democratic. Further, our community has always operated by consensus and public mailing list poll when applicable, and not by majority rules vote or anything similar. The only advantage I can see to allowing voting on TODOs would be to quickly answer the question does anyone t all care about this, but I personally am not convinced that offering Bugzilla-style voting would help that at all. On other projects, my experience is that people don't use the BZ voting, even projects which otherwise use BZ extensively. --Josh Berkus ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[HACKERS] Problem with create database ... with template
Hi there, I'm investigating a problem my client experienced with create database xxx with template=yyy - they often get error ERROR: source database yyy is being accessed by other users, I modelled situation in clean environment with very simple perl script ( FreeBSD 6.2-STABLE, postgresql-8.2.3 from ports ) and was able to reproduce the problem. There were no other connections ( I checked log ). I tried shell script and got the same problem. Is this a known problem ? perl testcreatedb.pl Repeat #:1 Repeat #:2 Repeat #:3 Repeat #:4 Repeat #:5 Repeat #:6 Repeat #:7 Repeat #:8 Repeat #:9 Repeat #:10 Repeat #:11 Repeat #:12 Repeat #:13 Repeat #:14 Repeat #:15 Repeat #:16 DBD::Pg::db do failed: ERROR: source database yyy is being accessed by other users DBD::Pg::db do failed: ERROR: source database yyy is being accessed by other users --- cut here --- #!/usr/bin/perl use strict; use DBI; my $dbh; my $rv = undef; foreach my $i ( 1..100) { $dbh = DBI-connect('dbi:Pg:database=yyy;host=msk2.socionet.ru;port=5432','db_user','',{ RaiseError = 1}) or die Can't connect to database !\n; print STDERR Repeat #:$i\n; $rv = $dbh-do('drop database xxx'); $rv = $dbh-do('create database xxx with TEMPLATE=yyy'); print STDERR error creating database: $rv\n if ( $rv ne '0E0' ); $dbh-disconnect; --- cut here - Regards, Oleg _ Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru), Sternberg Astronomical Institute, Moscow University, Russia Internet: [EMAIL PROTECTED], http://www.sai.msu.su/~megera/ phone: +007(495)939-16-83, +007(495)939-23-83 ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Problem with create database ... with template
Oleg Bartunov [EMAIL PROTECTED] writes: I'm investigating a problem my client experienced with create database xxx with template=yyy - they often get error ERROR: source database yyy is being accessed by other users, Your test case simply shows that it takes a finite amount of time for the previous backend to exit; depending on scheduler priorities it's possible for the next instance to connect and try to CREATE DATABASE before the previous one is gone. My suggestion would be to connect to some other database besides the one being copied. regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Problem with create database ... with template
Tom Lane wrote: Your test case simply shows that it takes a finite amount of time for the previous backend to exit; depending on scheduler priorities it's possible for the next instance to connect and try to CREATE DATABASE before the previous one is gone. My suggestion would be to connect to some other database besides the one being copied. Didn't we invent the postgres database for just this reason? cheers andrew ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] [PERFORM] Cannot make GIN intarray index be used by the planner
[cc'ing to pgsql-hackers since this is looking like a contrib/intarray bug] Valentine Gogichashvili [EMAIL PROTECTED] writes: here is the DT That works fine for me in 8.2: regression=# explain SELECT id, (myintarray_int4) FROM myintarray_table_nonulls WHERE ARRAY[8] @ myintarray_int4; QUERY PLAN -- Index Scan using idx_nonnulls_myintarray_int4_gin on myintarray_table_nonulls (cost=0.00..8.27 rows=1 width=36) Index Cond: ('{8}'::integer[] @ myintarray_int4) (2 rows) What I am betting is that you've installed contrib/intarray in this database and that's bollixed things up somehow. In particular, intarray tries to take over the position of default gin opclass for int4[], and the opclass that it installs as default has operators named just like the built-in ones. If somehow your query is using pg_catalog.@ instead of intarray's public.@, then the planner wouldn't think the index is relevant. In a quick test your example still works with intarray installed, because what it's really created is public.@ (integer[], integer[]) which is an exact match and therefore takes precedence over the built-in pg_catalog.@ (anyarray, anyarray). But if for example you don't have public in your search_path then the wrong operator would be chosen. Please look at the pg_index entry for your index, eg select * from pg_index where indexrelid = 'versionA.idx_nonnulls_myintarray_int4_gin'::regclass; and see whether the index opclass is the built-in one or not. Note to hackers: we've already discussed that intarray shouldn't be trying to take over the default gin opclass, but I am beginning to wonder if it still has a reason to live at all. We should at least consider removing the redundant operators to avoid risks like this one. regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Problem with create database ... with template
On Wed, 9 May 2007, Tom Lane wrote: Oleg Bartunov [EMAIL PROTECTED] writes: I'm investigating a problem my client experienced with create database xxx with template=yyy - they often get error ERROR: source database yyy is being accessed by other users, Your test case simply shows that it takes a finite amount of time for the previous backend to exit; depending on scheduler priorities it's possible for the next instance to connect and try to CREATE DATABASE before the previous one is gone. My suggestion would be to connect to some other database besides the one being copied. I tried that connecting to template1, but it doesn't helped. The problem occurs even if I repeat 'drop;create' in one connection (the same backend). connect while true do drop database create database with template end disconnect Regards, Oleg _ Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru), Sternberg Astronomical Institute, Moscow University, Russia Internet: [EMAIL PROTECTED], http://www.sai.msu.su/~megera/ phone: +007(495)939-16-83, +007(495)939-23-83 ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Problem with create database ... with template
Oleg Bartunov [EMAIL PROTECTED] writes: I tried that connecting to template1, but it doesn't helped. The problem occurs even if I repeat 'drop;create' in one connection (the same backend). Hmm. Do you have autovacuum turned on? 8.2 isn't real smart about dealing with autovac running in the template DB ... FWIW, I could repeat it here easily with the given test case (new connections being made each time) but not with a single persistent connection. regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] [COMMITTERS] pgsql: Teach tuplesort.c about top N sorting, in which only the first
Gregory Stark wrote: Magnus Hagander [EMAIL PROTECTED] writes: What 3 columns? In-memory sorts, on-disk sorts, and on-disk size? (Sum of how much spilled to disk). I was thinking in-mem sorts, on-disk sorts, limited-by-LIMIT sorts (that would be the new feature..) Tom's code distinguished in-memory, top-N, on-disk with final merge postponed, and on-disk with materialized result. Four categories. But I think the distinction between the two types of in-memory and the two types of on-disk sorts is only really useful when you're looking at an individual query. And even then probably only useful to a Postgres hacker, not a DBA. Missed the two on-disk distinctions, yeah. But you're probably right that on-disk vs in-memory is enough, the interesting thing is to get indications on when you hit disk given what it does for performance. It seems like it would be more useful to just break it down into in-memory and on-disk but for each give number of sorts, number of tuples, and space used. What would be really handy is breaking this down by table -- probably that would only be possible when the sort is sorting directly a table scan. I don't even know how easy it would be to get that information. And how would you deal with the data that's sorting the result of a join or something like that - makes things a lot more complicated ;) And the original question remains, 8.3 or 8.4... //Magnus ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Problem with create database ... with template
On Wed, 9 May 2007, Tom Lane wrote: Oleg Bartunov [EMAIL PROTECTED] writes: I tried that connecting to template1, but it doesn't helped. The problem occurs even if I repeat 'drop;create' in one connection (the same backend). Hmm. Do you have autovacuum turned on? 8.2 isn't real smart about dealing with autovac running in the template DB ... yes, it turned on. With autovac off the problem seems gone away. FWIW, I could repeat it here easily with the given test case (new connections being made each time) but not with a single persistent connection. If I play with toy database (empty) I don't see error message. Real database is about 800Mb, so it takes about 22 seconds to create database. Looks like there is a time lag between completion 'create database ...' command and actual releasing of template db. Regards, Oleg _ Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru), Sternberg Astronomical Institute, Moscow University, Russia Internet: [EMAIL PROTECTED], http://www.sai.msu.su/~megera/ phone: +007(495)939-16-83, +007(495)939-23-83 ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Behavior of GENERATED columns per SQL2003
After some more study of the SQL spec, the distinction between GENERATED ALWAYS AS IDENTITY and GENERATED BY DEFAULT AS IDENTITY is not what I thought it was. * As far as I can find from the spec, there is *no* difference between the two cases for INSERT commands. The rule is that you ignore any user-supplied data and use the default (ie, nextval()) unless OVERRIDING SYSTEM VALUE is specified. It is not an error to try to insert data into an identity column, it's just ignored unless OVERRIDING SYSTEM VALUE. * The difference for UPDATE commands is that you can update a BY DEFAULT identity column to anything you want, whereas for an ALWAYS identity it's an error to update to anything but DEFAULT (which causes a fresh nextval() to be assigned). Both behaviors are different from a generated column, which is updated whether you mention it or not. This means that GENERATED BY DEFAULT AS IDENTITY is not at all equivalent to our historical behavior for SERIAL columns and hence we cannot merge the two cases. The lack of any behavioral difference for INSERT seems surprising and counterintuitive; have I just missed something in the spec? BTW, I found what they did about the problem that generated columns are out of sync with their underlying columns during BEFORE-trigger execution: in 11.39 12)If BEFORE is specified, then: ... c) The triggered action shall not contain a field reference that references a field in the new transition variable corresponding to a generated column of T. IOW they just pretend you can't look. So I think we need not worry about leaving the values out-of-date until after the triggers fire. regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] [COMMITTERS] pgsql: Teach tuplesort.c about top N sorting, in which only the first
Magnus Hagander wrote: Gregory Stark wrote: Magnus Hagander [EMAIL PROTECTED] writes: What 3 columns? In-memory sorts, on-disk sorts, and on-disk size? (Sum of how much spilled to disk). I was thinking in-mem sorts, on-disk sorts, limited-by-LIMIT sorts (that would be the new feature..) Tom's code distinguished in-memory, top-N, on-disk with final merge postponed, and on-disk with materialized result. Four categories. But I think the distinction between the two types of in-memory and the two types of on-disk sorts is only really useful when you're looking at an individual query. And even then probably only useful to a Postgres hacker, not a DBA. Missed the two on-disk distinctions, yeah. But you're probably right that on-disk vs in-memory is enough, the interesting thing is to get indications on when you hit disk given what it does for performance. Keep in mind that when the sort goes to disk, it actually just means that it used up work_mem and switches to merge sort with tapes. In a typical configuration, there's plenty of RAM available to buffer the tapes, so the terms on-disk and in-memory sorts are misleading. If I've understood earlier discussion correctly, the quicksort - tape sort point is not even that interesting because the tape sort is actually not that much slower than quicksort, as long as it fits in RAM. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] [COMMITTERS] pgsql: Teach tuplesort.c about top N sorting, in which only the first
On May 9, 2007, at 11:01 AM, Magnus Hagander wrote: On Wed, May 09, 2007 at 10:55:12AM -0500, Jim Nasby wrote: On May 8, 2007, at 2:24 PM, Magnus Hagander wrote: Speaking of which, it might be interesting to actually show these values in the stats collector. I was thinking three cols for each database (probably the best level?) that counts each of those three counters. If you have a lot of sorts (percentage-wise) spilling to disk, it is often something you want to investigate, so exposing it that way seems like a good thing. What 3 columns? In-memory sorts, on-disk sorts, and on-disk size? (Sum of how much spilled to disk). I was thinking in-mem sorts, on-disk sorts, limited-by-LIMIT sorts (that would be the new feature..) It would also be useful to know how much got spilled to disk. If it's a large amount per sort, then there's probably not much you could do, but if it's just a tad over available memory per-sort... -- Jim Nasby[EMAIL PROTECTED] EnterpriseDB http://enterprisedb.com 512.569.9461 (cell) ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Re: [COMMITTERS] psqlodbc - psqlodbc: Put Autotools-generated files into subdirectory
Robert Treat wrote: On Monday 07 May 2007 15:52, Joshua D. Drake wrote: Andrew Dunstan wrote: Hiroshi Inoue wrote: Maybe it's BSD which is different from the license of psqlodbc (LGPL). Is there no problem with their coexistence ? Or is it possible for psqlodbc to be LGPL entirely ? I am having difficulty in understanding what the problem is. My understanding is that using BSD licensed code is ok in an LGPL project, but (probably) not vice versa. To my knowledge you can do it either way, as long as you remember that any changes to the lgpl code have to be released. It's generally a very bad idea for a BSD licensed project to include lgpl licensed code because people who try and use your work in thier own projects, under the assumption that it really is bsd licensed, get bitten when they find out that they have now illegally included code that is licensed via some other license. Psqlodbc package is LGPL licensed and seems to have little problem to include copy of BSD licensed code as a part of it. It's what I understand now from other developers' teachings. I'm still afraid of my misunderstanding a little. Thanks. Hiroshi Inoue ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Re: [COMMITTERS] psqlodbc - psqlodbc: Put Autotools-generated files into subdirectory
Alvaro Herrera wrote: Robert Treat wrote: On Monday 07 May 2007 15:52, Joshua D. Drake wrote: Andrew Dunstan wrote: Hiroshi Inoue wrote: Maybe it's BSD which is different from the license of psqlodbc (LGPL). Is there no problem with their coexistence ? Or is it possible for psqlodbc to be LGPL entirely ? I am having difficulty in understanding what the problem is. My understanding is that using BSD licensed code is ok in an LGPL project, but (probably) not vice versa. To my knowledge you can do it either way, as long as you remember that any changes to the lgpl code have to be released. It's generally a very bad idea for a BSD licensed project to include lgpl licensed code because people who try and use your work in thier own projects, under the assumption that it really is bsd licensed, get bitten when they find out that they have now illegally included code that is licensed via some other license. Of course, the developer who owns the LGPL-licensed copyright is free to relicense his work under a different license, so if the ODBC developers want to contribute code to Postgres they can give their work under the Postgres license. (They must obtain permission from all the involved developers, obviously). There are no original developers in the project now and I don't know where or how they are now. I personally am not so eager to change the license to BSD because it has been LGPL too long. Oppositely I thought we can implement the BSD licensed autoconf macros by ourselves but I'm not sure how it can be considered as *not derived*. Thanks. regards, Hiroshi Inoue ---(end of broadcast)--- TIP 6: explain analyze is your friend
[HACKERS] Problem with CREATE LANGUAGE in CVS TIP
Folks, While trying to test the arrays of complex types, I used a stripped-down configure. Too stripped down, it turns out. CFLAGS=-O0 ./configure --prefix=/var/lib/pgsql/tip --with-pgport=2225 --enable-cassert --enable-debug --enable-depend --enable-integer-datetimes I set $PGDATA to my test directory, then did initdb -D $PGDATA -E utf8 createdb array_test createlang plperlu array_test then spent some time trying to figure out why I got the following error when I tried to create a PL/PerlU function: psql:schema.sql:4259: ERROR: cache lookup failed for function 1000 It turns out that PL/PerlU wasn't available, even though createlang had succeeded. I think this is a bug in CREATE LANGUAGE, as CREATE LANGUAGE should have failed when it found no PL/Perl compiled in. Cheers, D -- David Fetter [EMAIL PROTECTED] http://fetter.org/ phone: +1 415 235 3778AIM: dfetter666 Skype: davidfetter Remember to vote! Consider donating to PostgreSQL: http://www.postgresql.org/about/donate ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] Seq scans roadmap
Hi, In reference to the seq scans roadmap, I have just submitted a patch that addresses some of the concerns. The patch does this: 1. for small relation (smaller than 60% of bufferpool), use the current logic 2. for big relation: - use a ring buffer in heap scan - pin first 12 pages when scan starts - on consumption of every 4-page, read and pin the next 4-page - invalidate used pages of in the scan so they do not force out other useful pages 4 files changed: bufmgr.c, bufmgr.h, heapam.c, relscan.h If there are interests, I can submit another scan patch that returns N tuples at a time, instead of current one-at-a-time interface. This improves code locality and further improve performance by another 10-20%. For TPCH 1G tables, we are seeing more than 20% improvement in scans on the same hardware. - - PATCHED VERSION - gptest=# select count(*) from lineitem; count - 6001215 (1 row) Time: 2117.025 ms - - ORIGINAL CVS HEAD VERSION - gptest=# select count(*) from lineitem; count - 6001215 (1 row) Time: 2722.441 ms Suggestions for improvement are welcome. Regards, -cktan Greenplum, Inc. On May 8, 2007, at 5:57 AM, Heikki Linnakangas wrote: Luke Lonergan wrote: What do you mean with using readahead inside the heapscan? Starting an async read request? Nope - just reading N buffers ahead for seqscans. Subsequent calls use previously read pages. The objective is to issue contiguous reads to the OS in sizes greater than the PG page size (which is much smaller than what is needed for fast sequential I/O). Are you filling multiple buffers in the buffer cache with a single read-call? The OS should be doing readahead for us anyway, so I don't see how just issuing multiple ReadBuffers one after each other helps. Yes, I think the ring buffer strategy should be used when the table size is 1 x bufcache and the ring buffer should be of a fixed size smaller than L2 cache (32KB - 128KB seems to work well). I think we want to let the ring grow larger than that for updating transactions and vacuums, though, to avoid the WAL flush problem. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 6: explain analyze is your friend ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Re: [COMMITTERS] psqlodbc - psqlodbc: Put Autotools-generated files into subdirectory
Hiroshi Inoue [EMAIL PROTECTED] writes: Robert Treat wrote: It's generally a very bad idea for a BSD licensed project to include lgpl licensed code Psqlodbc package is LGPL licensed and seems to have little problem to include copy of BSD licensed code as a part of it. Right, that direction is fine, it's the other way around that's problematic (because adding some BSD code adds no new restrictions on what users can do with an overall-LGPL project, whereas having some LGPL components in a supposedly BSD project does limit what they can do with it). I don't see any reason why you shouldn't include those PG autoconf macros in psqlodbc. You just need to document that they have a BSD license, in case someone wants to use them by themselves. regards, tom lane ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] Problem with CREATE LANGUAGE in CVS TIP
David Fetter [EMAIL PROTECTED] writes: It turns out that PL/PerlU wasn't available, even though createlang had succeeded. I think this is a bug in CREATE LANGUAGE, as CREATE LANGUAGE should have failed when it found no PL/Perl compiled in. Works for me: $ createlang plperlu array_test createlang: language installation failed: ERROR: could not access file $libdir/plperl: No such file or directory regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq