Re: [HACKERS] Not ready for 8.3
Jim C. Nasby wrote: On Wed, May 16, 2007 at 07:48:10PM +0200, Magnus Hagander wrote: Dave Page wrote: I the current URLs represent the month, and the ID of the message as it comes out of the mbox I believe. We could probably write a script to dump a list of message IDs, directories and mbox positions I imagine, and then import that into a new database. Yeah, if the files still resemble real emails then we can probably come up with a way to pull the data in. We have all the mbox files, so we can import them from there as raw messages. yeah, that's clearly the best source to work from. It's *possible* work from the mhonarc files (I've done it before), but it's more work. We'd want the old URLs to be redirected too, so at some point we'll have to deal with mhonarc. Right. Grabbing the msgid alone from them shouldn't be too hard though. It's included in the meta-headers mhonarc sticks in each file, so it should be a simple regex to find it. //Magnus ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Lack of urgency in 8.3 reviewing
Marc G. Fournier wrote: --On Wednesday, May 16, 2007 20:09:44 -0400 Bruce Momjian [EMAIL PROTECTED] wrote: I think one of the things that is preventing urgency is that everyone knows we have large patches unapplied, so they know that their lack of activity is not holding up the release. Any way around that? Set a fixed date (ie. 3 weeks) and whatever isn't in gets punted to 8.4 ... if that means those 'large patches' don't get applied, so be it ... Meaning we lose a bunch of potentially very cool features, and seriously hack off the developers who put significant time and effort into them, in some cases producing numerous updates based on ongoing discussion and feedback over a number of months. And then in 8.4 we have the same problem... I think we just have to accept that we're gonna have a long feature freeze period, and ask people to help review whatever they can. Regards, Dave. ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Not ready for 8.3
On Wed, May 16, 2007 at 11:50:26PM -0400, Greg Smith wrote: On Tue, 15 May 2007, Jim C. Nasby wrote: Speaking of reviewers... should we put some thought into how we can increase the number of people who can review code? It seems that's one of our biggest bottlenecks... Having recently dragged myself from never seeing the code before to being able to provide an early review to help the committers out, I can tell you a few things that would have sped up that process and therefore improve the odds more people will navigate the trail. My main difficulty was figuring out a workable way to build a local mirror of the code (so I could get off-line diffs), keep it in sync with the development tree, while branching out working areas to evaluate patches or do new development in. A good example walkthrough of how to do that using standard tools would be a big help. Hint: if you suggest CVsup I'll smack you. Overall, though, I don't think there would have been any substitute for what I learned by putting together my own small patches, so in some respects thinking about how to lower the bar for doing that would also ultimately expand the review pool. The main issues I had as a newcomer to the codebase was getting data in/out of the new code I added that was buried inside the system. Here are some examples of what I kept hoping to find documentation on: -Examples and usage guidelines for eLog and eReport (the stuff in the FAQ needs some more meat) -How to add a GUC variable. Once I've got that, now I can adjust the code in real-time just by updating it. -How to add to the statistics for some part of the system that track a new variable and follow how that moves through the collector process. If you put people into a position where they easily can poke data in at one end, see how it rattles around inside the engine by logging, and get some statistics on the result, now you've got someone who can build their own simple patches without being so frustrated. Would you be interested in providing this meat? You're uniquely qualified because your shins still smart from all the things you barked them on :) Cheers, D -- David Fetter [EMAIL PROTECTED] http://fetter.org/ phone: +1 415 235 3778AIM: dfetter666 Skype: davidfetter Remember to vote! Consider donating to PostgreSQL: http://www.postgresql.org/about/donate ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Not ready for 8.3
Magnus Hagander wrote: Right. Grabbing the msgid alone from them shouldn't be too hard though. It's included in the meta-headers mhonarc sticks in each file, so it should be a simple regex to find it. Should be easier than that - when we import the existing messages from the mbox files we should be able to figure out the current URL for each message based on the year/month/list of the mbox file, and the message number within the mbox. /D ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Lack of urgency in 8.3 reviewing
I want to help the reviewing work of ctid chain following enhancement . I've been studying the souce code which related with that part recently. :-) 2007/5/17, Dave Page [EMAIL PROTECTED]: I think we just have to accept that we're gonna have a long feature freeze period, and ask people to help review whatever they can. Regards, Dave. ---(end of broadcast)--- TIP 6: explain analyze is your friend ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] Lack of urgency in 8.3 reviewing
Cui Shijun wrote: I want to help the reviewing work of ctid chain following enhancement . I've been studying the souce code which related with that part recently. :-) Please go ahead :-) Regards Dave ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Planning large IN lists
Hi, Tom Lane wrote: Neil Conway [EMAIL PROTECTED] writes: When planning queries with a large IN expression in the WHERE clause, the planner transforms the IN list into a scalar array expression. In clause_selectivity(), we estimate the selectivity of the ScalarArrayExpr by calling scalararraysel(), which in turn estimates the selectivity of *each* array element in order to determine the selectivity of the array expression as a whole. This is quite inefficient when the IN list is large. That's the least of the problems. We really ought to convert such cases into an IN (VALUES(...)) type of query, since often repeated indexscans aren't the best implementation. I thought of giving this a shot and while I was working on it, it occurred to me that we need to decide on a threshold value of the IN list size above which such transformation should take place. For small sizes of the IN list, scalararraysel() of IN list wins over the hash join involved in IN (VALUES(...)). But for larger sizes of the IN list, IN (VALUES(...)) comes out to be a clear winner. I would like to know what does the community think should be a heuristic value of the IN list size beyond which this transformation should take place. I was thinking of a GUC variable (or a hard coded value) which defaults to say 30. This is based on numbers from the following test: postgres=# create table w (w text); CREATE TABLE postgres=# \copy w from '/usr/share/dict/words' And run the following query with different IN list sizes explain analyze select * from w where w in ('one', 'two', ...); I got the following runtimes: IN list IN (VALUES(...)) IN size 150 ~2000 ms ~5500 ms 100 ~1500 ms ~4000 ms 80 ~1400 ms ~3000 ms 50 ~1400 ms ~2500 ms 30 ~1500 ms ~1500 ms 20 ~1400 ms ~1200 ms 10 ~1400 ms ~1200 ms The IN (VALUES(...)) gives an almost steady state behavior, while the IN runtimes deteriorate with growing list size. There would obviously be different conditions on which to base this value. I seek community opinion on this. -- Atul EnterpriseDB www.enterprisedb.com ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Planning large IN lists
Hi, Tom Lane wrote: Neil Conway [EMAIL PROTECTED] writes: When planning queries with a large IN expression in the WHERE clause, the planner transforms the IN list into a scalar array expression. In clause_selectivity(), we estimate the selectivity of the ScalarArrayExpr by calling scalararraysel(), which in turn estimates the selectivity of *each* array element in order to determine the selectivity of the array expression as a whole. This is quite inefficient when the IN list is large. That's the least of the problems. We really ought to convert such cases into an IN (VALUES(...)) type of query, since often repeated indexscans aren't the best implementation. I thought of giving this a shot and while I was working on it, it occurred to me that we need to decide on a threshold value of the IN list size above which such transformation should take place. For small sizes of the IN list, scalararraysel() of IN list wins over the hash join involved in IN (VALUES(...)). But for larger sizes of the IN list, IN (VALUES(...)) comes out to be a clear winner. I would like to know what does the community think should be a heuristic value of the IN list size beyond which this transformation should take place. I was thinking of a GUC variable (or a hard coded value) which defaults to say 30. This is based on numbers from the following test: postgres=# create table w (w text); CREATE TABLE postgres=# \copy w from '/usr/share/dict/words' And run the following query with different IN list sizes explain analyze select * from w where w in ('one', 'two', ...); I got the following runtimes: IN list IN (VALUES(...)) IN size 150 ~2000 ms ~5500 ms 100 ~1500 ms ~4000 ms 80 ~1400 ms ~3000 ms 50 ~1400 ms ~2500 ms 30 ~1500 ms ~1500 ms 20 ~1400 ms ~1200 ms 10 ~1400 ms ~1200 ms The IN (VALUES(...)) gives an almost steady state behavior, while the IN runtimes deteriorate with growing list size. There would obviously be different conditions on which to base this value. I seek community opinion on this. -- Atul EnterpriseDB www.enterprisedb.com ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Lack of urgency in 8.3 reviewing
On 5/17/07, Cui Shijun [EMAIL PROTECTED] wrote: I want to help the reviewing work of ctid chain following enhancement . I've been studying the souce code which related with that part recently. :-) Tom had objected to this patch on the grounds that it adds complexity without any significant gains. Though I don't completely agree with the first part, the second part is indeed debatable since the code is touched only for infrequently. Thanks, Pavan -- EnterpriseDB http://www.enterprisedb.com
Re: [HACKERS] Not ready for 8.3
David Fetter wrote: My main difficulty was figuring out a workable way to build a local mirror of the code (so I could get off-line diffs), keep it in sync with the development tree, while branching out working areas to evaluate patches or do new development in. A good example walkthrough of how to do that using standard tools would be a big help. Hint: if you suggest CVsup I'll smack you. Would you be interested in providing this meat? You're uniquely qualified because your shins still smart from all the things you barked them on :) For this item at least the work has been done in showing how to set up a local mirror using rsync instead of CVSup. It's mentioned in the dev version of the docs at http://developer.postgresql.org/pgdocs/postgres/rsync.html - although frankly it would be better to import the information rather than just refer to the buildfarm HOWTO. cheers andredw ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] Not ready for 8.3
On Wednesday 16 May 2007 13:02, Robert Haas wrote: I care. I want a professional easy to understand and easy to maintain that doesn't cause potential conflict with future and past development syntax. snip If people have strong opinions about the syntax, why were they not ALL expressed when the proposal was originally laid on the table? snip I haven't studied the proposed syntaxes in detail, snip lol... just thought this was ironic. :-) -- Robert Treat Build A Brighter LAMP :: Linux Apache {middleware} PostgreSQL ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Not ready for 8.3
On Thursday 17 May 2007 08:16, Andrew Dunstan wrote: David Fetter wrote: My main difficulty was figuring out a workable way to build a local mirror of the code (so I could get off-line diffs), keep it in sync with the development tree, while branching out working areas to evaluate patches or do new development in. A good example walkthrough of how to do that using standard tools would be a big help. Hint: if you suggest CVsup I'll smack you. Would you be interested in providing this meat? You're uniquely qualified because your shins still smart from all the things you barked them on :) For this item at least the work has been done in showing how to set up a local mirror using rsync instead of CVSup. It's mentioned in the dev version of the docs at http://developer.postgresql.org/pgdocs/postgres/rsync.html - although frankly it would be better to import the information rather than just refer to the buildfarm HOWTO. And really should probably be written up into FAQ section or a full on guide for how do i get started hacking on postgresql? -- Robert Treat Build A Brighter LAMP :: Linux Apache {middleware} PostgreSQL ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Planning large IN lists
Atul Deopujari [EMAIL PROTECTED] writes: Hi, Tom Lane wrote: That's the least of the problems. We really ought to convert such cases into an IN (VALUES(...)) type of query, since often repeated indexscans aren't the best implementation. I thought of giving this a shot and while I was working on it, it occurred to me that we need to decide on a threshold value of the IN list size above which such transformation should take place. I see no good reason to suppose that there is/should be a constant threshold --- most likely it depends on size of table, availability of indexes, etc. Having the planner try it both ways and compare costs would be best. regards, tom lane ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] Not ready for 8.3
On Thu, 17 May 2007, Andrew Dunstan wrote: For this item at least the work has been done in showing how to set up a local mirror using rsync instead of CVSup...although frankly it would be better to import the information rather than just refer to the buildfarm HOWTO. The information in the buildfarm HOWTO has maybe 1/2 of what you need to know--you get a local mirror out of it but no idea how to then use that on the client side to branch and generate patches. And what it does provide is less helpful than it might be because it includes diversions specific to the buildfarm application I found confusing. -- * Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Lack of urgency in 8.3 reviewing
Pavan Deolasee wrote: On 5/17/07, Cui Shijun [EMAIL PROTECTED] wrote: I want to help the reviewing work of ctid chain following enhancement . I've been studying the souce code which related with that part recently. :-) Tom had objected to this patch on the grounds that it adds complexity without any significant gains. Though I don't completely agree with the first part, the second part is indeed debatable since the code is touched only for infrequently. Right. The reason the patch was kept in the queue is that there was discussion that HOT will exercise that part of the code a lot more than it does currently. -- Bruce Momjian [EMAIL PROTECTED] http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] BufFileWrite across MAX_PHYSICAL_FILESIZE boundary
Alvaro Herrera wrote: Bruce Momjian wrote: This has been saved for the 8.4 release: http://momjian.postgresql.org/cgi-bin/pgpatches_hold Huh, no, this is a bug and should be fixed right away. OK, moved to the 8.3 patch queue. --- --- Tom Lane wrote: Kurt Harriman [EMAIL PROTECTED] writes: Just noticed buffile.c:292 doesn't look quite right where BufFileDumpBuffer calls FileWrite: bytestowrite = FileWrite(thisfile, file-buffer, bytestowrite); It looks as though it would write the same data each time the loop is iterated. Would this be better? bytestowrite = FileWrite(thisfile, file-buffer + wpos, bytestowrite); Yeah, I think you're right. We've probably not seen this because in most usages the buffer is always block-aligned, but it looks possible for tuplestore.c to get out of alignment if its concurrent write/read feature is exercised just so. Do you want to try demonstrating the bug with a smaller-than-normal setting of MAX_PHYSICAL_FILESIZE? -- Alvaro Herrerahttp://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc. -- Bruce Momjian [EMAIL PROTECTED] http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Not ready for 8.3
On Thu, 17 May 2007, Robert Treat wrote: And really should probably be written up into FAQ section or a full on guide for how do i get started hacking on postgresql? To be clear here: while there are guides for new hackers out there, they focus on the big picture. I walked from those with a good idea of how the major pieces fit together and was able to navigate the source code. It was the little details of how to actually work with the code and the repository that were the biggest drag on my progress. I used to wonder if it was just me, but watching the discussion that went along with the OS/X startup scripts thread recently made me realize how many other people struggle(d) with this as well. -- * Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Lack of urgency in 8.3 reviewing
I see... I checked part of HOT patches(patch1), and found that it involves too many things I am not currently familar with. Maybe I should change an item to work. :-( Since I only studied part of source codes about transaction processing(lmgr/MVCC/xact but without xlog.c), I want to study Group Commit patch and try to review it, any suggestions? 2007/5/17, Bruce Momjian [EMAIL PROTECTED]: Pavan Deolasee wrote: On 5/17/07, Cui Shijun [EMAIL PROTECTED] wrote: I want to help the reviewing work of ctid chain following enhancement . I've been studying the souce code which related with that part recently. :-) Tom had objected to this patch on the grounds that it adds complexity without any significant gains. Though I don't completely agree with the first part, the second part is indeed debatable since the code is touched only for infrequently. Right. The reason the patch was kept in the queue is that there was discussion that HOT will exercise that part of the code a lot more than it does currently. -- Bruce Momjian [EMAIL PROTECTED] http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Not ready for 8.3
Greg Smith wrote: On Thu, 17 May 2007, Andrew Dunstan wrote: For this item at least the work has been done in showing how to set up a local mirror using rsync instead of CVSup...although frankly it would be better to import the information rather than just refer to the buildfarm HOWTO. The information in the buildfarm HOWTO has maybe 1/2 of what you need to know--you get a local mirror out of it but no idea how to then use that on the client side to branch and generate patches. And what it does provide is less helpful than it might be because it includes diversions specific to the buildfarm application I found confusing. Sure. And of course the creation of branches and generation of patches is irrelevant to the buildfarm. All this just reinforces my point that we should remove that reference and fill in the blanks on the docs, FAQs etc. Incidentally, I don't use a repo mirror for online or offline work, although I do for buildfarm work. What I do is to have a pristine checkout of each live stable branch plus HEAD. I have a cron job that keeps these moderately up to date. When I'm working on a feature or patch, I make a copy (using cp -a) of the relevant branch and then go to work on that. If the work lasts a while, every so often I run 'cvs update' in that copy. And of course I run 'cvs update' before cutting a patch or doing a commit. When the work is done I blow away the work directory. If the work is large and I need to checkpoint certain files locally so I can roll back, I occasionally use RCS. There's no right answer on how to work - everyone uses tools they feel comfortable with. cheers andrew ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Seq scans roadmap
Hi Jeff, On 5/16/07 4:56 PM, Jeff Davis [EMAIL PROTECTED] wrote: The main benefit of a sync scan will be the ability to start the scan where other scans have already filled the I/O cache with useful blocks. This will require some knowledge of the size of the I/O cache by the syncscan logic, but that's where the largest amount of I/O cache memory (by far) is located. I don't think it's necessarily the largest by far. However, it may be the largest. Compare the size of a 32 block ring buffer (between 256KB and 1024KB) and 16,000,000 KB of RAM on a common server, automatically used to maximum extent by the OS dynamic I/O caching. If you mean that the main benefit of sync scans is to make use of blocks that happen to be in cache before the scan began, I disagree. That's not what I meant. I think that's a possible benefit, but I was unable to show any huge benefit in my tests (someone may be able to on different hardware with different test cases). I agree, I don't think this is worth pursuing. The main benefits that I see are: (1) reduce total number of blocks read from disk by making use of blocks as they are read by another concurrent seqscan. (2) eliminate the need for random I/O on concurrent sequential scans. Yes on (1), but with (2), again, the OS prefetch reduces the seeking to a minimal level. With (1), we just have to define the meaning of concurrent to be within the span of the OS I/O cache and we're describing the same effect. - Luke ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] Lack of urgency in 8.3 reviewing
Cui Shijun wrote: I see... I checked part of HOT patches(patch1), and found that it involves too many things I am not currently familar with. Maybe I should change an item to work. :-( Yeah, that's one big patch.. Since I only studied part of source codes about transaction processing(lmgr/MVCC/xact but without xlog.c), I want to study Group Commit patch and try to review it, any suggestions? There's no group commit patch, just some discussion, and probably won't be until 8.4. Maybe one of these would interest you: - deferred transaction/waitless COMMIT - full page writes improvement - maintaining cluster order on insert - heap page diagnostic functions Make sure you look at the latest version of the patches. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Not ready for 8.3
On Tue, May 15, 2007 at 04:52:16PM -0400, Alvaro Herrera wrote: This is what happens with the Linux kernel. They have hundreds of developers getting their hands dirty during a previous period. Then 2.6.20 is released; the 2.6.21 merge window opens, and all sort of patches are flooded in. I hasten to point out that the Linux kernel has also had several stable releases with huge bugs -- things like massive filesystem corruption, bizarre failure cases, and nasty compatibility problems with modules across versions. I am not entirely sure that the Linux model is the one to ape. PostgreSQL has a history with remarkably few of those blunders, and I'd hate to give that up. A -- Andrew Sullivan | [EMAIL PROTECTED] If they don't do anything, we don't need their acronym. --Josh Hamilton, on the US FEMA ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] savepoints and upgrading locks
Your patch has been added to the PostgreSQL unapplied patches list at: http://momjian.postgresql.org/cgi-bin/pgpatches It will be applied as soon as one of the PostgreSQL committers reviews and approves it. --- Jeff Davis wrote: This thread here became a TODO item: http://archives.postgresql.org/pgsql-hackers/2006-11/msg01011.php During that discussion a patch was produced that nobody seemed to have objections to. The main problem seemed to be that it wouldn't always downgrade the lock on a ROLLBACK TO (as I understand it), which doesn't seem like a problem to me. Is there a reason this isn't a part of 8.3, or was it just forgotten? Regards, Jeff Davis ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org -- Bruce Momjian [EMAIL PROTECTED] http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] savepoints and upgrading locks
Bruce Momjian wrote: Your patch has been added to the PostgreSQL unapplied patches list at: http://momjian.postgresql.org/cgi-bin/pgpatches It will be applied as soon as one of the PostgreSQL committers reviews and approves it. Wait a minute. Would the mentioned patch be this one? http://article.gmane.org/gmane.comp.db.postgresql.devel.general/73330 Because it was later declared not working; see http://article.gmane.org/gmane.comp.db.postgresql.devel.general/73334 Jeff Davis wrote: This thread here became a TODO item: http://archives.postgresql.org/pgsql-hackers/2006-11/msg01011.php During that discussion a patch was produced that nobody seemed to have objections to. The main problem seemed to be that it wouldn't always downgrade the lock on a ROLLBACK TO (as I understand it), which doesn't seem like a problem to me. -- Alvaro Herrerahttp://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc. ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Not ready for 8.3
Andrew Sullivan wrote: On Tue, May 15, 2007 at 04:52:16PM -0400, Alvaro Herrera wrote: This is what happens with the Linux kernel. They have hundreds of developers getting their hands dirty during a previous period. Then 2.6.20 is released; the 2.6.21 merge window opens, and all sort of patches are flooded in. I hasten to point out that the Linux kernel has also had several stable releases with huge bugs -- /me fondly remembers kernel 2.4. things like massive filesystem corruption, bizarre failure cases, and nasty compatibility problems with modules across versions. I am not entirely sure that the Linux model is the one to ape. PostgreSQL has a history with remarkably few of those blunders, and I'd hate to give that up. A -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutions since 1997 http://www.commandprompt.com/ Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate PostgreSQL Replication: http://www.commandprompt.com/products/ ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] savepoints and upgrading locks
Bruce Momjian [EMAIL PROTECTED] writes: Your patch has been added to the PostgreSQL unapplied patches list at: Please take it off again. Jeff Davis wrote: This thread here became a TODO item: http://archives.postgresql.org/pgsql-hackers/2006-11/msg01011.php During that discussion a patch was produced that nobody seemed to have objections to. Apparently you stopped reading somewhere. It was pretty thoroughly destroyed beginning here: http://archives.postgresql.org/pgsql-hackers/2006-12/msg00025.php We should have spent more effort trying to think of a solution to the problem during the 8.3 development cycle. At the moment we're no further ahead than we were in December. regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
[HACKERS] mb and ecpg regression tests
We've had ecpg regression tests being doing by the buildfarm for a while, but they are not enabled for MSVC. Also, buildfarm has never supported doing MBCS regression tests. In both cases the regression tests are driven by Unix shell scripts, and in the MBCS case at least, the script would need plenty of love before it would be suitable for the buildfarm even on Unix. Back when we were thinking of supporting MSVC builds, we thought that everything would need to be a C program, and so we rewrote the main regression script in C, for example. However, now we have a build system for MSVC we can see that its main infrastructure is in fact perl, so I think what we need to do is to convert the MBCS and ECPG regression drivers to perl rather than C - which should be far, far simpler task. I'm going to try to get these done and supported by the buildfarm by the time we get to Beta. cheers andrew ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] Not ready for 8.3
Joshua D. Drake wrote: Andrew Sullivan wrote: On Tue, May 15, 2007 at 04:52:16PM -0400, Alvaro Herrera wrote: This is what happens with the Linux kernel. They have hundreds of developers getting their hands dirty during a previous period. Then 2.6.20 is released; the 2.6.21 merge window opens, and all sort of patches are flooded in. I hasten to point out that the Linux kernel has also had several stable releases with huge bugs -- /me fondly remembers kernel 2.4. We keep focusing on process. I am on record as saying we can improve our processes, but the fact is our major immediate problem is person-power, not process. We need more qualified reviewers. Qualified means (to me, at least) you have to have done enough visible PostgreSQL hacking that a committer can reasonably place some level of trust in your review, thereby saving some time. That's not to say that others can't or shouldn't do reviews - every little bit helps, but if Freda Bloggs comes along with a review of some new, large, feature, she isn't helping to make the process shorter, although she might be helping to make it more robust. cheers andrew ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] mb and ecpg regression tests
Andrew Dunstan wrote: We've had ecpg regression tests being doing by the buildfarm for a while, but they are not enabled for MSVC. Also, buildfarm has never supported doing MBCS regression tests. In both cases the regression tests are driven by Unix shell scripts, and in the MBCS case at least, the script would need plenty of love before it would be suitable for the buildfarm even on Unix. Back when we were thinking of supporting MSVC builds, we thought that everything would need to be a C program, and so we rewrote the main regression script in C, for example. However, now we have a build system for MSVC we can see that its main infrastructure is in fact perl, so I think what we need to do is to convert the MBCS and ECPG regression drivers to perl rather than C - which should be far, far simpler task. I'm going to try to get these done and supported by the buildfarm by the time we get to Beta. I don't think it's essential, but one of the other reasons Magnus and I discussed for rewriting the regression tests in C was that we could include it in the future with the installer as a platform/installation confidence test. Regards, Dave. ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] savepoints and upgrading locks
OK, emails moved to 8.4 queue. --- Tom Lane wrote: Bruce Momjian [EMAIL PROTECTED] writes: Your patch has been added to the PostgreSQL unapplied patches list at: Please take it off again. Jeff Davis wrote: This thread here became a TODO item: http://archives.postgresql.org/pgsql-hackers/2006-11/msg01011.php During that discussion a patch was produced that nobody seemed to have objections to. Apparently you stopped reading somewhere. It was pretty thoroughly destroyed beginning here: http://archives.postgresql.org/pgsql-hackers/2006-12/msg00025.php We should have spent more effort trying to think of a solution to the problem during the 8.3 development cycle. At the moment we're no further ahead than we were in December. regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings -- Bruce Momjian [EMAIL PROTECTED] http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] mb and ecpg regression tests
Dave Page wrote: Andrew Dunstan wrote: We've had ecpg regression tests being doing by the buildfarm for a while, but they are not enabled for MSVC. Also, buildfarm has never supported doing MBCS regression tests. In both cases the regression tests are driven by Unix shell scripts, and in the MBCS case at least, the script would need plenty of love before it would be suitable for the buildfarm even on Unix. Back when we were thinking of supporting MSVC builds, we thought that everything would need to be a C program, and so we rewrote the main regression script in C, for example. However, now we have a build system for MSVC we can see that its main infrastructure is in fact perl, so I think what we need to do is to convert the MBCS and ECPG regression drivers to perl rather than C - which should be far, far simpler task. I'm going to try to get these done and supported by the buildfarm by the time we get to Beta. I don't think it's essential, but one of the other reasons Magnus and I discussed for rewriting the regression tests in C was that we could include it in the future with the installer as a platform/installation confidence test. That's a good point. So, do we want/need the same for MBCS and ECPG tests? If so, we should work on creating a regression shared lib so we don't have to replicate lots of code. cheers andrew ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
[HACKERS] 8.3 release date on web site
Right now our roadmap lists the 8.3 release as July, 2007: http://www.postgresql.org/developer/roadmap While the year might be right, the month probably is not. I suggest we remove the date and replace it with undetermined until we have a clearer target date. -- Bruce Momjian [EMAIL PROTECTED] http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Not ready for 8.3
Andrew Sullivan wrote: On Tue, May 15, 2007 at 04:52:16PM -0400, Alvaro Herrera wrote: This is what happens with the Linux kernel. They have hundreds of developers getting their hands dirty during a previous period. Then 2.6.20 is released; the 2.6.21 merge window opens, and all sort of patches are flooded in. I hasten to point out that the Linux kernel has also had several stable releases with huge bugs -- things like massive filesystem corruption, bizarre failure cases, and nasty compatibility problems with modules across versions. I am not entirely sure that the Linux model is the one to ape. PostgreSQL has a history with remarkably few of those blunders, and I'd hate to give that up. Sorry, I wasn't trying to say that we should follow the Linux model all that closely. I know there are regressions in the point zero releases, and that there are bugs. But then, we're nowhere near the scale of Linux. In the press release for 8.0 we mentioned something about hundreds of developers. This was true -- but it was not in the same league as Linux's hundreds of developers. We're nowhere near the manpower that Linux has, nowhere near the amount of code these guys change. I think the figure is around 9000 lines of code changed per day, _every day_[1]. They have new drivers all the time, internal interfaces are cleaned up, new facilities are invented to support new kinds of hardware, performance improvements are made all over the place. So there are a hundred of strange machines where the thing does not boot; yes, but those bugs are fixed in 2.6.2x.1 or subsequent stable branch releases. I dare not think of the 2.2 or 2.4 disasters, where distributions were fond of backporting huge patches from the 2.3 or 2.5 development branches. I compiled my own kernel from Linus' tree back then, which worked without the strange behavior the other kernels had (maybe the distro kernels stabilized at some point, but I didn't try later in the game -- I was too used to compiling my own). Now with 2.6 I don't do that anymore (of course I don't run 2.6.22 as soon as it is out either). In my opinion Linux 2.6 is much better than Linux 2.4. I don't think we should follow either model though. We have different problems and different people. [1] http://lwn.net/Articles/232379/ -- Alvaro Herrerahttp://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] 8.3 release date on web site
Bruce Momjian wrote: Right now our roadmap lists the 8.3 release as July, 2007: http://www.postgresql.org/developer/roadmap While the year might be right, the month probably is not. I suggest we remove the date and replace it with undetermined until we have a clearer target date. I say we wait until we actually know where we are at. How is that comparative from Tom's triage to my email coming? With Andrew picking up three patches yesterday and the fact that some of the patches are clearly not workable for 8.3 (posix shared memory, bitmap on disk indexes(unless something has changed?), we may not be in as bad as shape as we think. Joshua D. Drake -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutions since 1997 http://www.commandprompt.com/ Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate PostgreSQL Replication: http://www.commandprompt.com/products/ ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Lack of urgency in 8.3 reviewing
Thank you for your suggestions, I am thinking about Full page writes improvement. It seems not so complicated, just fit for a novice like me. I'll work on it. :-) 2007/5/17, Heikki Linnakangas [EMAIL PROTECTED]: Cui Shijun wrote: I see... I checked part of HOT patches(patch1), and found that it involves too many things I am not currently familar with. Maybe I should change an item to work. :-( Yeah, that's one big patch.. Since I only studied part of source codes about transaction processing(lmgr/MVCC/xact but without xlog.c), I want to study Group Commit patch and try to review it, any suggestions? There's no group commit patch, just some discussion, and probably won't be until 8.4. Maybe one of these would interest you: - deferred transaction/waitless COMMIT - full page writes improvement - maintaining cluster order on insert - heap page diagnostic functions Make sure you look at the latest version of the patches. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] mb and ecpg regression tests
Andrew Dunstan [EMAIL PROTECTED] writes: Back when we were thinking of supporting MSVC builds, we thought that everything would need to be a C program, and so we rewrote the main regression script in C, for example. However, now we have a build system for MSVC we can see that its main infrastructure is in fact perl, so I think what we need to do is to convert the MBCS and ECPG regression drivers to perl rather than C - which should be far, far simpler task. I disagree with this for ecpg because (a) it would be a capability regression from the point of view of anyone building a non-perl-enabled build on Unix, and (b) we already have the C regression driver, why can't we re-use or adapt it for ecpg instead of writing and debugging and maintaining an entirely new set of test infrastructure? For the MBCS stuff I don't care, since no one runs that on a regular basis anyway (or should need to...) regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] mb and ecpg regression tests
Tom Lane wrote: Andrew Dunstan [EMAIL PROTECTED] writes: Back when we were thinking of supporting MSVC builds, we thought that everything would need to be a C program, and so we rewrote the main regression script in C, for example. However, now we have a build system for MSVC we can see that its main infrastructure is in fact perl, so I think what we need to do is to convert the MBCS and ECPG regression drivers to perl rather than C - which should be far, far simpler task. I disagree with this for ecpg because (a) it would be a capability regression from the point of view of anyone building a non-perl-enabled build on Unix, and (b) we already have the C regression driver, why can't we re-use or adapt it for ecpg instead of writing and debugging and maintaining an entirely new set of test infrastructure? For the MBCS stuff I don't care, since no one runs that on a regular basis anyway (or should need to...) Well, you need perl to build from CVS, IIRC, but I take the point. I don't think ECPG can use pg_regress as is, because it does more than just run psql. We probably need to look at factoring out the common bits into a shared lib. That's not a bad idea anyway. cheers andrew ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] mb and ecpg regression tests
Andrew Dunstan [EMAIL PROTECTED] writes: I don't think ECPG can use pg_regress as is, because it does more than just run psql. We probably need to look at factoring out the common bits into a shared lib. That's not a bad idea anyway. A shared library is probably overkill (it has a lot more installation overhead than it's worth), but at least factor the source code so we don't have two copies of the common bits. It wouldn't be a bad goal to try to use the same pg_regress 2.0 for both uses, either. I'm tempted to suggest it should become a separate subdirectory under src/bin/ to remind people it's decoupled from the regression tests per se... regards, tom lane ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] Patch queue triage
Tom Lane wrote: At this point it seems nothing will be done about this issue for 8.3. * [PATCHES] plpgpsm /Pavel Stehule/ I think this has to be held for 8.4: it was submitted too late for the 8.3 feature deadline, and in fact I don't recall that there was any community discussion/consensus on accepting it into core at all. Another big problem is that it largely copies-and-pastes the plpgsql code, which I think is an unacceptable maintenance burden in the long run. We need to consider whether we can't refactor to avoid code duplication. Per my updated patch email to the lists yesterday, plus Toms' above comments *and* Alvaro's comments to me when I asked Alexey to review it... may I strongly suggest that we bounce this for further development in 8.4. Joshua D. Drake -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutions since 1997 http://www.commandprompt.com/ Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate PostgreSQL Replication: http://www.commandprompt.com/products/ ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Group Commit
This is not ready for 8.3. This has been saved for the 8.4 release: http://momjian.postgresql.org/cgi-bin/pgpatches_hold --- Heikki Linnakangas wrote: It's been known for years that commit_delay isn't very good at giving us group commit behavior. I did some experiments with this simple test case: BEGIN; INSERT INTO test VALUES (1); COMMIT;, with different numbers of concurrent clients and with and without commit_delay. Summary for the impatient: 1. Current behavior sucks. 2. commit_delay doesn't help with # of clients ~10. It does help with higher numbers, but it still sucks. 3. I'm working on a patch. I added logging to show how many commit records are flushed on each fsync. The output with otherwise unpatched PG head looks like this, with 5 clients: LOG: Flushed 4 out of 5 commits LOG: Flushed 1 out of 5 commits LOG: Flushed 4 out of 5 commits LOG: Flushed 1 out of 5 commits LOG: Flushed 4 out of 5 commits LOG: Flushed 1 out of 5 commits LOG: Flushed 4 out of 5 commits LOG: Flushed 1 out of 5 commits LOG: Flushed 3 out of 5 commits LOG: Flushed 2 out of 5 commits LOG: Flushed 3 out of 5 commits LOG: Flushed 2 out of 5 commits LOG: Flushed 3 out of 5 commits LOG: Flushed 2 out of 5 commits LOG: Flushed 3 out of 5 commits ... Here's what's happening: 1. Client 1 issues fsync (A) 2. Clients 2-5 write their commit record, and try to fsync, but they have to wait for fsync (A) to finish. 3. fsync (A) finishes, freeing client 1. 4. One of clients 2-5 starts the next fsync (B), which will flush commits of clients 2-5 to disk 5. Client 1 begins new transaction, inserts commit record and tries to fsync. Needs to wait for previous fsync (B) to finish 6. fsync B finishes, freeing clients 2-5 7. Client 1 issues fsync (C) 8. ... The 2-3-2-3 pattern can be explained with similar unfortunate resonance, but with two clients instead of client 1 in the above possibly running in separate cores (test was run on a dual-core laptop). I also draw a diagram illustrating the above, attached. I wrote a quick dirty patch for this that I'm going to refine further, but wanted to get the results out for others to look at first. I'm not posting the patch yet, but it basically adds some synchronization to the WAL flushes. It introduces a counter of inserted but not yet flushed commit records. Instead of the commit_delay, the counter is checked. If it's smaller than NBackends, the process waits until count reaches NBackends, or a timeout expires. There's two significant differences to commit_delay here: 1. Instead of waiting for commit_delay to expire, processes are woken and fsync is started immediately when we know there's no more commit records coming that we should wait for. Even though commit_delay is given in microseconds, the real granularity of the wait can be as high as 10 ms, which is in the same ball park as the fsync itself. 2. commit_delay is not used when there's less than commit_siblings non-idle backends in the system. With very short transactions, it's worthwhile to wait even if that's the case, because a client can begin and finish a transaction in much shorter time than it takes to fsync. This is what makes the commit_delay to not work at all in my test case with 2 clients. Here's a spreadsheet with the results of the tests I ran: http://community.enterprisedb.com/groupcommit-comparison.ods It contains a graph that shows that the patch works very well for this test case. It's not very good for real life as it is, though. An obvious flaw is that if you have a longer-running transaction, effect 1. goes away. Instead of waiting for NBackends commit records, we should try to guess the number of transactions that are likely to finish in a reasonably short time. I'm thinking of keeping a running average of commits per second, or # of transactions that finish while an fsync is taking place. Any thoughts? -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match -- Bruce Momjian [EMAIL PROTECTED] http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Patch queue triage
Pavan Deolasee wrote: I suppose inserting HOT tuples without index maintenance is useful even if no changes to the space allocation is made is useful. It won't get the space usage but it would save on index thrashing. But that still implies all the code to handle scans, updates, index builds, etc. Those chunks could be separated out but you can't commit without them. There are few things that we can separate easily, like CREATE INDEX related changes, VACUUM and VACUUM FULL related changed, space reuse related changes etc. Let me give it a shot. Did we ever get a broken up patch for this? Joshua D. Drake Thanks, Pavan -- EnterpriseDB http://www.enterprisedb.com -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutions since 1997 http://www.commandprompt.com/ Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate PostgreSQL Replication: http://www.commandprompt.com/products/ ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Patch queue triage
Joshua D. Drake wrote: Tom Lane wrote: At this point it seems nothing will be done about this issue for 8.3. * [PATCHES] plpgpsm /Pavel Stehule/ I think this has to be held for 8.4: it was submitted too late for the 8.3 feature deadline, and in fact I don't recall that there was any community discussion/consensus on accepting it into core at all. Another big problem is that it largely copies-and-pastes the plpgsql code, which I think is an unacceptable maintenance burden in the long run. We need to consider whether we can't refactor to avoid code duplication. Per my updated patch email to the lists yesterday, plus Toms' above comments *and* Alvaro's comments to me when I asked Alexey to review it... may I strongly suggest that we bounce this for further development in 8.4. Agreed. Done. -- Bruce Momjian [EMAIL PROTECTED] http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Patch queue triage
On 5/17/07, Joshua D. Drake [EMAIL PROTECTED] wrote: Pavan Deolasee wrote: There are few things that we can separate easily, like CREATE INDEX related changes, VACUUM and VACUUM FULL related changed, space reuse related changes etc. Let me give it a shot. Did we ever get a broken up patch for this? I broke the patch into 5 smaller, logical pieces and posted them on May 7th. Thanks, Pavan -- Pavan Deolasee EnterpriseDB http://www.enterprisedb.com
Re: [HACKERS] Patch queue triage
Joshua D. Drake wrote: Pavan Deolasee wrote: I suppose inserting HOT tuples without index maintenance is useful even if no changes to the space allocation is made is useful. It won't get the space usage but it would save on index thrashing. But that still implies all the code to handle scans, updates, index builds, etc. Those chunks could be separated out but you can't commit without them. There are few things that we can separate easily, like CREATE INDEX related changes, VACUUM and VACUUM FULL related changed, space reuse related changes etc. Let me give it a shot. Did we ever get a broken up patch for this? Yes: http://archives.postgresql.org/pgsql-patches/2007-05/msg00065.php -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] mb and ecpg regression tests
Andrew Dunstan wrote: That's a good point. So, do we want/need the same for MBCS and ECPG tests? It would be nice for ECPG. I wasn't even aware of the MBCS stuff - if it will exercise the unicode sorting which is a little different on Windows iirc (due to the lack of full native UTF-8 support) it might be a useful addition, but I wouldn't say it was worth busting a gut over. Regards, Dave. ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Patch queue triage
Heikki Linnakangas wrote: There are few things that we can separate easily, like CREATE INDEX related changes, VACUUM and VACUUM FULL related changed, space reuse related changes etc. Let me give it a shot. Did we ever get a broken up patch for this? Yes: http://archives.postgresql.org/pgsql-patches/2007-05/msg00065.php Thanks :). Sincerely, Joshua D. Drake -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutions since 1997 http://www.commandprompt.com/ Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate PostgreSQL Replication: http://www.commandprompt.com/products/ ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] Planning large IN lists
Hi, Tom Lane wrote: Atul Deopujari [EMAIL PROTECTED] writes: Hi, Tom Lane wrote: That's the least of the problems. We really ought to convert such cases into an IN (VALUES(...)) type of query, since often repeated indexscans aren't the best implementation. I thought of giving this a shot and while I was working on it, it occurred to me that we need to decide on a threshold value of the IN list size above which such transformation should take place. I see no good reason to suppose that there is/should be a constant threshold --- most likely it depends on size of table, availability of indexes, etc. Having the planner try it both ways and compare costs would be best. Yes, letting the planner make its own decision would seem best (in accordance with what we do for different join paths). But for large IN lists, a substantial part of the planner is spent in estimating the selectivity of the ScalarArrayExpr by calling scalararraysel. If we are not eliminating this step in processing the IN list then we are not doing any optimization. Asking the planner to do scalararraysel and also compute cost of any other way and choose between the two is asking planner to do more work. Factors such as size of table, availability of index etc. would affect both the ways similarly. So, if we see a gain in the execution of the IN list due to an external factor then we will also see a similar gain in the execution of the transformed IN (VALUES(...)) clause. I agree that one value would not fit all cases. The problem with this approach is that for some cases, large IN list would perform better than the transformed IN (VALUES(...)) clause. But we know that the transformed IN (VALUES(...)) clause has almost a steady state behavior and it would not blow off the planner estimates. The error would be just marginal. -- Atul EnterpriseDB www.enterprisedb.com ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
[HACKERS] Async commands (like drop index)
Hello, It seems that it may be useful to allow something like: DROP INDEX NOWAIT. The idea being, that the terminal will come back, the index will be dropped in the background. If it doesn't drop, it rollback like normal and logs. I bring this up now, as an idea. We can argue about it later... :) Joshua D. Drake -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutions since 1997 http://www.commandprompt.com/ Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate PostgreSQL Replication: http://www.commandprompt.com/products/ ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] 8.3 release date on web site
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 - --On Thursday, May 17, 2007 09:20:30 -0700 Joshua D. Drake [EMAIL PROTECTED] wrote: I say we wait until we actually know where we are at. How is that comparative from Tom's triage to my email coming? I agree ... a date gives ppl something to aim for, even if we have to push it back later ... - Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email . [EMAIL PROTECTED] MSN . [EMAIL PROTECTED] Yahoo . yscrappy Skype: hub.orgICQ . 7615664 -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.5 (FreeBSD) iD8DBQFGTLHE4QvfyHIvDvMRAmNvAKCxZyhh9geBp8F5PHc7ImaOEhgvGgCfSlQ7 Ztyzahdf2RU5s86fKe0Ts08= =eJL3 -END PGP SIGNATURE- ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] mb and ecpg regression tests
Andrew Dunstan wrote: We've had ecpg regression tests being doing by the buildfarm for a while, but they are not enabled for MSVC. Also, buildfarm has never supported doing MBCS regression tests. In both cases the regression tests are driven by Unix shell scripts, and in the MBCS case at least, the script would need plenty of love before it would be suitable for the buildfarm even on Unix. Back when we were thinking of supporting MSVC builds, we thought that everything would need to be a C program, and so we rewrote the main regression script in C, for example. However, now we have a build system for MSVC we can see that its main infrastructure is in fact perl, so I think what we need to do is to convert the MBCS and ECPG regression drivers to perl rather than C - which should be far, far simpler task. I'm going to try to get these done and supported by the buildfarm by the time we get to Beta. IIRC, Joachim had started working on the ecpg part. (Which is one of the reasons I haven't looked at it myself yet) Joachim, did you ever get anywhere with that? //Magnus ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Planning large IN lists
Atul Deopujari [EMAIL PROTECTED] writes: Yes, letting the planner make its own decision would seem best (in accordance with what we do for different join paths). But for large IN lists, a substantial part of the planner is spent in estimating the selectivity of the ScalarArrayExpr by calling scalararraysel. If we are not eliminating this step in processing the IN list then we are not doing any optimization. Asking the planner to do scalararraysel and also compute cost of any other way and choose between the two is asking planner to do more work. So? Better planning usually involves more work. In any case the above argument seems irrelevant, because making scalararraysel more approximate and less expensive for long lists could be done independently of anything else. Factors such as size of table, availability of index etc. would affect both the ways similarly. So, if we see a gain in the execution of the IN list due to an external factor then we will also see a similar gain in the execution of the transformed IN (VALUES(...)) clause. Incorrect. There is more than one way to do a join, and the above argument only applies if the VALUES case is planned as a nestloop with inner indexscan, which indeed is isomorphic to the scalararrayop implementation ... except that it has higher per-tuple overhead, and therefore will consistently lose, disregarding artifacts of planning costs such as how hard we try to estimate the result size. The case where VALUES is actually a better plan is where the planner switches to merge or hash join because there are too many values. In the current implementation, the planner is incapable of generating those plan shapes from a scalararrayop, and that's what I'd like to see fixed. regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] 8.3 release date on web site
OK, date remains unchanged. --- Marc G. Fournier wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 - --On Thursday, May 17, 2007 09:20:30 -0700 Joshua D. Drake [EMAIL PROTECTED] wrote: I say we wait until we actually know where we are at. How is that comparative from Tom's triage to my email coming? I agree ... a date gives ppl something to aim for, even if we have to push it back later ... - Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email . [EMAIL PROTECTED] MSN . [EMAIL PROTECTED] Yahoo . yscrappy Skype: hub.orgICQ . 7615664 -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.5 (FreeBSD) iD8DBQFGTLHE4QvfyHIvDvMRAmNvAKCxZyhh9geBp8F5PHc7ImaOEhgvGgCfSlQ7 Ztyzahdf2RU5s86fKe0Ts08= =eJL3 -END PGP SIGNATURE- ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly -- Bruce Momjian [EMAIL PROTECTED] http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
[HACKERS] Re: [COMMITTERS] pgsql: Fix parameter recalculation for Limit nodes: during a ReScan call
Is there still a TODO here? --- Tom Lane wrote: Log Message: --- Fix parameter recalculation for Limit nodes: during a ReScan call we must recompute the limit/offset immediately, so that the updated values are available when the child's ReScan function is invoked. Add a regression test for this, too. Bug is new in HEAD (due to the bounded-sorting patch) so no need for back-patch. I did not do anything about merging this signaling with chgParam processing, but if we were to do that we'd still need to compute the updated values at this point rather than during the first ProcNode call. Per observation and test case from Greg Stark, though I didn't use his patch. Modified Files: -- pgsql/src/backend/executor: nodeLimit.c (r1.30 - r1.31) (http://developer.postgresql.org/cvsweb.cgi/pgsql/src/backend/executor/nodeLimit.c.diff?r1=1.30r2=1.31) nodeSubplan.c (r1.88 - r1.89) (http://developer.postgresql.org/cvsweb.cgi/pgsql/src/backend/executor/nodeSubplan.c.diff?r1=1.88r2=1.89) pgsql/src/include/nodes: execnodes.h (r1.173 - r1.174) (http://developer.postgresql.org/cvsweb.cgi/pgsql/src/include/nodes/execnodes.h.diff?r1=1.173r2=1.174) pgsql/src/test/regress/expected: limit.out (r1.4 - r1.5) (http://developer.postgresql.org/cvsweb.cgi/pgsql/src/test/regress/expected/limit.out.diff?r1=1.4r2=1.5) pgsql/src/test/regress/sql: limit.sql (r1.4 - r1.5) (http://developer.postgresql.org/cvsweb.cgi/pgsql/src/test/regress/sql/limit.sql.diff?r1=1.4r2=1.5) ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings -- Bruce Momjian [EMAIL PROTECTED] http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] mb and ecpg regression tests
On Thu, May 17, 2007 at 10:14:45PM +0200, Magnus Hagander wrote: IIRC, Joachim had started working on the ecpg part. (Which is one of the reasons I haven't looked at it myself yet) Joachim, did you ever get anywhere with that? I have ecpg tests somehow running under msvc but there are still some issues. Here are the problems that I encounter(ed): a) ecpg uses a program for each test that runs and generates output. Those programs need to be compiled: in MSVC we would either compile them on the fly while testing or generate a new project for every single executable. Since the second option is overkill, Magnus suggested to try compiling them on the fly with the command line compiler. This is ugly but seems to work. b) ecpg tests use several directories. This causes several problems, one of them being the resultmap format to be changed, it currently uses: float4/i.86-pc-mingw32=... we might want to change the / to some other character here. c) ecpg uses 3 different files per test. stderr, stdout and generated source. There is not only just one file to diff against an expected file. My approach was to have separate init and test routines for each module (backend regression tests, ecpg or whatever) and share the rest. The init routine sets up stuff and the test routine knows about the specific test logic. It generates lists of files to diff. The diffing and reporting is kept common of course. A problem here are alternative comparison files. We have the resultmap entries and we have test_n.out. This has to be combined with ecpg having 3 files. We might want to use test-platform.stdout instead of test.stdout and test_1.stderr instead of test.stderr I have not yet sorted out this last item. Proposals welcome. I hope that I can fix it on the weekend or at least by the end of next week and then send my version to whoever is interested. Joachim ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] [COMMITTERS] pgsql: Fix parameter recalculation for Limit nodes: during a ReScan call
Bruce Momjian [EMAIL PROTECTED] writes: Is there still a TODO here? Tom Lane wrote: Fix parameter recalculation for Limit nodes: No, it's fixed AFAIK. Till someone finds another bug anyway ;-) regards, tom lane ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] [BUGS] Removing pg_auth_members.grantor (was Grantor name gets lost when grantor role dropped)
Is there a TODO here? --- Russell Smith wrote: Alvaro Herrera wrote: Russell Smith wrote: Alvaro Herrera wrote: Alvaro Herrera wrote: 2. decide that the standard is braindead and just omit dumping the grantor when it's no longer available, but don't remove pg_auth_members.grantor Which do people feel should be implemented? I can do whatever we decide; if no one has a strong opinion on the matter, my opinion is we do (2) which is the easiest. Here is a patch implementing this idea, vaguely based on Russell's. I haven't had time to finalize my research about this, but the admin option with revoke doesn't appear to work as expected. Here is my sample SQL for 8.2.4 create table test (x integer); \z create role test1 noinherit; create role test2 noinherit; grant select on test to test1 with grant option; grant select on test to test2; \z test set role test1; revoke select on test from test2; \z test set role test2; select * from test; reset role; revoke all on test from test2; revoke all on test from test1; drop role test2; drop role test1; drop table test; \q The privilege doesn't appear to be revoked by test1 from test2. I'm not sure if this is related, but I wanted to bring it up in light of the options we have for grantor. Humm, but the privilege was not granted by test1, but by the user you were using initially. The docs state in a note that A user can only revoke privileges that were granted directly by that user. I understand that this would apply to the grantor stuff being discussed in this thread as well, but I haven't seen anyone arguing that we should implement that for GRANT ROLE (and I asked three times if people felt it was important and nobody answered). Well, I would vote for implementing this in GRANT ROLE. I wish to use it in my security model. I don't think the spec is brain dead when you understand what it's trying to achieve. Example: 2 Groups of administrators who are allowed to grant a role to users of the system App_Admin_G1 App_Admin_G2 App_User SET ROLE App_Admin_G1 GRANT App_User TO Fred; SET ROLE App_Admin_G2 GRANT App_User TO John; SET ROLE App_Admin_G1 REVOKE App_User FROM John; As App_Admin_G1 did not grant App_User rights to John, he should not be able to take them away. I currently have a situation where I would like to be able to do the above. I have two separate departments who might grant privileges for the same application to the same user. One department administrator should not be able to revoke the privileges set by the other one. I would expect superusers to be able to revoke from anybody, or the owner. I'm not sure what the owner is when we talk about granting roles. Regards Russell Smith ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings -- Bruce Momjian [EMAIL PROTECTED] http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] [BUGS] Removing pg_auth_members.grantor (was Grantor name gets lost when grantor role dropped)
Bruce Momjian wrote: Is there a TODO here? Yes, I think so: * Implement the SQL standard mechanism whereby REVOKE ROLE only revokes the privilege as granted by the invoking role, and not those granted by other roles -- Alvaro Herrerahttp://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc. ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
[HACKERS] Patch status page
I have updated the patch status page: http://developer.postgresql.org/index.php/Todo:PatchStatus#current_8.3_patch_status I moved some patches that were clearly not ready to the 8.4 queue. Let's see what we can get applied in the next week and review our status then. -- Bruce Momjian [EMAIL PROTECTED] http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] [BUGS] Removing pg_auth_members.grantor (was Grantor name gets lost when grantor role dropped)
Added to TODO: --- Alvaro Herrera wrote: Bruce Momjian wrote: Is there a TODO here? Yes, I think so: * Implement the SQL standard mechanism whereby REVOKE ROLE only revokes the privilege as granted by the invoking role, and not those granted by other roles -- Alvaro Herrerahttp://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc. -- Bruce Momjian [EMAIL PROTECTED] http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
[HACKERS] interval / interval - double operator
Yet another potential addition to the family of operators. Some guy was asking for it on IRC so... CREATE OR REPLACE FUNCTION interval_over_interval(interval, interval) RETURNS float STRICT IMMUTABLE LANGUAGE sql AS $$ SELECT extract(epoch from $1)::float / extract(epoch from $2); $$; CREATE OPERATOR / ( leftarg = interval , rightarg = interval , procedure = interval_over_interval ); ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] What is happening on buildfarm member baiji?
Are we going to apply this? I would also like to see a comment added on why we use SO_REUSEADDR. --- Magnus Hagander wrote: On Mon, May 14, 2007 at 09:34:05AM -0400, Andrew Dunstan wrote: Magnus Hagander wrote: On Mon, May 14, 2007 at 09:02:10AM -0400, Tom Lane wrote: Magnus Hagander [EMAIL PROTECTED] writes: If all we want to do is add a check that prevents two servers to start on the same port, we could do that trivially in a win32 specific way (since we'll never have unix sockets there). Just create an object in the global namespace named postgresql.interlock.portnumber or such a thing. Does it go away automatically on postmaster crash? Yes. Then I think it's worth adding, and I'd argue that as a low risk safety measure we should allow it to sneak into 8.3. I'm assuming the code involved will be quite small. Yes, see attached. BTW, did you mean 8.2? One typical case where this could happen is in an upgrade scenario, I think... //Magnus [ Attachment, skipping... ] ---(end of broadcast)--- TIP 6: explain analyze is your friend -- Bruce Momjian [EMAIL PROTECTED] http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] interval / interval - double operator
Andrew Hammond [EMAIL PROTECTED] writes: Yet another potential addition to the family of operators. Some guy was asking for it on IRC so... CREATE OR REPLACE FUNCTION interval_over_interval(interval, interval) RETURNS float STRICT IMMUTABLE LANGUAGE sql AS $$ SELECT extract(epoch from $1)::float / extract(epoch from $2); $$; What are the grounds for defining it that way rather than some other way? regards, tom lane ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] What is happening on buildfarm member baiji?
Bruce Momjian [EMAIL PROTECTED] writes: Are we going to apply this? Not in the form submitted so far, but I trust Magnus is working on fixing it. regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] interval / interval - double operator
On 5/17/07, Tom Lane [EMAIL PROTECTED] wrote: Andrew Hammond [EMAIL PROTECTED] writes: Yet another potential addition to the family of operators. Some guy was asking for it on IRC so... CREATE OR REPLACE FUNCTION interval_over_interval(interval, interval) RETURNS float STRICT IMMUTABLE LANGUAGE sql AS $$ SELECT extract(epoch from $1)::float / extract(epoch from $2); $$; What are the grounds for defining it that way rather than some other way? The only alternative that came to mind when I wrote it was using a numeric instead of float. I couldn't see why a numeric with some arbitrary precision / scale was particularly better than just using a double precision. There's already an interval_div function in the catalog which take an interval and a double precision and returns an interval, so using floating point math already has precedent. I figured that if I went with numeric, I'd also have to have a pretty good reason to change the existing operator or it'd inconsistent. Since float (without parameters) is both shorter to type and appears to be the same as double precision (at least according to the docs), my innate lazy streak went that way. Am I missing something obvious? Andrew
Re: [HACKERS] interval / interval - double operator
Andrew Hammond [EMAIL PROTECTED] writes: On 5/17/07, Tom Lane [EMAIL PROTECTED] wrote: What are the grounds for defining it that way rather than some other way? The only alternative that came to mind when I wrote it was using a numeric instead of float. No, I'm wondering what's the justification for smashing it to a single number at all, when the inputs are three-field values. Interval divided by float doesn't produce just a float, for example. regards, tom lane ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Not ready for 8.3
On Thu, 17 May 2007, David Fetter wrote: Would you be interested in providing this meat? You're uniquely qualified because your shins still smart from all the things you barked them on :) Unfortunately I'm temporarily on the other side of this problem; all the time I have to spare right now is going to into reviewing other people's patches. Even more unfortunately, I'm not 100% happy with what I'm doing, and certainly wouldn't want to present the shoddy techniques I'm using as recommended. If anyone else has useful info on this subject (keeping a local repository in sync with HEAD via rsync while working on branches) they'd like to pass along, please drop me a message with whatever level of documentation you might have--even rough notes would be a help. Once I'm finished with the reviews I'm trying to contribute, I will sort through any suggestions I get and turn that into formal documentation. This has left enough shin damage that I'd enjoy completely slaying the cause. -- * Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Planning large IN lists
Tom Lane wrote: Atul Deopujari [EMAIL PROTECTED] writes: Yes, letting the planner make its own decision would seem best (in accordance with what we do for different join paths). But for large IN lists, a substantial part of the planner is spent in estimating the selectivity of the ScalarArrayExpr by calling scalararraysel. If we are not eliminating this step in processing the IN list then we are not doing any optimization. Asking the planner to do scalararraysel and also compute cost of any other way and choose between the two is asking planner to do more work. So? Better planning usually involves more work. In any case the above argument seems irrelevant, because making scalararraysel more approximate and less expensive for long lists could be done independently of anything else. Got you and agree with you. -- Atul EnterpriseDB www.enterprisedb.com ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster