Re: [HACKERS] Status of Hierarchical Queries
Gavin Sherry [EMAIL PROTECTED] writes: On Thu, 22 Feb 2007, Gregory Stark wrote: But in a simple recursive tree search you have a node which wants to do a join between the output of tree level n against some table to produce tree level n+1. It can't simply execute the plan to produce tree level n since that's the same tree it's executing itself. If it calls the Init method on itself it'll lose all its state. There's another reason it can't just execute the previous node. You really don't want to recompute all the results for level n when you go to produce level n+1. You want to keep them around from the previous iteration. Otherwise you have an n^2 algorithm. Right. When I've spent some idle cycles thinking through this in the past I figured that in a non-trivial query, we'd end up with a bunch of materialisations, one for each level of recursion. That sounds very ugly. Well as long as you have precisely one for each level of recursion I think you're doing ok. The problem is if you do it the naive way you calculate the first level, then for the second level you recalculate the first level again, then for the third level you recalculate both of the previous two, ... So you end up with n copies of the first level, n-1 copies of the second level, ... If you reuse the result sets for subsequent recursive calls then you actually only need to keep then nth level around until you're done generating the n+1 level. The trick is being able to have two different call sites in the plan tree pulling records out of the Materialize node at different points in the result set. That currently isn't possible. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] SCMS question
Warren Turkal [EMAIL PROTECTED] writes: On Thursday 22 February 2007 00:42, you wrote: I think you just made my point for me. I wasn't trying to convince so much as get an opinion. Well, sure, it's all opinion ;-). But the overall costs of changing SCMS are pretty enormous IMHO. We're not going to do it just to find out if the grass might be greener on the other side of the fence. If you'd like to see it happen then you need to make some convincing arguments ... starting with recommending one specific SCMS that we should change to. If you haven't got a clear and defensible opinion on that, then you've already lost my interest. regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] SCMS question
Tom Lane [EMAIL PROTECTED] writes: Warren Turkal [EMAIL PROTECTED] writes: On Thursday 22 February 2007 00:05, Tom Lane wrote: Not particularly. We keep hearing from various advocates that $foo-is-better-than-CVS, but the preferred value of $foo changes with amazing frequency, and none of the arguments seem to justify the pain of converting. Some of the other options just seem to have much nicer user interfaces. I was playing with Bacula and they just changed to Subversion. It really is much nicer than the CVS they used to use. Git seems interesting as well. I guess Subversion and git are the two big ones right now. What would you look for if you were to check out new SCMSes? Would you want distributed like Git or centralized like CVS/Subversion? I think you just made my point for me. Not really. That's not a preferred value of $foo changing so much as different styles of systems. If we want to minimize the pain of changing and keep the same mode of operation Subversion is definitely the right choice. Its goal was to provide the same operational model as CVS and fix the implementation and architectural problems. There isn't really any other mature SCM system in that style and I've heard nothing but satisfaction from its users. It has definitely taken over CVS's top spot in free software mindshare. If we wanted to go to a whole new way of working using a decentralized system which might fit better with our system of submitting patches then one of those other systems like GIT might be better. While I think our system of working independently and submitting patches actually fits GIT better, if we're so conservative that we're still on CVS despite its problems I suspect we're better off not trying to change operational models at this point. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] SCMS question
Gregory Stark [EMAIL PROTECTED] writes: If we want to minimize the pain of changing and keep the same mode of operation Subversion is definitely the right choice. Its goal was to provide the same operational model as CVS and fix the implementation and architectural problems. Erm ... but this is not an argument in favor of changing. AFAIR the only real disadvantage of CVS that we've run up against is that it's hard to shuffle files around to different directories without losing their change history (or more accurately, making the history harder to find). Now that is a pretty considerable annoyance on some days, but it's not sufficient reason to change to something else. I have no doubt that every other SCMS has annoyances of its own. ... if we're so conservative that we're still on CVS despite its problems I suspect we're better off not trying to change operational models at this point. Conservatism is kind of inherent in our problem domain, no? I mean, you might have great arguments why XYZ is the best operating system since sliced bread and everyone should migrate to it immediately, and you might even be right --- but you'd be foolish to expect quick uptake by the average DBA. There is great value in being familiar with one's tools. regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] --enable-xml instead of --with-libxml?
On 2/21/07, Alvaro Herrera [EMAIL PROTECTED] wrote: I think it would be better that leaving --with-libxml out (i.e. compiling without libxml2 support) would only disable those parts in XML functionality that require libxml2 for their implementation; the rest of the stuff should be compiled in regardless of the setting. Is this not what is done currently? The thing is that some functions of XML support are based on libxml2, some are not. libxml2 contains useful routines to deal with XML data. Now we have: XMLELEMENT uses such routines and XMLPI doesn't. Actually, all SQL/XML publishing function could be implemented w/o libxml2 -- but it's more convenient to use those routines in some cases... And there is no guarantee that functions that don't currently use libxml2 will not use them in future. What I want to propose is just simplification -- consider all XML stuff as one package, including XML type, SQL/XML publishing, XPath funcs, additional publishing functions recently added by Peter (btw, who knows -- maybe libxml2 will help to improve them somehow in future?), etc. -- Best regards, Nikolay ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] autovacuum next steps, take 2
vacuum should be a process with the least amount of voodoo. If we can just have vacuum_delay and vacuum_threshold, where threshold allows an arbitrary setting of how much bandwidth we will allot to the process, then that is a beyond wonderful thing. It is easy to determine how much IO you have, and what you can spare. The tricky part is what metric to use. Imho IO per second would be good. In a typical DB scenario that is the IO bottleneck, not the Mb/s. Andreas ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] SCMS question
Tom Lane [EMAIL PROTECTED] writes: Gregory Stark [EMAIL PROTECTED] writes: If we want to minimize the pain of changing and keep the same mode of operation Subversion is definitely the right choice. Its goal was to provide the same operational model as CVS and fix the implementation and architectural problems. Erm ... but this is not an argument in favor of changing. AFAIR the only real disadvantage of CVS that we've run up against is that it's hard to shuffle files around to different directories without losing their change history (or more accurately, making the history harder to find). Now that is a pretty considerable annoyance on some days, but it's not sufficient reason to change to something else. I have no doubt that every other SCMS has annoyances of its own. Oh we have tons of problems with CVS, it's just that we've worked around them for so long we've forgotten. Why are side projects like bitmapped indexes and the variable varlena stuff sitting on people's personal hard drives instead of in a branch of the main tree? It makes it awfully hard for developers to collaborate if they have to mail patches back and forth, merging conflicts manually and constantly negotiate what version of postgres the patches are against. Why are so few people committers? The normal work-flow on other projects is that you want any substantial changes in the revision control system as soon as possible so other people can see and work with them and use the revision control tools to manage them. The review process can either back them out or keep the production code on a separate branch and merge in the changes when they're approved. The answer to both questions is because CVS limitations make it hard to do better. There are no access control mechanisms and creating and managing branches is awkward and easy to get wrong. Mailing around patches basically limits us to one-person projects that can be reviewed by a single committer in a single sitting. Any larger and the people involved have no tools to coordinate or the committer has to deal with code drift in the main tree during the time he's reviewing the patch. [on a related note, is something wrong with my cvs rsync tree or is configure in the CVS repository? It's causing my patches to bloat considerably since I added one line to configure.in] Conservatism is kind of inherent in our problem domain, no? I mean, you might have great arguments why XYZ is the best operating system since sliced bread and everyone should migrate to it immediately, and you might even be right --- but you'd be foolish to expect quick uptake by the average DBA. There is great value in being familiar with one's tools. It's also why we have so many posters asking whether it's really necessary for them to upgrade from 7.0 which is working fine for them. Transaction wrap-around only happens once in a blue moon. And the weekly outages for vacuum fulls are scheduled and approved by management so we don't have to care about them any more. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] --enable-xml instead of --with-libxml?
Nikolay Samokhvalov wrote: What I want to propose is just simplification -- consider all XML stuff as one package, including XML type, SQL/XML publishing, XPath funcs, additional publishing functions recently added by Peter (btw, who knows -- maybe libxml2 will help to improve them somehow in future?), etc. Well, in terms of predictable interfaces, there is some merit in your argument. -- Peter Eisentraut http://developer.postgresql.org/~petere/ ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] SCMS question
Gregory Stark wrote: [on a related note, is something wrong with my cvs rsync tree or is configure in the CVS repository? It's causing my patches to bloat considerably since I added one line to configure.in] cat CVS/Entries -- Peter Eisentraut http://developer.postgresql.org/~petere/ ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] HOT for PostgreSQL 8.3
I very much like Hannu's idea, but it does present some issues. I too liked Hannu's idea initially, but Tom raised a valid concern that it does not address the basic issue of root tuples. According to the idea, a DEAD root tuple can be used for a subsequent update of the same row. If you are reusing the existing slot of a root tuple how will that slot likely have room for an extra pointer and a live tuple ? If the idea does not cover root reuse we don't need pointers. Imho we should follow the swing idea. It should even be possible to point the root to the newest dead tuple during update (if it was index path), and reuse an older dead slot from the chain. Then we can limit the chain to number of potentially visible tuples + root + 2 without vacuum. Andreas ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] SCMS question
Peter Eisentraut [EMAIL PROTECTED] writes: Gregory Stark wrote: [on a related note, is something wrong with my cvs rsync tree or is configure in the CVS repository? It's causing my patches to bloat considerably since I added one line to configure.in] cat CVS/Entries $ cat CVS/Entries D/config D/contrib D/doc D/src /configure.in/1.501/Wed Feb 14 11:43:15 2007// /COPYRIGHT/1.13/Thu Feb 15 14:31:14 2007// /GNUmakefile.in/1.46/Thu Feb 15 14:31:14 2007// /Makefile/1.13/Thu Feb 15 14:31:14 2007// /README/1.32/Thu Feb 15 14:31:14 2007// /README.CVS/1.3/Thu Feb 15 14:31:14 2007// /aclocal.m4/1.18/Thu Feb 15 14:31:14 2007// /configure/1.534/Wed Feb 7 00:28:54 2007// -- Gregory Stark EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] SCMS question
On Thu, 22 Feb 2007, Tom Lane wrote: Gregory Stark [EMAIL PROTECTED] writes: If we want to minimize the pain of changing and keep the same mode of operation Subversion is definitely the right choice. Its goal was to provide the same operational model as CVS and fix the implementation and architectural problems. Erm ... but this is not an argument in favor of changing. AFAIR the only real disadvantage of CVS that we've run up against is that it's hard to shuffle files around to different directories without losing their change history (or more accurately, making the history harder to find). Now that is a pretty considerable annoyance on some days, but it's not sufficient reason to change to something else. I have no doubt that every other SCMS has annoyances of its own. It's not a problem for the project but I personally experience pain with CVS. I often want to take a bunch of commits and merge them into seperate trees (like Greenplum DB or my private bitmap index tree). This is a lot easier with the patch set based SCMs like darcs/monotone/git/etc. Just my thoughts. Gavin ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] Column storage positions
On Wed, 2007-02-21 at 16:57 -0300, Alvaro Herrera wrote: Andrew Dunstan escribió: Simon Riggs wrote: I agree with comments here about the multiple orderings being a horrible source of bugs, as well as lots of coding even to make it happen at all http://archives.postgresql.org/pgsql-hackers/2006-12/msg00859.php I thought we were going with this later proposal of Tom's (on which he's convinced me): http://archives.postgresql.org/pgsql-hackers/2006-12/msg00983.php - if not I'm totally confused (situation normal). The current thread started with this sentence: Inspired by this thread [1], and in particular by the idea of storing three numbers (permanent ID, on-disk storage position, display position) for each column, I spent a little time messing around with a prototype implementation of column storage positions to see what kind of difference it would make. I haven't understood Alvaro to suggest not keeping 3 numbers. Right, I'm not advocating not doing that -- I'm just saying that the first step to that could be decoupling physical position with attr id :-) Logical column ordering (the order in which SELECT * expands to) seems to me to be a different feature. Not disagreed. :-) Something very, very simple seems most likely to be an effective additional feature for 8.3. We can implement the 2/3 position version for 8.4 -- Simon Riggs EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] HOT for PostgreSQL 8.3
On 2/22/07, Zeugswetter Andreas ADI SD [EMAIL PROTECTED] wrote: I very much like Hannu's idea, but it does present some issues. I too liked Hannu's idea initially, but Tom raised a valid concern that it does not address the basic issue of root tuples. According to the idea, a DEAD root tuple can be used for a subsequent update of the same row. If you are reusing the existing slot of a root tuple how will that slot likely have room for an extra pointer and a live tuple ? If the idea does not cover root reuse we don't need pointers. Hannu talked about using one of xmin/xmax for storing back-pointers. There were objections to that since it breaks the xmax/xmin matching robustness that we have today. Imho we should follow the swing idea. Yes, thats one option. Though given a choice I would waste four bytes in the heap-page than inserting a new index entry. The heap tuples can be vacuumed rather easily than the index entries which, if I am mistaken, can not be reused even after marked LP_DELETEd. Thanks, Pavan -- EnterpriseDB http://www.enterprisedb.com
Re: [HACKERS] SCMS question
Tom Lane wrote: Gregory Stark [EMAIL PROTECTED] writes: If we want to minimize the pain of changing and keep the same mode of operation Subversion is definitely the right choice. Its goal was to provide the same operational model as CVS and fix the implementation and architectural problems. Erm ... but this is not an argument in favor of changing. AFAIR the only real disadvantage of CVS that we've run up against is that it's hard to shuffle files around to different directories without losing their change history (or more accurately, making the history harder to find). Now that is a pretty considerable annoyance on some days, but it's not sufficient reason to change to something else. I have no doubt that every other SCMS has annoyances of its own. Oh, goody! My favourite non-productive debate! :-) I work daily with SVN, and it certainly has some of the CVS pain points fixed, plus one or two nice gadgets. It's annoyed me a couple of times too, although I can't remember exactly how. Let me throw another couple of data points into the mix. 1. The buildfarm is very heavily dependent on CVS, and any change to anything else will be quite painful. There is no guarantee that all the members even have SVN installed, let alone anything else. And someone would have to code and test significant client changes. That said, a lot of the tortuous logic could be removed - change detection would almost just resolve to comparing two tree numbers with SVN, for example. 2. Many people (and some buildfarm members) operate against mirrors of the main repo which are created with rsync or CVSup. I am not aware of any way to do the equivalent with SVN - any info would be gratefully received. Of course, SVN is better at disconnected operation than CVS, so it might be a non-issue for many. Even so, it might be a pity to have to forego the facility. I have no doubt we'll change someday to something better. I don't know what it is and I don't think we need to be in any hurry. This space is still very fluid. cheers andrew ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] tsearch in core patch, for inclusion
Hi, Peter Eisentraut wrote: Oleg Bartunov wrote: It's not so big addition to the gram.y, see a list of commands http://mira.sai.msu.su/~megera/pgsql/ftsdoc/sql-commands.html. As we still to still discuss the syntax: is there a proposal for how a function based syntax would look like? CREATE FULLTEXT CONFIGURATION myfts LIKE template_cfg AS DEFAULT; just seems so much more SQL-like than: SELECT add_fulltext_config('myfts', 'template_cfg', True); I admit, that's a very simple and not thought through example. But as long as those who prefer not to extend the grammar don't come up with a better alternative syntax, one easily gets the impression that extending the grammar in general is evil. In that proposed syntax, I would drop all =, ,, (, and ). They don't seem necessary and they are untypical for SQL commands. I'd compare with CREATE FUNCTION or CREATE SEQUENCE for SQL commands that do similar things. Yup, I'd second that. Regards Markus ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
[HACKERS] Grouped Index Tuples
I've brought the GIT patch up-to-date with CVS head. The latest version can be found at http://community.enterprisedb.com/git/ I also reran the CPU bound test cases with the latest patch. I want this in 8.3 in some form, and I have the time to do any required changes. If someone wants to see more tests, I can arrange that as well. The patch is pretty big at the moment. I think the best way to proceed with this is to extract some smaller, incremental patches from it that just refactor the current b-tree code. After that, the final patch that implements GIT should be much smaller and more readable. And there's still a bunch of todo items there as well... But before I start doing that, I need some review and general agreement on the design. What I don't want to happen is that three days after the feature freeze, someone finally looks at it and finds a major issue or just thinks it's an unreadable mess, and we no longer have the time to fix it. One question that I'm sure someone will ask is do we need this if we have bitmap indexes? Both aim at having a smaller index, after all. The use cases are quite different; GIT is effective whenever you have a table that's reasonably well-clustered. Unlike the bitmap indexam, GIT's effectiveness doesn't depend on the number of distinct values, in particular it works well with unique indexes. GIT is comparable to clustered indexes in other DBMSs (in fact we might want to call GIT that in the end). -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] HOT for PostgreSQL 8.3
Imho we should follow the swing idea. Yes, thats one option. Though given a choice I would waste four bytes in the heap-page than inserting a new index entry. No question about that. My point was, that it would mean wasting the 2 (2 must be enough for a slot pointer) bytes on every heap tuple, hot or not. And then the decision is not so obvious anymore. If you don't have the room for the back pointer on every slot, there is no room to add one later. Andreas ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] SCMS question
Hi, Andrew Dunstan wrote: 1. The buildfarm is very heavily dependent on CVS, and any change to anything else will be quite painful. There is no guarantee that all the members even have SVN installed, But you can guarantee they have CVS or even cvsup installed? That seems dubious to me. let alone anything else. And someone would have to code and test significant client changes. That said, a lot of the tortuous logic could be removed - change detection would almost just resolve to comparing two tree numbers with SVN, for example. ..and a *real* VCS (as in monotone :-) ) would provide not only that, but give you correctness guarantees, built in certification of revisions (i.e. each buildfarm member could issue a cert on successful testing) and lightweight branches, so you could much easier test experimental patches of different authors. Just to name a few additional advantages. 2. Many people (and some buildfarm members) operate against mirrors of the main repo which are created with rsync or CVSup. I am not aware of any way to do the equivalent with SVN - any info would be gratefully received. You might want to have a look at svk. It can do exactly that. And the Blog of Thomas explains how, see [1]. Of course, SVN is better at disconnected operation than CVS, Really? I've dropped subversion exactly because it sucks big time when disconnected. But again, I'm probably not comparing against CVS... I have no doubt we'll change someday to something better. I don't know what it is and I don't think we need to be in any hurry. This space is still very fluid. Yup. Good to hear you see it that way. As I understand, you have good reasons to be still using CVS, but are open to good suggestions. That's a very good thing, but easily slips by when reading all the critics and pro-CVS statements. ;-) Regards Markus [1]: Remote Backup Of A Subversion Repository http://moelhave.dk/2006/07/remote-mirroring-a-subversion-svn-repository/ ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] SCMS question
Markus Schiltknecht wrote: Hi, Andrew Dunstan wrote: 1. The buildfarm is very heavily dependent on CVS, and any change to anything else will be quite painful. There is no guarantee that all the members even have SVN installed, But you can guarantee they have CVS or even cvsup installed? That seems dubious to me. getting CVS on a box is still way easier than SVN (I don't want event talk about more esoteric ones) especially on older and/or special platforms. As someone who operates a large number of buildfarm members switching to something else would put a large burden(both in terms of installation and configuration changes/upgrades of the buildfarm client) on me for no appearent gain. Beside that - are all of the currently supported Platforms officially supported by the proposed SCMSes ? let alone anything else. And someone would have to code and test significant client changes. That said, a lot of the tortuous logic could be removed - change detection would almost just resolve to comparing two tree numbers with SVN, for example. ..and a *real* VCS (as in monotone :-) ) would provide not only that, but give you correctness guarantees, built in certification of revisions (i.e. each buildfarm member could issue a cert on successful testing) and lightweight branches, so you could much easier test experimental patches of different authors. Just to name a few additional advantages. most of the issues with CVS in that regard have already been worked around (and are therefore solved). But I agree that for developers especially those that are doing large patches over a long period of time might gain something from another SCMS, but it is not really clear what that SCMS should be or if it warrants the imho enormous switching costs (and the potential disruption in development until that switch is done which might take days if not weeks). Stefan ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] HOT for PostgreSQL 8.3
On 2/22/07, Zeugswetter Andreas ADI SD [EMAIL PROTECTED] wrote: Yes, thats one option. Though given a choice I would waste four bytes in the heap-page than inserting a new index entry. No question about that. My point was, that it would mean wasting the 2 (2 must be enough for a slot pointer) bytes on every heap tuple, hot or not. And then the decision is not so obvious anymore. If you don't have the room for the back pointer on every slot, there is no room to add one later. Oh yes, I agree. I was referring to the idea of line pointer redirection which would waste four bytes (for the root line pointer) per hot-update chain. That occurs only when a tuple is hot-updated. So there is no overhead for normal tuples. Also, since its outside the tuple header, we don't have issues of additional space wastage because of alignment. We would need to teach the code to ignore all such pointers which don't point to a real tuple, but only redirects us to another line pointer. We arrive at this line pointer from the index and then get redirected to another line pointer on the same page. Vacuum would need to delay freeing this line pointer until the hot-update chain is dead. I am waiting for feedback on this since I would like to work on this next. Anybody sees any issue with this approach ? Comments please. Thanks, Pavan -- EnterpriseDB http://www.enterprisedb.com
Re: [HACKERS] tsearch in core patch, for inclusion
In that proposed syntax, I would drop all =, ,, (, and ). They don't seem necessary and they are untypical for SQL commands. I'd compare with CREATE FUNCTION or CREATE SEQUENCE for SQL commands that do similar things. I was looking at CREATE TYPE mostly. With removing =, ,, (, and ) in CREATE/ALTER FULLTEXT it's needed to add several items in unreserved_keyword list. And increase gram.y by adding new rules similar to OptRoleList instead of simple opt_deflist: '(' def_list ')' { $$ = $2; } | /*EMPTY*/ { $$ = NIL; } ; Is it acceptable? List of new keywords is: LOCALE, LEXIZE, INIT, OPT, GETTOKEN, LEXTYPES, HEADLINE So, syntax will be CREATE FULLTEXT DICTIONARY dictname LEXIZE lexize_function [ INIT init_function ] [ OPT opt_text ]; CREATE FULLTEXT DICTIONARY dictname [ { LEXIZE lexize_function | INIT init_function | OPT opt_text } [...] ] LIKE template_dictname; -- Teodor Sigaev E-mail: [EMAIL PROTECTED] WWW: http://www.sigaev.ru/ ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
[HACKERS] What is CheckPoint.undo needed for?
Hi I'm trying to gain a better understanding of how the postgres xlog works - especially about the corner cases of wal replay. One thing that I do not understand is what CheckPoint.undo is used for. I grepped through the source, and only see very few references to it, which either just print it, or set it to the checkpoint location or the redo location. The only place where is is checked is if (XLByteLT(checkPoint.undo, RecPtr) || XLByteLT(checkPoint.redo, RecPtr)) but I don't see the reason for this check, given that undo never seems to be assigned any interesting value.. Is there any in-depth documentation about the postgres xlog implementation? What am I missing? greetings, Florian Pflug ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] tsearch in core patch, for inclusion
CREATE FULLTEXT CONFIGURATION myfts LIKE template_cfg AS DEFAULT; SELECT add_fulltext_config('myfts', 'template_cfg', True); That's simple, but what about CREATE FULLTEXT MAPPING ON cfgname FOR lexemetypename[, ...] WITH dictname1[, ...]; ? SELECT create_fulltext_mapping(cfgname, '{lexemetypename[, ...]}'::text[], '{dictname1[, ...]}'::text[]); Seems rather ugly for me... And function interface does not provide autocompletion and online help in psql. \df says only types of arguments, not a meaning. -- Teodor Sigaev E-mail: [EMAIL PROTECTED] WWW: http://www.sigaev.ru/ ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Column storage positions
On 2/22/07, Tom Lane [EMAIL PROTECTED] wrote: Andrew Dunstan [EMAIL PROTECTED] writes: Alvaro Herrera wrote: Right, I'm not advocating not doing that -- I'm just saying that the first step to that could be decoupling physical position with attr id :-) Logical column ordering (the order in which SELECT * expands to) seems to me to be a different feature. Except in the sense that divorcing the id from the storage order makes it possible to do sanely. :-) They are different features, but they are going to hit all the same code, because the hardest part of this remains making sure that every piece of the code is using the right kind of column number. The suggestion I posted awhile ago amounts to saying that we might be able to solve that by default, by making sure that only one definition of column number is relevant to the majority of the backend and we can figure out exactly where the other definitions need to apply. But that's handwaving until someone actually does it :-( I don't really think it's just handwaving at this point because I've done a lot of it :). I'm not saying the work is done, or that a lot more testing isn't required, but at the moment I have a working system that seems to do what it needs to do to separate storage position from permanent ID/display position. And the changes to accomplish this were quite localized - namely the tuple access routines in heaptuple.c, and the small handful of places that need to construct tuple descriptors. That's pretty much it - the rest of the codebase is remains untouched. In any case I think it's foolish not to tackle both issues at once. We know we'd like to have both features and we know that all the same bits of code need to be looked at to implement either. I guess I disagree with that sentiment. I don't think it's necessary to bundle these two features together, even if some analysis will be duplicated between them, since they are completely distinct in a functional sense and will touch different places in the code. Smaller, more incremental changes make more sense to me. But if both-features-at-once is what the community wants, that's fine, no worries. I'll just pull my own personal hat out of the ring until someone comes along who's interested in implementing them both at the same time. phil ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] What is CheckPoint.undo needed for?
What am I missing? Seems, it's about that http://archives.postgresql.org/pgsql-committers/2005-06/msg00085.php -- Teodor Sigaev E-mail: [EMAIL PROTECTED] WWW: http://www.sigaev.ru/ ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] SCMS question
Hi, [ I've CCed the monotone-devel list, as I'm sure those people are interested, too. ] Stefan Kaltenbrunner wrote: Beside that - are all of the currently supported Platforms officially supported by the proposed SCMSes ? I can only speak for monotone. We have (had) buildbots for x86 (linux, netbsd, freebsd, win32), amd64 (linux), ppc (osx) and one sparc (osol). So far all gcc compiled, AFAIK. We are very interested in increasing portability of monotone. If you could give me (or other monotone devels) ssh access to some of the more obscure boxes, that would help a lot. Please contact me privately. most of the issues with CVS in that regard have already been worked around (and are therefore solved). Huh? How do you guarantee the correctness of a local checkout? At best, you can check an md5 sum of a tar archive, but CVS itself does almost no integrity checking. Does the buildfarm code check somehow? Against what? (Note that we've already had quite some disk failures uncovered by monotone, which does extensive integrity checking. But I'm sure database people know how important that is, don't you?) Or quickly test experimental patches? Is that solved? Or merging between branches, to add another major annoyance of CVS (and subversion, for that matter). I currently fetch the whole PostgreSQL repository via cvsup and then import it into monotone to be able to do serious work. Of course that's possible, and you can work around all the other limitations of CVS somehow, but it's annoying. But I agree that for developers especially those that are doing large patches over a long period of time might gain something from another SCMS, but it is not really clear what that SCMS should be or if it warrants the imho enormous switching costs (and the potential disruption in development until that switch is done which might take days if not weeks). I certainly agree that switching to another VCS is a major undertaking. And I'm working on easing migration to monotone. And I'll quite certainly try again to convince you again, *at some point in the future*. I would not vote for switching the PostgreSQL repository to monotone, yet. (As if I had a vote...;-) ) Regards Markus ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] What is CheckPoint.undo needed for?
Opps, sorry, I missed checkpoint keyword Teodor Sigaev wrote: What am I missing? Seems, it's about that http://archives.postgresql.org/pgsql-committers/2005-06/msg00085.php -- Teodor Sigaev E-mail: [EMAIL PROTECTED] WWW: http://www.sigaev.ru/ ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] autovacuum next steps, take 2
On Wed, Feb 21, 2007 at 05:40:53PM -0500, Matthew T. O'Connor wrote: My Proposal: If we require admins to identify hot tables tables, then: 1) Launcher fires-off a worker1 into database X. 2) worker1 deals with hot tables first, then regular tables. 3) Launcher continues to launch workers to DB X every autovac naptime. 4) worker2 (or 3 or 4 etc...) sees it is alone in DB X, if so it acts as worker1 did above. If worker1 is still working in DB X then worker2 looks for hot tables that are being starved because worker1 got busy. If worker2 finds no hot tables that need work, then worker2 exits. Rather than required people to manually identify hot tables, what if we just prioritize based on table size? So if a second autovac process hits a specific database, it would find the smallest table in need of vacuuming that it should be able to complete before the next naptime and vacuum that. It could even continue picking tables until it can't find one that it could finish within the naptime. Granted, it would have to make some assumptions about how many pages it would dirty. ISTM that's a lot easier than forcing admins to mark specific tables. -- Jim Nasby[EMAIL PROTECTED] EnterpriseDB http://enterprisedb.com 512.569.9461 (cell) ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] SCMS question
Markus Schiltknecht wrote: Hi, Andrew Dunstan wrote: 1. The buildfarm is very heavily dependent on CVS, and any change to anything else will be quite painful. There is no guarantee that all the members even have SVN installed, But you can guarantee they have CVS or even cvsup installed? That seems dubious to me. CVSup is not required, and is absent from most existing clients. I don't use it any more since the Fedora project stopped supporting it. Buildfarm was designed to be able to run anywhere that a build from our repo could run, without requiring anything extra - I have even tried to keep to a minimum the perl modules required. The point you are missing is that, while we know existing buildfarm members all have CVS installed, we don't know that they have SVN or whatever, and requiring them to install it will involve significant distributed pain. It will also involve some considerable localised pain (probably on my part) in rewriting the client. Right now I'm thinking it might make some sense to future-proof buildfarm by creating some sort of snapshot server. OTOH, if we avoid use of whatever SCM system that the project uses, we aren't testing that part of the process. let alone anything else. And someone would have to code and test significant client changes. That said, a lot of the tortuous logic could be removed - change detection would almost just resolve to comparing two tree numbers with SVN, for example. ..and a *real* VCS (as in monotone :-) ) would provide not only that, but give you correctness guarantees, built in certification of revisions (i.e. each buildfarm member could issue a cert on successful testing) and lightweight branches, so you could much easier test experimental patches of different authors. Just to name a few additional advantages. You're making Tom's point again :-) Of course, SVN is better at disconnected operation than CVS, Really? I've dropped subversion exactly because it sucks big time when disconnected. But again, I'm probably not comparing against CVS... IIRC you don't need to be connected to the repo to run svn diff, whereas you do to run cvs diff. I have no doubt we'll change someday to something better. I don't know what it is and I don't think we need to be in any hurry. This space is still very fluid. Yup. Good to hear you see it that way. As I understand, you have good reasons to be still using CVS, but are open to good suggestions. That's a very good thing, but easily slips by when reading all the critics and pro-CVS statements. ;-) We know the warts. If this were a green fields project there is no doubt we would not use CVS. But many proponents of other systems ignore the downside of changing. One thing I want to know is that whatever we change to will still be there, maintained and in widespread use, many years down the track. So far I am not sure about that for any possible replacement, with the possible exception of SVN. cheers andrew ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] autovacuum next steps, take 2
Jim C. Nasby wrote: On Wed, Feb 21, 2007 at 05:40:53PM -0500, Matthew T. O'Connor wrote: My Proposal: If we require admins to identify hot tables tables, then: 1) Launcher fires-off a worker1 into database X. 2) worker1 deals with hot tables first, then regular tables. 3) Launcher continues to launch workers to DB X every autovac naptime. 4) worker2 (or 3 or 4 etc...) sees it is alone in DB X, if so it acts as worker1 did above. If worker1 is still working in DB X then worker2 looks for hot tables that are being starved because worker1 got busy. If worker2 finds no hot tables that need work, then worker2 exits. Rather than required people to manually identify hot tables, what if we just prioritize based on table size? So if a second autovac process hits a specific database, it would find the smallest table in need of vacuuming that it should be able to complete before the next naptime and vacuum that. It could even continue picking tables until it can't find one that it could finish within the naptime. Granted, it would have to make some assumptions about how many pages it would dirty. ISTM that's a lot easier than forcing admins to mark specific tables. So the heuristic would be: * Launcher fires off workers into a database at a given interval (perhaps configurable?) * Each worker works on tables in size order. * If a worker ever catches up to an older worker, then the younger worker exits. This sounds simple and workable to me, perhaps we can later modify this to include some max_workers variable so that a worker would only exit if it catches an older worker and there are max_workers currently active. Thoughts? ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] What is CheckPoint.undo needed for?
No you're right, it's related to the WAL undo stuff that was never actually implemented. It's dead code. Teodor Sigaev wrote: Opps, sorry, I missed checkpoint keyword Teodor Sigaev wrote: What am I missing? Seems, it's about that http://archives.postgresql.org/pgsql-committers/2005-06/msg00085.php -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] What is CheckPoint.undo needed for?
Heikki Linnakangas wrote: No you're right, it's related to the WAL undo stuff that was never actually implemented. It's dead code. Teodor Sigaev wrote: Opps, sorry, I missed checkpoint keyword Teodor Sigaev wrote: What am I missing? Seems, it's about that http://archives.postgresql.org/pgsql-committers/2005-06/msg00085.php Thanks - everything makes much more sense now ;-) greetings, Florian Pflug ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] What is CheckPoint.undo needed for?
Florian G. Pflug wrote: Heikki Linnakangas wrote: No you're right, it's related to the WAL undo stuff that was never actually implemented. It's dead code. Teodor Sigaev wrote: Opps, sorry, I missed checkpoint keyword Teodor Sigaev wrote: What am I missing? Seems, it's about that http://archives.postgresql.org/pgsql-committers/2005-06/msg00085.php Thanks - everything makes much more sense now ;-) So if you want to submit a patch to remove it, it'll make sense for future readers as well ;-) -- Alvaro Herrerahttp://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc. ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] SCMS question
Hi, Andrew Dunstan wrote: CVSup is not required, and is absent from most existing clients. I don't use it any more since the Fedora project stopped supporting it. ..which is quite understandable, concerning the PITA compiling modula-3 gives you (or at least has given me, it still hurts). The point you are missing is that, while we know existing buildfarm members all have CVS installed, we don't know that they have SVN or whatever, and requiring them to install it will involve significant distributed pain. Okay, I certainly agree that CVS is much more wide spread and available than most (if not all) other VCSes. Let's change that ;-) It will also involve some considerable localised pain (probably on my part) in rewriting the client. Right now I'm thinking it might make some sense to future-proof buildfarm by creating some sort of snapshot server. OTOH, if we avoid use of whatever SCM system that the project uses, we aren't testing that part of the process. Did I mention that monotone is a very good snapshot server? *duck* You probably don't want to reinvent the weel, as 'snapshot serving' is exactly what a VCS should do (among other things). You're making Tom's point again :-) Yeah, sorry, couldn't resist :-) IIRC you don't need to be connected to the repo to run svn diff, whereas you do to run cvs diff. Yes, in the simplest case of comparing against the immediate successor revision. But certainly not for: svn diff -r${FURTHER_IN_THE_PAST}, as subversion does not have that data available (nor does CVS, for that matter). We know the warts. If this were a green fields project there is no doubt we would not use CVS. But many proponents of other systems ignore the downside of changing. Well, I guess many advocates for other VCSes (like myself) simply don't particularly like to talk about the downsides... But they are probably more aware of them than most other people. One thing I want to know is that whatever we change to will still be there, maintained and in widespread use, many years down the track. So far I am not sure about that for any possible replacement, with the possible exception of SVN. That's certainly a valid concern, too. Probably *the* one where monotone is weaker compared to git and mercurial. :-( We are working on that issue, though. Regards Markus ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] tsearch in core patch, for inclusion
Teodor Sigaev wrote: In that proposed syntax, I would drop all =, ,, (, and ). They don't seem necessary and they are untypical for SQL commands. I'd compare with CREATE FUNCTION or CREATE SEQUENCE for SQL commands that do similar things. I was looking at CREATE TYPE mostly. With removing =, ,, (, and ) in CREATE/ALTER FULLTEXT it's needed to add several items in unreserved_keyword list. And increase gram.y by adding new rules similar to OptRoleList instead of simple opt_deflist: '(' def_list ')' { $$ = $2; } | /*EMPTY*/ { $$ = NIL; } ; Is it acceptable? List of new keywords is: LOCALE, LEXIZE, INIT, OPT, GETTOKEN, LEXTYPES, HEADLINE So, syntax will be CREATE FULLTEXT DICTIONARY dictname LEXIZE lexize_function [ INIT init_function ] [ OPT opt_text ]; CREATE FULLTEXT DICTIONARY dictname [ { LEXIZE lexize_function | INIT init_function | OPT opt_text } [...] ] LIKE template_dictname; If we are worried about the size of the transition table and keeping it in cache (see remarks from Tom upthread) then adding more keywords seems a bad idea, as it will surely expand the table. OTOH, I'd hate to make that a design criterion. My main worry has been that the grammar would be stable. Just to quantify all this, I did a quick check on the grammar using bison -v - we appear to have 473 terminal symbols, and 420 non-terminal sybols in 1749 rules, generating 3142 states. The biggest tables generated are yytable and yycheck, each about 90kb on my machine. cheers andrew ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[HACKERS] XLOG_NO_TRAN and XLogRecord.xl_xid
Hi After futher reading I fear I have to bother you with another question ;-) There is a flag XLOG_NO_TRAN passed via the info parameter to XLogInsert. Now, for example the following comment in clog.c /* * Write a TRUNCATE xlog record * * We must flush the xlog record to disk before returning --- see notes * in TruncateCLOG(). * * Note: xlog record is marked as outside transaction control, since we * want it to be redone whether the invoking transaction commits or not. */ static void WriteTruncateXlogRec(int pageno) ... seems to imply that (some?) wal redoe records only actually get redone if the transaction that caused them eventually comitted. But given the way postgres MVCC works that doesn't make sense to me, and I also can't find any code that would actually skip xlog entries. On a related note - Looking at e.g. heap_xlog_insert, it seems that the orginal page (before the crash), and the one reconstructed via heap_xlog_insert are only functionally equivalent, but not the same byte-wise? At least this is what doing HeapTupleHeaderSetCmin(htup, FirstCommandId); seems to imply - surely the original command id could have been higher, no? greetings, Florian Pflug ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] XLOG_NO_TRAN and XLogRecord.xl_xid
Florian G. Pflug wrote: Hi After futher reading I fear I have to bother you with another question ;-) There is a flag XLOG_NO_TRAN passed via the info parameter to XLogInsert. Now, for example the following comment in clog.c /* * Write a TRUNCATE xlog record * * We must flush the xlog record to disk before returning --- see notes * in TruncateCLOG(). * * Note: xlog record is marked as outside transaction control, since we * want it to be redone whether the invoking transaction commits or not. */ static void WriteTruncateXlogRec(int pageno) ... seems to imply that (some?) wal redoe records only actually get redone if the transaction that caused them eventually comitted. But given the way postgres MVCC works that doesn't make sense to me, and I also can't find any code that would actually skip xlog entries. That comment is a bit misleading, I agree. We don't skip xlog entries, they're always replayed. The xid in the WAL record is used by some WAL resource managers to reconstruct the original data. For that purpose, it might as well not be in the header, but in the data portion. It's also used in PITR to recover up to a certain transaction, and it's used to advance the next xid counter to the next unused xid after replay. On a related note - Looking at e.g. heap_xlog_insert, it seems that the orginal page (before the crash), and the one reconstructed via heap_xlog_insert are only functionally equivalent, but not the same byte-wise? At least this is what doing HeapTupleHeaderSetCmin(htup, FirstCommandId); seems to imply - surely the original command id could have been higher, no? Yep, that's right. The reconstructed page is not always byte-to-byte identical to the original. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] XLOG_NO_TRAN and XLogRecord.xl_xid
Heikki Linnakangas wrote: Florian G. Pflug wrote: seems to imply that (some?) wal redoe records only actually get redone if the transaction that caused them eventually comitted. But given the way postgres MVCC works that doesn't make sense to me, and I also can't find any code that would actually skip xlog entries. That comment is a bit misleading, I agree. We don't skip xlog entries, they're always replayed. The xid in the WAL record is used by some WAL resource managers to reconstruct the original data. For that purpose, it might as well not be in the header, but in the data portion. It's also used in PITR to recover up to a certain transaction, and it's used to advance the next xid counter to the next unused xid after replay. Also, we skip clog update and writing the commit record if the transaction hasn't written any WAL records that are tied to the transaction. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] XLOG_NO_TRAN and XLogRecord.xl_xid
Heikki Linnakangas [EMAIL PROTECTED] writes: Florian G. Pflug wrote: * Note: xlog record is marked as outside transaction control, since we * want it to be redone whether the invoking transaction commits or not. That comment is a bit misleading, I agree. We don't skip xlog entries, they're always replayed. Yeah, this distinction is another bit of effectively-dead code left over from Vadim's original plan of using WAL for UNDO. I haven't worried about ripping it out because it doesn't cost much and it seems that distinguishing transactional from nontransactional changes might be useful for log analysis if nothing else. Yep, that's right. The reconstructed page is not always byte-to-byte identical to the original. We don't worry about recovering cmin/cmax since only the originating transaction would have cared. I think physical location of tuples on a page isn't reliably reproduced either. regards, tom lane ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] tsearch in core patch, for inclusion
CREATE FULLTEXT CONFIGURATION myfts LIKE template_cfg AS DEFAULT; SELECT add_fulltext_config('myfts', 'template_cfg', True); That's simple, but what about CREATE FULLTEXT MAPPING ON cfgname FOR lexemetypename[, ...] WITH dictname1[, ...]; ? SELECT create_fulltext_mapping(cfgname, '{lexemetypename[, ...]}'::text[], '{dictname1[, ...]}'::text[]); Seems rather ugly for me... Functions maybe doesn't see efective, but user's cannot learn new syntax. SELECT create_fulltext_mapping(cfgname, ARRAY['lex..','..'], ARRAY['...']) is readable. I agree so enhancing parser oabout not standard construct isn't good. And function interface does not provide autocompletion and online help in psql. \df says only types of arguments, not a meaning. Yes, I miss better support function in psql too. But it's different topic. I don't see reason why \h cannot support better functions. Nice a day Pavel Stehule _ Emotikony a pozadi programu MSN Messenger ozivi vasi konverzaci. http://messenger.msn.cz/ ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] tsearch in core patch, for inclusion
Hi, Andrew Dunstan wrote: If we are worried about the size of the transition table and keeping it in cache (see remarks from Tom upthread) then adding more keywords seems a bad idea, as it will surely expand the table. OTOH, I'd hate to make that a design criterion. Yeah, me too. Especially because it's an implementation issue against ease of use. (Or can somebody convince me that functions would provide a simple interface?) My main worry has been that the grammar would be stable. You mean stability of the grammar for the new additions or for all the grammar? Why are you worried about that? Just to quantify all this, I did a quick check on the grammar using bison -v - we appear to have 473 terminal symbols, and 420 non-terminal sybols in 1749 rules, generating 3142 states. The biggest tables generated are yytable and yycheck, each about 90kb on my machine. That already sounds somewhat better that Tom's 300 kb. And considering that these caches most probably grow faster than our grammar... Regards Markus ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Column storage positions
On Thursday 22 February 2007 09:06, Phil Currier wrote: On 2/22/07, Tom Lane [EMAIL PROTECTED] wrote: Andrew Dunstan [EMAIL PROTECTED] writes: Alvaro Herrera wrote: Right, I'm not advocating not doing that -- I'm just saying that the first step to that could be decoupling physical position with attr id :-) Logical column ordering (the order in which SELECT * expands to) seems to me to be a different feature. Except in the sense that divorcing the id from the storage order makes it possible to do sanely. :-) They are different features, but they are going to hit all the same code, because the hardest part of this remains making sure that every piece of the code is using the right kind of column number. The suggestion I posted awhile ago amounts to saying that we might be able to solve that by default, by making sure that only one definition of column number is relevant to the majority of the backend and we can figure out exactly where the other definitions need to apply. But that's handwaving until someone actually does it :-( I don't really think it's just handwaving at this point because I've done a lot of it :). I'm not saying the work is done, or that a lot more testing isn't required, but at the moment I have a working system that seems to do what it needs to do to separate storage position from permanent ID/display position. And the changes to accomplish this were quite localized - namely the tuple access routines in heaptuple.c, and the small handful of places that need to construct tuple descriptors. That's pretty much it - the rest of the codebase is remains untouched. In any case I think it's foolish not to tackle both issues at once. We know we'd like to have both features and we know that all the same bits of code need to be looked at to implement either. I guess I disagree with that sentiment. I don't think it's necessary to bundle these two features together, even if some analysis will be duplicated between them, since they are completely distinct in a functional sense and will touch different places in the code. Smaller, more incremental changes make more sense to me. Can you post a patch of what you have now to -patches? But if both-features-at-once is what the community wants, that's fine, no worries. I'll just pull my own personal hat out of the ring until someone comes along who's interested in implementing them both at the same time. Are you that opposed to working on the display portions as well? You'll be a hero to thousands of mysql users if you do it. -- Robert Treat Build A Brighter LAMP :: Linux Apache {middleware} PostgreSQL ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] HOT for PostgreSQL 8.3
Yes, thats one option. Though given a choice I would waste four bytes in the heap-page than inserting a new index entry. No question about that. My point was, that it would mean wasting the 2 (2 must be enough for a slot pointer) bytes on every heap tuple, hot or not. And then the decision is not so obvious anymore. If you don't have the room for the back pointer on every slot, there is no room to add one later. Oh yes, I agree. I was referring to the idea of line pointer redirection which would waste four bytes (for the root line pointer) per hot-update chain. That occurs only when a tuple is hot-updated. So there is no overhead for normal tuples. I think you are still misunderstanding me, sorry if I am not beeing clear enough. When the row is hot-updated it is too late. You do not have room in the root for the line pointer. Say a table's tuple size is typically 40 bytes. Your root slot has room for exactly 40 bytes. When the new tuple also (as expected) needs 40 bytes there is no room for the line pointer. Or are you suggesting, that vacuum resizes all root tuples to make room for the extra bytes, and only then you can use it ? Imho that would not be good. Imho the relink root to newest dead tuple during update path is more promising, and does not depend on vacuum. If we do reuse dead tuples without vacuum we should probably, as already suggested, disconnect the what is dead enough for reuse/vacuum from global xmin right from the start. Andreas ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Column storage positions
In any case I think it's foolish not to tackle both issues at once. We know we'd like to have both features and we know that all the same bits of code need to be looked at to implement either. I guess I disagree with that sentiment. I don't think it's necessary to bundle these two features together, even if some analysis will be duplicated between them, since they are completely distinct in a functional sense and will touch different places in the code. I fully agree with Phil here. And I also see a lot of unhappiness from users of system tables when column numbers all over the system tables would not be logical column positions any more. Andreas ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [Monotone-devel] Re: [HACKERS] SCMS question
Hello Richard, you should probably have read the thread on the PostgreSQL -hackers mailing list I've linked to... at least you didn't make Tom's point ;-) Richard Levitte - VMS Whacker wrote: 1. Do you want to stay with CVS or do you want to move to something else? Most PostgreSQL developers currently want to stay with CVS. Only some desperate souls including myself are fiddling with other VCSes. 3. What would you want a replacement to be able to do? That's being debated, with many voices saying: CVS (plus our own hackery) provides all we need. (And be warned again: as soon as you point out an advantage of your favourite VCS, you're making Tom's point. ;-) ) So far, I'm getting the sense that there are a lot of opinions on what replacement system to use, a bit carelessly before having answered the above questions thoroughly. How did you get that impression? I'm currently *using* monotone for Postgres-R development, doing cvs_import and propagating to my branch. And I know others did the same already, too. Regards Markus ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] What is CheckPoint.undo needed for?
Alvaro Herrera wrote: Florian G. Pflug wrote: Heikki Linnakangas wrote: No you're right, it's related to the WAL undo stuff that was never actually implemented. It's dead code. Teodor Sigaev wrote: Opps, sorry, I missed checkpoint keyword Teodor Sigaev wrote: What am I missing? Seems, it's about that http://archives.postgresql.org/pgsql-committers/2005-06/msg00085.php Thanks - everything makes much more sense now ;-) So if you want to submit a patch to remove it, it'll make sense for future readers as well ;-) Here you go. I'm not subscribe to pgsql-patches, so I posted this here. greetings, Florian Pflug Index: src/backend/access/transam/xlog.c === RCS file: /projects/cvsroot/pgsql/src/backend/access/transam/xlog.c,v retrieving revision 1.264 diff -a -u -r1.264 xlog.c --- src/backend/access/transam/xlog.c 14 Feb 2007 05:00:40 - 1.264 +++ src/backend/access/transam/xlog.c 22 Feb 2007 15:47:52 - @@ -4039,7 +4039,6 @@ /* Set up information for the initial checkpoint record */ checkPoint.redo.xlogid = 0; checkPoint.redo.xrecoff = SizeOfXLogLongPHD; - checkPoint.undo = checkPoint.redo; checkPoint.ThisTimeLineID = ThisTimeLineID; checkPoint.nextXidEpoch = 0; checkPoint.nextXid = FirstNormalTransactionId; @@ -4698,9 +4697,8 @@ wasShutdown = (record-xl_info == XLOG_CHECKPOINT_SHUTDOWN); ereport(LOG, -(errmsg(redo record is at %X/%X; undo record is at %X/%X; shutdown %s, +(errmsg(redo record is at %X/%X; ; shutdown %s, checkPoint.redo.xlogid, checkPoint.redo.xrecoff, -checkPoint.undo.xlogid, checkPoint.undo.xrecoff, wasShutdown ? TRUE : FALSE))); ereport(LOG, (errmsg(next transaction ID: %u/%u; next OID: %u, @@ -4730,20 +4728,17 @@ if (XLByteLT(RecPtr, checkPoint.redo)) ereport(PANIC, (errmsg(invalid redo in checkpoint record))); - if (checkPoint.undo.xrecoff == 0) - checkPoint.undo = RecPtr; /* * Check whether we need to force recovery from WAL. If it appears to * have been a clean shutdown and we did not have a recovery.conf file, * then assume no recovery needed. */ - if (XLByteLT(checkPoint.undo, RecPtr) || - XLByteLT(checkPoint.redo, RecPtr)) + if (XLByteLT(checkPoint.redo, RecPtr)) { if (wasShutdown) ereport(PANIC, - (errmsg(invalid redo/undo record in shutdown checkpoint))); + (errmsg(invalid redo record in shutdown checkpoint))); InRecovery = true; } else if (ControlFile-state != DB_SHUTDOWNED) @@ -5850,10 +5845,9 @@ { CheckPoint *checkpoint = (CheckPoint *) rec; - appendStringInfo(buf, checkpoint: redo %X/%X; undo %X/%X; + appendStringInfo(buf, checkpoint: redo %X/%X; tli %u; xid %u/%u; oid %u; multi %u; offset %u; %s, checkpoint-redo.xlogid, checkpoint-redo.xrecoff, -checkpoint-undo.xlogid, checkpoint-undo.xrecoff, checkpoint-ThisTimeLineID, checkpoint-nextXidEpoch, checkpoint-nextXid, checkpoint-nextOid, Index: src/bin/pg_controldata/pg_controldata.c === RCS file: /projects/cvsroot/pgsql/src/bin/pg_controldata/pg_controldata.c,v retrieving revision 1.32 diff -a -u -r1.32 pg_controldata.c --- src/bin/pg_controldata/pg_controldata.c 8 Dec 2006 19:50:53 - 1.32 +++ src/bin/pg_controldata/pg_controldata.c 22 Feb 2007 15:47:52 - @@ -168,9 +168,6 @@ printf(_(Latest checkpoint's REDO location:%X/%X\n), ControlFile.checkPointCopy.redo.xlogid, ControlFile.checkPointCopy.redo.xrecoff); - printf(_(Latest checkpoint's UNDO location:%X/%X\n), - ControlFile.checkPointCopy.undo.xlogid, - ControlFile.checkPointCopy.undo.xrecoff); printf(_(Latest checkpoint's TimeLineID: %u\n), ControlFile.checkPointCopy.ThisTimeLineID); printf(_(Latest checkpoint's NextXID: %u/%u\n), Index: src/include/catalog/pg_control.h === RCS file: /projects/cvsroot/pgsql/src/include/catalog/pg_control.h,v retrieving revision 1.35 diff -a -u -r1.35 pg_control.h --- src/include/catalog/pg_control.h5 Jan 2007 22:19:52 - 1.35
Re: [Monotone-devel] Re: [HACKERS] SCMS question
Markus Schiltknecht wrote: Richard Levitte - VMS Whacker wrote: 1. Do you want to stay with CVS or do you want to move to something else? Most PostgreSQL developers currently want to stay with CVS. Only some desperate souls including myself are fiddling with other VCSes. I really don't think this is a correct characterisation. What is true, I think, is that many remain to be convinced that the benefits of any proposed change outweigh the likely pain, and I suspect many are also uncertain about what's happening in the SCM space and prefer to wait until the dust settles some. It's also fair to say that this is a subject about which we usually get much more noise from partisans of other SCM systems than from the relatively small number of people who actually have to maintain the postgresql code. (As Tom has pointed out, our biggest pain point is the occasional wish to move things across directories.) cheers andrew ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] tsearch in core patch, for inclusion
Hi, Pavel Stehule wrote: Functions maybe doesn't see efective, but user's cannot learn new syntax. Are you serious? That argument speaks exactly *for* extending the grammar. From other databases, users are used to: CREATE TABLE ... (SQL) CREATE INDEX ... (SQL) CREATE FULLTEXT INDEX ... (Transact-SQL) CREATE TABLE (... FULLTEXT ...) (MySQL) CREATE INDEX ... INDEXTYPE IS ctxsys.context PARAMETERS ... (Oracle Text) And users are constantly complaining that PostgreSQL doesn't have fulltext indexing capabilities (if they don't know about tsearch2) or about how hard it is to use tsearch2. SELECT create_fulltext_mapping(cfgname, ARRAY['lex..','..'], ARRAY['...']) is readable. Hardly. Because it's not like SQL: - it's counter-intuitive to have to SELECT, when you want to CREATE something. - it's confusing to have two actions (select create) - why do I have to write ARRAYs to list parameters? - it's not obvious what you're selecting (return value?) - you have to keep track of the brackets, which can easily get messed up with two levels of them. Especially if the command gets multiple lines long. I agree so enhancing parser oabout not standard construct isn't good. Generally? Wow! This would mean PostgreSQL would always lack behind other RDBSes, regarding ease of use. Please don't do that! Regards Markus ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [Monotone-devel] Re: [HACKERS] SCMS question
Andrew Dunstan wrote: Markus Schiltknecht wrote: Richard Levitte - VMS Whacker wrote: 1. Do you want to stay with CVS or do you want to move to something else? Most PostgreSQL developers currently want to stay with CVS. Only some desperate souls including myself are fiddling with other VCSes. I really don't think this is a correct characterisation. What is true, I think, is that many remain to be convinced that the benefits of any proposed change outweigh the likely pain, and I suspect many are also uncertain about what's happening in the SCM space and prefer to wait until the dust settles some. I believe that is much more accurate. The reality is, switching to something else will be painful. I would prefer not to be on CVS as well but it would take a lot of work and cvs does what we need it to. Joshua D. Drake -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutions since 1997 http://www.commandprompt.com/ Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate PostgreSQL Replication: http://www.commandprompt.com/products/ ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Column storage positions
Zeugswetter Andreas ADI SD wrote: And I also see a lot of unhappiness from users of system tables when column numbers all over the system tables would not be logical column positions any more. Are you arguing against the feature? Or against the suggested design? I should have thought (without much looking) one possible way to implement it would be to put Oids on pg_attribute for the permanent id, and keep attnum for the (now mutable) logical order, adding a further column for the physical order. cheers andrew ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] tsearch in core patch, for inclusion
And users are constantly complaining that PostgreSQL doesn't have fulltext indexing capabilities (if they don't know about tsearch2) or about how hard it is to use tsearch2. SELECT create_fulltext_mapping(cfgname, ARRAY['lex..','..'], ARRAY['...']) is readable. Hardly. Because it's not like SQL: I have to agree here. SELECT create_fulltext_mapping(cfgname, ARRAY['lex..','..'], ARRAY['...']) is readable. Is a total no op. We might as well just leave it in contrib. Joshua D. Drake -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutions since 1997 http://www.commandprompt.com/ Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate PostgreSQL Replication: http://www.commandprompt.com/products/ ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] HOT for PostgreSQL 8.3
On 2/22/07, Zeugswetter Andreas ADI SD [EMAIL PROTECTED] wrote: I think you are still misunderstanding me, sorry if I am not beeing clear enough. When the row is hot-updated it is too late. You do not have room in the root for the line pointer. I think the word line pointer is causing some confusion here. Let me explain the idea again: Each page has a set of line pointers OR item-ids as they are referred in the code (I shall use the word item-id here after). The item-id stores the offset(15 bits), length (15 bits) and two flags, LP_USED and LP_DELETE (2 bits). The root tuple is pointed to by some item-id (say, I1) on the page. When the tuple is hot updated, a heap-only tuple is added which is linked to the root tuple by its t_ctid and is pointed by another item-id I2 on the page. Now, when the root tuple subsequently becomes DEAD, I am proposing to store I2 in the offset field of I1. The length field in I1 can be set to zero (or some other special value) to mark that I1 is now just a redirection to I2 and does not point to any real tuple, dead or live. The actual storage used by the root tuple is now released (or we can add another item pointer to keep track of it so that it can be reused without vacuum of the page. But lets defer that for the time being). In short, the pointer is NOT stored in the tuple header, but in the item-id. If we do reuse dead tuples without vacuum we should probably, as already suggested, disconnect the what is dead enough for reuse/vacuum from global xmin right from the start. I did not get that completely. Can you elaborate on that ? Once we find a heap-only DEAD tuple, we remove it from the tuple-chain (and thus remove its only reference) and set LP_DELETE on its item-id. When we run out of free-space in a page, we search for an item-id whose LP_DELETE is set and whose length is atleast equal to the new tuple length. We reuse that item-id and the associated storage for storing the new tuple. I hope things would be more clear once I post next version of the HOT-patch. But I really appreciate these comments. Thanks, Pavan -- EnterpriseDB http://www.enterprisedb.com
Re: [HACKERS] Column storage positions
And I also see a lot of unhappiness from users of system tables when column numbers all over the system tables would not be logical column positions any more. Are you arguing against the feature? Or against the suggested design? Against the design. I should have thought (without much looking) one possible way to implement it would be to put Oids on pg_attribute for the permanent id, and keep attnum for the (now mutable) logical order, adding a further column for the physical order. Yes, that was the idea (not oid but some number), and I am arguing against it. Imho people are used to see the logical position in e.g. pg_index I know it is a lot of work to update all those dependencies in the system tables to reorder logical position, but that is the path I think should be taken. And the first step in that direction is Phil's patch. Andreas ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] SCMS question
On Thursday 22 February 2007 05:26, Andrew Dunstan wrote: 2. Many people (and some buildfarm members) operate against mirrors of the main repo which are created with rsync or CVSup. I am not aware of any way to do the equivalent with SVN - any info would be gratefully received. Of course, SVN is better at disconnected operation than CVS, so it might be a non-issue for many. Even so, it might be a pity to have to forego the facility. Check out svnsync for SVN. It's part of a normal SVN installation. With git, a checkout brings the whole repository anyway, I think. wt -- Warren Turkal (w00t) ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Column storage positions
Zeugswetter Andreas ADI SD wrote: Yes, that was the idea (not oid but some number), and I am arguing against it. Imho people are used to see the logical position in e.g. pg_index Which people are you talking about? In my commercial PG work I hardly ever look at a system table at all, and users shouldn't have to IMNSHO. If you mean tools developers, then accomodating to catalog changes is par for the course, I should think. cheers andrew ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] SCMS question
Andrew Dunstan [EMAIL PROTECTED] writes: 2. Many people (and some buildfarm members) operate against mirrors of the main repo which are created with rsync or CVSup. I am not aware of any way to do the equivalent with SVN - any info would be gratefully received. Of course, SVN is better at disconnected operation than CVS, so it might be a non-issue for many. Even so, it might be a pity to have to forego the facility. Well SVN basically works by having that mirror all the time. That kind of sucks for people with many checkouts since it takes more disk space but it provides the same benefits of having a local mirror of the CVS repository which is really just working around the problems with CVS. The general point about the build farms is a strong argument in favor of SVN over the new-fangled revision control systems. It would involve the least change in the operational model and the main build farm maintainer is familiar with it... It's also the easiest to get ahold of. Easier I would say than CVS which you have to download some bug fixes from various unofficial sites to get a good working version of. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] tsearch in core patch, for inclusion
And users are constantly complaining that PostgreSQL doesn't have fulltext indexing capabilities (if they don't know about tsearch2) or about how hard it is to use tsearch2. SELECT create_fulltext_mapping(cfgname, ARRAY['lex..','..'], ARRAY['...']) is readable. Hardly. Because it's not like SQL: I have to agree here. SELECT create_fulltext_mapping(cfgname, ARRAY['lex..','..'], ARRAY['...']) is readable. Is a total no op. We might as well just leave it in contrib. I am for integration tsearch to core, why not. But I don't see reason for special syntax. Stored procedures is exactly good tool for it. Fulltext is standarised in SQL/MM, SQL Multimedia and Application Packages, Part 2: Full-Text Why implement extensive proprietary solution? If our soulution is proprietary, then so it is simple and cheap and doesn't complicate future conformance with ANSI SQL. Regards Pavel Stehule _ Najdete si svou lasku a nove pratele na Match.com. http://www.msn.cz/ ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] tsearch in core patch, for inclusion
Pavel Stehule wrote: And users are constantly complaining that PostgreSQL doesn't have fulltext indexing capabilities (if they don't know about tsearch2) or about how hard it is to use tsearch2. SELECT create_fulltext_mapping(cfgname, ARRAY['lex..','..'], ARRAY['...']) is readable. Hardly. Because it's not like SQL: I have to agree here. SELECT create_fulltext_mapping(cfgname, ARRAY['lex..','..'], ARRAY['...']) is readable. Is a total no op. We might as well just leave it in contrib. I am for integration tsearch to core, why not. But I don't see reason for special syntax. Stored procedures is exactly good tool for it. I am not talking about stored procedures. I am talking about a very ugly, counter intuitive syntax above. Initializing full text should be as simple as: CREATE INDEX foo USING FULLTEXT(bar); (or something similar) Or: CREATE TABLE foo (id serial, names text FULLTEXT); Anything more complicated is a waste of cycles. Joshua D. Drake -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutions since 1997 http://www.commandprompt.com/ Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate PostgreSQL Replication: http://www.commandprompt.com/products/ ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [Monotone-devel] Re: [HACKERS] SCMS question
Andrew Dunstan wrote: It's also fair to say that this is a subject about which we usually get much more noise from partisans of other SCM systems than from the relatively small number of people who actually have to maintain the postgresql code. (As Tom has pointed out, our biggest pain point is the occasional wish to move things across directories.) There are more features we are missing -- we just don't know about them :-) For example, currently if I have a patch and somebody reviews it and opines that I have to change foo to bar; then I resubmit the patch. How do they find out whether I actually changed foo to bar? Currently there are two alternatives: 1. trust that I did it 2. review the whole patch again With a distributed SCM, I could just patch the code and commit a new revision in my branch to just change foo to bar, and then the reviewer can check that I truly did what he wanted. Another easy thing to do is to track the current HEAD in a branch of mine. Keeping patches up to date in parallel with other developments is easier. -- Alvaro Herrerahttp://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] SCMS question
Warren Turkal wrote: On Thursday 22 February 2007 05:26, Andrew Dunstan wrote: 2. Many people (and some buildfarm members) operate against mirrors of the main repo which are created with rsync or CVSup. I am not aware of any way to do the equivalent with SVN - any info would be gratefully received. Of course, SVN is better at disconnected operation than CVS, so it might be a non-issue for many. Even so, it might be a pity to have to forego the facility. Check out svnsync for SVN. It's part of a normal SVN installation. Nifty. Thanks. andrew ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] tsearch in core patch, for inclusion
I am not talking about stored procedures. I am talking about a very ugly, counter intuitive syntax above. Initializing full text should be as simple as: CREATE INDEX foo USING FULLTEXT(bar); (or something similar) Or: CREATE TABLE foo (id serial, names text FULLTEXT); Anything more complicated is a waste of cycles. Joshua D. Drake I agree. Question: what about multilanguage fulltext. CREATE INDEX foo USING FULLTEXT(bar) [ WITH czech_dictionary ]; CREATE TABLE foo (id serial, names text FULLTEXT [ (czech_dictionary, english_dictionary) ] ); all others can we do via SP. Pavel Stehule _ Citite se osamele? Poznejte nekoho vyjmecneho diky Match.com. http://www.msn.cz/ ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] SCMS question
Andrew Dunstan wrote: Warren Turkal wrote: On Thursday 22 February 2007 05:26, Andrew Dunstan wrote: 2. Many people (and some buildfarm members) operate against mirrors of the main repo which are created with rsync or CVSup. I am not aware of any way to do the equivalent with SVN - Well you could just use rsync with svn ;) any info would be gratefully received. Of course, SVN is better at disconnected operation than CVS, so it might be a non-issue for many. Even so, it might be a pity to have to forego the facility. Check out svnsync for SVN. It's part of a normal SVN installation. That is cool, I didn't know about that. Other items I find interesting about SVN: 1. It has an api that can be written to, that may or may not be helpful to buildfarm. 2. It has mindshare. I know that isn't a big deal to a lot of people here, but the it is becoming the new cvs. There are others of course (like monotone) but most projects I know are moving from cvs to svn or starting with svn. Joshua D. Drake Nifty. Thanks. andrew ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutions since 1997 http://www.commandprompt.com/ Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate PostgreSQL Replication: http://www.commandprompt.com/products/ ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [Monotone-devel] Re: [HACKERS] SCMS question
Markus Schiltknecht wrote: Most PostgreSQL developers currently want to stay with CVS. Only some desperate souls including myself are fiddling with other VCSes. I think if you took a head count, a majority of developers would probably want to switch, but I doubt that there would be a consensus on what to. -- Peter Eisentraut http://developer.postgresql.org/~petere/ ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Column storage positions
Yes, that was the idea (not oid but some number), and I am arguing against it. Imho people are used to see the logical position in e.g. pg_index Which people are you talking about? In my commercial PG work I hardly ever look at a system table at all, and users shouldn't have to IMNSHO. You are probably right. I tend to resort to commandline tools, schema dumps and system tables, probably not many other people do that. I often don't get to use my perferred toolset because it is not installed. If you mean tools developers, then accomodating to catalog changes is par for the course, I should think. The question is, whether the distributed work needed to get all the tools and interfaces (like jdbc, odbc, pgadmin) to work again isn't more work, than doing it in the backend would be. Since we want plan invalidation anyway, I am not sure the rest is so much. Andreas ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] tsearch in core patch, for inclusion
CREATE TABLE foo (id serial, names text FULLTEXT); Anything more complicated is a waste of cycles. Joshua D. Drake I agree. Question: what about multilanguage fulltext. CREATE INDEX foo USING FULLTEXT(bar) [ WITH czech_dictionary ]; CREATE TABLE foo (id serial, names text FULLTEXT [ (czech_dictionary, english_dictionary) ] ); all others can we do via SP. That works for me with perhaps a default mapping to locales? For example if our locale is en_us.UTF8 we are pretty assured that we are using english. Joshua D. Drake Pavel Stehule _ Citite se osamele? Poznejte nekoho vyjmecneho diky Match.com. http://www.msn.cz/ ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutions since 1997 http://www.commandprompt.com/ Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate PostgreSQL Replication: http://www.commandprompt.com/products/ ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] What is CheckPoint.undo needed for?
Florian G. Pflug wrote: Alvaro Herrera wrote: Florian G. Pflug wrote: Heikki Linnakangas wrote: No you're right, it's related to the WAL undo stuff that was never actually implemented. It's dead code. Teodor Sigaev wrote: Opps, sorry, I missed checkpoint keyword Teodor Sigaev wrote: What am I missing? Seems, it's about that http://archives.postgresql.org/pgsql-committers/2005-06/msg00085.php Thanks - everything makes much more sense now ;-) So if you want to submit a patch to remove it, it'll make sense for future readers as well ;-) Here you go. I'm not subscribe to pgsql-patches, so I posted this here. I think you should increase pg_control version. -- Alvaro Herrerahttp://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc. ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] HOT for PostgreSQL 8.3
I think the word line pointer is causing some confusion here. Let me explain the idea again: Each page has a set of line pointers OR item-ids as they are referred in the code (I shall use the word item-id here after). The item-id stores the offset(15 bits), length (15 bits) and two flags, LP_USED and LP_DELETE (2 bits). The root tuple is pointed to by some item-id (say, I1) on the page. When the tuple is hot updated, a heap-only tuple is added which is linked to the root tuple by its t_ctid and is pointed by another item-id I2 on the page. Now, when the root tuple subsequently becomes DEAD, I am proposing to store I2 in the offset field of I1. The length field in I1 can be set to zero (or some other special value) to mark that I1 is now just a redirection to I2 and does not point to any real tuple, dead or live. Oh, thanks for explaining. I was really misunderstanding. Is that possible ? I thought item-id's (slots) need to be in physical order. If we do reuse dead tuples without vacuum we should probably, as already suggested, disconnect the what is dead enough for reuse/vacuum from global xmin right from the start. I did not get that completely. Can you elaborate on that ? Once we find a heap-only DEAD tuple, we remove it from the tuple-chain (and thus remove its only reference) and set LP_DELETE on its item-id. When we run out of free-space in a page, we search for an item-id whose LP_DELETE is set and whose length is atleast equal to the new tuple length. We reuse that item-id and the associated storage for storing the new tuple. With the recent discussion about flashback (explicitly setting a snapshot to 5 min ago) and forensics, I think we should have a way to delay what is considered DEAD and available for immediate reuse. I know this can reduce the efficiency of HOT when delayed too far, but I think we should have that possibility if we reuse without vacuum. Andreas ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] tsearch in core patch, for inclusion
CREATE TABLE foo (id serial, names text FULLTEXT); Anything more complicated is a waste of cycles. Joshua D. Drake I agree. Question: what about multilanguage fulltext. CREATE INDEX foo USING FULLTEXT(bar) [ WITH czech_dictionary ]; CREATE TABLE foo (id serial, names text FULLTEXT [ (czech_dictionary, english_dictionary) ] ); all others can we do via SP. That works for me with perhaps a default mapping to locales? For example if our locale is en_us.UTF8 we are pretty assured that we are using english. 90% yes. 10% no. In czech typical task: find word without accents, or find german, english, czech stemmed word in multilanguage documents (or different languages depend on topology). Lot of databases are minimal bilanguagal (in czech rep. german and czech). Pavel p.s. missing collates is big minus for PostgreSQL in eu (we have some workarounds) _ Najdete si svou lasku a nove pratele na Match.com. http://www.msn.cz/ ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Column storage positions
On Thu, 22 Feb 2007, Zeugswetter Andreas ADI SD wrote: And I also see a lot of unhappiness from users of system tables when column numbers all over the system tables would not be logical column positions any more. Right now the fact that attnum presents the logical order but not the logical position is a problem for the JDBC driver. In the presence of dropped columns there is no easy way to get from a pg_attribute entry to logical position. I would hope that a new logical position column would reflect the actual position and solve this problem. Kris Jurka ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] SCMS question
Gregory Stark wrote: Andrew Dunstan [EMAIL PROTECTED] writes: [...] It's also the easiest to get ahold of. Easier I would say than CVS which you have to download some bug fixes from various unofficial sites to get a good working version of. just to mention it - the openbsd gyus are working on a BSD licenced CVS rewrite too (and i believe they strike for full compat first and improve on the main week points later on). some information about that effort is available here: http://www.opencvs.org/. Stefan ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] What is CheckPoint.undo needed for?
Alvaro Herrera [EMAIL PROTECTED] writes: I think you should increase pg_control version. And the WAL page-header version, since this also changes WAL contents. regards, tom lane ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Column storage positions
And I also see a lot of unhappiness from users of system tables when column numbers all over the system tables would not be logical column positions any more. Right now the fact that attnum presents the logical order but not the logical position is a problem for the JDBC driver. In the presence of dropped columns there is no easy way to get from a pg_attribute entry to logical position. I would hope that a new logical position column would reflect the actual position and solve this problem. I agree, I haven't thought of drop column :-( Drop column should have relabeled attnum. Since it was not done then, my comments are probably moot. Andreas ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] SCMS question
Gregory Stark wrote: Andrew Dunstan [EMAIL PROTECTED] writes: 2. Many people (and some buildfarm members) operate against mirrors of the main repo which are created with rsync or CVSup. I am not aware of any way to do the equivalent with SVN - any info would be gratefully received. Of course, SVN is better at disconnected operation than CVS, so it might be a non-issue for many. Even so, it might be a pity to have to forego the facility. Well SVN basically works by having that mirror all the time. That kind of sucks for people with many checkouts since it takes more disk space but it provides the same benefits of having a local mirror of the CVS repository which is really just working around the problems with CVS. The general point about the build farms is a strong argument in favor of SVN over the new-fangled revision control systems. It would involve the least change in the operational model and the main build farm maintainer is familiar with it... Nonsense. Getting a checkout is almost as easy in Monotone as it is in CVS or subversion. You just get an initial copy of the database and then get all your checkouts from there. In SVN there's this strange notion of keeping the latest revision within the checked out copy: if you have more than one working copy, you get the same stuff multiple times. If you grep the whole source tree you get a lot of extraneous files in the grep result, which you then have to grep -v out, which is annoying. In Monotone the database is separate from the working copy, and from one database you can get as many working copies as you want. You don't need to mirror the repository, because you _have_ the repository. (I guess another way to put it is that mirroring the repository is part of everyday operation). We use SVN internally in Command Prompt, and I will change to anything else any day :-) One thing that Monotone doesn't do to which we are used to, is $Id$-style keyword expansion. The only other problem I see currently is the long time to get an initial pull. It's also the easiest to get ahold of. Easier I would say than CVS which you have to download some bug fixes from various unofficial sites to get a good working version of. Sure, anything is better than CVS :-) -- Alvaro Herrerahttp://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc. ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Column storage positions
Zeugswetter Andreas ADI SD escribió: And I also see a lot of unhappiness from users of system tables when column numbers all over the system tables would not be logical column positions any more. Right now the fact that attnum presents the logical order but not the logical position is a problem for the JDBC driver. In the presence of dropped columns there is no easy way to get from a pg_attribute entry to logical position. I would hope that a new logical position column would reflect the actual position and solve this problem. I agree, I haven't thought of drop column :-( Drop column should have relabeled attnum. Since it was not done then, my comments are probably moot. We can correct this problem now. -- Alvaro Herrerahttp://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc. ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Column storage positions
On Thu, 22 Feb 2007, Alvaro Herrera wrote: Zeugswetter Andreas ADI SD escribió: I agree, I haven't thought of drop column :-( Drop column should have relabeled attnum. Since it was not done then, my comments are probably moot. We can correct this problem now. How? If attnum is serving as both physical position and logical order, how can you make it be logical position without breaking physical position? Kris Jurka ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Column storage positions
Kris Jurka escribió: On Thu, 22 Feb 2007, Alvaro Herrera wrote: Zeugswetter Andreas ADI SD escribió: I agree, I haven't thought of drop column :-( Drop column should have relabeled attnum. Since it was not done then, my comments are probably moot. We can correct this problem now. How? If attnum is serving as both physical position and logical order, how can you make it be logical position without breaking physical position? By patching the code, of course, so that it doesn't serves as both things, which is what is being discussed. -- Alvaro Herrerahttp://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Column storage positions
I agree, I haven't thought of drop column :-( Drop column should have relabeled attnum. Since it was not done then, my comments are probably moot. We can correct this problem now. Do you mean fix it with the 3rd column in pg_attribute and use that, or fix attnum ? :-) Imho it is a pain to need 2 numbers and a mapping in drivers. Andreas ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] autovacuum next steps, take 2
On Thu, Feb 22, 2007 at 09:32:57AM -0500, Matthew T. O'Connor wrote: Jim C. Nasby wrote: On Wed, Feb 21, 2007 at 05:40:53PM -0500, Matthew T. O'Connor wrote: My Proposal: If we require admins to identify hot tables tables, then: 1) Launcher fires-off a worker1 into database X. 2) worker1 deals with hot tables first, then regular tables. 3) Launcher continues to launch workers to DB X every autovac naptime. 4) worker2 (or 3 or 4 etc...) sees it is alone in DB X, if so it acts as worker1 did above. If worker1 is still working in DB X then worker2 looks for hot tables that are being starved because worker1 got busy. If worker2 finds no hot tables that need work, then worker2 exits. Rather than required people to manually identify hot tables, what if we just prioritize based on table size? So if a second autovac process hits a specific database, it would find the smallest table in need of vacuuming that it should be able to complete before the next naptime and vacuum that. It could even continue picking tables until it can't find one that it could finish within the naptime. Granted, it would have to make some assumptions about how many pages it would dirty. ISTM that's a lot easier than forcing admins to mark specific tables. So the heuristic would be: * Launcher fires off workers into a database at a given interval (perhaps configurable?) * Each worker works on tables in size order. * If a worker ever catches up to an older worker, then the younger worker exits. This sounds simple and workable to me, perhaps we can later modify this to include some max_workers variable so that a worker would only exit if it catches an older worker and there are max_workers currently active. That would likely result in a number of workers running in one database, unless you limited how many workers per database. And if you did that, you wouldn't be addressing the frequently update table problem. A second vacuum in a database *must* exit after a fairly short time so that we can go back in and vacuum the important tables again (well or the 2nd vacuum has to periodically re-evaluate what tables need to be vacuumed). -- Jim Nasby[EMAIL PROTECTED] EnterpriseDB http://enterprisedb.com 512.569.9461 (cell) ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] autovacuum next steps, take 2
On Thu, Feb 22, 2007 at 09:35:45AM +0100, Zeugswetter Andreas ADI SD wrote: vacuum should be a process with the least amount of voodoo. If we can just have vacuum_delay and vacuum_threshold, where threshold allows an arbitrary setting of how much bandwidth we will allot to the process, then that is a beyond wonderful thing. It is easy to determine how much IO you have, and what you can spare. The tricky part is what metric to use. Imho IO per second would be good. In a typical DB scenario that is the IO bottleneck, not the Mb/s. Well, right now they're one in the same... but yeah, IO/sec probably does make more sense. -- Jim Nasby[EMAIL PROTECTED] EnterpriseDB http://enterprisedb.com 512.569.9461 (cell) ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Column storage positions
I agree, I haven't thought of drop column :-( Drop column should have relabeled attnum. Since it was not done then, my comments are probably moot. We can correct this problem now. How? If attnum is serving as both physical position and logical order, how can you make it be logical position without breaking physical position? If you ask me, attnum would be the logical position and would be used in all other system tables. attphypos would only be used in pg_attribute. It would be quite some work to rearrange attnum in all system tables for drop column and add column before, but it would be nice for jdbc. But it seems others want this: attnum beeing an arbitrary number, that is used in all system tables and 2 extra columns in pg_attribute, one for logical position and one for physical position. If you want a corresponding colname to a pg_index attnum you need a map. Andreas ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] autovacuum next steps, take 2
vacuum should be a process with the least amount of voodoo. If we can just have vacuum_delay and vacuum_threshold, where threshold allows an arbitrary setting of how much bandwidth we will allot to the process, then that is a beyond wonderful thing. It is easy to determine how much IO you have, and what you can spare. The tricky part is what metric to use. Imho IO per second would be good. In a typical DB scenario that is the IO bottleneck, not the Mb/s. Well, right now they're one in the same... but yeah, IO/sec probably does make more sense. Hopefully not :-) Else you have no readahead. And that is imho the problem. You need to anticipate how many physical IO's your logical IO's cause. And this is near impossible unless we group IO's in pg itself. Andreas ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] Log levels for checkpoint/bgwriter monitoring
On Mon, Feb 19, 2007 at 10:59:38PM -0500, Greg Smith wrote: I have a WIP patch that adds the main detail I have found I need to properly tune checkpoint and background writer activity. I think it's almost ready to submit (you can see the current patch against 8.2 at http://www.westnet.com/~gsmith/content/postgresql/patch-checkpoint.txt ) after making it a bit more human-readable. But I've realized that along with that, I need some guidance in regards to what log level is appropriate for this information. It would also be extremely useful to make checkpoint stats visible somewhere in the database (presumably via the existing stats mechanism). The log output is great for doing initial tuning, but you'd want to also be able to monitor things, which would be impractical via logging. I'm thinking just tracking how many pages had to be flushed during a checkpoint would be a good start. -- Jim Nasby[EMAIL PROTECTED] EnterpriseDB http://enterprisedb.com 512.569.9461 (cell) ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
[HACKERS] Saving space for common kinds of numeric values
The numeric data type's minimum data size is 8 bytes and it can only even get that small for 0. Storing even 1 requires 10 bytes. That seems pretty abysmal. It occurs to me that we could assign special-case meanings for any datum smaller than 8 bytes. In just 2 or 4 bytes (including the 1-byte varlena header) I think we could encode many common small values. This would be a pretty straightforward change to numeric.c because all the hard work there is done using an internal representation that never reaches disk. As a result only set_var_from_num and make_result and a handful of simple functions that work directly on the packed Numeric representation would need to be adjusted at all. I'm thinking of the following two cases: 1 byte (plus 1 byte header): integer between -128..127 with dscale and weight implicitly defined to display as an integer. Ie, weight is always the number of digits in the value and dscale is always 0. That doesn't restrict the values Numeric supports only the values it can use this representation for, so if you store -1.0 it would store it normally (or using the following option). 3 bytes (plus 1 byte header): 1 byte to store weight and dscale (one nibble each) and 2 bytes to store the value. This would let us handle the extremely common case of currency quantities which have 2 decimal places. It could store -327.68 .. 327.67 in four bytes including the varlena header. Alternatively we could do away with weight entirely for the 3 byte representation as with the 1 byte representation. That would let us store up -10485.76 .. 10486.75 in 21 bits and use the remaining 3 bytes to store a dscale of up to 8. I actually favour this option. There are lots of options and we could go nuts defining a meaning for every possible length up to 8, but really I think just defining a 1+1 byte encoding for small integers and a 3+1 byte encoding for many common applications would be reasonable. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] [Monotone-devel] Re: SCMS question
[EMAIL PROTECTED] (Andrew Dunstan) writes: Tom has pointed out, our biggest pain point is the occasional wish to move things across directories. That's the biggest pain that people are normally aware of. There are things that people don't even bother to try to do with CVS because they are so impossible as to be stupid ideas. Notably, CVS, being an inherently centralized system, doesn't support the notion of having a secondary repository tracking projects that haven't been able to be committed into the central project. -- (reverse (concatenate 'string ofni.secnanifxunil @ enworbbc)) http://linuxfinances.info/info/finances.html Rules of the Evil Overlord #189. I will never tell the hero Yes I was the one who did it, but you'll never be able to prove it to that incompetent old fool. Chances are, that incompetent old fool is standing behind the curtain. http://www.eviloverlord.com/ ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [Monotone-devel] Re: [HACKERS] SCMS question
In message [EMAIL PROTECTED] on Thu, 22 Feb 2007 09:09:48 -0800, Joshua D. Drake [EMAIL PROTECTED] said: jd I believe that is much more accurate. The reality is, switching to jd something else will be painful. I would prefer not to be on CVS as jd well but it would take a lot of work and cvs does what we need it jd to. I can only tell you my own story, and that's a story of being sick and damn (oh, I've a much stronger word, but I'm too polite to use it here) tired of CVS, especially having had to deal with it's absolute lack of merging capacity. It's taken trying a number of other SCMs, which were all a PITA, until I found monotone, before I decided I had found a worthy replacement. To each his or her own, I say, and as you say, chaging (or moving on) takes some amount of effort, and you will not do it, as a group, before more or less all (especially any core group) thinks it's worth more than staying with the current system. Cheers, Richard -- Richard Levitte [EMAIL PROTECTED] http://richard.levitte.org/ When I became a man I put away childish things, including the fear of childishness and the desire to be very grown up. -- C.S. Lewis ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [Monotone-devel] Re: [HACKERS] SCMS question
In message [EMAIL PROTECTED] on Thu, 22 Feb 2007 17:38:26 +0100, Markus Schiltknecht [EMAIL PROTECTED] said: markus So far, I'm getting the sense that there are a lot of markus opinions on what replacement system to use, a bit carelessly markus before having answered the above questions thoroughly. markus markus How did you get that impression? You said it yourself: Most PostgreSQL developers currently want to stay with CVS. Unless there's a majority that wants to move on, I doubt there will be a move. In the end, it has to be a group effort, or it will simply not happen. Cheers, Richard - Please consider sponsoring my work on free software. See http://www.free.lp.se/sponsoring.html for details. -- Richard Levitte [EMAIL PROTECTED] http://richard.levitte.org/ When I became a man I put away childish things, including the fear of childishness and the desire to be very grown up. -- C.S. Lewis ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [Monotone-devel] Re: [HACKERS] SCMS question
If I may, I'll add a few words to this discussion: Basically, I'm seeing that three things need to be decided upon: 1. Do you want to stay with CVS or do you want to move to something else? 2. If you want to move, when? Is now a good time, or is it better to look at it another time. This may be a question of what people you have who'd do the job, what kind of time they have for the moment and so on. 3. What would you want a replacement to be able to do? When those questions are answered and people are behind it, then it's time to look at the different systems and see what' the best match to your desires. So far, I'm getting the sense that there are a lot of opinions on what replacement system to use, a bit carelessly before having answered the above questions thoroughly. HTH. Cheers, Richard -- Richard Levitte [EMAIL PROTECTED] http://richard.levitte.org/ When I became a man I put away childish things, including the fear of childishness and the desire to be very grown up. -- C.S. Lewis ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Howto change db cluster locale on-the-fly
Thanks for your answer. Is there any other risk than wrong answers when running with wrong locale? So maybe the best bet would be: 1) drop all text/varchar user indexes 2) stop database, change the locale 3) in single user mode reindex shared tables and system tables in all databases and templates 4) start the database 5) create all text/varchar user indexes Sounds this about right? I'd like to minimize downtime... How to do step 2) - change the locale?? Thanks a lot, Kuba Martijn van Oosterhout napsal(a): On Mon, Feb 19, 2007 at 09:27:06AM +0100, Jakub Ouhrabka wrote: But I guess something like this would work: a) 1) drop all indexes on text/varchar columns 2) change cluster locale 3) create all indexes on text/varchar columns You're going to miss the name columns, ie. every string index in pg_catalog. Also, there are shared tables which all used in every DB. You need to log into every DB in the cluster (don't forget template[01] and reindex everything. So, REINDEX DATABASE; seems to be a safer bet. In general this doesn't actually work since changing the locale may make two strings equal that wern't before, thus possibly breaking a unique index, but it may be possible. I'd suggest single user mode at least, and make backups! Have a nice day, ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [Monotone-devel] Re: [HACKERS] SCMS question
On Thu, Feb 22, 2007 at 03:13:49PM +0100, Markus Schiltknecht wrote: one sparc (osol). So far all gcc compiled, AFAIK. I think, that buildbot was gcc on solaris9/sparc. I care for support of monotone built with sunpro on solaris10 (and opensolaris) on x86 and sparc (but no buildbot for those). there was once some work on msvc support, but I have no idea what happened to that. patrick georgi ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
[HACKERS] Simple Column reordering
Column storage position is the subject of many long threads in recent times. Solutions proposed for this have been both fairly complex and long enough that nothing seems likely to happen for 8.3. If I'm wrong, then of course this proposal would be superceded. I propose that at CREATE TABLE time, the column ordering is re-ordered so that the table columns are packed more efficiently. This would be a physical re-ordering, so that SELECT * and COPY without explicit column definitions would differ from the original CREATE TABLE statement. This would be an optional feature, off by default, controlled by a USERSET GUC optimize_column_order = off (default) | on When the full column ordering proposal is implemented, optimize_column_ordering would be set to default to on. The feature would be supported for at least one more release after this to allow bug analysis. The proposed ordering would be: 1. All fixed length columns, arranged so that alignment is efficient 2. All variable length columns All column ordering would stay as close as possible to original order No changes would be made apart from at CREATE TABLE time. The ordering would be repeatable, so that the order would not change on repeated dump/restore of a table with no changes. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Grouped Index Tuples
On Thu, 2007-02-22 at 12:47 +, Heikki Linnakangas wrote: One question that I'm sure someone will ask is do we need this if we have bitmap indexes? Both aim at having a smaller index, after all. The use cases are quite different; GIT is effective whenever you have a table that's reasonably well-clustered. Unlike the bitmap indexam, GIT's effectiveness doesn't depend on the number of distinct values, in particular it works well with unique indexes. GIT is comparable to clustered indexes in other DBMSs (in fact we might want to call GIT that in the end). A few thoughts: Whether its proof, or merely strong interest, we need to see where the cross-over point is between bitmap indexes (BMI) and GIT. I'm assuming there is one, but others may need convincing that there is a clear win in common use cases. Graphs, numbers etc.. On your TODO list, it would be good to differentiate between ideas and must-complete items. My suggested list would be: Must complete - cost model for optimizer - user documentation - performance tests Good to complete - duplicate key flag (should increase range of overlap with BMI) - range re-check optimization (based upon your results) - better index AM - HOT, so that GIT can provide longer term benefits Others ... I seem to recall you mentioning there was a way to record that the heap tuples are stored in monotonic order and so the sort can be avoided when you return the tuples. Is that still true? Also the patch roughly doubles the amount of WAL generated by WAL not sure what that means -- Simon Riggs EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Simple Column reordering
Simon Riggs wrote: I propose that at CREATE TABLE time, the column ordering is re-ordered so that the table columns are packed more efficiently. This would be a physical re-ordering, so that SELECT * and COPY without explicit column definitions would differ from the original CREATE TABLE statement. This would be an optional feature, off by default, controlled by a USERSET GUC optimize_column_order = off (default) | on Umm, you want a GUC setting to enable standards-breaking behaviour and that will be obsolete when we do column ordering right, which is not likely to be more than one release away, and could even still happen in this coming release? I hope I haven't misunderstood. cheers andrew ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Simple Column reordering
Andrew Dunstan wrote: Simon Riggs wrote: I propose that at CREATE TABLE time, the column ordering is re-ordered so that the table columns are packed more efficiently. This would be a physical re-ordering, so that SELECT * and COPY without explicit column definitions would differ from the original CREATE TABLE statement. This would be an optional feature, off by default, controlled by a USERSET GUC optimize_column_order = off (default) | on Umm, you want a GUC setting to enable standards-breaking behaviour and that will be obsolete when we do column ordering right, which is not likely to be more than one release away, and could even still happen in this coming release? Given that we already seem to have a patch implementing a complete solution, or part thereof, this would seem a rather shortsighted proposal. Why not develop the whole thing and be done with it? I don't understand the reluctance to implementing all of it. The most serious objection I've seen, from Andreas IIRC, is that it would make drivers' lives more difficult; but really, drivers have to cope with dropped columns today which is a pain, and ISTM this proposal (not this one here, but the three-column proposal) would make that a bit simpler. -- Alvaro Herrerahttp://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc. ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] SCMS question
On Thursday 22 February 2007 11:00, Joshua D. Drake wrote: 1. It has an api that can be written to, that may or may not be helpful to buildfarm. 2. It has mindshare. I know that isn't a big deal to a lot of people here, but the it is becoming the new cvs. There are others of course (like monotone) but most projects I know are moving from cvs to svn or starting with svn. Git is also pretty cool, too. You can even present a CVS interface on a git repository. That might address the build farm issue. wt -- Warren Turkal (w00t) ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] autovacuum next steps, take 2
Jim C. Nasby wrote: On Thu, Feb 22, 2007 at 09:32:57AM -0500, Matthew T. O'Connor wrote: So the heuristic would be: * Launcher fires off workers into a database at a given interval (perhaps configurable?) * Each worker works on tables in size order. * If a worker ever catches up to an older worker, then the younger worker exits. This sounds simple and workable to me, perhaps we can later modify this to include some max_workers variable so that a worker would only exit if it catches an older worker and there are max_workers currently active. That would likely result in a number of workers running in one database, unless you limited how many workers per database. And if you did that, you wouldn't be addressing the frequently update table problem. A second vacuum in a database *must* exit after a fairly short time so that we can go back in and vacuum the important tables again (well or the 2nd vacuum has to periodically re-evaluate what tables need to be vacuumed). I'm not sure this is a great idea, but I don't see how this would result in large numbers of workers working in one database. If workers work on tables in size order, and exit as soon as they catch up to an older worker, I don't see the problem. Newer works are going to catch-up to older workers pretty quickly since small tables will vacuum fairly quickly. ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] SCMS question
Warren Turkal wrote: On Thursday 22 February 2007 11:00, Joshua D. Drake wrote: 1. It has an api that can be written to, that may or may not be helpful to buildfarm. 2. It has mindshare. I know that isn't a big deal to a lot of people here, but the it is becoming the new cvs. There are others of course (like monotone) but most projects I know are moving from cvs to svn or starting with svn. Git is also pretty cool, too. You can even present a CVS interface on a git repository. That might address the build farm issue. But it wasn't portable, last time I checked. -- Alvaro Herrerahttp://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] SCMS question
Alvaro Herrera wrote: Warren Turkal wrote: On Thursday 22 February 2007 11:00, Joshua D. Drake wrote: 1. It has an api that can be written to, that may or may not be helpful to buildfarm. 2. It has mindshare. I know that isn't a big deal to a lot of people here, but the it is becoming the new cvs. There are others of course (like monotone) but most projects I know are moving from cvs to svn or starting with svn. Git is also pretty cool, too. You can even present a CVS interface on a git repository. That might address the build farm issue. But it wasn't portable, last time I checked. Besides that, the whole idea is that buildfarm should mimic as exactly as possible how we checkout and build by hand. cheers andrew ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] SCMS question
On Thursday 22 February 2007 20:39, Alvaro Herrera wrote: Git is also pretty cool, too. You can even present a CVS interface on a git repository. That might address the build farm issue. But it wasn't portable, last time I checked. Git is in the FreeBSD ports. The cvs gateway server comes with it AFAIK. wt -- Warren Turkal (w00t) ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] SCMS question
Gregory Stark [EMAIL PROTECTED] writes: [much snipped] Why are so few people committers? ... The answer to both questions is because CVS limitations make it hard to do better. Uh, no. The reason there are so few committers is that there are so few people qualified not to break things. If we had a community with hundreds of people with commit access, very possibly we'd be feeling the need of a better SCM system. But right now, CVS is not the bottleneck, and I don't see that we'd get a payback for the pain of changing to something else. regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend