Re: [HACKERS] tsearch in core patch, for inclusion
Bruce Momjian <[EMAIL PROTECTED]> writes: > Oleg Bartunov wrote: >> It's not so big addition to the gram.y, see a list of commands >> http://mira.sai.msu.su/~megera/pgsql/ftsdoc/sql-commands.html. > I looked at the diff file and the major change in gram.y is the creation > of a new object type FULLTEXT, You mean four different object types. I'm not totally clear on bison's scaling behavior relative to the number of productions, but I think there's no question that this patch will impose a measurable distributed penalty on every single query issued to Postgres by any application, whether it's heard of tsearch or not. The percentage overhead would be a lot lower if the patch were introducing a similar number of entries into pg_proc. regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] New feature request: FlashBack Query
RPK wrote: > Andrew, > >>> Demanding unlimited undo at some time that is arbitrarilly distant in the >>> future strikes me as wholly unreasonable. > > I did not mean asking for undo from a life-time log. Since FlashBack > Technology is already there, I just mean that world's most advanced database > (PostgreSQL, as they say), must have an optimized way for undoing of at > least a week changes. A week log is enough and PostgreSQL can keep on > removing old logs automatically. > > Secondly, it must be left to the user to decide for the number of days of > archive he want to store. Again upto a week max. You might look at storing delta's or similar (perhaps with a check table ) if you need to change data a week back. Then you can just find the row representing the problematic change and delete it. If you really want to track what happens, do deltas and then instead of deleting them, put a reversing delta in, keeping your entire audit trail. You can put materialized views on top of this if you need performance. - August ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] tsearch in core patch, for inclusion
It's not so big addition to the gram.y, see a list of commands http://mira.sai.msu.su/~megera/pgsql/ftsdoc/sql-commands.html. SQL commands make FTS syntax clear and follow tradition to manage system objects. From the user's side, I'd be very unhappy to configure FTS, which can be very complex, using functions. All we want is to provide users clear syntax. This is like the third time we have been around this problem. The syntax is clear and reasonable imo. Can we stop arguing about it and just include? If there are specific issues beyond syntax that is one thing, but that this point it seems we are arguing for the sake of arguing. Joshua D. Drake ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] timestamp subtraction (was Re: [SQL] formatting intervals with to_char)
One problem with removing justify_hours() is that this is going to return '24:00:00', rather than '1 day: test=> select '2004-01-02 00:00:00'::timestamptz - '2004-01-01 00:00:00'::timestamptz; ?column? -- 24:00:00 (1 row) --- Jim Nasby wrote: > On Oct 5, 2006, at 11:50 AM, Tom Lane wrote: > > regression=# select ('2006-09-15 23:59:00'::timestamp - '2006-09-01 > > 09:30:41'::timestamp); > > ?column? > > -- > > 14 days 14:28:19 > > (1 row) > > > > should be reporting '350:28:19' instead. > > > > This is a hack that was done to minimize the changes in the regression > > test expected outputs when we changed type interval from months/ > > seconds > > to months/days/seconds. But I wonder whether it wasn't a dumb idea. > > It is certainly inconsistent, as noted in the code comments. > > > > I'm tempted to propose that we remove the justify_hours call, and tell > > anyone who really wants the old results to apply justify_hours() to > > the > > subtraction result for themselves. Not sure what the fallout would > > be, > > though. > > I suspect there's applications out there that are relying on that > being nicely formated for display purposes. > > I agree it should be removed, but we might need a form of backwards > compatibility for a version or two... > -- > Jim Nasby[EMAIL PROTECTED] > EnterpriseDB http://enterprisedb.com 512.569.9461 (cell) > > > > ---(end of broadcast)--- > TIP 3: Have you checked our extensive FAQ? > >http://www.postgresql.org/docs/faq -- Bruce Momjian <[EMAIL PROTECTED]> http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Column storage positions
On Tuesday 20 February 2007 16:07, Phil Currier wrote: > Another problem relates to upgrades. With tools like pg_migrator now > on pgfoundry, people will eventually expect quick upgrades that don't > require rewriting each table's data. Storage positions would cause a > problem for every version X -> version Y upgrade with Y >= 8.3, even > when X is also >= 8.3, because a version X table could always have > been altered without a rewrite into a structure different from what > Y's CREATE TABLE will choose. If you are using pg_migrator your not going to be moving the datafiles on disk anyway,so pg_migrator's behavior shouldnt change terribly. If your doing pg_dump based upgrade, presumably pg_dump could write it's create statements with the columns in attstorpos order and set attnum = attstorpos, preserving the physical layout from the previous install. -- Robert Treat Build A Brighter LAMP :: Linux Apache {middleware} PostgreSQL ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Column storage positions
Just as my 2 cents to the proposed idea. I want to demonstrate that the proposed idea is very relevant for the performance. I recently did an migration from PG 8.1 to PG 8.2. During that time I was dumping the 2TB database with several very wide tables (having ~ 200 columns). And I saw that on my pretty powerful server with 8Gb RAM, Itanium2 procesor,large RAID which can do I/O at 100Mb/sec the performance of pg_dump was CPU limited, and the read speed of the tables was 1-1.5mb/sec (leading to 2 week dumping time). I was very surprised by these times, and profiled postgres to check the reason of that: here is the top of gprof: % cumulative self self total time seconds secondscalls s/call s/call name 60.72 13.5213.52 6769826 0.00 0.00 nocachegetattr 10.58 15.88 2.36 9035566 0.00 0.00 CopyAttributeOutText 7.22 17.49 1.61 65009457 0.00 0.00 CopySendData 6.34 18.90 1.411 1.4122.21 CopyTo So the main slow-down of the process was all this code recomputing the boundaries of the columns I checked that by removing one tiny varchar column and COALESCING all NULLs, and after that the performance of pg_dumping increased by more than a factor of 2! I should have reported that experience earlier... but I hope that my observations can be useful in the context of the Phil's idea. regards, Sergey *** Sergey E. Koposov Max Planck Institute for Astronomy/Cambridge Institute for Astronomy/Sternberg Astronomical Institute Tel: +49-6221-528-349 Web: http://lnfm1.sai.msu.ru/~math E-mail: [EMAIL PROTECTED] ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
[HACKERS] --enable-xml instead of --with-libxml?
Now we have --with-libxml (and USE_LIBXML constant for #ifdef-s), what is not absolutely right: XML support is smth that is more general than "using libxml2 library". E.g., some SQL/XML publishing functions (such as XMLPI) do not deal with libxml2. Also, in the future more "non-libxml" functionality could be added to "XML support" (well, Peter's recent SQL-to-XML mapping functions prove it). I think it'd better to rename configure option to --enable-xml and USE_LIBXML to ENABLE_XML. I'll do it if there are no objections. -- Best regards, Nikolay ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] PrivateRefCount (for 8.3)
Added to TODO: * Consider decreasing the amount of memory used by PrivateRefCount http://archives.postgresql.org/pgsql-hackers/2006-11/msg00797.php http://archives.postgresql.org/pgsql-hackers/2007-01/msg00752.php --- Simon Riggs wrote: > On Mon, 2006-11-27 at 14:42 -0500, Bruce Momjian wrote: > > Simon Riggs wrote: > > > int8 still seems like overkjll. When will the ref counts go above 2 on a > > > regular basis? Surely refcount=2 is just chance at the best of times. > > > > > > Refcount -> 2 bits per value, plus a simple overflow list? That would > > > allow 0,1,2 ref counts plus 3 means look in hashtable to find real > > > refcount. > > > > At two bits, would we run into contention for the byte by multiple > > backends? > > No contention, its a private per-backend data structure. That's why we > want to reduce the size of it so badly. > > -- > Simon Riggs > EnterpriseDB http://www.enterprisedb.com > > > > ---(end of broadcast)--- > TIP 6: explain analyze is your friend -- Bruce Momjian <[EMAIL PROTECTED]> http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] fixing Makefile.shlib for solaris/gcc with -m64 flag
OK, I have made Solaris gcc the same as Linux in Makefile.shlib, patch attached. I am not backpatching this. We will get this tested for 8.3. --- Jignesh K. Shah wrote: > I dont think we solved this.. But I think the way to put -m64 should be same > as in Linux and Solaris > and not different. > > Thanks. > Regards, > Jignesh > > > Tom Lane wrote: > > Peter Eisentraut <[EMAIL PROTECTED]> writes: > >> Am Mittwoch, 17. Januar 2007 17:12 schrieb Tom Lane: > >>> "Jignesh K. Shah" <[EMAIL PROTECTED]> writes: > simple if I use -m64 for 64 bit then all end binaries are generated > 64-bit and the shared libraries are generated 32-bit and the compilation > fails (ONLY ON SOLARIS) since that particular line is only for the > condition Solaris AND gcc. > > If I use the COMPILER which is CC + CFLAGS it passes -m64 properly to it > and generates shared libraries 64-bit and the compile continues.. > >>> Hmm ... I see we're doing it that way already for some other platforms, > >>> but I can't help thinking it's a kluge. Wouldn't the correct answer be > >>> that -m64 needs to be in LDFLAGS? > > > >> The correct answer may be to put -m64 into CC. > > > > Did we conclude that that was a satisfactory solution, or is this still > > a live patch proposal? > > > > If -m64 in CC is the right solution, it should probably be mentioned in > > FAQ_Solaris. > > > > regards, tom lane > > ---(end of broadcast)--- > TIP 3: Have you checked our extensive FAQ? > >http://www.postgresql.org/docs/faq -- Bruce Momjian <[EMAIL PROTECTED]> http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + Index: src/Makefile.shlib === RCS file: /cvsroot/pgsql/src/Makefile.shlib,v retrieving revision 1.108 diff -c -c -r1.108 Makefile.shlib *** src/Makefile.shlib 11 Feb 2007 19:31:45 - 1.108 --- src/Makefile.shlib 20 Feb 2007 22:45:06 - *** *** 191,197 ifeq ($(PORTNAME), solaris) ifeq ($(GCC), yes) ! LINK.shared = $(CC) -shared else # CFLAGS added for X86_64 LINK.shared = $(CC) -G $(CFLAGS) --- 191,197 ifeq ($(PORTNAME), solaris) ifeq ($(GCC), yes) ! LINK.shared = $(COMPILER) -shared # $(COMPILER) needed for -m64 else # CFLAGS added for X86_64 LINK.shared = $(CC) -G $(CFLAGS) ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] [pgsql-patches] pg_get_domaindef
Bruce Momjian wrote: I always felt is was better for us to have server functions that return schema-specific data rather than require every application to define its own functions. I realize they are duplicated in pg_dump, but even if we made an external library that pg_dump could share with applications, would it only be available to C applications? That seems quite limiting. I don't think so. I imagine that the maintainers of DBD::Pg and phppgadmin, for example, would be very likely to expose them. And I can certainly imagine using/exposing them in some psql slash commands. cheers andrew ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] DROP FUNCTION failure: cache lookup failed for relation X
Added to TODO: * Increase locking when DROPing objects so dependent objects cannot get dropped while the DROP operation is happening http://archives.postgresql.org/pgsql-hackers/2007-01/msg00937.php --- Tom Lane wrote: > Bruce Momjian <[EMAIL PROTECTED]> writes: > > Uh, where are we on this? > > Still in the think-about-it mode, personally ... my proposed fix is > certainly much too invasive to consider back-patching, so unless someone > comes up with a way-simpler idea, it's 8.3 material at best ... > > regards, tom lane > > ---(end of broadcast)--- > TIP 1: if posting/reading through Usenet, please send an appropriate >subscribe-nomail command to [EMAIL PROTECTED] so that your >message can get through to the mailing list cleanly -- Bruce Momjian <[EMAIL PROTECTED]> http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] tsearch in core patch, for inclusion
Oleg Bartunov wrote: > On Tue, 20 Feb 2007, Alvaro Herrera wrote: > > > Bruce Momjian wrote: > >> > >> FYI, I added this to the patches queue because I think we decided > >> full-text indexing should be in the core. If I am wrong, please let me > >> know. > > > > One of the objections I remember to this particular implementation was > > that configuration should be done using functions rather than new syntax > > in gram.y. This seems a good idea because it avoids bloating the > > grammar, while still allowing dependency tracking, pg_dump support, > > syscache support etc. > > It's not so big addition to the gram.y, see a list of commands > http://mira.sai.msu.su/~megera/pgsql/ftsdoc/sql-commands.html. > SQL commands make FTS syntax clear and follow tradition to manage > system objects. From the user's side, I'd be very unhappy to configure > FTS, which can be very complex, using functions. All we want is to > provide users clear syntax. I looked at the diff file and the major change in gram.y is the creation of a new object type FULLTEXT, so you can CREATE, ALTER and DROP FULLTEXT. I don't know fulltext administration well enough, so if Oleg says a function API would be too complex, I am OK with his new parser syntax. -- Bruce Momjian <[EMAIL PROTECTED]> http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] tsearch in core patch, for inclusion
On Tue, 20 Feb 2007, Alvaro Herrera wrote: Bruce Momjian wrote: FYI, I added this to the patches queue because I think we decided full-text indexing should be in the core. If I am wrong, please let me know. One of the objections I remember to this particular implementation was that configuration should be done using functions rather than new syntax in gram.y. This seems a good idea because it avoids bloating the grammar, while still allowing dependency tracking, pg_dump support, syscache support etc. It's not so big addition to the gram.y, see a list of commands http://mira.sai.msu.su/~megera/pgsql/ftsdoc/sql-commands.html. SQL commands make FTS syntax clear and follow tradition to manage system objects. From the user's side, I'd be very unhappy to configure FTS, which can be very complex, using functions. All we want is to provide users clear syntax. Regards, Oleg _ Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru), Sternberg Astronomical Institute, Moscow University, Russia Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(495)939-16-83, +007(495)939-23-83 ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] msvc failure in largeobject regression test
Was this problem addressed? --- Magnus Hagander wrote: > On Tue, Jan 23, 2007 at 11:39:23AM -0800, Jeremy Drake wrote: > > On Tue, 23 Jan 2007, Magnus Hagander wrote: > > > > > On Tue, Jan 23, 2007 at 09:31:40AM -0500, Andrew Dunstan wrote: > > > > Magnus Hagander wrote: > > > > >Hi! > > > > > > > > > >I get failures for the largeobject regression tests on my vc++ build. I > > > > >don't think this has ever worked, given that those tests are fairly > > > > >new. > > > > >Any quick ideas on what's wrong before I dig deeper? > > > > > > > > > > > > > > [snip] > > > > > > > > I wonder if this is a line-end issue? Assuming you are working from CVS, > > > > does your client turn \n into \r\n ? I see that other windows boxes are > > > > happily passing this test on the buildfarm, and of course the mingw cvs > > > > doesn't adjust line endings. > > > > > > Bingo! > > > > > > That's it. I copeid the file in binary mode from a linux box and now it > > > passes. > > > > I thought about that when I wrote it, and thus tried it under mingw and > > cygwin without issue ;) I don't think the regression tests were in a > > position of running on the msvc build at the time... My thought for what > > to do if this did run into a problem would be an alternate output file > > that is also acceptable (I don't know what they're called but other tests > > have them IIRC). > > Either that, or we require a checkout using Unix style linefeeds. I've > confirmed that removing the file and checking it back out with "cvs --lf > update tenk.data" works - tests pass fine. > > Yet another option might be to flag that file as binary in cvs, in which > case I think cvsnt shouldn't go mess with it. > > //Magnus > > ---(end of broadcast)--- > TIP 1: if posting/reading through Usenet, please send an appropriate >subscribe-nomail command to [EMAIL PROTECTED] so that your >message can get through to the mailing list cleanly -- Bruce Momjian <[EMAIL PROTECTED]> http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] [pgsql-patches] pg_get_domaindef
I always felt is was better for us to have server functions that return schema-specific data rather than require every application to define its own functions. I realize they are duplicated in pg_dump, but even if we made an external library that pg_dump could share with applications, would it only be available to C applications? That seems quite limiting. Of course, if people don't need these functions, then we shouldn't have them. Seems we have to decide on this one so we can update the TODO or apply the patch. --- Tom Lane wrote: > Andrew Dunstan <[EMAIL PROTECTED]> writes: > > FAST PostgreSQL wrote: > >> Please find attached the patch with modifications > > > are you proposing to implement the other functions in this TODO item > > (pg_get_acldef(), pg_get_typedefault(), pg_get_attrdef(), > > pg_get_tabledef(), pg_get_functiondef() ) ? > > I haven't entirely understood the use case for any of these. It's not > pg_dump, for a number of reasons: one being that pg_dump still has to > support older backend versions, and another being that every time we > let backend SnapshotNow functions get involved, we take another hit to > pg_dump's claim to produce a consistent MVCC snapshot. > > But my real objection is: do we really want to support duplicative code > in both pg_dump and the backend? Updating pg_dump is already a major > PITA whenever one adds a new feature; doubling that work isn't > attractive. (And it'd be double, not just a copy-and-paste, because of > the large difference in the operating environment.) So I want to hear a > seriously convincing use-case that will justify the maintenance load we > are setting up for ourselves. "Somebody might want this" is not > adequate. > > Perhaps a better area of work would be the often-proposed refactoring of > pg_dump into a library and driver program, wherein the library could > expose individual functions such as "fetch the SQL definition of this > object". Unfortunately, that'll be a huge project with no payoff until > the end... > > regards, tom lane > > ---(end of broadcast)--- > TIP 2: Don't 'kill -9' the postmaster -- Bruce Momjian <[EMAIL PROTECTED]> http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] tsearch in core patch, for inclusion
Bruce Momjian wrote: > > FYI, I added this to the patches queue because I think we decided > full-text indexing should be in the core. If I am wrong, please let me > know. One of the objections I remember to this particular implementation was that configuration should be done using functions rather than new syntax in gram.y. This seems a good idea because it avoids bloating the grammar, while still allowing dependency tracking, pg_dump support, syscache support etc. -- Alvaro Herrerahttp://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc. ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] No ~ operator for box, point
Added to TODO: * Add missing operators for geometric data types Some geometric types do not have the full suite of geometric operators, e.g. box @> point --- Jim Nasby wrote: > * Add missing operators for geometric data types and operators > > There are geometric data types that do not have the full suite > of geometric operators > defined; for example, box @> point does not exist. > > On Jan 26, 2007, at 9:32 PM, Bruce Momjian wrote: > > > > > Can I get a TODO on this? > > > > -- > > - > > > > Jim Nasby wrote: > >> On Jan 25, 2007, at 6:26 PM, Tom Lane wrote: > >>> Martijn van Oosterhout writes: > On Thu, Jan 25, 2007 at 01:59:33PM -0500, Merlin Moncure wrote: > > On 1/25/07, Jim C. Nasby <[EMAIL PROTECTED]> wrote: > >> decibel=# select box '((0,0),(2,2))' ~ point '(1,1)'; > >> ERROR: operator does not exist: box ~ point > > > > I don't see a reason, although you can do it with polygon and not > > box. > >>> > >>> Seems like an old oversight. > >> > >> Ok. If I ever get some time I'll submit a patch to bring everything > >> in-line (there's other missing operators as well). > >> > > Also, I can't find the ~ operator defined for polygon in the > > documentation, am I missing something? > >>> > >>> ~ is deprecated, "contains" is preferentially spelled @> now. > >> > >> Ok, I'll keep that in mind. > >> -- > >> Jim Nasby[EMAIL PROTECTED] > >> EnterpriseDB http://enterprisedb.com 512.569.9461 (cell) > >> > >> > >> > >> ---(end of > >> broadcast)--- > >> TIP 7: You can help support the PostgreSQL project by donating at > >> > >> http://www.postgresql.org/about/donate > > > > -- > > Bruce Momjian [EMAIL PROTECTED] > > EnterpriseDBhttp://www.enterprisedb.com > > > > + If your life is a hard drive, Christ can be your backup. + > > > > -- > Jim Nasby[EMAIL PROTECTED] > EnterpriseDB http://enterprisedb.com 512.569.9461 (cell) -- Bruce Momjian <[EMAIL PROTECTED]> http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] tsearch in core patch, for inclusion
FYI, I added this to the patches queue because I think we decided full-text indexing should be in the core. If I am wrong, please let me know. --- Teodor Sigaev wrote: > We (Oleg and me) are glad to present tsearch in core of pgsql patch. In > basic, > layout, functions, methods, types etc are the same as in current tsearch2 > with a > lot of improvements: > > - pg_ts_* tables now are in pg_catalog > - parsers, dictionaries, configurations now have owner and namespace > similar to > other pgsql's objects like tables, operator classes etc > - current tsearch configuration is managed with a help of GUC variable > tsearch_conf_name. > - choosing of tsearch cfg by locale may be done for each schema separately > - managing of tsearch configuration with a help of SQL commands, not with > insert/update/delete statements. This allows to drive dependencies, > correct dumping and dropping. > - psql support with a help of \dF* commands > - add all available Snowball stemmers and corresponding configuration > - correct memory freeing by any dictionary > > Work is sponsored by EnterpriseDB's PostgreSQL Development Fund. > > patch: http://www.sigaev.ru/misc/tsearch_core-0.33.gz > docs: http://mira.sai.msu.su/~megera/pgsql/ftsdoc/ (not yet completed and > it's > not yet a patch, just a SGML source) > > Implementation details: > - directory layout >src/backend/utils/adt/tsearch - all IO function and simple operations >src/backend/utils/tsearch - complex processing functions, including > language processing and dictionaries > - most of snowball dictionaries are placed in separate .so library and >they plug in into data base by similar way as character conversation >library does. > > If there aren't objections then we plan commit patch tomorrow or after > tomorrow. > Before committing, I'll changes oids from 5000+ to lower values to prevent > holes > in oids. And after that, I'll remove tsearch2 contrib module. > > -- > Teodor Sigaev E-mail: [EMAIL PROTECTED] > WWW: http://www.sigaev.ru/ > > ---(end of broadcast)--- > TIP 7: You can help support the PostgreSQL project by donating at > > http://www.postgresql.org/about/donate -- Bruce Momjian <[EMAIL PROTECTED]> http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] tsearch in core patch, for inclusion
Your patch has been added to the PostgreSQL unapplied patches list at: http://momjian.postgresql.org/cgi-bin/pgpatches It will be applied as soon as one of the PostgreSQL committers reviews and approves it. --- Teodor Sigaev wrote: > We (Oleg and me) are glad to present tsearch in core of pgsql patch. In > basic, > layout, functions, methods, types etc are the same as in current tsearch2 > with a > lot of improvements: > > - pg_ts_* tables now are in pg_catalog > - parsers, dictionaries, configurations now have owner and namespace > similar to > other pgsql's objects like tables, operator classes etc > - current tsearch configuration is managed with a help of GUC variable > tsearch_conf_name. > - choosing of tsearch cfg by locale may be done for each schema separately > - managing of tsearch configuration with a help of SQL commands, not with > insert/update/delete statements. This allows to drive dependencies, > correct dumping and dropping. > - psql support with a help of \dF* commands > - add all available Snowball stemmers and corresponding configuration > - correct memory freeing by any dictionary > > Work is sponsored by EnterpriseDB's PostgreSQL Development Fund. > > patch: http://www.sigaev.ru/misc/tsearch_core-0.33.gz > docs: http://mira.sai.msu.su/~megera/pgsql/ftsdoc/ (not yet completed and > it's > not yet a patch, just a SGML source) > > Implementation details: > - directory layout >src/backend/utils/adt/tsearch - all IO function and simple operations >src/backend/utils/tsearch - complex processing functions, including > language processing and dictionaries > - most of snowball dictionaries are placed in separate .so library and >they plug in into data base by similar way as character conversation >library does. > > If there aren't objections then we plan commit patch tomorrow or after > tomorrow. > Before committing, I'll changes oids from 5000+ to lower values to prevent > holes > in oids. And after that, I'll remove tsearch2 contrib module. > > -- > Teodor Sigaev E-mail: [EMAIL PROTECTED] > WWW: http://www.sigaev.ru/ > > ---(end of broadcast)--- > TIP 7: You can help support the PostgreSQL project by donating at > > http://www.postgresql.org/about/donate -- Bruce Momjian <[EMAIL PROTECTED]> http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
[HACKERS] Column storage positions
Inspired by this thread [1], and in particular by the idea of storing three numbers (permanent ID, on-disk storage position, display position) for each column, I spent a little time messing around with a prototype implementation of column storage positions to see what kind of difference it would make. The results were encouraging: on a table with 20 columns of alternating smallint and varchar(10) datatypes, selecting the max() of one of the rightmost int columns across 1 million rows ran around 3 times faster. The same query on the leftmost varchar column (which should suffer the most from this change) predictably got a little slower (about 10%); I couldn't measure a performance drop on the rightmost varchar columns. The table's size didn't drop much in this case, but a different table of 20 alternating int and smallint columns showed a 20% slimmer disk footprint, pretty much as expected. Pgbenching showed no measurable difference, which isn't surprising since the pgbench test tables consist of just int values with char filler at the end. So here is a proposal for separating a column's storage position from its permanent ID. I've ignored the display position piece of the original thread because display positions don't do much other than save you the hassle of creating a view on top of your table, while storage positions have demonstrable, tangible benefits. And there is no reason to connect the two features; display positions can easily be added separately at a later point. We want to decouple a column's on-disk storage position from its permanent ID for two reasons: to minimize the space lost to alignment padding between fields, and to speed up access to individual fields. The system will automatically assign new storage positions when a table is created, and when a table alteration requires a rewrite (currently just adding a column with a default, or changing a column datatype). To allow users to optimize tables based on the fields they know will be frequently accessed, I think we should extend ALTER TABLE to accept user-assigned storage positions (something like "ALTER TABLE ALTER col SET STORAGE POSITION X"). This command would also be useful for another reason discussed below. In my prototype, I used these rules to determine columns' storage order: 1) fixed-width fields before variable-width, dropped columns always last 2) fixed-width fields ordered by increasing size 3) not-null fields before nullable fields There are other approaches worth considering - for example, you could imagine swapping the priority of rules 2 and 3. Resultant tables would generally have more alignment waste, but would tend to have slightly faster field access. I'm really not sure what the optimal strategy is since every user will have a slightly different metric for "optimal". In any event, either of these approaches is better than the current situation. To implement this, we'll need a field (perhaps attstoragepos?) in pg_attribute to hold the storage position. It will equal attnum until it is explicitly reassigned. The routines in heaptuple.c need to quickly loop through the fields of a tuple in storage order rather than attnum order, so I propose extending TupleDesc to hold an "attrspos" array that sits alongside the attrs array. In the prototype I used an array of int2 indices into the attrs array, ordered by storage position. These changes cause a problem in ExecTypeFromTLInternal: this function calls CreateTemplateTupleDesc followed by TupleDescInitEntry, assuming that attnum == attstoragepos for all tuples. With the introduction of storage positions, this of course will no longer be true. I got around this by having expand_targetlist, build_physical_tlist, and build_relation_tlist make sure each TargetEntry (for targetlists corresponding to either insert/update tuples, or base tuples pulled straight from the heap) gets a correct resorigtbl and resname. Then ExecTypeFromTLInternal first tries calling a new function TupleDescInitEntryAttr, which hands off to TupleDescInitEntry and then performs a syscache lookup to update the storage position using the resorigtbl. This is a little ugly because ExecTypeFromTLInternal doesn't know in advance what kind of tupledesc it's building, so it needs to retreat to the old method whenever the syscache lookup fails, but it was enough to pass the regression tests. I could use some advice on this - there's probably a better way to do it. Another problem relates to upgrades. With tools like pg_migrator now on pgfoundry, people will eventually expect quick upgrades that don't require rewriting each table's data. Storage positions would cause a problem for every version X -> version Y upgrade with Y >= 8.3, even when X is also >= 8.3, because a version X table could always have been altered without a rewrite into a structure different from what Y's CREATE TABLE will choose. I don't think it's as simple as just using the above-mentioned ALTER TABLE extension to a
Re: [HACKERS] Modifying and solidifying contrib
Are we doing this? --- Joshua D. Drake wrote: > Hello, > > With all the recent discussion on contrib modules etc.. I would like to > offer the following suggestion. I am willing to do a good portion of the > work myself and I can get it done before feature freeze. I will need > help with the global make file stuff however so that is one dependency. > > Add directory /modules > Modules are compiled and installed by default but not enabled. > Modules in 8.3 currently are: >pgrowlocks >pg_freespacemap (to be renameed pgfreespacemap to be consistent) >pgstattuple >pgcrypto >xml2 >pgbuffercache >initagg > > Requirements for /modules >Must go through normal vetting process on -hackers >Must include patches to core documentation in Docbook >Must include test cases? I don't recall if we have regress for all >contrib stuff. > > Keep directory contrib > Contrib is not compiled or installed by default > Contrib in 8.3 would be: >start-scripts >pgbench (which I think really should be a foundry project) >vacuumlo (is this even required anymore?) >adminpack >btree_gist > etc... > Requirements for /contrib >Must go through normal vetting process on -hackers >Must includes README >Must include test cases? Same questions for modules > > > Thoughts, flames? > > Sincerely, > > Joshua D. Drake > > > > > > > -- > > === The PostgreSQL Company: Command Prompt, Inc. === > Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 > Providing the most comprehensive PostgreSQL solutions since 1997 > http://www.commandprompt.com/ > > Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate > PostgreSQL Replication: http://www.commandprompt.com/products/ > > > ---(end of broadcast)--- > TIP 7: You can help support the PostgreSQL project by donating at > > http://www.postgresql.org/about/donate -- Bruce Momjian <[EMAIL PROTECTED]> http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] New feature request: FlashBack Query
On 2/20/07, Hannu Krosing <[EMAIL PROTECTED]> wrote: He probably meant longer transactions and several versions visible to different backends. Yes, he may have... but I was responding to the statements he made. > but why shouldn't that be a bit slower, it isn't common practice anyway. Not for pure OLAP, at least when you have fairly fast transactions. But it can slow things down when you have some hotspot tables. True, but hotspots are hotspots and no matter what caused them or where they are, they slow down performance in one area or another. Limiting hotspots is generally an application-level design decision anyway. > Same with rollbacks... why > should they optimize for them when 97% of transactions commit? Or other way around, - you should write code, where most transactions commit ;) That's what I said, Oracle shouldn't optimize for rollbacks when most transactions commit. Hmm. How can it check visibility at block level and at the same time do in-place updates on single tuples ? In most cases, the block-level SCN determines transaction-level visibility. Now, row locks can exist within that page, but they don't determine visibility... they determine the UNDO location which contains the data required to rebuild a read-consistent version of the block. My proposal of keeping visibility info in a separate heap would help to get similar results, that is mostly 1 check per page. That would also cover much of the index lookup cases below. Most definitely. I don't think we will ever move to rollback segments, but for some use-cases moving visibility to a separate heap could make sense. Yes. And if we want to bring back time travel (see another thread about "Flashback Queries"), then we may end up implementing the original postgresql's design spec and make VACUUM spihon dead tuples over to archive relations, which already starts looking a little like rollback segments, only for other purposes :) Yes. Using a system extensively can also create blind spots about some of the systems (mis)features. One learns to avoid doing some things without consciously knowing about it. I've used 'em all and can certainly name issues with Oracle. However, we're discussing improving PostgreSQL, I was responding to Greg's statements, and I don't see the need to bring up unrelated Oracle implementation details which will just lead to a general anti-Oracle discussion. To get a really meaningful discussion we should involve someone who has *designed* them, not merely used them . True. My comment is more along the lines of uninformed discussion which leads to Oracle-bashing. Those who have at least used and administered Oracle in production tend to understand Oracle's design decisions and related issues better than those who have just heard of Oracle's issues. I live in the real world and can admit certain failures of any database system regardless of which I prefer. No single database is best for every task. I just didn't want the discussion going where it normally goes, to being one of, "we're right and they're wrong". Can we move offline or to another thread if we want to continue discussing Oracle-specifics; otherwise... let's focus on flashback-like functionality in this thread. -- Jonah H. Harris, Software Architect | phone: 732.331.1324 EnterpriseDB Corporation| fax: 732.331.1301 33 Wood Ave S, 3rd Floor| [EMAIL PROTECTED] Iselin, New Jersey 08830| http://www.enterprisedb.com/ ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] New feature request: FlashBack Query
On 2/20/07, Rod Taylor <[EMAIL PROTECTED]> wrote: Do 97% of transactions commit because Oracle has slow rollbacks and developers are working around that performance issue, or because they really commit? Again, off-topic, but 97% of all transactions commit according to Jim Gray and his research... not anything related to Oracle. I have watched several developers that would prefer to issue numerous selects to verify things like foreign keys in the application in order to avoid a rollback. That's just bad development. Anyway, I don't have experience with big Oracle applications but I'm not so sure that 97% of transactions would commit if rollbacks were cheaper. Again, stats not related to Oracle, but databases in general. -- Jonah H. Harris, Software Architect | phone: 732.331.1324 EnterpriseDB Corporation| fax: 732.331.1301 33 Wood Ave S, 3rd Floor| [EMAIL PROTECTED] Iselin, New Jersey 08830| http://www.enterprisedb.com/ ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] statement_timeout doesnt work within plpgsql by design?
On Tuesday 20 February 2007 12:50, Tom Lane wrote: > Robert Treat <[EMAIL PROTECTED]> writes: > > pagila=# create or replace function test() returns bool as $$ begin set > > statement_timeout = 3000; perform pg_sleep(4) ; return true; end $$ > > language plpgsql; > > CREATE FUNCTION > > statement_timeout is measured across an entire interactive command, not > individual commands within a function; and the timeout that applies to > an interactive command is determined at its beginning. So the above > doesn't do what you think. > Well, I'd be happy if it caused the entire function to bail out or if it caused individual statements within a function to bail out, but it does neither. I can see how that would be a bit tricky to implement though. -- Robert Treat Build A Brighter LAMP :: Linux Apache {middleware} PostgreSQL ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] autovacuum next steps
I'm wondering if we can do one better... Since what we really care about is I/O responsiveness for the rest of the system, could we just time how long I/O calls take to complete? I know that gettimeofday can have a non-trivial overhead, but do we care that much about it in the case of autovac? On Fri, Feb 16, 2007 at 05:37:26PM -0800, Ron Mayer wrote: > Alvaro Herrera wrote: > > > > Once autovacuum_naptime... autovacuum_max_workers... > > How does this sound? > > The knobs exposed on autovacuum feel kinda tangential to > what I think I'd really want to control. > > IMHO "vacuum_mbytes_per_second" would be quite a bit more > intuitive than cost_delay, naptime, etc. > > > ISTM I can relatively easily estimate and/or spec out how > much "extra" I/O bandwidth I have per device for vacuum; > and would pretty much want vacuum to be constantly > running on whichever table that needs it the most so > long as it can stay under that bandwith limit. > > Could vacuum have a tunable that says "X MBytes/second" > (perhaps per device) and have it measure how much I/O > it's actually doing and try to stay under that limit? > > For more fine-grained control a cron job could go > around setting different MBytes/second limits during > peak times vs idle times. > > > If people are concerned about CPU intensive vacuums > instead of I/O intensive ones (does anyone experience > that? - another tuneable "vacuum_percent_of_cpu" would > be more straightforward than delay_cost, cost_page_hit, > etc. But I'd be a bit surprised if cpu intensive > vacuums are common. > > ---(end of broadcast)--- > TIP 6: explain analyze is your friend > -- Jim Nasby[EMAIL PROTECTED] EnterpriseDB http://enterprisedb.com 512.569.9461 (cell) ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] New feature request: FlashBack Query
On Feb 20, 2007, at 1:40 PM, Tom Lane wrote: RPK <[EMAIL PROTECTED]> writes: I did not mean asking for undo from a life-time log. Since FlashBack Technology is already there, I just mean that world's most advanced database (PostgreSQL, as they say), must have an optimized way for undoing of at least a week changes. You're living in a dream world. Do you know any Oracle DBs who keep enough rollback segments to go back a week? Ours go for a good 6 hours sometimes :-D // Theo Schlossnagle // Esoteric Curio: http://www.lethargy.org/~jesus/ ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] New feature request: FlashBack Query
RPK <[EMAIL PROTECTED]> writes: > I did not mean asking for undo from a life-time log. Since FlashBack > Technology is already there, I just mean that world's most advanced database > (PostgreSQL, as they say), must have an optimized way for undoing of at > least a week changes. You're living in a dream world. Do you know any Oracle DBs who keep enough rollback segments to go back a week? regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] New feature request: FlashBack Query
Andrew, >> Demanding unlimited undo at some time that is arbitrarilly distant in the >> future strikes me as wholly unreasonable. I did not mean asking for undo from a life-time log. Since FlashBack Technology is already there, I just mean that world's most advanced database (PostgreSQL, as they say), must have an optimized way for undoing of at least a week changes. A week log is enough and PostgreSQL can keep on removing old logs automatically. Secondly, it must be left to the user to decide for the number of days of archive he want to store. Again upto a week max. RPK wrote: > > I agree that TimeStamp creates an overhead, but I just want to know if an > accidental update happened to a table and this incident got traced three > days after, what facility PGSQL provide to bring the table to its original > condition. You can't wait regretting on why you did not run ROLLBACK > before > COMMIT. (Correct me. I am only a user). > Why the heck can't you create a reversing transaction? That's what ordinary mortals do. Demanding unlimited undo at some time that is arbitrarilly distant in the future strikes me as wholly unreasonable. What do you mean by "accidental update"? What you really appear to mean is that a program or a human operator has made an error, and incorrectly told the database to commit a transaction. The answer surely is to correct the behaviour of the program or human, rather than wanting the database to provide an undo facility. Alternatively, this should be handled at the application layer, using something like table_log. Some things just don't work well with this sort of facility. Just ask your bookie if you can undo a bet that you "accidentally" placed with him and which, three days later, you discover (after the race) was a mistake. cheers andrew ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly -- View this message in context: http://www.nabble.com/New-feature-request%3A-FlashBack-Query-tf3245023.html#a9067564 Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] statement_timeout doesnt work within plpgsql by design?
Robert Treat <[EMAIL PROTECTED]> writes: > pagila=# create or replace function test() returns bool as $$ begin set > statement_timeout = 3000; perform pg_sleep(4) ; return true; end $$ language > plpgsql; > CREATE FUNCTION statement_timeout is measured across an entire interactive command, not individual commands within a function; and the timeout that applies to an interactive command is determined at its beginning. So the above doesn't do what you think. regards, tom lane ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
[HACKERS] statement_timeout doesnt work within plpgsql by design?
pagila=# select version(); version - PostgreSQL 8.2.3 on i386-pc-solaris2.10, compiled by cc -Xa (1 row) pagila=# create or replace function test() returns bool as $$ begin set statement_timeout = 3000; perform pg_sleep(4) ; return true; end $$ language plpgsql; CREATE FUNCTION pagila=# select test(); test -- t (1 row) pagila=# select test(); ERROR: canceling statement due to statement timeout CONTEXT: SQL statement "SELECT pg_sleep(4)" PL/pgSQL function "test" line 1 at perform is this behavior by design? if so why would you design it that way? :-) -- Robert Treat Build A Brighter LAMP :: Linux Apache {middleware} PostgreSQL ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] New feature request: FlashBack Query
Tom Lane wrote: > August Zajonc <[EMAIL PROTECTED]> writes: >> The key is how lightweight the setup could be, which matters because >> clients are not always willing to pay for a PITR setup. The low overhead >> would mean you'd feel fine about setting guc to 1hr or so. > > This would have exactly the same performance consequences as always > having an hour-old open transaction. I'm afraid that describing it > as "low overhead" is mere wishful thinking: it would cripple vacuuming > of high-update tables and greatly increase the typical load on pg_clog > and pg_subtrans. We already know that pg_subtrans contention can be a > source of context-swap storms, with the size of the window back to > GlobalXmin being the controlling factor for how bad it gets. > > It's possible that this last could be addressed by separating the > concept of "old enough to be vacuumed" from GlobalXmin, but it's > certainly not a trivial thing. Isn't globalxmin for open transactions? I thought the idea was that everything goes as usual, but you can flip a knob and say that vacuum doesn't vacuum anything more recent then GlobalXmin less x transactions. Then you can look at your transactions per second and get a rough window to work within. Or if there are timestamps on commits, that would switch to a time interval more user friendly. You end up simply delaying when 1hrs worth of transactions gets vacuumed. For folks doing nightly cron job vacuums, not too bad. Autovacuum isn't on by default :) Of course, this will be clumsy if not per database. But the thought might be to take advantage of the flashback data already present under the MVCC model as long as vacuum hasn't hit things (and being willing to stop activity on a database etc). Given that you are delaying a vacuum rather then being more aggressive, and know you can already vacuum up to a more recent transaction xmin, I dunno... Does anything depend (other then performance) on vacuum actually vacuuming as far as it can? - August ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] New feature request: FlashBack Query
"Jonah H. Harris" <[EMAIL PROTECTED]> writes: > On 2/20/07, Gregory Stark <[EMAIL PROTECTED]> wrote: > >> It's more like Oracle is optimized for data that's committed >> long in the past and we're optimized for data that's >> been recently updated. > > Wrong. When Oracle says it's committed, it's committed. No > difference between when, where, and how. In Oracle, the committed > version is *always* the first presented to the user... Sure, and if it was committed long in the past then you can use it. If it's committed recently then you'll have to start looking up rollback data instead. The rest of your post seems to all be predicated on the idea that if data is committed then that's all you'll need to look at. But that's missing precisely the point of what I was saying: >> In a sense then it's the opposite of what we usually say. Oracle is optimized >> for mostly static data. Postgres is optimized for changing data. By "changing data" I meant "data in flux", not the action of making changes to the data. Looking at data in flux in Oracle -- even other data that's unchanged but lives on the same page as some record that's in flux -- will require you to look up rollback data and possibly even have to follow many pages of chained rollback data. Looking at data in Postgres has no additional overhead when it's data in flux versus old static data. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Multiple Storage per Tablespace, or Volumes
On Mon, Feb 19, 2007 at 07:10:52PM -0800, David Fetter wrote: > > Isn't this one of the big use cases for table partitioning? Sure, but you can't detach that data in the meantime, AFAIK. Maybe I've missed something. If I have 10 years of finace data, and I have to keep it all online all the time, my electricity costs alone are outrageous. If I can know _where_ the data is without being able to get it until I've actually made it available, then my query tools could be smart enough to detect "table partitioned; this data is offline", I could roll back to my savepoint, return a partial result set to the user, and tell it "call back in 24 hours for your full report". Yes, I know, hands waving in the air. But I already said I was having a "you know what would be sweet" moment. A -- Andrew Sullivan | [EMAIL PROTECTED] This work was visionary and imaginative, and goes to show that visionary and imaginative work need not end up well. --Dennis Ritchie ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] New feature request: FlashBack Query
Ühel kenal päeval, T, 2007-02-20 kell 10:20, kirjutas Jonah H. Harris: > On 2/20/07, Gregory Stark <[EMAIL PROTECTED]> wrote: > > I used to say that too but I've since realized it's not really true. > > Heh, take a joke man... I was following up on Drake's email :) > > But, since you want to discuss your view of the systems openly... I'll > gladly reply :) > > > It's more like Oracle is optimized for data that's committed > > long in the past and we're optimized for data that's > > been recently updated. > > Wrong. When Oracle says it's committed, it's committed. No > difference between when, where, and how. In Oracle, the committed > version is *always* the first presented to the user... it takes time > to go back and look at older versions; Older versions are also committed :) He probably meant longer transactions and several versions visible to different backends. > but why shouldn't that be a bit slower, it isn't common practice anyway. Not for pure OLAP, at least when you have fairly fast transactions. But it can slow things down when you have some hotspot tables. > Same with rollbacks... why > should they optimize for them when 97% of transactions commit? Or other way around, - you should write code, where most transactions commit ;) > > In Oracle the data that's been committed long in the past requires no > > transactional overhead but the data that's been recently updated requires > > lots > > of work to fetch the right version. > > Wrong. The same transactional overhead applies to *all* data in > Oracle no matter of when it was committed. Similarly, the only > overhead required occurs when someone is querying in serializable > isolation or on read-committed data before or during a commit. On > short OLTP-type transactions, Oracle has the most optimized solution. > > > In Postgres it's the other way around. data that's been committed deleted > > long > > ago requires extra work to clean up but data that's been recently changed > > requires little additional work to see the correct version. > > PostgreSQL has little additional work? Like, checking the validity of > every tuple? Oracle checks visibility at the block level, so there's > *much* less overhead. Hmm. How can it check visibility at block level and at the same time do in-place updates on single tuples ? > Take most of the benchmarks which can hold ~200 > tuples per block. Tables in those benchmarks are 100+ million rows. > On a sequential scan, Oracle would perform 500K checks, PostgreSQL > would perform *all* 100M checks (not counting dead versions due to row > updates and the like). My proposal of keeping visibility info in a separate heap would help to get similar results, that is mostly 1 check per page. That would also cover much of the index lookup cases below. > On an index scan, Oracle not only has a > smaller index and less to check, but also knows the tuple will be > committed and will, in most cases, not have to perform additional > physical I/O to find the latest version of a row. It is also the reason why you can forget about doing simultaneous data loading and queries on the same table. If you know avoid doing that, then it "will, in most cases, not have to perform additional physical I/O to find the latest version of a row" ;) > Of course, Oracle's design is much more complicated in its ability to > build read-committed versions of the blocks at runtime; something the > simplicity of PostgreSQL's MVCC design eliminates. > > > In a sense then it's the opposite of what we usually say. Oracle is > > optimized > > for mostly static data. Postgres is optimized for changing data. > > Care to share an example to prove it? > > Like always, there are pros and cons with both designs, but denying > facts gets us nowhere. We're off-topic now... so we should either > move this off line or to another thread. I personally don't see much > of a reason to continue discussing MVCC designs anymore as Oracle's is > patented and PostgreSQL's is highly unlikely to change drastically. I don't think we will ever move to rollback segments, but for some use-cases moving visibility to a separate heap could make sense. And if we want to bring back time travel (see another thread about "Flashback Queries"), then we may end up implementing the original postgresql's design spec and make VACUUM spihon dead tuples over to archive relations, which already starts looking a little like rollback segments, only for other purposes :) > As always, I'd suggest discussing improvements, not the status quo. > Likewise, discussing Oracle's design, drawbacks, and limitations > without having used it extensively is quite obvious to anyone familiar > with Oracle. Using a system extensively can also create blind spots about some of the systems (mis)features. One learns to avoid doing some things without consciously knowing about it. > Don't get me wrong, it's fine to prefer one design to > another, but pushing discussion items c
Re: [HACKERS] HOT WIP Patch - version 2
On Tue, Feb 20, 2007 at 08:31:45PM +0530, Pavan Deolasee wrote: > I see your point, but as you mentioned do we really care ? The chain > needs to be broken so that the intermediate DEAD tuples can be > vacuumed. We can't vacuum them normally because they could > be a part of live HOT-update chain. Resetting the HOT-updated > status of the root tuple helps to mark the index entry LP_DELETE > once the entire HOT-update chain is dead. > ... For some reason this paragraph raised a query in my mind. Will we be able to toggle this new "hot update" code at configure time, so that we can measure what sort of effect this change has once it is complete? Even if only during the early testing cycles for the next release, I think it would be useful. Cheers, mark -- [EMAIL PROTECTED] / [EMAIL PROTECTED] / [EMAIL PROTECTED] __ . . _ ._ . . .__. . ._. .__ . . . .__ | Neighbourhood Coder |\/| |_| |_| |/|_ |\/| | |_ | |/ |_ | | | | | | \ | \ |__ . | | .|. |__ |__ | \ |__ | Ottawa, Ontario, Canada One ring to rule them all, one ring to find them, one ring to bring them all and in the darkness bind them... http://mark.mielke.cc/ ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] New feature request: FlashBack Query
Wrong. When Oracle says it's committed, it's committed. No difference between when, where, and how. In Oracle, the committed version is *always* the first presented to the user... it takes time to go back and look at older versions; but why shouldn't that be a bit slower, it isn't common practice anyway. Same with rollbacks... why should they optimize for them when 97% of transactions commit? Do 97% of transactions commit because Oracle has slow rollbacks and developers are working around that performance issue, or because they really commit? I have watched several developers that would prefer to issue numerous selects to verify things like foreign keys in the application in order to avoid a rollback. Anyway, I don't have experience with big Oracle applications but I'm not so sure that 97% of transactions would commit if rollbacks were cheaper. ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] New feature request: FlashBack Query
On 2/20/07, Gregory Stark <[EMAIL PROTECTED]> wrote: I used to say that too but I've since realized it's not really true. Heh, take a joke man... I was following up on Drake's email :) But, since you want to discuss your view of the systems openly... I'll gladly reply :) It's more like Oracle is optimized for data that's committed long in the past and we're optimized for data that's been recently updated. Wrong. When Oracle says it's committed, it's committed. No difference between when, where, and how. In Oracle, the committed version is *always* the first presented to the user... it takes time to go back and look at older versions; but why shouldn't that be a bit slower, it isn't common practice anyway. Same with rollbacks... why should they optimize for them when 97% of transactions commit? In Oracle the data that's been committed long in the past requires no transactional overhead but the data that's been recently updated requires lots of work to fetch the right version. Wrong. The same transactional overhead applies to *all* data in Oracle no matter of when it was committed. Similarly, the only overhead required occurs when someone is querying in serializable isolation or on read-committed data before or during a commit. On short OLTP-type transactions, Oracle has the most optimized solution. In Postgres it's the other way around. data that's been committed deleted long ago requires extra work to clean up but data that's been recently changed requires little additional work to see the correct version. PostgreSQL has little additional work? Like, checking the validity of every tuple? Oracle checks visibility at the block level, so there's *much* less overhead. Take most of the benchmarks which can hold ~200 tuples per block. Tables in those benchmarks are 100+ million rows. On a sequential scan, Oracle would perform 500K checks, PostgreSQL would perform *all* 100M checks (not counting dead versions due to row updates and the like). On an index scan, Oracle not only has a smaller index and less to check, but also knows the tuple will be committed and will, in most cases, not have to perform additional physical I/O to find the latest version of a row. Of course, Oracle's design is much more complicated in its ability to build read-committed versions of the blocks at runtime; something the simplicity of PostgreSQL's MVCC design eliminates. In a sense then it's the opposite of what we usually say. Oracle is optimized for mostly static data. Postgres is optimized for changing data. Care to share an example to prove it? Like always, there are pros and cons with both designs, but denying facts gets us nowhere. We're off-topic now... so we should either move this off line or to another thread. I personally don't see much of a reason to continue discussing MVCC designs anymore as Oracle's is patented and PostgreSQL's is highly unlikely to change drastically. As always, I'd suggest discussing improvements, not the status quo. Likewise, discussing Oracle's design, drawbacks, and limitations without having used it extensively is quite obvious to anyone familiar with Oracle. Don't get me wrong, it's fine to prefer one design to another, but pushing discussion items comparing Oracle to PostgreSQL because of things you've heard or read somewhere isn't the same as understanding them because you've used them. -- Jonah H. Harris, Software Architect | phone: 732.331.1324 EnterpriseDB Corporation| fax: 732.331.1301 33 Wood Ave S, 3rd Floor| [EMAIL PROTECTED] Iselin, New Jersey 08830| http://www.enterprisedb.com/ ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PATCHES] [HACKERS] HOT WIP Patch - version 2
On 2/20/07, Bruce Momjian <[EMAIL PROTECTED]> wrote: Tom Lane wrote: > > "Recently dead" means "still live to somebody", so those tids better not > change either. But I don't think that's what he meant. I'm more > worried about the deadlock possibilities inherent in trying to upgrade a > buffer lock. We do not have deadlock detection for LWLocks. I am guessing he is going to have to release the lock, then ask for an exclusive one. Yes, thats what is done. Since we try to prune the HOT-update chain even in the SELECT path, we upgrade the lock only if we are sure that there is atleast one tuple that can be removed from the chain or the root needs to be fixed (broken ctid chain for some reason). Thanks, Pavan -- EnterpriseDB http://www.enterprisedb.com
Re: [HACKERS] HOT WIP Patch - version 2
On 2/20/07, Tom Lane <[EMAIL PROTECTED]> wrote: "Pavan Deolasee" <[EMAIL PROTECTED]> writes: > ... Yes. The HOT-updated status of the root and all intermediate > tuples is cleared and their respective ctid pointers are made > point to themselves. Doesn't that destroy the knowledge that they form a tuple chain? While it might be that no one cares any longer, it would seem more reasonable to leave 'em chained together. I see your point, but as you mentioned do we really care ? The chain needs to be broken so that the intermediate DEAD tuples can be vacuumed. We can't vacuum them normally because they could be a part of live HOT-update chain. Resetting the HOT-updated status of the root tuple helps to mark the index entry LP_DELETE once the entire HOT-update chain is dead. Also, if we decide to reuse the heap-only tuples without even vacuuming, breaking the chain is a better option since we then guarantee no references to the heap-only DEAD tuples. Thanks, Pavan -- EnterpriseDB http://www.enterprisedb.com
Re: [PATCHES] [HACKERS] HOT WIP Patch - version 2
Tom Lane wrote: > Bruce Momjian <[EMAIL PROTECTED]> writes: > > Pavan Deolasee wrote: > >> When following a HOT-update chain from the index fetch, if we notice that > >> the root tuple is dead and it is HOT-updated, we try to prune the chain to > >> the smallest possible length. To do that, the share lock is upgraded to an > >> exclusive lock and the tuple chain is followed till we find a > >> live/recently-dead > >> tuple. At that point, the root t_ctid is made point to that tuple. In order > > > I assume you meant recently-dead here, rather than live/recently-dead, > > because we aren't going to change live ctids, right? > > "Recently dead" means "still live to somebody", so those tids better not > change either. But I don't think that's what he meant. I'm more > worried about the deadlock possibilities inherent in trying to upgrade a > buffer lock. We do not have deadlock detection for LWLocks. I am guessing he is going to have to release the lock, then ask for an exclusive one. -- Bruce Momjian <[EMAIL PROTECTED]> http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PATCHES] [HACKERS] HOT WIP Patch - version 2
Bruce Momjian <[EMAIL PROTECTED]> writes: > Pavan Deolasee wrote: >> When following a HOT-update chain from the index fetch, if we notice that >> the root tuple is dead and it is HOT-updated, we try to prune the chain to >> the smallest possible length. To do that, the share lock is upgraded to an >> exclusive lock and the tuple chain is followed till we find a >> live/recently-dead >> tuple. At that point, the root t_ctid is made point to that tuple. In order > I assume you meant recently-dead here, rather than live/recently-dead, > because we aren't going to change live ctids, right? "Recently dead" means "still live to somebody", so those tids better not change either. But I don't think that's what he meant. I'm more worried about the deadlock possibilities inherent in trying to upgrade a buffer lock. We do not have deadlock detection for LWLocks. regards, tom lane ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] HOT WIP Patch - version 2
Pavan Deolasee wrote: > On 2/20/07, Bruce Momjian <[EMAIL PROTECTED]> wrote: > > > > Pavan Deolasee wrote: > > > When following a HOT-update chain from the index fetch, if we notice > > that > > > the root tuple is dead and it is HOT-updated, we try to prune the chain > > to > > > the smallest possible length. To do that, the share lock is upgraded to > > an > > > exclusive lock and the tuple chain is followed till we find a > > > live/recently-dead > > > tuple. At that point, the root t_ctid is made point to that tuple. In > > order > > > > I assume you meant recently-dead here, rather than live/recently-dead, > > because we aren't going to change live ctids, right? > > > No, I meant live or recently-dead (in fact, anything other than > HEAPTUPLE_DEAD > or HEAPTUPLE_DEAD_CHAIN). > > We are not changing the tids here, but only pruning the HOT-update chain. > After pruning, the root->t_ctid points to the oldest tuple that might be > visible to any backend. The live tuples are still identified by their > original tid and index reachable from the root tuple. I am confused. Where is the root->t_ctid stored? -- Bruce Momjian <[EMAIL PROTECTED]> http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] HOT WIP Patch - version 2
On 2/20/07, Bruce Momjian <[EMAIL PROTECTED]> wrote: Pavan Deolasee wrote: > When following a HOT-update chain from the index fetch, if we notice that > the root tuple is dead and it is HOT-updated, we try to prune the chain to > the smallest possible length. To do that, the share lock is upgraded to an > exclusive lock and the tuple chain is followed till we find a > live/recently-dead > tuple. At that point, the root t_ctid is made point to that tuple. In order I assume you meant recently-dead here, rather than live/recently-dead, because we aren't going to change live ctids, right? No, I meant live or recently-dead (in fact, anything other than HEAPTUPLE_DEAD or HEAPTUPLE_DEAD_CHAIN). We are not changing the tids here, but only pruning the HOT-update chain. After pruning, the root->t_ctid points to the oldest tuple that might be visible to any backend. The live tuples are still identified by their original tid and index reachable from the root tuple. Thanks, Pavan -- EnterpriseDB http://www.enterprisedb.com
Re: [HACKERS] ToDo: add documentation for operator IS OF
> what code do you want to remove? Peter suggested removing the IS OF code itself. without compensation you hip lot of people. Time for removing was 4 years ago. Regards Pavel Stehule _ Citite se osamele? Poznejte nekoho vyjmecneho diky Match.com. http://www.msn.cz/ ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] ToDo: add documentation for operator IS OF
Pavel Stehule wrote: > > > documentation addition and add a comment to gram.y? > > > >OK, I have votes to remove the code, remove the documentation, and keep > >all of it but document its behavior might change in the future. > > what code do you want to remove? Peter suggested removing the IS OF code itself. -- Bruce Momjian <[EMAIL PROTECTED]> http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] HOT WIP Patch - version 2
Pavan Deolasee wrote: > When following a HOT-update chain from the index fetch, if we notice that > the root tuple is dead and it is HOT-updated, we try to prune the chain to > the smallest possible length. To do that, the share lock is upgraded to an > exclusive lock and the tuple chain is followed till we find a > live/recently-dead > tuple. At that point, the root t_ctid is made point to that tuple. In order I assume you meant recently-dead here, rather than live/recently-dead, because we aren't going to change live ctids, right? -- Bruce Momjian <[EMAIL PROTECTED]> http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] HOT WIP Patch - version 2
"Pavan Deolasee" <[EMAIL PROTECTED]> writes: > ... Yes. The HOT-updated status of the root and all intermediate > tuples is cleared and their respective ctid pointers are made > point to themselves. Doesn't that destroy the knowledge that they form a tuple chain? While it might be that no one cares any longer, it would seem more reasonable to leave 'em chained together. regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] ToDo: add documentation for operator IS OF
> documentation addition and add a comment to gram.y? OK, I have votes to remove the code, remove the documentation, and keep all of it but document its behavior might change in the future. what code do you want to remove? I will leave the documentation, but comment it out so it doesn't appear in the output, and state why in the comment. I will also add a TODO item. -- Bruce Momjian <[EMAIL PROTECTED]> http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + _ Chcete sdilet sve obrazky a hudbu s prateli? http://messenger.msn.cz/ ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] ToDo: add documentation for operator IS OF
bruce wrote: > Joe Conway wrote: > > Bruce Momjian wrote: > > > Pavel Stehule wrote: > > >> Hello, > > >> > > >> I miss doc for this operator > > > > > > Strang IS [NOT] OF wasn't documented, especially seeing it was added in > > > PostgreSQL 7.3. Anyway, documented and backpatched to 8.2.X. > > > > Here's the reason -- see this thread: > > http://archives.postgresql.org/pgsql-patches/2003-08/msg00062.php > > Wow, interesting. I do remember that now. Should I revert the > documentation addition and add a comment to gram.y? OK, I have votes to remove the code, remove the documentation, and keep all of it but document its behavior might change in the future. I will leave the documentation, but comment it out so it doesn't appear in the output, and state why in the comment. I will also add a TODO item. -- Bruce Momjian <[EMAIL PROTECTED]> http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Plan invalidation design
Simon Riggs wrote: > On Sat, 2007-02-17 at 12:48 -0500, Tom Lane wrote: > > > Relcache inval casts a fairly wide net; for example, adding or dropping an > > index will invalidate all plans using the index's table whether or not > > they used that particular index, and I believe that VACUUM will also > > result in a relcache inval due to updating the table's pg_class row. > > I think this is a good thing though --- for instance, after adding an > > index it seems a good idea to replan to see if the new index is useful, > > and replanning after a VACUUM is useful if the table has changed size > > enough to warrant a different plan. OTOH this might mean that plans on a > > high-update-traffic table never survive very long because of autovacuum's > > efforts. If that proves to be a problem in practice we can look at ways > > to dial down the number of replans, but for the moment I think it's more > > important to be sure we *can* replan at need than to find ways to avoid > > replans. > > Just some info on that: In an update-intensive scenario, I'm seeing > VACUUMs every 2 minutes on the heaviest hit tables on CVS HEAD on a > medium-powered 4-CPU server. Re-planning multiple queries on 100+ > sessions every few minutes would not be good. I would think the inval would be sent if relpages changed by more than a certain threshold, say 10%. In steady state, a high-update table that's under continuous vacuum should not change size much, thus no replan. But clearly the point here is to get the inval to be sent at all, and look for inhibitions mechanisms later. > Presumably ANALYZE would have the same effect? It would be nice to have a way to calculate a delta from the previous statistics snapshot and send an inval if it's appropriate. Can it be done? -- Alvaro Herrerahttp://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc. ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] [PATCHES] WIP patch - INSERT-able log statements
On Tue, 20 Feb 2007, Tom Lane wrote: I can't believe that any production situation could tolerate the overhead of one-commit-per-log-line. There aren't that many log lines, and a production environment with lots of commit throughput won't even notice. The installation I work on tuning does 300 small commits per second on a bad day. I can barely measure the overhead of whether or not the log files are involved in that if I'm importing them at the same time. The situation obviously changes if you're logging per-query level detail. So a realistic tool for this is going to have to be able to wrap blocks of maybe 100 or 1000 or so log lines with BEGIN/COMMIT, and that is exactly as difficult as wrapping them with a COPY command. Thus, I disbelieve your argument. We should not be designing this around an assumed use-case that will only work for toy installations. Wrapping the commits in blocks to lower overhead is appropriate for toy installations, and probably medium sized ones too. Serious installations, with battery-backed cache writes and similar commit throughput enhancing hardware, can commit a low-volume stream like the logs whenever they please. That's the environment my use-case comes from. Anyway, it doesn't really matter; I can build a tool with COPY style output as well, it just won't be trivial like the INSERT one would be. My reasons for "would slightly prefer INSERT" clearly aren't strong enough to override the issues you bring up with the average case. -- * Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] New feature request: FlashBack Query
RPK wrote: > > I agree that TimeStamp creates an overhead, but I just want to know if an > accidental update happened to a table and this incident got traced three > days after, what facility PGSQL provide to bring the table to its original > condition. You can't wait regretting on why you did not run ROLLBACK > before > COMMIT. (Correct me. I am only a user). > Why the heck can't you create a reversing transaction? That's what ordinary mortals do. Demanding unlimited undo at some time that is arbitrarilly distant in the future strikes me as wholly unreasonable. What do you mean by "accidental update"? What you really appear to mean is that a program or a human operator has made an error, and incorrectly told the database to commit a transaction. The answer surely is to correct the behaviour of the program or human, rather than wanting the database to provide an undo facility. Alternatively, this should be handled at the application layer, using something like table_log. Some things just don't work well with this sort of facility. Just ask your bookie if you can undo a bet that you "accidentally" placed with him and which, three days later, you discover (after the race) was a mistake. cheers andrew ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] ToDo: add documentation for operator IS OF
Am Dienstag, 20. Februar 2007 02:13 schrieb Bruce Momjian: > Wow, interesting. I do remember that now. Should I revert the > documentation addition and add a comment to gram.y? I'd say remove the code. Propably nobody use it for inheritancy, but some people (I am too) use it in polymorphic functions. You can remove it, but please replace it. Regards Pavel Stehule _ Chcete sdilet sve obrazky a hudbu s prateli? http://messenger.msn.cz/ ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] New feature request: FlashBack Query
I agree that TimeStamp creates an overhead, but I just want to know if an accidental update happened to a table and this incident got traced three days after, what facility PGSQL provide to bring the table to its original condition. You can't wait regretting on why you did not run ROLLBACK before COMMIT. (Correct me. I am only a user). When talking about Oracle's technology and that it creates overhead, it is true, Oracle's database is not for ordinary machines. You can't expect performance on a normal 256 MB machine with Oracle. But still the more the options of recovery the best for mission critical environments. The feature of enabling/disabling TimeStamp logging is acceptable. A user must be able to decide whether FlashBack type option is needed or not. In Oracle 10g we can switch off "FlashBack" feature if we are low on resources. If PGSQL is to be used in a mission-critical situation then no company will rely on low-end machines. For these type of situations best environment is chosen and I think PGSQL must have this type of recovery options. PGSQL installer can ask the user during setup to enable/disable TimeStamp Logging. Restoring the database from a backup file that was created three days ago is not feasible. The changes in other tables and the new things created need to be done again at the price of just undoing the last update on a particular table. Warren Turkal-5 wrote: > > On Saturday 17 February 2007 07:49, RPK wrote: >> PostgreSQL, already a mature database, needs to have more options for >> recovery as compared to proprietary databases. I just worked with >> Oracle's >> FlashBack query feature in Oracle 9i and FlashBack Table feature in 10g. >> >> Future versions of PostgreSQL must have similar features which enable >> users >> to bring Table(s) and/or Database(s) to a desired Time Stamp. > > Check out my proposal[1] for Temporal extensions. Ultimately, creating > valid > time and transaction time tables would be possible through my proposal. > Please > check it out. > > [1]http://archives.postgresql.org/pgsql-hackers/2007-02/msg00540.php > > wt > -- > Warren Turkal (w00t) > > ---(end of broadcast)--- > TIP 9: In versions below 8.0, the planner will ignore your desire to >choose an index scan if your joining column's datatypes do not >match > > -- View this message in context: http://www.nabble.com/New-feature-request%3A-FlashBack-Query-tf3245023.html#a9059865 Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
[HACKERS] New version of IDENTITY/GENERATED
Hi, I started working on my previous patch, encouraged by the fact that it became a wishlist item for 8.3. :-) The changes in this version are: - Refreshed to almost current (5 days old) CVS version of 8.3 devel - The original SERIAL pseudo type is left alone, you _have to_ spell out GENERATED { ALWAYS | BY DEFAULT} AS IDENTITY to get an identity column. - The "action-at-a-distance" behaviour is actually working for the IDENTITY/GENERATED columns on INSERT so the DEFAULT value is generated for them after all the regular columns were validated via ExecConstraints(). This way, if the validation fails, the sequence isn't inflated. - Test case is updated to reflect the above. - Documentation is updated, "Identity columns" have a new subsection now. - Dropped my pg_dump changes, as the altered sequence is also dumped in 8.2, thanks to Tom Lane. I am considering the following: - Since the IDENTITY is a new feature (plain old SERIAL behaves the same as always) I will restore the SQL:2003 confromant check that there can be only one identity column in a table at any time. - I read somewhere (but couldn't find it now in SQL:2003) that CHECK constraints cannot be defined for GENERATED (and IDENTITY?) columns. Maybe it was in the latest draft, I have to look at it... Anyway, I have to implement checks to disallow CHECKs for such columns. - Introduce an ALTER TABLE SET|DROP IDENTITY so a serial can be "upgraded" to an identity. This way, an identity column can be built by hand and pg_dump will need it, too. SET IDENTITY will either have to issue an error if CHECKs defined for such columns or automatically drop every such constraints. And I have a question, too. Is there a way to use ExecEvalExpr*() so values from a given tuples are used for "current" row? E.g. at present, UPDATE table SET f1 = f1 + 1, f2 = f1 + 1; sets both fields' new value to (f1 value before UPDATE) + 1. For a GENERATED column, value _after_ UPDATE is needed, so CREATE TABLE table ( f1 INTEGER, f2 INTEGER GENERATED ALWAYS AS (f1 + 1)); and no matter which one of the following is used: UPDATE table SET f1 = f1 + 1; or UPDATE table SET f1 = f1 + 1, f2 = default; the f2 current value = f1 current value + 1 is always maintained. Best regards, Zoltán Böszörményi psql-serial-30.diff.gz Description: Unix tar archive ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] pg_proc without oid?
Am Dienstag, 20. Februar 2007 09:24 schrieb Magnus Hagander: > Ok. Will do once the entires in pg_proc are changed, so that I can still > build. It's done. -- Peter Eisentraut http://developer.postgresql.org/~petere/ ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] ToDo: add documentation for operator IS OF
Am Dienstag, 20. Februar 2007 02:13 schrieb Bruce Momjian: > Wow, interesting. I do remember that now. Should I revert the > documentation addition and add a comment to gram.y? I'd say remove the code. -- Peter Eisentraut http://developer.postgresql.org/~petere/ ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] New feature request: FlashBack Query
"Jonah H. Harris" <[EMAIL PROTECTED]> writes: > On 2/17/07, Joshua D. Drake <[EMAIL PROTECTED]> wrote: >> My understanding is that the main difference is that rollbacks are >> inexpensive for us, but expensive for Oracle. > > Yes, Oracle is optimized for COMMIT, we're optimized for ROLLBACK :) I used to say that too but I've since realized it's not really true. It's more like Oracle is optimized for data that's committed long in the past and we're optimized for data that's been recently updated. In Oracle the data that's been committed long in the past requires no transactional overhead but the data that's been recently updated requires lots of work to fetch the right version. In Postgres it's the other way around. data that's been committed deleted long ago requires extra work to clean up but data that's been recently changed requires little additional work to see the correct version. In a sense then it's the opposite of what we usually say. Oracle is optimized for mostly static data. Postgres is optimized for changing data. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] pg_proc without oid?
On Mon, Feb 19, 2007 at 11:18:36AM -0500, Tom Lane wrote: > Peter Eisentraut <[EMAIL PROTECTED]> writes: > > Am Montag, 19. Februar 2007 16:50 schrieb Tom Lane: > >> In the second place, if you don't want to predetermine OIDs for your > >> functions then they shouldn't be in hardwired pg_proc.h rows at all. > > > Where else would you put them? > > SQL script maybe, much along the lines Greg was just mentioning. > (I'd been thinking myself earlier that pg_amop/amproc/etc would be a > whole lot easier to maintain if we could feed CREATE OPERATOR CLASS > commands to the bootstrap process.) But getting there will take > nontrivial work; you can't just decide to leave out a few OIDs on the > spur of the moment. > > Magnus, I'd suggest reverting whatever you did to your MSVC script, > so we'll find out the next time someone makes this mistake... Ok. Will do once the entires in pg_proc are changed, so that I can still build. BTW, another problem with the stuff that's in there now - pg_proc.h contains description entries for the functions, but that never goes in to pg_description, since there is no oid to bind it to... //Magnus ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] HOT WIP Patch - version 2
On 2/20/07, Hannu Krosing <[EMAIL PROTECTED]> wrote: Ühel kenal päeval, T, 2007-02-20 kell 12:08, kirjutas Pavan Deolasee: What do you do, if there are no live tuples on the page ? will this un-HOTify the root and free all other tuples in HOT chain ? Yes. The HOT-updated status of the root and all intermediate tuples is cleared and their respective ctid pointers are made point to themselves. The index entry will be marked LP_DELETE as with the normal case. VACUUM can subsequently reclaimed these tuples, along with the index entry. > > The intermediate heap-only tuples are removed from the HOT-update > chain. > The HOT-updated status of these tuples is cleared and their respective > t_ctid are made point to themselves. These tuples are not reachable > now and ready for vacuuming. Does this mean, that they are now indistinguishable from ordinary tuples ? No. HEAP_ONLY_TUPLE flag is still set on these tuples. So you can distinguish those tuples. Maybe they could be freed right away instead of changing HOT-updated status and ctid ? Yeah, thats a good idea. I am thinking of setting LP_DELETE flag on them while pruning. The tuple then can be reused for next in-page HOT-update. > When we run out space for update-within-the-block, we traverse > through all the line pointers looking for LP_DELETEd items. If any of > these > items have space large enough to store the new tuple, that item is > reused. > Does anyone see any issue with doing this ? Also, any suggestions > about doing it in a better way ? IIRC the size is determined by the next tuple pointer, so you can store new data without changing tuple pointer only if they are exactly the same size. There is a lp_len field in the line pointer to store the length of the tuple. ISTM that we can reduce that while reusing the line pointer. But that would create a permanent hole in the page. > we are > more concerned about the large tables, the chances of being able to > upgrade > the exclusive lock to vacuum-strength lock are high. Comments ? I'm not sure about the "we are more concerned about the large tables" part. I see it more as a device for high-update tables. This may not always be the same as "large", so there should be some fallbacks for case where you can't get the lock. Maybe just give up and move to another page ? Oh, yes. I agree. The fallback option of doing COLD update always exists. Thanks, Pavan -- EnterpriseDB http://www.enterprisedb.com