Re: [HACKERS] a faster compression algorithm for pg_dump
On Fri, Apr 9, 2010 at 12:17 AM, Joachim Wieland wrote: > One question that I do not yet see answered is, do we risk violating a > patent even if we just link against a compression library, for example > liblzf, without shipping the actual code? > Generally patents are infringed on when the process is used. So whether we link against or ship the code isn't really relevant. The user using the software would need a patent license either way. We want Postgres to be usable without being dependent on any copyright or patent licenses. Linking against as an option isn't nearly as bad since the user compiling it can choose whether to include the restricted feature or not. That's what we do with readline. However it's not nearly as attractive when it restricts what file formats Postgres supports -- it means someone might generate backup dump files that they later discover they don't have a legal right to read and restore :( -- greg -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] C-Language Fun on VC2005 ERROR: could not load library
Hi,I'm using VC2005 to create PG C-language Fun in my contrib xml_index, which import other library, and I have add the include and lib directory by changing Mkvcbuild.pm and config.pl. But after I executed the following commands:build DEBUGperl install.pl "C:\Program Files\PostgreSQL\8.3"initdb.exe -D "C:\Program Files\PostgreSQL\8.3\data" -E UTF8 --locale=C"pg_ctl" -D "C:/Program Files/PostgreSQL/8.3/data" -l logfile startcreatedb testpsql test when I trying to create the function by the following commands: CREATE OR REPLACE FUNCTION create_xml_value_i ndex(text,text,text) RETURNS boolAS '$libdir/xml_index'LANGUAGE C STRICT IMMUTABLE; It reports an ERROR:could not load library "C:\Program Files\PostgreSQL\8.3\lib/xml_index.dll": The specified module could not be found. I checked the directory C:\Program Files\PostgreSQL\8.3\lib, xml_index.dll exists.I tried to changed "$libdir/xml_index" by "C:\\Program Files\\PostgreSQL\\8.3\\lib\\xml_index", error remains. Has anyone ever encountered this problem? Thanks in advance. Best RegardsCristian _ Your E-mail and More On-the-Go. Get Windows Live Hotmail Free. https://signup.live.com/signup.aspx?id=60969
Re: [HACKERS] GSOC PostgreSQL partitioning issue
Greg Smith wrote: > An introduction to the current state of work in progress for adding > improved partitioning features to PostgreSQL is documented at > http://wiki.postgresql.org/wiki/Table_partitioning Also, here is my latest patch for it: http://repo.or.cz/w/pgsql-fdw.git/shortlog/refs/heads/partition I'd like to ask Necati what problem you will solve, rather than what module you will develop. Performance? Usability? Or othres? Regards, --- Takahiro Itagaki NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] extended operator classes vs. type interfaces
There are a couple of features that were considered but not committed for 9.0 which require additional information about the properties of various data types. At PGeast I had a chance to talk with Jeff Davis about this briefly, and wanted to write up some of what we talked about plus some further thoughts of my own before they went out of my head. The features I'm thinking about are: 1. knngist wants to use index scans to speed up queries of the form SELECT ... ORDER BY(as opposed to the existing machinery which only knows how to use an index for SELECT ... ORDER BY ). 2. Window functions want to define windows over a range of values defined by the underlying data type. To do this, we need to define what addition and subtraction mean for a particular data type. 3. Jeff Davis is interested in implementing range types. When the underlying base type is discrete, e.g. integers, you can say that [1,3] = [1,4), but only if you know that 3 and 4 are consecutive (in that order). All of these problems seem loosely related: we're teaching the database more about how certain types behave. But there are some important differences. In case #1, we're trying to teach the planner that if it sees a certain operator, it can map that operator onto an AM strategy - so the knowledge is fundamentally AM-specific. In the remaining cases, the information needed is a property of the underlying data type with no natural or logical relationship to an access method. For that reason, I don't think there's going to be any clean way to create a single mechanism that will encompass all of these needs, though maybe someone has a clever idea I'm not thinking of. What we discussed doing before for #1 (and it make sense to me) is to add a column pg_amop.amopcategory. The existing operator class functions will constitute one category - map a boolean operator onto an index strategy number that can be used to treat a filter condition as an index qual. The functions knngist cares about will be a second category - map an operator that returns some arbitrary type that is legal in the context of an ORDER BY clause onto an index strategy number that can regurgitate the tuples in the order defined by the return type. While it might be possible to shoehorn the remaining cases into the operator class machinery, it seems likely that it will be nothing but ugly. The whole charter of the operator class machinery at least AIUI is to map operators onto AM-specific index strategy numbers, and there is neither an applicable AM nor a strategy number for it in any of these cases. So I think it's time to create a separate concept of type interfaces (Jeff Davis proposed this name, and I like it). What might this look like? Given a type T, I think we'd like to be able to define a type U as "the natural type to be added to or subtracted from T". As Jeff pointed out to me, this is not necessarily the same as the underlying type. For example, if T is a timestamp, U is an interval; if T is a numeric, U is also a numeric; if T is a cidr, U is an integer. Then we'd like to define a canonical addition operator and a canonical subtraction operator. I think that would be sufficient for the needs of RANGE BETWEEN ... PRECEDING AND ... FOLLOWING. It would also be nearly sufficient for range types, but in that case you also need to specify the unit increment within U - i.e. a "1" value for the datatype. It may or may not be worth building the concept of a unit increment into the type interface machinery, though: one could imagine two different range types built over the same base type with different unit increments - e.g. one timestamp range with unit increment = 1s, and one with unit increment = 1m. Under the first type [4pm,5pm) = [4pm,4:59:59pm], while under the second [4pm,5pm) = [4pm,4:59pm]. In a previous thread on this topic, in response to a question about other possible needs for operator knowledge, Hitoshi Harada mentioned range partitioning as another possible case. If we have a partition where x ranges from 1 to 100, that's basically saying that x >= 1 and x <= 100, for suitable values of >= and <=. I think we're already habituated to doing this kind of thing by looking at the default btree opclass for the data type and looking for the operator that implements, e.g. BTLessEqualsStrategyNumber. It might be cleaner to get all this information from type interfaces, but I'm not sure whether it's reasonable (either for reasons of complexity or of performance) to think about untangling all the places where we've already made this assumption and redoing them. Thoughts? ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] a faster compression algorithm for pg_dump
I'd like to revive the discussion about offering another compression algorithm than zlib to at least pg_dump. There has been a previous discussion here: http://archives.postgresql.org/pgsql-performance/2009-08/msg00053.php and it ended without any real result. The results so far were: - There exist BSD-licensed compression algorithms - Nobody knows a patent that is in our way - Nobody can confirm that no patent is in our way I do see a very real demand for replacing zlib which compresses quite well but is slow as hell. For pg_dump what people want is cheap compression, they usually prefer an algorithm that compresses less optimal but that is really fast. One question that I do not yet see answered is, do we risk violating a patent even if we just link against a compression library, for example liblzf, without shipping the actual code? I have checked what other projects do, especially about liblzf which would be my favorite choice (BSD license, available since quite some time...) and there are other projects that actually ship the lzf code (I haven't found a project that just links to it). The most prominent projects are - KOffice (implements a derived version in koffice-2.1.2/libs/store/KoXmlReader.cpp) - Virtual Box (ships it in vbox-ose-1.3.8/src/libs/liblzf-1.51) - TuxOnIce (formerly known as suspend2 - linux kernel patch, ships it in the patch) We have pg_lzcompress.c which implements the compression routines for the tuple toaster. Are we sure that we don't violate any patents with this algorithm? Joachim -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Thoughts on pg_hba.conf rejection
On Wed, Apr 07, 2010 at 01:07:21PM -0400, Robert Haas wrote: > On Wed, Apr 7, 2010 at 10:46 AM, Tom Lane wrote: > > Simon Riggs writes: > >> When there is a specific reject rule, why does the server say > >> FATAL: no pg_hba.conf entry > > > > It's intentional. We try to expose the minimum amount of knowledge > > about the contents of pg_hba.conf to potential attackers. > > The problem with the message is not that it's uninformative, but that > it's counterfactual. > > ...Robert I agree (I noticed and was bothered by this today, as a matter of irrelevant fact). I can support the idea of exposing as little as possible of pg_hba.conf, but ISTM the "no pg_hba.conf entry" is exposing too much, by that standard. Just say something like "connection disallowed" and leave it at that -- either it's disallowed by lack of a rule, or by existence of a "reject" rule, or by something else entirely. As long as the message isn't clearly wrong in the "reject" case, as it is now. -- Joshua Tolley / eggyknap End Point Corporation http://www.endpoint.com signature.asc Description: Digital signature
Re: [HACKERS] GSOC PostgreSQL partitioning issue
An introduction to the current state of work in progress for adding improved partitioning features to PostgreSQL is documented at http://wiki.postgresql.org/wiki/Table_partitioning If you can find a small, targeted piece of that overall plan that builds on the work already done, and is in the direction of the final goal here, you may be able to make useful progress in a few months time. This area is extremely well explored already. There are 13 mailing list threads you'll need to browse through carefully just to have enough background that you're likely to build something new, rather than just wandering down a path that's already been followed but leads to a dead end. You have picked a PostgreSQL feature that is dramatically more difficult than it appears to be, and I wouldn't expect you'll actually finish even a fraction of your goals in a summer of work. You're at least in plentiful company--most students do the same. As a rule, if you see a feature on our TODO list that looks really useful and fun to work on, it's only still there because people have tried multiple times to build it completely but not managed to do so because it's harder than it appears. This is certainly the case with improving the partitioning support that's built in to the database. -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support g...@2ndquadrant.com www.2ndQuadrant.us -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Unsafe threading in syslogger on Windows
Andrew Dunstan wrote: > Heikki Linnakangas wrote: >> I'm going to see what happens if I remove all the #ifdef WIN32 blocks in >> syslogger, and let it use pgpipe() and select() instead of the extra >> thread. > > Sounds reasonable. Let's see how big the changes are on HEAD. I'm not > sure it's worth creating a different smaller fix for the back branches. I tried that, and got a crash somewhere in the code that inherits the syslogger pipe/socket to the child process. I don't understand why, and I don't feel like debugging any deeper into that right now. If you or someone else wants to give it a shot, that would be good. If not, I might try again some other day after sleeping over it. Anyway, here's the patch I had in mind for back-branches. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com diff --git a/src/backend/postmaster/syslogger.c b/src/backend/postmaster/syslogger.c index 22cf5f2..ee449ac 100644 --- a/src/backend/postmaster/syslogger.c +++ b/src/backend/postmaster/syslogger.c @@ -117,7 +117,7 @@ HANDLE syslogPipe[2] = {0, 0}; #ifdef WIN32 static HANDLE threadHandle = 0; -static CRITICAL_SECTION sysfileSection; +static CRITICAL_SECTION sysloggerSection; #endif /* @@ -268,7 +268,8 @@ SysLoggerMain(int argc, char *argv[]) #ifdef WIN32 /* Fire up separate data transfer thread */ - InitializeCriticalSection(&sysfileSection); + InitializeCriticalSection(&sysloggerSection); + EnterCriticalSection(&sysloggerSection); threadHandle = (HANDLE) _beginthreadex(NULL, 0, pipeThread, NULL, 0, NULL); if (threadHandle == 0) @@ -423,8 +424,16 @@ SysLoggerMain(int argc, char *argv[]) * On Windows we leave it to a separate thread to transfer data and * detect pipe EOF. The main thread just wakes up once a second to * check for SIGHUP and rotation conditions. + * + * Server code isn't generally thread-safe, so we ensure that only + * one of the threads is active at a time by entering the critical + * section whenever we're not sleeping. */ + LeaveCriticalSection(&sysloggerSection); + pg_usleep(100L); + + EnterCriticalSection(&sysloggerSection); #endif /* WIN32 */ if (pipe_eof_seen) @@ -911,17 +920,9 @@ write_syslogger_file(const char *buffer, int count, int destination) if (destination == LOG_DESTINATION_CSVLOG && csvlogFile == NULL) open_csvlogfile(); -#ifdef WIN32 - EnterCriticalSection(&sysfileSection); -#endif - logfile = destination == LOG_DESTINATION_CSVLOG ? csvlogFile : syslogFile; rc = fwrite(buffer, 1, count, logfile); -#ifdef WIN32 - LeaveCriticalSection(&sysfileSection); -#endif - /* can't use ereport here because of possible recursion */ if (rc != count) write_stderr("could not write to log file: %s\n", strerror(errno)); @@ -945,11 +946,21 @@ pipeThread(void *arg) for (;;) { DWORD bytesRead; + BOOL result; + + result = ReadFile(syslogPipe[0], + logbuffer + bytes_in_logbuffer, + sizeof(logbuffer) - bytes_in_logbuffer, + &bytesRead, 0); - if (!ReadFile(syslogPipe[0], - logbuffer + bytes_in_logbuffer, - sizeof(logbuffer) - bytes_in_logbuffer, - &bytesRead, 0)) + /* + * Enter critical section before doing anything that might touch + * global state shared by the main thread. Anything that uses + * palloc()/pfree() in particular are not safe outside the critical + * section. + */ + EnterCriticalSection(&sysloggerSection); + if (!result) { DWORD error = GetLastError(); @@ -966,6 +977,7 @@ pipeThread(void *arg) bytes_in_logbuffer += bytesRead; process_pipe_input(logbuffer, &bytes_in_logbuffer); } + LeaveCriticalSection(&sysloggerSection); } /* We exit the above loop only upon detecting pipe EOF */ @@ -974,6 +986,7 @@ pipeThread(void *arg) /* if there's any data left then force it out now */ flush_pipe_input(logbuffer, &bytes_in_logbuffer); + LeaveCriticalSection(&sysloggerSection); _endthread(); return 0; } @@ -1097,18 +1110,9 @@ logfile_rotate(bool time_based_rotation, int size_rotation_for) _setmode(_fileno(fh), _O_TEXT); /* use CRLF line endings on Windows */ #endif - /* On Windows, need to interlock against data-transfer thread */ -#ifdef WIN32 - EnterCriticalSection(&sysfileSection); -#endif - fclose(syslogFile); syslogFile = fh; -#ifdef WIN32 - LeaveCriticalSection(&sysfileSection); -#endif - /* instead of pfree'ing filename, remember it for next time */ if (last_file_name != NULL) pfree(last_file_name); @@ -1164,18 +1168,9 @@ logfile_rotate(bool time_based_rotation, int size_rotation_for) _setmode(_fileno(fh), _O_TEXT); /* use CRLF line endings on Windows */ #endif - /* On Windows, need to interlock against data-transfer thread */ -#ifdef WIN32 - EnterCriticalSection(&sysfileSection); -#endif - fclose(csvlogFile); csvlogFile = fh; -#ifdef WIN32 - LeaveCriticalSection(&sysfileSection); -#endif - /* instead of pfree'ing filename, remember it for next time */
Re: [HACKERS] A maze of twisty mailing lists all the same
On Thu, Apr 8, 2010 at 20:35, Kevin Grittner wrote: > Greg Stark wrote: > >> But all it means is you get a random subset of the messages. >> You're still missing most of the admin or sql or performance >> related threads since they're mostly on -general anyways. Those >> three categories cover pretty much all of -general. > > Perhaps -general should be eliminated in favor of more specific > lists? That sounds like a great way to make things harder for newbies and outsiders. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] A maze of twisty mailing lists all the same
On Thu, Apr 8, 2010 at 7:58 PM, Ned Lilly wrote: > +1 for the idea, and +1 for the Zork reference. Hello sailor. fwiw it's older than Zork. It comes from Adventure (http://en.wikipedia.org/wiki/Colossal_Cave_Adventure) -- greg -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GSOC PostgreSQL partitioning issue
On Thu, Apr 8, 2010 at 4:23 PM, Necati Batur wrote: > In order to make a system change a student need to be more informed > and experienced for the issue.Nonetheless,this step of work is > actually not the phase of determining all the details,I > guess.Otherwise,I would just do a few weeks of coding in summer to > complete the project and I would be the person in charge in project > management :) Well, obviously there are going to be details that won't get worked out until you really settle down to do the project. But I don't think you've even really defined what exactly you would be working on. Going through your deliverables one by one: *The trigger based operations can be done automatically What trigger-based operations are you talking about and how do you plan to automate them? *The stored procedures can help us to do some functionalities like check constraint problem What check constraint problem? PostgreSQL already has check constraints. If you think they have a problem, say what it is. *manual VACUUM or ANALYZE commands can be handled by using triggers DBMS SQL can help to provide faster executions We already have an autovacuum daemon that automates VACUUM and ANALYZE commands and it works pretty well. Certainly, there's room for improvement, but what do you think needs improving? What about the current implementation do you not like and how do you propose to fix it? *Some more functionalities can be added to UPDATE operations to make administrations easy What administrations are currently difficult and what functionality do you propose to add to simplify them? ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] some more data corruption
Hi, So I'm still investigating data corruption issues. They have some serious TOAST index corruption too; for example, For example, notice how an indexscan returns invalid chunk_ids here: select xmin,xmax,ctid,cmin, chunk_id, chunk_seq,length(chunk_data) from pg_toast.pg_toast_18141 where chunk_id between 194679500 and 194679700 order by chunk_id; xmin | xmax |ctid | cmin | chunk_id | chunk_seq | length -+-+-+--+---+---+ 9767631 | 0 | (2855683,1) | 27 | 94929948 | 1 | 26 9767659 | 0 | (2855683,5) | 28 | 94929949 | 1 | 26 9767659 | 0 | (2855683,6) | 32 | 94929950 | 1 | 26 9767659 | 0 | (2855683,7) | 34 | 94929951 | 1 | 26 9767797 | 9773223 | (2855684,4) | 36 | 94929958 | 1 | 27 9767797 | 9773223 | (2855684,5) | 38 | 94929959 | 1 | 27 9767797 | 0 | (2855684,6) | 41 | 94929960 | 1 | 27 One thing that surprised me is that they are seeing such short values being pushed out to TOAST. The rows are typically 156 bytes wide, and the values being pushed out are 30 bytes, so I fail to see why it would get done. Any theories? Is this normal? -- Alvaro Herrerahttp://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GSOC PostgreSQL partitioning issue
In order to make a system change a student need to be more informed and experienced for the issue.Nonetheless,this step of work is actually not the phase of determining all the details,I guess.Otherwise,I would just do a few weeks of coding in summer to complete the project and I would be the person in charge in project management :) 2010/4/8 Necati Batur : > Benefits of Project > > Partitioning refers to splitting what is logically one large table > into smaller physical pieces. Partitioning can provide several > benefits: > > Query performance can be improved dramatically in certain situations, > particularly when most of the heavily accessed rows of the table are > in a single partition or a small number of partitions. The > partitioning substitutes for leading columns of indexes, reducing > index size and making it more likely that the heavily-used parts of > the indexes fit in memory. > > When queries or updates access a large percentage of a single > partition, performance can be improved by taking advantage of > sequential scan of that partition instead of using an index and random > access reads scattered across the whole table. > > Bulk loads and deletes can be accomplished by adding or removing > partitions, if that requirement is planned into the partitioning > design. ALTER TABLE is far faster than a bulk operation. It also > entirely avoids the VACUUM overhead caused by a bulk DELETE. > > Seldom-used data can be migrated to cheaper and slower storage media. > > Delivarables > > *The trigger based operations can be done automatically > > *The stored procedures can help us to do some functionalities like > check constraint problem > > *manual VACUUM or ANALYZE commands can be handled by using triggers > DBMS SQL can help to provide faster executions > > *Some more functionalities can be added to UPDATE operations to make > administrations easy > > Timeline (not exact but most probably) > > Start at june 7 and End around 7 september > > *Warm up to environment to Postgresql(1-2 weeks) > > *Determine exact operations to be addded on postgresql > > *Initial coding as to workbreakdown structure > > *Start implementing on distributed environment to check inital functions work > > *Write test cases for code > > *Further implementation to support full functionalities on ideas > > *Write it to discussion site and collect feedbacks > > *More support upon feedbacks > > *Last tests and documentation of final operations > > About me > > I am a senior student at computer engineering at iztech in turkey. My > areas of inetrests are information management, OOP(Object Oriented > Programming) and currently bioinformatics. I have been working with a > Asistan Professor(Jens Allmer) in molecular biology genetics > department for one year.Firstly, we worked on a protein database 2DB > and we presented the project in HIBIT09 organization. The Project was > “Database management system independence by amending 2DB with a > database access layer”. Currently, I am working on another project > (Kerb) as my senior project which is a general sqeuential task > management system intend to reduce the errors and increase time saving > in biological experiments. We will present this project in HIBIT2010 > too. Moreover,I am good at data structures and implementations on C. > > > Contact: e-mails; necatiba...@gmail.com , necati_ba...@hotmail.com(msn) > -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GSOC PostgreSQL partitioning issue
On Thu, Apr 8, 2010 at 4:14 PM, Necati Batur wrote: > The more specific of the items will just be the exact job I guess. > However the detailed form will be hard to write now. > Or should I explain the sql issues for each point? Partitioning is a big project. It seems to me that if you want to have any chance of making a meaningful contribution in one summer, you're going to need to have a pretty specific idea of what you hope to accomplish up front, and I don't think you have that right now. The hard changes are not going to be adjustments to SQL syntax, but in the guts of the planner, executor, system catalogs, etc. ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GSOC PostgreSQL partitioning issue
The more specific of the items will just be the exact job I guess. However the detailed form will be hard to write now. Or should I explain the sql issues for each point? 2010/4/8 Necati Batur : > Benefits of Project > > Partitioning refers to splitting what is logically one large table > into smaller physical pieces. Partitioning can provide several > benefits: > > Query performance can be improved dramatically in certain situations, > particularly when most of the heavily accessed rows of the table are > in a single partition or a small number of partitions. The > partitioning substitutes for leading columns of indexes, reducing > index size and making it more likely that the heavily-used parts of > the indexes fit in memory. > > When queries or updates access a large percentage of a single > partition, performance can be improved by taking advantage of > sequential scan of that partition instead of using an index and random > access reads scattered across the whole table. > > Bulk loads and deletes can be accomplished by adding or removing > partitions, if that requirement is planned into the partitioning > design. ALTER TABLE is far faster than a bulk operation. It also > entirely avoids the VACUUM overhead caused by a bulk DELETE. > > Seldom-used data can be migrated to cheaper and slower storage media. > > Delivarables > > *The trigger based operations can be done automatically > > *The stored procedures can help us to do some functionalities like > check constraint problem > > *manual VACUUM or ANALYZE commands can be handled by using triggers > DBMS SQL can help to provide faster executions > > *Some more functionalities can be added to UPDATE operations to make > administrations easy > > Timeline (not exact but most probably) > > Start at june 7 and End around 7 september > > *Warm up to environment to Postgresql(1-2 weeks) > > *Determine exact operations to be addded on postgresql > > *Initial coding as to workbreakdown structure > > *Start implementing on distributed environment to check inital functions work > > *Write test cases for code > > *Further implementation to support full functionalities on ideas > > *Write it to discussion site and collect feedbacks > > *More support upon feedbacks > > *Last tests and documentation of final operations > > About me > > I am a senior student at computer engineering at iztech in turkey. My > areas of inetrests are information management, OOP(Object Oriented > Programming) and currently bioinformatics. I have been working with a > Asistan Professor(Jens Allmer) in molecular biology genetics > department for one year.Firstly, we worked on a protein database 2DB > and we presented the project in HIBIT09 organization. The Project was > “Database management system independence by amending 2DB with a > database access layer”. Currently, I am working on another project > (Kerb) as my senior project which is a general sqeuential task > management system intend to reduce the errors and increase time saving > in biological experiments. We will present this project in HIBIT2010 > too. Moreover,I am good at data structures and implementations on C. > > > Contact: e-mails; necatiba...@gmail.com , necati_ba...@hotmail.com(msn) > -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GSOC PostgreSQL partitioning issue
On Thu, Apr 8, 2010 at 3:58 PM, Necati Batur wrote: > *The trigger based operations can be done automatically > > *The stored procedures can help us to do some functionalities like > check constraint problem > > *manual VACUUM or ANALYZE commands can be handled by using triggers > DBMS SQL can help to provide faster executions > > *Some more functionalities can be added to UPDATE operations to make > administrations easy I think you need to be a LOT more specific about each of these items and what you intend to do about them. You also need to explain the relationship between your work and Itagaki Takahiro's existing patch. ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] GSOC PostgreSQL partitioning issue
Benefits of Project Partitioning refers to splitting what is logically one large table into smaller physical pieces. Partitioning can provide several benefits: Query performance can be improved dramatically in certain situations, particularly when most of the heavily accessed rows of the table are in a single partition or a small number of partitions. The partitioning substitutes for leading columns of indexes, reducing index size and making it more likely that the heavily-used parts of the indexes fit in memory. When queries or updates access a large percentage of a single partition, performance can be improved by taking advantage of sequential scan of that partition instead of using an index and random access reads scattered across the whole table. Bulk loads and deletes can be accomplished by adding or removing partitions, if that requirement is planned into the partitioning design. ALTER TABLE is far faster than a bulk operation. It also entirely avoids the VACUUM overhead caused by a bulk DELETE. Seldom-used data can be migrated to cheaper and slower storage media. Delivarables *The trigger based operations can be done automatically *The stored procedures can help us to do some functionalities like check constraint problem *manual VACUUM or ANALYZE commands can be handled by using triggers DBMS SQL can help to provide faster executions *Some more functionalities can be added to UPDATE operations to make administrations easy Timeline (not exact but most probably) Start at june 7 and End around 7 september *Warm up to environment to Postgresql(1-2 weeks) *Determine exact operations to be addded on postgresql *Initial coding as to workbreakdown structure *Start implementing on distributed environment to check inital functions work *Write test cases for code *Further implementation to support full functionalities on ideas *Write it to discussion site and collect feedbacks *More support upon feedbacks *Last tests and documentation of final operations About me I am a senior student at computer engineering at iztech in turkey. My areas of inetrests are information management, OOP(Object Oriented Programming) and currently bioinformatics. I have been working with a Asistan Professor(Jens Allmer) in molecular biology genetics department for one year.Firstly, we worked on a protein database 2DB and we presented the project in HIBIT09 organization. The Project was “Database management system independence by amending 2DB with a database access layer”. Currently, I am working on another project (Kerb) as my senior project which is a general sqeuential task management system intend to reduce the errors and increase time saving in biological experiments. We will present this project in HIBIT2010 too. Moreover,I am good at data structures and implementations on C. Contact: e-mails; necatiba...@gmail.com , necati_ba...@hotmail.com(msn) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] autovacuum and temp tables support
On Thu, Apr 8, 2010 at 3:22 PM, Oleg Bartunov wrote: >> On general thought I've had is that it would be nice if the first >> attempt to SELECT against a table with no statistics would trigger an >> automatic ANALYZE by the backend on which the query was executed. >> It's pretty common to populate a table using INSERT, or CTAS, or COPY >> and then try to immediately run a query against it, and I've often >> found that it's necessary to insert manual analyze statements in there >> to get decent query plans. > > Oracle does this. So, is't worth to add support (configurable, like > Oracle's optimizer_dynamic_sampling) ? Well, dynamic sampling is considerably more complicated than what I proposed, which is just to force an ordinary ANALYZE before the first query against the table. It would be a very powerful feature if we could use it to ameliorate, for example, the gross statistical errors that sometimes occur when multiple, correlated filter conditions are applied to the same base table; but I don't think it's in the direct path of solving the present complaint. ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] autovacuum and temp tables support
On Thu, 8 Apr 2010, Robert Haas wrote: On Thu, Apr 8, 2010 at 2:53 PM, Alvaro Herrera wrote: Oleg Bartunov wrote: our client complained about slow query, which involves temporary tables. Analyzing them manually solved the problem. I don't remember arguments against temporary tables support by autovacuum. I'd appreciate any pointers. Autovacuum can't process temp tables; they could reside in a backend's private temp buffers (local memory, not shared). On general thought I've had is that it would be nice if the first attempt to SELECT against a table with no statistics would trigger an automatic ANALYZE by the backend on which the query was executed. It's pretty common to populate a table using INSERT, or CTAS, or COPY and then try to immediately run a query against it, and I've often found that it's necessary to insert manual analyze statements in there to get decent query plans. Oracle does this. So, is't worth to add support (configurable, like Oracle's optimizer_dynamic_sampling) ? Regards, Oleg _ Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru), Sternberg Astronomical Institute, Moscow University, Russia Internet: o...@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(495)939-16-83, +007(495)939-23-83 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] A maze of twisty mailing lists all the same
On Thu, Apr 8, 2010 at 3:09 PM, Joshua D. Drake wrote: > On Thu, 2010-04-08 at 15:06 -0400, Jaime Casanova wrote: >> > >> >> if we want specific topics, then remove -general, -novice, -admin > > This will likely never fly, see the archives. > well, -novice shuold be easy... actually it has no reason to exist. after all what are the rules? you should subscribe here first 6 month you use postgres or until you make a course? -- Atentamente, Jaime Casanova Soporte y capacitación de PostgreSQL Asesoría y desarrollo de sistemas Guayaquil - Ecuador Cel. +59387171157 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] autovacuum and temp tables support
Robert Haas wrote: > On Thu, Apr 8, 2010 at 2:53 PM, Alvaro Herrera >> Autovacuum can't process temp tables; they could reside in a >> backend's private temp buffers (local memory, not shared). > > it would be nice if the first attempt to SELECT against a table > with no statistics would trigger an automatic ANALYZE by the > backend on which the query was executed. +1 as an RFE -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] A maze of twisty mailing lists all the same
On Thu, 2010-04-08 at 15:06 -0400, Jaime Casanova wrote: > On Thu, Apr 8, 2010 at 2:31 PM, Robert Haas wrote: > > On Thu, Apr 8, 2010 at 2:30 PM, Greg Stark wrote: > >> On Thu, Apr 8, 2010 at 5:09 PM, Kevin Grittner > >> wrote: > >>> My set is different, but the principle is the same -- I can't find > >>> the time to read all messages to all lists (really, I've tried), so > >>> I limit by list to try to target the issues of most interest to me. > >> > >> But all it means is you get a random subset of the messages. You're > >> still missing most of the admin or sql or performance related threads > >> since they're mostly on -general anyways. Those three categories cover > >> pretty much all of -general. > > > > Maybe we should remove -general. :-) > > > > > > if we want specific topics, then remove -general, -novice, -admin This will likely never fly, see the archives. Joshua D. Drake > > -- > Atentamente, > Jaime Casanova > Soporte y capacitación de PostgreSQL > Asesoría y desarrollo de sistemas > Guayaquil - Ecuador > Cel. +59387171157 > -- PostgreSQL.org Major Contributor Command Prompt, Inc: http://www.commandprompt.com/ - 503.667.4564 Consulting, Training, Support, Custom Development, Engineering Respect is earned, not gained through arbitrary and repetitive use or Mr. or Sir. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] autovacuum and temp tables support
On Thu, Apr 8, 2010 at 2:53 PM, Alvaro Herrera wrote: > Oleg Bartunov wrote: > >> our client complained about slow query, which involves temporary tables. >> Analyzing them manually solved the problem. I don't remember >> arguments against temporary tables support by autovacuum. I'd >> appreciate any >> pointers. > > Autovacuum can't process temp tables; they could reside in a backend's > private temp buffers (local memory, not shared). On general thought I've had is that it would be nice if the first attempt to SELECT against a table with no statistics would trigger an automatic ANALYZE by the backend on which the query was executed. It's pretty common to populate a table using INSERT, or CTAS, or COPY and then try to immediately run a query against it, and I've often found that it's necessary to insert manual analyze statements in there to get decent query plans. ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] A maze of twisty mailing lists all the same
On Thu, Apr 8, 2010 at 2:31 PM, Robert Haas wrote: > On Thu, Apr 8, 2010 at 2:30 PM, Greg Stark wrote: >> On Thu, Apr 8, 2010 at 5:09 PM, Kevin Grittner >> wrote: >>> My set is different, but the principle is the same -- I can't find >>> the time to read all messages to all lists (really, I've tried), so >>> I limit by list to try to target the issues of most interest to me. >> >> But all it means is you get a random subset of the messages. You're >> still missing most of the admin or sql or performance related threads >> since they're mostly on -general anyways. Those three categories cover >> pretty much all of -general. > > Maybe we should remove -general. :-) > > if we want specific topics, then remove -general, -novice, -admin -- Atentamente, Jaime Casanova Soporte y capacitación de PostgreSQL Asesoría y desarrollo de sistemas Guayaquil - Ecuador Cel. +59387171157 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] A maze of twisty mailing lists all the same
+1 for the idea, and +1 for the Zork reference. Hello sailor. On 4/8/2010 1:11 AM Greg Stark wrote: I've often said in the past that we have too many mailing lists with overlapping and vague charters. I submit the following thread as evidence that this causes real problems. http://archives.postgresql.org/message-id/g2o4b46b5f01004010610ib8625426uae6ee90ac1435...@mail.gmail.com Because the poster chose to send it to pgsql-admin instead of pgsql-general (or pgsql-bugs) very few of the usual suspects had a chance to see it. 7 days later a question about a rather serious database corruption problem had no responses. I've never understand what the point of pgsql-admin is; just about every question posted is an "admin" question of some sort. Likewise I don't think we should have pgsql-performance or pgsql-sql or pgsql-novice -- any thread appropriate for any of these would be better served by sending it to pgsql-general anyways (with the exception of pgsql-performance which has a weird combination of hacker threads and user performance tuning threads). Sending threads to pgsql-general would get more eyes on them and would avoid a lot of the cross-posting headaches. What would someone subscribed to one of these lists but not pgsql-general get anyways but some random sample of threads that might be vaguely performance or admin related. They would still miss most of the administration and performance questions and discussions which happen on -general and -hackers as appropriate. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] A maze of twisty mailing lists all the same
Greg Stark wrote: > But all it means is you get a random subset of the messages. > You're still missing most of the admin or sql or performance > related threads since they're mostly on -general anyways. Those > three categories cover pretty much all of -general. Well, one of these more specific lists must be getting over half of the message relevant to the title, unless things are freakishly evenly divided. Message counts in the last 30 days: 143 -novice 199 -sql 321 -admin 436 -performance 1099 *subtotal* 1102 -general 2201 **total** -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] autovacuum and temp tables support
Oleg Bartunov wrote: > our client complained about slow query, which involves temporary tables. > Analyzing them manually solved the problem. I don't remember > arguments against temporary tables support by autovacuum. I'd > appreciate any > pointers. Autovacuum can't process temp tables; they could reside in a backend's private temp buffers (local memory, not shared). -- Alvaro Herrerahttp://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] A maze of twisty mailing lists all the same
Greg Stark wrote: > But all it means is you get a random subset of the messages. > You're still missing most of the admin or sql or performance > related threads since they're mostly on -general anyways. Those > three categories cover pretty much all of -general. Perhaps -general should be eliminated in favor of more specific lists? -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] A maze of twisty mailing lists all the same
On Thu, Apr 8, 2010 at 2:30 PM, Greg Stark wrote: > On Thu, Apr 8, 2010 at 5:09 PM, Kevin Grittner > wrote: >> My set is different, but the principle is the same -- I can't find >> the time to read all messages to all lists (really, I've tried), so >> I limit by list to try to target the issues of most interest to me. > > But all it means is you get a random subset of the messages. You're > still missing most of the admin or sql or performance related threads > since they're mostly on -general anyways. Those three categories cover > pretty much all of -general. Maybe we should remove -general. :-) ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] A maze of twisty mailing lists all the same
On Thu, Apr 8, 2010 at 5:09 PM, Kevin Grittner wrote: > My set is different, but the principle is the same -- I can't find > the time to read all messages to all lists (really, I've tried), so > I limit by list to try to target the issues of most interest to me. But all it means is you get a random subset of the messages. You're still missing most of the admin or sql or performance related threads since they're mostly on -general anyways. Those three categories cover pretty much all of -general. -- greg -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] autovacuum and temp tables support
Hi there, our client complained about slow query, which involves temporary tables. Analyzing them manually solved the problem. I don't remember arguments against temporary tables support by autovacuum. I'd appreciate any pointers. Also, it's worth to add autovacuum_enable_temp_tables variable to control autovacuum behaviour ? Regards, Oleg _ Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru), Sternberg Astronomical Institute, Moscow University, Russia Internet: o...@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(495)939-16-83, +007(495)939-23-83 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Set LC_COLLATE to de_DE_phoneb
Am Donnerstag, den 08.04.2010, 10:27 +0900 schrieb Takahiro Itagaki: > Frank Jagusch wrote: > > > "de_DE_phoneb" is the name of an alternative sorting in german (only a > > few languages have alternate sorting). You may find some information > > when you search the MSDN for "de_DE_phoneb", i.e. > > http://msdn.microsoft.com/en-en/library/ms404373.aspx > > These alternate sorting is supported by the OS, but I don't know how it > > is supported in the msvcrt. > > Hmmm, I found "de-DE_phoneb" in MSDN: > http://msdn.microsoft.com/en-us/library/dd374060 > but setlocale("de-DE_phoneb") always fails at least on my machine. Is it a windows box? May be you need to install some german language support? > The doc says "de-DE_phoneb" is a locale name for > MAKELCID(MAKELANGID(LANG_GERMAN, SUBLANG_GERMAN), SORT_GERMAN_PHONE_BOOK). > Some of native Win32 APIs could accept the locale and sort-order > combination, but setlocale() in CRT seems to reject it. > > So, you could use the locale if you find a setlocale-compatible name of > "de-DE_phoneb". As far as I investigated I didn't find one. I hoped to find such a setlocale-compatible name or a hint where to search here... > Or, you cannot use it, unless we modify PostgreSQL to > use Win32 locale functions instead of standard libc ones -- but it is > hardly acceptable. I thought PostgreSQL is using OS specific functions for sorting (means Win32 functions?). This sounds not good for my request. So I ask the whole audience: Are others out there asking for a support for the alternate sort orders? Is it worth to discuss further in this direction? Here an other Idea: Is there a way to define a custom collation for a database? Thanks so far, Frank Jagusch -- http://www.jagusch-online.de/frank -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] A maze of twisty mailing lists all the same
On 4/7/10 10:11 PM, Greg Stark wrote: > Likewise I don't think we should have pgsql-performance or pgsql-sql > or pgsql-novice -- any thread appropriate for any of these would be > better served by sending it to pgsql-general anyways (with the > exception of pgsql-performance which has a weird combination of hacker > threads and user performance tuning threads). Sending threads to > pgsql-general would get more eyes on them and would avoid a lot of the > cross-posting headaches. What would someone subscribed to one of these > lists but not pgsql-general get anyways but some random sample of > threads that might be vaguely performance or admin related. They would > still miss most of the administration and performance questions and > discussions which happen on -general and -hackers as appropriate. (1) Regarding -sql and -performance, I couldn't disagree more. I agree that the charter of -admin is extremely vague. (2) This is *definitely* the wrong list for this discussion; it should be on -www. And, no, #2 was not meant to be ironic, even if it is. -- -- Josh Berkus PostgreSQL Experts Inc. http://www.pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot Standby: Startup at shutdown checkpoint
On Thu, 2010-04-08 at 19:02 +0300, Heikki Linnakangas wrote: > Simon Riggs wrote: > > OK, that seems better. I'm happy with that instead. > > > > Have you tested this? Is it ready to commit? > > Only very briefly. I think the code is ready, but please review and test > to see I didn't miss anything. I'm going to need you to commit this. I'm on holiday now until 14 April, so its not going to get a retest before then otherwise; its not smart to commit and then go on holiday, IIRC. I've reviewed your changes and they look correct to me; the main chunk of code is mine and that was tested by me. -- Simon Riggs www.2ndQuadrant.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] A maze of twisty mailing lists all the same
-BEGIN PGP SIGNED MESSAGE- Hash: RIPEMD160 > Perhaps further clarifying the charters of the various lists would > help, but folding too much into any one list is likely to reduce the > number of readers or cause "spotty" attention. (When I was > attempting to follow all the lists, I'd typically give up when I > fell about 6000 messages behind, and try to start up again "cold" > after having missed a big interval of messages.) I don't quite agree with this, because -general is *already* at the level where it takes a significant chunk of daily time to keep up with it. All the other mergeable lists pale in comparison to its volume. I stopped trying to read lists completely a time ago, and merely read subject lines at this point, diving into ones that seem interesting or important. Merging the smaller lists that have a huge overlap of topics with -general already would thus be a win, as there would be a larger audience to reply to, and less lists to administer and have people keep track of. It would also reduce the confusion of "which list should I post this to?" I think -admin should absolutely be folded in, -sql really should as well, and slightly less so -performance and -interfaces. - -- Greg Sabino Mullane g...@turnstep.com End Point Corporation http://www.endpoint.com/ PGP Key: 0x14964AC8 201004081214 http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8 -BEGIN PGP SIGNATURE- iEYEAREDAAYFAku+AV0ACgkQvJuQZxSWSsgVvgCbBh9vsx2cecfAhZQRmju4Vtyi zz0An0OjXFGtAtOyTUZFDDWGxrRZltBB =gDQ+ -END PGP SIGNATURE- -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] A maze of twisty mailing lists all the same
Robert Haas wrote: > Dave Page wrote: >> Greg Stark wrote: >>> Because the poster chose to send it to pgsql-admin instead of >>> pgsql-general (or pgsql-bugs) very few of the usual suspects had >>> a chance to see it. 7 days later a question about a rather >>> serious database corruption problem had no responses. I do monitor that list, and try to respond to those issues I can, but had no clue what that message was about -- so I left it for someone else to take up. I often see Tom responding to posts on that list, so I kinda figure anything serious (or where I get it wrong) will be addressed by him, but this thread makes me wonder whether we should advise people not to post there when there is any indication of possible corruption or bugs. >>> I've never understand what the point of pgsql-admin is; just >>> about every question posted is an "admin" question of some sort. I think you just answered your own question. I've considered it to be a list for DBAs (or those filling that role, regardless of title) to discuss administrative and operational issues and "best practices". That seems useful to me. >> a counter argument is that merging lists would significantly >> increase the traffic on -general would may not be appreciated by >> the many people that are only subscribed to one or two of the >> affected lists. I would wager that the majority of people aren't >> subscribed to more than a small number of the available lists. > > Yeah. I read -performance, -hackers, -bugs, but not -sql, -admin, > -general. My set is different, but the principle is the same -- I can't find the time to read all messages to all lists (really, I've tried), so I limit by list to try to target the issues of most interest to me. > Consolidating multiple mailing lists to increase viewership of > certain messages is only going to work if everyone who now follows > each of the smaller mailing lists does an equally good job > following the bigger one. That doesn't seem like a safe > assumption. Agreed. Perhaps further clarifying the charters of the various lists would help, but folding too much into any one list is likely to reduce the number of readers or cause "spotty" attention. (When I was attempting to follow all the lists, I'd typically give up when I fell about 6000 messages behind, and try to start up again "cold" after having missed a big interval of messages.) -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot Standby: Startup at shutdown checkpoint
Simon Riggs wrote: > OK, that seems better. I'm happy with that instead. > > Have you tested this? Is it ready to commit? Only very briefly. I think the code is ready, but please review and test to see I didn't miss anything. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot Standby: Startup at shutdown checkpoint
On Thu, 2010-04-08 at 18:35 +0300, Heikki Linnakangas wrote: > > > So I have introduced the new mode ("snapshot mode") to enter hot > standby > > anyway. That avoids us having to screw around with the loop logic > for > > redo. I don't see any need to support the case of where we have no > WAL > > source defined, yet we want Hot Standby but we also want to allow > > somebody to drop a WAL file into pg_xlog at some future point. That > has > > no use case of value AFAICS and is too complex to add at this stage > of > > the release cycle. > > You don't need a new mode for that. Just do the same "are we > consistent now?" check you do in the loop once before calling > ReadRecord to fetch the record that follows the checkpoint pointer. > Attached is a patch to show what I mean. We just need to let > postmaster know that recovery has started a bit earlier, right after > processing the checkpoint record, not delaying it until we've read the > first record after it. OK, that seems better. I'm happy with that instead. Have you tested this? Is it ready to commit? -- Simon Riggs www.2ndQuadrant.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Unsafe threading in syslogger on Windows
Heikki Linnakangas wrote: On Windows, syslogger uses two threads. The main thread loops and polls if any SIGHUPs have been received or if the log file needs to be rotated. Another thread, "pipe thread", does ReadFile() on the pipe that other processes send their log messages to. ReadFile() blocks, and whenever new data arrives, it is processed in the pipe thread. Both threads use palloc()/pfree(), which are not thread-safe :-(. It's hard to trigger a crash because the main thread mostly just sleeps, and the pipe thread only uses palloc()/pfree() when it receives chunked messages, larger than 512 bytes. Browsing the CVS history, this was made visibly broken by the patch that introduced the message chunking. Before that the pipe thread just read from the pipe and wrote to the log file, which was safe. It has always used ereport() to report read errors, though, which can do palloc(), but there shouldn't normally be any read errors. I chatted with Magnus about this, and he suggested using a Windows critical section to make sure that only one of the threads is active at a time. That seems suitable for back-porting, but I'd like to get rid of this threading in CVS head, it seems too error-prone. The reason it uses threads like this on Windows is explained in the comments: /* * Worker thread to transfer data from the pipe to the current logfile. * * We need this because on Windows, WaitForSingleObject does not work on * unnamed pipes: it always reports "signaled", so the blocking ReadFile won't * allow for SIGHUP; and select is for sockets only. */ But Magnus pointed out that our pgpipe() implementation on Windows actually creates a pair of sockets instead of pipes, for exactly that reason, so that you can use select() on the returned file descriptor. For some reason syslogger explicitly doesn't use pgpipe() on Windows, though, but calls CreatePipe(). I don't see any explanation why. I'm going to see what happens if I remove all the #ifdef WIN32 blocks in syslogger, and let it use pgpipe() and select() instead of the extra thread. Sounds reasonable. Let's see how big the changes are on HEAD. I'm not sure it's worth creating a different smaller fix for the back branches. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot Standby: Startup at shutdown checkpoint
Simon Riggs wrote: > In StartupXlog() when we get to the point where we "Find the first > record that logically follows the checkpoint", in the current code > ReadRecord() loops forever, spitting out > LOG: record with zero length at 0/C88 > ... > > That prevents us from going further down StartupXLog() to the point > where we start the InRedo loop and hence start hot standby. As long as > we retry we cannot progress further: this is the main problem. > > So in the patch, I have modified the retry test in ReadRecord() so it no > longer retries iff there is no WAL source defined. Now, when > ReadRecord() exits, record == NULL at that point and so we do not (and > cannot) enter the redo loop. Oh, I see. > So I have introduced the new mode ("snapshot mode") to enter hot standby > anyway. That avoids us having to screw around with the loop logic for > redo. I don't see any need to support the case of where we have no WAL > source defined, yet we want Hot Standby but we also want to allow > somebody to drop a WAL file into pg_xlog at some future point. That has > no use case of value AFAICS and is too complex to add at this stage of > the release cycle. You don't need a new mode for that. Just do the same "are we consistent now?" check you do in the loop once before calling ReadRecord to fetch the record that follows the checkpoint pointer. Attached is a patch to show what I mean. We just need to let postmaster know that recovery has started a bit earlier, right after processing the checkpoint record, not delaying it until we've read the first record after it. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com *** a/src/backend/access/transam/twophase.c --- b/src/backend/access/transam/twophase.c *** *** 1719,1724 PrescanPreparedTransactions(TransactionId **xids_p, int *nxids_p) --- 1719,1806 } /* + * StandbyRecoverPreparedTransactions + * + * Scan the pg_twophase directory and setup all the required information to + * allow standby queries to treat prepared transactions as still active. + * This is never called at the end of recovery - we use + * RecoverPreparedTransactions() at that point. + * + * Currently we simply call SubTransSetParent() for any subxids of prepared + * transactions. + */ + void + StandbyRecoverPreparedTransactions(bool can_overwrite) + { + DIR *cldir; + struct dirent *clde; + + cldir = AllocateDir(TWOPHASE_DIR); + while ((clde = ReadDir(cldir, TWOPHASE_DIR)) != NULL) + { + if (strlen(clde->d_name) == 8 && + strspn(clde->d_name, "0123456789ABCDEF") == 8) + { + TransactionId xid; + char *buf; + TwoPhaseFileHeader *hdr; + TransactionId *subxids; + int i; + + xid = (TransactionId) strtoul(clde->d_name, NULL, 16); + + /* Already processed? */ + if (TransactionIdDidCommit(xid) || TransactionIdDidAbort(xid)) + { + ereport(WARNING, + (errmsg("removing stale two-phase state file \"%s\"", + clde->d_name))); + RemoveTwoPhaseFile(xid, true); + continue; + } + + /* Read and validate file */ + buf = ReadTwoPhaseFile(xid, true); + if (buf == NULL) + { + ereport(WARNING, + (errmsg("removing corrupt two-phase state file \"%s\"", + clde->d_name))); + RemoveTwoPhaseFile(xid, true); + continue; + } + + /* Deconstruct header */ + hdr = (TwoPhaseFileHeader *) buf; + if (!TransactionIdEquals(hdr->xid, xid)) + { + ereport(WARNING, + (errmsg("removing corrupt two-phase state file \"%s\"", + clde->d_name))); + RemoveTwoPhaseFile(xid, true); + pfree(buf); + continue; + } + + /* + * Examine subtransaction XIDs ... they should all follow main + * XID, and they may force us to advance nextXid. + */ + subxids = (TransactionId *) + (buf + MAXALIGN(sizeof(TwoPhaseFileHeader))); + for (i = 0; i < hdr->nsubxacts; i++) + { + TransactionId subxid = subxids[i]; + + Assert(TransactionIdFollows(subxid, xid)); + SubTransSetParent(xid, subxid, can_overwrite); + } + } + } + FreeDir(cldir); + } + + /* * RecoverPreparedTransactions * * Scan the pg_twophase directory and reload shared-memory state for each *** a/src/backend/access/transam/xlog.c --- b/src/backend/access/transam/xlog.c *** *** 494,499 static XLogRecPtr minRecoveryPoint; /* local copy of --- 494,501 * ControlFile->minRecoveryPoint */ static bool updateMinRecoveryPoint = true; + static bool reachedMinRecoveryPoint = false; + static bool InRedo = false; /* *** *** 547,552 static void ValidateXLOGDirectoryStructure(void); --- 549,555 static void CleanupBackupHistory(void); static void UpdateMinRecoveryPoint(XLogRecPtr lsn, bool force); static XLogRecord *ReadRecord(XLogRecPtr *RecPtr, int emode, bool fetching_ckpt); + static void CheckRecoveryConsistency(void); static bool ValidXLOGHeader(XLogPageHead
[HACKERS] Unsafe threading in syslogger on Windows
On Windows, syslogger uses two threads. The main thread loops and polls if any SIGHUPs have been received or if the log file needs to be rotated. Another thread, "pipe thread", does ReadFile() on the pipe that other processes send their log messages to. ReadFile() blocks, and whenever new data arrives, it is processed in the pipe thread. Both threads use palloc()/pfree(), which are not thread-safe :-(. It's hard to trigger a crash because the main thread mostly just sleeps, and the pipe thread only uses palloc()/pfree() when it receives chunked messages, larger than 512 bytes. Browsing the CVS history, this was made visibly broken by the patch that introduced the message chunking. Before that the pipe thread just read from the pipe and wrote to the log file, which was safe. It has always used ereport() to report read errors, though, which can do palloc(), but there shouldn't normally be any read errors. I chatted with Magnus about this, and he suggested using a Windows critical section to make sure that only one of the threads is active at a time. That seems suitable for back-porting, but I'd like to get rid of this threading in CVS head, it seems too error-prone. The reason it uses threads like this on Windows is explained in the comments: > /* > * Worker thread to transfer data from the pipe to the current logfile. > * > * We need this because on Windows, WaitForSingleObject does not work on > * unnamed pipes: it always reports "signaled", so the blocking ReadFile won't > * allow for SIGHUP; and select is for sockets only. > */ But Magnus pointed out that our pgpipe() implementation on Windows actually creates a pair of sockets instead of pipes, for exactly that reason, so that you can use select() on the returned file descriptor. For some reason syslogger explicitly doesn't use pgpipe() on Windows, though, but calls CreatePipe(). I don't see any explanation why. I'm going to see what happens if I remove all the #ifdef WIN32 blocks in syslogger, and let it use pgpipe() and select() instead of the extra thread. Thoughts? -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Remaining Streaming Replication Open Items
On Thu, Apr 8, 2010 at 9:56 AM, Simon Riggs wrote: > On Thu, 2010-04-08 at 09:40 -0400, Robert Haas wrote: >> On Thu, Apr 8, 2010 at 8:00 AM, Simon Riggs wrote: >> > On Thu, 2010-04-08 at 07:53 -0400, Robert Haas wrote: >> > >> >> > I do. I see no reason to do the latter, ever, so should not be added to >> >> > any TODO. >> >> >> >> Well, stopping recovery earlier would mean fewer locks, which would >> >> mean a better chance for the read-only backends to finish their work >> >> and exit quickly. But I'm not sure how much it's worth worrying >> >> about. >> > >> > The purpose of the lock is to prevent access to objects when they are in >> > inappropriate states for access. If we stopped startup and allowed >> > access, how do we know that things are in sufficiently good state to >> > allow access? We don't. If the Startup process is holding a lock then >> > that is the only safe thing to do. Otherwise we might allow access to a >> > table with a partially built index or other screw ups. >> >> Hmm. Good point. I guess you could really only stop the startup >> process safely when it wasn't holding any locks anyhow - you couldn't >> just kill it and have it release the locks. > > ... and if it isn't holding any locks at all, there is no reason to kill > Startup first => no TODO item. I think you could shut it down at the first point at which it is holding no locks, rather than letting it continue recovering and potentially retake some new locks. That would be more consistent with the general idea of what a smart shutdown is supposed to be about. I think the real question is whether it's worth the code complexity. I suspect most people use fast shutdown most of the time anyway in real-world applications. ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Remaining Streaming Replication Open Items
On Thu, 2010-04-08 at 09:40 -0400, Robert Haas wrote: > On Thu, Apr 8, 2010 at 8:00 AM, Simon Riggs wrote: > > On Thu, 2010-04-08 at 07:53 -0400, Robert Haas wrote: > > > >> > I do. I see no reason to do the latter, ever, so should not be added to > >> > any TODO. > >> > >> Well, stopping recovery earlier would mean fewer locks, which would > >> mean a better chance for the read-only backends to finish their work > >> and exit quickly. But I'm not sure how much it's worth worrying > >> about. > > > > The purpose of the lock is to prevent access to objects when they are in > > inappropriate states for access. If we stopped startup and allowed > > access, how do we know that things are in sufficiently good state to > > allow access? We don't. If the Startup process is holding a lock then > > that is the only safe thing to do. Otherwise we might allow access to a > > table with a partially built index or other screw ups. > > Hmm. Good point. I guess you could really only stop the startup > process safely when it wasn't holding any locks anyhow - you couldn't > just kill it and have it release the locks. ... and if it isn't holding any locks at all, there is no reason to kill Startup first => no TODO item. -- Simon Riggs www.2ndQuadrant.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [pgadmin-hackers] Feature request: limited deletions
Hi all, On Thu, 2010-04-08 at 07:45 -0400, Robert Haas wrote: > >> 2010/4/8 Thom Brown : > >> > So you could write: > >> > > >> > DELETE FROM massive_table WHERE id < 4000 LIMIT 1; > I've certainly worked around the lack of this syntax more than once. > And I bet it's not even that hard to implement. The fact that it's not implemented has nothing to do with it's complexity (in fact it is probably just a matter of enabling it) - you'll have a hard time to convince some old-time hackers on this list that the non-determinism inherent in this kind of query is acceptable ;-) There is a workaround to do it, which works quite good in fact: delete from massive_table where ctid = any(array(select ctid from massive_table WHERE id < 4000 LIMIT 1)); Just run an explain on it and you'll see it won't get any better, but beware that it might be less optimal than you think, as you will be likely sequential scanning the table for each chunk unless you put some selective where conditions on it too - and then you'll still scan the whole deleted part and not just the next chunk - the deleted records won't go out of the way magically, you need to vacuum, and that's probably a problem too on a big table. So most likely it will help you less than you think on a massive table, the run time per chunk will increase with each chunk unless you're able to vacuum efficiently. In any case you need to balance the chunk size with the scanned portion of the table so you get a reasonable run time per chunk, and not too much overhead of the whole chunking process... Cheers, Csaba. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] system table/view and sequence
On 7 avr, 17:44, nicolas.barb...@gmail.com (Nicolas Barbier) wrote: > 2010/4/7 Olivier Baheux : > > > i'm trying to find where are stored sequence definition > > (increment,minvalue,maxvalue,start,cache) in system tables. Atm I > > found everything exept sequence. > > It's in the sequence itself (which can be accessed like a table). The > fact that this "table" is in fact a sequence is stored in pg_class: > > 8< > itsme=# CREATE TABLE a (i serial); > HINWEIS: CREATE TABLE erstellt implizit eine Sequenz »a_i_seq« für > die »serial«-Spalte »a.i« > CREATE TABLE > itsme=# SELECT * FROM a_i_seq; > sequence_name | last_value | increment_by | max_value | > min_value | cache_value | log_cnt | is_cycled | is_called > ---++--+-+---+-+-+---+--- > a_i_seq | 1 | 1 | 9223372036854775807 | > 1 | 1 | 1 | f | f > (1 Zeile) > > itsme=# SELECT relkind FROM pg_class WHERE relname = 'a_i_seq'; > relkind > - > S > (1 Zeile) > >8 > > Nicolas > > -- > Sent via pgsql-hackers mailing list (pgsql-hack...@postgresql.org) > To make changes to your > subscription:http://www.postgresql.org/mailpref/pgsql-hackers it work, thanx so much -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot Standby: Startup at shutdown checkpoint
On Thu, Apr 8, 2010 at 6:16 AM, Simon Riggs wrote: > If standby_mode is enabled and there is no source of WAL, then we get a > stream of messages saying > > LOG: record with zero length at 0/C88 > ... > > but most importantly we never get to the main recovery loop, so Hot > Standby never gets to start at all. We can't keep retrying the request > for WAL and at the same time enter the retry loop, executing lots of > things that expect non-NULL pointers using a NULL xlog pointer. This is pretty much a corner case, so I don't think it's a good idea to add a new mode to handle it. It also seems like it would be pretty inconsistent if we allow WAL to be dropped in pg_xlog, but only if we are also doing archive recovery or streaming replication. If we can't support this case with the same code path we use otherwise, I think we should revert to disallowing it. Having said that, I guess I don't understand how having a source of WAL solves the problem described above. Do we always have to read at least 1 byte of WAL from either SR or the archive before starting up? If not, why do we need to do so here? ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Remaining Streaming Replication Open Items
On Thu, Apr 8, 2010 at 8:00 AM, Simon Riggs wrote: > On Thu, 2010-04-08 at 07:53 -0400, Robert Haas wrote: > >> > I do. I see no reason to do the latter, ever, so should not be added to >> > any TODO. >> >> Well, stopping recovery earlier would mean fewer locks, which would >> mean a better chance for the read-only backends to finish their work >> and exit quickly. But I'm not sure how much it's worth worrying >> about. > > The purpose of the lock is to prevent access to objects when they are in > inappropriate states for access. If we stopped startup and allowed > access, how do we know that things are in sufficiently good state to > allow access? We don't. If the Startup process is holding a lock then > that is the only safe thing to do. Otherwise we might allow access to a > table with a partially built index or other screw ups. Hmm. Good point. I guess you could really only stop the startup process safely when it wasn't holding any locks anyhow - you couldn't just kill it and have it release the locks. ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Remaining Streaming Replication Open Items
On Thu, 2010-04-08 at 07:53 -0400, Robert Haas wrote: > > I do. I see no reason to do the latter, ever, so should not be added to > > any TODO. > > Well, stopping recovery earlier would mean fewer locks, which would > mean a better chance for the read-only backends to finish their work > and exit quickly. But I'm not sure how much it's worth worrying > about. The purpose of the lock is to prevent access to objects when they are in inappropriate states for access. If we stopped startup and allowed access, how do we know that things are in sufficiently good state to allow access? We don't. If the Startup process is holding a lock then that is the only safe thing to do. Otherwise we might allow access to a table with a partially built index or other screw ups. -- Simon Riggs www.2ndQuadrant.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Remaining Streaming Replication Open Items
On Thu, Apr 8, 2010 at 7:37 AM, Simon Riggs wrote: > On Thu, 2010-04-08 at 06:58 -0400, Robert Haas wrote: > >> >> Thanks. Committed. >> > >> > Thanks. The following TODO item should be removed? >> > >> > "Redefine smart shutdown in standby mode to exist as soon as all >> > read-only connections are gone." >> > http://wiki.postgresql.org/wiki/Todo#Standby_server_mode > >> > Or change it to something like? >> > >> > "Change smart shutdown in standby mode so that it kills the startup >> > and walreceiver process before waiting for the regular backends to die >> > off" >> >> Yeah, we should do one of those two things, but I don't much care which. > > I do. I see no reason to do the latter, ever, so should not be added to > any TODO. Well, stopping recovery earlier would mean fewer locks, which would mean a better chance for the read-only backends to finish their work and exit quickly. But I'm not sure how much it's worth worrying about. ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [pgadmin-hackers] Feature request: limited deletions
On Thu, Apr 8, 2010 at 7:05 AM, Thom Brown wrote: > On 8 April 2010 11:55, Ian Barwick wrote: >> >> 2010/4/8 Thom Brown : >> > I couldn't find any discussion on this, but the request is quite >> > straightforward. Implement a LIMIT on DELETE statements like SELECT >> > statements. >> > >> > So you could write: >> > >> > DELETE FROM massive_table WHERE id < 4000 LIMIT 1; >> > >> > This would allow deletions in smaller batches rather than waiting >> > potentially hours for the server to mark all those rows as deleted and >> > commit it as one massive transaction. >> >> Is this a PgAdmin-specific question? If it is, apologies I am missing >> the context. >> >> If not, this is totally the wrong list, but why not use a subquery to >> control what is deleted? > > Erm... my mistake, I thought this was on the generic hackers list. Moving > it over in this reply. I've certainly worked around the lack of this syntax more than once. And I bet it's not even that hard to implement. ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Remaining Streaming Replication Open Items
On Thu, 2010-04-08 at 06:58 -0400, Robert Haas wrote: > >> Thanks. Committed. > > > > Thanks. The following TODO item should be removed? > > > > "Redefine smart shutdown in standby mode to exist as soon as all > > read-only connections are gone." > > http://wiki.postgresql.org/wiki/Todo#Standby_server_mode > > Or change it to something like? > > > > "Change smart shutdown in standby mode so that it kills the startup > > and walreceiver process before waiting for the regular backends to die off" > > Yeah, we should do one of those two things, but I don't much care which. I do. I see no reason to do the latter, ever, so should not be added to any TODO. -- Simon Riggs www.2ndQuadrant.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot Standby: Startup at shutdown checkpoint
On Thu, 2010-04-08 at 13:33 +0300, Heikki Linnakangas wrote: > > If standby_mode is enabled and there is no source of WAL, then we get a > > stream of messages saying > > > > LOG: record with zero length at 0/C88 > > ... > > > > but most importantly we never get to the main recovery loop, so Hot > > Standby never gets to start at all. We can't keep retrying the request > > for WAL and at the same time enter the retry loop, executing lots of > > things that expect non-NULL pointers using a NULL xlog pointer. > > You mean it can't find even the checkpoint record to start replaying? Clearly I don't mean that. Otherwise it wouldn't be "start from a shutdown checkpoint". I think you are misunderstanding me. Let me explain in more detail though please also read the patch before replying, if you do. The patch I submitted at top of this thread works for allowing Hot Standby during recovery. Yes, of course that occurs when the database is consistent. The trick is to get recovery to the point where it can be enabled. The second patch on this thread presents a way to get the database to that point; it touches some of the other recovery code that you and Masao have worked on. We *must* touch that code if we are to enable Hot Standby in the way you desire. In StartupXlog() when we get to the point where we "Find the first record that logically follows the checkpoint", in the current code ReadRecord() loops forever, spitting out LOG: record with zero length at 0/C88 ... That prevents us from going further down StartupXLog() to the point where we start the InRedo loop and hence start hot standby. As long as we retry we cannot progress further: this is the main problem. So in the patch, I have modified the retry test in ReadRecord() so it no longer retries iff there is no WAL source defined. Now, when ReadRecord() exits, record == NULL at that point and so we do not (and cannot) enter the redo loop. So I have introduced the new mode ("snapshot mode") to enter hot standby anyway. That avoids us having to screw around with the loop logic for redo. I don't see any need to support the case of where we have no WAL source defined, yet we want Hot Standby but we also want to allow somebody to drop a WAL file into pg_xlog at some future point. That has no use case of value AFAICS and is too complex to add at this stage of the release cycle. -- Simon Riggs www.2ndQuadrant.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [pgadmin-hackers] Feature request: limited deletions
On 8 April 2010 11:55, Ian Barwick wrote: > 2010/4/8 Thom Brown : > > I couldn't find any discussion on this, but the request is quite > > straightforward. Implement a LIMIT on DELETE statements like SELECT > > statements. > > > > So you could write: > > > > DELETE FROM massive_table WHERE id < 4000 LIMIT 1; > > > > This would allow deletions in smaller batches rather than waiting > > potentially hours for the server to mark all those rows as deleted and > > commit it as one massive transaction. > > Is this a PgAdmin-specific question? If it is, apologies I am missing > the context. > > If not, this is totally the wrong list, but why not use a subquery to > control what is deleted? > > > Ian Barwick > Erm... my mistake, I thought this was on the generic hackers list. Moving it over in this reply. Thom
Re: [HACKERS] Remaining Streaming Replication Open Items
On Thu, Apr 8, 2010 at 2:54 AM, Fujii Masao wrote: > On Thu, Apr 8, 2010 at 10:41 AM, Robert Haas wrote: >> On Wed, Apr 7, 2010 at 8:17 AM, Simon Riggs wrote: >>> OK, that looks a lot less risky than I had understood from discussions. >>> The main thing for me is it doesn't interfere with Startup or >>> WalReceiver, so assuming it works I've got no objections. Thanks for >>> chasing this down, good addition. >> >> Thanks. Committed. > > Thanks. The following TODO item should be removed? > > "Redefine smart shutdown in standby mode to exist as soon as all > read-only connections are gone." > http://wiki.postgresql.org/wiki/Todo#Standby_server_mode > > Or change it to something like? > > "Change smart shutdown in standby mode so that it kills the startup > and walreceiver process before waiting for the regular backends to die off" Yeah, we should do one of those two things, but I don't much care which. ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] A maze of twisty mailing lists all the same
On Thu, Apr 8, 2010 at 3:46 AM, Dave Page wrote: > On Thu, Apr 8, 2010 at 6:11 AM, Greg Stark wrote: >> Because the poster chose to send it to pgsql-admin instead of >> pgsql-general (or pgsql-bugs) very few of the usual suspects had a >> chance to see it. 7 days later a question about a rather serious >> database corruption problem had no responses. I've never understand >> what the point of pgsql-admin is; just about every question posted is >> an "admin" question of some sort. > > I can't argue with that... but a counter argument is that merging > lists would significantly increase the traffic on -general would may > not be appreciated by the many people that are only subscribed to one > or two of the affected lists. I would wager that the majority of > people aren't subscribed to more than a small number of the available > lists. Yeah. I read -performance, -hackers, -bugs, but not -sql, -admin, -general. Consolidating multiple mailing lists to increase viewership of certain messages is only going to work if everyone who now follows each of the smaller mailing lists does an equally good job following the bigger one. That doesn't seem like a safe assumption. I might be able to buy an argument that -admin is too fuzzy to be readily distinguishable, although I don't really know since I don't read it. But -performance seems to have a fairly well-defined charter and it's a subset of messages I enjoy reading. Of course if some performance questions get posted elsewhere, yeah, I'll miss them, but oh well: reading every message on every topic hasn't seemed like a good way to address that problem. ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot Standby: Startup at shutdown checkpoint
Simon Riggs wrote: > On Tue, 2010-04-06 at 10:22 +0100, Simon Riggs wrote: > >> Initial patch. I will be testing over next day. No commit before at >> least midday on Wed 7 Apr. > > Various previous discussions sidelined a very important point: what > exactly does it mean to "start recovery from a shutdown checkpoint"? Hot standby should be possible as soon we know that the database is consistent. That is, as soon as we've replayed WAL past the minRecoveryPoint/backupStartPoint point indicated in pg_control. > If standby_mode is enabled and there is no source of WAL, then we get a > stream of messages saying > > LOG: record with zero length at 0/C88 > ... > > but most importantly we never get to the main recovery loop, so Hot > Standby never gets to start at all. We can't keep retrying the request > for WAL and at the same time enter the retry loop, executing lots of > things that expect non-NULL pointers using a NULL xlog pointer. You mean it can't find even the checkpoint record to start replaying? I think the behavior in that scenario is fine as it is. The database isn't consistent (or at least we can't know if it is, because we don't know the redo pointer) until you read and replay the first checkpoint record. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot Standby: Startup at shutdown checkpoint
On Tue, 2010-04-06 at 10:22 +0100, Simon Riggs wrote: > Initial patch. I will be testing over next day. No commit before at > least midday on Wed 7 Apr. Various previous discussions sidelined a very important point: what exactly does it mean to "start recovery from a shutdown checkpoint"? If standby_mode is enabled and there is no source of WAL, then we get a stream of messages saying LOG: record with zero length at 0/C88 ... but most importantly we never get to the main recovery loop, so Hot Standby never gets to start at all. We can't keep retrying the request for WAL and at the same time enter the retry loop, executing lots of things that expect non-NULL pointers using a NULL xlog pointer. What we are asking for here is a completely new state: the database is not "in recovery" - by definition there is nothing at all to recover. The following patch adds "Snapshot Mode", a very simple variation on the existing code - emphasis on the "simple": LOG: entering snapshot mode LOG: record with zero length at 0/C88 LOG: consistent recovery state reached at 0/C88 LOG: database system is ready to accept read only connections this mode does *not* continually check to see if new WAL files have been added. Startup just sits and waits, backends allowed. If a trigger file is specified, then we can leave recovery. Otherwise Startup process just sits doing nothing. There's possibly an argument for inventing some more special modes where we do allow read only connections but don't start the bgwriter. I don't personally wish to do this at this stage of the release cycle. The attached patch is non-invasive and safe and I want to leave it at that. I will be committing later today, unless major objections, but I ask you to read the patch before you sharpen your pen. It's simple. -- Simon Riggs www.2ndQuadrant.com diff --git a/src/backend/access/transam/twophase.c b/src/backend/access/transam/twophase.c index e2566a4..365cd17 100644 --- a/src/backend/access/transam/twophase.c +++ b/src/backend/access/transam/twophase.c @@ -1719,6 +1719,88 @@ PrescanPreparedTransactions(TransactionId **xids_p, int *nxids_p) } /* + * StandbyRecoverPreparedTransactions + * + * Scan the pg_twophase directory and setup all the required information to + * allow standby queries to treat prepared transactions as still active. + * This is never called at the end of recovery - we use + * RecoverPreparedTransactions() at that point. + * + * Currently we simply call SubTransSetParent() for any subxids of prepared + * transactions. + */ +void +StandbyRecoverPreparedTransactions(bool can_overwrite) +{ + DIR *cldir; + struct dirent *clde; + + cldir = AllocateDir(TWOPHASE_DIR); + while ((clde = ReadDir(cldir, TWOPHASE_DIR)) != NULL) + { + if (strlen(clde->d_name) == 8 && + strspn(clde->d_name, "0123456789ABCDEF") == 8) + { + TransactionId xid; + char *buf; + TwoPhaseFileHeader *hdr; + TransactionId *subxids; + int i; + + xid = (TransactionId) strtoul(clde->d_name, NULL, 16); + + /* Already processed? */ + if (TransactionIdDidCommit(xid) || TransactionIdDidAbort(xid)) + { +ereport(WARNING, + (errmsg("removing stale two-phase state file \"%s\"", +clde->d_name))); +RemoveTwoPhaseFile(xid, true); +continue; + } + + /* Read and validate file */ + buf = ReadTwoPhaseFile(xid, true); + if (buf == NULL) + { +ereport(WARNING, + (errmsg("removing corrupt two-phase state file \"%s\"", + clde->d_name))); +RemoveTwoPhaseFile(xid, true); +continue; + } + + /* Deconstruct header */ + hdr = (TwoPhaseFileHeader *) buf; + if (!TransactionIdEquals(hdr->xid, xid)) + { +ereport(WARNING, + (errmsg("removing corrupt two-phase state file \"%s\"", + clde->d_name))); +RemoveTwoPhaseFile(xid, true); +pfree(buf); +continue; + } + + /* + * Examine subtransaction XIDs ... they should all follow main + * XID, and they may force us to advance nextXid. + */ + subxids = (TransactionId *) +(buf + MAXALIGN(sizeof(TwoPhaseFileHeader))); + for (i = 0; i < hdr->nsubxacts; i++) + { +TransactionId subxid = subxids[i]; + +Assert(TransactionIdFollows(subxid, xid)); +SubTransSetParent(xid, subxid, can_overwrite); + } + } + } + FreeDir(cldir); +} + +/* * RecoverPreparedTransactions * * Scan the pg_twophase directory and reload shared-memory state for each diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index 3000ab7..d26b369 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -180,6 +180,7 @@ static TimestampTz recoveryLastXTime = 0; /* options taken from recovery.conf for XLOG streaming */ static bool StandbyMode = false; +static bool SnapshotMode = false; static char *PrimaryConnInfo = NULL; char *TriggerFile = NULL; @@ -3845,7 +3846,7 @@ next_record_is_invalid: } /* In standby-mode, keep trying *
Re: [HACKERS] Oddly indented raw_expression_tree_walker
Takahiro Itagaki writes: > I found raw_expression_tree_walker() is oddly indented in 8.4 and HEAD. > I expected pgindent would fix those clutter, but it could not. > Should we cleanup it manually, or leave it as-is? There is exactly zero point in a manual cleanup, because pgindent will just do it again next time. If you want to try to fix pgindent, though, have at it. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] A maze of twisty mailing lists all the same
On Thu, Apr 8, 2010 at 9:46 AM, Dave Page wrote: > On Thu, Apr 8, 2010 at 6:11 AM, Greg Stark wrote: >> Because the poster chose to send it to pgsql-admin instead of >> pgsql-general (or pgsql-bugs) very few of the usual suspects had a >> chance to see it. 7 days later a question about a rather serious >> database corruption problem had no responses. I've never understand >> what the point of pgsql-admin is; just about every question posted is >> an "admin" question of some sort. > > I can't argue with that... but a counter argument is that merging > lists would significantly increase the traffic on -general would may > not be appreciated by the many people that are only subscribed to one > or two of the affected lists. I would wager that the majority of > people aren't subscribed to more than a small number of the available > lists. That's actually something we can easily find out, if we can get a list of the subscribers emails into a Real Database. I know this bunch of database geeks who write strange "ess-cue-ell kweriis", or whatever they call it, to make such analysis... -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] walreceiver is uninterruptible on win32
On Wed, Apr 7, 2010 at 1:45 AM, Magnus Hagander wrote: > No, I don't mean that. I mean store it in one place, and copy/link it > into where it's used. Look at for example how crypt.c and > getaddrinfo.c are handled in libpq. Thanks for the advice! > Not sure how that will play with PGXS, though, but I'm not entirely > sure we care if it can be built that way? Probably Yes. > If it does, there should be > some way to get PGXS to execute that rule as well, I'm sure. If we can copy/link the source file defining "new PQexec" when we compile the dblink, DLL doesn't seem to be required. So I stop creating new DLL for PGXS. > Also note that per Tom's comments this is not a win32 only fix, so it > shouldn't be called pgwin32_*(). Yep. Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] A maze of twisty mailing lists all the same
On Thu, Apr 8, 2010 at 8:46 AM, Dave Page wrote: > I can't argue with that... but a counter argument is ... Yes, I know. Clearly it's coffee time :-p -- Dave Page EnterpriseDB UK: http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: [COMMITTERS] pgsql: Forbid using pg_xlogfile_name() and pg_xlogfile_name_offset()
On Thu, 2010-04-08 at 16:41 +0900, Fujii Masao wrote: > >> > Why? The tli of the last WAL record received is always the > >> > recovery target tli currently. > >> > >> True. > > > > Only in streaming mode. If you use the current TLI as I have suggested > > then it will be correct in more cases. > > pg_last_xlog_receive_location() might be executed also after archive > recovery ends. In this case, using the current tli seems not correct > because it's always different from the recovery target tli after recovery. Which is why the code I write says "if (RecoveryInProgress())" > >> Hmm, currently pg_last_xlog_receive_location() returns the last location > >> streamed via streaming replication. Should that be changed so that it > >> also advances when a WAL segment is restored from archive? It seems > >> strange that pg_last_xlog_receive_location() can be smaller than > >> pg_last_xlog_replay_location(). > > > > Yes, it should be changed. > > Should it advance when WAL file in pg_xlog is read? If not, > pg_last_xlog_receive_location() can be smaller than > pg_last_xlog_replay_location(). > > And, how far should it advance when WAL file is > partially-filled for some reasons? Just make pg_last_xlog_receive_location() do exactly the same thing as pg_last_xlog_replay_location() when working with files. No need to try to keep two things exactly in sync. -- Simon Riggs www.2ndQuadrant.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] A maze of twisty mailing lists all the same
On Thu, Apr 8, 2010 at 6:11 AM, Greg Stark wrote: > Because the poster chose to send it to pgsql-admin instead of > pgsql-general (or pgsql-bugs) very few of the usual suspects had a > chance to see it. 7 days later a question about a rather serious > database corruption problem had no responses. I've never understand > what the point of pgsql-admin is; just about every question posted is > an "admin" question of some sort. I can't argue with that... but a counter argument is that merging lists would significantly increase the traffic on -general would may not be appreciated by the many people that are only subscribed to one or two of the affected lists. I would wager that the majority of people aren't subscribed to more than a small number of the available lists. -- Dave Page EnterpriseDB UK: http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: [COMMITTERS] pgsql: Forbid using pg_xlogfile_name() and pg_xlogfile_name_offset()
On Thu, Apr 8, 2010 at 4:06 PM, Simon Riggs wrote: > On Thu, 2010-04-08 at 09:54 +0300, Heikki Linnakangas wrote: >> Fujii Masao wrote: >> > On Wed, Apr 7, 2010 at 7:23 PM, Heikki Linnakangas >> > wrote: >> >> This commit is a stop-gap solution until we figure out what exactly to >> >> do about that. Masao-san wrote a patch that included the TLI in the >> >> string returned by pg_last_xlog_receive/replay_location() (see >> >> http://archives.postgresql.org/message-id/3f0b79eb1003030603ibd0cbadjebb09fa424930...@mail.gmail.com >> >> and >> >> http://archives.postgresql.org/message-id/3f0b79eb1003300214r6cf98c46tc9be5d563ccf4...@mail.gmail.com), >> >> but it still wasn't clear it did the right thing in corner-cases where >> >> the TLI changes. Using GetRecoveryTargetTLI() for the tli returned by >> >> pg_last_receive_location() seems bogus, at least. >> > >> > Why? The tli of the last WAL record received is always the >> > recovery target tli currently. >> >> True. > > Only in streaming mode. If you use the current TLI as I have suggested > then it will be correct in more cases. pg_last_xlog_receive_location() might be executed also after archive recovery ends. In this case, using the current tli seems not correct because it's always different from the recovery target tli after recovery. >> Hmm, currently pg_last_xlog_receive_location() returns the last location >> streamed via streaming replication. Should that be changed so that it >> also advances when a WAL segment is restored from archive? It seems >> strange that pg_last_xlog_receive_location() can be smaller than >> pg_last_xlog_replay_location(). > > Yes, it should be changed. Should it advance when WAL file in pg_xlog is read? If not, pg_last_xlog_receive_location() can be smaller than pg_last_xlog_replay_location(). And, how far should it advance when WAL file is partially-filled for some reasons? Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: [COMMITTERS] pgsql: Forbid using pg_xlogfile_name() and pg_xlogfile_name_offset()
On Thu, 2010-04-08 at 09:54 +0300, Heikki Linnakangas wrote: > Fujii Masao wrote: > > On Wed, Apr 7, 2010 at 7:23 PM, Heikki Linnakangas > > wrote: > >> This commit is a stop-gap solution until we figure out what exactly to > >> do about that. Masao-san wrote a patch that included the TLI in the > >> string returned by pg_last_xlog_receive/replay_location() (see > >> http://archives.postgresql.org/message-id/3f0b79eb1003030603ibd0cbadjebb09fa424930...@mail.gmail.com > >> and > >> http://archives.postgresql.org/message-id/3f0b79eb1003300214r6cf98c46tc9be5d563ccf4...@mail.gmail.com), > >> but it still wasn't clear it did the right thing in corner-cases where > >> the TLI changes. Using GetRecoveryTargetTLI() for the tli returned by > >> pg_last_receive_location() seems bogus, at least. > > > > Why? The tli of the last WAL record received is always the > > recovery target tli currently. > > True. Only in streaming mode. If you use the current TLI as I have suggested then it will be correct in more cases. > Hmm, currently pg_last_xlog_receive_location() returns the last location > streamed via streaming replication. Should that be changed so that it > also advances when a WAL segment is restored from archive? It seems > strange that pg_last_xlog_receive_location() can be smaller than > pg_last_xlog_replay_location(). Yes, it should be changed. -- Simon Riggs www.2ndQuadrant.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers