Re: [HACKERS] pg_restore --no-post-data and --post-data-only
On 12/8/11 9:18 PM, Joachim Wieland wrote: > If you ask pg_restore to restore a section out of an archive which > doesn't have this section, there is no error and the command just > succeeds. This is what I expected and I think it's the right thing to > do but maybe others think that > there should be a warning. Andrew and I discussed this previously. It's consistent with how we treat other options in pg_restore. It may be that we should be consistently treating all options differently, but I don't think that's specific to this patch. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_restore --no-post-data and --post-data-only
On Tue, Nov 15, 2011 at 6:14 PM, Andrew Dunstan wrote: > Updated version with pg_restore included is attached. The patch applies with some fuzz by now but compiles without errors or warnings. The feature just works, it is not adding a lot of new code, basically it parses the given options and then skips over steps depending on the selected section. I verified the equivalence of -a and -s to the respective sections in the different archive formats and no surprise here either, they were equivalent except for the header (which has a timestamp). If you ask pg_restore to restore a section out of an archive which doesn't have this section, there is no error and the command just succeeds. This is what I expected and I think it's the right thing to do but maybe others think that there should be a warning. In pg_restore, pre-data cannot be run in parallel, it would only run serially, data and post-data can run in parallel, though. This is also what I had expected but it might be worth to add a note about this to the documentation. What I didn't like about the implementation was the two set_section() functions, I'd prefer them to move to a file that is shared between pg_dump and pg_restore and become one function... Minor issues: {"section", required_argument, NULL, 5} in pg_dump.c is not in the alphabetical order of the options. ./pg_restore --section=foobar pg_restore: unknown section name "foobar") Note the trailing ')', it's coming from a _(...) confusion Some of the lines in the patch have trailing spaces and in the documentation part tabs and spaces are mixed. int skip used as bool skip in dumpDumpableObject() Joachim -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] PostgreSQL fails to build with 32bit MinGW-w64
Hi, > FYI I've been testing with the attached patch. > We'll need to construct a configure test for HAVE_CRTDEFS_H. Isn't it enough to add the name in configure.in and run autoconf to update configure and autoheaders to update pg_config.h.in? The check of win32 before large file perhaps should also go to configure.in, otherwise they would be wiped with next autoconf. The patch recreated with removing the #undef but adding the conditional to skip AC_SYS_LARGEFILE in configure.in and update configure by autoconf. diff --git a/config/ac_func_accept_argtypes.m4 b/config/ac_func_accept_argtypes.m4 index 1e77179..a82788d 100644 --- a/config/ac_func_accept_argtypes.m4 +++ b/config/ac_func_accept_argtypes.m4 @@ -46,7 +46,7 @@ AC_DEFUN([AC_FUNC_ACCEPT_ARGTYPES], [AC_CACHE_VAL(ac_cv_func_accept_arg1,dnl [AC_CACHE_VAL(ac_cv_func_accept_arg2,dnl [AC_CACHE_VAL(ac_cv_func_accept_arg3,dnl -[for ac_cv_func_accept_return in 'int' 'unsigned int PASCAL' 'SOCKET'; do +[for ac_cv_func_accept_return in 'int' 'unsigned int PASCAL' 'SOCKET WSAAPI'; do for ac_cv_func_accept_arg1 in 'int' 'unsigned int' 'SOCKET'; do for ac_cv_func_accept_arg2 in 'struct sockaddr *' 'const struct sockaddr *' 'void *'; do for ac_cv_func_accept_arg3 in 'int' 'size_t' 'socklen_t' 'unsigned int' 'void'; do diff --git a/configure b/configure index de9ba5a..dfdd034 100755 --- a/configure +++ b/configure @@ -10056,7 +10056,8 @@ done -for ac_header in crypt.h dld.h fp_class.h getopt.h ieeefp.h ifaddrs.h langinfo.h poll.h pwd.h sys/ioctl.h sys/ipc.h sys/poll.h sys/pstat.h sys/resource.h sys/select.h sys/sem.h sys/shm.h sys/socket.h sys/sockio.h sys/tas.h sys/time.h sys/ucred.h sys/un.h termios.h ucred.h utime.h wchar.h wctype.h kernel/OS.h kernel/image.h SupportDefs.h + +for ac_header in crypt.h dld.h fp_class.h getopt.h ieeefp.h ifaddrs.h langinfo.h poll.h pwd.h sys/ioctl.h sys/ipc.h sys/poll.h sys/pstat.h sys/resource.h sys/select.h sys/sem.h sys/shm.h sys/socket.h sys/sockio.h sys/tas.h sys/time.h sys/ucred.h sys/un.h termios.h ucred.h utime.h wchar.h wctype.h kernel/OS.h kernel/image.h SupportDefs.h crtdefs.h do as_ac_Header=`$as_echo "ac_cv_header_$ac_header" | $as_tr_sh` if { as_var=$as_ac_Header; eval "test \"\${$as_var+set}\" = set"; }; then @@ -17997,6 +17998,7 @@ fi # compiler characteristic, but you'd be wrong. We must check this before # probing existence of related functions such as fseeko, since the largefile # defines can affect what is generated for that. +if test "$PORTNAME" != "win32" ; then # Check whether --enable-largefile was given. if test "${enable_largefile+set}" = set; then enableval=$enable_largefile; @@ -18353,7 +18355,7 @@ rm -rf conftest* fi fi - +fi # Check for largefile support (must be after AC_SYS_LARGEFILE) # The cast to long int works around a bug in the HP C Compiler # version HP92453-01 B.11.11.23709.GP, which incorrectly rejects @@ -18808,7 +18810,7 @@ else if test "${ac_cv_func_accept_arg3+set}" = set; then $as_echo_n "(cached) " >&6 else - for ac_cv_func_accept_return in 'int' 'unsigned int PASCAL' 'SOCKET'; do + for ac_cv_func_accept_return in 'int' 'unsigned int PASCAL' 'SOCKET WSAAPI'; do for ac_cv_func_accept_arg1 in 'int' 'unsigned int' 'SOCKET'; do for ac_cv_func_accept_arg2 in 'struct sockaddr *' 'const struct sockaddr *' 'void *'; do for ac_cv_func_accept_arg3 in 'int' 'size_t' 'socklen_t' 'unsigned int' 'void'; do diff --git a/configure.in b/configure.in index 5591b93..79180df 100644 --- a/configure.in +++ b/configure.in @@ -985,7 +985,7 @@ AC_SUBST(OSSP_UUID_LIBS) ## dnl sys/socket.h is required by AC_FUNC_ACCEPT_ARGTYPES -AC_CHECK_HEADERS([crypt.h dld.h fp_class.h getopt.h ieeefp.h ifaddrs.h langinfo.h poll.h pwd.h sys/ioctl.h sys/ipc.h sys/poll.h sys/pstat.h sys/resource.h sys/select.h sys/sem.h sys/shm.h sys/socket.h sys/sockio.h sys/tas.h sys/time.h sys/ucred.h sys/un.h termios.h ucred.h utime.h wchar.h wctype.h kernel/OS.h kernel/image.h SupportDefs.h]) +AC_CHECK_HEADERS([crypt.h dld.h fp_class.h getopt.h ieeefp.h ifaddrs.h langinfo.h poll.h pwd.h sys/ioctl.h sys/ipc.h sys/poll.h sys/pstat.h sys/resource.h sys/select.h sys/sem.h sys/shm.h sys/socket.h sys/sockio.h sys/tas.h sys/time.h sys/ucred.h sys/un.h termios.h ucred.h utime.h wchar.h wctype.h kernel/OS.h kernel/image.h SupportDefs.h crtdefs.h]) # On BSD, cpp test for net/if.h will fail unless sys/socket.h # is included first. @@ -1174,8 +1174,9 @@ fi # compiler characteristic, but you'd be wrong. We must check this before # probing existence of related functions such as fseeko, since the largefile # defines can affect what is generated for that. +if test "$PORTNAME" != "win32" ; then AC_SYS_LARGEFILE - +fi # Check for largefile support (must be after AC_SYS_LARGEFILE) AC_CHECK_SIZEOF([off_t]) diff --git a/src/include/c.h b/src/include/c.h index 0391860..14f6443 100644 --- a/src/include/c.h ++
Re: [HACKERS] [v9.2] Fix Leaky View Problem
2011/12/8 Robert Haas : > On Sat, Dec 3, 2011 at 3:19 AM, Kohei KaiGai wrote: >> I rebased my patch set. New functions in pg_proc.h prevented to apply >> previous revision cleanly. Here is no functional changes. > > I was thinking that my version of this (attached to an email from > earlier today) might be about ready to commit. But while I was > trolling through the archives on this problem trying to figure out who > to credit, I found an old complaint of Tom's that we never fixed, and > which represents a security exposure for this patch: > > rhaas=# create table foo (a integer); > CREATE TABLE > rhaas=# insert into foo select generate_series(1,10); > INSERT 0 10 > rhaas=# insert into foo values (1); > INSERT 0 1 > rhaas=# analyze foo; > ANALYZE > rhaas=# create view safe_foo with (security_barrier) as select * from > foo where a > 5; > CREATE VIEW > rhaas=# grant select on safe_foo to bob; > GRANT > > Secure in the knowledge that Bob will only be able to see rows where a > is 6 or higher, we go to bed. But Bob finds a way to outsmart us: > > rhaas=> create or replace function leak(integer,integer) returns > boolean as $$begin raise notice 'leak % %', $1, $2; return false; > end$$ language plpgsql; > CREATE FUNCTION > rhaas=> create operator !! (procedure = leak, leftarg = integer, > rightarg = integer, restrict = eqsel); > CREATE OPERATOR > rhaas=> explain select * from safe_foo where a !! 0; > NOTICE: leak 1 0 > QUERY PLAN > - > Subquery Scan on safe_foo (cost=0.00..2.70 rows=1 width=4) > Filter: (safe_foo.a !! 0) > -> Seq Scan on foo (cost=0.00..1.14 rows=6 width=4) > Filter: (a > 5) > (4 rows) > > OOPS. The *executor* has been persuaded not to apply the > possibly-nefarious operator !! to the data until after applying the > security-critical qual "a > 5". But the *planner* has no such > compunctions, and has cheerfully leaked the most common value in the > table, which the user wasn't supposed to see. I guess it's hopeless > to suppose that we're going to completely conceal the list of MCVs > from the user, since it might change the plan - and even if > ProcessUtility_hook or somesuch is used to disable EXPLAIN, the user > can still try to ferret out the MCVs via a timing attack. That having > been said, the above behavior doesn't sit well with me: letting the > user probe for MCVs via a timing attack or a plan change is one thing; > printing them out on request is a little bit too convenient for my > taste. :-( > Sorry, I missed this scenario, and have not investigated this code path in detail yet. My first impression remind me an idea that I proposed before, even though it got negative response due to user visible changes. It requires superuser privilege to create new operators, since we assume superuser does not set up harmful configuration. I still think it is an idea. Or, maybe, we can adopt a bit weaker restriction; functions being used to operators must have "leakproof" property. Is it worthful to have a discussion again? Thanks, -- KaiGai Kohei -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Note about bogus pg_amop entries for commutator operators
While reviewing the SP-Gist patch I noticed an error that I've seen before, notably in the rangetypes patch: cross-type operators should not be included in an opclass merely because their commutators are. To be useful in an index opclass, an operator has to take the indexed column's datatype as its *left* input, because index scan conditions are always "indexed_column OP comparison_value". So, for example, if you've got a GiST opclass on points, it might be useful to include "point <@ box" in that opclass, but there is no reason to include "box @> point" in it. (If the operators are marked as commutators, the planner can figure out what to do with "box @> indexed_point_col", but this is not dependent on any opclass entries.) Both of the aforementioned patches contained not only useless pg_amop entries, but utterly broken "support" code that would have crashed if it ever could have been reached, because it made the wrong assumptions about which input would be on which side of the index clause. So it finally occurred to me to add an opr_sanity test case that accounts for this, and lo and behold, it found three similarly bogus entries that were already in the code: *** *** 1077,1082 --- 1092,1115 -+--- (0 rows) + -- Check that each operator listed in pg_amop has an associated opclass, + -- that is one whose opcintype matches oprleft (possibly by coercion). + -- Otherwise the operator is useless because it cannot be matched to an index. + -- (In principle it could be useful to list such operators in multiple-datatype + -- btree opfamilies, but in practice you'd expect there to be an opclass for + -- every datatype the family knows about.) + SELECT p1.amopfamily, p1.amopstrategy, p1.amopopr + FROM pg_amop AS p1 + WHERE NOT EXISTS(SELECT 1 FROM pg_opclass AS p2 + WHERE p2.opcfamily = p1.amopfamily +AND binary_coercible(p2.opcintype, p1.amoplefttype)); + amopfamily | amopstrategy | amopopr + +--+- +1029 | 27 | 433 +1029 | 47 | 757 +1029 | 67 | 759 + (3 rows) + -- Operators that are primary members of opclasses must be immutable (else -- it suggests that the index ordering isn't fixed). Operators that are -- cross-type members need only be stable, since they are just shorthands (Those are in the gist point_ops opclass, if you were wondering.) I'm planning to leave it like this for the moment in the spgist patch, and then go back and clean up the useless point_ops entries and their underlying dead code in a separate commit. Just FYI. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] fix for pg_upgrade
OK, works now with the recent update. Thanks -- View this message in context: http://postgresql.1045698.n5.nabble.com/fix-for-pg-upgrade-tp3411128p5059777.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] PostgreSQL fails to build with 32bit MinGW-w64
On 12/05/2011 06:27 PM, Andrew Dunstan wrote: $ cat regression.diffs *** C:/MinGW/msys/1.0/home/pgrunner/bf/root32/HEAD/pgsql/src/test/regress/expected/float8-exp-three-digits-win32.out Fri Nov 25 14:24:49 2011 --- C:/MinGW/msys/1.0/home/pgrunner/bf/root32/HEAD/pgsql/src/test/regress/results/float8.out Mon Dec 5 18:17:36 2011 *** *** 382,388 SET f1 = FLOAT8_TBL.f1 * '-1' WHERE FLOAT8_TBL.f1 > '0.0'; SELECT '' AS bad, f.f1 * '1e200' from FLOAT8_TBL f; ! ERROR: value out of range: overflow SELECT '' AS bad, f.f1 ^ '1e200' from FLOAT8_TBL f; ERROR: value out of range: overflow SELECT 0 ^ 0 + 0 ^ 1 + 0 ^ 0.0 + 0 ^ 0.5; --- 382,396 SET f1 = FLOAT8_TBL.f1 * '-1' WHERE FLOAT8_TBL.f1 > '0.0'; SELECT '' AS bad, f.f1 * '1e200' from FLOAT8_TBL f; ! bad | ?column? ! -+-- ! |0 ! | -3.484e+201 ! | -1.0043e+203 ! |-Infinity ! | -1.2345678901234 ! (5 rows) ! SELECT '' AS bad, f.f1 ^ '1e200' from FLOAT8_TBL f; ERROR: value out of range: overflow SELECT 0 ^ 0 + 0 ^ 1 + 0 ^ 0.0 + 0 ^ 0.5; == This is apparently an optimization bug in the compiler. If I turn optimization off (CFLAGS=-O0) it goes away. Ick. So at the moment I'm a bit blocked. I can't really file a bug because the compiler can't currently be used to build postgres, I don't have time to construct a self-contained test case, and I don't want to commit changes to enable the compiler until the issue is solved. FYI I've been testing with the attached patch. We'll need to construct a configure test for HAVE_CRTDEFS_H. cheers andrew diff --git a/config/ac_func_accept_argtypes.m4 b/config/ac_func_accept_argtypes.m4 index 1e77179..a82788d 100644 --- a/config/ac_func_accept_argtypes.m4 +++ b/config/ac_func_accept_argtypes.m4 @@ -46,7 +46,7 @@ AC_DEFUN([AC_FUNC_ACCEPT_ARGTYPES], [AC_CACHE_VAL(ac_cv_func_accept_arg1,dnl [AC_CACHE_VAL(ac_cv_func_accept_arg2,dnl [AC_CACHE_VAL(ac_cv_func_accept_arg3,dnl -[for ac_cv_func_accept_return in 'int' 'unsigned int PASCAL' 'SOCKET'; do +[for ac_cv_func_accept_return in 'int' 'unsigned int PASCAL' 'SOCKET WSAAPI'; do for ac_cv_func_accept_arg1 in 'int' 'unsigned int' 'SOCKET'; do for ac_cv_func_accept_arg2 in 'struct sockaddr *' 'const struct sockaddr *' 'void *'; do for ac_cv_func_accept_arg3 in 'int' 'size_t' 'socklen_t' 'unsigned int' 'void'; do diff --git a/configure b/configure index de9ba5a..fbafa4b 100755 --- a/configure +++ b/configure @@ -17998,6 +17998,7 @@ fi # probing existence of related functions such as fseeko, since the largefile # defines can affect what is generated for that. # Check whether --enable-largefile was given. +if test "$PORTNAME" != "win32" ; then if test "${enable_largefile+set}" = set; then enableval=$enable_largefile; fi @@ -18352,6 +18353,7 @@ esac rm -rf conftest* fi fi +fi # Check for largefile support (must be after AC_SYS_LARGEFILE) @@ -18808,7 +18810,7 @@ else if test "${ac_cv_func_accept_arg3+set}" = set; then $as_echo_n "(cached) " >&6 else - for ac_cv_func_accept_return in 'int' 'unsigned int PASCAL' 'SOCKET'; do + for ac_cv_func_accept_return in 'int' 'unsigned int PASCAL' 'SOCKET WSAAPI'; do for ac_cv_func_accept_arg1 in 'int' 'unsigned int' 'SOCKET'; do for ac_cv_func_accept_arg2 in 'struct sockaddr *' 'const struct sockaddr *' 'void *'; do for ac_cv_func_accept_arg3 in 'int' 'size_t' 'socklen_t' 'unsigned int' 'void'; do diff --git a/src/include/c.h b/src/include/c.h index 0391860..cb9b150 100644 --- a/src/include/c.h +++ b/src/include/c.h @@ -58,7 +58,8 @@ #endif #include "postgres_ext.h" -#if _MSC_VER >= 1400 || defined(WIN64) +#define HAVE_CRTDEFS_H 1 +#if _MSC_VER >= 1400 || defined(HAVE_CRTDEFS_H) #define errcode __msvc_errcode #include #undef errcode diff --git a/src/include/port/win32.h b/src/include/port/win32.h index 34f4004..49da684 100644 --- a/src/include/port/win32.h +++ b/src/include/port/win32.h @@ -31,7 +31,7 @@ * The Mingw64 headers choke if this is already defined - they * define it themselves. */ -#if !defined(WIN64) || defined(WIN32_ONLY_COMPILER) +#if !defined(__MINGW64_VERSION_MAJOR) || defined(WIN32_ONLY_COMPILER) #define _WINSOCKAPI_ #endif #include @@ -225,9 +225,13 @@ int setitimer(int which, const struct itimerval * value, struct itimerval * ov #define fseeko(stream, offset, origin) _fseeki64(stream, offset, origin) #define ftello(stream) _ftelli64(stream) #else +#ifndef fseeko #define fseeko(stream, offset, origin) fseeko64(stream, offset, origin) +#endif +#ifndef ftello #define ftello(stream) ftello64(stream) #endif +#endif /* * Supplement to . @@ -264,6 +268,7 @@ typedef int p
Re: [HACKERS] pg_dump --exclude-table-data
On 12/08/2011 11:13 AM, Robert Haas wrote: On Wed, Dec 7, 2011 at 10:19 PM, Andrew Dunstan wrote: I'm also a bit concerned about the relationship between this and the existing -s option. It seems odd that you use --schema-only to get the behavior database-wide, and --exclude-table-data to get it for just one table. Is there some way we can make that a bit more consistent? It's consistent, and was designed to be, with the --exclude-table option. I'm not sure what you want it to look like instead. But TBH I'm more interested in getting the functionality than in how it's spelled. Ah, hmm. Well, maybe it's fine the way that you have it. But I'd be tempted to at least add a sentence to the sgml documentation for each option referring to the other, e.g. "To dump only schema for all tables in the database, see --schema-only." and "To exclude table data for only a subset of tables in the database, see --exclude-table-data.", or something along those lines. Sure, no problem with that. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_dump --exclude-table-data
On Wed, Dec 7, 2011 at 10:19 PM, Andrew Dunstan wrote: >> I'm also a bit concerned about the relationship between this and the >> existing -s option. It seems odd that you use --schema-only to get >> the behavior database-wide, and --exclude-table-data to get it for >> just one table. Is there some way we can make that a bit more >> consistent? > > It's consistent, and was designed to be, with the --exclude-table option. > I'm not sure what you want it to look like instead. But TBH I'm more > interested in getting the functionality than in how it's spelled. Ah, hmm. Well, maybe it's fine the way that you have it. But I'd be tempted to at least add a sentence to the sgml documentation for each option referring to the other, e.g. "To dump only schema for all tables in the database, see --schema-only." and "To exclude table data for only a subset of tables in the database, see --exclude-table-data.", or something along those lines. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Lots of FSM-related fragility in transaction commit
Heikki Linnakangas writes: > On 08.12.2011 08:20, Tom Lane wrote: >> So this is really a whole lot worse than our behavior was in pre-FSM >> days, and it needs to get fixed. > This bug was actually introduced only recently. Notice how the log says > "consistent recovery state reached at 0/5D71BA8". This interacts badly > with Fujii's patch I committed last week: You're right, I was testing on HEAD not 9.1.x. > That was harmless until last week, because reachedMinRecoveryPoint was > not used for anything unless you're doing archive recovery and hot > standby was enabled, but IMO the "consistent recovery state reached" log > message was misleading even then. I propose that we apply the attached > patch to master and backbranches. Looks sane to me, though I've not tested to see what effect it has on the case I was testing. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Allow substitute allocators for PGresult.
Hello, The documentation had slipped my mind. This is the patch to add the documentation of PGresult custom storage. It shows in section '31.19. Alternative result storage'. regards, -- Kyotaro Horiguchi NTT Open Source Software Center diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml index 252ff8c..dc2acb6 100644 --- a/doc/src/sgml/libpq.sgml +++ b/doc/src/sgml/libpq.sgml @@ -7229,6 +7229,325 @@ int PQisthreadsafe(); + + Alternative result storage + + + PGresult + PGconn + + + + As the standard usage, users can get the result of command + execution from PGresult aquired + with PGgetResult + from PGConn. While the memory areas for + the PGresult are allocated with malloc() internally within calls of + command execution functions such as PQexec + and PQgetResult. If you have difficulties to + handle the result records in the form of PGresult, you can instruct + PGconn to store them into your own storage instead of PGresult. + + + + + + PQregisterTupleAdder + + PQregisterTupleAdder + + + + + + Sets a function to allocate memory for each tuple and column + values, and add the completed tuple into your storage. + +void PQregisterTupleAdder(PGconn *conn, + addTupleFunction func, + void *param); + + + + + + + conn + + + The connection object to set the tuple adder + function. PGresult created from this connection calles + this function to store the result tuples instead of + storing into its internal storage. + + + + + func + + + Tuple adder function to set. NULL means to use the + default storage. + + + + + param + + + A pointer to contextual parameter passed + to func. + + + + + + + + + + + + + addTupleFunction + + addTupleFunction + + + + + + The type for the callback function to serve memory blocks for + each tuple and its column values, and to add the constructed + tuple into your own storage. + +typedef enum +{ + ADDTUP_ALLOC_TEXT, + ADDTUP_ALLOC_BINARY, + ADDTUP_ADD_TUPLE +} AddTupFunc; + +void *(*addTupleFunction)(PGresult *res, + AddTupFunc func, + int id, + size_t size); + + + + + Generally this function must return NULL for failure and should + set the error message + with PGsetAddTupleErrMes if the cause is + other than out of memory. This funcion must not throw any + exception. This function is called in the sequence following. + + + + Call with func + = ADDTUP_ALLOC_BINARY + and id = -1 to request the memory + for tuple used as an array + of PGresAttValue + + + Call with func + = ADDTUP_ALLOC_TEXT + or ADDTUP_ALLOC_TEXT + and id is zero or positive number + to request the memory for each column value in current + tuple. + + + Call with func + = ADDTUP_ADD_TUPLE to request the + constructed tuple to store. + + + + + Calling addTupleFunction + with func = + ADDTUP_ALLOC_TEXT is telling to return a +memory block with at least size bytes +which may not be aligned to the word boundary. + id is a zero or positive number + distinguishes the usage of requested memory block, that is the + position of the column for which the memory block is used. + + + When func + = ADDTUP_ALLOC_BINARY, this function is + telled to return a memory block with at + least size bytes which is aligned to the + word boundary. + id is the identifier distinguishes the + usage of requested memory block. -1 means that it is used as an + array of PGresAttValue to store the tuple. Zero or + positive numbers have the same meanings as for + ADDTUP_ALLOC_BINARY. + + When func + = ADDTUP_ADD_TUPLE, this function is + telled to store the PGresAttValue structure + constructed by the caller into your storage. The pointer to the + tuple structure is not passed so you should memorize the + pointer to the memory block passed the caller on + func + = ADDTUP_ALLOC_BINARY + with id is -1. This function must return + any non-NULL values for success. You must properly put back the + memory blocks passed to the caller for this function if needed. + + + + res + + + A pointer to the PGresult object. + + + + + func + + + An enum value telling the function to perform. + + + + + param + + + A pointer to contextual parameter passed to func. + + + + + + + + + +
Re: [HACKERS] Lots of FSM-related fragility in transaction commit
On 08.12.2011 08:20, Tom Lane wrote: I thought that removing the unreadable file would let me restart, but after "rm 52860_fsm" and trying again to start the server, there's a different problem: LOG: database system was interrupted while in recovery at 2011-12-08 00:56:11 EST HINT: This probably means that some data is corrupted and you will have to use the last backup for recovery. LOG: database system was not properly shut down; automatic recovery in progress LOG: consistent recovery state reached at 0/5D71BA8 LOG: redo starts at 0/5100050 WARNING: page 18 of relation base/27666/52860 is uninitialized CONTEXT: xlog redo visible: rel 1663/27666/52860; blk 18 PANIC: WAL contains references to invalid pages CONTEXT: xlog redo visible: rel 1663/27666/52860; blk 18 LOG: startup process (PID 14507) was terminated by signal 6 LOG: aborting startup due to startup process failure Note that this isn't even the same symptom Shraibman hit, since apparently he was failing on replaying the commit record. How is it that the main table file managed to have uninitialized pages? I suspect that "redo visible" WAL processing is breaking one or more of the fundamental WAL rules, perhaps by not generating a full-page image when it needs to. So this is really a whole lot worse than our behavior was in pre-FSM days, and it needs to get fixed. This bug was actually introduced only recently. Notice how the log says "consistent recovery state reached at 0/5D71BA8". This interacts badly with Fujii's patch I committed last week: commit 1e616f639156b2431725f7823c999486ca46c1ea Author: Heikki Linnakangas Date: Fri Dec 2 10:49:54 2011 +0200 During recovery, if we reach consistent state and still have entries in the invalid-page hash table, PANIC immediately. Immediate PANIC is much better than waiting for end-of-recovery, which is what we did before, because the end-of-recovery might not come until months later if this is a standby server. ... Recovery thinks it has reached consistency early on, so it PANICs as soon as it sees a reference to a page that belongs to a table that was later dropped. The bug here is that we consider having immediately reached consistency during crash recovery. That's a false notion: during crash recovery the database isn't consistent until *all* WAL has been replayed. We should not set reachedMinRecoveryPoint during crash recovery at all. And perhaps the flag should be renamed to reachedConsistency, to make it clear that during crash recovery it's not enough to reach the nominal minRecoveryPoint. That was harmless until last week, because reachedMinRecoveryPoint was not used for anything unless you're doing archive recovery and hot standby was enabled, but IMO the "consistent recovery state reached" log message was misleading even then. I propose that we apply the attached patch to master and backbranches. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index 9bec660..9d96044 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -562,7 +562,13 @@ static TimeLineID lastPageTLI = 0; static XLogRecPtr minRecoveryPoint; /* local copy of * ControlFile->minRecoveryPoint */ static bool updateMinRecoveryPoint = true; -bool reachedMinRecoveryPoint = false; + +/* + * Have we reached a consistent database state? In crash recovery, we have + * to replay all the WAL, so reachedConsistency is never set. During archive + * recovery, the database is consistent once minRecoveryPoint is reached. + */ +bool reachedConsistency = false; static bool InRedo = false; @@ -6894,9 +6900,16 @@ static void CheckRecoveryConsistency(void) { /* + * During crash recovery, we don't reach a consistent state until we've + * replayed all the WAL. + */ + if (XLogRecPtrIsInvalid(minRecoveryPoint)) + return; + + /* * Have we passed our safe starting point? */ - if (!reachedMinRecoveryPoint && + if (!reachedConsistency && XLByteLE(minRecoveryPoint, EndRecPtr) && XLogRecPtrIsInvalid(ControlFile->backupStartPoint)) { @@ -6906,7 +6919,7 @@ CheckRecoveryConsistency(void) */ XLogCheckInvalidPages(); - reachedMinRecoveryPoint = true; + reachedConsistency = true; ereport(LOG, (errmsg("consistent recovery state reached at %X/%X", EndRecPtr.xlogid, EndRecPtr.xrecoff))); @@ -6919,7 +6932,7 @@ CheckRecoveryConsistency(void) */ if (standbyState == STANDBY_SNAPSHOT_READY && !LocalHotStandbyActive && - reachedMinRecoveryPoint && + reachedConsistency && IsUnderPostmaster) { /* use volatile pointer to prevent code rearrangement */ diff --git a/src/backend/access/transam/xlogutils.c b/src/backend/access/transam/xlogutils.c index 350d434..b280434 100644 --- a/src/backend/access/transam/xlogutils.c +++ b/src/backend/access/transam/xlogutils.c @@ -85,7 +85,7 @@ log_invalid_pag