Re: [HACKERS] Getting the OID inside the executor
Thanks for the reply Tom. As far as I know, the TID of a row can change over time (e.g. when the row is updated). Isn't it so? If so, it wont solve my problem. It says here (http://www.postgresql.org/docs/9.0/static/ddl-system-columns.html) that "The OID, or even better a user-defined serial number, should be used to identify logical rows.", so I thought I can use OID. You are right as OIDs are 32-bit ints, so they will wrap-around at some point. The reason why I don't want to use a SERIAL user-column is that I want the process to be transparent to the user. What I want accomplished is the following. I want to trace all accesses (read/update/insert) to rows. If a row was accessed twice, it should appear twice in the trace. Later (offline), I want to analyze the trace to find some patterns, and after deducing some results from the pattern, I should be able to access the original rows in the database. For example, if my pattern-analysis says that row x is "best", I should be able to read x from the DB. Hence, a tuple should uniquely be identifiable (have same ID) in the DB and in the trace. What would you suggest for such a scenario? Thanks for the help. On Thu, Sep 1, 2011 at 10:03 PM, Tom Lane wrote: > Fazool writes: >> I am implementing some functionality into Postgresql, where I want to >> track which row was accessed by a user query. I am implementing the >> functionality inside Postgres, so that there are no changes required >> on client side (e.g. re-writing queries). >> Rows are identified by OIDs, and I have set default_with_oids=true, so >> that all tables are created with OIDs. > > It's a serious, serious error to suppose that OIDs are adequate > identifiers for rows in user tables. > > We use OIDs to identify rows in system catalogs. We can get away with > that, more or less, because (a) system catalogs are not likely to > contain billions of rows, and (b) we place a unique index on OID on > every system catalog that has OIDs. Neither of these statements is > very tenable for user tables. > > You haven't said anything about what it is you actually need to > accomplish here, but can you use TIDs as row identifiers? What's > the required lifetime of the identifiers? > > regards, tom lane > -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Getting the OID inside the executor
Fazool writes: > I am implementing some functionality into Postgresql, where I want to > track which row was accessed by a user query. I am implementing the > functionality inside Postgres, so that there are no changes required > on client side (e.g. re-writing queries). > Rows are identified by OIDs, and I have set default_with_oids=true, so > that all tables are created with OIDs. It's a serious, serious error to suppose that OIDs are adequate identifiers for rows in user tables. We use OIDs to identify rows in system catalogs. We can get away with that, more or less, because (a) system catalogs are not likely to contain billions of rows, and (b) we place a unique index on OID on every system catalog that has OIDs. Neither of these statements is very tenable for user tables. You haven't said anything about what it is you actually need to accomplish here, but can you use TIDs as row identifiers? What's the required lifetime of the identifiers? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Getting the OID inside the executor
Hello everyone, I am implementing some functionality into Postgresql, where I want to track which row was accessed by a user query. I am implementing the functionality inside Postgres, so that there are no changes required on client side (e.g. re-writing queries). Rows are identified by OIDs, and I have set default_with_oids=true, so that all tables are created with OIDs. Now, when I run a 'select * from my_table', I can easily get the OID inside the executor because I get a 'HeapTuple' from ExecSan(). I call HeapTupleGetOid() on the heap tuple and I have the oid. The problem is that when I run a query with a projection, e.g., 'select name from my_table', then the scanner returns a null HeapTuple. How can I get the OID in such a case? What would be the best way of doing this? Should I modify the planner to add 'get oid' as a junk filter? Would it affect something else? Any comments will be highly appreciated. Thanks! -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_restore --no-post-data and --post-data-only
-BEGIN PGP SIGNED MESSAGE- Hash: RIPEMD160 > It's off topic. But I think custom format would require a major mangling > to be able to handle a complete cluster. This isn't just a simple matter > of programming, IMNSHO. Oh, I meant just having it create separate custom format files for each database. As shell scripts all over the world have been doing for years, but it would be nice if it was simply built in. - -- Greg Sabino Mullane g...@turnstep.com End Point Corporation http://www.endpoint.com/ PGP Key: 0x14964AC8 201109012139 http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8 -BEGIN PGP SIGNATURE- iEYEAREDAAYFAk5gM+oACgkQvJuQZxSWSsi+xgCfbr0q+Ilbw0JRsORLZN2pSz1r JtcAoJaleZvW/wWtU83d9MVeOes4I6+0 =VqFQ -END PGP SIGNATURE- -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] strange row number estimates in pg9.1rc1
Tom Lane wrote: > I wrote: > > "Sergey E. Koposov" writes: > >> I'm seeing something weird which looks like a bug in 9.1rc1 after the > >> upgrade 8.4->9.0->9.1 done using pg_upgrade. > > > Hm, I wonder what pg_upgrade left relpages/reltuples set to ... > > Sure enough, that's the problem. pg_upgrade leaves relpages/reltuples > set to zero, but it also imports the visibility map pages from the old > cluster. If the old visibility map shows the table as all-visible, > then this happens when you try to VACUUM ANALYZE the table: > > 1. VACUUM doesn't process any pages, so it has no tuple density > estimate. It leaves reltuples set to zero, but it does set relpages. > > 2. ANALYZE scans some part of the table. It gets a tuple density > estimate for those pages ... but if that's only a small fraction of > the table, it believes the zero estimate of tuple density elsewhere. > So you get only a small update of reltuples. > > (The above behavior is new as of commit > b4b6923e03f4d29636a94f6f4cc2f5cf6298b8c8, BTW.) > > Basically, step 1 is buggy here: if we aren't making an update to > reltuples, we shouldn't set relpages either. Setting it nonzero > changes the implied tuple density from "unknown" to "known zero", > which is wrong. > > I'll go fix that, but I think it might be a good idea for pg_upgrade > to think about preserving the relpages/reltuples columns ... pg_upgrade currently only restores some oids and frozenxids. We would need to modify pg_dump --binary-upgrade mode to restore those values --- it isn't hard to do. > PS: right now, you cannot reproduce this in a 9.0 -> HEAD upgrade, > because of this patch: > > commit 00a7c9014a8fbb7388a807daeba3e0a85b49a747 > Author: Bruce Momjian > Date: Fri Aug 19 11:20:30 2011 -0400 > > In pg_upgrade, don't copy visibility map files from clusters that did not > have crash-safe visibility maps to clusters that expect crash-safety. > > Request from Robert Haas. > > I did reproduce it in a 9.0->9.1 test. Right, that is expected. -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PATCH: regular logging of checkpoint progress
I've prepared a significantly simplified version of the patch. The two main changes are (a) it does not update the pg_stat_bgwriter anymore, it just prints an info to the server log (b) a new GUC is not required, it's driven by the log_checkpoints This version will log at least 10 'checkpoint status' lines (at 10%, 20%, 30%, ...) and whenever 5 seconds since the last log elapses. The time is not checked for each buffer but for 128 buffers. So if the checkpoint is very slow, you'll get a message every 5 seconds, if it's fast you'll get 10 messages. Tomasdiff --git a/src/backend/storage/buffer/bufmgr.c b/src/backend/storage/buffer/bufmgr.c new file mode 100644 index 4c7cfb0..e60d304 *** a/src/backend/storage/buffer/bufmgr.c --- b/src/backend/storage/buffer/bufmgr.c *** BufferSync(int flags) *** 1175,1180 --- 1175,1193 int num_to_write; int num_written; int mask = BM_DIRTY; + + int num_since_update; + + longcurr_secs, + total_secs; + int curr_usecs, + total_usecs; + float curr_time, + total_time; + + TimestampTz startTimestamp, lastTimestamp; + + int log_interval, check_interval; /* Make sure we can handle the pin inside SyncOneBuffer */ ResourceOwnerEnlargeBuffers(CurrentResourceOwner); *** BufferSync(int flags) *** 1238,1243 --- 1251,1265 buf_id = StrategySyncStart(NULL, NULL); num_to_scan = NBuffers; num_written = 0; + num_since_update = 0; + + startTimestamp = GetCurrentTimestamp(); + lastTimestamp = startTimestamp; + + /* check the progress each 128 buffers or 10% */ + log_interval = (int)round(num_to_write/10); + check_interval = ((log_interval < 128) && (log_interval > 0)) ? log_interval : 128; + while (num_to_scan-- > 0) { volatile BufferDesc *bufHdr = &BufferDescriptors[buf_id]; *** BufferSync(int flags) *** 1261,1266 --- 1283,1334 TRACE_POSTGRESQL_BUFFER_SYNC_WRITTEN(buf_id); BgWriterStats.m_buf_written_checkpoints++; num_written++; + num_since_update++; + + /* +* Every time we write enough buffers (checkpoint_update_limit), +* we log a checkpoint status message and update the bgwriter +* stats (so that the pg_stat_bgwriter table may be updated). +* +* The log message contains info about total number of buffers to +* write, how many buffers are already written, average and current +* write speed and an estimate remaining time. +*/ + if ((log_checkpoints) && (num_written % check_interval == 0)) { + + TimestampDifference(lastTimestamp, + GetCurrentTimestamp(), + &curr_secs, &curr_usecs); + + /* if at least 5 seconds elapsed since the last log, log */ + if ((curr_secs >= 5) || (num_since_update >= log_interval)) { + + TimestampDifference(startTimestamp, + GetCurrentTimestamp(), + &total_secs, &total_usecs); + + curr_time = curr_secs + (float)curr_usecs / 100; + total_time = total_secs + (float)total_usecs / 100; + + elog(LOG, "checkpoint status: wrote %d buffers of %d (%.1f%%) in %.1f s; " + "average %.1f MB/s (%d buffers, %ld.%03d s), " + "current %.1f MB/s (%d buffers, %ld.%03d s), " + "remaining %.1f s", + num_written, num_to_write, ((float) num_written * 100 / num_to_write)
Re: [HACKERS] sha1, sha2 functions into core?
On Wed, Aug 31, 2011 at 11:12 AM, Ross J. Reedstrom wrote: > Hmm, this thread seems to have petered out without a conclusion. Just > wanted to comment that there _are_ non-password storage uses for these > digests: I use them in a context of storing large files in a bytea > column, as a means to doing data deduplication, and avoiding pushing > files from clients to server and back. Yes, agreed: there is no decent content-addressing type in PostgreSQL, so one rolls their own using shas and joins; I've seen this more than once. It's a useful way to get non-bloated index on a series of (larger than sha1) values where one only cares about the equality operator (hash indexes, as unattractive as they were before in PostgreSQL's implementation are even less so now with streaming replication). When that content to be addressed can be submitted from another source, anything with md5 is correctly met with suspicion. We have gone to the trouble of using pgcrypto to get sha1 access, but I know of other applications that would have preferred to use sha but settle for md5 simply because it's known to be bundled in core everywhere. CREATE EXTENSION -- particularly if there is *any* way (is there? even with ugliness like utility statement hooks) to configure it on the provider end to not require superuser for common extensions like 'pgcrypto' -- could ablate this issue and one could get off the hash "treadmill", including md5 -- but I think that would be a mistake. Applications need a high quality digest to enable any kind of principled content addressing use case, and I think making that any harder than a builtin is going to negatively impact the state of things at large. As a compromise, I'd also be happy with making CREATE EXTENSION so trivial that everyone who has that use case can get pgcrypto on any hosting provider. -- fdr -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_upgrade automatic testing
Peter Eisentraut wrote: > On tis, 2011-08-30 at 16:25 -0400, Tom Lane wrote: > > So I think that as given, this script is only useful for testing > > pg_upgrade of $currentversion to $currentversion. Which is surely > > better than no test at all, but it would not for example have caught > > the 8.3 incompatibility that was just reported. > > Well, the goal was always current to current version. Cross-version > testing is obviously important, but will be quite a bit harder. > > > How can we improve things here? I've toyed with the idea of > > installing pg_regress.so so that we can refer to it relative to > > $libdir, but that might be a bit invasive, especially if we were to > > try to back-patch it as far as 8.3. > > Aside from hesitations to backpatch those sorts of changes, it would > effectively prevent us from ever removing anything from the C libraries > used in the regression tests, because we need to keep the symbols around > so that the schema dump can load successfully into the new instance. > > I think a solution would have to be one of: > > 1) pg_upgrade needs a mode to cope with these situations. It can tell > the user, I upgraded your installation, but some dynamic modules appear > to be missing, you need to sort that out before you can put this back > into use. > > 2) Design a different test schema to load into the database before > running pg_upgrade. This would then be a one-line change in the script. Here are the scripts I use for testing: http://momjian.us/expire/pg_upgrade_test.tgz -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_upgrade automatic testing
Tom Lane wrote: > Peter Eisentraut writes: > > +# contrib/pg_upgrade/test.sh > > +# > > +# Test driver for pg_upgrade. Initializes a new database cluster, > > +# runs the regression tests (to put in some data), runs pg_dumpall, > > +# runs pg_upgrade, runs pg_dumpall again, compares the dumps. > > Hm .. my experience is that that doesn't work at all, because the > regression tests set up assorted C functions whose implementations are > in pg_regress.so, and it creates them with absolute path references > to pg_regress.so. When you try to load that into another installation > that's a different version of PG, it quite properly fails. So I think > that as given, this script is only useful for testing pg_upgrade of > $currentversion to $currentversion. Which is surely better than no test Reminder --- you can't use pg_upgrade to go from the same catalog version to the same catalog version because the catalog version is embedded in the tablespace directory name. -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] postgresql.conf archive_command example
On Thu, Sep 1, 2011 at 3:05 PM, Josh Berkus wrote: > +1 > > I'm also wondering if providing some shell script examples of a > fault-tolerant script to handle archiving would be useful. I think it would. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] postgresql.conf archive_command example
> Seems like it would be better to fix archive_mode so that it can be > changed without a restart. +1 I'm also wondering if providing some shell script examples of a fault-tolerant script to handle archiving would be useful. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] "stored procedures"
On Thu, Sep 1, 2011 at 1:18 PM, Josh Berkus wrote: > On 8/31/11 12:15 PM, Merlin Moncure wrote: >> An out of process, autonomous transaction type implementation should >> probably not sit under stored procedures for a number of reasons -- >> mainly that it's going to expose too many implementation details to >> the user. For example, does a SP heavy app have 2*N running >> processes? Or do we slot them into a defined number of backends for >> that purpose? Yuck & yuck. I like the AT feature, and kludge it >> frequently via dblink, but it's a solution for a different set of >> problems. > > I think that transaction control without parallelism would be the 80% > solution. That is, an SP has transaction control, but those > transactions are strictly serial, and cannot be run in parallel. For > example, if you were writing an SP in PL/pgSQL, each "BEGIN ... END" > block would be an explicit transaction, and standalone-only statements > be allowed between BEGIN ... END blocks, or possibly in their own > special block type (I prefer the latter). > > One issue we'd need to deal with is exception control around > single-statement transactions and non-transactional statements (VACUUM, > CREATE INDEX CONCURRENTLY, CHECKPOINT, etc.). In some cases, the user > is going to want to catch exceptions and abort the SP, and in other > cases ignore them, so both need to be possible. Yep, "+1" on that. Leaving out parallelism, and having the mechanism operate under the auspices of a single connection, makes a fine start, and perhaps is enough even in the longer run. -- When confronted by a difficult problem, solve it by reducing it to the question, "How would the Lone Ranger handle this?" -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Why buildfarm member anchovy is failing on 8.2 and 8.3 branches
I wrote: > What *does* seem feasible is to back-port just the single change we > actually need, by copying the two relevant macros into one of our > config/ source files for the configure script. I've tested that in > 8.3 and it seems to work --- at least, the generated configure script > changes in the expected way. This also seems like a reasonably sane > thing to back-port to 8.2. So I'll go ahead and commit those things > and see if anchovy likes it. So the upshot is that that fixed the 8.3 build, but anchovy is still failing on 8.2, with some different errors: /usr/bin/ld.gold: /tmp/ccn7RPJJ.ltrans0.ltrans.o: in function base_yyparse:y.tab.c:12777: error: undefined reference to 'filtered_base_yylex' /usr/bin/ld.gold: /tmp/ccn7RPJJ.ltrans0.ltrans.o: in function base_yyparse:gram.y:494: error: undefined reference to 'parsetree' /usr/bin/ld.gold: /tmp/ccn7RPJJ.ltrans7.ltrans.o: in function parseTypeString:parse_type.c:445: error: undefined reference to 'raw_parser' /usr/bin/ld.gold: /tmp/ccn7RPJJ.ltrans19.ltrans.o: in function simplify_function.128434.2836:postgres.c:544: error: undefined reference to 'raw_parser' /usr/bin/ld.gold: /tmp/ccn7RPJJ.ltrans19.ltrans.o: in function pg_parse_and_rewrite:postgres.c:544: error: undefined reference to 'raw_parser' /usr/bin/ld.gold: /tmp/ccn7RPJJ.ltrans19.ltrans.o: in function fmgr_sql_validator:postgres.c:544: error: undefined reference to 'raw_parser' collect2: ld returned 1 exit status I went so far as to install Arch Linux here, but I cannot duplicate the above. (Although I wonder whether my machine is really doing link-time optimization, since it doesn't generate any compiler warning messages during the link step, as anchovy is doing.) But these errors seem like they should be impossible anyway, since there is nothing platform-specific about our uses of any of the mentioned functions. I wonder if there is something messed up with anchovy's copy of REL8_2_STABLE. Marti, could I trouble you to blow away and recreate that machine's 8.2 checkout, as well as any compiler cache directories? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] "stored procedures"
On 8/31/11 12:15 PM, Merlin Moncure wrote: > An out of process, autonomous transaction type implementation should > probably not sit under stored procedures for a number of reasons -- > mainly that it's going to expose too many implementation details to > the user. For example, does a SP heavy app have 2*N running > processes? Or do we slot them into a defined number of backends for > that purpose? Yuck & yuck. I like the AT feature, and kludge it > frequently via dblink, but it's a solution for a different set of > problems. I think that transaction control without parallelism would be the 80% solution. That is, an SP has transaction control, but those transactions are strictly serial, and cannot be run in parallel. For example, if you were writing an SP in PL/pgSQL, each "BEGIN ... END" block would be an explicit transaction, and standalone-only statements be allowed between BEGIN ... END blocks, or possibly in their own special block type (I prefer the latter). One issue we'd need to deal with is exception control around single-statement transactions and non-transactional statements (VACUUM, CREATE INDEX CONCURRENTLY, CHECKPOINT, etc.). In some cases, the user is going to want to catch exceptions and abort the SP, and in other cases ignore them, so both need to be possible. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Informix FDW - anybody working on this?
On Wed, Aug 31, 2011 at 8:13 AM, Bernd Helmle wrote: > Out of curiosity, > > is anybody working on $subject? I'm currently planning to work on such a > driver, > but given the current stream of new drivers i want to make sure to not > duplicate any efforts... > The most complete list I've seen of FDW's is on the wiki: http://wiki.postgresql.org/wiki/Foreign_data_wrappers Note there is an ODBC FDW, which might work, but if you start working on an Informix specific one, please add it there. (I say this, knowing that one of my co-workers has the outlines of a riak fdw he hasn't listed yet... guess I should go pester him). Robert Treat conjecture: xzilla.net consulting: omniti.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] help with plug-in function for additional (partition/shard) visibility checks
Hallow hackers I have the following problem to solve and would like to get advice on the best way to do it. The problem: When growing a pl/proxy based database cluster, one of the main operations is splitting a partition. The standard flow is as follows: 1) make a copy of the partitions table(s) to another database 2) reconfigure pl/proxy to use 2 partitions instead of one The easy part is making a copy of all or half of the table to another database. The hard part is fast deletion (i mean milliseconds, comparable to TRUNCATE) the data that should not be in a partition (so that RUN ON ALL functions will continue to return right results). It would be relatively easy, if we still had the RULES for select available for plain tables, but even then the eventual cleanup would usually mean at least 3 passes of disk writes (set xmax, write deleted flag, vacuum and remove) What I would like to have is possibility for additional visibility checks, which would run some simple C function over tuple data (usually hash(fieldval) + and + or ) and return visibility (is in this partition) as a result. It would be best if this is run at so low level that also vacuum would use it and can clean up the foreign partition data in one pass, without doing the delete dance first. So finally the QUESTION : where in code would be the best place to check for this so that 1) both regular queries and VACUUM see it 2) the tuple data (and not only system fields or just xmin/xmax) would be available for the function to use -- --- Hannu Krosing PostgreSQL Unlimited Scalability and Performance Consultant 2ndQuadrant Nordic PG Admin Book: http://www.2ndQuadrant.com/books/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Removal of useless include references
Tom Lane wrote: > Alvaro Herrera writes: > > Excerpts from Bruce Momjian's message of jue sep 01 11:04:33 -0300 2011: > >> I have re-run the script and applied the result, again tested on BSD and > >> Linux. I will monitor the buildfarm for possible failures. > > > I think anything of this sort should be tested on Windows too. > > Well, if Windows is broken we'll find out soon enough from the > buildfarm. My recollection from the last go-round is that the pain > points were in non-default #define options that Bruce hadn't tested > and that no buildfarm critter exercised either, such as LOCK_DEBUG. Ah, but this time I only removed includes for files I could compile with all #if markers removed. It only got 5.8k diff lines out a possible 6.8k lines, but this seems like an acceptable cost for greater reliability. -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Removal of useless include references
Alvaro Herrera wrote: > Excerpts from Bruce Momjian's message of jue sep 01 11:04:33 -0300 2011: > > Bruce Momjian wrote: > > > > OK, try #2. I already had code that removed #if/#else/#endif code in > > > *.h files for better testing, so I extended that to all *.c files. This > > > reduces the size of the diff from 6.6k lines to 4.7k lines but it makes > > > it much less likely that there will be problems from running > > > pgrminclude. > > > > > > The current patch is here: > > > > > > http://momjian.us/expire/pgrminclude.diff > > > > > > I tested the patch on BSD and Linux. > > > > I have re-run the script and applied the result, again tested on BSD and > > Linux. I will monitor the buildfarm for possible failures. > > I think anything of this sort should be tested on Windows too. Agreed, but we have so many Windows configurations I figured I would let the buildfarm test them, no? -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Removal of useless include references
Alvaro Herrera writes: > Excerpts from Bruce Momjian's message of jue sep 01 11:04:33 -0300 2011: >> I have re-run the script and applied the result, again tested on BSD and >> Linux. I will monitor the buildfarm for possible failures. > I think anything of this sort should be tested on Windows too. Well, if Windows is broken we'll find out soon enough from the buildfarm. My recollection from the last go-round is that the pain points were in non-default #define options that Bruce hadn't tested and that no buildfarm critter exercised either, such as LOCK_DEBUG. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Removal of useless include references
Excerpts from Bruce Momjian's message of jue sep 01 11:04:33 -0300 2011: > Bruce Momjian wrote: > > OK, try #2. I already had code that removed #if/#else/#endif code in > > *.h files for better testing, so I extended that to all *.c files. This > > reduces the size of the diff from 6.6k lines to 4.7k lines but it makes > > it much less likely that there will be problems from running > > pgrminclude. > > > > The current patch is here: > > > > http://momjian.us/expire/pgrminclude.diff > > > > I tested the patch on BSD and Linux. > > I have re-run the script and applied the result, again tested on BSD and > Linux. I will monitor the buildfarm for possible failures. I think anything of this sort should be tested on Windows too. -- Álvaro Herrera The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Removal of useless include references
Bruce Momjian wrote: > Bruce Momjian wrote: > > Tom Lane wrote: > > > Bruce Momjian writes: > > > > It has been years since I ran src/tools/pginclude/pgrminclude to remove > > > > unnecessary include files. (I have already fixed things so include > > > > files can be compiled on their own.) > > > > > > > The attached patch removes unneeded include references, and marks some > > > > includes as needing to be skipped by pgrminclude. > > > > > > > I am sure applying this patch will break builds on some platforms and > > > > some option combinations so I will monitor the buildfarm when I apply it > > > > and make adjustments. > > > > > > The last time you did this was in July 2006. It took us two weeks to > > > mostly recover, but we were still dealing with some fallout in December, > > > cf > > > http://archives.postgresql.org/pgsql-hackers/2006-12/msg00491.php > > > > > > We had the buildfarm then, had had it for a couple years. The notion > > > that watching the buildfarm is enough is fully disproven by history. > > > > > > Unless you have a better test plan than last time (which this isn't), > > > I don't think this should be done at all. The benefits are microscopic > > > and the pain real. > > > > I don't have a better plan. There are #ifdef code blocks that often > > don't get processed and therefore this can't be done better. I will > > abandon the idea. > > OK, try #2. I already had code that removed #if/#else/#endif code in > *.h files for better testing, so I extended that to all *.c files. This > reduces the size of the diff from 6.6k lines to 4.7k lines but it makes > it much less likely that there will be problems from running > pgrminclude. > > The current patch is here: > > http://momjian.us/expire/pgrminclude.diff > > I tested the patch on BSD and Linux. I have re-run the script and applied the result, again tested on BSD and Linux. I will monitor the buildfarm for possible failures. This is not something we are going to do regularly, but probably every five years like this time. -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] mb_regress.sh gripes
On Thu, Aug 18, 2011 at 6:19 PM, Josh Kupershmidt wrote: > A few gripes about mb_regress.sh: > 1. No exit code is specified, so even if there are differences > between results/ and expected/ the script will still return 0. > > 2. The 'dropdb' command is used to wipe out the "utf8" database > before the run. This generates an error message like: > dropdb: database removal failed: ERROR: database "utf8" does not exist > > the first time you run the script. IMO it would be less startling to > just print a NOTICE here. > > 3. No error checking for whether createdb succeeds. > > The attached patch fixes these problems. Committed, with some changes. I used the new --if-exists option for dropdb rather than doing it as you had it here; I assume this may have been the motivation for that patch. I also just made the exit code 1 no matter how many failures there were. That seems more normal, and I wasn't altogether certain that $((expr)) is completely portable. I also set the execute bit on the script. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [v9.1] sepgsql - userspace access vector cache
> On Fri, Aug 26, 2011 at 5:32 AM, Kohei KaiGai wrote: > > Yes. It also caches an expected security label when a client being > > labeled as "scontext" tries to execute a procedure being labeled as > > "tcontext", to reduce number of system call invocations on fmgr_hook > > and needs_fmgr_hook. > > If the expected security label is not same with "scontext", it means > > the procedure performs as a trusted procedure that switches security > > label of the client during its execution; like a security invoker > > function. > > A pair of security labels are the only factor to determine whether the > > procedure is a trusted-procedure, or not. Thus, it is suitable to > > cache in userspace avc. > > I've committed this, but I still think it would be helpful to revise > that comment. The turn "boosted up" is not very precise or > informative. Could you submit a separate, comment-only patch to > improve this? > OK, Please wait for a few days. Thanks, -- NEC Europe Ltd, SAP Global Competence Center KaiGai Kohei -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] dblink make fails under postgresql 8.4.4 on mac osx 10.4.11
On Wed, Aug 31, 2011 at 9:45 AM, Gary Merkel wrote: > Having trouble installing dblink under PostgreSQL 8.4.4 on MAC OS X 10.4.11 > > Running make gives the following error: > > sed 's,MODULE_PATHNAME,$libdir/dblink,g' dblink.sql.in >dblink.sql > > gcc -no-cpp-precomp -O2 -Wall -Wmissing-prototypes -Wpointer-arith > -Wdeclaration-after-statement -Wendif-labels -fno-strict-aliasing -fwrapv > -I../../src/interfaces/libpq -I. -I../../src/include -c -o dblink.o > dblink.c > dblink.c: In function 'get_pkey_attnames': > dblink.c:1698: error: 'SnapshotNow' undeclared (first use in this function) > dblink.c:1698: error: (Each undeclared identifier is reported only once > dblink.c:1698: error: for each function it appears in.) > make: *** [dblink.o] Error 1 > > Does anyone know a fix to this? This is hard to understand, because SnapshotNow is defined in utils/tqual.h, and dblink.c includes that file. Also, I checked out the official sources at tag REL8_4_4 and it looks to me like there's no reference to SnapshotNow on line 1698. That line for me is blank, and the only reference I see to SnapshotNow is on line 1715. So I suspect that what you've got is a modified version of the sources that is somehow messed up. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [v9.1] sepgsql - userspace access vector cache
On Fri, Aug 26, 2011 at 5:32 AM, Kohei KaiGai wrote: > Yes. It also caches an expected security label when a client being > labeled as "scontext" tries to execute a procedure being labeled as > "tcontext", to reduce number of system call invocations on fmgr_hook > and needs_fmgr_hook. > If the expected security label is not same with "scontext", it means > the procedure performs as a trusted procedure that switches security > label of the client during its execution; like a security invoker > function. > A pair of security labels are the only factor to determine whether the > procedure is a trusted-procedure, or not. Thus, it is suitable to > cache in userspace avc. I've committed this, but I still think it would be helpful to revise that comment. The turn "boosted up" is not very precise or informative. Could you submit a separate, comment-only patch to improve this? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: SP-GiST, Space-Partitioned GiST
Hi! Ie expect some problems in support of comparison operators for text, because locale string comparison can have unexpected behaviour. Let's see the example. Create table with words and add extra leading space to some of them. test=# create table dict(id serial, word text); NOTICE: CREATE TABLE will create implicit sequence "dict_id_seq" for serial column "dict.id" CREATE TABLE test=# \copy dict(word) from '/usr/share/dict/american-english'; test=# update dict set word = ' '||word where id%2=0; UPDATE 49284 I use Ubuntu 11.04 with ru_RU.utf8 locale. So, comparison operators ignores leading spaces. test=# select * from dict where word between 'cart' and 'cary'; id | word ---+ 3029 | Carter 3031 | Cartesian 3033 | Carthage's 3035 | Cartier 3037 | Cartwright 3039 | Caruso 3041 | Carver 28419 | cart 28421 | carted 28423 | cartel's 28425 | cartilage 28427 | cartilages 28429 | carting 28431 | cartographer's 28433 | cartography 28435 | carton 28437 | cartons 28439 | cartoon's 28441 | cartooning 28443 | cartoonist's 28445 | cartoons 28447 | cartridge's 28449 | carts 28451 | cartwheel's 28453 | cartwheeling 28455 | carve 28457 | carver 28459 | carvers 28461 | carving 28463 | carvings 3030 | Carter's 3032 | Carthage 3034 | Carthaginian 3036 | Cartier's 3038 | Cartwright's 3040 | Caruso's 3042 | Carver's 28420 | cart's 28422 | cartel 28424 | cartels 28426 | cartilage's 28428 | cartilaginous 28430 | cartographer 28432 | cartographers 28434 | cartography's 28436 | carton's 28438 | cartoon 28440 | cartooned 28442 | cartoonist 28444 | cartoonists 28446 | cartridge 28448 | cartridges 28450 | cartwheel 28452 | cartwheeled 28454 | cartwheels 28456 | carved 28458 | carver's 28460 | carves 28462 | carving's (59 rows) But if I create spgist index query result differs. test=# create index dict_idx on dict using spgist (word); CREATE INDEX test=# select * from dict where word between 'cart' and 'cary'; id | word ---+ 28419 | cart 28421 | carted 28423 | cartel's 28425 | cartilage 28427 | cartilages 28429 | carting 28431 | cartographer's 28433 | cartography 28435 | carton 28437 | cartons 28439 | cartoon's 28441 | cartooning 28443 | cartoonist's 28445 | cartoons 28447 | cartridge's 28449 | carts 28451 | cartwheel's 28453 | cartwheeling 28455 | carve 28457 | carver 28459 | carvers 28461 | carving 28463 | carvings (23 rows) -- With best regards, Alexander Korotkov.
Re: [HACKERS] WIP: SP-GiST, Space-Partitioned GiST
This is updates SP-GiST patch, which fixed one bug and replaced test to the locale independent one. On Wed, 31 Aug 2011, Oleg Bartunov wrote: Hi there, attached is our WIP-patch for 9.2 development source tree, which provides implementation of SP-GiST (prototype was presented at PGCon-2011, see http://www.pgcon.org/2011/schedule/events/309.en.html and presentation for details) as a core feature. Main differences from prototype version: 1. Now it's part of pg core, not contrib module 2. It provides more operations for quadtree and suffix tree 3. It uses clustering algorithm of nodes on disk and has much better utilization of disk space. Fillfactor is supported 4. Some corner cases were eliminated 5. It provides support for concurency and recovery (inserts are logged, supports for deletes, and log replay will be added really soon) So, now code contains almost all possible overhead of production code and we ask hackers to test performance on real data sets. We expect the same performance for random data (since almost no overlaps) and much better performance on real-life data, plus much better index creation time. Also, we appreciate your comments and suggestions about API. Regards, Oleg _ Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru), Sternberg Astronomical Institute, Moscow University, Russia Internet: o...@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(495)939-16-83, +007(495)939-23-83 Regards, Oleg _ Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru), Sternberg Astronomical Institute, Moscow University, Russia Internet: o...@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(495)939-16-83, +007(495)939-23-83 spgist_patch-0.85.gz Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] postgresql.conf archive_command example
2011/9/1 Dimitri Fontaine : > Peter Eisentraut writes: >> Well, we could make initdb patch it up, but that might seem excessive. > > I sometime wonder if archive_mode shouldn't default to "on" with the > archive_command set to either '/bin/true' or 'rem' for windows. > > That allows to install proper archiving without restart, but the > tradeoff is of course that you need to restart to enable some > optimisation cases by turning archive_mode off. Seems like it would be better to fix archive_mode so that it can be changed without a restart. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: Fast GiST index build
On 01.09.2011 12:23, Alexander Korotkov wrote: On Thu, Sep 1, 2011 at 12:59 PM, Heikki Linnakangas< heikki.linnakan...@enterprisedb.com> wrote: So I changed the test script to generate the table as: CREATE TABLE points AS SELECT random() as x, random() as y FROM generate_series(1, $NROWS); The unordered results are in: testname | nrows |duration | accesses -+**---+-+**-- points unordered buffered | 25000 | 05:56:58.575789 | 2241050 points unordered auto | 25000 | 05:34:12.187479 | 2246420 points unordered unbuffered | 25000 | 04:38:48.663952 | 2244228 Although the buffered build doesn't lose as badly as it did with more overlap, it still doesn't look good :-(. Any ideas? But it's still a lot of overlap. It's about 220 accesses per small area request. It's about 10 - 20 times greater than should be without overlaps. Hmm, those "accesses" numbers are actually quite bogus for this test. I changed the creation of the table as you suggested, so that all x and y values are in the range 0.0 - 1.0, but I didn't change the loop to calculate those accesses, so it still queried for boxes in the range 0 - 10. That makes me wonder, why does it need 220 accesses on average to satisfy queries most of which lie completely outside the range of actual values in the index? I would expect such queries to just look at the root node, conclude that there can't be any matching tuples, and return immediately. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: Fast GiST index build
On Thu, Sep 1, 2011 at 12:59 PM, Heikki Linnakangas < heikki.linnakan...@enterprisedb.com> wrote: > So I changed the test script to generate the table as: > > CREATE TABLE points AS SELECT random() as x, random() as y FROM > generate_series(1, $NROWS); > > The unordered results are in: > > testname | nrows |duration | accesses > -+**---+-+**-- > points unordered buffered | 25000 | 05:56:58.575789 | 2241050 > points unordered auto | 25000 | 05:34:12.187479 | 2246420 > points unordered unbuffered | 25000 | 04:38:48.663952 | 2244228 > > Although the buffered build doesn't lose as badly as it did with more > overlap, it still doesn't look good :-(. Any ideas? But it's still a lot of overlap. It's about 220 accesses per small area request. It's about 10 - 20 times greater than should be without overlaps. If we roughly assume that 10 times more overlap makes 1/10 of tree to be used for actual inserts, then that part of tree can easily fit to the cache. You can try my splitting algorithm on your test setup (it this case I advice to start from smaller number of rows, 100 M for example). I'm requesting real-life datasets which makes troubles in real life from Oleg. Probably those datasets is even larger or new linear split produce less overlaps on them. -- With best regards, Alexander Korotkov.
Re: [HACKERS] WIP: Fast GiST index build
On 30.08.2011 13:38, Alexander Korotkov wrote: On Tue, Aug 30, 2011 at 1:08 PM, Heikki Linnakangas< heikki.linnakan...@enterprisedb.com> wrote: Thanks. Meanwhile, I hacked together my own set of test scripts, and let them run over the weekend. I'm still running tests with ordered data, but here are some preliminary results: testname | nrows |duration | accesses -+**---+-+**-- points unordered auto | 25000 | 08:08:39.174956 | 3757848 points unordered buffered | 25000 | 09:29:16.47012 | 4049832 points unordered unbuffered | 25000 | 03:48:10.999861 | 4564986 As you can see, the results are very disappointing :-(. The buffered builds take a lot *longer* than unbuffered ones. I was expecting the buffering to be very helpful at least in these unordered tests. On the positive side, the buffering made index quality somewhat better (accesses column, smaller is better), but that's not what we're aiming at. What's going on here? This data set was large enough to not fit in RAM, the table was about 8.5 GB in size (and I think the index is even larger than that), and the box has 4GB of RAM. Does the buffering only help with even larger indexes that exceed the cache size even more? This seems pretty strange for me. Time of unbuffered index build shows that there is not bottleneck at IO. That radically differs from my experiments. I'm going to try your test script on my test setup. While I have only express assumption that random function appears to be somewhat bad. Thereby unordered dataset behave like the ordered one. Can you rerun tests on your test setup with dataset generation on the backend like this? CREATE TABLE points AS (SELECT point(random(), random() FROM generate_series(1,1000)); So I changed the test script to generate the table as: CREATE TABLE points AS SELECT random() as x, random() as y FROM generate_series(1, $NROWS); The unordered results are in: testname | nrows |duration | accesses -+---+-+-- points unordered buffered | 25000 | 05:56:58.575789 | 2241050 points unordered auto | 25000 | 05:34:12.187479 | 2246420 points unordered unbuffered | 25000 | 04:38:48.663952 | 2244228 Although the buffered build doesn't lose as badly as it did with more overlap, it still doesn't look good :-(. Any ideas? -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] setlocale() on Windows is broken
On 31.08.2011 16:05, Heikki Linnakangas wrote: While looking through old emails, I bumped into this: http://archives.postgresql.org/message-id/25219.1303306...@sss.pgh.pa.us To recap, setlocale() on Windows is broken for locale names that contain dots or apostrophes in the country name. That includes "Hong Kong S.A.R.", "Macau S.A.R.", and "U.A.E." and "People's Republic of China". In April, I put in a hack to initdb to map those problematic names to aliases that don't contain dots: People's Republic of China -> China Hong Kong S.A.R. -> HKG U.A.E. -> ARE Macau S.A.R. -> ZHM However, Hiroshi pointed out in the thread linked above that that doesn't completely solve the problem. If you set locale to "HKG", for example, setlocale(LC_ALL, NULL) still returns the full name, "Hong Kong S.A.R.", and if you feed that back to setlocale() it fails. In particular, check_locale() uses "saved = setlocale(LC_XXX, NULL)" to get the current value, and tries to restore it later with "setlocale(LC_XXX, saved)". At first, I thought I should revert my hack in initdb, since it's not fully solving the problem anyway. But it doesn't really help - you run into the same issue if you set locale to one of those aliases manually. And that's exactly what users will have to do if we don't map those locales automatically. Microsoft should fix their bug. I don't have much faith in that happening, however. So, I think we should move the mapping from initdb to somewhere in src/port, so that the mapping is done every time setlocale() is called. That would fix the problem with check_locale(): even though "setlocale(LC_XXX, NULL)" returns a value that won't work, the setlocale() call to restore it would map it to an alias that does work again. In addition to that, I think we should check the return value of setlocale() in check_locale(), and throw a warning if restoring the old locale fails. The session's locale will still be screwed, but at least you'll know if it happens. I've committed a patch along those lines. It turned out to be pretty difficult to reproduce user-visible buggy behavior caused by this bug, so for the sake of the archives, here's a recipe on that: 1. Set system locale to "Chinese_Hong Kong S.A.R..950" 2. initdb -D data --locale="Arabic_ARE" 3. Launch psql. CREATE TABLE foo (a text); INSERT INTO foo VALUES ('a'), ('A'); -- Verify that the order is 'a', 'A' SELECT * FROM foo ORDER BY a; -- This fails, as it should CREATE DATABASE postgres WITH LC_COLLATE='C' TEMPLATE=template0; -- This also fails, as it should CREATE DATABASE postgres WITH LC_COLLATE='C' TEMPLATE=template0; -- The order returned by this is now wrong: 'A', 'a' SELECT * FROM foo ORDER BY a; It's a bizarre looking sequence, but that does it. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] postgresql.conf archive_command example
Peter Eisentraut writes: > Well, we could make initdb patch it up, but that might seem excessive. I sometime wonder if archive_mode shouldn't default to "on" with the archive_command set to either '/bin/true' or 'rem' for windows. That allows to install proper archiving without restart, but the tradeoff is of course that you need to restart to enable some optimisation cases by turning archive_mode off. Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers