[HACKERS] About the parameter of API: PQprepared
In libpq, the definition is like: PGresult * PQprepare(PGconn *conn, const char *stmtName, const char *query, int nParams, const Oid *paramTypes) Could we remove the parameter nParams? e.g. insert into foo(id, name, address) values ($1, $2, $3) PostgreSQL possibly can parse the prepared sql statement to get the real paramters count. Or, is there another alternate way?
Re: [HACKERS] Updates of SE-PostgreSQL 8.4devel patches
Robert Haas wrote: I think you have to resign yourself to the fact that a user who can see only a subset of the rows in a table may very well see apparent foreign-key violations. But so what? So you're leaking information about the rows that they're not supposed to be able to see. This is not what I would call national-security-grade information hiding --- leastwise *I* certainly wouldn't store nuclear weapon design information in such a database. The people that the NSA wants to defend against are more than smart enough, and persistent enough, to extract information through such loopholes. If your plan is to refuse to implement row-level security until someone produces a design that can never leak information under any circumstances regardless of how the user configures it, then I strongly suspect we'll be waiting forever. The designs being put forward here are capable of being configured in an insecure fashion, but that's true of any other security mechanism we offer, too. In my understanding, security evaluation criteria does not require to prevent such kind of information leaks (called as illicit information flow, or covert channel) in all cases. The ISO/IEC15408 is a set of specifications for security functionalities of IT products. We can find its requirement about illicit information flow as follows: - FDP_IFF.3 Limited illicit information flows, requires the SFP to cover illicit information flows, but not necessarily eliminate them. - FDP_IFF.4 Partial elimination of illicit information flows, requires the SFP to cover the elimination of some (but not necessarily all) illicit information flows. - FDP_IFF.5 No illicit information flows, requires SFP to cover the elimination of all illicit information flows. (*) FDP_IFF = Functionalities of Data Protection Information Flaw control Functions (*) The larger tail number means more strict requirement. (*) Common Criteria part.2 at http://www.commoncriteriaportal.org/files/ccfiles/CCPART2V3.1R2.pdf At least, we can say it is not something like all or nothing. SE-PostgreSQL needs to be good enough for the NSA, but row-level security in general does not. BTW, it seems to me someone misunderstand I works as an agent of NSA. Is it a humor, itn't it? I've paid my effort to improve the security of open source software, not for overseas intelligence agency. Thanks, -- OSS Platform Development Division, NEC KaiGai Kohei [EMAIL PROTECTED] -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Updates of SE-PostgreSQL 8.4devel patches
Well, the PGACE documentation says: http://code.google.com/p/sepgsql/wiki/WhatIsPGACE Datum pgacePreparePlanCheck(Relation rel) In the latest patch, this hooks is replaced by pgaceBeginPerformCheckFK() and pgaceEndPerformCheckFK(), but its purpose is unchanged. Sorry for the confusable legacy description. I updated it to reflect the latest patches. Thanks, -- OSS Platform Development Division, NEC KaiGai Kohei [EMAIL PROTECTED] -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] parallel pg_restore - WIP patch
Andrew Dunstan wrote: Attached is my WIP patch for parallel pg_restore. It's still very rough, but seems to work. Anyone who can test this with highend equipment would be helping some. tried playing with this(on a 22Gb compressed dump using 4 connections) but it does not seem to work at all for me: pg_restore: [custom archiver] out of memory pg_restore: [custom archiver] could not uncompress data: invalid block type pg_restore: [custom archiver] out of memory pg_restore: [custom archiver] could not uncompress data: invalid stored block lengths pg_restore: [custom archiver] out of memory pg_restore: [custom archiver] out of memory pg_restore: [custom archiver] could not uncompress data: invalid distance too far back pg_restore: [custom archiver] could not uncompress data: invalid distances set pg_restore: [custom archiver] could not uncompress data: invalid code lengths set pg_restore: [custom archiver] could not uncompress data: incorrect data check pg_restore: [custom archiver] could not uncompress data: invalid code lengths set pg_restore: [custom archiver] out of memory pg_restore: [custom archiver] out of memory pg_restore: [custom archiver] out of memory pg_restore: [custom archiver] out of memory pg_restore: [custom archiver] could not uncompress data: invalid literal/length code pg_restore: [custom archiver] could not uncompress data: invalid literal/length code pg_restore: [custom archiver] could not uncompress data: invalid block type each pg_restore process seem to eat a few gigabytes of memory in a few seconds. Stefan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [PATCHES] [HACKERS] Infrastructure changes for recovery
On Thu, 2008-09-25 at 18:28 -0400, Tom Lane wrote: Simon Riggs [EMAIL PROTECTED] writes: Version 7 Anyway, that's sufficiently bad that I'm bouncing the patch for reconsideration. No problem, I understand this needs discussion. There's less detail here than first appears. There are some basic points to consider from which all else follows. After reading this for awhile, I realized that there is a rather fundamental problem with it: it switches into consistent recovery mode as soon as it's read WAL beyond ControlFile-minRecoveryPoint. In a crash recovery situation that typically is before the last checkpoint (if indeed it's not still zero), and what that means is that this patch will activate the bgwriter and start letting in backends instantaneously after a crash, long before we can have any certainty that the DB state really is consistent. In a normal crash recovery situation this would be easily fixed by simply not letting it go to consistent recovery state at all, but what about recovery from a restartpoint? We don't want a slave that's crashed once to never let backends in again. But I don't see how to determine that we're far enough past the restartpoint to be consistent again. In crash recovery we assume (without proof ;-)) that we're consistent once we reach the end of valid-looking WAL, but that rule doesn't help for a slave that's following a continuing WAL sequence. Perhaps something could be done based on noting when we have to pull in a WAL segment from the recovery_command, but it sounds like a pretty fragile assumption. Seems like we just say we only signal the postmaster if InArchiveRecovery. Archive recovery from a restartpoint is still archive recovery, so this shouldn't be a problem in the way you mention. The presence of recovery.conf overrides all other cases. Some other issues I noted before giving up: All of these issues raised can be addressed, but I think the main decision we need to make is not so much about running other processes but about when it can start and when they have to change mode. When they can start seems solvable, as above. When/how they must change state from recovery to normal mode seems more difficult. State change must be atomic across all processes, but also done at a micro level so that XLogFlush tests for the state change. The overhead of continually checking seems high, so I am tempted to say lets just kick 'em all off and then let them back on again. That's easily accomplished for bgwriter without anybody noticing much. For Hot Standby that would mean that a failover would kick off all query backends. I can see why that would be both desirable and undesirable. Anyway, from here I propose: * we keep the shutdown checkpoint * we kick off bgwriter (and any children) then let 'em back on again so they can initialise in a different mode. To do that, I just need to dust off a previous version of the patch. So we can sort this out quickly if we have a clear way to proceed. -- other comments relate to this current patch, so further discussion of the points below may not be required, if we agree how to proceed as above. * I'm a bit uncomfortable with the fact that the IsRecoveryProcessingMode flag is read and written with no lock. There's no atomicity or concurrent-write problem, sure, but on a multiprocessor with weak memory ordering guarantees (eg PPC) readers could get a fairly stale value of the flag. The false to true transition happens before anyone except the startup process is running, so that's no problem; the net effect is then that backends might think that recovery mode was still active for awhile after it wasn't. This seems a bit scary, eg in the patch as it stands that'll disable XLogFlush calls that should have happened. You could fix that by grabbing/releasing some spinlock (any old one) around the accesses, but are any of the call sites performance-critical? The one in XLogInsert looks like it is, if nothing else. Agreed. It's not a dynamic state, so I can fix that inside IsRecoveryProcessingMode() with a local state to make check faster. bool IsRecoveryProcessingMode(void) { if (!IsRecoveryProcessingMode) return false; { /* use volatile pointer to prevent code rearrangement */ volatile XLogCtlData *xlogctl = XLogCtl; SpinLockAcquire(xlogctl-mode_lck); RecoveryProcessingMode = XLogCtl-IsRecoveryProcessingMode; SpinLockRelease(xlogctl-mode_lck); } return IsRecoveryProcessingMode; } This only applies if we decide not to kick everybody off, change state and then let them back on again. * I kinda think you broke XLogFlush anyway. It's certainly clear that the WARNING case at the bottom is unreachable with the patch, and I think that means that you've messed up an important error robustness behavior. Is it still possible to get out of recovery mode if
Re: [HACKERS] PostgreSQL future ideas
[EMAIL PROTECTED] (Gevik Babakhani) writes: Advantage of C++ is that it reduce lot of OO code written in C in PostgreSQL, but it is so big effort to do that without small gain. It will increase number of bugs. Do not forget also that C++ compiler is not so common (so good) on different platforms. If somebody interesting in that yes but like a fork ( PostgreSQL++ :-). Reducing OO code that is written in C is one of my major interests. After some investigating myself it appears that having the codebase fully (rewritten in C++ will have an impact on the performance. So I guess such an effort will result the code being more C++ish and fully OO, being a mixture in C with some OO taste. I'm not convinced that it would a good idea at all to make the system fully OO, nor that C++ would be a meaningful tool to use to that end. After all, C++ can certainly be used in decidedly non-OO ways. For instance, STL is NOT an OO framework, and the author of STL, obviously something of a fan of C++, characterizes OO as almost as much of a hoax as Artificial Intelligence. http://en.wikipedia.org/wiki/Object-oriented_programming#Criticism I tend to agree with that characterization. Further, C++ suffers from the same not OO at its base problem of Java, which contributes complexity as well as hurting the OO-ness of it. Better idea is to start to use C99 in PostgreSQL ;-). I have not investigated this yet. But I am very interested to know what the advantages would be to upgrade the code to C99 standards. It would give us heartburn on any platforms where the preferred compiler doesn't grok C99, for sure. As much as I'm ok with using GCC, it would seem unfortunate to force people into using GCC everywhere, and preclude using other compilers. (And actually, I'm more ambivalent about GCC than that; I'm not totally happy with how GCC has gone, but that's another tale for another day...) -- (format nil [EMAIL PROTECTED] cbbrowne acm.org) http://linuxdatabases.info/info/linux.html NT 5.0 is the last nail in the Unix coffin. Interestingly, Unix isn't in the coffin... It's wondering what the heck is sealing itself into a wooden box 6 feet underground... -- Jason McMullan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal: move column defaults into pg_attribute along with attacl
Alvaro Herrera wrote: Tom Lane wrote: I think adding a subobject column to pg_shdepend is probably the best answer --- we only didn't do that to start with because we thought it wasn't needed. Yep. I did consider adding it, but there was no use for it at the time so I just left it out. It's not like it's very difficult to add it at this point. Sorry, I lost track here and would like to know the recommended course of action for ACLs. I see these issues: * make pg_attribute store VARLENA and/or NOT NULL fields * dependency tracking of defaults (and ACLs) * where to place ACLs (pg_attribute, merge with pg_attrdef or yet another pg_attracl table) So far I understood that merging pg_attrdef into pg_attribute is unwanted due to complexity at DROP TABLE. What does the subobject column for pg_shdepend buy us? Clarification appreciated. Regards Markus Wanner -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] parallel pg_restore - WIP patch
Hi, As I'm interested in this topic, I thought I'd take a look at the patch. I have no capability to test it on high end hardware but did some basic testing on my workstation and basic review of the patch. I somehow had the impression that instead of creating a new connection for each restore item we would create the processes at the start and then send them the dumpId's they should be restoring. That would allow the controller to batch dumpId's together and expect the worker to process them in a transaction. But this is probably just an idea I created in my head. Do we know why we experience tuple concurrently updated errors if we spawn thread too fast? I completed some test restores using the pg_restore from head with the patch applied. The dump was a custom dump created with pg 8.2 and restored to an 8.2 database. To confirm this would work, I completed a restore using the standard single threaded mode. The schema restore successfully. The only errors reported involved non-existent roles. When I attempt to restore using parallel restore I get out of memory errors reported from _PrintData. The code returning the error is; _PrintData(... while (blkLen != 0) { if (blkLen + 1 ctx-inSize) { free(ctx-zlibIn); ctx-zlibIn = NULL; ctx-zlibIn = (char *) malloc(blkLen + 1); if (!ctx-zlibIn) die_horribly(AH, modulename, out of memory\n); ctx-inSize = blkLen + 1; in = ctx-zlibIn; } It appears from my debugging and looking at the code that in _PrintData; lclContext *ctx = (lclContext *) AH-formatData; the memory context is shared across all threads. Which means that it's possible the memory contexts are stomping on each other. My GDB skills are now up to being able to reproduce this in a gdb session as there are forks going on all over the place. And if you process them in a serial fashion, there aren't any errors. I'm not sure of the fix for this. But in a parallel environment it doesn't seem possible to store the memory context in the AH. I also receive messages saying pg_restore: [custom archiver] could not read from input file: end of file. I have not investigated these further as my current guess is they are linked to the out of memory error. Given I ran into this error at my first testing attempt I haven't evaluated much else at this point in time. Now all this could be because I'm using the 8.2 archive, but it works fine in single restore mode. The dump file is about 400M compressed and an entire archive schema was removed from the restore path with a custom restore list. Command line used; PGPORT=5432 ./pg_restore -h /var/run/postgresql -m4 --truncate-before-load -v -d tt2 -L tt.list /home/mr-russ/pg-index-test/timetable.pgdump 2 log.txt I sent the log and this email originally to the list, but I think the attachment was too large, so I've resent without any attachements. Since my initial testing, Stefan has confirmed the problem I am having. If you have any questions, would like me to run other tests or anything, feel free to contact me. Regards Russell -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [PATCHES] [HACKERS] Infrastructure changes for recovery
On Fri, 2008-09-26 at 11:20 +0100, Simon Riggs wrote: After reading this for awhile, I realized that there is a rather fundamental problem with it: it switches into consistent recovery mode as soon as it's read WAL beyond ControlFile-minRecoveryPoint. In a crash recovery situation that typically is before the last checkpoint (if indeed it's not still zero), and what that means is that this patch will activate the bgwriter and start letting in backends instantaneously after a crash, long before we can have any certainty that the DB state really is consistent. In a normal crash recovery situation this would be easily fixed by simply not letting it go to consistent recovery state at all, but what about recovery from a restartpoint? We don't want a slave that's crashed once to never let backends in again. But I don't see how to determine that we're far enough past the restartpoint to be consistent again. In crash recovery we assume (without proof ;-)) that we're consistent once we reach the end of valid-looking WAL, but that rule doesn't help for a slave that's following a continuing WAL sequence. Perhaps something could be done based on noting when we have to pull in a WAL segment from the recovery_command, but it sounds like a pretty fragile assumption. Seems like we just say we only signal the postmaster if InArchiveRecovery. Archive recovery from a restartpoint is still archive recovery, so this shouldn't be a problem in the way you mention. The presence of recovery.conf overrides all other cases. Anticipating your possible reponses, I would add this also: There has long been an annoying hole in the PITR scheme which is the support of recovery using a crashed database. That is there to support split mirror snapshots, but it creates a loophole where we don't know the min recovery location, circumventing the care we (you!) took to put stop/start backup in place. I think we need to add a parameter to recovery.conf that people can use to specify a minRecoveryPoint iff there in no backup label file. They can work out what this should be by following this procedure, which we should document: * split mirror, so you have offline copy of crashed database * copy database away to backup * go to running database and run pg_current_xlog_insert_location() * use the value to specify recovery_min_location If they don't specify this, then bgwriter will not start and you cannot run in Hot Standby mode. Their choice, so we need not worry then about the loophole any more. -- Simon Riggs www.2ndQuadrant.com PostgreSQL Training, Services and Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] parallel pg_restore - WIP patch
Russell Smith wrote: Hi, As I'm interested in this topic, I thought I'd take a look at the patch. I have no capability to test it on high end hardware but did some basic testing on my workstation and basic review of the patch. I somehow had the impression that instead of creating a new connection for each restore item we would create the processes at the start and then send them the dumpId's they should be restoring. That would allow the controller to batch dumpId's together and expect the worker to process them in a transaction. But this is probably just an idea I created in my head. Yes it is. To do that I would have to invent a protocol for talking to the workers, etc, and there is not the slightest chance I would get that done by November. And I don't see the virtue in processing them all in a transaction. I've provided a much simpler means of avoiding WAL logging of the COPY. Do we know why we experience tuple concurrently updated errors if we spawn thread too fast? No. That's an open item. I completed some test restores using the pg_restore from head with the patch applied. The dump was a custom dump created with pg 8.2 and restored to an 8.2 database. To confirm this would work, I completed a restore using the standard single threaded mode. The schema restore successfully. The only errors reported involved non-existent roles. When I attempt to restore using parallel restore I get out of memory errors reported from _PrintData. The code returning the error is; _PrintData(... while (blkLen != 0) { if (blkLen + 1 ctx-inSize) { free(ctx-zlibIn); ctx-zlibIn = NULL; ctx-zlibIn = (char *) malloc(blkLen + 1); if (!ctx-zlibIn) die_horribly(AH, modulename, out of memory\n); ctx-inSize = blkLen + 1; in = ctx-zlibIn; } It appears from my debugging and looking at the code that in _PrintData; lclContext *ctx = (lclContext *) AH-formatData; the memory context is shared across all threads. Which means that it's possible the memory contexts are stomping on each other. My GDB skills are now up to being able to reproduce this in a gdb session as there are forks going on all over the place. And if you process them in a serial fashion, there aren't any errors. I'm not sure of the fix for this. But in a parallel environment it doesn't seem possible to store the memory context in the AH. There are no threads, hence nothing is shared. fork() create s new process, not a new thread, and all they share are file descriptors. I also receive messages saying pg_restore: [custom archiver] could not read from input file: end of file. I have not investigated these further as my current guess is they are linked to the out of memory error. Given I ran into this error at my first testing attempt I haven't evaluated much else at this point in time. Now all this could be because I'm using the 8.2 archive, but it works fine in single restore mode. The dump file is about 400M compressed and an entire archive schema was removed from the restore path with a custom restore list. Command line used; PGPORT=5432 ./pg_restore -h /var/run/postgresql -m4 --truncate-before-load -v -d tt2 -L tt.list /home/mr-russ/pg-index-test/timetable.pgdump 2 log.txt I've attached the log.txt file so you can review the errors that I saw. I have adjusted the out of memory error to include a number to work out which one was being triggered. So you'll see 5 out of memory in the log file, which corresponds to the code above. However, there does seem to be something odd happening with the compression lib, which I will investigate. Thanks for the report. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal: new border setting in psql
So what have we decided about this suggestion. Should I submit the patch or just forget about it? So far some people like it and some people think that it is unneccessary. No one so far has suggested that it would harm the system or people's use of it. -- D'Arcy J.M. Cain [EMAIL PROTECTED] | Democracy is three wolves http://www.druid.net/darcy/| and a sheep voting on +1 416 425 1212 (DoD#0082)(eNTP) | what's for dinner. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] parallel pg_restore - WIP patch
Stefan Kaltenbrunner wrote: Andrew Dunstan wrote: Attached is my WIP patch for parallel pg_restore. It's still very rough, but seems to work. Anyone who can test this with highend equipment would be helping some. tried playing with this(on a 22Gb compressed dump using 4 connections) but it does not seem to work at all for me: pg_restore: [custom archiver] out of memory pg_restore: [custom archiver] could not uncompress data: invalid block type pg_restore: [custom archiver] out of memory pg_restore: [custom archiver] could not uncompress data: invalid stored block lengths pg_restore: [custom archiver] out of memory pg_restore: [custom archiver] out of memory pg_restore: [custom archiver] could not uncompress data: invalid distance too far back pg_restore: [custom archiver] could not uncompress data: invalid distances set pg_restore: [custom archiver] could not uncompress data: invalid code lengths set pg_restore: [custom archiver] could not uncompress data: incorrect data check pg_restore: [custom archiver] could not uncompress data: invalid code lengths set pg_restore: [custom archiver] out of memory pg_restore: [custom archiver] out of memory pg_restore: [custom archiver] out of memory pg_restore: [custom archiver] out of memory pg_restore: [custom archiver] could not uncompress data: invalid literal/length code pg_restore: [custom archiver] could not uncompress data: invalid literal/length code pg_restore: [custom archiver] could not uncompress data: invalid block type each pg_restore process seem to eat a few gigabytes of memory in a few seconds. Ouch. Ok, Thanks for the report. I will investigate. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Updates of SE-PostgreSQL 8.4devel patches
We already have an optional OID system column that can be specified during table creation (WITH OIDS). We could have another optional oid column (WITH ROW SECURITY) called security_context which would store the oid of the role that can see the row; if the oid is zero (InvalidOid), A role alone is not sufficient. It needs to be the proposed mapping to pg_security. anyone can see it. SE-PostgreSQL would default to WITH ROW SECURITY and use the oid to look up strings in pg_security. The above explanation is not correct, as Tom mentioned. The security system column is declared as TEXT type, however, every tuple has a Oid value to indicate pg_security system catalog. It enables to prevent waste of storage. When user tries to read the system column, it is translated from Oid to text representation. Imho the important points Bruce wanted to make are: 1. there is only one extra oid storage column per row (regardless whether it is translated to text upon select) this is already the case in the patch. 2. the column(s) are system columns, so they do not show up in select * I think having access to the oid directly could be beneficial to performance. e.g. a smart client could cache pg_security and map the oid's to text locally instead of transferring the quite verbose text representation for every row. That may be mute, because showing the security_context definitely sounds more like an admin kind of functionality. Traditionally the column would probably be oid and sql would need to cast it for the text representation (e.g. security_context::regsecurity). Andreas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] parallel pg_restore - WIP patch
Andrew Dunstan wrote: Do we know why we experience tuple concurrently updated errors if we spawn thread too fast? No. That's an open item. Okay, I'll see if I can have a little more of a look into it. No promises as the restore the restore isn't playing nicely. the memory context is shared across all threads. Which means that it's possible the memory contexts are stomping on each other. My GDB skills are now up to being able to reproduce this in a gdb session as there are forks going on all over the place. And if you process them in a serial fashion, there aren't any errors. I'm not sure of the fix for this. But in a parallel environment it doesn't seem possible to store the memory context in the AH. There are no threads, hence nothing is shared. fork() create s new process, not a new thread, and all they share are file descriptors. However, there does seem to be something odd happening with the compression lib, which I will investigate. Thanks for the report. I'm sorry, I meant processes there. I'm aware there are no threads. But my feeling was that when you forked with open files you got all of the open file properties, including positions, and as you dupped the descriptor, you share all that it's pointing to with every other copy of the descriptor. My brief research on that shows that in 2005 there was a kernel mailing list discussion on this issue. http://mail.nl.linux.org/kernelnewbies/2005-09/msg00479.html was quite informative for me. I again could be wrong but worth a read. If it is true, then the file needs to be reopened by each child, it can't use the duplicated descriptor. I haven't had a change to implementation test is as it's late here. But I'd take a stab that it will solve the compression library problems. I hope this helps, not hinders Russell. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] FSM, now without WAL-logging
Heikki Linnakangas napsal(a): Zdenek Kotala wrote: Attached is a new version, now with WAL-logging of the FSM truncation. I decided to go with the separate WAL record for that, rather than piggybacking on the smgrtruncate's WAL record. It seems much better from a modularity point of view this way. I've also worked on the comments, renamed many of the internal functions, in a more coherent scheme, and I also started using the struct FSMAddress you suggested a while ago. But I don't think I've changed anything that could explain that crash. Let me know if it still doesn't work. This version works on my old repo. I performed performance test (iGEN) on SUN x4600 with 60 concurrent users and see result: Original: - MQThL (Maximum Qualified Throughput LIGHT): 1209.60 tpm MQThM (Maximum Qualified Throughput MEDIUM): 2576.72 tpm MQThH (Maximum Qualified Throughput HEAVY): 2191.20 tpm TRANSACTION MIX Total number of transactions = 181232 TYPETX. COUNT MIX - --- Light: 30240 16.69% Medium: 64418 35.54% DSS:19865 10.96% Heavy: 54780 30.23% Connection: 11929 6.58% RESPONSE TIMES AVG.MAX.90TH Light 0.304 6.405 0.400 Medium 0.317 6.533 0.400 DSS 0.266 6.343 0.020 Heavy 0.361 6.737 3.000 Connections 0.264 5.983 0.400 Number of users = 60 Sum of Avg. RT * TPS for all Tx. Types = 32.770142 FSM with WAL MQThL (Maximum Qualified Throughput LIGHT): 1199.36 tpm MQThM (Maximum Qualified Throughput MEDIUM): 2569.12 tpm MQThH (Maximum Qualified Throughput HEAVY): 2171.64 tpm TRANSACTION MIX Total number of transactions = 180625 TYPETX. COUNT MIX - --- Light: 29984 16.60% Medium: 64228 35.56% DSS:20181 11.17% Heavy: 54291 30.06% Connection: 11941 6.61% RESPONSE TIMES AVG.MAX.90TH Light 0.309 6.560 0.400 Medium 0.323 6.529 0.400 DSS 0.268 6.327 0.020 Heavy 0.360 6.675 3.000 Connections 0.274 6.359 0.400 Number of users = 60 Sum of Avg. RT * TPS for all Tx. Types = 32.845712 FSM no WAL last version --- MQThL (Maximum Qualified Throughput LIGHT): 1207.92 tpm MQThM (Maximum Qualified Throughput MEDIUM): 2611.84 tpm MQThH (Maximum Qualified Throughput HEAVY): 2177.68 tpm TRANSACTION MIX Total number of transactions = 18 TYPETX. COUNT MIX - --- Light: 30198 16.57% Medium: 65296 35.83% DSS:20118 11.04% Heavy: 54442 29.88% Connection: 12168 6.68% RESPONSE TIMES AVG.MAX.90TH Light 0.301 6.106 0.400 Medium 0.315 6.130 0.400 DSS 0.261 5.977 0.020 Heavy 0.361 6.220 3.000 Connections 0.260 6.044 0.400 Number of users = 60 Sum of Avg. RT * TPS for all Tx. Types = 32.696832 - I don't see any big difference. Throughput is similar. Only response time seems to be better with your last FSM version. I personally happy with performance. Zdenek -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] About the parameter of API: PQprepared
iihero [EMAIL PROTECTED] writes: In libpq, the definition is like: PGresult * PQprepare(PGconn *conn, const char *stmtName, const char *query, int nParams, const Oid *paramTypes) Could we remove the parameter nParams? No. We are not going to break every existing user of the function. But you can already set it to zero if you don't feel like prespecifying the parameter types. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] FSM, now without WAL-logging
Zdenek Kotala wrote: I performed performance test (iGEN) on SUN x4600 with 60 concurrent users and see result: ... I don't see any big difference. Throughput is similar. Only response time seems to be better with your last FSM version. I personally happy with performance. Thanks! I've been running DBT-2 tests myself, and I'm not seeing any difference there either: testno TPM NO90timecomment 36 14051.724 6h fsm-nowal 35 14210.761 6h CVS HEAD 34 14391.066 2h fsm-nowal 33 14420.868 2h CVS HEAD The NO90 is the 90% percentile of New Order transaction response time. there's big variation there in the 6h tests, because of a huge dip in performance near the end of the test, both with and without the patch. I don't have an explanation for the dips, but it throws off the response times for the whole tests. Given that in the 2h tests, which is exactly the same as the 6h test, just shorter, there's no degradation in the response times, I'm not worried about that. I don't have access to the site where I used to publish the test results earlier, but let me know if you want to see the full test results, and I'll try to zip them up and FTP somewhere (~500 MB uncompressed). I've also tried various pgbench tests, on a RAM disk and otherwise, as well as the table population test I ran earlier, and don't see any difference in performance. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] parallel pg_restore - WIP patch
Russell Smith wrote: I'm sorry, I meant processes there. I'm aware there are no threads. But my feeling was that when you forked with open files you got all of the open file properties, including positions, and as you dupped the descriptor, you share all that it's pointing to with every other copy of the descriptor. My brief research on that shows that in 2005 there was a kernel mailing list discussion on this issue. http://mail.nl.linux.org/kernelnewbies/2005-09/msg00479.html was quite informative for me. I again could be wrong but worth a read. If it is true, then the file needs to be reopened by each child, it can't use the duplicated descriptor. I haven't had a change to implementation test is as it's late here. But I'd take a stab that it will solve the compression library problems. I hope this helps, not hinders I'm sure that's the problem. Should be fairly easily fixable, I believe. Thanks for the info. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Updates of SE-PostgreSQL 8.4devel patches
On Thu, Sep 25, 2008 at 08:57:46PM -0400, Tom Lane wrote: Another point is that the proposed behavior leaks quite a lot of information, since it will fail operations on the basis of tuples that supposedly aren't visible to the invoking user. While I admit that it's hard to see an alternative if we're to preserve FK integrity, I have to worry that this definition isn't going to satisfy the tin-foil-hat brigade that are supposed to be the main users of SEPostgres. If the goal is you don't know the row is there, this doesn't seem to meet it. The above point, and other similar ones in every discussion of the proposed functionality, makes me think once again either that the requirements for this feature aren't understood by everyone, or else that they're not actually explicit enough. I have a feeling it's the latter. Certainly, I've not yet read a complete security analysis of a data system security plan that outlines why the proposed model is correct. What I think is really happening with this development is that the SE-Linux understanding of security enhancement has been taken as the correct analysis for how one secures an information system. That deep assumption appears to me to be informing much of the development of SE-PostgreSQL. In particular, that deep assumption includes an assumption that consistency of access control trumps all. The Postgres developers who are questioning the SE approach are (I think) coming at this from the point of view of data systems developers, where consistency of the data set trumps all. I suspect that the tension between these approaches will not be reconciled without a fairly complete outline of possible security models for data systems, their relationship to what the OS security people have decided is the right thing to do, and the trade-offs necessary to make different priorities work. Some of the trade offs may include things like violate traditional understanding of data set consistency and possible disclosure of existence of datum. I think this will be a lot of work, and I'm not volunteering to do it. I nevertheless think that without it, the SE-PostgreSQL features will continue to be a very awkward fit. A -- Andrew Sullivan [EMAIL PROTECTED] +1 503 667 4564 x104 http://www.commandprompt.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Updates of SE-PostgreSQL 8.4devel patches
On Thu, Sep 25, 2008 at 10:32:24PM -0400, Tom Lane wrote: I can't escape the lurking suspicion that some bright folk inside the NSA have spent years thinking about this and have come up with some reasonably self-consistent definition of row hiding in a SQL database. But have they published it where we can find it? I have a couple contacts in the security world who might be able to help with references. I'm asking them now. A -- Andrew Sullivan [EMAIL PROTECTED] +1 503 667 4564 x104 http://www.commandprompt.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] FSM, now without WAL-logging
Heikki Linnakangas napsal(a): Zdenek Kotala wrote: snip I've also tried various pgbench tests, on a RAM disk and otherwise, as well as the table population test I ran earlier, and don't see any difference in performance. I think performance is OK Zdenek -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Updates of SE-PostgreSQL 8.4devel patches
Andrew Sullivan [EMAIL PROTECTED] writes: The above point, and other similar ones in every discussion of the proposed functionality, makes me think once again either that the requirements for this feature aren't understood by everyone, or else that they're not actually explicit enough. I have a feeling it's the latter. Yeah, I think that's exactly the problem here: we've got this large patch and no agreement on just what requirements it's supposed to meet. Perhaps others see it differently, but I feel like I'm being told that whatever the patch does is the right thing by definition ... and yet it doesn't seem to meet what I would think are the likely requirements of the users who might actually want such features. Agreeing on the requirements seems like a necessary condition for arriving at any consensus on a patch. Where can we get some evidence that would convince everyone that the requirements for a highly secure database are X, Y and Z? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Updates of SE-PostgreSQL 8.4devel patches
Zeugswetter Andreas OSB sIT wrote: anyone can see it. SE-PostgreSQL would default to WITH ROW SECURITY and use the oid to look up strings in pg_security. The above explanation is not correct, as Tom mentioned. The security system column is declared as TEXT type, however, every tuple has a Oid value to indicate pg_security system catalog. It enables to prevent waste of storage. When user tries to read the system column, it is translated from Oid to text representation. Imho the important points Bruce wanted to make are: 1. there is only one extra oid storage column per row (regardless whether it is translated to text upon select) this is already the case in the patch. 2. the column(s) are system columns, so they do not show up in select * I think having access to the oid directly could be beneficial to performance. e.g. a smart client could cache pg_security and map the oid's to text locally instead of transferring the quite verbose text representation for every row. That may be mute, because showing the security_context definitely sounds more like an admin kind of functionality. Traditionally the column would probably be oid and sql would need to cast it for the text representation (e.g. security_context::regsecurity). In most of cases, SE-PostgreSQL does not need to translate the security identifier into text representation, because it caches the result of access checks between the client and the recently used security_context. SE-PostgreSQL can make its decision refering the internal hash table with the security Oid. (See, src/backend/security/sepgsql/avc.c) When user requires to expose security_context, it is necessary to lookup pg_security to translate the security Oid into text representation, but I guess it is not frequently. Thanks, -- KaiGai Kohei [EMAIL PROTECTED] -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] planned maintenance downtime - tribble.postgresql.org
The sysadmin team would like to announce a planned maintenance window for OS related updates on tribble.postgresql.org starting Sunday Sep 28 07:00 GMT (espected to last for an hour) affecting the following publically visible services: cvs.postgresql.org wwwmaster.postgresql.org www.pgadmin.org doxygen.postgresql.org wiki.postgresql.org I would ask people to hold off on any changes or commits to the affected services during that time period until you see an explicit it's done. regards Stefan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] parallel pg_restore - WIP patch
This version of the patch should fix the shared file descriptor bug Russell Smith noticed. It also disables the 1/2 second sleep between forks, so the performance on a small db (regression) is vastly improved. cheers andrew Index: pg_backup.h === RCS file: /cvsroot/pgsql/src/bin/pg_dump/pg_backup.h,v retrieving revision 1.47 diff -c -r1.47 pg_backup.h *** pg_backup.h 13 Apr 2008 03:49:21 - 1.47 --- pg_backup.h 26 Sep 2008 15:15:38 - *** *** 123,128 --- 123,130 int suppressDumpWarnings; /* Suppress output of WARNING entries * to stderr */ bool single_txn; + int number_of_threads; + booltruncate_before_load; bool *idWanted; /* array showing which dump IDs to emit */ } RestoreOptions; *** *** 165,170 --- 167,173 extern void CloseArchive(Archive *AH); extern void RestoreArchive(Archive *AH, RestoreOptions *ropt); + extern void RestoreArchiveParallel(Archive *AH, RestoreOptions *ropt); /* Open an existing archive */ extern Archive *OpenArchive(const char *FileSpec, const ArchiveFormat fmt); Index: pg_backup_archiver.c === RCS file: /cvsroot/pgsql/src/bin/pg_dump/pg_backup_archiver.c,v retrieving revision 1.158 diff -c -r1.158 pg_backup_archiver.c *** pg_backup_archiver.c 5 Sep 2008 23:53:42 - 1.158 --- pg_backup_archiver.c 26 Sep 2008 15:15:39 - *** *** 27,38 --- 27,50 #include unistd.h + #include sys/types.h + #include sys/wait.h + + #ifdef WIN32 #include io.h #endif #include libpq/libpq-fs.h + typedef struct _parallel_slot + { + pid_t pid; + TocEntry *te; + DumpId dumpId; + } ParallelSlot; + + #define NO_SLOT (-1) const char *progname; *** *** 70,76 --- 82,99 static void dumpTimestamp(ArchiveHandle *AH, const char *msg, time_t tim); static OutputContext SetOutput(ArchiveHandle *AH, char *filename, int compression); static void ResetOutput(ArchiveHandle *AH, OutputContext savedContext); + static bool work_is_being_done(ParallelSlot *slot, int n_slots); + static int get_next_slot(ParallelSlot *slots, int n_slots); + static TocEntry *get_next_work_item(ArchiveHandle *AH); + static void prestore(ArchiveHandle *AH, TocEntry *te); + static void mark_work_done(ArchiveHandle *AH, pid_t worker, ParallelSlot *slots, int n_slots); + static int _restore_one_te(ArchiveHandle *ah, TocEntry *te, RestoreOptions *ropt,bool is_parallel); + static void _reduce_dependencies(ArchiveHandle * AH, TocEntry *te); + static void _fix_dependency_counts(ArchiveHandle *AH); + static void _inhibit_data_for_failed_table(ArchiveHandle *AH, TocEntry * te); + + static ArchiveHandle *GAH; /* * Wrapper functions. *** *** 125,137 /* Public */ void RestoreArchive(Archive *AHX, RestoreOptions *ropt) { ArchiveHandle *AH = (ArchiveHandle *) AHX; TocEntry *te; teReqs reqs; OutputContext sav; - bool defnDumped; AH-ropt = ropt; AH-stage = STAGE_INITIALIZING; --- 148,529 /* Public */ void + RestoreArchiveParallel(Archive *AHX, RestoreOptions *ropt) + { + + ArchiveHandle *AH = (ArchiveHandle *) AHX; + ParallelSlot *slots; + int next_slot; + TocEntry *next_work_item = NULL; + int work_status; + pid_t ret_child; + int n_slots = ropt-number_of_threads; + TocEntry *te; + teReqsreqs; + + + /* AH-debugLevel = 99; */ + /* some routines that use ahlog() don't get passed AH */ + GAH = AH; + + ahlog(AH,1,entering RestoreARchiveParallel\n); + + + slots = (ParallelSlot *) calloc(sizeof(ParallelSlot),n_slots); + AH-ropt = ropt; + + if (ropt-create) + die_horribly(AH,modulename, + parallel restore is incompatible with --create\n); + + if (ropt-dropSchema) + die_horribly(AH,modulename, + parallel restore is incompatible with --clean\n); + + if (!ropt-useDB) + die_horribly(AH,modulename, + parallel restore requires direct database connection\n); + + + #ifndef HAVE_LIBZ + + /* make sure we won't need (de)compression we haven't got */ + if (AH-compression != 0 AH-PrintTocDataPtr != NULL) + { + for (te = AH-toc-next; te != AH-toc; te = te-next) + { + reqs = _tocEntryRequired(te, ropt, false); + if (te-hadDumper (reqs REQ_DATA) != 0) + die_horribly(AH, modulename, + cannot restore from compressed archive (compression not supported in this installation)\n); + } + } + #endif + + ahlog(AH, 1, connecting to database for restore\n); + if (AH-version K_VERS_1_3) + die_horribly(AH, modulename, + direct database connections are not supported in pre-1.3 archives\n); + + /* XXX Should get this from the archive */ + AHX-minRemoteVersion = 070100; + AHX-maxRemoteVersion = 99; + + /* correct dependency counts in case we're doing a
Re: [HACKERS] PostgreSQL future ideas
On Sep 25, 2008, at 5:50 PM, Chris Browne wrote: [EMAIL PROTECTED] (Gevik Babakhani) writes: Advantage of C++ is that it reduce lot of OO code written in C in PostgreSQL, but it is so big effort to do that without small gain. It will increase number of bugs. Do not forget also that C++ compiler is not so common (so good) on different platforms. If somebody interesting in that yes but like a fork ( PostgreSQL++ :-). Reducing OO code that is written in C is one of my major interests. After some investigating myself it appears that having the codebase fully (rewritten in C++ will have an impact on the performance. So I guess such an effort will result the code being more C++ish and fully OO, being a mixture in C with some OO taste. I'm not convinced that it would a good idea at all to make the system fully OO, nor that C++ would be a meaningful tool to use to that end. After all, C++ can certainly be used in decidedly non-OO ways. For instance, STL is NOT an OO framework, and the author of STL, obviously something of a fan of C++, characterizes OO as almost as much of a hoax as Artificial Intelligence. http://en.wikipedia.org/wiki/Object-oriented_programming#Criticism I tend to agree with that characterization. Further, C++ suffers from the same not OO at its base problem of Java, which contributes complexity as well as hurting the OO-ness of it. Better idea is to start to use C99 in PostgreSQL ;-). I have not investigated this yet. But I am very interested to know what the advantages would be to upgrade the code to C99 standards. It would give us heartburn on any platforms where the preferred compiler doesn't grok C99, for sure. As much as I'm ok with using GCC, it would seem unfortunate to force people into using GCC everywhere, and preclude using other compilers. (And actually, I'm more ambivalent about GCC than that; I'm not totally happy with how GCC has gone, but that's another tale for another day...) Speaking of language choice, no one said that _all_ the source code would need to be rewritten. It would be nice, for example, if PostgreSQL rewrote the current GUC system with a glue language like Lua (which is also very C-like). Cheers, M -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PostgreSQL future ideas
A.M. wrote: Speaking of language choice, no one said that _all_ the source code would need to be rewritten. It would be nice, for example, if PostgreSQL rewrote the current GUC system with a glue language like Lua (which is also very C-like). No it wouldn't. All it would mean is that you'd need developers fluent in both languages. If this were a greenfields project we might well now make choices other than those made in the past, but that doesn't mean we should constantly revisit those decisions. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] lock contention on parallel COPY ?
I'm currently playing with Andrews parallel restore patch and it seems that pg is far from taking advantage of the hardware I have for testing (Dual Quad Core Xeon attached to Netapp with 68 spindles). with a concurrency of 4 - I see iowait hovering ~1% CPU load at 20% peak and around 15 contextswitches/s. The load seems to progress at around the same rate as a single backend restore does on the same box. a profile during the load (a fair amount of tables sized ~10-30M rows each) looks fairly similiar to: samples %symbol name 1933314 21.8884 LWLockAcquire 1677808 18.9957 XLogInsert 8482279.6034 LWLockRelease 4141794.6892 DoCopy 3326333.7660 CopyReadLine 2665803.0181 UnpinBuffer 2216932.5099 heap_formtuple 1769392.0033 .plt 1718421.9455 PinBuffer 1604701.8168 GetNewObjectId 1540951.7446 heap_insert 1518131.7188 s_lock 1178491.3343 LockBuffer 1095301.2401 hash_search_with_hash_value 1021691.1567 PageAddItem 91151 1.0320 pg_verify_mbstr_len 82538 0.9345 CopyGetData using --truncate-before-load seems to help a bit but it still seems to only barely utilizing the available resources. Stefan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] lock contention on parallel COPY ?
Stefan Kaltenbrunner [EMAIL PROTECTED] writes: samples %symbol name 1933314 21.8884 LWLockAcquire 1677808 18.9957 XLogInsert 8482279.6034 LWLockRelease 4141794.6892 DoCopy 3326333.7660 CopyReadLine 2665803.0181 UnpinBuffer 2216932.5099 heap_formtuple I suppose Andrew didn't yet put in the hack to avoid WAL logging during the COPY commands. The LWLockAcquires are presumably blocking on WALInsertLock, given that XLogInsert is also right up there ... regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] lock contention on parallel COPY ?
Tom Lane wrote: Stefan Kaltenbrunner [EMAIL PROTECTED] writes: samples %symbol name 1933314 21.8884 LWLockAcquire 1677808 18.9957 XLogInsert 8482279.6034 LWLockRelease 4141794.6892 DoCopy 3326333.7660 CopyReadLine 2665803.0181 UnpinBuffer 2216932.5099 heap_formtuple I suppose Andrew didn't yet put in the hack to avoid WAL logging during the COPY commands. The LWLockAcquires are presumably blocking on WALInsertLock, given that XLogInsert is also right up there ... Yes I did. That's what the --truncate-before-load switch does (or should do). cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] lock contention on parallel COPY ?
Andrew Dunstan [EMAIL PROTECTED] writes: Tom Lane wrote: I suppose Andrew didn't yet put in the hack to avoid WAL logging Yes I did. That's what the --truncate-before-load switch does (or should do). Well, it doesn't seem to be having any effect. Maybe Stefan is testing a configuration with xlog archiving enabled? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] lock contention on parallel COPY ?
Tom Lane wrote: Andrew Dunstan [EMAIL PROTECTED] writes: Tom Lane wrote: I suppose Andrew didn't yet put in the hack to avoid WAL logging Yes I did. That's what the --truncate-before-load switch does (or should do). Well, it doesn't seem to be having any effect. Maybe Stefan is testing a configuration with xlog archiving enabled? heh no log archiving - I actually said that I'm now playing with --truncate-before-load which seems to cause a noticeable performance (as in IO generated) increase but I still see 13 context switches/s and a profile that looks like: samples %symbol name 5552616.5614 LWLockAcquire 29721 8.8647 DoCopy 26581 7.9281 CopyReadLine 25105 7.4879 LWLockRelease 15743 4.6956 PinBuffer 14725 4.3919 heap_formtuple 9748 2.9075 GetNewObjectId 8059 2.4037 pg_verify_mbstr_len 6825 2.0356 hash_search_with_hash_value 6386 1.9047 s_lock 5645 1.6837 heap_insert 5631 1.6795 PageAddItem 4723 1.4087 pg_atoi Stefan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] lock contention on parallel COPY ?
Stefan Kaltenbrunner [EMAIL PROTECTED] writes: heh no log archiving - I actually said that I'm now playing with --truncate-before-load which seems to cause a noticeable performance (as in IO generated) increase but I still see 13 context switches/s and a profile that looks like: samples %symbol name 5552616.5614 LWLockAcquire 29721 8.8647 DoCopy 26581 7.9281 CopyReadLine 25105 7.4879 LWLockRelease 15743 4.6956 PinBuffer 14725 4.3919 heap_formtuple Still a lot of contention for something, then. You might try turning on LWLOCK_STATS (this only requires recompiling storage/lmgr/lwlock.c) to get some evidence about what. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] lock contention on parallel COPY ?
Tom Lane wrote: Stefan Kaltenbrunner [EMAIL PROTECTED] writes: heh no log archiving - I actually said that I'm now playing with --truncate-before-load which seems to cause a noticeable performance (as in IO generated) increase but I still see 13 context switches/s and a profile that looks like: samples %symbol name 5552616.5614 LWLockAcquire 29721 8.8647 DoCopy 26581 7.9281 CopyReadLine 25105 7.4879 LWLockRelease 15743 4.6956 PinBuffer 14725 4.3919 heap_formtuple Still a lot of contention for something, then. You might try turning on LWLOCK_STATS (this only requires recompiling storage/lmgr/lwlock.c) to get some evidence about what. that one generates a huge amount of logs - output for ~60s into the load is available here: http://www.kaltenbrunner.cc/files/lwstats.txt (21MB!) Stefan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Meridiem markers (was: [BUGS] Incorrect invalid AM/PM string error from to_timestamp)
Hey folks, Tom pointed out in the parent thread [1] that the error message for bogus AM/PM markers in to_timestamp is pretty lame: ERROR: invalid AM/PM string I agree, and once I started thinking about this, I came up with other gripes concerning the treatment of 12-hour time in to_timestamp, the use of the meridiem markers (AM/PM) in particular. Currently, Postgres accepts four separate flavours for specifying meridiem markers, given by uppercase/lowercase and with/without periods: * am/pm * AM/PM * a.m./p.m. * A.M./P.M. What I find surprising about the implementation is that you must specify the correct formatting pattern for the particular flavour of meridiem marker in your input string, or you get the aforementioned lame error message. Consider: postgres=# select to_timestamp('11:47 pm 27 Sep 2008', 'HH:MI PM DD Mon '); ERROR: invalid AM/PM string Here it seems to me that Postgres is being unnecessarily pedantic. Yes, the case of the marker differs from the case of the formatting keyword, but is that really grounds for an ERROR? The user's intention is perfectly unambiguous, so why not just accept the input? I would go so far as to say that we should accept any of the 8 valid meridiem markers, regardless of which flavour is indicated by the formatting keyword. Day and month names already work this way. We don't throw an error if a user specifies a mixed-case month name like Sep but uses the uppercase formatting keyword MON. I suspect that the current behaviour isn't so much a deliberate choice to be draconian as it is a consequence of the way the code was developed; as the inverse of to_char(). One way to tidy this up would be to re-implement the meridiem markers using the seq_search functions, i.e., make it work like the day and month names. This would make it easy to accept any flavour of marker, and the error messages thrown for bogus input would then be the same as those for bogus day and month names. Note that all of the above applies equally to the era markers B.C. and A.D. Comments? Cheers, BJ [1] http://archives.postgresql.org/message-id/[EMAIL PROTECTED] -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Meridiem markers (was: [BUGS] Incorrect invalid AM/PM string error from to_timestamp)
Brendan Jurd [EMAIL PROTECTED] writes: One way to tidy this up would be to re-implement the meridiem markers using the seq_search functions, i.e., make it work like the day and month names. This would make it easy to accept any flavour of marker, and the error messages thrown for bogus input would then be the same as those for bogus day and month names. Note that all of the above applies equally to the era markers B.C. and A.D. +1 ... making these work like month names makes sense to me. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] lock contention on parallel COPY ?
Stefan Kaltenbrunner [EMAIL PROTECTED] writes: that one generates a huge amount of logs - output for ~60s into the load is available here: http://www.kaltenbrunner.cc/files/lwstats.txt (21MB!) Huh ... essentially all the contention is for OidGenLock. I take it you're loading into a table that has OIDs? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PostgreSQL future ideas
On Fri, Sep 26, 2008 at 11:52 AM, Andrew Dunstan [EMAIL PROTECTED] wrote: Speaking of language choice, no one said that _all_ the source code would need to be rewritten. It would be nice, for example, if PostgreSQL rewrote the current GUC system with a glue language like Lua (which is also very C-like). No it wouldn't. All it would mean is that you'd need developers fluent in both languages. Having done quite a bit of internals work with SAP DB (which is an amalgamation of C, C++, and Pascal), I completely agree. The entire system, if possible, should be in a single language. -- Jonah H. Harris, Senior DBA myYearbook.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] lock contention on parallel COPY ?
Tom Lane wrote: Stefan Kaltenbrunner [EMAIL PROTECTED] writes: that one generates a huge amount of logs - output for ~60s into the load is available here: http://www.kaltenbrunner.cc/files/lwstats.txt (21MB!) Huh ... essentially all the contention is for OidGenLock. I take it you're loading into a table that has OIDs? hmm that particular database has its ancient roots on 7.3 (or 7.4) and indeed it seems that some tables in the dump are preceeded by a SET default_with_oids = true; Will go and elimited those and try again. Stefan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] lock contention on parallel COPY ?
On Fri, 2008-09-26 at 12:38 -0400, Tom Lane wrote: Stefan Kaltenbrunner [EMAIL PROTECTED] writes: heh no log archiving - I actually said that I'm now playing with --truncate-before-load which seems to cause a noticeable performance (as in IO generated) increase but I still see 13 context switches/s and a profile that looks like: samples %symbol name 5552616.5614 LWLockAcquire 29721 8.8647 DoCopy 26581 7.9281 CopyReadLine 25105 7.4879 LWLockRelease 15743 4.6956 PinBuffer 14725 4.3919 heap_formtuple Still a lot of contention for something, then. You might try turning on LWLOCK_STATS (this only requires recompiling storage/lmgr/lwlock.c) to get some evidence about what. Probably loading a table with a generated PK or loading data in ascending sequence, so its contending heavily for the rightmost edge of the index. We need to load data a block at a time and buffer the inserts into the index also, so we don't need to lock/unlock per row. -- Simon Riggs www.2ndQuadrant.com PostgreSQL Training, Services and Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] lock contention on parallel COPY ?
Simon Riggs [EMAIL PROTECTED] writes: Stefan Kaltenbrunner [EMAIL PROTECTED] writes: samples %symbol name 5552616.5614 LWLockAcquire 29721 8.8647 DoCopy 26581 7.9281 CopyReadLine 25105 7.4879 LWLockRelease 15743 4.6956 PinBuffer 14725 4.3919 heap_formtuple Probably loading a table with a generated PK or loading data in ascending sequence, so its contending heavily for the rightmost edge of the index. No, given that DoCopy and CopyReadLine are right up there, I think we're still looking at the COPY phase, not index building. The profile will probably change completely once index building starts... regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Meridiem markers (was: [BUGS] Incorrect invalid AM/PM string error from to_timestamp)
On Fri, Sep 26, 2008 at 11:25 AM, Brendan Jurd [EMAIL PROTECTED] wrote: One way to tidy this up would be to re-implement the meridiem markers using the seq_search functions, i.e., make it work like the day and month names. This would make it easy to accept any flavour of marker, and the error messages thrown for bogus input would then be the same as those for bogus day and month names. Yeah if we seq_search then it should be a pretty easy conversion. so +1 However that still leaves the original complaint around (at least IMHO): select to_timestamp('AN', 'AM'); ERROR: invalid AM/PM string select to_timestamp('11:47 PM 27 Sep a2008', 'HH:MI PM DD Mon '); ERROR: invalid value for in source string Now arguably most to_timestamp calls are going to be short so i can easily look for the in my string and see what I did wrong (and DETAIL: provides Value must be an integer in the second case)... So really maybe thats good enough I dunno... Personally I find the output of my dumb patch to be better than both above: select to_timestamp('AN', 'AM'); ERROR: invalid AM/PM string for 'AN' And we could improve that by only showing node-key-len chars. And make that work for both AM/PM and the others etc Then again maybe its not worth it? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Meridiem markers (was: [BUGS] Incorrect invalid AM/PM string error from to_timestamp)
Alex Hunsaker [EMAIL PROTECTED] writes: However that still leaves the original complaint around (at least IMHO): select to_timestamp('AN', 'AM'); ERROR: invalid AM/PM string select to_timestamp('11:47 PM 27 Sep a2008', 'HH:MI PM DD Mon '); ERROR: invalid value for in source string Yeah, it would be a lot better if it said ERROR: invalid value for : a2008 The DETAIL is good too, but it's no substitute for showing the exact substring that the code is unhappy about. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] lock contention on parallel COPY ?
Tom Lane wrote: Simon Riggs [EMAIL PROTECTED] writes: Stefan Kaltenbrunner [EMAIL PROTECTED] writes: samples %symbol name 5552616.5614 LWLockAcquire 29721 8.8647 DoCopy 26581 7.9281 CopyReadLine 25105 7.4879 LWLockRelease 15743 4.6956 PinBuffer 14725 4.3919 heap_formtuple Probably loading a table with a generated PK or loading data in ascending sequence, so its contending heavily for the rightmost edge of the index. No, given that DoCopy and CopyReadLine are right up there, I think we're still looking at the COPY phase, not index building. The profile will probably change completely once index building starts... yeah this profile is only showing the COPY phase ... Stefan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] lock contention on parallel COPY ?
On Fri, 2008-09-26 at 14:00 -0400, Tom Lane wrote: Simon Riggs [EMAIL PROTECTED] writes: Stefan Kaltenbrunner [EMAIL PROTECTED] writes: samples %symbol name 5552616.5614 LWLockAcquire 29721 8.8647 DoCopy 26581 7.9281 CopyReadLine 25105 7.4879 LWLockRelease 15743 4.6956 PinBuffer 14725 4.3919 heap_formtuple Probably loading a table with a generated PK or loading data in ascending sequence, so its contending heavily for the rightmost edge of the index. No, given that DoCopy and CopyReadLine are right up there, I think we're still looking at the COPY phase, not index building. The profile will probably change completely once index building starts... Sorry, was assuming we were loading with indexes on, which is wrong. Agree the profile looks odd. -- Simon Riggs www.2ndQuadrant.com PostgreSQL Training, Services and Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] About the parameter of API: PQprepared
On Fri, Sep 26, 2008 at 2:45 AM, iihero [EMAIL PROTECTED] wrote: In libpq, the definition is like: PGresult * PQprepare(PGconn *conn, const char *stmtName, const char *query, int nParams, const Oid *paramTypes) Could we remove the parameter nParams? e.g. insert into foo(id, name, address) values ($1, $2, $3) PostgreSQL possibly can parse the prepared sql statement to get the real paramters count. Or, is there another alternate way? if you are looking for easier way to do things like prepared statements, etc. over libpq, you may want to look at libpqtypes (it's a external library that will require a patched libpq for versions 8.4): http://libpqtypes.esilo.com/ merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] lock contention on parallel COPY ?
On Fri, 2008-09-26 at 20:07 +0200, Stefan Kaltenbrunner wrote: Tom Lane wrote: Simon Riggs [EMAIL PROTECTED] writes: Stefan Kaltenbrunner [EMAIL PROTECTED] writes: samples %symbol name 5552616.5614 LWLockAcquire 29721 8.8647 DoCopy 26581 7.9281 CopyReadLine 25105 7.4879 LWLockRelease 15743 4.6956 PinBuffer 14725 4.3919 heap_formtuple Probably loading a table with a generated PK or loading data in ascending sequence, so its contending heavily for the rightmost edge of the index. No, given that DoCopy and CopyReadLine are right up there, I think we're still looking at the COPY phase, not index building. The profile will probably change completely once index building starts... yeah this profile is only showing the COPY phase ... Try using this Postgres core patch. It's a updated version of my fast_copy.v4.patch from Patches: Bulk Insert tuning 20 Mar 2008 -- Simon Riggs www.2ndQuadrant.com PostgreSQL Training, Services and Support Index: src/backend/access/heap/heapam.c === RCS file: /home/sriggs/pg/REPOSITORY/pgsql/src/backend/access/heap/heapam.c,v retrieving revision 1.263 diff -c -r1.263 heapam.c *** src/backend/access/heap/heapam.c 11 Sep 2008 14:01:09 - 1.263 --- src/backend/access/heap/heapam.c 26 Sep 2008 19:16:27 - *** *** 1728,1733 --- 1728,1748 } } + /* + * Begin/End Bulk Inserts + * + */ + void + heap_begin_bulk_insert(void) + { + ReleaseBulkInsertBufferIfAny(); + } + + void + heap_end_bulk_insert(void) + { + ReleaseBulkInsertBufferIfAny(); + } /* * heap_insert - insert tuple into a heap *** *** 1755,1765 */ Oid heap_insert(Relation relation, HeapTuple tup, CommandId cid, ! bool use_wal, bool use_fsm) { TransactionId xid = GetCurrentTransactionId(); HeapTuple heaptup; Buffer buffer; if (relation-rd_rel-relhasoids) { --- 1770,1781 */ Oid heap_insert(Relation relation, HeapTuple tup, CommandId cid, ! bool use_wal, bool use_fsm, bool bulk_insert_request) { TransactionId xid = GetCurrentTransactionId(); HeapTuple heaptup; Buffer buffer; + bool bulk_insert = bulk_insert_request !relation-rd_istemp; if (relation-rd_rel-relhasoids) { *** *** 1812,1820 else heaptup = tup; ! /* Find buffer to insert this tuple into */ ! buffer = RelationGetBufferForTuple(relation, heaptup-t_len, ! InvalidBuffer, use_fsm); /* NO EREPORT(ERROR) from here till changes are logged */ START_CRIT_SECTION(); --- 1828,1845 else heaptup = tup; ! /* ! * Find buffer to insert this tuple into ! */ ! if (bulk_insert) ! { ! buffer = RelationGetBufferForTuple(relation, heaptup-t_len, ! GetBulkInsertBuffer(), use_fsm, true); ! SetBulkInsertBuffer(buffer); ! } ! else ! buffer = RelationGetBufferForTuple(relation, heaptup-t_len, ! InvalidBuffer, use_fsm, false); /* NO EREPORT(ERROR) from here till changes are logged */ START_CRIT_SECTION(); *** *** 1893,1899 END_CRIT_SECTION(); ! UnlockReleaseBuffer(buffer); /* * If tuple is cachable, mark it for invalidation from the caches in case --- 1918,1930 END_CRIT_SECTION(); ! /* ! * Keep buffer pinned if we are in bulk insert mode ! */ ! if (bulk_insert) ! LockBuffer(buffer, BUFFER_LOCK_UNLOCK); ! else ! UnlockReleaseBuffer(buffer); /* * If tuple is cachable, mark it for invalidation from the caches in case *** *** 1930,1936 Oid simple_heap_insert(Relation relation, HeapTuple tup) { ! return heap_insert(relation, tup, GetCurrentCommandId(true), true, true); } /* --- 1961,1967 Oid simple_heap_insert(Relation relation, HeapTuple tup) { ! return heap_insert(relation, tup, GetCurrentCommandId(true), true, true, false); } /* *** *** 2553,2559 { /* Assume there's no chance to put heaptup on same page. */ newbuf = RelationGetBufferForTuple(relation, heaptup-t_len, ! buffer, true); } else { --- 2584,2590 { /* Assume there's no chance to put heaptup on same page. */ newbuf = RelationGetBufferForTuple(relation, heaptup-t_len, ! buffer, true, false); } else { *** *** 2570,2576 */ LockBuffer(buffer, BUFFER_LOCK_UNLOCK); newbuf = RelationGetBufferForTuple(relation, heaptup-t_len, ! buffer, true); } else { --- 2601,2607 */ LockBuffer(buffer, BUFFER_LOCK_UNLOCK); newbuf = RelationGetBufferForTuple(relation, heaptup-t_len, ! buffer, true, false); } else { Index: src/backend/access/heap/hio.c === RCS
Re: [HACKERS] About the parameter of API: PQprepared
Merlin Moncure wrote: On Fri, Sep 26, 2008 at 2:45 AM, iihero [EMAIL PROTECTED] wrote: In libpq, the definition is like: PGresult * PQprepare(PGconn *conn, const char *stmtName, const char *query, int nParams, const Oid *paramTypes) Could we remove the parameter nParams? e.g. insert into foo(id, name, address) values ($1, $2, $3) PostgreSQL possibly can parse the prepared sql statement to get the real paramters count. Or, is there another alternate way? if you are looking for easier way to do things like prepared statements, etc. over libpq, you may want to look at libpqtypes (it's a external library that will require a patched libpq for versions 8.4): http://libpqtypes.esilo.com/ merlin Sounds like you could make use of libpqtypes, which abstracts you from the standard parameterized API. Latest version of libpqtypes is finally online (v1.2a): http://pgfoundry.org/frs/?group_id=1000370 NOTE: requires using cvs head because libpqtypes needs the new libpq-events feature. -- Andrew Chernow eSilo, LLC every bit counts http://www.esilo.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] parallel pg_restore - WIP patch
Andrew Dunstan wrote: This version of the patch should fix the shared file descriptor bug Russell Smith noticed. It also disables the 1/2 second sleep between forks, so the performance on a small db (regression) is vastly improved. this works better but there is something fishy still - using the same dump file I get a proper restore using pg_restore normally. If I however use -m for a parallel one I only get parts (in this case only 243 of the 709 tables) of the database restored ... Stefan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] parallel pg_restore - WIP patch
Stefan Kaltenbrunner wrote: Andrew Dunstan wrote: This version of the patch should fix the shared file descriptor bug Russell Smith noticed. It also disables the 1/2 second sleep between forks, so the performance on a small db (regression) is vastly improved. this works better but there is something fishy still - using the same dump file I get a proper restore using pg_restore normally. If I however use -m for a parallel one I only get parts (in this case only 243 of the 709 tables) of the database restored ... Yes, there are several funny things going on, including some stuff with dependencies. I'll have a new patch tomorrow with luck. Thanks for testing. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] parallel pg_restore - WIP patch
On Fri, 26 Sep 2008 17:10:44 -0400 Andrew Dunstan [EMAIL PROTECTED] wrote: Yes, there are several funny things going on, including some stuff with dependencies. I'll have a new patch tomorrow with luck. Thanks for testing. O.k. I took at look at the patch itself and although I don't understand all of it there were a couple of red flags to me: + if (ropt-create) + die_horribly(AH,modulename, +parallel restore is incompatible with --create\n); + This seems like an odd limitation. In my mind, the schema would not be restored in parallel. The schema before data would restore as a single thread. Even the largest schemas would only take minutes (if that). Thus something like --create should never be a problem. I also noticed you check if we have zlib? Is it even possible to use the c format without it? (that would be new to me). I noticed this line: + while((next_work_item = get_next_work_item(AH)) != NULL) + { + /* XXX need to improve this test in case there is no table data */ + /* need to test for indexes, FKs, PK, Unique, etc */ + if(strcmp(next_work_item-desc,TABLE DATA) == 0) + break; + (void) _restore_one_te(AH, next_work_item, ropt, false); + + next_work_item-prestored = true; + + _reduce_dependencies(AH,next_work_item); + } Intead of the TABLE DATA compare, perhaps it makes sense to back patch pg_dump to have a line delimiter in the TOC? That way even if there is no TABLE DATA there would be a delimiter that says: --- BEGIN TABLE DATA --- END TABLE DATA Thus if nothing is there... nothing is there? + /* delay just long enough betweek forks to give the catalog some +* breathing space. Without this sleep I got +* tuple concurrently updated errors. +*/ + pg_usleep(50); + continue; /* in case the slots are not yet full */ + } Could that be solved with a lock instead? Once the lock is released Anyway... just some thoughts. I apologize if I misunderstood the patch. Sincerely, Joshua D. Drake cheers andrew -- The PostgreSQL Company since 1997: http://www.commandprompt.com/ PostgreSQL Community Conference: http://www.postgresqlconference.org/ United States PostgreSQL Association: http://www.postgresql.us/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Updates of SE-PostgreSQL 8.4devel patches
Dear colleagues, I said earlier I'd ask around about some of the literature on security controls vs. databse accessibility and side channels. I did, and I heard back. One person told me that this conference often has things on this topic: http://www.ieee-security.org/TC/SP-Index.html From my brief glimpse of the TOCs from the proceedings, as well as some spelunking in the ACM guide, it seems to me that some people have already worked out what ought to happen in many of these cases, and all we need to do is write down what we think ought to happen for the various use cases. I note in particular that an awful lot of work seems to be coming out of the health care sector in this area. That strikes me as at least as good a guide as national security concerns, and anything that one might want to do probably ought to be able to cope with at least those two caricatures of use cases. I also found a 2007 doctoral thesis by Azhar Rauf, Colorado Technical University, _A tradeoff analysis between data accessibility and inference control for row, column, and cell level security in relational databases_. The title and abstract make me think it might be worth looking at. Hope this is helpful, A -- Andrew Sullivan [EMAIL PROTECTED] +1 503 667 4564 x104 http://www.commandprompt.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Updates of SE-PostgreSQL 8.4devel patches
KaiGai Kohei wrote: SE-PostgreSQL needs to be good enough for the NSA, but row-level security in general does not. BTW, it seems to me someone misunderstand I works as an agent of NSA. Is it a humor, itn't it? I've paid my effort to improve the security of open source software, not for overseas intelligence agency. The comment was not directed at you. The comment was that SE-Linux was perhaps designed by the NSA, so an implementation matching SE-Linux will be good enough for the NSA. The comment goes on to say that we might want something better or more consistent than the NSA requirements. -- Bruce Momjian [EMAIL PROTECTED]http://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Updates of SE-PostgreSQL 8.4devel patches
Tom Lane wrote: Robert Haas [EMAIL PROTECTED] writes: I think you have to resign yourself to the fact that a user who can see only a subset of the rows in a table may very well see apparent foreign-key violations. But so what? So you're leaking information about the rows that they're not supposed to be able to see. This is not what I would call national-security-grade information hiding --- leastwise *I* certainly wouldn't store nuclear weapon design information in such a database. The people that the NSA wants to defend against are more than smart enough, and persistent enough, to extract information through such loopholes. I can't escape the lurking suspicion that some bright folk inside the NSA have spent years thinking about this and have come up with some reasonably self-consistent definition of row hiding in a SQL database. But have they published it where we can find it? I am confused how knowing that a sequence number used for a primary key exists or doesn't exist is leaking _meaningful_ information. People might know the sequence number exists, but how is that information useful. Now, if natural keys are used, that is a different story. I am, of course, supportive of digging deeper to find the best possible behavior. I am also supportive of making row-level security an SQL-level feature that can be used beyond SE-Linux, and will allow the feature to be tested on all platforms. -- Bruce Momjian [EMAIL PROTECTED]http://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] parallel pg_restore - WIP patch
Joshua Drake wrote: On Fri, 26 Sep 2008 17:10:44 -0400 Andrew Dunstan [EMAIL PROTECTED] wrote: Yes, there are several funny things going on, including some stuff with dependencies. I'll have a new patch tomorrow with luck. Thanks for testing. O.k. I took at look at the patch itself and although I don't understand all of it there were a couple of red flags to me: + if (ropt-create) + die_horribly(AH,modulename, +parallel restore is incompatible with --create\n); + This seems like an odd limitation. In my mind, the schema would not be restored in parallel. The schema before data would restore as a single thread. Even the largest schemas would only take minutes (if that). Thus something like --create should never be a problem. Originally I had everything restoring in parallel. Now I am in fact (as the patch should have showed you) restoring the first part in a single thread like you say. Thus I probably can relax that restriction. I will look and see. I also noticed you check if we have zlib? Is it even possible to use the c format without it? (that would be new to me). I noticed this line: + while((next_work_item = get_next_work_item(AH)) != NULL) + { + /* XXX need to improve this test in case there is no table data */ + /* need to test for indexes, FKs, PK, Unique, etc */ + if(strcmp(next_work_item-desc,TABLE DATA) == 0) + break; + (void) _restore_one_te(AH, next_work_item, ropt, false); + + next_work_item-prestored = true; + + _reduce_dependencies(AH,next_work_item); + } Intead of the TABLE DATA compare, perhaps it makes sense to back patch pg_dump to have a line delimiter in the TOC? That way even if there is no TABLE DATA there would be a delimiter that says: --- BEGIN TABLE DATA --- END TABLE DATA Thus if nothing is there... nothing is there? The TOC isn't stored as a text file. So we'll need to look by entry tags. It's no big deal - there aren't a huge number. + /* delay just long enough betweek forks to give the catalog some + * breathing space. Without this sleep I got + * tuple concurrently updated errors. +*/ + pg_usleep(50); + continue; /* in case the slots are not yet full */ + } Could that be solved with a lock instead? Once the lock is released That sleep is now gone. Anyway... just some thoughts. I apologize if I misunderstood the patch. No problem. Thanks for looking. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Updates of SE-PostgreSQL 8.4devel patches
Bruce Momjian [EMAIL PROTECTED] writes: I am confused how knowing that a sequence number used for a primary key exists or doesn't exist is leaking _meaningful_ information. People might know the sequence number exists, but how is that information useful. Now, if natural keys are used, that is a different story. Right. It might be that securing a database requires not just some security mechanisms but also some database design rules (like don't allow foreign keys except on synthetic IDs). But it seems to me that we are just flailing around in the dark because we don't have that bigger picture of how the features would actually get used. The literature pointers that Andrew just gave us seem promising to me. Who's going to go searching for some useful info? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PostgreSQL future ideas
[EMAIL PROTECTED] (Andrew Dunstan) writes: A.M. wrote: Speaking of language choice, no one said that _all_ the source code would need to be rewritten. It would be nice, for example, if PostgreSQL rewrote the current GUC system with a glue language like Lua (which is also very C-like). No it wouldn't. All it would mean is that you'd need developers fluent in both languages. I expect it would be both a little better *and* a little worse than that. On the better side, I don't expect that, in this instance, there would be terribly much need for anything but the shallowest understanding of Lua. If this were all there was to it, I'd contend that there's little to object to. However, there's a pretty considerable worse side, namely that developers would need to understand the API for interfacing between the bits of C that are the Lua 'external interface' and how that gets plumbed into PostgreSQL. *That's* got very little to do with language, per se; it has to do with the implementation of the language. -- let name=cbbrowne and tld=acm.org in String.concat @ [name;tld];; http://linuxdatabases.info/info/oses.html Real concurrency---in which one program actually continues to function while you call up and use another---is more amazing but of small use to the average person. How many programs do you have that take more than a few seconds to perform any task? -- New York Times, 4/25/89 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Bug in ILIKE?
Tom Lane wrote: I think it's just a bug in 8.3. Well, here's a patch that I think fixes it. If you're happy I'll apply it to HEAD and 8.3. cheers andrew ? .deps Index: like_match.c === RCS file: /cvsroot/pgsql/src/backend/utils/adt/like_match.c,v retrieving revision 1.21 diff -c -r1.21 like_match.c *** like_match.c 1 Mar 2008 03:26:34 - 1.21 --- like_match.c 26 Sep 2008 22:38:26 - *** *** 96,105 { if (*p == '\\') { ! /* Next byte must match literally, whatever it is */ NextByte(p, plen); ! if ((plen = 0) || *p != *t) return LIKE_FALSE; } else if (*p == '%') { --- 96,108 { if (*p == '\\') { ! /* Next char must match literally, whatever it is */ NextByte(p, plen); ! if ((plen = 0) || TCHAR(*p) != TCHAR(*t)) return LIKE_FALSE; + NextChar(t,tlen); + NextChar(p,plen); + continue; } else if (*p == '%') { -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Bug in ILIKE?
Andrew Dunstan [EMAIL PROTECTED] writes: Tom Lane wrote: I think it's just a bug in 8.3. Well, here's a patch that I think fixes it. If you're happy I'll apply it to HEAD and 8.3. That patch isn't gonna apply to HEAD ;-). Also the introduction of NextChar is simply broken, as it will skip additional bytes of a multibyte char without having compared 'em. All you need AFAICS is to put TCHAR()s into the *p != *t comparison, so that it matches the case for ordinary characters. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] allow has_table_privilege(..., 'usage') on sequences
At 2008-09-22 12:54:34 -0500, [EMAIL PROTECTED] wrote: can we tell there is consensus in create a new has_sequence_privilege()? Abhijit will you make it? if not i can make a try... Yes, I'll do it. -- ams -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Bug in ILIKE?
Tom Lane wrote: Andrew Dunstan [EMAIL PROTECTED] writes: Tom Lane wrote: I think it's just a bug in 8.3. Well, here's a patch that I think fixes it. If you're happy I'll apply it to HEAD and 8.3. That patch isn't gonna apply to HEAD ;-). Also the introduction of NextChar is simply broken, as it will skip additional bytes of a multibyte char without having compared 'em. All you need AFAICS is to put TCHAR()s into the *p != *t comparison, so that it matches the case for ordinary characters. I'll have another look. What happens in that case though if you have escape+x where x is a multibyte char? cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Bug in ILIKE?
Andrew Dunstan [EMAIL PROTECTED] writes: I'll have another look. What happens in that case though if you have escape+x where x is a multibyte char? TCHAR()'s just a no-op in multibyte encodings --- we handle ILIKE using a preliminary downcasing pass in that case. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Bug in ILIKE?
Tom Lane wrote: Andrew Dunstan [EMAIL PROTECTED] writes: I'll have another look. What happens in that case though if you have escape+x where x is a multibyte char? TCHAR()'s just a no-op in multibyte encodings --- we handle ILIKE using a preliminary downcasing pass in that case. Ah. Of course. Will fix. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Meridiem markers (was: [BUGS] Incorrect invalid AM/PM string error from to_timestamp)
On Sat, Sep 27, 2008 at 4:08 AM, Tom Lane [EMAIL PROTECTED] wrote: Alex Hunsaker [EMAIL PROTECTED] writes: However that still leaves the original complaint around (at least IMHO): select to_timestamp('AN', 'AM'); ERROR: invalid AM/PM string select to_timestamp('11:47 PM 27 Sep a2008', 'HH:MI PM DD Mon '); ERROR: invalid value for in source string Yeah, it would be a lot better if it said ERROR: invalid value for : a2008 The DETAIL is good too, but it's no substitute for showing the exact substring that the code is unhappy about. I agree. And with my proposed changes to the meridiem/era marker code, it will be much easier to improve the error messages in a consistent way. I'll work on a patch for the Nov commitfest. Cheers, BJ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Updates of SE-PostgreSQL 8.4devel patches
The literature pointers that Andrew just gave us seem promising to me. Who's going to go searching for some useful info? I couldn't find much on the pages linked directly from the link Andrew sent, but a Google search for site:ieee-security.org database turned up the following: http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.48.2185 A MAC policy framework for multilevel relational databases by Xiaolei Qian, Teresa F. Lunt I haven't read through this in detail yet, but appears to be highly relevant to Tom's concerns about referential integrity. ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Updates of SE-PostgreSQL 8.4devel patches
On Fri, Sep 26, 2008 at 06:15:46PM -0400, Bruce Momjian wrote: I am confused how knowing that a sequence number used for a primary key exists or doesn't exist is leaking _meaningful_ information. This sort of side-channel intelligence is _exactly_ how certain kinds of security exploits work: I'm not supposed to know that _x_ exists; but by knowing key-of-_x_, I learn that _x_ exists. From existence, I can infer something, and from that inference I construct an attack that was supposed to be forestalled by the access controls. I am by no means a security expert, but I know enough about the area to know that it is very hard to get right, and that seemingly insignificant flaws in design turn out to be major vulnerabilities. To speak about something I do know about, when DNS was designed, nobody could have imagined that the widespread availability of recursion would turn out to be a flaw. Today, it turns out that open recursion can be used in an attack that magnifies the attacker's outbound traffic by many orders of magnitude. This sort of surprise side effect is why I am so anxious that something advertised as a security system fit really well with the proposed use cases. A -- Andrew Sullivan [EMAIL PROTECTED] +1 503 667 4564 x104 http://www.commandprompt.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Updates of SE-PostgreSQL 8.4devel patches
Tom Lane wrote: Bruce Momjian [EMAIL PROTECTED] writes: I am confused how knowing that a sequence number used for a primary key exists or doesn't exist is leaking _meaningful_ information. People might know the sequence number exists, but how is that information useful. Now, if natural keys are used, that is a different story. Right. It might be that securing a database requires not just some security mechanisms but also some database design rules (like don't allow foreign keys except on synthetic IDs). But it seems to me that we are just flailing around in the dark because we don't have that bigger picture of how the features would actually get used. The literature pointers that Andrew just gave us seem promising to me. Who's going to go searching for some useful info? I found this paper from 1996: http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.33.5950 Full PDF at link in right column. The interesting chapters are chapter 3, that talks about ENTITY AND REFERENTIAL INTEGRITY IN MLS DATABASES and chapter 4, COVERT CHANNELS. It mentions polyinstantiation: These security considerations have led to the notion of polyinstantiation [Denning 87]. Polyinstantiation forces a relation to contain multiple tuples with the same primary key but distinguishable by their classification levels or by the non-primary key attributes of the relation [Lunt 91]. which I think we want to avoid. It also talks about cases where the primary and foreign key rows have identical or different security settings. It talks about COVERT CHANNELS, which is information leaking. And it mentions TCSEC (Trusted Computer System Evaluation Criteria): http://en.wikipedia.org/wiki/Trusted_Computer_System_Evaluation_Criteria which I think is the proper term for the security target we are trying to address, or at least one of the targets. -- Bruce Momjian [EMAIL PROTECTED]http://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Updates of SE-PostgreSQL 8.4devel patches
Tom Lane wrote: Andrew Sullivan [EMAIL PROTECTED] writes: The above point, and other similar ones in every discussion of the proposed functionality, makes me think once again either that the requirements for this feature aren't understood by everyone, or else that they're not actually explicit enough. I have a feeling it's the latter. Yeah, I think that's exactly the problem here: we've got this large patch and no agreement on just what requirements it's supposed to meet. Perhaps others see it differently, but I feel like I'm being told that whatever the patch does is the right thing by definition ... and yet it doesn't seem to meet what I would think are the likely requirements of the users who might actually want such features. Indeed, I might not say my motivation and the goal/target of SE-PostgreSQL in explicit. As I repeated several times, SE-PostgreSQL applies the seuciry policy of SELinux to achieve consistency in access controls. This feature enables to restrict client's privileges on accesses to database objects, as if it accesses to filesystem objects. Its background is our nightmare for web application flaws. The major purpose of this feature is to provide the most important component to run enterprise class web application with least privilege set which is consistent at whole of the system. I know there are various aspect of opinions for SELinux, but it has succeeded to run server applications with the least privileges and enabled to prevent attachs from malicious ones. However, web application had some of characteristics different from server application. The LAPP software stack is very popular architecture to provide web services, and widely accepted. I had a concern from the viewpoint of security folks. The LAPP stack is constructed from four components, and each of them provides individual security mechanism. For example, OS(Linux) provides filesystem permission and SELinux, RDBMS(PostgreSQL) provides database ACL, and so on. In addition, most of web application stores its data into databases not only filesystems. This fact prevented to work web application with least privilege set, because in-kernel SELinux cannot trap accesses on database objects. In this situation, we have to set up database acl, and choose a proper database role. But there is no explicit relationship between database roles and the least privilege set (called as security context), so web application can/must select arbitrary one. In my vision, Apache assigns its contents handler an individual security context based on HTTP authentication, source IP address and so on just before web application invoked. Because web applications works with individual least privilege set, its accesses on filesystem are restricted by the security policy. In a similar way, its accesses on databases are also restricted via SE-PostgreSQL by same policy, by same privilege set. (Please note that SE-PostgreSQL obtain the security context of peer process using getpeercon() API provided by SELinux.) This architecture enables to minimize security sensitive code, like openning database connection. If we adopt this approach, web application can also select a database role, but the outline of its privileges are defined by security policy. In recent years, web application flaws are nightmare for us. The recent report said 95% of significant incidents on the first half of 2008, and 75% of them were SQL injection in Japan. My ultimate goal is to help the situation with mandatory access control and least priviled set for whole of LAPP stack. The fine-grained access controls are provided for practical advantage. One requirement I got is to apply SE-PostgreSQL as a backend of medical information system which is shared by several small hospitals. An electronic chart has common format, so it is simple design to store it within a single table. And, individual tuples have individual access right which is only exposed to his doctor and medical staffs. The actual origin of fine-grained access control is sense of rivalry towards the prior commercial security aware RDBMS (like Oracle Label Security, IBM DB2), however, it is neccesary to utilize access controls. If we don't have fine-grained controls, any web application have to have privileges to access all charts of patients, and vulnerable for SQL injection. It is a real practical advantage, I think. Sorry for long my description. Thanks, Agreeing on the requirements seems like a necessary condition for arriving at any consensus on a patch. Where can we get some evidence that would convince everyone that the requirements for a highly secure database are X, Y and Z? regards, tom lane -- KaiGai Kohei [EMAIL PROTECTED] -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers