Re: [HACKERS] mapping object names to role IDs
* Robert Haas (robertmh...@gmail.com) wrote: Long story short, this is kind of a mess. I agree that it's a bit of a mess. What I would propose is that we create a new source file somewhere (maybe utils/cache), move all of the other functions of this type there, give them standardized names, and provide them all with an argument that specifies whether an error is to be thrown if the object doesn't exist. Something which has come up in the past is that putting all the functions that do the same kind of thing, but operate on different types of objects, into the same backend file/area ends up meaning that such an area has an untold amount of knowledge about everything. Additionally, what *does* go into said area has tons of things that are only related by the kind of operation- not because they actually have anything to do with each other. This was one of the complaints levied at the approach for moving all the ACL checking into one place. I think it would be good to have a consistant naming/calling scheme for these various functions, but I'm not sure that moving them all to the same place makes sense. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] mapping object names to role IDs
Stephen Frost sfr...@snowman.net writes: * Robert Haas (robertmh...@gmail.com) wrote: Long story short, this is kind of a mess. ... I think it would be good to have a consistant naming/calling scheme for these various functions, but I'm not sure that moving them all to the same place makes sense. I'm with Stephen on this one. I agree that standardizing the function names and behavior would be a good idea, but don't try to put them all in one place. BTW, the plain-name cases should be const char *, else some callers will have to cast away const. You could possibly make an argument for const List * in the qualified-name cases, but we don't do that anywhere else so I think it'd just look funny here (and would require internally casting away const, too). regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] mapping object names to role IDs
On Sun, May 23, 2010 at 11:10 AM, Tom Lane t...@sss.pgh.pa.us wrote: Stephen Frost sfr...@snowman.net writes: * Robert Haas (robertmh...@gmail.com) wrote: Long story short, this is kind of a mess. ... I think it would be good to have a consistant naming/calling scheme for these various functions, but I'm not sure that moving them all to the same place makes sense. I'm with Stephen on this one. I agree that standardizing the function names and behavior would be a good idea, but don't try to put them all in one place. Some of the existing functions are in lsyscache.c, some are in files in the commands directory, and some are in files in the parser directory; also, even between commands and parser, not every object type has its own file. It would be nice to bring some consistency to where the functions are located as well as what they do. Any thoughts on how to achieve that? BTW, the plain-name cases should be const char *, else some callers will have to cast away const. You could possibly make an argument for const List * in the qualified-name cases, but we don't do that anywhere else so I think it'd just look funny here (and would require internally casting away const, too). Makes sense. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Move 'long long' check to c.h
Stephen Frost sfr...@snowman.net writes: While reviewing bfba40e2c7b3909d3de13bd1b83b7e85fa8dfec2 (mmm, we like git diff -p), I noted that c.h is already included by both extern.h and ecpg.header through postgres_fe.h. Given this and that we're already doing alot of similar #define's there (unlike in those other files), I felt c.h was a more appropriate place. Putting it in c.h also means we don't have to duplicate that code. Ugh. Moving that to c.h doesn't render it not junk code. (For one thing, it will not operate as intended if you haven't previously #included limits.h, which in fact is not included in c.h.) If we need this we should do it properly with autoconf. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] mapping object names to role IDs
Robert Haas robertmh...@gmail.com writes: On Sun, May 23, 2010 at 11:10 AM, Tom Lane t...@sss.pgh.pa.us wrote: I'm with Stephen on this one. I agree that standardizing the function names and behavior would be a good idea, but don't try to put them all in one place. Some of the existing functions are in lsyscache.c, some are in files in the commands directory, and some are in files in the parser directory; also, even between commands and parser, not every object type has its own file. It would be nice to bring some consistency to where the functions are located as well as what they do. Any thoughts on how to achieve that? I think both Stephen and I are saying we don't see merit in that. Moving around pre-existing functions won't accomplish much except causing include-list churn. Let's just standardize the names/APIs and be done. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Move 'long long' check to c.h
* Tom Lane (t...@sss.pgh.pa.us) wrote: Ugh. Moving that to c.h doesn't render it not junk code. (For one thing, it will not operate as intended if you haven't previously #included limits.h, which in fact is not included in c.h.) Doh. If we need this we should do it properly with autoconf. My autoconf foo is not very good, but once I finish a couple of other things I'll take a shot at doing it that way. Thanks! Stephen signature.asc Description: Digital signature
Re: [HACKERS] beta testing - planner bug - ERROR: XX000: failed to build any 2-way joins
Pavel Stehule pavel.steh...@gmail.com writes: postgres=# explain SELECT name FROM person WHERE name IN ( SELECT name FROM person p LEFT JOIN person_data ON p.id = person_data.id); ERROR: failed to build any 2-way joins Fixed, thanks. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] beta testing - pg_upgrade bug fix - double free
Pavel Stehule pavel.steh...@gmail.com writes: --- a/contrib/pg_upgrade/check.c +++ b/contrib/pg_upgrade/check.c @@ -154,7 +154,6 @@ issue_warnings(migratorContext *ctx, char *sequence_script_file_name) ctx-new.bindir, ctx-new.port, sequence_script_file_name, ctx-logfile); unlink(sequence_script_file_name); - pg_free(sequence_script_file_name); check_ok(ctx); } Done, thanks. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] mapping object names to role IDs
On Sun, May 23, 2010 at 11:30 AM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: On Sun, May 23, 2010 at 11:10 AM, Tom Lane t...@sss.pgh.pa.us wrote: I'm with Stephen on this one. I agree that standardizing the function names and behavior would be a good idea, but don't try to put them all in one place. Some of the existing functions are in lsyscache.c, some are in files in the commands directory, and some are in files in the parser directory; also, even between commands and parser, not every object type has its own file. It would be nice to bring some consistency to where the functions are located as well as what they do. Any thoughts on how to achieve that? I think both Stephen and I are saying we don't see merit in that. Moving around pre-existing functions won't accomplish much except causing include-list churn. Let's just standardize the names/APIs and be done. Where do we put the new functions? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] mapping object names to role IDs
Robert Haas robertmh...@gmail.com writes: On Sun, May 23, 2010 at 11:30 AM, Tom Lane t...@sss.pgh.pa.us wrote: I think both Stephen and I are saying we don't see merit in that. Moving around pre-existing functions won't accomplish much except causing include-list churn. Let's just standardize the names/APIs and be done. Where do we put the new functions? Probably in files that are already concerned with each object type. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] mapping object names to role IDs
On Sun, May 23, 2010 at 1:39 PM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: On Sun, May 23, 2010 at 11:30 AM, Tom Lane t...@sss.pgh.pa.us wrote: I think both Stephen and I are saying we don't see merit in that. Moving around pre-existing functions won't accomplish much except causing include-list churn. Let's just standardize the names/APIs and be done. Where do we put the new functions? Probably in files that are already concerned with each object type. Not every object type has a file, and the existing functions are split across three different directories, sometimes in files that don't really pertain to the object type being dealt with. I think this is going to be difficult to maintain if we intentionally spread out the parallel code across essentially the entire backend. But I guess I can code it up and we can argue about it then. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] mapping object names to role IDs
Robert Haas robertmh...@gmail.com writes: Not every object type has a file, and the existing functions are split across three different directories, sometimes in files that don't really pertain to the object type being dealt with. I think this is going to be difficult to maintain if we intentionally spread out the parallel code across essentially the entire backend. But I guess I can code it up and we can argue about it then. The only thing that seems really significantly parallel is the error message to be issued for object-not-found. I would suggest maybe putting the code in lsyscache.c, except that lsyscache functions generally are not expected to throw error on object-not-found. As for not every object type has a file, there is certainly code someplace that would be calling these functions. Whereever (the preponderance of) such calls are would be an appropriate place for the function, IMO. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Exposing the Xact commit order to the user
In some systems (data warehousing, replication), the order of commits is important, since that is the order in which changes have become visible. This information could theoretically be extracted from the WAL, but scanning the entire WAL just to extract this tidbit of information would be excruciatingly painful. The following is based on ideas that emerged during last weeks PGCon. Consider it an implementation proposal, if you like. We introduce a new set of files. The files represent segments of an infinite array of structures. The present segments are the available window of data. Similar to CLOG files, the individual file name will represent the high bits of a serial number, the offset of the record inside the file represents the low bits of the serial. The system will have postgresql.conf options for enabling/disabling the whole shebang, how many shared buffers to allocate for managing access to the data and to define the retention period of the data based on data volume and/or age of the commit records. Each record of the Transaction Commit Info consists of txid xci_transaction_id timestamptz xci_begin_timestamp timestamptz xci_commit_timestamp int64 xci_total_rowcount 32 bytes total. CommitTransaction() inside of xact.c will call a function, that inserts a new record into this array. The operation will for most of the time be nothing than taking a spinlock and adding the record to shared memory. All the data for the record is readily available, does not require further locking and can be collected locally before taking the spinlock. The begin_timestamp is the transactions idea of CURRENT_TIMESTAMP, the commit_timestamp is what CommitTransaction() just decided to write into the WAL commit record and the total_rowcount is the sum of inserted, updated and deleted heap tuples during the transaction, which should be easily available from the statistics collector, unless row stats are disabled, in which case the datum would be zero. The function will return the sequence number which CommitTransaction() in turn will record in the WAL commit record together with the begin_timestamp. While both, the begin as well as the commit timestamp are crucial to determine what data a particular transaction should have seen, the row count is not and will not be recorded in WAL. Checkpoint handling will call a function to flush the shared buffers. Together with this, the information from WAL records will be sufficient to recover this data (except for row counts) during crash recovery. Exposing the data will be done via a set returning function. The SRF takes two arguments. The maximum number of rows to return and the last serial number processed by the reader. The advantage of such SRF is that the result can be used in a query that right away delivers audit or replication log information in transaction commit order. The SRF can return an empty set if no further transactions have committed since, or an error if data segments needed to answer the request have already been purged. Purging of the data will be possible in several different ways. Autovacuum will call a function that drops segments of the data that are outside the postgresql.conf configuration with respect to maximum age or data volume. There will also be a function reserved for superusers to explicitly purge the data up to a certain serial number. Comments, suggestions? Jan -- Anyone who trades liberty for security deserves neither liberty nor security. -- Benjamin Franklin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Exposing the Xact commit order to the user
-BEGIN PGP SIGNED MESSAGE- Hash: RIPEMD160 Exposing the data will be done via a set returning function. The SRF takes two arguments. The maximum number of rows to return and the last serial number processed by the reader. The advantage of such SRF is that the result can be used in a query that right away delivers audit or replication log information in transaction commit order. The SRF can return an empty set if no further transactions have committed since, or an error if data segments needed to answer the request have already been purged. In light of the proposed purging scheme, how would it be able to distinguish between those two cases (nothing there yet vs. was there but purged)? - -- Greg Sabino Mullane g...@turnstep.com End Point Corporation http://www.endpoint.com/ PGP Key: 0x14964AC8 201005231646 http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8 -BEGIN PGP SIGNATURE- iEYEAREDAAYFAkv5lIAACgkQvJuQZxSWSsiR3gCgvyK/NPd6WmKGUqdo/3fdWIR7 LAQAoJqk3gYpEgtjw10gINDKFXTAnWO5 =sSvK -END PGP SIGNATURE- -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Specification for Trusted PLs?
Tom Lane wrote: Robert Haas robertmh...@gmail.com writes: So... can we get back to coming up with a reasonable definition, (1) no access to system calls (including file and network I/O) If a PL has file access to it's own sandbox (similar to what flash seems to do in web browsers), could that be considered trusted? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Exposing the Xact commit order to the user
On Sun, May 23, 2010 at 4:21 PM, Jan Wieck janwi...@yahoo.com wrote: The system will have postgresql.conf options for enabling/disabling the whole shebang, how many shared buffers to allocate for managing access to the data and to define the retention period of the data based on data volume and/or age of the commit records. It would be nice if this could just be managed out of shared_buffers rather than needing to configure a separate pool just for this feature. But, I'm not sure how much work that is, and if it turns out to be too ugly then I'd say it's not a hard requirement. In general, I think we talked during the meeting about the desirability of folding specific pools into shared_buffers rather than managing them separately, but I'm not aware that we have any cases where we do that today so it might be hard (or not). Each record of the Transaction Commit Info consists of txid xci_transaction_id timestamptz xci_begin_timestamp timestamptz xci_commit_timestamp int64 xci_total_rowcount 32 bytes total. Are we sure it's worth including the row count? I wonder if we ought to leave that out and let individual clients of the mechanism track that if they're so inclined, especially since it won't be reliable anyway. CommitTransaction() inside of xact.c will call a function, that inserts a new record into this array. The operation will for most of the time be nothing than taking a spinlock and adding the record to shared memory. All the data for the record is readily available, does not require further locking and can be collected locally before taking the spinlock. What happens when you need to switch pages? The function will return the sequence number which CommitTransaction() in turn will record in the WAL commit record together with the begin_timestamp. While both, the begin as well as the commit timestamp are crucial to determine what data a particular transaction should have seen, the row count is not and will not be recorded in WAL. It would certainly be better if we didn't to bloat the commit xlog records to do this. Is there any way to avoid that? Checkpoint handling will call a function to flush the shared buffers. Together with this, the information from WAL records will be sufficient to recover this data (except for row counts) during crash recovery. Right. Exposing the data will be done via a set returning function. The SRF takes two arguments. The maximum number of rows to return and the last serial number processed by the reader. The advantage of such SRF is that the result can be used in a query that right away delivers audit or replication log information in transaction commit order. The SRF can return an empty set if no further transactions have committed since, or an error if data segments needed to answer the request have already been purged. Purging of the data will be possible in several different ways. Autovacuum will call a function that drops segments of the data that are outside the postgresql.conf configuration with respect to maximum age or data volume. There will also be a function reserved for superusers to explicitly purge the data up to a certain serial number. Dunno if autovacuuming this is the right way to go. Seems like that could leave to replication breaks, and it's also more work than not doing that. I'd just say that if you turn this on you're responsible for pruning it, full stop. Anyone who trades liberty for security deserves neither liberty nor security. -- Benjamin Franklin +1. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Specification for Trusted PLs?
On 5/23/2010 6:14 PM, Ron Mayer wrote: Tom Lane wrote: Robert Haas robertmh...@gmail.com writes: So... can we get back to coming up with a reasonable definition, (1) no access to system calls (including file and network I/O) If a PL has file access to it's own sandbox (similar to what flash seems to do in web browsers), could that be considered trusted? That is a good question. Currently, the first of all TRUSTED languages, PL/Tcl, would allow the function of a lesser privileged user access the global objects of every other database user created within the same session. These are per backend in memory objects, but none the less, an evil function could just scan the per backend Tcl namespace and look for compromising data, and that's not exactly what TRUSTED is all about. In the case of Tcl it is possible to create a separate safe interpreter per DB role to fix this. I actually think this would be the right thing to do. Jan -- Anyone who trades liberty for security deserves neither liberty nor security. -- Benjamin Franklin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Exposing the Xact commit order to the user
On 5/23/2010 4:48 PM, Greg Sabino Mullane wrote: -BEGIN PGP SIGNED MESSAGE- Hash: RIPEMD160 Exposing the data will be done via a set returning function. The SRF takes two arguments. The maximum number of rows to return and the last serial number processed by the reader. The advantage of such SRF is that the result can be used in a query that right away delivers audit or replication log information in transaction commit order. The SRF can return an empty set if no further transactions have committed since, or an error if data segments needed to answer the request have already been purged. In light of the proposed purging scheme, how would it be able to distinguish between those two cases (nothing there yet vs. was there but purged)? There is a difference between an empty result set and an exception. Jan -- Anyone who trades liberty for security deserves neither liberty nor security. -- Benjamin Franklin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Exposing the Xact commit order to the user
On 5/23/2010 8:38 PM, Robert Haas wrote: On Sun, May 23, 2010 at 4:21 PM, Jan Wieck janwi...@yahoo.com wrote: The system will have postgresql.conf options for enabling/disabling the whole shebang, how many shared buffers to allocate for managing access to the data and to define the retention period of the data based on data volume and/or age of the commit records. It would be nice if this could just be managed out of shared_buffers rather than needing to configure a separate pool just for this feature. But, I'm not sure how much work that is, and if it turns out to be too ugly then I'd say it's not a hard requirement. In general, I think we talked during the meeting about the desirability of folding specific pools into shared_buffers rather than managing them separately, but I'm not aware that we have any cases where we do that today so it might be hard (or not). I'm not sure the retention policies of the shared buffer cache, the WAL buffers, CLOG buffers and every other thing we try to cache are that easy to fold into one single set of logic. But I'm all ears. Each record of the Transaction Commit Info consists of txid xci_transaction_id timestamptz xci_begin_timestamp timestamptz xci_commit_timestamp int64 xci_total_rowcount 32 bytes total. Are we sure it's worth including the row count? I wonder if we ought to leave that out and let individual clients of the mechanism track that if they're so inclined, especially since it won't be reliable anyway. Nope, we (my belly and I) are not sure about the absolute worth of the row count. It would be a convenient number to have there, but I can live without it. CommitTransaction() inside of xact.c will call a function, that inserts a new record into this array. The operation will for most of the time be nothing than taking a spinlock and adding the record to shared memory. All the data for the record is readily available, does not require further locking and can be collected locally before taking the spinlock. What happens when you need to switch pages? Then the code will have to grab another free buffer or evict one. The function will return the sequence number which CommitTransaction() in turn will record in the WAL commit record together with the begin_timestamp. While both, the begin as well as the commit timestamp are crucial to determine what data a particular transaction should have seen, the row count is not and will not be recorded in WAL. It would certainly be better if we didn't to bloat the commit xlog records to do this. Is there any way to avoid that? If you can tell me how a crash recovering system can figure out what the exact sequence number of the WAL commit record at hand should be, let's rip it. Checkpoint handling will call a function to flush the shared buffers. Together with this, the information from WAL records will be sufficient to recover this data (except for row counts) during crash recovery. Right. Exposing the data will be done via a set returning function. The SRF takes two arguments. The maximum number of rows to return and the last serial number processed by the reader. The advantage of such SRF is that the result can be used in a query that right away delivers audit or replication log information in transaction commit order. The SRF can return an empty set if no further transactions have committed since, or an error if data segments needed to answer the request have already been purged. Purging of the data will be possible in several different ways. Autovacuum will call a function that drops segments of the data that are outside the postgresql.conf configuration with respect to maximum age or data volume. There will also be a function reserved for superusers to explicitly purge the data up to a certain serial number. Dunno if autovacuuming this is the right way to go. Seems like that could leave to replication breaks, and it's also more work than not doing that. I'd just say that if you turn this on you're responsible for pruning it, full stop. It is an option. Keep it until I tell you is a perfectly valid configuration option. One you probably don't want to forget about, but valid none the less. Jan -- Anyone who trades liberty for security deserves neither liberty nor security. -- Benjamin Franklin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Specification for Trusted PLs?
Jan Wieck wrote: On 5/23/2010 6:14 PM, Ron Mayer wrote: Tom Lane wrote: Robert Haas robertmh...@gmail.com writes: So... can we get back to coming up with a reasonable definition, (1) no access to system calls (including file and network I/O) If a PL has file access to it's own sandbox (similar to what flash seems to do in web browsers), could that be considered trusted? That is a good question. Currently, the first of all TRUSTED languages, PL/Tcl, would allow the function of a lesser privileged user access the global objects of every other database user created within the same session. These are per backend in memory objects, but none the less, an evil function could just scan the per backend Tcl namespace and look for compromising data, and that's not exactly what TRUSTED is all about. In the case of Tcl it is possible to create a separate safe interpreter per DB role to fix this. I actually think this would be the right thing to do. I think that would probably be serious overkill. Maybe a data stash per role rather than an interpreter per role would be doable. it would certainly be more lightweight. ISTM we are in danger of confusing several different things. A user that doesn't want data to be shared should not stash it in global objects. But to me, trusting a language is not about making data private, but about not allowing the user to do things that are dangerous, such as referencing memory, or the file system, or the operating system, or network connections, or loading code which might do any of those things. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Specification for Trusted PLs?
On 21/05/10 23:55, Josh Berkus wrote: So, here's a working definition: 1) cannot directly read or write files on the server. It must also prevent PL-user-level access to file descriptors already open by the backend. That's implicitly covered in the above, but should probably be explicit. 2) cannot bind network ports 3) uses only the SPI interface to interact with postgresql tables etc. 4) does any logging only using elog to the postgres log 5) Cannot dynamically load shared libraries from user-supplied locations (eg in Python, 'import' of a module that had a .so component would be blocked unless it was in the core module path) a) it seems like there should be some kind of restriction on access to memory, but I'm not clear on how that would be defined. Like: 5) Has no way to directly access backend memory, ie doesn't have PL-user-accessible pointers or user access to any C-level calls that take/return them. Data structures containing pointers must be opaque to the PL user. The idea being that if you have no access to C APIs that work with pointers to memory, and you can't use files (/dev/mem, /proc/self/mem, etc), you can't work with backend memory directly. -- Craig Ringer -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Specification for Trusted PLs?
On 5/23/2010 10:04 PM, Andrew Dunstan wrote: Jan Wieck wrote: On 5/23/2010 6:14 PM, Ron Mayer wrote: Tom Lane wrote: Robert Haas robertmh...@gmail.com writes: So... can we get back to coming up with a reasonable definition, (1) no access to system calls (including file and network I/O) If a PL has file access to it's own sandbox (similar to what flash seems to do in web browsers), could that be considered trusted? That is a good question. Currently, the first of all TRUSTED languages, PL/Tcl, would allow the function of a lesser privileged user access the global objects of every other database user created within the same session. These are per backend in memory objects, but none the less, an evil function could just scan the per backend Tcl namespace and look for compromising data, and that's not exactly what TRUSTED is all about. In the case of Tcl it is possible to create a separate safe interpreter per DB role to fix this. I actually think this would be the right thing to do. I think that would probably be serious overkill. Maybe a data stash per role rather than an interpreter per role would be doable. it would certainly be more lightweight. ISTM we are in danger of confusing several different things. A user that doesn't want data to be shared should not stash it in global objects. But to me, trusting a language is not about making data private, but about not allowing the user to do things that are dangerous, such as referencing memory, or the file system, or the operating system, or network connections, or loading code which might do any of those things. How is loading code which might do any of those things different from writing a stored procedure, that accesses data, a careless superuser left in a global variable? Remember, the code of a PL function is open source - like in everyone can select from pg_proc. You really don't expect anyone to scan for your global variables just because they can write functions in the same language? Jan -- Anyone who trades liberty for security deserves neither liberty nor security. -- Benjamin Franklin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Specification for Trusted PLs?
On 22/05/10 02:12, Robert Haas wrote: On Fri, May 21, 2010 at 1:58 PM, David Fetterda...@fetter.org wrote: On Fri, May 21, 2010 at 01:45:45PM -0400, Stephen Frost wrote: * David Fetter (da...@fetter.org) wrote: That is *precisely* the business we need to be in, at least for the languages we ship, and it would behoove us to test languages we don't ship so we can warn people when they don't pass. k, let's start with something simpler first tho- I'm sure we can pull in the glibc regression tests and run them too. You know, just in case there's a bug there, somewhere. That's pretty pure straw man argument. I expect much higher quality trolling. D-. I'm sorely tempted to try to provide some higher-quality trolling, but in all seriousness I think that (1) we could certainly use much better regression tests in many areas of which this is one and (2) it will never be possible to catch all security bugs - in particular - via regression testing because they typically stem from cases people didn't consider. So... can we get back to coming up with a reasonable definition, and if somebody wants to write some regression tests, all the better? Personally, I don't think a PL should be trusted unless it _does_ define a whitelist of operations. Experience in the wider world has shown that this is the only approach that works. Regression testing to make sure all possible approaches to access unsafe features are blocked is doomed to have holes where there's another approach that hasn't been thought of yet. Perl's new approach is whitelist based. Python restricted mode failed not least because it was a blacklist and people kept on finding ways around it. Lua and JavaScript are great examples of whitelist approaches, where the language just doesn't expose features that're dangerous - in fact, the core language doesn't even *have* those features. PL/PgSQL is the same, and works well as a trusted language for that reason. Java's SecurityManager is whitelist based (allowed classes, allowed operations), and has proved very secure. -- Craig Ringer -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Specification for Trusted PLs?
Jan Wieck wrote: ISTM we are in danger of confusing several different things. A user that doesn't want data to be shared should not stash it in global objects. But to me, trusting a language is not about making data private, but about not allowing the user to do things that are dangerous, such as referencing memory, or the file system, or the operating system, or network connections, or loading code which might do any of those things. How is loading code which might do any of those things different from writing a stored procedure, that accesses data, a careless superuser left in a global variable? Remember, the code of a PL function is open source - like in everyone can select from pg_proc. You really don't expect anyone to scan for your global variables just because they can write functions in the same language? Well, that threat arises from the unsafe actions of the careless superuser. And we could at least ameliorate it by providing a per role data stash, at very little cost, as I mentioned. It's not like we don't know about such threats, and I'm certainly not pretending they don't exist. The 9.0 PL/Perl docs say: The %_SHARED variable and other global state within the language is public data, available to all PL/Perl functions within a session. Use with care, especially in situations that involve use of multiple roles or SECURITY DEFINER functions. But the threats I was referring to arise if the language allows them to, without any requirement for unsafe actions by another user. Protecting against those is the essence of trustedness in my mind at least. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Stefan's bug (was: max_standby_delay considered harmful)
On Wed, May 19, 2010 at 2:47 PM, Fujii Masao masao.fu...@gmail.com wrote: Oh, right. How about allowing the postmaster only in PM_STARTUP, PM_RECOVERY, PM_HOT_STANDBY or PM_WAIT_READONLY state to invoke walreceiver? We can keep walreceiver alive until all read only backends have gone, and prevent unexpected startup of walreceiver. Yes, that seems like something we should be checking, if we aren't already. I'll do that. Here is the updated version. I added the above-mentioned check into the patch. Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center fix_smart_shutdown_in_recovery_v4_fujii.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers