Re: [HACKERS] Correctly producing array literals for prepared statements
> This is only true for server encodings. In a client library I think > you lose on this and do have to deal with it. I'm not sure what client > encodings we do support that aren't ascii-supersets though, it's > possible none of them generate quote characters this way. We have a clear definition what encodings are for client only(mb/pg_wchar.h): /* followings are for client encoding only */ PG_SJIS,/* Shift JIS (Winindows-932) */ PG_BIG5,/* Big5 (Windows-950) */ PG_GBK, /* GBK (Windows-936) */ PG_UHC, /* UHC (Windows-949) */ PG_GB18030, /* GB18030 */ PG_JOHAB, /* EUC for Korean JOHAB */ PG_SHIFT_JIS_2004, /* Shift-JIS-2004 */ -- Tatsuo Ishii SRA OSS, Inc. Japan English: http://www.sraoss.co.jp/index_en.php Japanese: http://www.sraoss.co.jp -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: cross column correlation ...
> Personally, I think the first thing we ought to do is add a real, bona > fide planner hint to override the selectivity calculation manually, > maybe something like this: > > WHERE (x < 5 AND y = 1) SELECTIVITY (0.1); > If you're going to go that far, why not just collect statistics on that specific predicate? ie, ANALYZE SELECTIVITY ON tablename (x, y) WHERE (x < 5 AND y = 1); Then it won't fall subject to all of the pitfalls that Tom outlines below. Selectivities are easy to estimate if we know the predicate. They only become hard when they have to work for every possible predicate. Best, Nathan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] SSI bug?
Dan Ports wrote: > The obvious solution to me is to just keep the lock on both the old > and new page. That's the creative thinking I was failing to do. Keeping the old lock will generate some false positives, but it will be rare and those don't compromise correctness -- they just carry the cost of starting the transaction over. In exchange you buy a savings in predicate lock acquisition. In particular, by dodging LW locking for a large percentage of calls, you help scalability. It can't cause false negatives, so correctness is OK -- it's all about net costs. > I was going to bemoan the extra complexity this would add -- but > actually, couldn't we just replace PredicateLockPageCombine with a > call to PredicateLockPageSplit since they'd now do the same thing? I'd be inclined to leave the external interface alone, in case we conceive of an even better implementation. We can just remove the removeOld bool from the parameter list of the static TransferPredicateLocksToNewTarget function, and keep the behavior from the "false" case. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Correctly producing array literals for prepared statements
On Wed, Feb 23, 2011 at 4:16 AM, Peter Geoghegan wrote: > Since Postgres only supports encodings that are ASCII supersets, I > don't believe that I have to consider encoding - only my clients do. > This is only true for server encodings. In a client library I think you lose on this and do have to deal with it. I'm not sure what client encodings we do support that aren't ascii-supersets though, it's possible none of them generate quote characters this way. I'm a bit surprised libpqxx isn't using binary mode internally though. This would at least avoid the problems with encoding. However I'm not sure things like the array binary format are really stable and portable enough to really use from a client library. Some datatypes might be dependent on the server ABI (floats -- I'm looking at you) so that might make it difficult or impossible. -- greg -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Correctly producing array literals for prepared statements
I'm investigating the possibility of developing a utility function for our C++ client library, libpqxx, that produces array literals that can be used in prepared statements. This problem appears to be a bit of a tar pit, so I'm hoping that someone can help me out. My goal is to produce a template function that accepts arbitrarily nested standard library containers, that contain at the most nested level constants/literals of some type that can be fed into a stream, such as an int or a std::string. I'm aware that I cannot assume that types are delimited by a single quote, even for built-in types. I thought that I would put the onus on the client to specify the correct delimiter, by checking pg_type themselves if necessary, but default to ',' . Is this a reasonable approach? Escaping/quoting individual elements seems tricky. I have produced a generic and superficially well behaved implementation by using double quotes for constants. However, I have now opened the door to malicious parties injecting multiple array elements where only one is allowed, or causing malformed array literal errors by simply including a double quote of their own. It's not clear where the responsibility should rest for escaping constants/ensuring that constants don't contain double quotes. Can someone suggest a better approach? I can't very well use single quotes, because they are escaped/doubled up when we pass the array literal to something similar to PQexecPrepared(), and they shouldn't be - strings end up looking like this: "'has errant single quotes on either side'". Since Postgres only supports encodings that are ASCII supersets, I don't believe that I have to consider encoding - only my clients do. Can someone please point me in the direction of an established client library/driver where all corner cases are covered, or at least enough of them to produce a net gain in usefulness? There may well be additional subtleties that have not occurred to me. -- Regards, Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] TODO: You can alter it, but you can't view it
> Right now pg_options_to_table() is not documented. Should it be? Yes, I think so. -- -- Josh Berkus PostgreSQL Experts Inc. http://www.pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: cross column correlation ...
Robert Haas writes: > On Tue, Feb 22, 2011 at 9:43 PM, Tom Lane wrote: >> One of the criteria we've always had for a suitable hint-or-whatever- >> you-call-it design is that it *not* involve decorating the queries. > [ snip ] > To put that another way, it's true that some people can't adjust their > queries, but also some people can. It's true that nonstandard stuff > sucks, but queries that don't work suck, too. And as for better > solutions, how many major release cycles do we expect people to wait > for them? Well, a decorating-the-queries solution that isn't utter crap is not going to be a small amount of work, either. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: cross column correlation ...
On Tue, Feb 22, 2011 at 9:43 PM, Tom Lane wrote: > Robert Haas writes: >> /me prepares to go down in flames. > >> Personally, I think the first thing we ought to do is add a real, bona >> fide planner hint to override the selectivity calculation manually, >> maybe something like this: > >> WHERE (x < 5 AND y = 1) SELECTIVITY (0.1); > > One of the criteria we've always had for a suitable hint-or-whatever- > you-call-it design is that it *not* involve decorating the queries. > There are a number of reasons for that, some of the killer ones being > > (1) People frequently *can't* adjust their queries that way, because > they're coming out of some broken query generator or other. (Crappy > query generators are of course one of the prime reasons for > poor-performing queries in the first place, so you can't write this off > as not being a key use case.) > > (2) Anything we do like that, we'd be locked into supporting forever, > even after we think of better solutions. > > (3) People don't like decorating their queries with nonstandard stuff; > it smells of vendor lock-in. Especially if it's actually SQL syntax > and not comments. Once you put something into the DML it's just too > hard to fix applications to get rid of it (the inverse case of point > #1). Those are real problems, but I still want it. The last time I hit this problem I spent two days redesigning my schema and adding triggers all over the place to make things work. If I had been dealing with a 30TB database instead of a 300MB database I would have been royally up a creek. To put that another way, it's true that some people can't adjust their queries, but also some people can. It's true that nonstandard stuff sucks, but queries that don't work suck, too. And as for better solutions, how many major release cycles do we expect people to wait for them? Even one major release cycle is an eternity when you're trying to get the application working before your company runs out of money, and this particular problem has had a lot of cycles expended on it without producing anything very tangible (proposed patch, which like you I can't spare a lot of cycles to look at just now, possibly excepted). I agree that if we can get something that actually works that doesn't involve decorating the queries, that is better. But I would surely rather decorate the queries than rewrite the entire application around the problem. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: cross column correlation ...
Robert Haas writes: > /me prepares to go down in flames. > Personally, I think the first thing we ought to do is add a real, bona > fide planner hint to override the selectivity calculation manually, > maybe something like this: > WHERE (x < 5 AND y = 1) SELECTIVITY (0.1); One of the criteria we've always had for a suitable hint-or-whatever- you-call-it design is that it *not* involve decorating the queries. There are a number of reasons for that, some of the killer ones being (1) People frequently *can't* adjust their queries that way, because they're coming out of some broken query generator or other. (Crappy query generators are of course one of the prime reasons for poor-performing queries in the first place, so you can't write this off as not being a key use case.) (2) Anything we do like that, we'd be locked into supporting forever, even after we think of better solutions. (3) People don't like decorating their queries with nonstandard stuff; it smells of vendor lock-in. Especially if it's actually SQL syntax and not comments. Once you put something into the DML it's just too hard to fix applications to get rid of it (the inverse case of point #1). I haven't looked at Hans' patch in any detail, and don't intend to do so while the CF is still running; but at least he got this point right. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Binary in/out for aclitem
On Tue, Feb 22, 2011 at 9:26 PM, Tom Lane wrote: >> Well, unfortunately, there's an awful lot of information that can only >> be obtained in a reasonable way by introspection of the system >> catalogs. If you want to know whether user A can select from table B, >> there's really no sensible way of obtaining that without parsing the >> aclitem entries in some fashion, and unfortunately that's just the tip >> of the iceberg. > > Um, that question is precisely why we invented the has_foo_privilege > class of functions. OK, fair point. > I would *much* rather see users applying those > functions than looking at ACLs directly --- and if there's some > reasonable use-case that those don't cover, let's talk about that. Well, the obvious applications are "I'd like to know who has permissions on this item" and "I'd like to know what this user has permissions to do". >> Now, if you were to propose adding a well-designed set of DCL commands >> to expose this kind of information to clients in a more structured >> way, I would be the first to applaud. LIST TABLES? SHOW GRANTS TO? >> Sign me up! (I got a request for the latter just today.) But until >> then, if you need this information, you don't have much choice but to >> pull it out of the system catalogs; and if JDBC would rather use >> binary format to talk to the server, I don't particularly see any >> reason to say "no". If we prefer to expose the text format rather >> than anything else, that's OK with me, but I do think it would make >> sense to expose something. > > Well, to go back to the binary-format issue, if we're going to insist > that all standard types have binary I/O support then we should actually > do that, not accept piecemeal patches that move us incrementally in that > direction without establishing a policy. To my mind, "establishing a > policy" would include adding a type_sanity regression test query that > shows there are no missing binary I/O functions. That's hard to disagree with. +1 for accepting a patch that fixes all the remaining cases, but not anything less than that. > smgr: let's just get rid of that useless vestigial type. Heck yeah. Even if we someday want to go back to supporting more than one smgr, the chances that we're going to want to do it this way are just about zero. This part probably merits its own commit, though. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Binary in/out for aclitem
Robert Haas writes: > On Tue, Feb 22, 2011 at 8:20 PM, Tom Lane wrote: >> ... But my question isn't about that; it's about why >> aclitem should be considered a first-class citizen. It makes me >> uncomfortable that client apps are looking at it at all, because any >> that do are bound to get broken in the future, even assuming that they >> get the right answers today. I wonder how many such clients are up to >> speed for per-column privileges and non-constant default privileges for >> instance. And sepgsql is going to cut them off at the knees. > Well, unfortunately, there's an awful lot of information that can only > be obtained in a reasonable way by introspection of the system > catalogs. If you want to know whether user A can select from table B, > there's really no sensible way of obtaining that without parsing the > aclitem entries in some fashion, and unfortunately that's just the tip > of the iceberg. Um, that question is precisely why we invented the has_foo_privilege class of functions. I would *much* rather see users applying those functions than looking at ACLs directly --- and if there's some reasonable use-case that those don't cover, let's talk about that. > Now, if you were to propose adding a well-designed set of DCL commands > to expose this kind of information to clients in a more structured > way, I would be the first to applaud. LIST TABLES? SHOW GRANTS TO? > Sign me up! (I got a request for the latter just today.) But until > then, if you need this information, you don't have much choice but to > pull it out of the system catalogs; and if JDBC would rather use > binary format to talk to the server, I don't particularly see any > reason to say "no". If we prefer to expose the text format rather > than anything else, that's OK with me, but I do think it would make > sense to expose something. Well, to go back to the binary-format issue, if we're going to insist that all standard types have binary I/O support then we should actually do that, not accept piecemeal patches that move us incrementally in that direction without establishing a policy. To my mind, "establishing a policy" would include adding a type_sanity regression test query that shows there are no missing binary I/O functions. As of HEAD, we have postgres=# select typname,typtype from pg_type where typreceive = 0 or typsend = 0; typname | typtype --+- smgr | b aclitem | b gtsvector| b any | p trigger | p language_handler | p internal | p opaque | p anyelement | p anynonarray | p anyenum | p fdw_handler | p (12 rows) Possibly we could exclude pseudotypes from the policy, on the grounds there are never really values of those types anyway. But other than that, we have: smgr: let's just get rid of that useless vestigial type. aclitem: as per this thread, using the text representation as the binary representation seems reasonable, or at least it doesn't make anything any worse. gtsvector: this is strictly an internal type, so it probably doesn't need actual I/O support, but we could put in a couple of dummy functions that just throw ERRCODE_FEATURE_NOT_SUPPORTED. Maybe the right plan would be to give all the pseudotypes error-throwing binary I/O functions too. Then, if anyone lobbies for not throwing an error (as per what we just did with "void"), at least it doesn't take a catversion bump to fix it. If someone wanted to propose doing all that, I could get behind it. But I'm not excited about debating this one datatype at a time. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Snapshot synchronization, again...
On Tue, Feb 22, 2011 at 8:55 PM, Alvaro Herrera wrote: > The current discussion seems to have reached the same conclusion as the > last one: the snapshot info shouldn't leave the server, but be > transmitted internally -- the idea of the temp file seems prevalent. > Here's an attempt at a detailed spec: > > By invoking pg_export_snapshot(), a transaction can export its current > transaction snapshot. > > To export a snapshot, the transaction creates a temp file containing all > information about the snapshot, including the BackendId and > TransactionId of the creating transaction. The file name is determined > internally by the exporting transaction, and returned by the function > that creates the snapshot. The act of exporting a snapshot causes a > TransactionId to be acquired, if there is none. > > Care is taken that no existing snapshot file is ever overwritten, to > prevent security problems. A transaction may export any number of > snapshots. > > The "cookie" to be passed around, then, is a temp file identifier. This > cookie gets passed as START TRANSACTION (snapshot='foo'), which sets the > transaction's snapshot to the one determined by the identifier. The > importing transaction must have isolation level serializable or > repeatable read declared on the same START TRANSACTION command; an > attempt to import a snapshot into a read committed transaction or one > with unspecified isolation level causes the transaction to get into > aborted state (so you still need to say ROLLBACK to get out of it. This > is necessary to avoid the session to proceed involuntarily with the > wrong snapshot.) > > Similarly, if the snapshot is not available, an error is raised and the > transaction block remains in aborted state. This can happen if the > originating transaction ended after you opened the file, for example. > The BackendId and TransactionId of the exporting transaction let the > importing backend determine the validity of the snapshot beyond the mere > existence of the file. > > The snapshot temp file is to be marked for deletion on transaction end. > All snapshot temp files are also deleted on crash recovery, to prevent > dangling files (I need some help here to determine how this plays with > hot standby.) I think this can be done within StartupXLOG(), right around where we ResetUnloggedRelations(UNLOGGED_RELATION_CLEANUP). > Privileges: Any role can export or import a snapshot. The rationale > here is that exporting a snapshot doesn't cause any more harm than > holding a transaction open, so if you let your users do that, then > there's no extra harm. Similarly, there's no extra privilege given to a > role that imports a snapshot that cannot be obtained by sending a START > TRANSACTION command at the right time. Agree. > Note: this proposal doesn't mention restrictions on having > subtransactions in the exporting transaction. This is because I haven't > figured them out, not because I think there shouldn't be any. I think it's probably sufficient to say that a snapshot can only be exported from the toplevel transaction. Overall, this proposal is fine with me. But then I wasn't the one who complained about it last time. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: cross column correlation ...
2011/2/22 PostgreSQL - Hans-Jürgen Schönig : > how does it work? we try to find suitable statistics for an arbitrary length > list of conditions so that the planner can use it directly rather than > multiplying all the selectivities. this should make estimates a lot more > precise. > the current approach can be extended to work with expressions and well as > "straight" conditions. /me prepares to go down in flames. Personally, I think the first thing we ought to do is add a real, bona fide planner hint to override the selectivity calculation manually, maybe something like this: WHERE (x < 5 AND y = 1) SELECTIVITY (0.1); Then, having provided a method for the DBA to extinguish the raging flames of searing agony which are consuming them while a crocodile chews off their leg and their boss asks them why they didn't use Oracle, we can continue bikeshedding about the best way of fixing this problem in a more user-transparent fashion. As to the approach you've proposed here, I'm not sure I understand what this is actually doing. Selectivity estimates aren't made directly for predicates; they're made based on MCV and histogram information for predicates. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Snapshot synchronization, again...
The current discussion seems to have reached the same conclusion as the last one: the snapshot info shouldn't leave the server, but be transmitted internally -- the idea of the temp file seems prevalent. Here's an attempt at a detailed spec: By invoking pg_export_snapshot(), a transaction can export its current transaction snapshot. To export a snapshot, the transaction creates a temp file containing all information about the snapshot, including the BackendId and TransactionId of the creating transaction. The file name is determined internally by the exporting transaction, and returned by the function that creates the snapshot. The act of exporting a snapshot causes a TransactionId to be acquired, if there is none. Care is taken that no existing snapshot file is ever overwritten, to prevent security problems. A transaction may export any number of snapshots. The "cookie" to be passed around, then, is a temp file identifier. This cookie gets passed as START TRANSACTION (snapshot='foo'), which sets the transaction's snapshot to the one determined by the identifier. The importing transaction must have isolation level serializable or repeatable read declared on the same START TRANSACTION command; an attempt to import a snapshot into a read committed transaction or one with unspecified isolation level causes the transaction to get into aborted state (so you still need to say ROLLBACK to get out of it. This is necessary to avoid the session to proceed involuntarily with the wrong snapshot.) Similarly, if the snapshot is not available, an error is raised and the transaction block remains in aborted state. This can happen if the originating transaction ended after you opened the file, for example. The BackendId and TransactionId of the exporting transaction let the importing backend determine the validity of the snapshot beyond the mere existence of the file. The snapshot temp file is to be marked for deletion on transaction end. All snapshot temp files are also deleted on crash recovery, to prevent dangling files (I need some help here to determine how this plays with hot standby.) Privileges: Any role can export or import a snapshot. The rationale here is that exporting a snapshot doesn't cause any more harm than holding a transaction open, so if you let your users do that, then there's no extra harm. Similarly, there's no extra privilege given to a role that imports a snapshot that cannot be obtained by sending a START TRANSACTION command at the right time. Note: this proposal doesn't mention restrictions on having subtransactions in the exporting transaction. This is because I haven't figured them out, not because I think there shouldn't be any. -- Álvaro Herrera The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] SSI bug?
On Tue, Feb 22, 2011 at 05:54:49PM -0600, Kevin Grittner wrote: > I'm not sure it's safe to assume that the index page won't get > reused before the local lock information is cleared. In the absence > of a clear proof that it is safe, or some enforcement mechanism to > ensure that it is, I don't think we should make this assumption. > Off-hand I can't think of a clever way to make this safe which would > cost less than taking out the LW lock and checking the definitive > shared memory HTAB, but that might be for lack of creative thinking > at the moment.. Hmm. Yeah, I wasn't sure about that one, and having now thought about it some more I think it isn't safe -- consider adding a lock on an index page concurrently with another backend merging that page into another one. The obvious solution to me is to just keep the lock on both the old and new page. The downside is that because this requires allocating a new lock and is in a context where we're not allowed to fail, we'll need to fall back on acquiring the relation lock just as we do for page splits. I was going to bemoan the extra complexity this would add -- but actually, couldn't we just replace PredicateLockPageCombine with a call to PredicateLockPageSplit since they'd now do the same thing? > The only alternative I see would be to use some form of asynchronous > notification of the new locks so that the local table can be > maintained. That seems overkill without some clear evidence that it > is needed. I agree. It is certainly weird and undesirable that the backend-local lock table is not always accurate, but I don't see a good way to keep it up to date without the cure being worse than the disease. > I *really* wouldn't want to go back to needing LW locks > to maintain this info; as you know (and stated only for the benefit > of the list), it was a pretty serious contention point in early > profiling and adding the local table was a big part of getting an > early benchmark down from a 14+% performance hit for SSI to a 1.8% > performance hit. Yes, it's definitely important for a backend to be able to check whether it's already holding a lock (even if that's just a hint) without having to take locks. Let me add one more piece of info for the benefit of the list: a backend's local lock table contains not just locks held by the backend, but also an entry and refcount for every parent of a lock it holds. This is used to determine when to promote to one of the coarser-grained parent locks. It's both unnecessary and undesirable for that info to be in shared memory. Dan -- Dan R. K. Ports MIT CSAILhttp://drkp.net/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Binary in/out for aclitem
[ removing Radoslaw from the CC list, as his email is bouncing every time ] On Tue, Feb 22, 2011 at 8:20 PM, Tom Lane wrote: > Robert Haas writes: >> On Tue, Feb 22, 2011 at 5:24 PM, Tom Lane wrote: >>> It'd be more future-proof than this patch, but I'm still unconvinced >>> about the use-case. > >> Do we want to intentionally make binary format a second-class citizen? > > Well, it's not exactly a first-class citizen; compare for instance the > amount of verbiage in the docs about text I/O formats versus the amount > about binary formats. But my question isn't about that; it's about why > aclitem should be considered a first-class citizen. It makes me > uncomfortable that client apps are looking at it at all, because any > that do are bound to get broken in the future, even assuming that they > get the right answers today. I wonder how many such clients are up to > speed for per-column privileges and non-constant default privileges for > instance. And sepgsql is going to cut them off at the knees. Well, unfortunately, there's an awful lot of information that can only be obtained in a reasonable way by introspection of the system catalogs. If you want to know whether user A can select from table B, there's really no sensible way of obtaining that without parsing the aclitem entries in some fashion, and unfortunately that's just the tip of the iceberg. One of the first applications I wrote for PG included a tool to upgrade the production schema to match the dev schema, which promptly got broken when (I'm dating myself here[1]) PG added support for dropping columns (7.3) and recording the grantor on aclitems (7.4). I'm not going to claim that there aren't better ways of trying to solve the problems that I was trying to solve that day, but at the time it seemed like the best solution, and if I had a dollar for every other person who is written a similar application, I am pretty sure I could afford at least a pizza, if not dinner at Fogo de Chao. Now, if you were to propose adding a well-designed set of DCL commands to expose this kind of information to clients in a more structured way, I would be the first to applaud. LIST TABLES? SHOW GRANTS TO? Sign me up! (I got a request for the latter just today.) But until then, if you need this information, you don't have much choice but to pull it out of the system catalogs; and if JDBC would rather use binary format to talk to the server, I don't particularly see any reason to say "no". If we prefer to expose the text format rather than anything else, that's OK with me, but I do think it would make sense to expose something. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company [1] Insert obligatory joke about how no one else would. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] psql tab-completion for CREATE UNLOGGED TABLE
Here is a patch to support CREATE UNLOGGED TABLE in psql tab-completion. It also fixes a bug that DROP is completed with TEMP and UNIQUE unexpectedly and cleanup codes for DROP OWNED BY in drop_command_generator(). -- Itagaki Takahiro psql_tab_completion_for_unlogged-20110223.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Binary in/out for aclitem
Robert Haas writes: > On Tue, Feb 22, 2011 at 5:24 PM, Tom Lane wrote: >> It'd be more future-proof than this patch, but I'm still unconvinced >> about the use-case. > Do we want to intentionally make binary format a second-class citizen? Well, it's not exactly a first-class citizen; compare for instance the amount of verbiage in the docs about text I/O formats versus the amount about binary formats. But my question isn't about that; it's about why aclitem should be considered a first-class citizen. It makes me uncomfortable that client apps are looking at it at all, because any that do are bound to get broken in the future, even assuming that they get the right answers today. I wonder how many such clients are up to speed for per-column privileges and non-constant default privileges for instance. And sepgsql is going to cut them off at the knees. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Snapshot synchronization, again...
On Tue, Feb 22, 2011 at 8:00 PM, Joachim Wieland wrote: > Both Tom and Robert voted quite explicitly against the > store-in-shared-memory idea. No, I voted *for* that approach. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Binary in/out for aclitem
On Tue, Feb 22, 2011 at 5:24 PM, Tom Lane wrote: > Andrew Dunstan writes: >> On 02/22/2011 05:04 PM, Tom Lane wrote: >>> I think this one has got far less use-case than the other, and I don't >>> want to expose the internal representation of ACLITEM to the world. > >> The sendv for enums sends the label, and ISTR there are some others that >> send the text representation also. Would that be better? > > It'd be more future-proof than this patch, but I'm still unconvinced > about the use-case. Do we want to intentionally make binary format a second-class citizen? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Snapshot synchronization, again...
On Tue, Feb 22, 2011 at 3:34 PM, Heikki Linnakangas wrote: > On 22.02.2011 16:29, Robert Haas wrote: >> On Tue, Feb 22, 2011 at 8:58 AM, Heikki Linnakangas >> wrote: >>> No, the hash is stored in shared memory. The hash of the garbage has to >>> match. >> >> Oh. Well that's really silly. At that point you might as well just >> store the snapshot and an integer identifier in shared memory, right? > > Yes, that's the point I was trying to make. I believe the idea of a hash was > that it takes less memory than storing the whole snapshot (and more > importantly, a fixed amount of memory per snapshot). But I'm not convinced > either that dealing with a hash is any less troublesome. Both Tom and Robert voted quite explicitly against the store-in-shared-memory idea. Others don't want to allow people request arbitrary snapshots and again others wanted to pass the snapshot through the client so that in the future we could also request snapshots from standby servers. The hash idea seemed to at least make nobody unhappy. For easier review, here are a few links to the previous discusion: http://archives.postgresql.org/pgsql-hackers/2010-12/msg00361.php http://archives.postgresql.org/pgsql-hackers/2010-12/msg00383.php http://archives.postgresql.org/pgsql-hackers/2010-12/msg00481.php http://archives.postgresql.org/pgsql-hackers/2010-12/msg02454.php Why exactly, Heikki do you think the hash is more troublesome? And how could we validate/invalidate snapshots without the checksum (assuming the through-the-client approach instead of storing the whole snapshot in shared memory)? Joachim -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] review: FDW API
I wrote: > I did a bit of poking around here, and came to the following > conclusions: > 1. We don't want to add another RTEKind. RTE_RELATION basically has the > semantics of "anything with a pg_class OID", so it ought to include > foreign tables. Therefore the fix ought to be to add relkind to > RangeTblEntry. (BTW, so far as I can tell, RTE_SPECIAL is obsolete: > there are assorted switch cases that handle it, but no place that can > generate the value. I'm inclined to delete it while we are messing > with this.) > 2. In the current code layout, to make sense of relkind you need to > #include catalog/pg_class.h where the values for relkind are #defined. > I dislike the idea of that being true for a field of such a widely-known > struct as RangeTblEntry. Accordingly, I suggest that we move those > values into parsenodes.h. (Perhaps we could convert them to an enum, > too, though still keeping the same ASCII values.) > 3. We can have the rewriter update an RTE's stored value of relkind > during AcquireRewriteLocks: it opens the rel for each RTE_RELATION entry > anyway, so copying over the relkind is essentially free. While it's not > instantly obvious that that is "soon enough", I think that it is, since > up to the point of acquiring a lock there we can't assume that the rel > isn't being changed or dropped undeneath us --- that is, any earlier > test on an RTE's relkind might be testing just-obsoleted state anyway. > 4. I had hoped that we might be able to get rid of some pre-existing > syscache lookups, but at least so far as the parse/plan/execute chain > is concerned, there don't seem to be any other places that need to > fetch a relkind based on just an RTE entry. > So point #4 is a bit discouraging, but other that, this seems like > a fairly straightforward exercise. I'm inclined to go ahead and do it, > unless there are objections. Applied, except I ended up not moving the RELKIND #defines as suggested in point #2. Those #defines are used by pg_dump, and having pg_dump.c #include parsenodes.h seemed like a bad idea. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] SSI bug?
Dan Ports wrote: > On Tue, Feb 22, 2011 at 10:51:05AM -0600, Kevin Grittner wrote: > The theory was before that the local lock table would only have > false negatives, i.e. if it says we hold a lock then we really do. > That makes it a useful heuristic because we can bail out quickly > if we're trying to re-acquire a lock we already hold. It seems > worthwhile to try to preserve that. > > This property holds as long as the only time one backend edits > another's lock list is to *add* a new lock, not remove an existing > one. Fortunately, this is true most of the time (at a glance, it > seems to be the case for the new tuple update code). > > There are two exceptions; I think they're both OK but we need to > be careful here. > - if we're forced to promote an index page lock's granularity to >avoid running out of shared memory, we remove the fine-grained >one. This is fine as long as we relax our expectations to "if >the local lock table says we hold a lock, then we hold that >lock or one that covers it", which is acceptable for current >users of the table. That makes sense. This one sounds OK. > - if we combine two index pages, we remove the lock entry for the >page being deleted. I think that's OK because the page is being >removed so we will not make any efforts to lock it. I'm not sure it's safe to assume that the index page won't get reused before the local lock information is cleared. In the absence of a clear proof that it is safe, or some enforcement mechanism to ensure that it is, I don't think we should make this assumption. Off-hand I can't think of a clever way to make this safe which would cost less than taking out the LW lock and checking the definitive shared memory HTAB, but that might be for lack of creative thinking at the moment.. >> Yeah, as far as a I can recall the only divergence was in *page* >> level entries for *indexes* until this latest patch. We now have >> *tuple* level entries for *heap* relations, too. > > *nod*. I'm slightly concerned about the impact of that on > granularity promotion -- the new locks created by heap updates > won't get counted toward the lock promotion thresholds. That's not > a fatal problem of anything, but it could make granularity > promotion less effective at conserving lock entries. This pattern doesn't seem to come up very often in most workloads. Since it's feeding into a heuristic which already triggers pretty quickly I think we should be OK with this. It makes me less tempted to tinker with the threshold for promoting tuple locks to page locks, though. The only alternative I see would be to use some form of asynchronous notification of the new locks so that the local table can be maintained. That seems overkill without some clear evidence that it is needed. I *really* wouldn't want to go back to needing LW locks to maintain this info; as you know (and stated only for the benefit of the list), it was a pretty serious contention point in early profiling and adding the local table was a big part of getting an early benchmark down from a 14+% performance hit for SSI to a 1.8% performance hit. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] How to extract a value from a record using attnum or attname?
Andrew Dunstan wrote: > Have you performance tested it? Scanning pg_index for index > columns for each row strikes me as likely to be unpleasant. I haven't, yet. I had rather assumed that the index info for a relation would have a high probability of being cached during execution of an AFTER trigger for that relation, so I think we're talking RAM access here. It didn't seem sane to try to create an HTAB for this and worry about invalidation of it, etc. If there's a faster way to get to the info without going to such extremes, I'd be happy to hear them. (At least I avoided building and parsing a query to get at it.) > Also, the error messages seem to need a bit of work (no wonder > they seemed familiar to me :) ) [blush] I was just trying to write code with "fits in with surrounding code", as recommended. You mean I should change the function name in the message from the name of the function I copied *from* to the name of the function I copied *to*? Well, I *guess* it still fits in ;-) Seriously, thanks for pointing that out. Will fix. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] SSI bug?
On Tue, Feb 22, 2011 at 10:51:05AM -0600, Kevin Grittner wrote: > Dan Ports wrote: > > > It looks like CheckTargetForConflictsIn is making the assumption > > that the backend-local lock table is accurate, which was probably > > even true at the time it was written. > > I remember we decided that it could only be false in certain ways > which allowed us to use it as a "lossy" first-cut test in a couple > places. I doubt that we can count on any of that any longer, and > should drop those heuristics. Hmm. The theory was before that the local lock table would only have false negatives, i.e. if it says we hold a lock then we really do. That makes it a useful heuristic because we can bail out quickly if we're trying to re-acquire a lock we already hold. It seems worthwhile to try to preserve that. This property holds as long as the only time one backend edits another's lock list is to *add* a new lock, not remove an existing one. Fortunately, this is true most of the time (at a glance, it seems to be the case for the new tuple update code). There are two exceptions; I think they're both OK but we need to be careful here. - if we're forced to promote an index page lock's granularity to avoid running out of shared memory, we remove the fine-grained one. This is fine as long as we relax our expectations to "if the local lock table says we hold a lock, then we hold that lock or one that covers it", which is acceptable for current users of the table. - if we combine two index pages, we remove the lock entry for the page being deleted. I think that's OK because the page is being removed so we will not make any efforts to lock it. > Yeah, as far as a I can recall the only divergence was in *page* > level entries for *indexes* until this latest patch. We now have > *tuple* level entries for *heap* relations, too. *nod*. I'm slightly concerned about the impact of that on granularity promotion -- the new locks created by heap updates won't get counted toward the lock promotion thresholds. That's not a fatal problem of anything, but it could make granularity promotion less effective at conserving lock entries. > > The solution is only slightly more complicated than just removing > > the assertion. > > That's certainly true for that one spot, but we need an overall > review of where we might be trying to use LocalPredicateLockHash for > "first cut" tests as an optimization. Yes, I'd planned to go through the references to LocalPredicateLockHash to make sure none of them were making any unwarranted assumptions about the results. Fortunately, there are not too many of them... Dan -- Dan R. K. Ports MIT CSAILhttp://drkp.net/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: [GENERAL] How to extract a value from a record using attnum or attname?
On 02/22/2011 05:32 PM, Kevin Grittner wrote: [moving to -hackers with BC to -general] Dimitri Fontaine wrote: "Kevin Grittner" writes: PL/pgSQL seems tantalizingly close to being useful for developing a generalized trigger function for notifying the client of changes. I don't know whether I'm missing something or whether we're missing a potentially useful feature here. Does anyone see how to fill in where the commented question is, or do I need to write this function in C? See those: http://tapoueh.org/articles/blog/_Dynamic_Triggers_in_PLpgSQL.html http://www.pgsql.cz/index.php/PL_toolbox_%28en%29#Record.27s_functions for i in array_lower(keycols, 1)..array_upper(keycols, 1) loop select quote_ident(attname) from pg_catalog.pg_attribute where attrelid = tg_relid and attnum = keycols[i]::oid Beware of attisdropped, which I've not fixed in the published URL before (the tapoueh.org one). Thanks. In the absence of an earlier response, though, I went ahead and wrote the attached, which has passed some initial programmer testing and is scheduled to start business analyst testing tomorrow with the application software for production deployment in a couple months. We probably won't go back to PL/pgSQL for this now. I'm assuming that while I have an AccessShareLock on the index relation for the primary key, any attributes it tells me are used by that relation will not have the attisdropped flag set? What this trigger function does is to issue a NOTIFY to the channel specified as a parameter to the function in CREATE TRIGGER (with 'tcn' as the default), and a payload consisting of the table name, a code for the operation (Insert, Update, or Delete), and the primary key values. So, an update to a Party record for us might generate this NOTIFY payload: "Party",U,"countyNo"='71',"caseNo"='2011CF001234',"partyNo"='1' This is one of those things which our shop needs, but I was planning to post it for the first 9.2 CF fest to see if anyone else was interested. It struck me while typing this post that for general use the schema would probably need to be in there, but I'll worry about that later, if anyone else *is* interested. If anyone wants it I can provide Java code to tear apart the NOTIFY payloads using the Pattern and Matches classes. I'll add to the first 9.2 CF referencing this post. Have you performance tested it? Scanning pg_index for index columns for each row strikes me as likely to be unpleasant. Also, the error messages seem to need a bit of work (no wonder they seemed familiar to me :) ) cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: [GENERAL] How to extract a value from a record using attnum or attname?
[moving to -hackers with BC to -general] Dimitri Fontaine wrote: > "Kevin Grittner" writes: > >> PL/pgSQL seems tantalizingly close to being useful for developing >> a generalized trigger function for notifying the client of >> changes. I don't know whether I'm missing something or whether >> we're missing a potentially useful feature here. Does anyone see >> how to fill in where the commented question is, or do I need to >> write this function in C? > > See those: > > http://tapoueh.org/articles/blog/_Dynamic_Triggers_in_PLpgSQL.html > http://www.pgsql.cz/index.php/PL_toolbox_%28en%29#Record.27s_functions > >> for i in array_lower(keycols, 1)..array_upper(keycols, 1) loop >> select quote_ident(attname) from pg_catalog.pg_attribute >> where attrelid = tg_relid and attnum = keycols[i]::oid > > Beware of attisdropped, which I've not fixed in the published URL > before (the tapoueh.org one). Thanks. In the absence of an earlier response, though, I went ahead and wrote the attached, which has passed some initial programmer testing and is scheduled to start business analyst testing tomorrow with the application software for production deployment in a couple months. We probably won't go back to PL/pgSQL for this now. I'm assuming that while I have an AccessShareLock on the index relation for the primary key, any attributes it tells me are used by that relation will not have the attisdropped flag set? What this trigger function does is to issue a NOTIFY to the channel specified as a parameter to the function in CREATE TRIGGER (with 'tcn' as the default), and a payload consisting of the table name, a code for the operation (Insert, Update, or Delete), and the primary key values. So, an update to a Party record for us might generate this NOTIFY payload: "Party",U,"countyNo"='71',"caseNo"='2011CF001234',"partyNo"='1' This is one of those things which our shop needs, but I was planning to post it for the first 9.2 CF fest to see if anyone else was interested. It struck me while typing this post that for general use the schema would probably need to be in there, but I'll worry about that later, if anyone else *is* interested. If anyone wants it I can provide Java code to tear apart the NOTIFY payloads using the Pattern and Matches classes. I'll add to the first 9.2 CF referencing this post. -Kevin *** a/src/backend/utils/adt/trigfuncs.c --- b/src/backend/utils/adt/trigfuncs.c *** *** 13,21 */ #include "postgres.h" ! #include "access/htup.h" #include "commands/trigger.h" #include "utils/builtins.h" /* --- 13,25 */ #include "postgres.h" ! #include "executor/spi.h" ! #include "catalog/indexing.h" ! #include "commands/async.h" #include "commands/trigger.h" #include "utils/builtins.h" + #include "utils/fmgroids.h" + #include "utils/tqual.h" /* *** *** 93,95 suppress_redundant_updates_trigger(PG_FUNCTION_ARGS) --- 97,261 return PointerGetDatum(rettuple); } + + + /* + * Copy from s (for source) to r (for result), wrapping with q (quote) + * characters and doubling any quote characters found. + */ + static char * + strcpy_quoted(char *r, const char *s, const char q) + { + *r++ = q; + while (*s) + { + if (*s == q) + *r++ = q; + *r++ = *s; + s++; + } + *r++ = q; + return r; + } + + /* + * triggered_change_notification + * + * This trigger function will send a notification of data modification with + * primary key values.The channel will be "tcn" unless the trigger is + * created with a parameter, in which case that parameter will be used. + */ + Datum + triggered_change_notification(PG_FUNCTION_ARGS) + { + TriggerData *trigdata = (TriggerData *) fcinfo->context; + Trigger*trigger; + int nargs; + HeapTuple trigtuple, + newtuple; + HeapTupleHeader trigheader, + newheader; + Relationrel; + TupleDesc tupdesc; + RelationindexRelation; + ScanKeyData skey; + SysScanDesc scan; + HeapTuple indexTuple; + char *channel; + charoperation; + charpayload[200]; + char *p; + boolfoundPK; + + /* make sure it's called as a trigger */ + if (!CALLED_AS_TRIGGER(fcinfo)) + ereport(ERROR, + (errcode(ERRCODE_E_R_I_E_TRIGGER_PROTOCOL_VIOLATED), + errmsg("triggered_change_notification: must be called as trigger"))); + + /* and that it's called after the change */ + if (!TRIGGER_FIRED_AFTER(trigdata->tg_event)) + ereport(ERROR, + (errcode(ERRCODE_E_R_I_E_TRIGGER_PROTOCOL_VIOLATED), +e
Re: [HACKERS] Binary in/out for aclitem
Andrew Dunstan writes: > On 02/22/2011 05:04 PM, Tom Lane wrote: >> I think this one has got far less use-case than the other, and I don't >> want to expose the internal representation of ACLITEM to the world. > The sendv for enums sends the label, and ISTR there are some others that > send the text representation also. Would that be better? It'd be more future-proof than this patch, but I'm still unconvinced about the use-case. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Binary in/out for aclitem
On 02/22/2011 05:04 PM, Tom Lane wrote: =?utf-8?q?Rados=C5=82aw_Smogura?= writes: Actaully one more POD left it's aclitem :). In Java for e.g. it is used to obtain column priviliges, I assume some folks may want to use it too. I think this one has got far less use-case than the other, and I don't want to expose the internal representation of ACLITEM to the world. So, nope, not excited about it. The sendv for enums sends the label, and ISTR there are some others that send the text representation also. Would that be better? cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Binary in/out for aclitem
=?utf-8?q?Rados=C5=82aw_Smogura?= writes: > Actaully one more POD left it's aclitem :). In Java for e.g. it is used to > obtain column priviliges, I assume some folks may want to use it too. I think this one has got far less use-case than the other, and I don't want to expose the internal representation of ACLITEM to the world. So, nope, not excited about it. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pl/python quoting functions
Committed this, with two changes: Changed some things around with the way const char * is propagated. Just casting it away is not nice. Also dropped the error tests in the _quote.sql regression test. This generates three different wordings of error messages from Python with 2.6, 3.1, and 3.2, which I don't care to maintain. Maybe one day we'll have a better solution for this. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] PgEast talks and trainings up
Hello, Per the customary URL: https://www.postgresqlconference.org/ JD -- PostgreSQL.org Major Contributor Command Prompt, Inc: http://www.commandprompt.com/ - 509.416.6579 Consulting, Training, Support, Custom Development, Engineering http://twitter.com/cmdpromptinc | http://identi.ca/commandprompt -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Binary in/out for aclitem
Hi, Actaully one more POD left it's aclitem :). In Java for e.g. it is used to obtain column priviliges, I assume some folks may want to use it too. I tested only recv :-( Acually I don't know if idea of such format is OK, but my intention was to send roles names, so driver don't need to ask for name (like for regproc) and to keep together human-based approch (text) and binary approch. Kind regards, Radek diff --git a/.gitignore b/.gitignore index 1be11e8..0d594f9 100644 --- a/.gitignore +++ b/.gitignore @@ -17,3 +17,5 @@ objfiles.txt /GNUmakefile /config.log /config.status +/nbproject/private/ +/nbproject diff --git a/src/backend/utils/adt/acl.c b/src/backend/utils/adt/acl.c index 691ba3b..33877cb 100644 --- a/src/backend/utils/adt/acl.c +++ b/src/backend/utils/adt/acl.c @@ -33,6 +33,7 @@ #include "utils/lsyscache.h" #include "utils/memutils.h" #include "utils/syscache.h" +#include "libpq/pqformat.h" typedef struct @@ -78,6 +79,10 @@ static void putid(char *p, const char *s); static Acl *allocacl(int n); static void check_acl(const Acl *acl); static const char *aclparse(const char *s, AclItem *aip); + +/** Assigns default grantor and send warning. */ +static void aclitem_assign_default_grantor(AclItem *aip); + static bool aclitem_match(const AclItem *a1, const AclItem *a2); static int aclitemComparator(const void *arg1, const void *arg2); static void check_circularity(const Acl *old_acl, const AclItem *mod_aip, @@ -209,6 +214,14 @@ putid(char *p, const char *s) *p = '\0'; } +/** Assigns default grantor and send warning. */ +void aclitem_assign_default_grantor(AclItem *aip) { +aip->ai_grantor = BOOTSTRAP_SUPERUSERID; +ereport(WARNING, +(errcode(ERRCODE_INVALID_GRANTOR), + errmsg("defaulting grantor to user ID %u", +BOOTSTRAP_SUPERUSERID))); +} /* * aclparse * Consumes and parses an ACL specification of the form: @@ -343,11 +356,7 @@ aclparse(const char *s, AclItem *aip) } else { - aip->ai_grantor = BOOTSTRAP_SUPERUSERID; - ereport(WARNING, -(errcode(ERRCODE_INVALID_GRANTOR), - errmsg("defaulting grantor to user ID %u", - BOOTSTRAP_SUPERUSERID))); +aclitem_assign_default_grantor(aip); } ACLITEM_SET_PRIVS_GOPTIONS(*aip, privs, goption); @@ -643,6 +652,143 @@ aclitemout(PG_FUNCTION_ARGS) PG_RETURN_CSTRING(out); } +/** Do binary read of aclitem. Input format is same as {@link aclitem_recv}, but + * special algorithm is used to determine grantee's and grantor's OID. The reason + * is to keep backward "information" compatiblity with text mode - typical + * client (which gets instructions from user) + * may be much more interested in sending grantee and grantors name then + * OID. Detailed rule is as follow: + * If message contains only oids and privs (no name and names' length) then + * use passed OIDs (message may be truncated, we accept this, + * but both, two last fields must be not present). + * If grantee's name len or grantor's name len is {@code -1} then use respecitve + * OIDs. + * If name length is not {@code -1} then find OID for given part, and + * ensure that respective OID is {@code 0} or is equal to found OID. + * If grantor's OID is {@code 0} and grantor's name lenght is {@code -1} or + * truncated then assign superuser as grantor. + */ +Datum +aclitemrecv(PG_FUNCTION_ARGS) { +StringInfo buf = (StringInfo) PG_GETARG_POINTER(0); +AclItem*aip; +intgRawLen; +char *gVal = NULL; +intgValLen; +OidgOid; + +aip = (AclItem *) palloc(sizeof(AclItem)); +aip->ai_grantee = pq_getmsgint(buf, 4); +aip->ai_grantor = pq_getmsgint(buf, 4); +aip->ai_privs = pq_getmsgint(buf, 4); + +/* Client has passed names. */ +if (buf->cursor < buf->len) { +/* Read grantee. */ +gRawLen = pq_getmsgint(buf, 4); +if (gRawLen != -1) { +gVal = pq_getmsgtext(buf, gRawLen, &gValLen); +gOid = get_role_oid(gVal, false); +if (aip->ai_grantee != 0 && aip->ai_grantee != gOid) { +ereport(ERROR, +(errcode(ERRCODE_INVALID_BINARY_REPRESENTATION), +errmsg("grantee's OID is not 0 and passed grantee's name and grantee's OID doesn't match"))); + +} +} + +/* Read grantor. */ +gRawLen = pq_getmsgint(buf, 4); +if (gRawLen != -1) { +gVal = pq_getmsgtext(buf, gRawLen, &gValLen); +gOid = get_role_oid(gVal, false); +if (aip->ai_grantor != 0 && aip->ai_grantor != gOid) { +ereport(ERROR, +(errcode(ERRCODE_INVALID_BINARY_REPRESENTATION), +errmsg("grantor's OID is not 0 and passed grantor's name and grantor's OID doesn't match"))); +} +}else { +gVal = NULL; +} +} +
[HACKERS] WIP: cross column correlation ...
hello everbody, we have spent some time in finally attacking cross column correlation. as this is an issue which keeps bugging us for a couple of applications (some years). this is a WIP patch which can do: special cross column correlation specific syntax: CREATE CROSS COLUMN STATISTICS ON tablename (field, ...); DROP CROSS COLUMN STATISTICS ON tablename (field, ...); we use specific syntax because we simply cannot keep track of all possible correlations in the DB so the admi can take care of things explicitly. some distant day somebody might want to write a mechanism to derive the desired stats automatically but this is beyond the scope of our project for now. as far as the patch is concerned: it is patched nicely into clauselist_selectivity(), but has some rough edges, even when a cross-col stat is found, the single col selectivities are still counted ( = lovering the selectivity even more), this is a TODO. this patch adds the grammar and the start of planner integration with a static selectivity value for now, the previous discussion about cross-column statistics can be continued and perhaps comes to fruition soon. how does it work? we try to find suitable statistics for an arbitrary length list of conditions so that the planner can use it directly rather than multiplying all the selectivities. this should make estimates a lot more precise. the current approach can be extended to work with expressions and well as "straight" conditions. goal: to make cross column correlation work for 9.2 ... the purpose of this mail is mostly to get the race for a patch going and to see if the approach as such is reasonable / feasible. many thanks, hans cross-column-v5.patch Description: Binary data -- Cybertec Schönig & Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: http://www.postgresql-support.de -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Sync Rep v17
On Tue, Feb 22, 2011 at 5:04 AM, Greg Smith wrote: > Daniel Farina wrote: >> >> As it will be somewhat hard to prove the durability guarantees of >> commit without special heroics, unless someone can suggest a >> mechanism. > > Could you introduce a hack creating deterministic server side crashes in > order to test this out? The simplest thing that comes to mind is a rule > like "kick shared memory in the teeth to force a crash after every 100 > commits", then see if #100 shows up as expected. Pick two different small > numbers for the interval and you could probably put that on both sides to > simulate all sorts of badness. I probably could via function, would a kill -9 also be of interest to you? -- fdr -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Sync Rep v17
On Sat, Feb 19, 2011 at 11:26 PM, Robert Haas wrote: > > DEBUG: write 0/3027BC8 flush 0/3014690 apply 0/3014690 > DEBUG: released 0 procs up to 0/3014690 > DEBUG: write 0/3027BC8 flush 0/3027BC8 apply 0/3014690 > DEBUG: released 2 procs up to 0/3027BC8 > WARNING: could not locate ourselves on wait queue > server closed the connection unexpectedly > This probably means the server terminated abnormally > before or while processing the request. > The connection to the server was lost. Attempting reset: DEBUG: you can make this happen more easily, i just run "pgbench -n -c10 -j10 test" and qot that warning and sometimes a segmentation fault and sometimes a failed assertion and the problematic code starts at src/backend/replication/syncrep.c:277, here my suggestions on that code. still i get a failed assertion because of the second Assert (i think we should just remove that one) *** SyncRepRemoveFromQueue(void) *** 288,299 if (proc->lwWaitLink == NULL) elog(WARNING, "could not locate ourselves on wait queue"); ! proc = proc->lwWaitLink; } if (proc->lwWaitLink == NULL) /* At tail */ { ! Assert(proc == MyProc); /* Remove ourselves from tail of queue */ Assert(queue->tail == MyProc); queue->tail = proc; --- 288,300 if (proc->lwWaitLink == NULL) elog(WARNING, "could not locate ourselves on wait queue"); ! else ! proc = proc->lwWaitLink; } if (proc->lwWaitLink == NULL) /* At tail */ { ! Assert(proc != MyProc); /* Remove ourselves from tail of queue */ Assert(queue->tail == MyProc); queue->tail = proc; -- Jaime Casanova www.2ndQuadrant.com Professional PostgreSQL: Soporte y capacitación de PostgreSQL -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: [COMMITTERS] pgsql: Make a hard state change from catchup to streaming mode.
On Fri, Feb 18, 2011 at 11:50 AM, Robert Haas wrote: > On Fri, Feb 18, 2011 at 10:41 AM, Tom Lane wrote: >> Robert Haas writes: >>> On Fri, Feb 18, 2011 at 10:09 AM, Simon Riggs wrote: Make a hard state change from catchup to streaming mode. More useful state change for monitoring purposes, plus a required change for synchronous replication patch. >> >>> As far as I can see, this patch was not posted or discussed before >>> commit, and I'm not sure it's the behavior everyone wants. It has the >>> effect of preventing the system from ever going backwards from >>> "streaming" to "catchup". Is that what we want? >> >> That seems like a very bad idea from here. Being able to go back to >> catchup after loss of the streaming connection is essential for >> robustness. If we now have to restart the slave for that to happen, >> it's not an improvement. > > No, that's not the case where it matters. The state would get reset > on reconnect. The problem is when, say, the master server is > generating WAL at a rate which exceeds the network bandwidth of the > link between the master and the standby. The previous coding will > make us flip back into the catchup state when that happens. > > Actually, the old code is awfully sensitive, and knowing that you are > not caught up is not really enough information: you need to know how > far behind you are, and how long you've been behind for. I'm guessing > that Simon intended this patch to deal with that problem, but it's not > the right fix. So am I the only one who thinks this needs to be reverted? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Void binary patch
Robert Haas writes: > On Tue, Feb 22, 2011 at 10:15 AM, Tom Lane wrote: >> Yeah, this has been discussed before. >> >> Even though this patch is far past the CF deadline, I'm a bit tempted to >> push it into 9.1 anyway, just so we can check off that problem. > +1. Done. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Void binary patch
On Tue, Feb 22, 2011 at 8:22 AM, rsmogura wrote: > On Tue, 22 Feb 2011 08:12:23 -0600, Merlin Moncure wrote: >> >> On Tue, Feb 22, 2011 at 6:01 AM, Robert Haas >> wrote: >>> >>> On Sun, Feb 20, 2011 at 5:20 AM, Radosław Smogura >>> wrote: Just patch for missing procedures for void send/recv >>> >>> What problem does this fix? >> >> void returning functions may not be called when binary protocol is >> requested currently. this is annoying: some drivers that wrap libpq >> or the protocol directly use the binary mode exclusively and this >> causes headaches for them. put another way, 'void' is the only POD >> type missing send/recv. >> >> merlin > > Just curious what POD means? POD = 'Plain Old Data' -- one of the core types. :-). merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] SSI bug?
hi, > "Kevin Grittner" wrote: > >> I'm proceeding on this basis. > > Result attached. I found myself passing around the tuple xmin value > just about everywhere that the predicate lock target tag was being > passed, so it finally dawned on me that this logically belonged as > part of the target tag. That simplified the changes, and the net > result of following Heikki's suggestion here is the reduction of > total lines of code by 178 while adding coverage for missed corner > cases and fixing bugs. > > Thanks again, Heikki! > > I will test this some more tomorrow. So far I haven't done more than > ensure it passes the standard regression tests and the isolation > tests added for SSI. The latter took awhile because the hash_any > function was including uninitialized bytes past the length of the tag > in its calculations. We should probably either fix that or document > it. I had to add another couple bytes to the tag to get it to a four > byte boundary to fix it. Easy once you know that's how it works... > > The attached patch can also be viewed here: > > http://git.postgresql.org/gitweb?p=users/kgrittn/postgres.git;a=commitdiff;h=46fd5ea6728b566c521ec83048bc00a207289dd9 > > If this stands up to further testing, the only issue I know of for > the 9.1 release is to update the documentation of shared memory usage > to include the new structures. > > -Kevin i tested ede45e90dd1992bfd3e1e61ce87bad494b81f54d + ssi-multi-update-1.patch with my application and got the following assertion failure. YAMAMOTO Takashi Core was generated by `postgres'. Program terminated with signal 6, Aborted. #0 0xbbba4cc7 in _lwp_kill () from /usr/lib/libc.so.12 (gdb) bt #0 0xbbba4cc7 in _lwp_kill () from /usr/lib/libc.so.12 #1 0xbbba4c85 in raise (s=6) at /siro/nbsd/src/lib/libc/gen/raise.c:48 #2 0xbbba445a in abort () at /siro/nbsd/src/lib/libc/stdlib/abort.c:74 #3 0x0833d9f2 in ExceptionalCondition ( conditionName=0x8493f6d "!(locallock != ((void *)0))", errorType=0x8370451 "FailedAssertion", fileName=0x8493ec3 "predicate.c", lineNumber=3657) at assert.c:57 #4 0x0827977e in CheckTargetForConflictsIn (targettag=0xbfbfce78) at predicate.c:3657 #5 0x0827bd74 in CheckForSerializableConflictIn (relation=0x99b5bad4, tuple=0xbfbfcf78, buffer=234) at predicate.c:3772 #6 0x080a5be8 in heap_update (relation=0x99b5bad4, otid=0xbfbfd038, newtup=0x99a0aa08, ctid=0xbfbfd03e, update_xmax=0xbfbfd044, cid=1, crosscheck=0x0, wait=1 '\001') at heapam.c:2593 #7 0x081c81ca in ExecModifyTable (node=0x99a0566c) at nodeModifyTable.c:624 #8 0x081b2153 in ExecProcNode (node=0x99a0566c) at execProcnode.c:371 #9 0x081b0f82 in standard_ExecutorRun (queryDesc=0x99b378f0, direction=ForwardScanDirection, count=0) at execMain.c:1247 #10 0xbbaaf352 in pgss_ExecutorRun (queryDesc=0x99b378f0, direction=ForwardScanDirection, count=0) at pg_stat_statements.c:541 #11 0xbbab5ee5 in explain_ExecutorRun (queryDesc=0x99b378f0, direction=ForwardScanDirection, count=0) at auto_explain.c:201 #12 0x08288ae3 in ProcessQuery (plan=0x99bb404c, sourceText=0x99b3781c "UPDATE file SET ctime = $1 WHERE fileid = $2", params=, dest=0x84d1f38, completionTag=0xbfbfd2f0 "") at pquery.c:197 #13 0x08288d0a in PortalRunMulti (portal=0x99b3f01c, isTopLevel=, dest=dwarf2_read_address: Corrupted DWARF expression. ) at pquery.c:1268 #14 0x0828984a in PortalRun (portal=0x99b3f01c, count=2147483647, isTopLevel=1 '\001', dest=0x99b07428, altdest=0x99b07428, completionTag=0xbfbfd2f0 "") at pquery.c:822 #15 0x08286c22 in PostgresMain (argc=2, argv=0xbb9196a4, username=0xbb9195f8 "takashi") at postgres.c:2004 #16 0x082413f6 in ServerLoop () at postmaster.c:3590 #17 0x082421a8 in PostmasterMain (argc=3, argv=0xbfbfe594) at postmaster.c:1110 #18 0x081e0d09 in main (argc=3, argv=0xbfbfe594) at main.c:199 (gdb) fr 4 #4 0x0827977e in CheckTargetForConflictsIn (targettag=0xbfbfce78) at predicate.c:3657 3657Assert(locallock != NULL); (gdb) list 3652 3653locallock = (LOCALPREDICATELOCK *) 3654 hash_search_with_hash_value(LocalPredicateLockHash, 3655 targettag, targettaghash, 3656 HASH_FIND, NULL); 3657Assert(locallock != NULL); 3658Assert(locallock->held); 3659locallock->held = false; 3660 3661if (locallock->childLocks == 0) (gdb) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/
Re: [HACKERS] pg_resetxlog display bogosity
2011/2/22 Alvaro Herrera : > Excerpts from Bruce Momjian's message of vie feb 18 23:41:18 -0300 2011: >> >> Is this a TODO item? > > Only to me, it seems. looks like you suggestion get positive impact so far :-) +1 to fix the bogosity output rather than waiting for 9.2 via a todo > > > -- > Álvaro Herrera > The PostgreSQL Company - Command Prompt, Inc. > PostgreSQL Replication, Consulting, Custom Development, 24x7 support > > -- > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-hackers > -- Cédric Villemain 2ndQuadrant http://2ndQuadrant.fr/ PostgreSQL : Expertise, Formation et Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_basebackup and wal streaming
On 18.02.2011 12:02, Magnus Hagander wrote: Better late than never (or?), here's the final cleanup of pg_streamrecv for moving into the main distribution, per discussion back in late dec or early jan. It also includes the "stream logs in parallel to backup" part that was not completed on pg_basebackup. Looks reasonable at a quick glance. + /* Already know when to stop, compare to the position we got */ That sentence sounds broken. + * The background stream will use it's own database connection so we can s/it's/its/ -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] SSI bug?
Dan Ports wrote: > It looks like CheckTargetForConflictsIn is making the assumption > that the backend-local lock table is accurate, which was probably > even true at the time it was written. I remember we decided that it could only be false in certain ways which allowed us to use it as a "lossy" first-cut test in a couple places. I doubt that we can count on any of that any longer, and should drop those heuristics. > the new changes for tuple versions make it more likely that this > will actually come up. Yeah, as far as a I can recall the only divergence was in *page* level entries for *indexes* until this latest patch. We now have *tuple* level entries for *heap* relations, too. > The solution is only slightly more complicated than just removing > the assertion. That's certainly true for that one spot, but we need an overall review of where we might be trying to use LocalPredicateLockHash for "first cut" tests as an optimization. > Unless Kevin beats me to it, I'll put together a patch later > tonight or tomorrow. (I'm at the airport right now.) It would be great if you could get this one. Thanks. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_resetxlog display bogosity
Excerpts from Bruce Momjian's message of vie feb 18 23:41:18 -0300 2011: > > Is this a TODO item? Only to me, it seems. -- Álvaro Herrera The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] SR standby hangs
Greg Stark writes: > On Tue, Feb 22, 2011 at 12:55 PM, Robert Haas wrote: >> A little OT, but ISTM that the buffer pin mechanism by its nature is >> prone to lock upgrade hazards. > Except that pins don't block exclusive locks so there's no deadlock risk. > The oddity here is on Vacuums super-exclusive "lock" which is the real > equivalent of an "exclusive lock". However there's the added bonus > that there can only be one vacuum on a table at a time. That makes it > safe We have seen deadlocks arising from this type of scenario: autovac has vacuum lock on table X autovac blocks waiting for cleanup lock on buffer B in X process P has pin on B due to a suspended query (eg cursor) P tries to get exclusive lock on X, is blocked by autovac's lock The heavyweight-lock manager fails to recognize deadlock because it doesn't know about the buffer-level LWLock. > It might be interesting to have autovacuum skip a block it finds > pinned for too long. +1, although as somebody pointed out nearby, this will only be legal if it's not a vacuum-to-prevent-wraparound situation. > Incidentally, even if we allowed multiple vacuum processes per table I > think it could be coded to be safe as long as each vacuum only needs > to acquire the super exclusive lock on a single block at a time and > doesn't try to acquire other locks while holding it. IIRC, it's cleaning the indexes that is problematic. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PostgreSQL FDW update
On Mon, Feb 21, 2011 at 10:31 AM, Tom Lane wrote: > Heikki Linnakangas writes: >> I needed something to test the FDW API patch with, and didn't want to >> get involved in the COPY API changes, and also wanted to have something >> that needs real connection management and can push down quals. So I >> updated the postgresql_fdw patch to work with the latest FDW patch. > >> Here. It's a bit of a mess, but it works for simple queries.. > > I'm going to set this CF item back to Waiting on Author, because (a) > there doesn't seem to be an actual concrete patch presented at the > moment, only a stack of hard-to-follow deltas, and (b) the patch > certainly needs updates for the extension mechanism and committed > FDW API anyway. Is anyone actually working on a new version of this patch sufficiently rapidly that we can expect a new version in the next day or two? If not, I think we mark this one Returned with Feedback and revisit it for 9.2. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Void binary patch
On Tue, Feb 22, 2011 at 10:15 AM, Tom Lane wrote: > Merlin Moncure writes: >> On Tue, Feb 22, 2011 at 6:01 AM, Robert Haas wrote: >>> What problem does this fix? > >> void returning functions may not be called when binary protocol is >> requested currently. this is annoying: some drivers that wrap libpq >> or the protocol directly use the binary mode exclusively and this >> causes headaches for them. put another way, 'void' is the only POD >> type missing send/recv. > > Yeah, this has been discussed before. > > Even though this patch is far past the CF deadline, I'm a bit tempted to > push it into 9.1 anyway, just so we can check off that problem. +1. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Sync Rep v17
Marti Raudsepp writes: > On Tue, Feb 22, 2011 at 07:38, Fujii Masao wrote: >> + SpinLockAcquire(&WalSndCtl->ctlmutex); >> + result = WalSndCtl->sync_rep_service_available; >> + SpinLockRelease(&WalSndCtl->ctlmutex); >> volatile pointer needs to be used to prevent code rearrangement. > I don't think that's necessary. Spinlock functions already prevent > reordering using __asm__ __volatile__ You're mistaken. We started using that volatile-pointer convention after noting that some compilers would misoptimize the code otherwise. It's not a problem with LWLock-protected stuff because the LWLock calls are actual out-of-line function calls, and the compiler knows it doesn't know what those functions might do. But gcc is a lot more willing to reorder stuff around asm operations, so you can't assume that SpinLockAcquire/SpinLockRelease are equally safe. The way to prevent optimization bugs is to make sure that the fetches/stores protected by a spinlock are done through volatile pointers. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] SR standby hangs
On Tue, Feb 22, 2011 at 12:55 PM, Robert Haas wrote: > A little OT, but ISTM that the buffer pin mechanism by its nature is > prone to lock upgrade hazards. A cleanup lock is essentially an > access exclusive lock on the buffer, while a buffer pin is an access > share lock. In the middle, we have the usual share and exclusive > (content) locks. We regularly try to upgrade buffer pins to any of > the higher locking levels, which is quite unlike what we do with > regular (heavyweight) locks, where we take some pains to avoid lock > upgrades. Except that pins don't block exclusive locks so there's no deadlock risk. The oddity here is on Vacuums super-exclusive "lock" which is the real equivalent of an "exclusive lock". However there's the added bonus that there can only be one vacuum on a table at a time. That makes it safe It might be interesting to have autovacuum skip a block it finds pinned for too long. Since it's willing to abort entirely if someone gets a conflicting lock skipping a block doesn't seem like a big deal. It won't let us advance the frozenxid but it'll speed up a subsequent vacuum. At some point I wonder if we'll find ourselves wanting to keep track of frozenxid per block and keep a kind of priority queue of blocks to vacuum based on their age. I continue to wonder whether we're reinventing standard garbage collection processes. Incidentally, even if we allowed multiple vacuum processes per table I think it could be coded to be safe as long as each vacuum only needs to acquire the super exclusive lock on a single block at a time and doesn't try to acquire other locks while holding it. -- greg -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] OUTER keyword
Robert Haas writes: > On Tue, Feb 22, 2011 at 10:10 AM, Tom Lane wrote: >> I don't see a good reason to change it. The SQL standard is perfectly >> clear that OUTER is a fully reserved word. > My vote would be to change it. We don't normally reserve keywords > unnecessarily. Well, we don't like *upgrading* keywords without real good reason, but OUTER has had its current classification since forever. The argument for trying to downgrade it was to avoid breaking a plpgsql function that used to work, but I don't have a lot of sympathy for that. There are any number of cases where you used to be able to get away with using reserved words as plpgsql variable names and now cannot, and most of them are not going to be fixable like this. The scenario that concerns me is that some future SQL spec addition uses OUTER in such a way that it has to be reserved again, which isn't going to bother the committee any since they already think it's reserved. Then we have to upgrade it, and all we've accomplished is to encourage people to write unportable, non-future-proof code. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Void binary patch
Merlin Moncure writes: > On Tue, Feb 22, 2011 at 6:01 AM, Robert Haas wrote: >> What problem does this fix? > void returning functions may not be called when binary protocol is > requested currently. this is annoying: some drivers that wrap libpq > or the protocol directly use the binary mode exclusively and this > causes headaches for them. put another way, 'void' is the only POD > type missing send/recv. Yeah, this has been discussed before. Even though this patch is far past the CF deadline, I'm a bit tempted to push it into 9.1 anyway, just so we can check off that problem. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] OUTER keyword
On Tue, Feb 22, 2011 at 10:10 AM, Tom Lane wrote: > Heikki Linnakangas writes: >> On 22.02.2011 16:58, Bruce Momjian wrote: >>> It this a TODO? > >> If we want to change OUTER, we should just do it now. If not, I don't >> see a TODO here. > > I don't see a good reason to change it. The SQL standard is perfectly > clear that OUTER is a fully reserved word. My vote would be to change it. We don't normally reserve keywords unnecessarily. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] OUTER keyword
Heikki Linnakangas writes: > On 22.02.2011 16:58, Bruce Momjian wrote: >> It this a TODO? > If we want to change OUTER, we should just do it now. If not, I don't > see a TODO here. I don't see a good reason to change it. The SQL standard is perfectly clear that OUTER is a fully reserved word. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] OUTER keyword
On 22.02.2011 16:58, Bruce Momjian wrote: Heikki Linnakangas wrote: On 04.10.2010 18:23, Tom Lane wrote: I wrote: Heikki Linnakangas writes: Why is OUTER a type_func_name_keyword? The grammar doesn't require that, it could as well be unreserved. Hm, you sure? All the JOIN-related keywords used to need to be at least that to avoid conflicts, IIRC. Yes. OUTER is just an optional noise word in LEFT/RIGHT OUTER JOIN. Actually, on reflection, it's possible that only JOIN itself really needs that treatment (because it can be followed by a left paren). We might have made the JOIN modifier words the same level for consistency or something. If we can back off both INNER and OUTER to unreserved, it might be worth doing. I'd be a little more worried about reducing LEFT/RIGHT/FULL, even if it works at the moment. No, can't change INNER, that creates conflicts. SELECT * FROM pg_class inner JOIN pg_namespace nsp ON nsp.oid = relnamespace; is ambiguous, "inner" could be either an alias name for pg_class or part of "INNER JOIN". I bumped into the OUTER case because we had a test case in the EnterpriseDB test suite using OUTER as a PL/pgSQL variable name. It used to work, at least in simple cases where you don't try to use "LEFT OUTER JOIN", in 8.4 when PL/pgSQL replaced it with $1 in any SQL statements before passing them to the backend. But not anymore in 9.0. It this a TODO? If we want to change OUTER, we should just do it now. If not, I don't see a TODO here. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] OUTER keyword
Heikki Linnakangas wrote: > On 04.10.2010 18:23, Tom Lane wrote: > > I wrote: > >> Heikki Linnakangas writes: > >>> Why is OUTER a type_func_name_keyword? The grammar doesn't require that, > >>> it could as well be unreserved. > > > >> Hm, you sure? All the JOIN-related keywords used to need to be at least > >> that to avoid conflicts, IIRC. > > Yes. OUTER is just an optional noise word in LEFT/RIGHT OUTER JOIN. > > > Actually, on reflection, it's possible that only JOIN itself really > > needs that treatment (because it can be followed by a left paren). > > We might have made the JOIN modifier words the same level for > > consistency or something. If we can back off both INNER and OUTER > > to unreserved, it might be worth doing. I'd be a little more worried > > about reducing LEFT/RIGHT/FULL, even if it works at the moment. > > No, can't change INNER, that creates conflicts. > > SELECT * FROM pg_class inner JOIN pg_namespace nsp ON nsp.oid = > relnamespace; > > is ambiguous, "inner" could be either an alias name for pg_class or part > of "INNER JOIN". > > I bumped into the OUTER case because we had a test case in the > EnterpriseDB test suite using OUTER as a PL/pgSQL variable name. It used > to work, at least in simple cases where you don't try to use "LEFT OUTER > JOIN", in 8.4 when PL/pgSQL replaced it with $1 in any SQL statements > before passing them to the backend. But not anymore in 9.0. It this a TODO? -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Snapshot synchronization, again...
On Tue, Feb 22, 2011 at 9:34 AM, Heikki Linnakangas wrote: >> Oh. Well that's really silly. At that point you might as well just >> store the snapshot and an integer identifier in shared memory, right? > > Yes, that's the point I was trying to make. I believe the idea of a hash was > that it takes less memory than storing the whole snapshot (and more > importantly, a fixed amount of memory per snapshot). But I'm not convinced > either that dealing with a hash is any less troublesome. OK, sorry for taking a while to get the point. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Snapshot synchronization, again...
On 22.02.2011 16:29, Robert Haas wrote: On Tue, Feb 22, 2011 at 8:58 AM, Heikki Linnakangas wrote: On 22.02.2011 15:52, Robert Haas wrote: On Tue, Feb 22, 2011 at 8:01 AM, Heikki Linnakangas wrote: Yes. It would be good to perform those sanity checks anyway. I don't think it's good; I think it's absolutely necessary. Otherwise someone can generate arbitrary garbage, hash it, and feed it to us. No? No, the hash is stored in shared memory. The hash of the garbage has to match. Oh. Well that's really silly. At that point you might as well just store the snapshot and an integer identifier in shared memory, right? Yes, that's the point I was trying to make. I believe the idea of a hash was that it takes less memory than storing the whole snapshot (and more importantly, a fixed amount of memory per snapshot). But I'm not convinced either that dealing with a hash is any less troublesome. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Snapshot synchronization, again...
On Tue, Feb 22, 2011 at 8:58 AM, Heikki Linnakangas wrote: > On 22.02.2011 15:52, Robert Haas wrote: >> >> On Tue, Feb 22, 2011 at 8:01 AM, Heikki Linnakangas >> wrote: >>> >>> Yes. It would be good to perform those sanity checks anyway. >> >> I don't think it's good; I think it's absolutely necessary. Otherwise >> someone can generate arbitrary garbage, hash it, and feed it to us. >> No? > > No, the hash is stored in shared memory. The hash of the garbage has to > match. Oh. Well that's really silly. At that point you might as well just store the snapshot and an integer identifier in shared memory, right? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Void binary patch
On Tue, 22 Feb 2011 08:12:23 -0600, Merlin Moncure wrote: On Tue, Feb 22, 2011 at 6:01 AM, Robert Haas wrote: On Sun, Feb 20, 2011 at 5:20 AM, Radosław Smogura wrote: Just patch for missing procedures for void send/recv What problem does this fix? void returning functions may not be called when binary protocol is requested currently. this is annoying: some drivers that wrap libpq or the protocol directly use the binary mode exclusively and this causes headaches for them. put another way, 'void' is the only POD type missing send/recv. merlin Just curious what POD means? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] CopyReadLineText optimization revisited
Robert Haas wrote: > On Fri, Feb 18, 2011 at 9:35 PM, Bruce Momjian wrote: > > Was this implemented? ?Is it a TODO? > > It's not entirely clear to me that there's a meaningful win here. > Speeding up COPY is already on the TODO list, with this link: > > http://archives.postgresql.org/pgsql-hackers/2008-02/msg00954.php > > ...and it may be that some of the ideas proposed there are more > worthwhile than this one. > > But maybe that same TODO item could also get a link to the start of > this thread. OK, I added a link to this thread to the existing COPY TODO performance item. -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Void binary patch
On Tue, Feb 22, 2011 at 6:01 AM, Robert Haas wrote: > On Sun, Feb 20, 2011 at 5:20 AM, Radosław Smogura > wrote: >> Just patch for missing procedures for void send/recv > > What problem does this fix? void returning functions may not be called when binary protocol is requested currently. this is annoying: some drivers that wrap libpq or the protocol directly use the binary mode exclusively and this causes headaches for them. put another way, 'void' is the only POD type missing send/recv. merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Snapshot synchronization, again...
On 22.02.2011 15:52, Robert Haas wrote: On Tue, Feb 22, 2011 at 8:01 AM, Heikki Linnakangas wrote: Yes. It would be good to perform those sanity checks anyway. I don't think it's good; I think it's absolutely necessary. Otherwise someone can generate arbitrary garbage, hash it, and feed it to us. No? No, the hash is stored in shared memory. The hash of the garbage has to match. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Snapshot synchronization, again...
On Tue, Feb 22, 2011 at 8:01 AM, Heikki Linnakangas wrote: > This is hashing, not encryption, there is no key. The point is that even if > the attacker has the hash value and knows the algorithm, he can not > construct *another* snapshot that has the same hash. What good does that do us? > Yes. It would be good to perform those sanity checks anyway. I don't think it's good; I think it's absolutely necessary. Otherwise someone can generate arbitrary garbage, hash it, and feed it to us. No? > But even if we don't allow it, there's no harm in sending the whole snapshot > to the client, anyway. Ie. instead of "1" as the identifier, use the > snapshot itself. That leaves the door open for allowing it in the future, > should we choose to do so. The door is open either way, AFAICS: we could eventually allow: BEGIN TRANSACTION (SNAPSHOT '1'); and also BEGIN TRANSACTION (SNAPSHOT '{xmin 123 xmax 456 xids 128 149 201}'); -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Void binary patch
On Tue, 22 Feb 2011 07:01:02 -0500, Robert Haas wrote: On Sun, Feb 20, 2011 at 5:20 AM, Radosław Smogura wrote: Just patch for missing procedures for void send/recv What problem does this fix? Can not execute stored procedures in JDBC with out arguments, I think function retuning void as well, and some other "minors". Ofc with binary mode. Regards, Radek -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] UNLOGGED tables in psql \d
On Tue, Feb 22, 2011 at 8:03 AM, Cédric Villemain wrote: > The fact that you didn''t saw it might be enough to reconsider the way > we display the unlogged state (and temp state) of a relation. > > Maybe some a "Durability: normal, temp, unlogged" line at bottom of > the \d output ? The term we use internally is "persistence", but I'm not really sure it's worth making the \d output longer. In fact I'd much rather find some way of going the other direction. Note also that the temp-ness of a table can already be inferred from the schema. Another thought is that we might eventually have global temporary tables, where the schema is globally visible (like a permanent or unlogged table) but each backend sees only its own contents. So whatever we come up with here should generalize to that case as well. Still another point is that "temp" can apply to any object type, but "unlogged" can only apply to tables, indexes, and sequences. (We don't currently implement it for sequences, though.) -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] CopyReadLineText optimization revisited
On Fri, Feb 18, 2011 at 9:35 PM, Bruce Momjian wrote: > Was this implemented? Is it a TODO? It's not entirely clear to me that there's a meaningful win here. Speeding up COPY is already on the TODO list, with this link: http://archives.postgresql.org/pgsql-hackers/2008-02/msg00954.php ...and it may be that some of the ideas proposed there are more worthwhile than this one. But maybe that same TODO item could also get a link to the start of this thread. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Sync Rep v17
Daniel Farina wrote: As it will be somewhat hard to prove the durability guarantees of commit without special heroics, unless someone can suggest a mechanism. Could you introduce a hack creating deterministic server side crashes in order to test this out? The simplest thing that comes to mind is a rule like "kick shared memory in the teeth to force a crash after every 100 commits", then see if #100 shows up as expected. Pick two different small numbers for the interval and you could probably put that on both sides to simulate all sorts of badness. -- Greg Smith 2ndQuadrant USg...@2ndquadrant.com Baltimore, MD PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.us "PostgreSQL 9.0 High Performance": http://www.2ndQuadrant.com/books -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] UNLOGGED tables in psql \d
2011/2/22 Itagaki Takahiro : > On Tue, Feb 22, 2011 at 18:11, Cédric Villemain > wrote: >> 2011/2/22 Itagaki Takahiro : >>> psql \d(+) doesn't show any information about UNLOGGED and TEMP attributes >>> for the table. So, we cannot know the table is unlogged or not unless >>> we directly select from pg_class.relpersistence. Is this a TODO item? >> >> I believe it is in the "title" of the table presented by \d (Table, >> Unlogged table, Temp table) > > Ah, I see. Thank you. They are shown as "Unlogged Table" and > "Unlogged Index". > > - We don't have "Temp" for temp tables. Should we have? Hum, First, I would say yes to be more consistent with the unlogged case, but 2nd we must refrain from outputting too much information where it is not relevant. The fact that you didn''t saw it might be enough to reconsider the way we display the unlogged state (and temp state) of a relation. Maybe some a "Durability: normal, temp, unlogged" line at bottom of the \d output ? > - The head characters of the second words would be small letters. > We describe other objects as "Composite type", "Foreign table", or so. > > -- > Itagaki Takahiro > -- Cédric Villemain 2ndQuadrant http://2ndQuadrant.fr/ PostgreSQL : Expertise, Formation et Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Snapshot synchronization, again...
On 22.02.2011 14:24, Robert Haas wrote: On Tue, Feb 22, 2011 at 1:59 AM, Heikki Linnakangas wrote: If you don't use a cryptographically secure hash, it's easy to construct a snapshot with the same hash as an existing snapshot, with more or less arbitrary contents. And if you did use a cryptographically secure hash, it would still be easy, unless there is some plausible way of keeping the key a secret, which I doubt. This is hashing, not encryption, there is no key. The point is that even if the attacker has the hash value and knows the algorithm, he can not construct *another* snapshot that has the same hash. A cryptographically strong hash function has that property, second preimage resistance, whereas for a regular CRC or similar it's easy to create an arbitrary input that has any given checksum. The original reason why we went with this design (send the snapshot to the client and then have the client send it back to the server) was because some people thought it could be useful to take a snapshot from the master and recreate it on a standby, or perhaps the other way around. If that's the goal, checking whether the snapshot being imported is one that's already in use is missing the point. We have to rely on our ability to detect, when importing the snapshot, any anomalies that could lead to system instability; and allow people to import an arbitrary snapshot that meets those constraints, even one that the user just created out of whole cloth. Yes. It would be good to perform those sanity checks anyway. Now, as I said before, I think that's a bad idea. I think we should NOT be designing the system to allow publication of arbitrary snapshots; instead, the backend that wishes to export its snapshot should so request, getting back a token (like "1") that other backends can then pass to START TRANSACTION (SNAPSHOT '1'), or whatever syntax we end up using. Well, I'm not sure if we should allow importing arbitrary (as long as they're sane) snapshots or not; the arguments for and against that are both compelling. But even if we don't allow it, there's no harm in sending the whole snapshot to the client, anyway. Ie. instead of "1" as the identifier, use the snapshot itself. That leaves the door open for allowing it in the future, should we choose to do so. In that design, there's no need to validate snapshots because they never leave the server's memory space, which IMHO is as it should be. I agree that if we don't allow importing arbitrary snapshots, we should use some out-of-band communication to get the snapshot to the backend. IOW not rely on a hash. Another reason I don't like this approach is because it's possible that the representation of snapshots might change in the future. Tom mused a while back about using an LSN as a snapshot (snapshots sees all transactions committed prior to that LSN) and there are other plausible changes, too. If we expose the internal details of the snapshot mechanism to users, then (1) it becomes a user-visible feature with backward compatibility implications if we choose to break it down the road Clients should treat the snapshot as an opaque block. and (2) all future snapshot representations must be able to be fully sanity checked on import. Right now it's fairly easy to verify that the snapshot's xmin isn't too old and that the remaining XIDs are in a reasonable range and that's probably all we really need to be certain of, but it's not clear that some future representation would be equally easy to validate. Perhaps, although I can't imagine a representation that wouldn't be equally easy to validate. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] SR standby hangs
On Sun, Feb 20, 2011 at 12:39 PM, Greg Stark wrote: > On Fri, Feb 18, 2011 at 6:59 PM, Andrew Dunstan wrote: >> The server is running as a warm standby, and the client's application tries >> to connect to both the master and the slave, accepting whichever lets it >> connect (hence hot standby is not turned on). >>... >> #2 0x005de645 in LockBufferForCleanup () at bufmgr.c:2432 >> #3 0x00463733 in heap_xlog_clean (lsn=, > > Hm, if hot standby isn't turned on why are we bothering with locks at > all? But that said is there *anything* else happening in the database > aside from recovery? Are there any errors in the database log? > > But this still shouldn't block. It's either blocked locking the buffer > or blocked waiting for the buffer to become unpinned. It would be nice > to get a backtrace from a debugging build which wouldn't have some of > the functions inlined. It would be really nice to see the pin count on > the buffer in question -- perhaps it has gotten out of sync or > underflowed? I realize now that it would also be useful to see the state of the LWLock on the buffer. A little OT, but ISTM that the buffer pin mechanism by its nature is prone to lock upgrade hazards. A cleanup lock is essentially an access exclusive lock on the buffer, while a buffer pin is an access share lock. In the middle, we have the usual share and exclusive (content) locks. We regularly try to upgrade buffer pins to any of the higher locking levels, which is quite unlike what we do with regular (heavyweight) locks, where we take some pains to avoid lock upgrades. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Snapshot synchronization, again...
On Tue, Feb 22, 2011 at 1:59 AM, Heikki Linnakangas wrote: > Really? The idea of the hash is to prevent you from importing arbitrary > snapshots into the system, allowing only copying snapshots that are in use > by another backend. The purpose of the hash is to make sure the snapshot the > client passes in is identical to one used by another backend. If that's the purpose of the hash, it is utterly useless. The computation performed by the backend can easily be replicated by an adversary. > If you don't use a cryptographically secure hash, it's easy to construct a > snapshot with the same hash as an existing snapshot, with more or less > arbitrary contents. And if you did use a cryptographically secure hash, it would still be easy, unless there is some plausible way of keeping the key a secret, which I doubt. The original reason why we went with this design (send the snapshot to the client and then have the client send it back to the server) was because some people thought it could be useful to take a snapshot from the master and recreate it on a standby, or perhaps the other way around. If that's the goal, checking whether the snapshot being imported is one that's already in use is missing the point. We have to rely on our ability to detect, when importing the snapshot, any anomalies that could lead to system instability; and allow people to import an arbitrary snapshot that meets those constraints, even one that the user just created out of whole cloth. Now, as I said before, I think that's a bad idea. I think we should NOT be designing the system to allow publication of arbitrary snapshots; instead, the backend that wishes to export its snapshot should so request, getting back a token (like "1") that other backends can then pass to START TRANSACTION (SNAPSHOT '1'), or whatever syntax we end up using. In that design, there's no need to validate snapshots because they never leave the server's memory space, which IMHO is as it should be. If someone can develop a convincing argument that transmitting snapshots between the master and standby is a good idea, we can still accommodate that: the master-standby connection is now full-duplex. In fact, we may want to do that anyway for other reasons (see Kevin Grittner's musings on this point vs. true serializability). Another reason I don't like this approach is because it's possible that the representation of snapshots might change in the future. Tom mused a while back about using an LSN as a snapshot (snapshots sees all transactions committed prior to that LSN) and there are other plausible changes, too. If we expose the internal details of the snapshot mechanism to users, then (1) it becomes a user-visible feature with backward compatibility implications if we choose to break it down the road and (2) all future snapshot representations must be able to be fully sanity checked on import. Right now it's fairly easy to verify that the snapshot's xmin isn't too old and that the remaining XIDs are in a reasonable range and that's probably all we really need to be certain of, but it's not clear that some future representation would be equally easy to validate. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Void binary patch
On Sun, Feb 20, 2011 at 5:20 AM, Radosław Smogura wrote: > Just patch for missing procedures for void send/recv What problem does this fix? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] UNLOGGED tables in psql \d
On Tue, Feb 22, 2011 at 18:11, Cédric Villemain wrote: > 2011/2/22 Itagaki Takahiro : >> psql \d(+) doesn't show any information about UNLOGGED and TEMP attributes >> for the table. So, we cannot know the table is unlogged or not unless >> we directly select from pg_class.relpersistence. Is this a TODO item? > > I believe it is in the "title" of the table presented by \d (Table, > Unlogged table, Temp table) Ah, I see. Thank you. They are shown as "Unlogged Table" and "Unlogged Index". - We don't have "Temp" for temp tables. Should we have? - The head characters of the second words would be small letters. We describe other objects as "Composite type", "Foreign table", or so. -- Itagaki Takahiro -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Sync Rep v17
On Tuesday 22 February 2011 09:59:21 Marti Raudsepp wrote: > On Tue, Feb 22, 2011 at 07:38, Fujii Masao wrote: > > + SpinLockAcquire(&WalSndCtl->ctlmutex); > > + result = WalSndCtl->sync_rep_service_available; > > + SpinLockRelease(&WalSndCtl->ctlmutex); > > > > volatile pointer needs to be used to prevent code rearrangement. > > I don't think that's necessary. Spinlock functions already prevent > reordering using __asm__ __volatile__ > > Otherwise, surely they would be utterly broken? Its not the spinlock thats the problem but that "result" may be already loaded into a register. Thats not prohibited by __asm__ __volatile__. Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] UNLOGGED tables in psql \d
2011/2/22 Itagaki Takahiro : > psql \d(+) doesn't show any information about UNLOGGED and TEMP attributes > for the table. So, we cannot know the table is unlogged or not unless > we directly select from pg_class.relpersistence. Is this a TODO item? > > The same issue is in TEMP tables, but we can determine them by their > schema; they are always created in pg_temp_N schema. I believe it is in the "title" of the table presented by \d (Table, Unlogged table, Temp table) -- Cédric Villemain 2ndQuadrant http://2ndQuadrant.fr/ PostgreSQL : Expertise, Formation et Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Sync Rep v17
On Tue, Feb 22, 2011 at 07:38, Fujii Masao wrote: > + SpinLockAcquire(&WalSndCtl->ctlmutex); > + result = WalSndCtl->sync_rep_service_available; > + SpinLockRelease(&WalSndCtl->ctlmutex); > volatile pointer needs to be used to prevent code rearrangement. I don't think that's necessary. Spinlock functions already prevent reordering using __asm__ __volatile__ Otherwise, surely they would be utterly broken? Regards, Marti -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers