[HACKERS] proposal: lob conversion functionality
Hello I had to enhance my older project, where XML documents are parsed and created on server side - in PLpgSQL and PLPerl procedures. We would to use a LO API for client server communication, but we have to parse/serialize LO on server side. I found so there are no simple API for working with LO from PL without access to file system. I had to use a ugly hacks: CREATE OR REPLACE FUNCTION parser.save_as_lob(bytea) RETURNS oid AS $$ DECLARE _loid oid; _substr bytea; BEGIN _loid := lo_creat(-1); FOR i IN 0..length($1)/2048 LOOP _substr := substring($1 FROM i * 2048 + 1 FOR 2048); IF _substr <> '' THEN INSERT INTO pg_largeobject(loid, pageno, data) VALUES(_loid, i, _substr); END IF; END LOOP; EXECUTE format('GRANT SELECT ON LARGE OBJECT %s TO ohs', _loid); RETURN _loid; END; $$ LANGUAGE plpgsql SECURITY DEFINER STRICT SET search_path = 'pg_catalog'; and CREATE OR REPLACE FUNCTION fbuilder.attachment_to_xml(attachment oid) RETURNS xml AS $$ DECLARE b_cum bytea = ''; b bytea; BEGIN FOR b IN SELECT l.data FROM pg_largeobject l WHERE l.loid = attachment_to_xml.attachment ORDER BY l.pageno LOOP b_cum := b_cum || b; END LOOP; IF NOT FOUND THEN RETURN NULL; ELSE RETURN xmlelement(NAME "attachment", encode(b_cum, 'base64')); END IF; END; $$ LANGUAGE plpgsql STRICT SECURITY DEFINER SET search_path = 'pg_catalog'; These functions can be simplified if we supports some functions like encode, decode for LO So my proposal is creating functions: * lo_encode(loid oid) .. returns bytea * lo_encode(loid oid, encoding text) .. returns text * lo_make(loid oid, data bytea) * lo_make(loid oid, data text, encoding text) This can simplify all transformation between LO and VARLENA. Known limit is 1G for varlena, but it is still relative enough high. Notes. comments? Regards Pavel -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal: leave a hint when switching logging away from stderr
On Fri, Aug 09, 2013 at 06:59:13PM -0400, Tom Lane wrote: > Noah Misch writes: > > On Thu, Aug 08, 2013 at 10:32:17PM -0400, Tom Lane wrote: > >> This patch arranges to emit a hint message when/if we switch away from > >> logging to the original postmaster stderr during startup. There are two > >> cases to cover: we're still using LOG_DESTINATION_STDERR but redirecting > >> stderr to a syslogger process, or we stop writing to stderr altogether, > >> presumably in favor of going to syslog or something. > > > At LOG level, this feels a bit chatty: it's a 100% increase in startup-time > > messages if you count both the main message and the HINT. I can't think of > > another program with configuration-directed logging that does this on every > > startup. Makes perfect sense to me at DEBUG1, though, and that would have > > been enough for the situation you cite above. > > Hm. That would be enough for users who think to increase log_min_messages > while trying to resolve their problem. But what I'm mainly worried about > here is people who are relative novices, so I don't have a lot of > confidence that the patch would still help them if we made the message not > appear at default logging verbosity. > > Also, I'm not sure that the chattiness argument is relevant, because no > message will be emitted at all unless you're switching to some log target > different from the postmaster's initial stderr. So the message won't show > up in the "official" log target files, only in an arguably vestigial > startup-time-messages-only file. Perhaps the chatter would most affect use, typically casual, of pg_ctl without "-l" or similar. > Does that ameliorate your concern, or do you still want it to be DEBUG1? > I'd be happier with DEBUG1 than with no message at all, but I don't think > it's going to help as many people at DEBUG1 as it would at LOG level. I think of the "implicit sequence" messages we moved from NOTICE to DEBUG1 somewhat recently. No doubt those messages had helped at times, but they didn't quite carry their weight at NOTICE. My gut prediction is that this will fall in that same utility range. But you make a valid point about noise in the startup log being easier to discount. -- Noah Misch EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] killing pg_dump leaves backend process
Tatsuo Ishii writes: > I noticed pg_dump does not exit gracefully when killed. > start pg_dump > kill pg_dump by ctrl-c > ps x > 27246 ?Ds96:02 postgres: t-ishii dbt3 [local] COPY > 29920 ?S 0:00 sshd: ishii@pts/5 > 29921 pts/5Ss 0:00 -bash > 30172 ?Ss 0:00 postgres: t-ishii dbt3 [local] LOCK TABLE waiting > As you can see, after killing pg_dump, a backend process is (LOCK > TABLE waiting) left behind. I think this could be easily fixed by > adding signal handler to pg_dump so that it catches the signal and > issues a query cancel request. If we think that's a problem (which I'm not convinced of) then pg_dump is the wrong place to fix it. Any other client would behave the same if it were killed while waiting for some backend query. So the right fix would involve figuring out a way for the backend to kill itself if the client connection goes away while it's waiting. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] mvcc catalo gsnapshots and TopTransactionContext
On 2013-08-09 14:11:46 -0400, Tom Lane wrote: > Andres Freund writes: > > On 2013-08-08 09:27:24 -0400, Robert Haas wrote: > >> How can it be safe to try to read catalogs if the transaction is aborted? > > > Well. It isn't. At least not in general. The specific case triggered > > here though are cache invalidations being processed which can lead to > > the catalog being read (pretty crummy, but not easy to get rid > > of). That's actually safe since before we process the invalidations we > > have done: > > 1) CurrentTransactionState->state = TRANS_ABORT > > 2) RecordTransactionAbort(), marking the transaction as aborted in the > > clog > > 3) marked subxacts as aborted > > 3) ProcArrayEndTransaction() (for toplevel ones) > > > Due to these any tqual stuff will treat the current (sub-)xact and it's > > children as aborted. So the catalog lookups will use the catalog in a > > sensible state. > > I don't have any faith in this argument. You might be right that we'll > correctly see our own output rows as aborted, but that's barely the tip > of the iceberg of risk here. Is it safe to take new locks in an aborted > transaction? (What if we're already past the lock-release point in > the abort sequence?) Don't get me wrong. I find the idea of doing catalog lookup during abort horrid. But it's been that way for at least 10 years (I checked 7.4), so it has at least some resemblance of working. Today we do a good bit less than back then, for one we don't do a full cache reload during abort anymore, just for the index support infrastructure. Also, you've reduced the amount of lookups a bit with the relmapper introduction. > For that matter, given that we don't know what > exactly caused the transaction abort, how safe is it to do anything at > all --- we might for instance be nearly out of memory. If the catalog > reading attempt itself fails, won't we be in an infinite loop of > transaction aborts? Looks like that's possible, yes. There seem to be quite some other opportunities for this to happen if you look at the amount of work done in AbortSubTransaction(). I guess it rarely happens because we previously release some memory... > I could probably think of ten more risks if I spent a few more minutes > at it. No need to convince me here. I neither could believe we were doing this, nor figure out why it even "works" for the first hour of looking at it. > Cache invalidation during abort should *not* lead to any attempt to > immediately revalidate the cache. No amount of excuses will make that > okay. I have not looked to see just what the path of control is in this > particular case, but we need to fix it, not paper over it. I agree, although that's easier said than done in the case of subtransactions. The problem we have there is that it's perfectly valid to still have references to a relation from the outer, not aborted, transaction. Those need to be valid for anybody looking at the relcache entry after we've processed the ROLLBACK TO/... I guess the fix is something like we do in the commit case, where we transfer invalidations to the parent transaction. If we then process local invalidations *after* we've cleaned up the subtransaction completely we should be fine. We already do an implicity CommandCounterIncrement() in CommitSubTransaction()... Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] killing pg_dump leaves backend process
I noticed pg_dump does not exit gracefully when killed. start pg_dump kill pg_dump by ctrl-c ps x 27246 ?Ds96:02 postgres: t-ishii dbt3 [local] COPY 29920 ?S 0:00 sshd: ishii@pts/5 29921 pts/5Ss 0:00 -bash 30172 ?Ss 0:00 postgres: t-ishii dbt3 [local] LOCK TABLE waiting As you can see, after killing pg_dump, a backend process is (LOCK TABLE waiting) left behind. I think this could be easily fixed by adding signal handler to pg_dump so that it catches the signal and issues a query cancel request. Thoughts? -- Tatsuo Ishii SRA OSS, Inc. Japan English: http://www.sraoss.co.jp/index_en.php Japanese: http://www.sraoss.co.jp -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] confusing error message
On 08/10/2013 02:43 AM, Stephen Frost wrote: > * Craig Ringer (cr...@2ndquadrant.com) wrote: >> More seriously, with interpolated strings for relation names etc >> it can be hard to know which chunks to search for, and search >> engines aren't always good at having the whole message thrown at >> them. > > It's not perfect, but if the searches are getting to mailing list > archive messages then it's clearly possible for us to do better. > Having an embedded code or URL or what-have-you might be an option > too, but we need the documentation first in any case, so we could > do these other things later.. Well said; you're quite right. I complain myself that a focus on perfection can prevent progress. Lets start with "helpful". My first two would be: "No pg_hba.conf entry for ..." and "fe_sendauth: no password supplied" as both seem to confuse new users endlessly. -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal: leave a hint when switching logging away from stderr
On 08/09/2013 03:40 PM, Tom Lane wrote: > In principle I see the risk, but I don't think I've ever seen an actual > report of someone getting confused this way by an on-the-fly logging > parameter change. Whereas there are numerous examples in the archives > of people not realizing that "pg_ctl -l foo" doesn't necessarily mean > that all the useful log output is in file foo. (Usually it's because > they're using a logging setup chosen by some packager, not by themselves.) > So I'm feeling that what you're suggesting is solving a different and > far less pressing problem than what I'm on about. No question. That's why I suggested it as a TODO item instead of anyone working on it right now. The main benefit I see for this is security, especially with ALTER SYSTEM SET pending. Switching log destinations is a good way to cover your tracks if you have some kind of temporary superuser access (for example, by exploiting a SECURITY DEFINER function). If the switch were recorded somewhere other than the new log location, it would provide a little more sleuthing information for later auditors. > I did think a little bit about how to do it. For parameters that affect > where the logging collector writes data (probably only log_directory is > worth special handling), it would be quite easy to make the got_SIGHUP > code segment in syslogger.c emit a log message just before switching the > active value. However, if you want something similar for log_destination, > syslog_facility, syslog_ident, or event_source, it's far more problematic > because those settings affect the behavior of individual processes, and > so there's no unique point where we're switching from one log target to > another. We could have the postmaster report a value change but it's > likely that that message would not appear very close to the end of the > messages directed to the old target. But wait, there's more complications: what if you're switching log_directory because the disk on the old log location is full? Then we *can't* emit a log entry on switch, because we can't write it. Like I said, complicated out of proportion to be benefit, at least right now. > Another point here is that if you're tailing the current log file, > a plain old rotation (no parameter change anywhere) would also cut > you off without obvious notice. Yeah, I'm not concerned about that, since the location of the new log is predictable. > Maybe we shouldn't think about this > as a "change of parameter" problem, but just say it's worth emitting > an "end of log file" message anytime we're about to change to a new log > file, no matter the reason. But again, that's specific to the logging > collector case and doesn't help with any other log target. I also note > that I'm not familiar with any system logging tool that emits such > messages. I'm not familiar with other system logging tools. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal: leave a hint when switching logging away from stderr
Noah Misch writes: > On Thu, Aug 08, 2013 at 10:32:17PM -0400, Tom Lane wrote: >> This patch arranges to emit a hint message when/if we switch away from >> logging to the original postmaster stderr during startup. There are two >> cases to cover: we're still using LOG_DESTINATION_STDERR but redirecting >> stderr to a syslogger process, or we stop writing to stderr altogether, >> presumably in favor of going to syslog or something. > At LOG level, this feels a bit chatty: it's a 100% increase in startup-time > messages if you count both the main message and the HINT. I can't think of > another program with configuration-directed logging that does this on every > startup. Makes perfect sense to me at DEBUG1, though, and that would have > been enough for the situation you cite above. Hm. That would be enough for users who think to increase log_min_messages while trying to resolve their problem. But what I'm mainly worried about here is people who are relative novices, so I don't have a lot of confidence that the patch would still help them if we made the message not appear at default logging verbosity. Also, I'm not sure that the chattiness argument is relevant, because no message will be emitted at all unless you're switching to some log target different from the postmaster's initial stderr. So the message won't show up in the "official" log target files, only in an arguably vestigial startup-time-messages-only file. Does that ameliorate your concern, or do you still want it to be DEBUG1? I'd be happier with DEBUG1 than with no message at all, but I don't think it's going to help as many people at DEBUG1 as it would at LOG level. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Statistics collection for CLUSTER command
On 08/09/2013 10:37 PM, Stefan Kaltenbrunner wrote: >>> On 08/08/2013 01:52 PM, Vik Fearing wrote: I would add this to the next commitfest but I seem to be unable to log in with my community account (I can log in to the wiki). Help appreciated. > hmm looks like your account may be affected by one of the buglets > introduced (and fixed shortly afterwards) of the main infrastructure to > debian wheezy - please try logging in to the main website and change > your password at least once. That should make it working again for the > commitfest app... That worked. Thank you. Vik -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal: leave a hint when switching logging away from stderr
On Thu, Aug 08, 2013 at 10:32:17PM -0400, Tom Lane wrote: > The attached patch is motivated by > http://www.postgresql.org/message-id/cajyqwwryt9rmbzs-sh6ucr1otg4joxqkdf-fkoyp6pv12t0...@mail.gmail.com > This patch arranges to emit a hint message when/if we switch away from > logging to the original postmaster stderr during startup. There are two > cases to cover: we're still using LOG_DESTINATION_STDERR but redirecting > stderr to a syslogger process, or we stop writing to stderr altogether, > presumably in favor of going to syslog or something. At LOG level, this feels a bit chatty: it's a 100% increase in startup-time messages if you count both the main message and the HINT. I can't think of another program with configuration-directed logging that does this on every startup. Makes perfect sense to me at DEBUG1, though, and that would have been enough for the situation you cite above. -- Noah Misch EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal: leave a hint when switching logging away from stderr
Josh Berkus writes: > Tom, >> I thought about trying to leave similar breadcrumbs if the logging >> parameters are changed while the postmaster is running, but it would add a >> fair amount of complication to the patch, and I'm not sure there's a lot >> of value in it. On-the-fly logging parameter changes don't happen without >> active DBA involvement, so it's a lot harder to credit thaat somebody would >> not be expecting the data to start going somewhere else. > Well, I think doing that ALSO would be worthwhile for the TODO list. > I've often wished, for example, that if we switch log_directory the > *last* message in the old log file be "reloading postgresql with new > configuration" or something similar, so that I would know to look for a > new log file somewhere else. If you are, for example, logging only > errors, you wouldn't necessarily realize that logging on the file you're > tailing/monitoring has stopped. In principle I see the risk, but I don't think I've ever seen an actual report of someone getting confused this way by an on-the-fly logging parameter change. Whereas there are numerous examples in the archives of people not realizing that "pg_ctl -l foo" doesn't necessarily mean that all the useful log output is in file foo. (Usually it's because they're using a logging setup chosen by some packager, not by themselves.) So I'm feeling that what you're suggesting is solving a different and far less pressing problem than what I'm on about. I did think a little bit about how to do it. For parameters that affect where the logging collector writes data (probably only log_directory is worth special handling), it would be quite easy to make the got_SIGHUP code segment in syslogger.c emit a log message just before switching the active value. However, if you want something similar for log_destination, syslog_facility, syslog_ident, or event_source, it's far more problematic because those settings affect the behavior of individual processes, and so there's no unique point where we're switching from one log target to another. We could have the postmaster report a value change but it's likely that that message would not appear very close to the end of the messages directed to the old target. Another point here is that if you're tailing the current log file, a plain old rotation (no parameter change anywhere) would also cut you off without obvious notice. Maybe we shouldn't think about this as a "change of parameter" problem, but just say it's worth emitting an "end of log file" message anytime we're about to change to a new log file, no matter the reason. But again, that's specific to the logging collector case and doesn't help with any other log target. I also note that I'm not familiar with any system logging tool that emits such messages. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] mvcc catalo gsnapshots and TopTransactionContext
On Fri, Aug 09, 2013 at 02:11:46PM -0400, Tom Lane wrote: > Cache invalidation during abort should *not* lead to any attempt to > immediately revalidate the cache. No amount of excuses will make that > okay. I have not looked to see just what the path of control is in this > particular case, but we need to fix it, not paper over it. +1. What if (sub)transaction end only manipulated the local invalidation message queue for later processing? Actual processing would happen after CleanupSubTransaction() returns control to the owning xact, or at the start of the next transaction for a top-level ending. -- Noah Misch EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Statistics collection for CLUSTER command
On 08/09/2013 12:02 AM, Vik Fearing wrote: > On 08/08/2013 07:57 PM, Stefan Kaltenbrunner wrote: > >> On 08/08/2013 01:52 PM, Vik Fearing wrote: >>> I would add this to the next commitfest but I seem to be unable to log >>> in with my community account (I can log in to the wiki). Help appreciated. >> whould be a bit easier to diagnose if we knew your community account name > > Sorry, it's "glaucous". hmm looks like your account may be affected by one of the buglets introduced (and fixed shortly afterwards) of the main infrastructure to debian wheezy - please try logging in to the main website and change your password at least once. That should make it working again for the commitfest app... Stefan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal: leave a hint when switching logging away from stderr
Josh, Tom, * Josh Berkus (j...@agliodbs.com) wrote: > > Does this rise to the level of a usability bug that ought to be > > back-patched? As I said, we've seen this type of thinko multiple > > times before. > > Hmmm. On the one hand, I can't see the harm in it. On the other hand, > I'm reluctant to introduce non-critical behavior changes into > backbranches no matter how minor. What if we just put this in 9.3 and up? I'd be fine w/ this going into 9.3. What was perhaps not entirely clear from my last mail is that I was complaining about the autovacuum changes in 9.2.3 (iirc?) which caused it to be more "chatty" which likely surprised some poor DBAs. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] pg_dump and schema names
On Fri, Aug 9, 2013 at 02:15:31PM -0400, Bruce Momjian wrote: > On Fri, Aug 9, 2013 at 01:39:35PM -0400, Tom Lane wrote: > > Bruce Momjian writes: > > > On Fri, Aug 9, 2013 at 12:53:20PM -0400, Tom Lane wrote: > > >> This really requires more than no attention to the comments, especially > > >> since you just removed the only apparent reason for _getObjectDescription > > >> to make a distinction between objects whose name includes a schema and > > >> those that don't. > > > > > I am confused. Are you saying I didn't read the comments, or that I can > > > now merge the schema-qualified and non-schema-qualified object sections? > > > > Well, it's certainly not immediately obvious why we shouldn't merge them. > > But I would have expected the function's header comment to now explain > > that the output is intentionally not schema-qualified and assumes that the > > search path is set for the object's schema if any. > > OK, done with the attached patch. The dump output is unchanged. To be honest, I never got to modifying the comments because I expected someone to say the patch was wrong. I also didn't expect to find dead code in there too. -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] confusing error message
* Craig Ringer (cr...@2ndquadrant.com) wrote: > More seriously, with interpolated strings for relation names etc it can > be hard to know which chunks to search for, and search engines aren't > always good at having the whole message thrown at them. It's not perfect, but if the searches are getting to mailing list archive messages then it's clearly possible for us to do better. Having an embedded code or URL or what-have-you might be an option too, but we need the documentation first in any case, so we could do these other things later.. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] pg_dump and schema names
On Fri, Aug 9, 2013 at 01:39:35PM -0400, Tom Lane wrote: > Bruce Momjian writes: > > On Fri, Aug 9, 2013 at 12:53:20PM -0400, Tom Lane wrote: > >> This really requires more than no attention to the comments, especially > >> since you just removed the only apparent reason for _getObjectDescription > >> to make a distinction between objects whose name includes a schema and > >> those that don't. > > > I am confused. Are you saying I didn't read the comments, or that I can > > now merge the schema-qualified and non-schema-qualified object sections? > > Well, it's certainly not immediately obvious why we shouldn't merge them. > But I would have expected the function's header comment to now explain > that the output is intentionally not schema-qualified and assumes that the > search path is set for the object's schema if any. OK, done with the attached patch. The dump output is unchanged. > > Also, this seems like dead code as there is no test for "INDEX" in the > > if() block it exists in: > > > /* > > * Pre-7.3 pg_dump would sometimes (not always) put a fmtId'd name > > * into te->tag for an index. This check is heuristic, so make its > > * scope as narrow as possible. > > */ > > if (AH->version < K_VERS_1_7 && > > te->tag[0] == '"' && > > te->tag[strlen(te->tag) - 1] == '"' && > > strcmp(type, "INDEX") == 0) > > appendPQExpBuffer(buf, "%s", te->tag); > > else > > Huh, yeah it is dead code, since _printTocEntry doesn't call this function > for "INDEX" objects. And anyway I doubt anybody still cares about reading > 7.2-era archive files. No objection to removing that. Removed. -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + diff --git a/src/bin/pg_dump/pg_backup_archiver.c b/src/bin/pg_dump/pg_backup_archiver.c new file mode 100644 index cd7669b..5204ceb *** a/src/bin/pg_dump/pg_backup_archiver.c --- b/src/bin/pg_dump/pg_backup_archiver.c *** _selectTablespace(ArchiveHandle *AH, con *** 2879,2889 /* * Extract an object description for a TOC entry, and append it to buf. * ! * This is not quite as general as it may seem, since it really only ! * handles constructing the right thing to put into ALTER ... OWNER TO. ! * ! * The whole thing is pretty grotty, but we are kind of stuck since the ! * information used is all that's available in older dump files. */ static void _getObjectDescription(PQExpBuffer buf, TocEntry *te, ArchiveHandle *AH) --- 2879,2885 /* * Extract an object description for a TOC entry, and append it to buf. * ! * This is used for ALTER ... OWNER TO. */ static void _getObjectDescription(PQExpBuffer buf, TocEntry *te, ArchiveHandle *AH) *** _getObjectDescription(PQExpBuffer buf, T *** 2895,2901 strcmp(type, "MATERIALIZED VIEW") == 0) type = "TABLE"; ! /* objects named by a schema and name */ if (strcmp(type, "COLLATION") == 0 || strcmp(type, "CONVERSION") == 0 || strcmp(type, "DOMAIN") == 0 || --- 2891,2897 strcmp(type, "MATERIALIZED VIEW") == 0) type = "TABLE"; ! /* objects that don't require special decoration */ if (strcmp(type, "COLLATION") == 0 || strcmp(type, "CONVERSION") == 0 || strcmp(type, "DOMAIN") == 0 || *** _getObjectDescription(PQExpBuffer buf, T *** 2903,2937 strcmp(type, "TYPE") == 0 || strcmp(type, "FOREIGN TABLE") == 0 || strcmp(type, "TEXT SEARCH DICTIONARY") == 0 || ! strcmp(type, "TEXT SEARCH CONFIGURATION") == 0) ! { ! appendPQExpBuffer(buf, "%s ", type); ! if (te->namespace && te->namespace[0]) /* is null pre-7.3 */ ! appendPQExpBuffer(buf, "%s.", fmtId(te->namespace)); ! ! /* ! * Pre-7.3 pg_dump would sometimes (not always) put a fmtId'd name ! * into te->tag for an index. This check is heuristic, so make its ! * scope as narrow as possible. ! */ ! if (AH->version < K_VERS_1_7 && ! te->tag[0] == '"' && ! te->tag[strlen(te->tag) - 1] == '"' && ! strcmp(type, "INDEX") == 0) ! appendPQExpBuffer(buf, "%s", te->tag); ! else ! appendPQExpBuffer(buf, "%s", fmtId(te->tag)); ! return; ! } ! ! /* objects named by just a name */ ! if (strcmp(type, "DATABASE") == 0 || strcmp(type, "PROCEDURAL LANGUAGE") == 0 || strcmp(type, "SCHEMA") == 0 || strcmp(type, "FOREIGN DATA WRAPPER") == 0 || strcmp(type, "SERVER") == 0 || strcmp(type, "USER MAPPING") == 0) { appendPQExpBuffer(buf, "%s %s", type, fmtId(te->tag)); return; } --- 2899,2914 strcmp(type, "TYPE") == 0 || strcmp(type, "FOREIGN TABLE") == 0 || strcmp(type, "TEXT SEARCH DICTIONARY") == 0 || ! strcmp(type, "TEXT SEARCH CONFIGURATION") == 0 || ! /* non-schema-specified objects */ ! strcmp(type, "DATABASE") == 0 || strcmp(type, "PROCEDURAL
Re: [HACKERS] mvcc catalo gsnapshots and TopTransactionContext
Andres Freund writes: > On 2013-08-08 09:27:24 -0400, Robert Haas wrote: >> How can it be safe to try to read catalogs if the transaction is aborted? > Well. It isn't. At least not in general. The specific case triggered > here though are cache invalidations being processed which can lead to > the catalog being read (pretty crummy, but not easy to get rid > of). That's actually safe since before we process the invalidations we > have done: > 1) CurrentTransactionState->state = TRANS_ABORT > 2) RecordTransactionAbort(), marking the transaction as aborted in the > clog > 3) marked subxacts as aborted > 3) ProcArrayEndTransaction() (for toplevel ones) > Due to these any tqual stuff will treat the current (sub-)xact and it's > children as aborted. So the catalog lookups will use the catalog in a > sensible state. I don't have any faith in this argument. You might be right that we'll correctly see our own output rows as aborted, but that's barely the tip of the iceberg of risk here. Is it safe to take new locks in an aborted transaction? (What if we're already past the lock-release point in the abort sequence?) For that matter, given that we don't know what exactly caused the transaction abort, how safe is it to do anything at all --- we might for instance be nearly out of memory. If the catalog reading attempt itself fails, won't we be in an infinite loop of transaction aborts? I could probably think of ten more risks if I spent a few more minutes at it. Cache invalidation during abort should *not* lead to any attempt to immediately revalidate the cache. No amount of excuses will make that okay. I have not looked to see just what the path of control is in this particular case, but we need to fix it, not paper over it. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_dump and schema names
Bruce Momjian writes: > On Fri, Aug 9, 2013 at 12:53:20PM -0400, Tom Lane wrote: >> This really requires more than no attention to the comments, especially >> since you just removed the only apparent reason for _getObjectDescription >> to make a distinction between objects whose name includes a schema and >> those that don't. > I am confused. Are you saying I didn't read the comments, or that I can > now merge the schema-qualified and non-schema-qualified object sections? Well, it's certainly not immediately obvious why we shouldn't merge them. But I would have expected the function's header comment to now explain that the output is intentionally not schema-qualified and assumes that the search path is set for the object's schema if any. > Also, this seems like dead code as there is no test for "INDEX" in the > if() block it exists in: > /* > * Pre-7.3 pg_dump would sometimes (not always) put a fmtId'd name > * into te->tag for an index. This check is heuristic, so make its > * scope as narrow as possible. > */ > if (AH->version < K_VERS_1_7 && > te->tag[0] == '"' && > te->tag[strlen(te->tag) - 1] == '"' && > strcmp(type, "INDEX") == 0) > appendPQExpBuffer(buf, "%s", te->tag); > else Huh, yeah it is dead code, since _printTocEntry doesn't call this function for "INDEX" objects. And anyway I doubt anybody still cares about reading 7.2-era archive files. No objection to removing that. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal: leave a hint when switching logging away from stderr
Tom, > I thought about trying to leave similar breadcrumbs if the logging > parameters are changed while the postmaster is running, but it would add a > fair amount of complication to the patch, and I'm not sure there's a lot > of value in it. On-the-fly logging parameter changes don't happen without > active DBA involvement, so it's a lot harder to credit thaat somebody would > not be expecting the data to start going somewhere else. Well, I think doing that ALSO would be worthwhile for the TODO list. I've often wished, for example, that if we switch log_directory the *last* message in the old log file be "reloading postgresql with new configuration" or something similar, so that I would know to look for a new log file somewhere else. If you are, for example, logging only errors, you wouldn't necessarily realize that logging on the file you're tailing/monitoring has stopped. The "active DBA involvement" argument doesn't hold much water given the many avenues for someone to accidentally introduce a configuration change they didn't intend. However, I also realize that the complexity of this feature's implementation would likely eclipse its usefulness. As such, I'd like to put it on the TODO list for some future occasion when we need to mess with log-switching code *anyway* and can include this. > > Thoughts? In particular, anyone want to bikeshed on the message wording? > > Does this rise to the level of a usability bug that ought to be > back-patched? As I said, we've seen this type of thinko multiple > times before. Hmmm. On the one hand, I can't see the harm in it. On the other hand, I'm reluctant to introduce non-critical behavior changes into backbranches no matter how minor. What if we just put this in 9.3 and up? -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal for XML Schema Validation
Hi Craig Ringer, yeah, you are right indeed. I tried to answer your question in three section as below. (A) XML Schema update User may wish to update schema in future. So we will provide them one more function UPDATE_XML_SCHEMA("","","") Here we are assuming URL OF SCHEMA as a primary key. And we will overwrite content of .xsd field no provision of altering a selective portion xsd in place. Whenever an update to schema happens before committing changes we will run small algo as below 1. For all table in which this schema is used 2.if(!validate xml document for each row) 3. abort/exit with error; 4. commit If user modify schema by adding some extra optional tags then their won't be any error ,error will arise in cases such as adding new compulsory/required tags, datatype in .xsd for certain tag is modified. This is beyond our control an obvious solution as suggested by you could be used, user will manually go through rows which are violating schema (for simplicity, we will mention row number which are violating schema in our error message) and do modification/deletion as required. ( similar case happen, suppose you have a table t(a,b,c) with lot of data, later on you want to add primary key constraints to column 'a', but if data in column 'a' is not unique then it may fail, and user has to manually handle this situation may be by deleting or modifying respective rows. ) (B) Alter Table Only sole purpose of making use of keyword USE_SCHEMA is to mimic oracle (somewhere on oracle site i found this type of syntax), I may not be correct but check constraint is only used to limit the value ranges. So it is better to introduce new meaningful keyword or else no problem to work embed this feature with CHECK() (C) yes , there are memory management related issue with libxml as mentioned on below link http://wiki.postgresql.org/wiki/XML_Support#Implementation_Issues It is also mention there that this issue can be resolved(how? don't know!). Thanks, Vikrantsingh & Pridhvi IIIT Bangalore From: Craig Ringer Sent: Friday, August 09, 2013 8:27 AM To: Kodamasimham Pridhvi (MT2012066) Cc: pgsql-hackers@postgresql.org; Bisen Vikrantsingh Mohansingh MT2012036 Subject: Re: [HACKERS] Proposal for XML Schema Validation On 08/09/2013 12:39 AM, Kodamasimham Pridhvi (MT2012066) wrote: > > Objective: To Add XML Schema validation and xmlvalidate functions (SQL:2008) > > Description: > We’ve gone through current support of xml in postgreSQL and found that there > is a check for well-formedness of xml document while inserting and updating. > We want to extend this feature by adding xml schema validation. >We will be providing user with DDL commands for creating and deleting > XML Schema, also provision of associating xml schema with table while > creation of new table or while altering table structure, we are planning to > use libxml2 library. Proposed syntax is given below. The first thing that comes to mind here is "what if the user wants to update/replace the schema" ? How would you handle re-validating the fields? Sure, updating XML schemas is not a great idea, but it doesn't stop people doing it. It might be reasonable to say "if you want to do this you have to drop the dependent constraints, drop the schema, re-create the schema and re-create the schema constraints" though. Why extend the create table / alter table syntax with "USE_SCHEMA"? Is there a compatibility/standards reason to do this? If not, what advantage does this provide over using a suitable CHECK constraint? IIRC there were some memory management issues with libxml2 in Pg. Anyone remember anything about that? -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] confusing error message
I seem to remember somebody proposed an errurl() macro so that we could add URLs to certain particularly confusing error reports. [searches the archives] Bah, that was me, and some other ideas were proposed: http://www.postgresql.org/message-id/48ca9d5f.6060...@esilo.com -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_dump and schema names
On Fri, Aug 9, 2013 at 12:53:20PM -0400, Tom Lane wrote: > Bruce Momjian writes: > > On Fri, Aug 9, 2013 at 01:48:43AM -0400, Tom Lane wrote: > >> The practical difficulties involved can be seen by reading the comments > >> and code for _getObjectDescription(). > > > Yes, I looked at that.Seems _getObjectDescription() is only called > > from _printTocEntry(), and that function has a call to > > _selectOutputSchema() at the top, so we already know we have search_path > > set to the proper schema. > > > The attached patch removes the unnecessary schema qualification for > > ALTER OWNER, and the attached dump file show a two-schema dump that > > restores just fine. > > This really requires more than no attention to the comments, especially > since you just removed the only apparent reason for _getObjectDescription > to make a distinction between objects whose name includes a schema and > those that don't. I am confused. Are you saying I didn't read the comments, or that I can now merge the schema-qualified and non-schema-qualified object sections? Also, this seems like dead code as there is no test for "INDEX" in the if() block it exists in: /* * Pre-7.3 pg_dump would sometimes (not always) put a fmtId'd name * into te->tag for an index. This check is heuristic, so make its * scope as narrow as possible. */ if (AH->version < K_VERS_1_7 && te->tag[0] == '"' && te->tag[strlen(te->tag) - 1] == '"' && strcmp(type, "INDEX") == 0) appendPQExpBuffer(buf, "%s", te->tag); else Please advise. -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] confusing error message
On 08/10/2013 12:09 AM, Stephen Frost wrote: > Perhaps we should add an area to our documentation which provides > more information about the specific error messages which PostgreSQL > returns? That's not a terribly exciting bit of documentation to > write, but perhaps it would be very useful for our users. PG00204 Error reading control file More seriously, with interpolated strings for relation names etc it can be hard to know which chunks to search for, and search engines aren't always good at having the whole message thrown at them. I'm not actually proposing in-text message identifiers ... I'm sure enough people have ignored all sanity and reason and parsed message strings that this would cause breakage all over the place. It'd also be a nightmare for translators and would easily be confused for an SQLSTATE. A separate field in the structured messages would be saner, but of course most clients would not then display it so the user would never know it was there, much the same way few people seem to realise you can get the function, source file and line from error messages already. After all, how many people do you think run with VERBOSITY=verbose in psql? -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_dump and schema names
Bruce Momjian writes: > On Fri, Aug 9, 2013 at 01:48:43AM -0400, Tom Lane wrote: >> The practical difficulties involved can be seen by reading the comments >> and code for _getObjectDescription(). > Yes, I looked at that.Seems _getObjectDescription() is only called > from _printTocEntry(), and that function has a call to > _selectOutputSchema() at the top, so we already know we have search_path > set to the proper schema. > The attached patch removes the unnecessary schema qualification for > ALTER OWNER, and the attached dump file show a two-schema dump that > restores just fine. This really requires more than no attention to the comments, especially since you just removed the only apparent reason for _getObjectDescription to make a distinction between objects whose name includes a schema and those that don't. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] question about HTTP API
Josh Berkus writes: > Agreed. Too bad you can't do this as an extension, it would allow you > to rev releases a lot faster than once a year. > Actually, maybe you should look at "what is the minimum patch required > to enable a webserver extension", with the idea that most of the > webserver code would still live outside the core? That way you could > continue to develop it a lot faster. +1. I think for reasons such as security, a lot of people would rather *not* see any such thing in core anyway, independent of development issues. It's also far from clear that there is only one desirable behavior of this sort, so a design path that offers the possibility of multiple webserver implementations as separate extensions seems attractive. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_dump and schema names
On Fri, Aug 9, 2013 at 01:48:43AM -0400, Tom Lane wrote: > Bruce Momjian writes: > > pg_dump goes to great lengths not to hard-code the schema name into > > commands like CREATE TABLE, instead setting the search_path before > > creating the table; these commands: > > > CREATE SCHEMA xx; > > CREATE TABLE xx.test(x int); > > > generates this output: > > > SET search_path = xx, pg_catalog; > > CREATE TABLE test ( > > x integer > > ); > > > If you dump a schema and want to reload it into another schema, you > > should only need to update that one search_path line. However, later in > > the dump file, we hardcode the schema name for setting the object owner: > > > ALTER TABLE xx.test OWNER TO postgres; > > > Could we use search_path here to avoid the schema designation? > > Perhaps, but that's not likely to reduce the number of places you have to > edit, unless your dump is only one schema anyway. > > The practical difficulties involved can be seen by reading the comments > and code for _getObjectDescription(). Yes, I looked at that.Seems _getObjectDescription() is only called from _printTocEntry(), and that function has a call to _selectOutputSchema() at the top, so we already know we have search_path set to the proper schema. The attached patch removes the unnecessary schema qualification for ALTER OWNER, and the attached dump file show a two-schema dump that restores just fine. Basically, if we are going to use search_path to avoid schema specification, we should do it in ALTER OWNER too. -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + diff --git a/src/bin/pg_dump/pg_backup_archiver.c b/src/bin/pg_dump/pg_backup_archiver.c new file mode 100644 index cd7669b..0a79489 *** a/src/bin/pg_dump/pg_backup_archiver.c --- b/src/bin/pg_dump/pg_backup_archiver.c *** _getObjectDescription(PQExpBuffer buf, T *** 2906,2913 strcmp(type, "TEXT SEARCH CONFIGURATION") == 0) { appendPQExpBuffer(buf, "%s ", type); - if (te->namespace && te->namespace[0]) /* is null pre-7.3 */ - appendPQExpBuffer(buf, "%s.", fmtId(te->namespace)); /* * Pre-7.3 pg_dump would sometimes (not always) put a fmtId'd name --- 2906,2911 -- -- PostgreSQL database dump -- SET statement_timeout = 0; SET lock_timeout = 0; SET client_encoding = 'UTF8'; SET standard_conforming_strings = on; SET check_function_bodies = false; SET client_min_messages = warning; -- -- Name: xx; Type: SCHEMA; Schema: -; Owner: postgres -- CREATE SCHEMA xx; ALTER SCHEMA xx OWNER TO postgres; -- -- Name: yy; Type: SCHEMA; Schema: -; Owner: postgres -- CREATE SCHEMA yy; ALTER SCHEMA yy OWNER TO postgres; -- -- Name: plpgsql; Type: EXTENSION; Schema: -; Owner: -- CREATE EXTENSION IF NOT EXISTS plpgsql WITH SCHEMA pg_catalog; -- -- Name: EXTENSION plpgsql; Type: COMMENT; Schema: -; Owner: -- COMMENT ON EXTENSION plpgsql IS 'PL/pgSQL procedural language'; SET search_path = xx, pg_catalog; SET default_tablespace = ''; SET default_with_oids = false; -- -- Name: test2; Type: TABLE; Schema: xx; Owner: postgres; Tablespace: -- CREATE TABLE test2 ( x integer ); ALTER TABLE test2 OWNER TO postgres; SET search_path = yy, pg_catalog; -- -- Name: zz; Type: TABLE; Schema: yy; Owner: postgres; Tablespace: -- CREATE TABLE zz ( x integer ); ALTER TABLE zz OWNER TO postgres; SET search_path = xx, pg_catalog; -- -- Data for Name: test2; Type: TABLE DATA; Schema: xx; Owner: postgres -- COPY test2 (x) FROM stdin; \. SET search_path = yy, pg_catalog; -- -- Data for Name: zz; Type: TABLE DATA; Schema: yy; Owner: postgres -- COPY zz (x) FROM stdin; \. -- -- Name: public; Type: ACL; Schema: -; Owner: postgres -- REVOKE ALL ON SCHEMA public FROM PUBLIC; REVOKE ALL ON SCHEMA public FROM postgres; GRANT ALL ON SCHEMA public TO postgres; GRANT ALL ON SCHEMA public TO PUBLIC; -- -- PostgreSQL database dump complete -- -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal: leave a hint when switching logging away from stderr
Stephen Frost writes: > * Tom Lane (t...@sss.pgh.pa.us) wrote: >> Does this rise to the level of a usability bug that ought to be >> back-patched? As I said, we've seen this type of thinko multiple >> times before. > For this, I'd say to not back-patch it; we seem to have had enough fun > with changing error messaging in back branches already lately (eg: > the recent autovacuum changes..). Well, since these would be new messages, they'd just not get translated (until the translators got around to them). Seems like the worst case scenario is that someone who didn't understand the English version would remain as clueless as before, which isn't much of a downside. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] question about HTTP API
> For my patch, I plan to use pre-forked bgworkers which have already > connected to the backend, so that populating the relcache and other process > startup costs don't impact on the HTTP response time. (This still means > queries are being planned and function code is being compiled for each > request, of course...) > > This is going to be a very long series of patches, but IMHO we have to > start somewhere! For some applications, performance is far less important > than ease-of-use and ease-of-deployment. Agreed. Too bad you can't do this as an extension, it would allow you to rev releases a lot faster than once a year. Actually, maybe you should look at "what is the minimum patch required to enable a webserver extension", with the idea that most of the webserver code would still live outside the core? That way you could continue to develop it a lot faster. Also, if all aspects of the web services model (management of sessions, sercurity, etc.) need to be a core PostgreSQL patch, you're in for a really long set of arguments since there's no one "best" way to do these things. Keeping the web services engine outside the core would let you not have those arguments on this list, which otherwise would likely cause the feature to miss 9.4. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal: leave a hint when switching logging away from stderr
* Tom Lane (t...@sss.pgh.pa.us) wrote: > Thoughts? In particular, anyone want to bikeshed on the message wording? Looks like a good idea to me and the wording looks fine to me. > Does this rise to the level of a usability bug that ought to be > back-patched? As I said, we've seen this type of thinko multiple > times before. For this, I'd say to not back-patch it; we seem to have had enough fun with changing error messaging in back branches already lately (eg: the recent autovacuum changes..). Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] confusing error message
* Tom Lane (t...@sss.pgh.pa.us) wrote: > This is all sufficiently bizarre that I don't know if there's an > easy explanation. It occurs to me that users, when faced with complex error messages, are very likely to go to their favorite search engine with it and rarely does that lead them to any documentation on the subject but instead they get mailing list archives or q-and-a type of websites. Perhaps we should add an area to our documentation which provides more information about the specific error messages which PostgreSQL returns? That's not a terribly exciting bit of documentation to write, but perhaps it would be very useful for our users. Thoughts? Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] mvcc catalo gsnapshots and TopTransactionContext
On Tue, Aug 06, 2013 at 09:06:59AM +0200, Andres Freund wrote: > On 2013-08-05 13:09:31 -0400, Noah Misch wrote: > > When we call AtEOSubXact_Inval() or AtEOXact_Inval() with a relation still > > open, we can potentially get a relcache rebuild and therefore a syscache > > lookup as shown above. CommitSubTransaction() is also potentially affected, > > though I don't have an SQL-level test case for that. It calls > > CommandCounterIncrement() after moving to TRANS_COMMIT. That CCI had better > > find no invalidations of open relations, or we'll make syscache lookups. > > (In > > simple tests, any necessary invalidations tend to happen at the CCI in > > CommitTransactionCommand(), so the later CCI does in fact find nothing to > > do. > > I have little confidence that should be counted upon, though.) > > > How might we best rearrange things to avoid these hazards? > > Ok. After a good bit of code reading, I think this isn't an actual bug > but an overzealous Assert(). I think it should be > Assert(IsTransactionBlock()) not Assert(IsTransactionState()); IsTransactionBlock() is for higher-level things that care about actual use of BEGIN. It's false in the middle of executing a single-statement transaction, but that's of course a perfectly valid time for syscache lookups. > The reason for that is that when we do the AtEO(Sub)?Xact_Inval(), we've > already done a RecordTransactionAbort(true|false) and > CurrentTransactionState->state = TRANS_ABORT. So the visibility routines > have enough information to consider rows created by the aborted > transaction as invisible. > > I am not really happy with the RelationReloadIndexInfo()s in > RelationClearRelation() when we're in an aborted state, especially as > the comment surrounding them are clearly out of date, but I don't see a > bug there anymore. Interesting. -- Noah Misch EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: ALTER SYSTEM SET command to change postgresql.conf parameters (RE: [HACKERS] Proposal for Allow postgresql.conf values to be changed via SQL [review])
On Fri, Aug 09, 2013 at 03:08:45PM +0200, Dimitri Fontaine wrote: > Stephen Frost writes: > > A shared catalog which defined which *database* to run the trigger > > in, with a way to fire off a new backend worker in that database > > and tell it to run the trigger, might be interesting and would > > deal with the issue that the trigger would behave differently > > depending on the database connected to. That would bring along > > other issues, of course, but it seemed an interesting enough idea > > to mention. > > Yeah, I like that approach. The only drawback is that it requires > having PLproxy in core first, or something like Foreign Functions or > something. SQL/MED does define such an API. Whether we find it useful enough to make it the default way of doing things is a separate matter. I'll do some research. Cheers, David. -- David Fetter http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fet...@gmail.com iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal for XML Schema Validation
Craig Ringer writes: > On 08/09/2013 05:55 PM, Kodamasimham Pridhvi (MT2012066) wrote: >> Only sole purpose of making use of keyword USE_SCHEMA is to mimic oracle >> (somewhere on >> oracle site i found this type of syntax) > Well, there's certainly precedent for that - see to_char, the various > different BEGIN permutations, etc. > I would suggest doing that as a second separate step though. First > produce a function based interface that can be tried and tested without > the need to mess with the syntax and the parser. Then once that's in > good shape propose a patch that adds the compatibility syntax. TBH I think any such syntax would be rejected. We have enough trouble dealing with the SQL standards committee's creative ideas about weird syntax with unnecessary keywords. Oracle compatibility is not going to be enough of an argument for inventing another keyword. Especially not if it has to be reserved, which seems rather likely given where you're proposing to put it. Having to add another catalog column for the sole use of this feature is another thing that's unlikely to fly. (A general rule of thumb is that if a proposed feature imposes overhead on everybody, whether they ever use that feature or not, it had better be something that a pretty large percentage of people *will* use. I doubt this meets that standard.) So if you can do it along the lines of CHECK(xml_validates(xml_col_name, 'schema name')), I would strongly urge you to pursue that path. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] mvcc catalo gsnapshots and TopTransactionContext
On 2013-08-09 09:05:51 -0400, Robert Haas wrote: > On Thu, Aug 8, 2013 at 1:28 PM, Andres Freund wrote: > > Well. It isn't. At least not in general. The specific case triggered > > here though are cache invalidations being processed which can lead to > > the catalog being read (pretty crummy, but not easy to get rid > > of). That's actually safe since before we process the invalidations we > > have done: > > 1) CurrentTransactionState->state = TRANS_ABORT > > 2) RecordTransactionAbort(), marking the transaction as aborted in the > > clog > > 3) marked subxacts as aborted > > 3) ProcArrayEndTransaction() (for toplevel ones) > > > > Due to these any tqual stuff will treat the current (sub-)xact and it's > > children as aborted. So the catalog lookups will use the catalog in a > > sensible state. > > > > Now, one could argue that it's certainly not safe for anything but > > xact.c itself to play such games. And would be pretty damn right. We > > could add some flat to signal catcache.c to temporarily use > > Assert(IsTransactionBlock()) instead of IsTransactionStmt() but that > > seems overly complex. I think the danger of code doing stuff in an > > aborted transaction isn't that big. > > > > This certainly deserves a good comment... > > Do you want to propose something? I can, but it will have to wait a couple of days. I am only still online because my holiday plans didn't 100% work out and got delayed by a day... > I basically don't have a very good feeling about this. Processing > invalidations should invalidate stuff, not try to reread it. I agree. But fixing that seems to require a good amount of surgery. The problem is that currently we cannot know for sure some index doesn't still use the index support infrastructure :(. > You may > be right that our MVCC snapshot is OK at the point invalidations are > processed, but we don't know what caused the transaction to abort in > the first place. Well, we know it has been an ERROR and nothing worse. And that it successfully longjmp'ed up the way to postgres.c or one of the PLs. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal for XML Schema Validation
On 08/09/2013 05:55 PM, Kodamasimham Pridhvi (MT2012066) wrote: > (B) Alter Table > >Only sole purpose of making use of keyword USE_SCHEMA is to mimic oracle > (somewhere on > oracle site i found this type of syntax) Well, there's certainly precedent for that - see to_char, the various different BEGIN permutations, etc. I would suggest doing that as a second separate step though. First produce a function based interface that can be tried and tested without the need to mess with the syntax and the parser. Then once that's in good shape propose a patch that adds the compatibility syntax. Among other things, if you're not adding new syntax you're more likely to be able to prototype this as an extension. I'm very far from being an expert in getting patches into Pg, though, so please don't just take my word for it. > I may not be correct but check constraint is only used to > limit the value ranges. A CHECK constraint can be any logic that refers only to the current row. Using it with non-immutable (stable/volatile) functions isn't prevented, but is also not a great idea, so updating an xsd would be a concern, but it'd otherwise be fine. > yes , there are memory management related issue with libxml as mentioned on > below link > http://wiki.postgresql.org/wiki/XML_Support#Implementation_Issues > It is also mention there that this issue can be resolved(how? don't know!). Well, if you're planning on relying on libxml in core (and it'll have to be in core if you're adding new syntax) then you'll need a solid, well researched answer to that one or an alternative XML library that's portable and doesn't have those issues. -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] CREATE EVENT TRIGGER syntax
On Mon, Aug 5, 2013 at 4:53 PM, Dimitri Fontaine wrote: > Bruce Momjian writes: >> So do we want to keep that "AND" in the 9.3beta and 9.4 documentation? > > The grammar as in gram.y still allows the AND form, and I think we're > used to maintain documentation that matches the code here. So I think it > makes sense to remove both capabilities as we failed to deliver any > other filter. > > But if we wanted to clean that, what about having the grammar check for > the only one item we support rather than waiting until into > CreateEventTrigger() to ereport a syntax error? I have found that it's generally better to recognize such errors in the post-parse phase rather than during parsing. When you start adding more options, that tends to quickly become the only workable option anyway. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: ALTER SYSTEM SET command to change postgresql.conf parameters (RE: [HACKERS] Proposal for Allow postgresql.conf values to be changed via SQL [review])
Stephen Frost writes: > A shared catalog which defined which *database* to run the trigger in, > with a way to fire off a new backend worker in that database and tell it > to run the trigger, might be interesting and would deal with the issue > that the trigger would behave differently depending on the database > connected to. That would bring along other issues, of course, but it > seemed an interesting enough idea to mention. Yeah, I like that approach. The only drawback is that it requires having PLproxy in core first, or something like Foreign Functions or something. Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] mvcc catalo gsnapshots and TopTransactionContext
On Thu, Aug 8, 2013 at 1:28 PM, Andres Freund wrote: > Well. It isn't. At least not in general. The specific case triggered > here though are cache invalidations being processed which can lead to > the catalog being read (pretty crummy, but not easy to get rid > of). That's actually safe since before we process the invalidations we > have done: > 1) CurrentTransactionState->state = TRANS_ABORT > 2) RecordTransactionAbort(), marking the transaction as aborted in the > clog > 3) marked subxacts as aborted > 3) ProcArrayEndTransaction() (for toplevel ones) > > Due to these any tqual stuff will treat the current (sub-)xact and it's > children as aborted. So the catalog lookups will use the catalog in a > sensible state. > > Now, one could argue that it's certainly not safe for anything but > xact.c itself to play such games. And would be pretty damn right. We > could add some flat to signal catcache.c to temporarily use > Assert(IsTransactionBlock()) instead of IsTransactionStmt() but that > seems overly complex. I think the danger of code doing stuff in an > aborted transaction isn't that big. > > This certainly deserves a good comment... Do you want to propose something? I basically don't have a very good feeling about this. Processing invalidations should invalidate stuff, not try to reread it. You may be right that our MVCC snapshot is OK at the point invalidations are processed, but we don't know what caused the transaction to abort in the first place. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: ALTER SYSTEM SET command to change postgresql.conf parameters (RE: [HACKERS] Proposal for Allow postgresql.conf values to be changed via SQL [review])
* Robert Haas (robertmh...@gmail.com) wrote: > On Sun, Aug 4, 2013 at 4:26 PM, Dimitri Fontaine > wrote: > > What I'm yet unsure about is that there's a consensus that the use cases > > are worthy of a new shared catalog in the system. Also I didn't look how > > hard it is to actually provide for it. > > A new shared catalog wouldn't actually help, because the actual > procedure to be run has to live in pg_proc, which is not shared. And > that has references to all sorts of other things (like pg_language) > that aren't shared either. A shared catalog which defined which *database* to run the trigger in, with a way to fire off a new backend worker in that database and tell it to run the trigger, might be interesting and would deal with the issue that the trigger would behave differently depending on the database connected to. That would bring along other issues, of course, but it seemed an interesting enough idea to mention. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] Proposal for XML Schema Validation
Hi , In support to proposal, I'm extending by providing use case scenario. Consider a table student(id int, profile xml) where you are storing student id and their profile in xml format. Let xml format for profile be as follow. profile.xml=== xyz abc 20 DB101 bangalore 560100 There may a situation where course (tag) may be missing in xml file. The insertion of this file will not prompt any error message (even though it is one of important field) because current version of postgresql only check for well-formedness of xml document, no provision for validation against xml schema. Initially user has to register his schema by using our function REGSITER_XML_SCHEMA( “”, “” , “") where user has to provide the URI for the schema , Namespace to avoid namespace clashes and need to provide the complete schema in text and while creating the table user must associate the schema with column as below : Create table ( , xml USE_SCHEMA ); eg: Create table student (id int, profile xml USE_SCHEMA "http://www.example.com/profile";); So we are providing facility to register xml schema against a column with a datatype xml in a table, so that when ever an insertion in that column happens query processor (or which ever unit involved in this process) will first call our function validate_xml_schema()(as proposed in previous mail), which will decide to proceed or abort the insertion. In this way we can ensure that all the documents in the xml column are valid against schema. Thanks, Vikrantsingh From: Kodamasimham Pridhvi (MT2012066) Sent: Thursday, August 08, 2013 10:09 PM To: pgsql-hackers@postgresql.org Cc: Bisen Vikrantsingh Mohansingh MT2012036 Subject: RE: [HACKERS] Proposal for XML Schema Validation Objective: To Add XML Schema validation and xmlvalidate functions (SQL:2008) Description: We’ve gone through current support of xml in postgreSQL and found that there is a check for well-formedness of xml document while inserting and updating. We want to extend this feature by adding xml schema validation. We will be providing user with DDL commands for creating and deleting XML Schema, also provision of associating xml schema with table while creation of new table or while altering table structure, we are planning to use libxml2 library. Proposed syntax is given below. 1. End user perspective: 1.1 DDL 1.1.1 Register xmlschema Syntax REGSITER_XML_SCHEMA( “”, “” , “") We will save this information into system catalog 1.1.2 Delete xmlschema Syntax DELETE_XML_SCHEMA( “”) 1.1.3 Modification in Create Table commands Syntax Create table ( , xml USE_SCHEMA ) We will keep a flag in catalog for xml schema validation for each table. If xml schema is specified then while every insert/update sql query we will call valdate_xml_schema() [currently built in xml_is_well_formed() is called while inserting/updating, we can place our function call just next to it] 1.1.4 Similarly for Alter Table commands 2. Developer perspective 2.1. C-Function for Validation of xml doc 2.1.1 Validating XML Syntax Int validate_xml_schema(char xml[], char xml_schema[]) This function will return 0 if validate successfully else return respective error code (which we will define later) We are planning to use libxml2 This function will called while insert/update sql query From: Andrew Dunstan Sent: Thursday, August 08, 2013 6:54 PM To: Kodamasimham Pridhvi (MT2012066) Cc: pgsql-hackers@postgresql.org; Bisen Vikrantsingh Mohansingh MT2012036; r...@iiitb.ac.in Subject: Re: [HACKERS] Proposal for XML Schema Validation On 08/08/2013 12:42 AM, Kodamasimham Pridhvi (MT2012066) wrote: > Hello pgsql-hackers , > With reference to "Add XML Schema validation and xmlvalidate > functions (SQL:2008)" in ToDo list, we have gone through pgsql-mailing > list but we didn't find any significant work in this area, so we are > proposing our own model for xml schema validation . please kindly go > through it and let us know how can we improve it.Please find the > attached proposal document. > > > > Please post your proposal as text, not as a PDF attachment. That's what is preferred on this mailing list. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal for XML Schema Validation
Hi Craig Ringer, (Sorry for reposting it as it got posted in different thread previously) yeah, you are right indeed. I tried to answer your question in three section as below. (A) XML Schema update User may wish to update schema in future. So we will provide them one more function UPDATE_XML_SCHEMA("","","") Here we are assuming URL OF SCHEMA as a primary key. And we will overwrite content of .xsd field no provision of altering a selective portion xsd in place. Whenever an update to schema happens before committing changes we will run small algo as below 1. For all table in which this schema is used 2.if(!validate xml document for each row) 3. abort/exit with error; 4. commit If user modify schema by adding some extra optional tags then their won't be any error ,error will arise in cases such as adding new compulsory/required tags, datatype in .xsd for certain tag is modified. This is beyond our control an obvious solution as suggested by you could be used, user will manually go through rows which are violating schema (for simplicity, we will mention row number which are violating schema in our error message) and do modification/deletion as required. ( similar case happen, suppose you have a table t(a,b,c) with lot of data, later on you want to add primary key constraints to column 'a', but if data in column 'a' is not unique then it may fail, and user has to manually handle this situation may be by deleting or modifying respective rows. ) (B) Alter Table Only sole purpose of making use of keyword USE_SCHEMA is to mimic oracle (somewhere on oracle site i found this type of syntax), I may not be correct but check constraint is only used to limit the value ranges. So it is better to introduce new meaningful keyword or else no problem to work embed this feature with CHECK() (C) yes , there are memory management related issue with libxml as mentioned on below link http://wiki.postgresql.org/wiki/XML_Support#Implementation_Issues It is also mention there that this issue can be resolved(how? don't know!). Thanks, Pridhvi & Vikrantsingh IIIT Bangalore From: Craig Ringer Sent: Friday, August 09, 2013 8:27 AM To: Kodamasimham Pridhvi (MT2012066) Cc: pgsql-hackers@postgresql.org; Bisen Vikrantsingh Mohansingh MT2012036 Subject: Re: [HACKERS] Proposal for XML Schema Validation On 08/09/2013 12:39 AM, Kodamasimham Pridhvi (MT2012066) wrote: > > Objective: To Add XML Schema validation and xmlvalidate functions (SQL:2008) > > Description: > We’ve gone through current support of xml in postgreSQL and found that there > is a check for well-formedness of xml document while inserting and updating. > We want to extend this feature by adding xml schema validation. >We will be providing user with DDL commands for creating and deleting > XML Schema, also provision of associating xml schema with table while > creation of new table or while altering table structure, we are planning to > use libxml2 library. Proposed syntax is given below. The first thing that comes to mind here is "what if the user wants to update/replace the schema" ? How would you handle re-validating the fields? Sure, updating XML schemas is not a great idea, but it doesn't stop people doing it. It might be reasonable to say "if you want to do this you have to drop the dependent constraints, drop the schema, re-create the schema and re-create the schema constraints" though. Why extend the create table / alter table syntax with "USE_SCHEMA"? Is there a compatibility/standards reason to do this? If not, what advantage does this provide over using a suitable CHECK constraint? IIRC there were some memory management issues with libxml2 in Pg. Anyone remember anything about that? -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
FW: [HACKERS] Proposal for XML Schema Validation
Hi Craig Ringer, (Sorry for reposting it as it got posted in different thread previously) yeah, you are right indeed. I tried to answer your question in three section as below. (A) XML Schema update User may wish to update schema in future. So we will provide them one more function UPDATE_XML_SCHEMA("","","") Here we are assuming URL OF SCHEMA as a primary key. And we will overwrite content of .xsd field no provision of altering a selective portion xsd in place. Whenever an update to schema happens before committing changes we will run small algo as below 1. For all table in which this schema is used 2.if(!validate xml document for each row) 3. abort/exit with error; 4. commit If user modify schema by adding some extra optional tags then their won't be any error ,error will arise in cases such as adding new compulsory/required tags, datatype in .xsd for certain tag is modified. This is beyond our control an obvious solution as suggested by you could be used, user will manually go through rows which are violating schema (for simplicity, we will mention row number which are violating schema in our error message) and do modification/deletion as required. ( similar case happen, suppose you have a table t(a,b,c) with lot of data, later on you want to add primary key constraints to column 'a', but if data in column 'a' is not unique then it may fail, and user has to manually handle this situation may be by deleting or modifying respective rows. ) (B) Alter Table Only sole purpose of making use of keyword USE_SCHEMA is to mimic oracle (somewhere on oracle site i found this type of syntax), I may not be correct but check constraint is only used to limit the value ranges. So it is better to introduce new meaningful keyword or else no problem to work embed this feature with CHECK() (C) yes , there are memory management related issue with libxml as mentioned on below link http://wiki.postgresql.org/wiki/XML_Support#Implementation_Issues It is also mention there that this issue can be resolved(how? don't know!). Thanks, Vikrantsingh & Pridhvi IIIT Bangalore From: Craig Ringer Sent: Friday, August 09, 2013 8:27 AM To: Kodamasimham Pridhvi (MT2012066) Cc: pgsql-hackers@postgresql.org; Bisen Vikrantsingh Mohansingh MT2012036 Subject: Re: [HACKERS] Proposal for XML Schema Validation On 08/09/2013 12:39 AM, Kodamasimham Pridhvi (MT2012066) wrote: > > Objective: To Add XML Schema validation and xmlvalidate functions (SQL:2008) > > Description: > We’ve gone through current support of xml in postgreSQL and found that there > is a check for well-formedness of xml document while inserting and updating. > We want to extend this feature by adding xml schema validation. >We will be providing user with DDL commands for creating and deleting > XML Schema, also provision of associating xml schema with table while > creation of new table or while altering table structure, we are planning to > use libxml2 library. Proposed syntax is given below. The first thing that comes to mind here is "what if the user wants to update/replace the schema" ? How would you handle re-validating the fields? Sure, updating XML schemas is not a great idea, but it doesn't stop people doing it. It might be reasonable to say "if you want to do this you have to drop the dependent constraints, drop the schema, re-create the schema and re-create the schema constraints" though. Why extend the create table / alter table syntax with "USE_SCHEMA"? Is there a compatibility/standards reason to do this? If not, what advantage does this provide over using a suitable CHECK constraint? IIRC there were some memory management issues with libxml2 in Pg. Anyone remember anything about that? -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal for XML Schema Validation
Hi Craig Ringer, yeah, you are right indeed. I tried to answer your question in three section as below. (A) XML Schema update User may wish to update schema in future. So we will provide them one more function UPDATE_XML_SCHEMA("","","") Here we are assuming URL OF SCHEMA as a primary key. And we will overwrite content of .xsd field no provision of altering a selective portion xsd in place. Whenever an update to schema happens before committing changes we will run small algo as below 1. For all table in which this schema is used 2.if(!validate xml document for each row) 3. abort/exit with error; 4. commit If user modify schema by adding some extra optional tags then their won't be any error ,error will arise in cases such as adding new compulsory/required tags, datatype in .xsd for certain tag is modified. This is beyond our control an obvious solution as suggested by you could be used, user will manually go through rows which are violating schema (for simplicity, we will mention row number which are violating schema in our error message) and do modification/deletion as required. ( similar case happen, suppose you have a table t(a,b,c) with lot of data, later on you want to add primary key constraints to column 'a', but if data in column 'a' is not unique then it may fail, and user has to manually handle this situation may be by deleting or modifying respective rows. ) (B) Alter Table Only sole purpose of making use of keyword USE_SCHEMA is to mimic oracle (somewhere on oracle site i found this type of syntax), I may not be correct but check constraint is only used to limit the value ranges. So it is better to introduce new meaningful keyword or else no problem to work embed this feature with CHECK() (C) yes , there are memory management related issue with libxml as mentioned on below link http://wiki.postgresql.org/wiki/XML_Support#Implementation_Issues It is also mention there that this issue can be resolved(how? don't know!). Thanks, Vikrantsingh & Pridhvi IIIT Bangalore From: Craig Ringer Sent: Friday, August 09, 2013 8:27 AM To: Kodamasimham Pridhvi (MT2012066) Cc: pgsql-hackers@postgresql.org; Bisen Vikrantsingh Mohansingh MT2012036 Subject: Re: [HACKERS] Proposal for XML Schema Validation On 08/09/2013 12:39 AM, Kodamasimham Pridhvi (MT2012066) wrote: > > Objective: To Add XML Schema validation and xmlvalidate functions (SQL:2008) > > Description: > We’ve gone through current support of xml in postgreSQL and found that there > is a check for well-formedness of xml document while inserting and updating. > We want to extend this feature by adding xml schema validation. >We will be providing user with DDL commands for creating and deleting > XML Schema, also provision of associating xml schema with table while > creation of new table or while altering table structure, we are planning to > use libxml2 library. Proposed syntax is given below. The first thing that comes to mind here is "what if the user wants to update/replace the schema" ? How would you handle re-validating the fields? Sure, updating XML schemas is not a great idea, but it doesn't stop people doing it. It might be reasonable to say "if you want to do this you have to drop the dependent constraints, drop the schema, re-create the schema and re-create the schema constraints" though. Why extend the create table / alter table syntax with "USE_SCHEMA"? Is there a compatibility/standards reason to do this? If not, what advantage does this provide over using a suitable CHECK constraint? IIRC there were some memory management issues with libxml2 in Pg. Anyone remember anything about that? -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] question about HTTP API
On 9 Aug 2013 17:03, "Greg Stark" wrote: > I looked at the wiki and thought it had a lot of good ideas but also a lot of good questions. do you have any idea how to tackle the session problem? > [...] > A decent HTTP RPC layer will need to have some way of creating a session and issuing multiple requests on that session. That session will need to be a stored and available for future requests. The obvious concern is state like the current database, current role, gucs, and prepared queries. But even if you're prepared to discard those for a stateless interface the performance issues of not having a relcache built will be pretty severe. The performance certainly will be poor to start with, yes. Sessions and HTTP simply don't go together, and so I think we need to accept that each request is going to be stateless. (We could use Websockets, and pass the socket to libpq but that hardly counts as an HTTP API.) For my patch, I plan to use pre-forked bgworkers which have already connected to the backend, so that populating the relcache and other process startup costs don't impact on the HTTP response time. (This still means queries are being planned and function code is being compiled for each request, of course...) This is going to be a very long series of patches, but IMHO we have to start somewhere! For some applications, performance is far less important than ease-of-use and ease-of-deployment. Regards, Andrew Tipton
Re: [HACKERS] question about HTTP API
On Fri, Aug 9, 2013 at 9:21 AM, Andrew Tipton wrote: > I recently threw together a quick-and-dirty prototype of this idea. It > was an external tool which used the libmicrohttpd library to accept > incoming requests, convert them to a SQL query (which called a stored > procedure), and return the query results. (It allowed *any* content-type > to be returned, not just JSON.) I only got as far as handling GET > requests. The code is available here: I looked at the wiki and thought it had a lot of good ideas but also a lot of good questions. do you have any idea how to tackle the session problem? Postgres has always assumed session == backend == connection. TPC prepared transactions are the one main break in this model and they can take a lot of short cuts because they know there will be no more operations in the transaction aside from commit or rollback. A decent HTTP RPC layer will need to have some way of creating a session and issuing multiple requests on that session. That session will need to be a stored and available for future requests. The obvious concern is state like the current database, current role, gucs, and prepared queries. But even if you're prepared to discard those for a stateless interface the performance issues of not having a relcache built will be pretty severe. I suspect this is something better built into something like pgbouncer which already has to deal with multiplexing many clients onto a single connection. -- greg
Re: [HACKERS] question about HTTP API
On Fri, Aug 9, 2013 at 3:44 AM, Josh Berkus wrote: > Well, there's HTSQL: http://htsql.org/ > > Other than that, no. I was thinking of creating a general tool as a > custom background worker, which would take stored procedure calls and > pass them through to PostgreSQL, returning results as JSON. Mainly > because I need it for a project. However, this wouldn't accept any query. I'm actually in the process of writing an HTTP server that lives inside Postgres. I hope to have a WIP patch ready in the next week or two, and then (if all goes well) submit it for CF2. [There are a few impediments to doing this as an extension module, which I shall detail as part of the WIP patch...] Why integrate a webserver with Postgres? Well, the trend that we're seeing in web development is to push much of the display logic into client-side Javascript frameworks and expose the database through an HTTP API. (Good examples of this are Parse and Meteor.) CouchDB can even host the application's static content alongside the data. As a result, many applications don't need any middle-tier Python/Ruby/Java framework at all. One of my goals is to allow Postgres to directly serve HTTP requests and return arbitrary content (e.g. text/html in addition to JSON) directly to end-user browsers. With the JSON datatype and PL/v8, code can even be re-used on both client and server. Getting rid of an entire middle tier would make small-scale application development dramatically easier. Regards, Andrew Tipton
Re: [HACKERS] question about HTTP API
On Fri, Aug 9, 2013 at 2:44 AM, Szymon Guz wrote: > > Do we have any attempts of implementation the HTTP server described at > http://wiki.postgresql.org/wiki/HTTP_API? > > It seems like there are design ideas only. Are there any ideas about > implementation like using some existing http servers or writing everything > from scratch? > I recently threw together a quick-and-dirty prototype of this idea. It was an external tool which used the libmicrohttpd library to accept incoming requests, convert them to a SQL query (which called a stored procedure), and return the query results. (It allowed *any* content-type to be returned, not just JSON.) I only got as far as handling GET requests. The code is available here: http://code.malloclabs.com/pghttpd.v1 I'm also aware of an nginx module (ngx_postgres) that lets you transform requests into queries against a Postgres database, but it isn't "generic" -- you have to configure it for each URL that should be handled. Regards, Andrew Tipton
Re: [HACKERS] [PATCH] Statistics collection for CLUSTER command
Thank you, but it seems you've duplicated the title from the other patch (and thanks for adding that one, too!). Indeed, possibly a wrong copy paste. Fixed. -- Fabien. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Statistics collection for CLUSTER command
On 08/08/2013 02:26 PM, Fabien COELHO wrote: > >> As part of routine maintenance monitoring, it is interesting for us to >> have statistics on the CLUSTER command (timestamp of last run, and >> number of runs since stat reset) like we have for (auto)ANALYZE and >> (auto)VACUUM. Patch against today's HEAD attached. >> >> I would add this to the next commitfest but I seem to be unable to log >> in with my community account (I can log in to the wiki). Help >> appreciated. > > Done. > Thank you, but it seems you've duplicated the title from the other patch (and thanks for adding that one, too!). https://commitfest.postgresql.org/action/patch_view?id=1190 Vik -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers