Re: [HACKERS] plperlu problem with utf8
On Sat, Dec 18, 2010 at 20:29, David E. Wheeler wrote: > On Dec 17, 2010, at 9:32 PM, David Christensen wrote: > latin=# SELECT * FROM perlgets('“hello”'); > length │ is_utf8 > ┼─ > 11 │ f > > (Yes I used Latin-1 curly quotes in that last example). Erm, latin1 does not have curly quotes, Windows-1252 does. Those are utf8 quotes AFAICT so 11 is actually right (thats 3 bytes per quote so that where 11 comes from). If latin1 did have quotes and you used them you would have gotten the same answer as latin1 is a single byte encoding. :) I think your terminal tripping you up here. Postgres also gives the same length: latin1=# select length('“hello”'); length 11 (1 row) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Can postgres create a file with physically continuous blocks.
folks, Does postgres make an effort to create a file with physically continuous blocks? Thanks!
Re: [HACKERS] Re: Proposed Windows-specific change: Enable crash dumps (like core files)
On 18/12/2010 1:13 AM, Magnus Hagander wrote: On Fri, Dec 17, 2010 at 17:42, Magnus Hagander wrote: On Fri, Dec 17, 2010 at 17:24, Craig Ringer wrote: On 17/12/2010 7:17 PM, Magnus Hagander wrote: Now, that's annoying. So clearly we can't use that function to determine which version we're on. Seems it only works for "image help api", and not the general thing. According to http://msdn.microsoft.com/en-us/library/ms679294(v=vs.85).aspx, we could look for: SysEnumLines - if present, we have at least 6.1. However, I don't see any function that appeared in 6.0 only.. Actually, I'm wrong - there are functions enough to determine the version. So here's a patch that tries that. Great. I pulled the latest from your git tree, tested that, and got much better results. Crashdump size is back to what I expected. In my test code, fcinfo->args and fcinfo->argnull can be examined without problems. Backtraces look good; see below. It seems to be including backend private memory again now. Thanks _very_ much for your work on this. fcinfo->flinfo is still inaccessible, but I suspect it's in shared memory, as it's at 0x0135 . Ditto fcinfo->resultinfo and fcinfo->context. This has me wondering - is it going to be necessary to dump shared memory to make many backtraces useful? I just responded to Tom mentioning that the patch doesn't currently dump shared memory, but I hadn't realized the extent to which it's used for _lots_ more than just disk buffers. I'm not sure how to handle dumping shared_buffers when someone might be using multi-gigabyte shared_buffers, though. Dumping the whole lot would risk sudden out-of-disk-space issues, slowdowns as dumps are written, and the backend being "frozen" as it's being dumped could delay the system coming back up again. Trying to selectively dump critical parts could cause dumps to fail if the system is in early startup or a bad state. The same concern applies to writing backend private memory; it's fine most of the time, but if you're doing data warehousing queries with 2GB of work_mem, it's going to be nasty having all that extra disk I/O and disk space use, not to mention the hold-up while the dump is written. If this is something we want to have people running in production "just in case" or to track down rare / hard to reproduce faults, that'll be a problem. OTOH, we can't really go poking around in palloc contexts to decide what to dump. I guess we could always do a small, minimalist minidump, then write _another_ dump that attempts to include select parts of shm and backend private memory. I just thought of two other things, too: - Is it possible for this handler to be called recursively if it fails during the handler call? If so, do we need to uninstall the handler before attempting a dump to avoid such recursion? I need to do some testing and dig around MSDN to find out more about this. - Can asynchronous events like signals (or their win32 emulation) interrupt an executing crash handler, or are they blocked before the crash handler is called? If they're not blocked, do we need to try to block them before attempting a dump? Again, I need to do some reading on this. Anyway, here's an example of the backtraces I'm currently getting. They're clearly missing some parameters (in shm? Unsure) but provide source file+line, argument values where resolvable, and the call stack its self. Locals are accessible at all levels of the stack when you go poking around in windbg. This dump file has an exception of interest stored in it. The stored exception information can be accessed via .ecxr. (930.12e8): Access violation - code c005 (first/second chance not available) eax=00bce2c0 ebx=72d0e800 ecx=02e4 edx=72cb81c8 esi=00f0 edi=0930 eip=771464f4 esp=00bce294 ebp=00bce2a4 iopl=0 nv up ei pl zr na pe nc cs=001b ss=0023 ds=0023 es=0023 fs=003b gs= efl=0246 ntdll!KiFastSystemCallRet: 771464f4 c3 ret 0:000> .ecxr eax= ebx= ecx=015fd7d8 edx=7362100f esi=015fd7c8 edi=015fd804 eip=73621052 esp=00bcf284 ebp=015fd7c8 iopl=0 nv up ei pl zr na pe nc cs=001b ss=0023 ds=0023 es=0023 fs=003b gs= efl=00010246 crashme!crashdump_crashme+0x2: 73621052 c7000100mov dword ptr [eax],1ds:0023:= 0:000> kp *** Stack trace for last set context - .thread/.cxr resets it ChildEBP RetAddr 00bcf280 0031c797 crashme!crashdump_crashme(struct FunctionCallInfoData * fcinfo = 0x015e3318)+0x2 [c:\users\craig\developer\postgres\contrib\crashme\crashme.c @ 14] 00bcf2e4 0031c804 postgres!ExecMakeFunctionResult(struct FuncExprState * fcache = 0x015e3318, struct ExprContext * econtext = 0x00319410, char * isNull = 0x "", ExprDoneCond * isDone = 0x7362100f)+0x427 [c:\users\craig\developer\postgres\src\backend\executor\execqual.c @ 1824] 00bcf30c 0031b760 postgres!ExecEvalFunc(struct FuncExprState * fcache = 0x00
[HACKERS] can shared cache be swapped to disk?
hi, folks! I see that shared cache is implemented by system v shared memory. I wonder whether data in this area can be swapped out to disk. Isn't it bad that we read data from disk, put data in shared cache, and finally data in shared cache is swapped to disk again! Why not use shmctl(..SHM_LOCK..) to pin data in main memory? Thanks!
Re: [HACKERS] Re: Proposed Windows-specific change: Enable crash dumps (like core files)
On 17/12/2010 11:17 PM, Tom Lane wrote: So I'd be happy to have it enabled - given that we want a default. There is absolutely nothing more frustrating than having a crash dump that hasn't got the information you need. Please turn it on by default. There have to be limits to that, though. dbghelp.dll can dump shared memory, too - but it's not going to be practical to write or work with many-gigabyte dumps most of the time, and neither my initial patch nor Magnus's recent reworking of it ask windbg.dll to include shared memory in the dump. It's possible to tell dbghelp.dll to include specific additional memory ranges, so it may well be worth including specific critical areas within the shared memory block while omitting the bulk of the buffers. Doing that safely in an unsafe we're-crashing environment might not be simple, though, especially if you're not even sure whether you got far into startup before crashing. -- Craig Ringer Tech-related writing at http://soapyfrogs.blogspot.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] serializable lock consistency
On Wed, Dec 15, 2010 at 9:20 AM, Florian Pflug wrote: > On Dec14, 2010, at 15:01 , Robert Haas wrote: >> On Tue, Dec 14, 2010 at 7:51 AM, Florian Pflug wrote: - serializable lock consistency - I am fairly certain this needs rebasing. I don't have time to deal with it right away. That sucks, because I think this is a really important change. >>> I can try to find some time to update the patch if it suffers from bit-rot. >>> Would that help? >> >> Yes! > > I've rebased the patch to the current HEAD, and re-run my FK concurrency test > suite, > available from https://github.com/fgp/fk_concurrency, to verify that things > still work. > > I've also asserts to the callers of heap_{update,delete,lock_tuple} to verify > (and document) > that update_xmax may only be InvalidTransactionId if a lockcheck_snapshot is > passed to > heap_{update,delete,lock_tuple}. > > Finally, I've improved the explanation in src/backend/executor/README of how > row locks and > REPEATABLE READ transactions interact, and tried to state the guarantees > provided by > FOR SHARE and FOR UPDATE locks more precisely. > > I've published my work to > https://github.com/fgp/postgres/tree/serializable_lock_consistency, > and attached an updated patch. I'd be happy to give write access to that GIT > repository > to anyone who wants to help getting this committed. I am having a hard time understanding this patch. I decided to start with the README, and I'm still lost. :-( My understanding of the problem is as follows. Acquiring a lock on a tuple prevents the tuple from being concurrently updated. You might take such a lock on a tuple T, make some other modification to the database M, and commit, in the hope that your lock will prevent a concurrent transaction from updating T without seeing M. However, it doesn't work. When your lock on T is released, a concurrent serializable transaction will still be looking at the database using its original snapshot, and therefore won't see M. In effect, you might as well not have bothered obtaining a lock at all. (You'd get the same wrong answer that way, without unnecessary blocking!) To fix this, one must ensure that if a tuple T is locked - either for update or for share - by transaction A, and if a serializable transaction B whose snapshot doesn't permit it to see the effects of A subsequently attempts to update T, it must roll back. Questions: 1. Is my understanding of the problem correct? 2. Is my understanding of the fix correct? 3. Is that what this patch implements? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] SQL/MED - file_fdw
On Sat, Dec 18, 2010 at 10:43 PM, Itagaki Takahiro wrote: > On Sun, Dec 19, 2010 at 12:18, Robert Haas wrote: >> I'm sort of suspicious of the fact that BeginCopyTo() is a shell >> around BeginCopy() while BeginCopyFrom() does a whole bunch of other >> stuff. I haven't grokked what the code is doing here well enough to >> have a concrete proposal though... > > I added Begin/EndCopyTo() just because the internal code looks > symmetric. The proposal doesn't change behaviors of COPY commands > at all. It just exports a part of COPY FROM codes as "File Reader" > so that the file_fdw external module can reuse the code. I believe > we have the conclusion that we should avoid code duplication > to read files in the prior discussion. I'm not questioning any of that. But I'd like the resulting code to be as maintainable as we can make it. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: Extensions, patch v20 (bitrot fixes) (was: [HACKERS] Extensions, patch v19 (encoding brainfart fix))
On Sat, Dec 18, 2010 at 10:04 PM, Robert Haas wrote: > - Has the issue of changing custom_variable_classes from PGC_SIGHUP to > PGC_SUSET been discussed? I am not sure whether that's an OK thing to > do. If it is OK, then the documentation also needs updating. > > - This comment looks like leftovers: > > + /* FIXME: add PGC_EXTENSION so that we don't abuse PGC_SIGHUP here? */ > + SetConfigOption("custom_variable_classes", > + newval, PGC_SIGHUP, PGC_S_EXTENSION); > > Apologies if I missed the previous discussion of this, but why are we > adding a new GUC context? Looking at this a little more, I am inclined to think that ExtensionSetCVC() is entirely unacceptable. Our backend startup is high enough already. Sequential scanning the pg_extension catalog on every startup to spare the DBA the trouble of setting up postgresql.conf strikes me as a bad trade-off. If you were to come back and say that custom_variable_classes is a vile hack from the darkest reaches of wherever vile hacks originate from, I would agree with you. Having a GUC that controls what names can be used as GUCs is probably a bad design, and I'd like to come up with something better. But I don't think this is it. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] SQL/MED - file_fdw
On Sun, Dec 19, 2010 at 12:18, Robert Haas wrote: > I'm sort of suspicious of the fact that BeginCopyTo() is a shell > around BeginCopy() while BeginCopyFrom() does a whole bunch of other > stuff. I haven't grokked what the code is doing here well enough to > have a concrete proposal though... I added Begin/EndCopyTo() just because the internal code looks symmetric. The proposal doesn't change behaviors of COPY commands at all. It just exports a part of COPY FROM codes as "File Reader" so that the file_fdw external module can reuse the code. I believe we have the conclusion that we should avoid code duplication to read files in the prior discussion. We could arrange COPY TO codes as like as the COPY FROM APIs, but I've not and I won't do that at this time because it is not required by SQL/MED at all. If we do, it would be "File Writer" APIs, like: cstate = BeginCopyTO(...); while (tuple = ReadTupleFromSomewhere()) { /* write the tuple into a TSV/CSV file */ NextCopyTo(cstate, tuple); } EndCopyTo(cstate); -- Itagaki Takahiro -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: Extensions, patch v20 (bitrot fixes) (was: [HACKERS] Extensions, patch v19 (encoding brainfart fix))
On Dec 18, 2010, at 7:04 PM, Robert Haas wrote: > - Did we decide to ditch the encoding parameter for extension scripts > and mandate UTF-8? +1 It was certainly suggested. I think it's a good idea, at least with a first cut. Best, David -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] plperlu problem with utf8
On Dec 17, 2010, at 9:32 PM, David Christensen wrote: > +1 on the original sentiment, but only for the case that we're dealing with > data that is passed in/out as arguments. In the case that the > server_encoding is UTF-8, this is as trivial as a few macros on the > underlying SVs for text-like types. If the server_encoding is SQL_ASCII (= > byte soup), this is a trivial case of doing nothing with the conversion > regardless of data type. For any other server_encoding, the data would need > to be converted from the server_encoding to UTF-8, presumably using the > built-in conversions before passing it off to the first code path. A similar > handling would need to be done for the return values, again > datatype-dependent. +1 > Recent upgrades of the Encode module included with perl 5.10+ have caused > issues wherein circular dependencies between Encode and Encode::Alias have > made it impossible to load in a Safe container without major pain. (There > may be some better options than I'd had on a previous project, given that > we're embedding our own interpreters and accessing more through the XS guts, > so I'm not ruling out this possibility completely). Fortunately, thanks to Tim Bunce, PL/Perl no longer relies on Safe.pm. >> Well that works for me. I always use UTF8. Oleg, what was the encoding of >> your database where you saw the issue? > > I'm not sure what the current plperl runtime does as far as marshaling this, > but it would be fairly easy to ensure the parameters came in in perl's > internal format given a server_encoding of UTF8 and some type introspection > to identify the string-like types/text data. (Perhaps any type which had a > binary cast to text would be a sufficient definition here. Do domains > automatically inherit binary casts from their originating types?) Their labels are TEXT. I believe that the only type that should not be treated as text is bytea. >>> 2) its not utf8, so we just leave it as octets. >> >> Which mean's Perl will assume that it's Latin-1, IIUC. > > This is sub-optimal for non-UTF-8-encoded databases, for reasons I pointed > out earlier. This would produce bogus results for any non-UTF-8, non-ASCII, > non latin-1 encoding, even if it did not generally bite most people in > general usage. Agreed. > This example seems bogus; wouldn't length be 3 if this is the example text > this was run with? Additionally, since all ASCII is trivially UTF-8, I think > a better example would be using a string with hi-bit characters so if this > was improperly handled the lengths wouldn't match; length($all_ascii) == > length(encode_utf8($all_ascii)) vs length($hi_bit) < > length(encode_utf8($hi_bit)). I don't see that this test shows us much with > the test case as given. The is_utf8() function merely returns the state of > the SV_utf8 flag, which doesn't speak to UTF-8 validity (i.e., this need not > be set on ascii-only strings, which are still valid in the UTF-8 encoding), > nor does it indicate that there are no hi-bit characters in the string (i.e., > with encode_utf8($hi_bit_string)), the source string $hi_bit_string (in > perl's internal format) with hi-bit characters will have the utf8 flag set, > but the return value of encode_utf8 will not, even though the underlying > data, as represented in perl will be identical). Sorry, I probably had a pasto there. how about this? CREATE OR REPLACE FUNCTION perlgets( TEXT ) RETURNS TABLE(length INT, is_utf8 BOOL) LANGUAGE plperl AS $$ my $text = shift; return_next { length => length $text, is_utf8 => utf8::is_utf8($text) ? 1 : 0 }; $$; utf8=# SELECT * FROM perlgets('“hello”'); length │ is_utf8 ┼─ 7 │ t latin=# SELECT * FROM perlgets('“hello”'); length │ is_utf8 ┼─ 11 │ f (Yes I used Latin-1 curly quotes in that last example). I would argue that it should output the same as the first example. That is, PL/Perl should have decoded the latin-1 before passing the text to the Perl function. > >> In a latin-1 database: >> >> latin=# select * from perlgets('foo'); >>length │ is_utf8 >> ┼─ >> 8 │ f >> (1 row) >> >> I would argue that in the latter case, is_utf8 should be true, too. That is, >> PL/Perl should decode from Latin-1 to Perl's internal form. > > See above for discussion of the is_utf8 flag; if we're dealing with latin-1 > data or (more precisely in this case) data that has not been decoded from the > server_encoding to perl's internal format, this would exactly be the > expectation for the state of that flag. Right. I think that it *should* be decoded. >> Interestingly, when I created a function that takes a bytea argument, utf8 >> was *still* enabled in the utf-8 database. That doesn't seem right to me. > > I'm not sure what you mean here, but I do think that if bytea is identi
Re: [HACKERS] SQL/MED - file_fdw
On Fri, Dec 17, 2010 at 11:01 PM, Itagaki Takahiro wrote: > On Fri, Dec 17, 2010 at 11:49, Shigeru HANADA > wrote: >> I've just moved permission check and read-only check from BeginCopy() >> to DoCopy(). Please see attached patch. > > Thanks! > > Are there any objections for the change? If acceptable, > I'd like to apply it prior to SQL/MED and file_fdw patches. I think at a bare minimum the functions you're adding should have a comment explaining what they do, what their arguments mean, etc. I'm sort of suspicious of the fact that BeginCopyTo() is a shell around BeginCopy() while BeginCopyFrom() does a whole bunch of other stuff. I haven't grokked what the code is doing here well enough to have a concrete proposal though... -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: Extensions, patch v20 (bitrot fixes) (was: [HACKERS] Extensions, patch v19 (encoding brainfart fix))
On Sat, Dec 18, 2010 at 2:39 PM, Dimitri Fontaine wrote: > Here it is, attached. The problem was of course due to git branches and > missing local pulling and merging. Going gradually from 7 to 2 branches > causes that, sometimes. I spent a little time looking at this tonight. I'm going to give you the same general advice that I've given other people who have submitted very large patches of this type: it'll be a lot easier to get this committed if we can break it into smaller pieces. A pretty easy thing to do would be to separate the changes that turn the existing contrib modules into extensions into its own patch. A possibly more controversial idea is to actually remove the notion of a relocatable extension from this patch, and all the supporting machinery, and make that a follow-on patch. I think it was arranged like that before and somehow got collapsed, but maybe the notion is worth revisiting, because this is a lot of code: 224 files changed, 4713 insertions(+), 293 deletions(-) That issue aside, I think there is still a fair amount of cleanup and code review that needs to happen here before this can be considered for commit. Here are a few things I noticed on a first read-through: - pg_execute_sql_string() and pg_execute_sql_file() are documented, but are not actually part of the patch. - The description of the NO USER DATA option is almost content-free. It basically says "this option is here because it wouldn't work if we didn't have this option". - listDependentObjects() appears to be largely cut-and-pasted from some other function. It calls AcquireDeletionLock() and the comments mention constructing a list of objects to delete. It sure doesn't seem right to be acquiring AccessExclusiveLocks on lots of things in a function that's there to support the pg_extension_objects SRF. - This bit certainly violates our message style guidelines: + elog(NOTICE, +"Installing extension '%s' from '%s', in schema %s, %s user data", +stmt->extname, +get_extension_absolute_path(control->script), +schema ? schema : "public", +create_extension_with_user_data ? "with" : "without"); User-facing messages need to be ereport, not elog, so they can be translated, and mustn't be assembled from pieces, as you've done with "%s user data". - Has the issue of changing custom_variable_classes from PGC_SIGHUP to PGC_SUSET been discussed? I am not sure whether that's an OK thing to do. If it is OK, then the documentation also needs updating. - This comment looks like leftovers: + /* FIXME: add PGC_EXTENSION so that we don't abuse PGC_SIGHUP here? */ + SetConfigOption("custom_variable_classes", + newval, PGC_SIGHUP, PGC_S_EXTENSION); Apologies if I missed the previous discussion of this, but why are we adding a new GUC context? - Did we decide to ditch the encoding parameter for extension scripts and mandate UTF-8? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] keeping a timestamp of the last stats reset (for a db, table and function)
Tomas Vondra writes: > I've done several small changes to the patch, namely > - added docs for the functions (in SGML) > - added the same thing for background writer > So I think now it's 'complete' and I'll add it to the commit fest in a > few minutes. Please split this into separate patches for database-level and table-level tracking, because I think it's highly likely that the latter will get rejected. We have had multiple complaints already about the size of the stats file for databases with many tables. I don't believe that it's worth bloating the per-table entries even further with this information. Tracking reset time it per-database doesn't seem like a problem though. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] keeping a timestamp of the last stats reset (for a db, table and function)
Dne 12.12.2010 03:47, Robert Haas napsal(a): > On Sat, Dec 11, 2010 at 4:40 PM, wrote: >>> Hello >>> >>> you have to respect pg coding style: >>> >>> a) not too long lines >>> b) not C++ line comments >> >> OK, thanks for the notice. I've fixed those two problems. > > Please add this to the currently-open commitfest. > > https://commitfest.postgresql.org/action/commitfest_view/open I've done several small changes to the patch, namely - added docs for the functions (in SGML) - added the same thing for background writer So I think now it's 'complete' and I'll add it to the commit fest in a few minutes. regards Tomas diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml index 5fd0213..40c6c6b 100644 --- a/doc/src/sgml/monitoring.sgml +++ b/doc/src/sgml/monitoring.sgml @@ -600,6 +600,15 @@ postgres: user database host + pg_stat_get_db_last_stat_reset_time(oid) + timestamptz + + Time of the last reset of statistics for this database (as a result of executing + pg_stat_reset function) + + + + pg_stat_get_numscans(oid) bigint @@ -725,6 +734,15 @@ postgres: user database host + pg_stat_get_last_stat_reset_time(oid) + timestamptz + + Time of the last reset of statistics on this table (as a result of executing + pg_stat_reset or pg_stat_reset_single_table_counters) + + + + pg_stat_get_vacuum_count(oid) bigint @@ -879,6 +897,15 @@ postgres: user database host + pg_stat_get_function_last_stat_reset_time(oid) + timestamptz + + Time of the last reset of statistics for this function (as a result of executing + pg_stat_reset or pg_stat_reset_single_function_counters) + + + + pg_stat_get_xact_function_calls(oid) bigint @@ -1064,6 +1091,15 @@ postgres: user database host + pg_stat_get_bgwriter_last_stat_reset_time() + timestamptz + + Time of the last reset of statistics for the background writer (as a result of executing + pg_stat_reset_shared('bgwriter')) + + + + pg_stat_get_buf_written_backend() bigint diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql index 346eaaf..3e1cca3 100644 --- a/src/backend/catalog/system_views.sql +++ b/src/backend/catalog/system_views.sql @@ -310,6 +310,7 @@ CREATE VIEW pg_stat_all_tables AS pg_stat_get_last_autovacuum_time(C.oid) as last_autovacuum, pg_stat_get_last_analyze_time(C.oid) as last_analyze, pg_stat_get_last_autoanalyze_time(C.oid) as last_autoanalyze, +pg_stat_get_last_stat_reset_time(C.oid) as last_stat_reset, pg_stat_get_vacuum_count(C.oid) AS vacuum_count, pg_stat_get_autovacuum_count(C.oid) AS autovacuum_count, pg_stat_get_analyze_count(C.oid) AS analyze_count, @@ -502,7 +503,8 @@ CREATE VIEW pg_stat_database AS pg_stat_get_db_tuples_fetched(D.oid) AS tup_fetched, pg_stat_get_db_tuples_inserted(D.oid) AS tup_inserted, pg_stat_get_db_tuples_updated(D.oid) AS tup_updated, -pg_stat_get_db_tuples_deleted(D.oid) AS tup_deleted +pg_stat_get_db_tuples_deleted(D.oid) AS tup_deleted, +pg_stat_get_db_last_stat_reset_time(D.oid) AS last_stat_reset FROM pg_database D; CREATE VIEW pg_stat_user_functions AS @@ -512,7 +514,8 @@ CREATE VIEW pg_stat_user_functions AS P.proname AS funcname, pg_stat_get_function_calls(P.oid) AS calls, pg_stat_get_function_time(P.oid) / 1000 AS total_time, -pg_stat_get_function_self_time(P.oid) / 1000 AS self_time +pg_stat_get_function_self_time(P.oid) / 1000 AS self_time, +pg_stat_get_function_last_stat_reset_time(P.oid) AS last_stat_reset FROM pg_proc P LEFT JOIN pg_namespace N ON (N.oid = P.pronamespace) WHERE P.prolang != 12 -- fast check to eliminate built-in functions AND pg_stat_get_function_calls(P.oid) IS NOT NULL; @@ -538,7 +541,8 @@ CREATE VIEW pg_stat_bgwriter AS pg_stat_get_bgwriter_maxwritten_clean() AS maxwritten_clean, pg_stat_get_buf_written_backend() AS buffers_backend, pg_stat_get_buf_fsync_backend() AS buffers_backend_fsync, -pg_stat_get_buf_alloc() AS buffers_alloc; +pg_stat_get_buf_alloc() AS buffers_alloc, +pg_stat_get_bgwriter_last_stat_reset_time() AS last_stat_reset; CREATE VIEW pg_user_mappings AS SELECT diff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c index c3c136a..1fd5e6a 100644 --- a/src/backend/postmaster/pgstat.c +++ b/src/backend/postmaster/pgstat.c @@ -249,6 +249,8 @@ static void pgstat_sighup_handler(SIGNAL_ARGS); static PgStat_StatDBEntry *pgstat_get_db_entry(Oid databaseid, bool cr
Re: [HACKERS] plperlu problem with utf8
On Dec 17, 2010, at 10:46 PM, Alex Hunsaker wrote: >> But that's a separate issue from the, erm, inconsistency with which PL/Perl >> treats encoding and decoding of its inputs and outputs. > > Yay! So I think we can finally agree that for Oleg's original test > case postgres was getting right. I hope ? :) Yes, and so was Perl. David -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_ctl and port number detection
Andrew Dunstan wrote: > > > On 12/18/2010 06:23 PM, Bruce Momjian wrote: > > > >> If you really think that pulling a port number out of the pid file is an > >> improvement over what pg_ctl does now, then you need to start by storing > >> the port number, as such, in the pid file. Not something that might or > >> might not be related to the port number. But what we have to discuss > >> before that is whether we mind having a significant postmaster version > >> dependency in pg_ctl. > > OK, good point on the version issue. Let's see if we get more > > complaints before changing this. Thanks. > > > > Wasn't there a proposal to provide an explicit port parameter to pg_ctl, > instead of relying on PGPORT? That would probably be a small advance. I do not remember that suggestion. I wonder if we should write the port number as the 4th line in postmaster.pid and return in a few major releases and use that. We could fall back and use our existing code if there is no 4th line. -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_ctl and port number detection
On 12/18/2010 06:23 PM, Bruce Momjian wrote: If you really think that pulling a port number out of the pid file is an improvement over what pg_ctl does now, then you need to start by storing the port number, as such, in the pid file. Not something that might or might not be related to the port number. But what we have to discuss before that is whether we mind having a significant postmaster version dependency in pg_ctl. OK, good point on the version issue. Let's see if we get more complaints before changing this. Thanks. Wasn't there a proposal to provide an explicit port parameter to pg_ctl, instead of relying on PGPORT? That would probably be a small advance. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_ctl and port number detection
Tom Lane wrote: > Bruce Momjian writes: > > Tom Lane wrote: > >> Bruce Momjian writes: > >>> pg_ctl already knows the data directory. If the file is missing, the > >>> server is not running. If the file exists, the first number on the last > >>> line, divided by 1000, is the port number. > > >> That's somewhere between fragile and outright wrong. > > > Please explain why my idea is not an improvement. > > Because it's assuming that those numbers are sysv shmem keys derived in > a particular way. We have platforms on which that is wrong, Windows > being the most obvious example. Reading the shmem key assignment code > closely will suggest to you other ways that this could fail. Not to > mention that people propose getting rid of sysv shmem approximately > every other month, and perhaps someday that will actually happen; > whereupon whatever might get logged in postmaster.pid could be something > completely different. Yeah, I was afraid of Windows. > If you really think that pulling a port number out of the pid file is an > improvement over what pg_ctl does now, then you need to start by storing > the port number, as such, in the pid file. Not something that might or > might not be related to the port number. But what we have to discuss > before that is whether we mind having a significant postmaster version > dependency in pg_ctl. OK, good point on the version issue. Let's see if we get more complaints before changing this. Thanks. -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_ctl and port number detection
Bruce Momjian writes: > Tom Lane wrote: >> Bruce Momjian writes: >>> pg_ctl already knows the data directory. If the file is missing, the >>> server is not running. If the file exists, the first number on the last >>> line, divided by 1000, is the port number. >> That's somewhere between fragile and outright wrong. > Please explain why my idea is not an improvement. Because it's assuming that those numbers are sysv shmem keys derived in a particular way. We have platforms on which that is wrong, Windows being the most obvious example. Reading the shmem key assignment code closely will suggest to you other ways that this could fail. Not to mention that people propose getting rid of sysv shmem approximately every other month, and perhaps someday that will actually happen; whereupon whatever might get logged in postmaster.pid could be something completely different. If you really think that pulling a port number out of the pid file is an improvement over what pg_ctl does now, then you need to start by storing the port number, as such, in the pid file. Not something that might or might not be related to the port number. But what we have to discuss before that is whether we mind having a significant postmaster version dependency in pg_ctl. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_ctl and port number detection
Tom Lane wrote: > Bruce Momjian writes: > > pg_ctl.c::test_postmaster_connection() has some fragile code that tries > > to detect the server port number by looking in the pg_ctl -o string, > > It may be fragile, but it works; or at least I've not heard complaints > about it lately. True. > > I think a simpler solution would be to look in postmaster.pid: > > pg_ctl already knows the data directory. If the file is missing, the > > server is not running. If the file exists, the first number on the last > > line, divided by 1000, is the port number. > > That's somewhere between fragile and outright wrong. Please explain why my idea is not an improvement. -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_ctl and port number detection
Bruce Momjian writes: > pg_ctl.c::test_postmaster_connection() has some fragile code that tries > to detect the server port number by looking in the pg_ctl -o string, It may be fragile, but it works; or at least I've not heard complaints about it lately. > I think a simpler solution would be to look in postmaster.pid: > pg_ctl already knows the data directory. If the file is missing, the > server is not running. If the file exists, the first number on the last > line, divided by 1000, is the port number. That's somewhere between fragile and outright wrong. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] unlogged tables
On Sat, Dec 18, 2010 at 12:27 PM, Kevin Grittner wrote: > Robert Haas wrote: > >> If there's any third-party code out there that is checking >> rd_istemp, it likely also needs to be revised to check whether >> WAL-logging is needed, not whether the relation is temp. The way >> I've coded it, such code will fail to compile, and can be very >> easily fixed by substituting a call to RelationNeedsWAL() or >> RelationUsesLocalBuffers() or RelationUsesTempNamespace(), >> depending on which property the caller actually cares about. > > Hmm... This broke the SSI patch, which was using rd_istemp to omit > conflict checking where it was set to true. The property I care > about is whether tuples in one backend can be read by an transaction > in a different backend, which I assumed would not be true for > temporary tables. Which of the above would be appropriate for that > use? RelationUsesLocalBuffers(). -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] exceptions not present in plpy with Python 3
On 18/12/10 18:56, Jan Urbański wrote: > I'm not a Python 3 expert, but I nicked some code from the Internet and > came up with this patch (passes regression tests on both Python 2 and 3). I tried to be too cute with the regression test, it fails with Python 2.3.7 (the latest 2.3 release). Attached is a fixed test that should work in Pythons 2.3+ I don't want to open that can of worms just yet, but do we have any formal policy on the oldest Python version we support? I tested on 2.3 just because looking at http://python.org/download/releases/ I saw that 2.2 was last released in 2003, so I thought 2.3 was as far as I'd go... Cheers, Jan *** src/pl/plpython/expected/plpython_test.out --- src/pl/plpython/expected/plpython_test.out 2010-12-18 19:29:48.0 +0100 *** *** 35,40 --- 35,53 willem doe => {fname: willem, lname: doe, userid: 3, username: w_doe} (3 rows) + -- check module contents + CREATE FUNCTION module_contents() RETURNS text AS + $$ + contents = list(filter(lambda x: not x.startswith("__"), dir(plpy))) + contents.sort() + return ", ".join(contents) + $$ LANGUAGE plpythonu; + select module_contents(); + module_contents + --- + Error, Fatal, SPIError, debug, error, execute, fatal, info, log, notice, prepare, warning + (1 row) + CREATE FUNCTION elog_test() RETURNS void AS $$ plpy.debug('debug') *** src/pl/plpython/plpython.c --- src/pl/plpython/plpython.c 2010-12-18 19:21:35.0 +0100 *** *** 3204,3209 --- 3204,3238 /* + * Exception initialization + */ + + static void + PLy_initialize_exceptions(PyObject *plpy) + { + #if PY_MAJOR_VERSION < 3 + PyObject *plpy_dict = PyModule_GetDict(plpy); + #endif + + PLy_exc_error = PyErr_NewException("plpy.Error", NULL, NULL); + PLy_exc_fatal = PyErr_NewException("plpy.Fatal", NULL, NULL); + PLy_exc_spi_error = PyErr_NewException("plpy.SPIError", NULL, NULL); + + #if PY_MAJOR_VERSION >= 3 + Py_INCREF(PLy_exc_error); + PyModule_AddObject(plpy, "Error", PLy_exc_error); + Py_INCREF(PLy_exc_fatal); + PyModule_AddObject(plpy, "Fatal", PLy_exc_fatal); + Py_INCREF(PLy_exc_spi_error); + PyModule_AddObject(plpy, "SPIError", PLy_exc_spi_error); + #else + PyDict_SetItemString(plpy_dict, "Error", PLy_exc_error); + PyDict_SetItemString(plpy_dict, "Fatal", PLy_exc_fatal); + PyDict_SetItemString(plpy_dict, "SPIError", PLy_exc_spi_error); + #endif + } + + /* * language handler and interpreter initialization */ *** *** 3211,3217 static PyMODINIT_FUNC PyInit_plpy(void) { ! return PyModule_Create(&PLy_module); } #endif --- 3240,3254 static PyMODINIT_FUNC PyInit_plpy(void) { ! PyObject *m; ! ! m = PyModule_Create(&PLy_module); ! if (m == NULL) ! return NULL; ! ! PLy_initialize_exceptions(m); ! ! return m; } #endif *** *** 3290,3297 PyObject *main_mod, *main_dict, *plpy_mod; ! PyObject *plpy, ! *plpy_dict; /* * initialize plpy module --- 3327, PyObject *main_mod, *main_dict, *plpy_mod; ! PyObject *plpy; /* * initialize plpy module *** *** 3305,3322 plpy = PyModule_Create(&PLy_module); #else plpy = Py_InitModule("plpy", PLy_methods); #endif - plpy_dict = PyModule_GetDict(plpy); /* PyDict_SetItemString(plpy, "PlanType", (PyObject *) &PLy_PlanType); */ - PLy_exc_error = PyErr_NewException("plpy.Error", NULL, NULL); - PLy_exc_fatal = PyErr_NewException("plpy.Fatal", NULL, NULL); - PLy_exc_spi_error = PyErr_NewException("plpy.SPIError", NULL, NULL); - PyDict_SetItemString(plpy_dict, "Error", PLy_exc_error); - PyDict_SetItemString(plpy_dict, "Fatal", PLy_exc_fatal); - PyDict_SetItemString(plpy_dict, "SPIError", PLy_exc_spi_error); - /* * initialize main module, and add plpy */ --- 3341,3352 plpy = PyModule_Create(&PLy_module); #else plpy = Py_InitModule("plpy", PLy_methods); + /* for Python 3 we initialized the exceptions in PyInit_plpy */ + PLy_initialize_exceptions(plpy); #endif /* PyDict_SetItemString(plpy, "PlanType", (PyObject *) &PLy_PlanType); */ /* * initialize main module, and add plpy */ *** src/pl/plpython/sql/plpython_test.sql --- src/pl/plpython/sql/plpython_test.sql 2010-12-18 19:27:58.0 +0100 *** *** 25,30 --- 25,39 select argument_test_one(users, fname, lname) from users where lname = 'doe' order by 1; + -- check module contents + CREATE FUNCTION module_contents() RETURNS text AS + $$ + contents = list(filter(lambda x: not x.startswith("__"), dir(plpy))) + contents.sort() + return ", ".join(contents) + $$ LANGUAGE plpythonu; + + select module_contents(); CREATE FUNCTION elog_test() RETURNS void AS $$ -- Sent via
[HACKERS] pg_ctl and port number detection
pg_ctl.c::test_postmaster_connection() has some fragile code that tries to detect the server port number by looking in the pg_ctl -o string, postgresql.conf, the PGPORT environment variable, and finally using the default port number. I think a simpler solution would be to look in postmaster.pid: 10231 /u/pgsql/data 5432001 45481984 pg_ctl already knows the data directory. If the file is missing, the server is not running. If the file exists, the first number on the last line, divided by 1000, is the port number. We can then use this port number for libpq to check for connectivity. Comments? -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] exceptions not present in plpy with Python 3
Hi, there seems to be a problem in the way we add exceptions to the plpy module in PL/Python compiled with Python 3k. Try this: DO $$ plpy.SPIError $$ language plpython3u; I'm not a Python 3 expert, but I nicked some code from the Internet and came up with this patch (passes regression tests on both Python 2 and 3). The funny thing is that it never blew up earlier, because we only used plpy.SPIError in except: blocks that weren't even executed, as errors in plpy.execute just terminate the function. With my changes they turn into catchable exceptions, and so accessing plpy.SPIError in Python 3 becomes essential. BTW: do we have any buildfarm animal that uses Python 3? Cheers, Jan *** src/pl/plpython/expected/plpython_test.out --- src/pl/plpython/expected/plpython_test.out 2010-12-18 18:46:17.0 +0100 *** *** 35,40 --- 35,53 willem doe => {fname: willem, lname: doe, userid: 3, username: w_doe} (3 rows) + -- check module contents + CREATE FUNCTION module_contents() RETURNS text AS + $$ + contents = dir(plpy) + contents.sort() + return ", ".join(contents) + $$ LANGUAGE plpythonu; + select module_contents(); + module_contents + --- + Error, Fatal, SPIError, __doc__, __name__, __package__, debug, error, execute, fatal, info, log, notice, prepare, warning + (1 row) + CREATE FUNCTION elog_test() RETURNS void AS $$ plpy.debug('debug') *** src/pl/plpython/plpython.c --- src/pl/plpython/plpython.c 2010-12-18 18:39:00.0 +0100 *** *** 3204,3209 --- 3204,3238 /* + * Exception initialization + */ + + static void + PLy_initialize_exceptions(PyObject *plpy) + { + #if PY_MAJOR_VERSION < 3 + PyObject *plpy_dict = PyModule_GetDict(plpy); + #endif + + PLy_exc_error = PyErr_NewException("plpy.Error", NULL, NULL); + PLy_exc_fatal = PyErr_NewException("plpy.Fatal", NULL, NULL); + PLy_exc_spi_error = PyErr_NewException("plpy.SPIError", NULL, NULL); + + #if PY_MAJOR_VERSION >= 3 + Py_INCREF(PLy_exc_error); + PyModule_AddObject(plpy, "Error", PLy_exc_error); + Py_INCREF(PLy_exc_fatal); + PyModule_AddObject(plpy, "Fatal", PLy_exc_fatal); + Py_INCREF(PLy_exc_spi_error); + PyModule_AddObject(plpy, "SPIError", PLy_exc_spi_error); + #else + PyDict_SetItemString(plpy_dict, "Error", PLy_exc_error); + PyDict_SetItemString(plpy_dict, "Fatal", PLy_exc_fatal); + PyDict_SetItemString(plpy_dict, "SPIError", PLy_exc_spi_error); + #endif + } + + /* * language handler and interpreter initialization */ *** *** 3211,3217 static PyMODINIT_FUNC PyInit_plpy(void) { ! return PyModule_Create(&PLy_module); } #endif --- 3240,3254 static PyMODINIT_FUNC PyInit_plpy(void) { ! PyObject *m; ! ! m = PyModule_Create(&PLy_module); ! if (m == NULL) ! return NULL; ! ! PLy_initialize_exceptions(m); ! ! return m; } #endif *** *** 3290,3297 PyObject *main_mod, *main_dict, *plpy_mod; ! PyObject *plpy, ! *plpy_dict; /* * initialize plpy module --- 3327, PyObject *main_mod, *main_dict, *plpy_mod; ! PyObject *plpy; /* * initialize plpy module *** *** 3305,3322 plpy = PyModule_Create(&PLy_module); #else plpy = Py_InitModule("plpy", PLy_methods); #endif - plpy_dict = PyModule_GetDict(plpy); /* PyDict_SetItemString(plpy, "PlanType", (PyObject *) &PLy_PlanType); */ - PLy_exc_error = PyErr_NewException("plpy.Error", NULL, NULL); - PLy_exc_fatal = PyErr_NewException("plpy.Fatal", NULL, NULL); - PLy_exc_spi_error = PyErr_NewException("plpy.SPIError", NULL, NULL); - PyDict_SetItemString(plpy_dict, "Error", PLy_exc_error); - PyDict_SetItemString(plpy_dict, "Fatal", PLy_exc_fatal); - PyDict_SetItemString(plpy_dict, "SPIError", PLy_exc_spi_error); - /* * initialize main module, and add plpy */ --- 3341,3352 plpy = PyModule_Create(&PLy_module); #else plpy = Py_InitModule("plpy", PLy_methods); + /* for Python 3 we initialized the exceptions in PyInit_plpy */ + PLy_initialize_exceptions(plpy); #endif /* PyDict_SetItemString(plpy, "PlanType", (PyObject *) &PLy_PlanType); */ /* * initialize main module, and add plpy */ *** src/pl/plpython/sql/plpython_test.sql --- src/pl/plpython/sql/plpython_test.sql 2010-12-18 18:44:19.0 +0100 *** *** 25,30 --- 25,39 select argument_test_one(users, fname, lname) from users where lname = 'doe' order by 1; + -- check module contents + CREATE FUNCTION module_contents() RETURNS text AS + $$ + contents = dir(plpy) + contents.sort() + return ", ".join(contents) + $$ LANGUAGE plpythonu; + + select module_contents(); CREATE FUNCTION
Re: [HACKERS] Extensions, patch v19 (encoding brainfart fix)
Itagaki Takahiro writes: > You probably compared wrong versions of source trees. The patch contains > many diffs not related to EXTENSION. It cannot be applied cleanly. Ouch, sorry about that, will revise soon'ish. Meanwhile the git repo is still available here: http://git.postgresql.org/gitweb?p=postgresql-extension.git;a=shortlog;h=refs/heads/extension > BTW, only earthdistance.sql.in has @extsch...@. > Didn't you forget to remove it? No. It so happens that earthdistance depends on cube and that we agreed on not implement dependencies between extensions for the first patch. The problem here is that ALTER EXTENSION earthdistance SET SCHEMA foo; would relocate cube objects too, because there's nothing to stop the recursive walking at the right spot. So earthdistance.control states that earthdistance is not relocatable for now. And a non relocatable extension will use @extschema@ so that users still get a say in where to install it… > I've not read the patch well yet, but I'd like to make sure of > two specs in the patch: > > #1. I found the patch specifies "version" in each control file. We will > need to maintain them manually, but I assume it was a conclusion in the > discussion for v18 patch. So, no more changes are required here. Exactly, the consensus seems to be that we *want* to maintain separate versions number for each contrib. > #2. The patch replaces "\i XXX.sql" to "CREATE EXTENSION XXX". They are a > bit different because CREATE EXTENSION uses the installed sql files instead > of the source directory. But I think this is the correct behavior. We should > have used only installed files because they are used in "make *installcheck*". Yeah. Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] unlogged tables
Robert Haas wrote: > If there's any third-party code out there that is checking > rd_istemp, it likely also needs to be revised to check whether > WAL-logging is needed, not whether the relation is temp. The way > I've coded it, such code will fail to compile, and can be very > easily fixed by substituting a call to RelationNeedsWAL() or > RelationUsesLocalBuffers() or RelationUsesTempNamespace(), > depending on which property the caller actually cares about. Hmm... This broke the SSI patch, which was using rd_istemp to omit conflict checking where it was set to true. The property I care about is whether tuples in one backend can be read by an transaction in a different backend, which I assumed would not be true for temporary tables. Which of the above would be appropriate for that use? -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal : cross-column stats
Dne 18.12.2010 16:59, Florian Pflug napsal(a): > On Dec18, 2010, at 08:10 , t...@fuzzy.cz wrote: > >>> On Dec17, 2010, at 23:12 , Tomas Vondra wrote: Well, not really - I haven't done any experiments with it. For two columns selectivity equation is (dist(A) * sel(A) + dist(B) * sel(B)) / (2 * dist(A,B)) where A and B are columns, dist(X) is number of distinct values in column X and sel(X) is selectivity of column X. >>> >>> Huh? This is the selectivity estimate for "A = x AND B = y"? Surely, >>> if A and B are independent, the formula must reduce to sel(A) * sel(B), >>> and I cannot see how that'd work with the formula above. >> >> Yes, it's a selectivity estimate for P(A=a and B=b). It's based on >> conditional probability, as >> >> P(A=a and B=b) = P(A=a|B=b)*P(B=b) = P(B=b|A=a)*P(A=a) >> >> and "uniform correlation" assumption so that it's possible to replace the >> conditional probabilities with constants. And those constants are then >> estimated as dist(A)/dist(A,B) or dist(B)/dist(A,B). > > Ok, I think I understand this now. The selectivity equation actually > *does* reduce to sel(A) * sel(B), *if* we pick a very simple estimate > for sel(A). > > Take the clause "A = x AND B = y" for example. Without knowing anything > about x and y, reasonable guesses for sel(A=x) and sel(B=y) are > > sel(A=x) = 1 / dist(A) > sel(B=y) = 1 / dist(B). > > This is also what we do currently, according to var_eq_non_const() in > src/backend/utils/adt/selfuncs.c, if we don't have any additional > knowledge about x (Actually, we also factor the probability of A being > NULL into this). > > With these estimates, your formula becomes > > sel(A=x,B=y) = 1 / dist(A,B). > > and if A and B are uncorrelated, dist(A,B) ~= dist(A) * dist(B), thus > > sel(A=x,B=y) = sel(A=x) * sel(B=y). > > If, however, y is a constant, then we use the MKVs to estimate sel(B=y) > (var_eq_const() in src/backend/utils/adt/selfuncs.c). If > > sel(B=y) ~= 0, > > we'd currently also conclude that > > sel(A=x,B=y) ~= 0. > > With the "uniform correlation" approach, we'd instead estimate > > sel(A=x,B=y) ~= sel(A=x) / dist(B) > > assuming that dist(A,B) ~= dist(A)*dist(B), meaning A,B are uncorrelated. > If dist(B) is small, this estimate is much worse than what we'd currently > get, since we've effectively ignored the information that the restriction > B=y alone guarantees that only very few rows will match. Well, I guess you're right. Let's see two examples - uncorrelated and higly correlated data (and then a few comments on what I think you're missing). uncorrelated Let there be 10 distinct values in A, 100 distinct values in B, and there are all possible combinations (10x100 = 1000). All the values (and combinations) are uniformly distributed. The current estimate is correct, i.e. sel(A=a) = 1/10 = 1/dist(A) sel(B=b) = 1/100 = 1/dist(B) sel(A=a, B=b) = sel(A) * sel(B) = 1/1000 /* due to AVI */ the proposed estimate is sel(A=a, B=b) = (dist(A)*sel(A=a) + dist(B)*sel(B=b)) / (2*dist(A,B)) = (10 * 1/10 + 100 * 1/100) / 2000 = 1/1000 so actually it produces the same estimate, but thats due to the uniform distribution assumption. Let's say the selectivities for A and B are very different - A is 10x les common, B is 10x more common (let's say we know this). Then the current estimate is still correct, i.e. it gives sel(A=a) = 1/100 sel(B=b) = 1/10 => sel(A=a,B=b) = 1/1000 but the new estimate is (10 * 1/100 + 100 * 1/10) / 2000 = (101/10) / 2000 ~ 1/200 So yes, in case of uncorrelated data this overestimates the result. highly correlated - Again, let dist(A)=10, dist(B)=100. But this case let B=>A i.e. for each value in B, there is a unique value in A. Say B=mod(A,10) or something like that. So now there is dist(A,B)=100. Assuming uniform distribution, the correct estimate is sel(A=a,B=b) = sel(B=b) = 1/100 and the current estimate is sel(A=a,B=b) = sel(A=a) * sel(B=b) = 1/10 * 1/100 = 1/1000 The proposed estimate is (10 * 1/10 + 100 * 1/100) / 200 = 2/200 = 1/100 which is correct. Without the uniform distribution (again, sel(A)=1/100, sel(B)=1/10), we currently get (just as before) sel(A=a,B=b) = 1/10 * 1/100 = 1/1000 which is 100x less than the correct value (sel(B)=1/10). The new estimate yields (10*1/100 + 100*1/10) / 200 = (1010/100) / 200 ~ 1/20 which is not as accurate as with uniform distribution, but it's considerably more accurate than the current estimate (1/1000). comments I really don't think this a perfect solution giving 100% accurate results in all cases. But the examples above illustrate that it gives much better estimates in case of correlated data. It seems to me you're missing one very important thing - this was not meant as a new default way to do estimates. It was meant as an option when the user (DBA, developer, ...) realizes the current solution gives reall
Re: [HACKERS] Crash on attempt to connect to nonstarted server
bruce wrote: > Magnus Hagander wrote: > > I get a crash on win32 when connecting to a server that's not started. > > In fe-connect.c, we have: > > > > display_host_addr = (conn->pghostaddr == NULL) && > > (strcmp(conn->pghost, host_addr) != 0); > > > > In my case, conn->pghost is NULL at this point, as is > > conn->pghostaddr. Thus, it crashes in strcmp(). > > I have researched this with Magnus, and was able to reproduce the > failure. It happens only on Win32 because that is missing unix-domain > sockets so "" maps to localhost, which is an IP address. I have applied > the attached patch. The new output is: > > $ psql test > psql: could not connect to server: Connection refused > Is the server running on host "???" and accepting > TCP/IP connections on port 5432? > > Note the "???". This happens because the mapping of "" to localhost > happens below the libpq library variable level. I made a mistake in the fix from yesterday. I added code to test for NULL, but in fact what I should have done was to allow a NULL hostname to trigger the printing of the IP address because it will never match an IP number. The attached applied, patch fixes this, and uses the same logic as in connectDBStart() to print 'localhost' for connection failures. I also added code to use DefaultHost consistently in that file. With the patch the new Win32 output for a down server is: $ pql test psql: could not connect to server: Connection refused Is the server running on host "localhost" (127.0.0.1) and accepting TCP/IP connections on port 5432? $ psql -h "" test psql: could not connect to server: Connection refused Is the server running on host "localhost" (127.0.0.1) and accepting TCP/IP connections on port 5432? I am amazed we were printing "???" all these years on this very popular platform. -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c index 8d9400b..bf8beb7 100644 *** /tmp/rgI5Ne_fe-connect.c Sat Dec 18 11:21:07 2010 --- src/interfaces/libpq/fe-connect.c Sat Dec 18 11:10:09 2010 *** connectFailureMessage(PGconn *conn, int *** 1030,1049 else strcpy(host_addr, "???"); display_host_addr = (conn->pghostaddr == NULL) && ! (conn->pghost != NULL) && ! (strcmp(conn->pghost, host_addr) != 0); appendPQExpBuffer(&conn->errorMessage, libpq_gettext("could not connect to server: %s\n" "\tIs the server running on host \"%s\"%s%s%s and accepting\n" "\tTCP/IP connections on port %s?\n"), SOCK_STRERROR(errorno, sebuf, sizeof(sebuf)), ! conn->pghostaddr ? conn->pghostaddr ! : (conn->pghost ? conn->pghost ! : "???"), /* display the IP address only if not already output */ display_host_addr ? " (" : "", display_host_addr ? host_addr : "", --- 1030,1054 else strcpy(host_addr, "???"); + /* + * If the user did not supply an IP address using 'hostaddr', and + * 'host' was missing or does not match our lookup, display the + * looked-up IP address. + */ display_host_addr = (conn->pghostaddr == NULL) && ! ((conn->pghost == NULL) || ! (strcmp(conn->pghost, host_addr) != 0)); appendPQExpBuffer(&conn->errorMessage, libpq_gettext("could not connect to server: %s\n" "\tIs the server running on host \"%s\"%s%s%s and accepting\n" "\tTCP/IP connections on port %s?\n"), SOCK_STRERROR(errorno, sebuf, sizeof(sebuf)), ! (conn->pghostaddr && conn->pghostaddr[0] != '\0') ? conn->pghostaddr ! : (conn->pghost && conn->pghost[0] != '\0') ? conn->pghost ! : DefaultHost, /* display the IP address only if not already output */ display_host_addr ? " (" : "", display_host_addr ? host_addr : "", *** connectDBStart(PGconn *conn) *** 1304,1310 UNIXSOCK_PATH(portstr, portnum, conn->pgunixsocket); #else /* Without Unix sockets, default to localhost instead */ ! node = "localhost"; hint.ai_family = AF_UNSPEC; #endif /* HAVE_UNIX_SOCKETS */ } --- 1309,1315 UNIXSOCK_PATH(portstr, portnum, conn->pgunixsocket); #else /* Without Unix sockets, default to localhost instead */ ! node = DefaultHost; hint.ai_family = AF_UNSPEC; #endif /* HAVE_UNIX_SOCKETS */ } *** ldapServiceLookup(const char *purl, PQco *** 3388,3394 /* hostname */ hostname = url + strlen(LDAP_URL); if (*hostname == '/') /* no hostname? */ ! hostname = "localhost"; /* the default */ /* dn, "distinguished name" */
Re: [HACKERS] proposal : cross-column stats
On Dec18, 2010, at 08:10 , t...@fuzzy.cz wrote: >> On Dec17, 2010, at 23:12 , Tomas Vondra wrote: >>> Well, not really - I haven't done any experiments with it. For two >>> columns selectivity equation is >>> >>> (dist(A) * sel(A) + dist(B) * sel(B)) / (2 * dist(A,B)) >>> >>> where A and B are columns, dist(X) is number of distinct values in >>> column X and sel(X) is selectivity of column X. >> >> Huh? This is the selectivity estimate for "A = x AND B = y"? Surely, >> if A and B are independent, the formula must reduce to sel(A) * sel(B), >> and I cannot see how that'd work with the formula above. > > Yes, it's a selectivity estimate for P(A=a and B=b). It's based on > conditional probability, as > > P(A=a and B=b) = P(A=a|B=b)*P(B=b) = P(B=b|A=a)*P(A=a) > > and "uniform correlation" assumption so that it's possible to replace the > conditional probabilities with constants. And those constants are then > estimated as dist(A)/dist(A,B) or dist(B)/dist(A,B). Ok, I think I understand this now. The selectivity equation actually *does* reduce to sel(A) * sel(B), *if* we pick a very simple estimate for sel(A). Take the clause "A = x AND B = y" for example. Without knowing anything about x and y, reasonable guesses for sel(A=x) and sel(B=y) are sel(A=x) = 1 / dist(A) sel(B=y) = 1 / dist(B). This is also what we do currently, according to var_eq_non_const() in src/backend/utils/adt/selfuncs.c, if we don't have any additional knowledge about x (Actually, we also factor the probability of A being NULL into this). With these estimates, your formula becomes sel(A=x,B=y) = 1 / dist(A,B). and if A and B are uncorrelated, dist(A,B) ~= dist(A) * dist(B), thus sel(A=x,B=y) = sel(A=x) * sel(B=y). If, however, y is a constant, then we use the MKVs to estimate sel(B=y) (var_eq_const() in src/backend/utils/adt/selfuncs.c). If sel(B=y) ~= 0, we'd currently also conclude that sel(A=x,B=y) ~= 0. With the "uniform correlation" approach, we'd instead estimate sel(A=x,B=y) ~= sel(A=x) / dist(B) assuming that dist(A,B) ~= dist(A)*dist(B), meaning A,B are uncorrelated. If dist(B) is small, this estimate is much worse than what we'd currently get, since we've effectively ignored the information that the restriction B=y alone guarantees that only very few rows will match. best regards, Florian Pflug -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers