[HACKERS] ANALYZE to be ignored by VACUUM
When there are a heavily updated table and a large table at the same time, ANALYZE against the large table disturbs other autovacuums and HOT updates. In my workload, ANALYZE takes long time (1min at statistics_target = 10, and 5min at 100), but the updated table needs to be vacuumed every 30 seconds because seqscans run on the table repeatedly. ANALYZE is a transaction. As long as long transactions are alive, VACUUM cannot remove dead tuples deleted after those transaction began. HOT also cannot work under long transactions. We will be happy if VACUUM can get along with long transactions, but it requires some kinds of complex managements of vacuum horizon. I have no idea for it... So, I'm targeting only ANALZYE for now by changing ANALYZE to be ignored by VACUUM. It is just same as VACUUM that has already been ignored by other VACUUMs since version 8.2. My proposal is splitting ANALYZEs with use_own_xacts (by VACUUM ANALYZE, autovacuum or database-wide analyze) to two transactions: T1: acquire_sample_rows() T2: compute_stats() and update_attstats() and set PROC_IN_VACUUM during T1. T1 takes long time because read pages to sample rows. T2 is relatively short because stats are calculated in memory, without i/o. T2 is needed for consistent snapshot because user-defined functions in expression indexes might use it. Is it an acceptable approach? Or am I missing something? Regards, --- ITAGAKI Takahiro NTT Open Source Software Center ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Reworking WAL locking
On Thu, 2008-02-14 at 16:50 -0500, Tom Lane wrote: Simon Riggs [EMAIL PROTECTED] writes: On Thu, 2008-02-14 at 13:52 -0500, Tom Lane wrote: [ still staring at the code ... ] Something that might be interesting though is to try to move some of the buffer control logic overhead out of WALInsertLock's domain and into WALWriteLock's domain. I like this one because its another example that contention is not static, seems like its more often cyclical. The WALInsertLock is held longer than normal when we cross page boundaries, creating a pulsing effect thru the lock queue. Yeah, significantly longer than normal if you assume the normal case is to write a WAL record that's much less than a full page. But has anyone seen direct evidence that that's an important effect? I was just proposing this on speculation --- if there's already evidence that that behavior is a problem, it'd be interesting to see it. A tracepoint in AdvanceXLInsertBuffer() would help there. I would expect lock time to vary with number of cache lines touched (approximated by record size) as well as whether the record crossed a page boundary. Very large WAL records are more likely to cross boundaries, so the effect will show itself most when we write small WAL records. BTW, we'd probably need to do something like this even if we then go forward with your original idea. If we're going to allow multiple backends to be inserting WAL records into the-same-or-different WAL buffers concurrently, we can't have that same code responsible for initializing empty buffers. You've sold me already! I will return to the other part of the idea, but just too busy now to think and reply in full. I agree with the issues you raised earlier. -- Simon Riggs 2ndQuadrant http://www.2ndQuadrant.com ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] wishlist for 8.4
Decibel! wrote: On Feb 14, 2008, at 10:06 AM, Tom Lane wrote: =?UTF-8?B?UmFwaGHDq2wgSmFjcXVvdA==?= [EMAIL PROTECTED] writes: so, I propose the use of NEW[variable_containing_the_column_name] (which can obviously be extended to any tuples) to allow such access. what do you experts think ? Zero chance. plplgsql is a strongly typed language, and a construct like that couldn't have any known-in-advance data type. Would it be reasonable to teach EXECUTE about NEW and OLD? That should allow the OP to do what he's looking for... You could have a function get_attribute_as_text(NEW, 'id') or even get_attribute_quoted(NEW, 'id') It would be nice to have a more dynamic language built-in. I'm not aware of any BSD-licensed dynamic languages though. -- Richard Huxton Archonet Ltd ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
[HACKERS] subquery in limit
(just as an example): select * from test order by a limit (select count(*)*0.9 from test); is not doable in postgresql. Someone recently asked on IRC about, SELECT TOP 90 PERCENT type of query in m$sql. Any ideas how should this be approach in psql. I ask here, because you guys probably can tell why the first query won't work (subquery is not allowed as limit's argument, why?). cheers. -- Grzegorz Jaskiewicz [EMAIL PROTECTED] ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] wishlist for 8.4
On Thu, Feb 14, 2008 at 11:06:42AM -0500, Tom Lane wrote: =?UTF-8?B?UmFwaGHDq2wgSmFjcXVvdA==?= [EMAIL PROTECTED] writes: so, I propose the use of NEW[variable_containing_the_column_name] (which can obviously be extended to any tuples) to allow such access. what do you experts think ? Zero chance. plplgsql is a strongly typed language, and a construct like that couldn't have any known-in-advance data type. If variable_containing_the_column_name was a string then I agree, if columns were first class objects then it wouldn't be a problem. Introducing sub-types would make this all nice and tidy, but unfortunately sub-types are a long way away from what PG currently supports and I don't know the SQL spec well enough to know if there would be any serious incompatibilities with them--they are pretty expressive so I'd doubt there would be any fundamental incompatibility. If you want an example, Java has sub-types and 1.5+ has parametric polymorphism making it possible to create an interface that named the column you were interested in and then to write a function like: interface ColumnRow,ColType { public ColType get(Row r); } public Row,ColType ColType foo(Row row, ColumnRow,ColType col) { return col.get(row); } In PG you'd want the sub-types to be generated automatically, the above example is somewhat unreadable. The only real purpose of the above was to show how a strong type system can express what the OP wanted. The anyelement and supporting polymorphism would all be subsumed into the features provided by sub-typing and would all be done in a much tidier way. It's a very big change for PG though! Sam ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] wishlist for 8.4
Raphaƫl Jacquot wrote: I recently found myself trying to build a trigger to modify some fields in a good dozen similarly structured tables in which the similar columns had different names. in fact, I got stuck in pl/pgsql with the fact that there's no way to access the NEW tuple in an indirect way, having the name of the column in some variable. (I found that it could be done in plperl, but that left me with a taste of un-completeness...) It's ugly, but you could play tricks with EXECUTE. Like: CREATE OR REPLACE FUNCTION emp_stamp() RETURNS trigger AS $emp_stamp$ DECLARE empname text; BEGIN EXECUTE 'SELECT ('''||new||'''::emp).empname' INTO empname; RAISE NOTICE 'new empname is: %', empname; RETURN NEW; END; $emp_stamp$ LANGUAGE plpgsql; Not sure the quoting is right... -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] subquery in limit
I have no idea why you can't do a subquery in the limit but you can reference a function: create table test as select * from pg_tables; create or replace function fn_count(p_sql varchar) returns int as $$ declare v_count int; begin execute p_sql into v_count; return v_count; end; $$ language 'plpgsql' security definer; select * from test limit fn_count('select round(count(*)*0.9) from test'); And I'm sure someone will point out a more efficient way to write my function without using pl/pgsql. :) Jon -Original Message- From: [EMAIL PROTECTED] [mailto:pgsql-hackers- [EMAIL PROTECTED] On Behalf Of Grzegorz Jaskiewicz Sent: Friday, February 15, 2008 5:35 AM To: pgsql-hackers@postgresql.org Subject: [HACKERS] subquery in limit (just as an example): select * from test order by a limit (select count(*)*0.9 from test); is not doable in postgresql. Someone recently asked on IRC about, SELECT TOP 90 PERCENT type of query in m$sql. Any ideas how should this be approach in psql. I ask here, because you guys probably can tell why the first query won't work (subquery is not allowed as limit's argument, why?). cheers. -- Grzegorz Jaskiewicz [EMAIL PROTECTED] ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] ANALYZE to be ignored by VACUUM
ITAGAKI Takahiro [EMAIL PROTECTED] writes: In my workload, ANALYZE takes long time (1min at statistics_target = 10, and 5min at 100), but the updated table needs to be vacuumed every 30 seconds because seqscans run on the table repeatedly. There is something *seriously* wrong with that. If vacuum can complete in under 30 seconds, how can analyze take a minute? (I'm also wondering whether you'll still need such frantic vacuuming with HOT...) So, I'm targeting only ANALZYE for now by changing ANALYZE to be ignored by VACUUM. I think we need to understand what the real problem is with your test case. This proposal seems very messy/ugly to me, and I'm unconvinced that it solves anything. regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] subquery in limit
Roberts, Jon [EMAIL PROTECTED] writes: I have no idea why you can't do a subquery in the limit It hasn't seemed worth putting any effort into --- AFAIR this is the first time anyone's even inquired about it. As you say, you can always use a function. And I'm sure someone will point out a more efficient way to write my function without using pl/pgsql. :) Only that it doesn't seem a particularly bright idea to use SECURITY DEFINER for a function that will execute any arbitrary caller-provided SQL ... regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] subquery in limit
Roberts, Jon [EMAIL PROTECTED] writes: I have no idea why you can't do a subquery in the limit It hasn't seemed worth putting any effort into --- AFAIR this is the first time anyone's even inquired about it. As you say, you can always use a function. And I'm sure someone will point out a more efficient way to write my function without using pl/pgsql. :) Only that it doesn't seem a particularly bright idea to use SECURITY DEFINER for a function that will execute any arbitrary caller-provided SQL ... LOL! I knew something in my code would trigger a response. :) Jon ---(end of broadcast)--- TIP 6: explain analyze is your friend
[HACKERS] Last call for objections: SELECT alias without AS for IDENTs only?
Is anyone opposed to applying the patch shown here http://archives.postgresql.org/pgsql-hackers/2008-02/msg00434.php (plus appropriate docs changes etc)? This allows AS to be omitted in a SELECT output-column alias clause, but only if the desired alias name lexes as an IDENT, that is, it's not any known keyword (reserved or otherwise). It would be nicer if it allowed any unreserved keyword too; but per discussion, the only way that could happen is if we made some currently unreserved keywords reserved, which does not seem like a net improvement. The documentation would probably have to say something like You can omit AS only if the desired alias name is not any known keyword (see Appendix C). For protection against possible future keyword additions, it is recommended that you always either use AS or double-quote the alias name. On the whole this form of the patch seems like a reasonable compromise --- it doesn't lose any existing functionality, it doesn't take much work, and it should at least dampen complaints about our lack of spec compliance on this point. regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] wishlist for 8.4
On Fri, Feb 15, 2008 at 10:06:49AM +, Richard Huxton wrote: Decibel! wrote: On Feb 14, 2008, at 10:06 AM, Tom Lane wrote: =?UTF-8?B?UmFwaGHDq2wgSmFjcXVvdA==?= [EMAIL PROTECTED] writes: so, I propose the use of NEW[variable_containing_the_column_name] (which can obviously be extended to any tuples) to allow such access. what do you experts think ? Zero chance. plplgsql is a strongly typed language, and a construct like that couldn't have any known-in-advance data type. Would it be reasonable to teach EXECUTE about NEW and OLD? That should allow the OP to do what he's looking for... You could have a function get_attribute_as_text(NEW, 'id') or even get_attribute_quoted(NEW, 'id') It would be nice to have a more dynamic language built-in. I'm not aware of any BSD-licensed dynamic languages though. Perl is BSD-compatible. Cheers, David. -- David Fetter [EMAIL PROTECTED] http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: [EMAIL PROTECTED] Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
[HACKERS] CVS repository invalid revision
Hello there, I'm running a conversion of PostgreSQL's CVS repository, but I'm stuck on a revision that cscvs fails to parse. The hint that the error gives me is: Parser error: failed to parse revision data line (line: 'date: 2000/12/04 01:20:38; author: tgl; state: Exp; lines: ') That's the literal output -- I'm not sure what the linebreak after lines: means. Can someone help me out by finding and fixing the corrupted revision? Id be most grateful. -- Christian Robottom Reis | http://async.com.br/~kiko/ | [+55 16] 3376 0125 ---(end of broadcast)--- TIP 6: explain analyze is your friend
[HACKERS] NetBSD/dtime_t
Could anyone please tell me how NetBSD defines dtime_t? We have a buildfarm failure on canary. I know which change triggered this. I enabled typedef timestamp dtime_t; Which was mishandled by ecpg in Informix mode before and didn't work at all if you were using the data type. Michael -- Michael Meskes Email: Michael at Fam-Meskes dot De, Michael at Meskes dot (De|Com|Net|Org) ICQ: 179140304, AIM/Yahoo: michaelmeskes, Jabber: [EMAIL PROTECTED] Go VfL Borussia! Go SF 49ers! Use Debian GNU/Linux! Use PostgreSQL! ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] NetBSD/dtime_t
On Fri, 15 Feb 2008, Michael Meskes wrote: Could anyone please tell me how NetBSD defines dtime_t? We have a buildfarm failure on canary. /usr/include/sys/types.h says: typedef int32_t dtime_t;/* on-disk time_t */ Kris Jurka ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
[HACKERS] PG-relevant bug fixes just pushed in Fedora 8
If you happen to use Fedora 8, you might be interested in these just-closed bugs: https://bugzilla.redhat.com/show_bug.cgi?id=430983 Fixes the problem of gssapi reporting uninterpreted numeric error codes instead of useful error messages when there's a Kerberos problem, as we previously saw in this thread: http://archives.postgresql.org/pgsql-hackers/2008-01/msg01004.php https://bugzilla.redhat.com/show_bug.cgi?id=432203 Fixes flex 2.5.33 to not generate half a dozen no previous prototype warnings for each .l file. Note if you are working with a CVS pull you'll need to make maintainer-clean to get rid of the broken current output files, else the warnings will remain. If you're on another distro and see these problems, you might want to pester your distributor to apply the same fixes. regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings