[HACKERS] SET WITHOUT OIDS and VACUUM badness?
This is what we did: 0. BEGIN; 1. ALTER TABLE ... SET WITHOUT OIDS 2. A bunch of things are selected out of this table and inserted into another (using INSERT ... SELECT) 3. An index is created on a timestamp field on this table 4. Then there's an update on a related table, that selects stuff from this table. 5. Renames a column 6. Drops a constraint 7. Adds a foreign key 8. Drops 8 columns 9. Drops 2 indexes 10. Drops 3 triggers 11. Then a tsearch 'txtidx' field is updated, and then cancelled halfway through 12. ROLLBACK; 13. VACUUM FULL forums_posts; Then we get thousands of these: WARNING: relation forums_posts TID 22763/10: OID is invalid WARNING: relation forums_posts TID 22763/11: OID is invalid WARNING: relation forums_posts TID 22763/12: OID is invalid WARNING: relation forums_posts TID 22763/13: OID is invalid WARNING: relation forums_posts TID 22763/14: OID is invalid WARNING: relation forums_posts TID 22763/15: OID is invalid WARNING: relation forums_posts TID 22763/16: OID is invalid WARNING: relation forums_posts TID 22763/17: OID is invalid WARNING: relation forums_posts TID 22764/1: OID is invalid WARNING: relation forums_posts TID 22764/2: OID is invalid WARNING: relation forums_posts TID 22764/3: OID is invalid WARNING: relation forums_posts TID 22764/4: OID is invalid This seems to be reproducible... Chris ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] SET WITHOUT OIDS and VACUUM badness?
On Wed, 21 Jan 2004, Christopher Kings-Lynne wrote: This seems to be reproducible... Here is a smaller example that show the problem: CREATE TABLE foo (a INT); BEGIN; ALTER TABLE foo SET WITHOUT OIDS; INSERT INTO foo values (5); ROLLBACK; VACUUM FULL foo; It's easy to guess what is causing this, but I'll leave that to the person that wants to fix it. -- /Dennis Björklund ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] SET WITHOUT OIDS and VACUUM badness?
On Wed, 21 Jan 2004, Christopher Kings-Lynne wrote: This is what we did: 0. BEGIN; 1. ALTER TABLE ... SET WITHOUT OIDS 12. ROLLBACK; 13. VACUUM FULL forums_posts; The problem here is that this conditional doesn't take into account the change in state which the above transaction causes: if (onerel-rd_rel-relhasoids !OidIsValid(HeapTupleGetOid(tuple))) Tuples inserted after step one have no (valid) OID. However, since we rollback, the change to pg_class.relhasoids = 'f' is rolled back. The only solution I can think of is removing the test or storing relhasoids as a per tuple flag (argh). Gavin ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [HACKERS] SET WITHOUT OIDS and VACUUM badness?
On Wed, 21 Jan 2004, Gavin Sherry wrote: On Wed, 21 Jan 2004, Christopher Kings-Lynne wrote: This is what we did: 0. BEGIN; 1. ALTER TABLE ... SET WITHOUT OIDS 12. ROLLBACK; 13. VACUUM FULL forums_posts; The problem here is that this conditional doesn't take into account the change in state which the above transaction causes: if (onerel-rd_rel-relhasoids !OidIsValid(HeapTupleGetOid(tuple))) Tuples inserted after step one have no (valid) OID. However, since we rollback, the change to pg_class.relhasoids = 'f' is rolled back. The only solution I can think of is removing the test or storing relhasoids as a per tuple flag (argh). What am I talking about. Can't we test for: (tuple)-t_infomask HEAP_HASOID Instead of: onerel-rd_rel-relhasoids Gavin ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] Slow query - index not used
What's the actual distribution of values in these columns? Are you searching for values that are particularly common or uncommon? This column always has a predefined set of values. Usually the app. would search for one of the existing values. --- Total records: 74654 --- nt_note_name| count +-- Collection | 10068 Component | 1200 CustTicket | 15009 Deliver| 1201 Download | 1999 GroupProv | 464 IP_News| 5950 IP_Publish_Request | 4000 IP_Usage | 2000 KnowledgeBase | 15002 LevelInfo |10 OtherParam | 4000 Request| 4501 TestMethod | 4050 VerTech| 4000 Version| 1200 --- I started from scratch: took out param DEFAULT_STATISTICS_TARGET from config file, restarted db, ran vacuum analyze, then got this statistics: # explain select count (1) from note_links_aux where nt_note_name = 'KnowledgeBase'; Aggregate (cost=1982.68..1982.68 rows=1 width=0) - Seq Scan on note_links_aux (cost=0.00..1970.18 rows=5002 width=0) Filter: (nt_note_name = 'KnowledgeBase'::character varying) # explain select count (1) from note_links_aux where nt_note_name = 'OtherParam'; Aggregate (cost=1984.78..1984.78 rows=1 width=0) - Seq Scan on note_links_aux (cost=0.00..1970.18 rows=5840 width=0) Filter: (nt_note_name = 'OtherParam'::character varying) # explain select count (1) from note_links_aux where nt_note_name = 'LevelInfo'; Aggregate (cost=58.91..58.91 rows=1 width=0) - Index Scan using nla_nt_note_name_fk_i on note_links_aux (cost=0.00..58.87 rows=15 width=0) Index Cond: (nt_note_name = 'LevelInfo'::character varying) # explain select count (1) from note_links_aux where nt_note_name = 'NoSuchThing'; Aggregate (cost=5.83..5.83 rows=1 width=0) - Index Scan using nla_nt_note_name_fk_i on note_links_aux (cost=0.00..5.83 rows=1 width=0) Index Cond: (nt_note_name = 'NoSuchThing'::character varying) So 'rows' values are incorrect. Also looking at queries with 'KnowledgeBase' and 'OtherParam' - does seq. scan make sense? I mean 'rows' has value of about 5000 records from the total of 75000 records on the table. This ratio does not seem high enough to assume that index scan won't be benefitial. And even if we consider the real number of records - 5000, this is still only 20% of the total. Would an index scan be still faster? Sorry if I put here more info than you need. Thanks, Mike. ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Slow query - index not used
On Wed, 21 Jan 2004, Michael Brusser wrote: So 'rows' values are incorrect. You can increase the statistics-gathering for that column with ALTER TABLE and probably get better estimates. Also looking at queries with 'KnowledgeBase' and 'OtherParam' - does seq. scan make sense? I mean 'rows' has value of about 5000 records from the total of 75000 records on the table. It doesn't sound so strange to me. I don't know exactly what limits postgresql uses but it probably need to fetch every page in the table to find all those 5000 records. If it has to do that then the index scan would not help that much (it might even make it slower). It's easy to test what happens if it do the index scan instead of the seq. scan. Just do SET enable_seqscan TO false; before you try the query. Then you can compare the times with and without index scan. Remember, even if it finds a row in the index. it still has to fetch the actual row from the table also. So if it needs to fetch all pages from the table the total amount of IO is all of the table + the relevant part of the index. The if it's faster or not depends on such things as if it's already cached in memory. Setting the effective_cache_size correctly lets postgresql take into account how much file cache you have which can effect the plan. -- /Dennis Björklund ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Allow backend to output result sets in XML
Tom Lane [EMAIL PROTECTED] writes: Brian Moore [EMAIL PROTECTED] writes: i would like to begin work on the TODO item Allow backend to output result sets in XML I am not sure why it's phrased that way --- surely the code to hack on is the client side, not the backend. Otherwise you need a protocol revision to make this happen, which implies hacking *both* ends. Presumably libpq would continue to use the binary protocol, but other clients could bypass libpq and just stream ascii xml queries. Personally I don't see any point in xml, but if there was a standard query protocol then a client could send queries to any database that supported it without using any libraries. That might be useful. Of course you could do that without xml, but people seem to get more excited about complying with standards when they invoke xml. psql already has some code to output results as HTML tables; I'd think adding functionality in that vicinity would be the way to go. That could also be useful, mainly in that it could include the data from the query, as well as some meta data. Allowing import tools for programs like spreadsheets to do more intelligent things with the data than currently. -- greg ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [HACKERS] Allow backend to output result sets in XML
Greg Stark wrote: Personally I don't see any point in xml, but if there was a standard query protocol then a client could send queries to any database that supported it without using any libraries. That might be useful. Of course you could do that without xml, but people seem to get more excited about complying with standards when they invoke xml. hm. I have to deal with xml quite frequently because I do a lot of DX with the gov't and other entities that are rapidly standardizing on xml. I like Oracle's approach to xml using object relational mappings to allow composition of documents server side based on natural data relationships. The XML document becomes something like a specialized view. It would save me tons of app-level coding if the server could do this for me. Since postgres is already fairly Oracle-ish in design, IMO this is definitely the way to go (XQuery = Insanity.). A FE/BE protocol revision would be useful but not necessary...the XML doc could be returned as a scalar. Right now I think all xml processing is done in app-level code, because the server (due to limitations of sql) is usually unable to return data the way you want it...so simply adding xml output from psql would be fairly useless for most real tasks (if it wasn't, someone would have done it a long time ago). Also, contrib\xml can already handle most of the simple things. Merlin ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Allow backend to output result sets in XML
Greg Stark wrote: Tom Lane [EMAIL PROTECTED] writes: Brian Moore [EMAIL PROTECTED] writes: i would like to begin work on the TODO item Allow backend to output result sets in XML I am not sure why it's phrased that way --- surely the code to hack on is the client side, not the backend. Otherwise you need a protocol revision to make this happen, which implies hacking *both* ends. Presumably libpq would continue to use the binary protocol, but other clients could bypass libpq and just stream ascii xml queries. I would think that you would still use libpq with the binary protocol that understood an xml header request of some sort?? J Personally I don't see any point in xml, but if there was a standard query protocol then a client could send queries to any database that supported it without using any libraries. That might be useful. Of course you could do that without xml, but people seem to get more excited about complying with standards when they invoke xml. psql already has some code to output results as HTML tables; I'd think adding functionality in that vicinity would be the way to go. That could also be useful, mainly in that it could include the data from the query, as well as some meta data. Allowing import tools for programs like spreadsheets to do more intelligent things with the data than currently. -- Command Prompt, Inc., home of Mammoth PostgreSQL - S/ODBC and S/JDBC Postgresql support, programming shared hosting and dedicated hosting. +1-503-667-4564 - [EMAIL PROTECTED] - http://www.commandprompt.com PostgreSQL Replicator -- production quality replication for PostgreSQL
Re: [HACKERS] Slow query - index not used
Michael Brusser [EMAIL PROTECTED] writes: So 'rows' values are incorrect. You sound like you are expecting them to be exact. They're just estimates. They're all plenty close enough for planning purposes, except maybe the one for 'KnowledgeBase' is a little further off than I would have expected. That could just be a chance sampling thing --- if you rerun ANALYZE and then check again, how much does that estimate move around in a few tries? I mean 'rows' has value of about 5000 records from the total of 75000 records on the table. This ratio does not seem high enough to assume that index scan won't be benefitial. You're mistaken. You didn't give any indication of how wide the rows are in this table, but supposing for purposes of argument that they're roughly 100 bytes apiece, there would be room for 80 rows on each 8K page of the table. A scan that needs to visit 1/15th of the table is statistically certain to read nearly every page of the table, many of them multiple times. (This assumes the needed rows are randomly scattered, which wouldn't be true if the table is clustered on the index, but you didn't say anything about having done that.) So an indexscan doesn't save any I/O, and may cost I/O if some of the pages fall out of buffer cache between visits. Add to that the penalty for random instead of sequential reads, and the cost to read the index itself, and you come out way behind a seqscan. This all assumes that I/O is far more expensive than CPU cycles. If you have a database that's small enough to fit in RAM then the cost of checking the WHERE clauses at each tuple could become the dominant factor. If that's your situation you should look at altering the planner's cost parameters --- in particular lower random_page_cost and increase the various CPU cost settings. With the default cost settings, the planner will not pick an indexscan if it thinks that more than a couple percent of the table needs to be visited. (I think the breakeven point is dependent on table size, but don't have the details in my head right now.) You can experiment with EXPLAIN ANALYZE and the various enable_xxx settings to see if it's right or not in your environment. See the pgsql-performance archives for more discussion. regards, tom lane ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [HACKERS] Allow backend to output result sets in XML
Please forgive me if this is silly, but if you wanted XML from the server, couldn't you just write a PL/Perl untrusted function that takes a SELECT statement as its parameter, and returns a single scalar containing the XML? - The XML:: modules in Perl help with the XML formatting - DBD::PgSPI could be handed the query as-is - No change to BE/FE or wire protocols - No impact on people who don't want it - Probably works across versions with minimal fuss Returning a simple XML structure with column names and rows should only take a few lines. (I'd write an example if I knew XML:: better.) I'll go back to lurking now. Thanks all for the great database! ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Allow backend to output result sets in XML
Tom Lane wrote: Brian Moore [EMAIL PROTECTED] writes: i would like to begin work on the TODO item Allow backend to output result sets in XML I am not sure why it's phrased that way --- surely the code to hack on is the client side, not the backend. Otherwise you need a protocol revision to make this happen, which implies hacking *both* ends. psql already has some code to output results as HTML tables; I'd think adding functionality in that vicinity would be the way to go. Converting a libpq result set (or a JDBC result set or ...) to an XML document should be a trivial string concatenation job that anyone can implement in half an hour. The more interesting questions are: what XML schema do you want to use and why? What do you want to do with the XML in the first place? Would a streaming interface be a better? Do you want a text document or a preparsed structure? What good would a, say, libpq implementation be if it's more work to make a wrapper in one of the other language bindings than implement it from scratch there? I think output XML is just buzz. Give us a real use scenario and an indication that a majority also has that use scenario (vs. the other ones listed above), then we can talk. ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[HACKERS] testing mail relays ...
we lost a mail relay, so teseting to make sure configuration is still working after replacing it with other local machines we have more contrl over ... Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email: [EMAIL PROTECTED] Yahoo!: yscrappy ICQ: 7615664 ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [HACKERS] Allow backend to output result sets in XML
Peter Eisentraut wrote: I think output XML is just buzz. Give us a real use scenario and an indication that a majority also has that use scenario (vs. the other ones listed above), then we can talk. Consider: create table person (name varchar primary key, age int); create table account (number varchar primary key, name varchar references person); insert into person values ('Fred', 35); insert into person values ('Barney', 37); insert into account values ('1234', 'Fred'); insert into account values ('5678', 'Fred'); insert into account values ('abcd', 'Barney'); test=# select * from person into xml natural; ?xml version=1.0 encoding=UTF-8 ? result row n=1 person nameFred/name age35/age account number1234/number /account account number5678/number /account /person /row /result row n=2 [...] now consider: select * from person into xml natural namespace is 'some_uri' schema is 'person.xsd'; this returns result set above, but with schema and namespace declarations included. Of course, there is tons of complexity hiding in there, but is this worth considering? Merlin ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [HACKERS] CTTAS w/ DISTINCT ON crashes backend
The crash I'm getting can be boiled down to this: regression=# create table fooey(f1 int) without oids; CREATE TABLE regression=# insert into fooey values(11); INSERT 0 1 regression=# create temp table fooey2 as select distinct * from fooey; server closed the connection unexpectedly This probably means the server terminated abnormally before or while processing the request. The connection to the server was lost. Attempting reset: Failed. ! I'm getting an Assert failure, which no doubt detects the problem much sooner than you were doing. The problem is in adding OIDs to rows that initially did not have 'em when returned from the SELECT DISTINCT plan. So your best immediate workaround is to create the first temp table with oids, or create the second one without. regards, tom lane ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] CTTAS w/ DISTINCT ON crashes backend
Tom Lane wrote: The crash I'm getting can be boiled down to this: regression=# create table fooey(f1 int) without oids; CREATE TABLE regression=# insert into fooey values(11); INSERT 0 1 regression=# create temp table fooey2 as select distinct * from fooey; server closed the connection unexpectedly This probably means the server terminated abnormally before or while processing the request. The connection to the server was lost. Attempting reset: Failed. ! I'm getting an Assert failure, which no doubt detects the problem much sooner than you were doing. The problem is in adding OIDs to rows that initially did not have 'em when returned from the SELECT DISTINCT plan. Okay. So your best immediate workaround is to create the first temp table with oids, or create the second one without. Thanks! Mike Mascari ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly