Re: [GENERAL] A query planner that learns
Scott Marlowe wrote: While all the talk of a hinting system over in hackers and perform is good, and I have a few queries that could live with a simple hint system pop up now and again, I keep thinking that a query planner that learns from its mistakes over time is far more desirable. Is it reasonable or possible for the system to have a way to look at query plans it's run and look for obvious mistakes its made, like being off by a factor of 10 or more in estimations, and slowly learn to apply its own hints? Technically it is very feasible. But I think you might want to check US Patent 6,763,359 before you start writing any code. It seems to me the first logical step would be having the ability to flip a switch and when the postmaster hits a slow query, it saves both the query that ran long, as well as the output of explain or explain analyze or some bastardized version missing some of the inner timing info. Even just saving the parts of the plan where the planner thought it would get 1 row and got instead 350,000 and was using a nested loop to join would be VERY useful. I could see something like that eventually evolving into a self tuning system. I think it would be a good start if we can specify a log_selectivity_error_threshold and if estimates are more then that factor off, the query, parameters and planner estimates get logged for later analysis. That would be driven entirely by selectivity estimates and not (estimated) cost since cost is influenced by outside factors such as other processes competing for resources. If a system for statistical hints emerges from the current discussion we would indeed have the input to start tuning the selectivity estimations. Jochem ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [GENERAL] [JDBC] Is what I want possible and if so how?
Csaba Nagy wrote: On Mon, 2006-07-03 at 17:03, Tom Lane wrote: status and TX2's select will not return the row. This isn't entirely perfect because LIMIT acts before FOR UPDATE: TX2's select will return nothing, rather than selecting the next available row as you might wish. So you might want to retry the select several times before deciding there's nothing to do. We do have a table like this, and in fact we did observe this behavior that if multiple clients ask for a row at the same time, the first gets something and the rest nothing. We're actually still looking for an optimal solution for this... For now, we added a random field to the table (with values 0-9), and the clients asks with a where clause for a random value in this field. This way there's a good chance the clients will not tip on each other's toes (i.e. the row asked for is not locked by another client). It is still necessary to retry a few times, but after introducing this random number mechanism we did notice a significant performance improvement in emptying the queue... so it must work somehow. It's true that we usually have 10-15 clients constantly polling the queue, and the queue itself is usually loaded with at least a few hundred tasks, so the random numbers are reasonably distributed to be effective. Now I wonder if there's some other way to get the same result without additional column in the table ? For a small number of processes and a large difference in time between the 'loookup' speed and the 'work' I have used a two-step process where you first get a batch of records and then try them all in rapid succession. In pseudocode: SELECT * FROM table WHERE condition LIMIT number_of_queue_processes + 1; LOOP; BEGIN; SELECT * FROM table WHERE condition AND pk = xxx LIMIT 1 FOR UPDATE NOWAIT; do something; COMMIT; END; Jochem ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
[GENERAL] Open Source database comparison
FYI: http://www.fabalabs.org/research/papers/FabalabsResearchPaper-OSDBMS-Eval.pdf Jochem ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [GENERAL] pg_dump 8.0.3 failing against PostgreSQL 7.3.2
Tom Lane wrote: I wrote: [ light dawns ] You've created a table named "text", haven't you? Yes, there is such a table. But even if I put the schema with that table in the search_path I can't reproduce the error from psql. You need this patch. I prefer the interpretation "My customer needs to change his schema if he wants backups" :) I will test the patch, but it will take a few days because I haven't got the toolchain set up. Thanx, Jochem ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [GENERAL] pg_dump 8.0.3 failing against PostgreSQL 7.3.2
Tom Lane wrote: Well, that trace makes it look like it's unhappy about the "null::text" in the command, because there is no other typecast in the SELECT target statement. Looking at the 7.3 code, the only very plausible reason for the failure is if either "unknown" or "text" has disappeared from pg_type, so that one of the typeidIsValid tests in can_coerce_type fails. But that doesn't explain why you don't see the failure interactively --- seems like "select null::text" should always fail in that database, if that's where the problem is. I confess to bewilderment ... anyone have a clue? If I check pg_type in the database with the problem for typname unknown or text I get 3 rows. In other databases I get only 2 rows. Jochem Problem database: wedstrijdzeilen=> select * from pg_type where typname in ('unknown','text') order by typname; typname | typnamespace | typowner | typlen | typbyval | typtype | typisdefined | typdelim | typrelid | typelem | typinput | typoutput | t ypalign | typstorage | typnotnull | typbasetype | typtypmod | typndims | typdefaultbin | typdefault -+--+--++--+-+--+--+--+-+---++-- +++-+---+--+---+ text| 11 |1 | -1 | f| b | t | ,|0 | 0 | textin| textout| i | x | f | 0 |-1 |0 | | text| 7965528 | 152 | 4 | t| c | t | ,| 8330945 | 0 | record_in | record_out | i | p | f | 0 |-1 |0 | | unknown | 11 |1 | -1 | f| b | t | ,|0 | 0 | unknownin | unknownout | i | p | f | 0 |-1 |0 | | (3 rows) wedstrijdzeilen=> select * from pg_class where oid = 8330945; relname | relnamespace | reltype | relowner | relam | relfilenode | relpages | reltuples | reltoastrelid | reltoastidxid | relhasindex | re lisshared | relkind | relnatts | relchecks | reltriggers | relukeys | relfkeys | relrefs | relhasoids | relhaspkey | relhasrules | relhassub class | relacl -+--+-+--+---+-+--+---+---+---+-+--- --+-+--+---+-+--+--+-+++-+-- --+-- text| 7965528 | 8330946 | 152 | 0 | 8330945 |0 | 0 | 8330947 | 0 | t | f | r |2 | 0 | 0 |0 |0 | 0 | f | t | f | f | {=,wedstrijdzeilen_admin=arwdRxt,wedstrijdzeilen_user=r} (1 row) Reference database: jochemd=> select * from pg_type where typname in ('unknown','text') order by typname; typname | typnamespace | typowner | typlen | typbyval | typtype | typisdefined | typdelim | typrelid | typelem | typinput | typoutput | t ypalign | typstorage | typnotnull | typbasetype | typtypmod | typndims | typdefaultbin | typdefault -+--+--++--+-+--+--+--+-+---++-- +++-+---+--+---+ text| 11 |1 | -1 | f| b | t | ,|0 | 0 | textin| textout| i | x | f | 0 |-1 |0 | | unknown | 11 |1 | -1 | f| b | t | ,|0 | 0 | unknownin | unknownout | i | p | f | 0 |-1 |0 | | (2 rows) ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: [GENERAL] pg_dump 8.0.3 failing against PostgreSQL 7.3.2
Tom Lane wrote: [ studies 7.3 code a bit ] It might work better to set the breakpoint at elog_message_prefix, assuming you've got logging dialed down to the point where only actual ERRORs go to the log. Attaching to process 10284 0x403827df in ?? () (gdb) symbol-file postmaster Reading symbols from postmaster...done. (gdb) break elog_message_prefix Breakpoint 1 at 0x16e73f (gdb) continue Continuing. Breakpoint 1, 0x16e73f in elog_message_prefix () (gdb) bt #0 0x16e73f in elog_message_prefix () #1 0x16da26 in elog () #2 0x82b3d in typecast_expression () #3 0x818d9 in transformExpr () #4 0x89d4d in transformTargetEntry () #5 0x8a021 in transformTargetList () #6 0x3cb78 in transformSelectStmt () #7 0x3ab6f in transformStmt () #8 0x3a79c in parse_analyze () #9 0x110574 in pg_analyze_and_rewrite () #10 0x110923 in pg_exec_query_string () #11 0x112a91 in PostgresMain () #12 0xf4eae in DoBackend () #13 0xf463d in BackendStartup () #14 0xf3040 in ServerLoop () #15 0xf2502 in PostmasterMain () #16 0xc9926 in main () (gdb) detach Detaching from program: process 10284 (gdb) quit [EMAIL PROTECTED]:/usr/local/pgsql/bin> Jochem ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: [GENERAL] pg_dump 8.0.3 failing against PostgreSQL 7.3.2
Tom Lane wrote: Jochem van Dieten <[EMAIL PROTECTED]> writes: (gdb) break elog Breakpoint 1 at 0x16d8f8 (gdb) bt #0 0x403ca553 in ?? () from /usr/lib/libc.so.28.5 #1 0x10e604 in mdread () #2 0x10f31f in smgrread () You forgot to "continue" until the breakpoint is reached --- this trace just indicates where the backend happened to be when you stopped it by attaching. That's different from backtracing a core dump :) This better? Attaching to process 22733 0x403827df in ?? () (gdb) symbol-file postmaster Reading symbols from postmaster...done. (gdb) break elog Breakpoint 1 at 0x16d8f8 (gdb) continue Continuing. Breakpoint 1, 0x16d8f8 in elog () (gdb) bt #0 0x16d8f8 in elog () #1 0x110abb in pg_exec_query_string () #2 0x112a91 in PostgresMain () #3 0xf4eae in DoBackend () #4 0xf463d in BackendStartup () #5 0xf3040 in ServerLoop () #6 0xf2502 in PostmasterMain () #7 0xc9926 in main () (gdb) detach Jochem ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [GENERAL] pg_dump 8.0.3 failing against PostgreSQL 7.3.2
Tom Lane wrote: Jochem van Dieten <[EMAIL PROTECTED]> writes: Below I have copy pasted the console log which has some additional information. This bug, or a related one, seems to have been registered previously as bug #1455 http://archives.postgresql.org/pgsql-bugs/2005-02/msg00021.php Yeah. We never heard back from that person, so I thought he'd resolved his problem, but maybe there's something in common. It's *really* odd that you can do the same command by hand and it doesn't fail. Just to get the Windows version out of the loop: if you run 7.3's own pg_dump directly on the BSD box, does it fail the same way? That fails due to a pg_dump issue that was fixed in 7.4: http://archives.postgresql.org/pgsql-general/2003-01/msg6.php The only way I can think of to get more info is to try to get a stack trace for the error --- that is, attach to the backend process with gdb, set a breakpoint at elog, let it run till the error occurs, and then "bt" to see how control got to the failure. Can you do that? OpenBSD man pages are said to be very good :) It will help if you do something like export PGOPTIONS="-W 30" on the client side before starting pg_dump --- that will provide a 30-second delay during connection to give you time to identify the connected backend and attach to it with gdb. On Windows that would be 'set PGOPTIONS="-W 30"'. Once I do that, I get the following error: C:\Program Files\PostgreSQL\8.0\bin>set PGOPTIONS="-W 30" "z:\backup\databases\2005-06-06\wedstrijdzeilen.sql" WARNING: postgres: invalid command line arguments Try -? for help. pg_dump: [archiver (db)] connection to database "wedstrijdzeilen" failed: server closed the connection unexpectedly This probably means the server terminated abnormally before or while processing the request. (pg_dump -? appears to suggest that -W is used for the "force password prompt" connection option.) Exclusively locking pg_proc from another connection is also a nice way to slow down pg_dump :) [EMAIL PROTECTED]:/usr/local/pgsql/bin> gdb attach 7761 warning: failed to install memory consistency checks; configuration should define NO_MMCHECK or MMCHECK_FORCE GNU gdb 4.16.1 Copyright 1996 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "i386-unknown-openbsd3.2"... attach: No such file or directory. /usr/local/pgsql/bin/7761: No such file or directory. Attaching to process 7761 0x403ca553 in ?? () (gdb) symbol-file postmaster Reading symbols from postmaster...done. (gdb) break elog Breakpoint 1 at 0x16d8f8 (gdb) bt #0 0x403ca553 in ?? () from /usr/lib/libc.so.28.5 #1 0x10e604 in mdread () #2 0x10f31f in smgrread () #3 0x103827 in ReadBufferInternal () #4 0x1035ea in ReadBuffer () #5 0x16fc5 in _bt_getbuf () #6 0x19da0 in _bt_step () #7 0x1941d in _bt_next () #8 0x17573 in btgettuple () #9 0x170523 in FunctionCall2 () #10 0x1330d in index_getnext () #11 0x12dac in systable_getnext () #12 0xa5a3b in RelationBuildTriggers () #13 0x167849 in RelationBuildDesc () #14 0x168837 in RelationIdGetRelation () #15 0xd378 in relation_open () #16 0x992d8 in LockTableCommand () #17 0x114c11 in ProcessUtility () #18 0x110a3a in pg_exec_query_string () #19 0x112a91 in PostgresMain () #20 0xf4eae in DoBackend () #21 0xf463d in BackendStartup () #22 0xf3040 in ServerLoop () #23 0xf2502 in PostmasterMain () #24 0xc9926 in main () (gdb) detach Detaching from program: process 7761 (gdb) quit [EMAIL PROTECTED]:/usr/local/pgsql/bin> Jochem ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[GENERAL] pg_dump 8.0.3 failing against PostgreSQL 7.3.2
I am experiencing a problem with dumping one specific database on a cluster. All 143 other databases dump without giving errors. The server runs PostgreSQL 7.3.2 on OpenBSD (I know :). pg_dump is version 8.0.3 on Windows (upgraded from 8.0.1 which had the same problem). The error message is: pg_dump: SQL command failed pg_dump: Error message from server: ERROR: Cannot cast type "unknown" to text pg_dump: The command was: SELECT proretset, prosrc, probin, null::text as proargnames, provolatile, proisstrict, prosecdef, (SELECT lanname FROM pg_catalog.pg_language WHERE oid = prolang) as lanname FROM pg_catalog.pg_proc WHERE oid = '8331054'::pg_catalog.oid Below I have copy pasted the console log which has some additional information. This bug, or a related one, seems to have been registered previously as bug #1455 http://archives.postgresql.org/pgsql-bugs/2005-02/msg00021.php Jochem C:\Program Files\PostgreSQL\8.0\bin>psql --version psql (PostgreSQL) 8.0.3 C:\Program Files\PostgreSQL\8.0\bin>pg_dump wedstrijdzeilen > "z:\backup\databases\2005-06-06\wedstrijdzeilen.sql" Password: pg_dump: SQL command failed pg_dump: Error message from server: ERROR: Cannot cast type "unknown" to text pg_dump: The command was: SELECT proretset, prosrc, probin, null::text as proargnames, provolatile, proisstrict, prosecd ef, (SELECT lanname FROM pg_catalog.pg_language WHERE oid = prolang) as lanname FROM pg_catalog.pg_proc WHERE oid = '833 1054'::pg_catalog.oid C:\Program Files\PostgreSQL\8.0\bin>psql wedstrijdzeilen Password: Welcome to psql 8.0.3, the PostgreSQL interactive terminal. Type: \copyright for distribution terms \h for help with SQL commands \? for help with psql commands \g or terminate with semicolon to execute query \q to quit Warning: Console code page (437) differs from Windows code page (1252) 8-bit characters may not work correctly. See psql reference page "Notes for Windows users" for details. oli=> select version(); version - PostgreSQL 7.3.2 on i386-unknown-openbsd3.2, compiled by GCC 2.95.3 (1 row) wedstrijdzeilen=> SELECT proretset, prosrc, probin, null::text as proargnames, provolatile, proisstrict, prosecdef, (SEL ECT lanname FROM pg_catalog.pg_language WHERE oid = prolang) as lanname FROM pg_catalog.pg_proc WHERE oid = '8331054'::p g_catalog.oid; proretset | prosrc | probin | proargnames | provolatile | proisstrict | prosecdef | lanname ---+ --++-+-+ -+---+- t | SELECT j.parent_id as id, (SELECT singular FROM dswz.translations WHERE id = t.name AND language_id = (SELECT dswz.check_creator_i d ($1, $2, $3, $4))), street, house_number, zipcode, city, (SELECT singular FROM dswz.translations WHERE id = nations.name AND language_id = (SELECT dswz.check_cre ator_id ($1, $2, $3, $4))) FROMdswz.view_addresses a LEFT JOIN dswz.nations ON country = nations.id, dswz.joins($1, $2,$3,$4) j, dswz.object_types t WHERE j.child_id = a.id AND t.id = a.type AND j.cancel_time > now(); | - | | v | f | t | sql (1 row) ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [GENERAL] PostgreSQL vs. InnoDB performance
Peter Eisentraut wrote: On a particular system, loading 1 million rows (100 bytes, nothing fancy) into PostgreSQL one transaction at a time takes about 90 minutes. Doing the same in MySQL/InnoDB takes about 3 minutes. InnoDB is supposed to have a similar level of functionality as far as the storage manager is concerned, so I'm puzzled about how this can be. Does anyone know whether InnoDB is taking some kind of questionable shortcuts it doesn't tell me about? MySQL/InnoDB offers the same knobs to force commits to disk as PostgreSQL does. Look at innodb_flush_log_at_trx_commit and innodb_flush_method: http://dev.mysql.com/doc/mysql/en/innodb-start.html Jochem ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [GENERAL] Stored Procedures woes
"Andrew Hall" wrote: We are using a number of stored procedures that are called often from our client programs. I include one here as an example. The problem we are seeing is that when executing some of these that deal with a large number of records, they begin execution and never return. The process handling the request chews 97% of CPU resources and must be cancelled before it will release. We have tried examining where in the process it stops via the debug output, but this has proven to be fruitless as A. the problem doesn't happen with all the SP's (some other SP's deal with far more data but don't have the problem), and B. doesn't always happen consistently with the SP's that seem to cause problems. What we do know is that the machine doesn't seem to be low on memory, never returns any error, and happens regardless of machine or PG version (we've tried 7.4.1, and 7.4.2). Wouldn't the following query be functionally the same as the procedure you posted (if you fix the rate and the groupid)? If so, does it perform better and how does the explain look? UPDATE user_sessions SET cost_bytes_in = a.costIn, cost_bytes_out = a.costOut FROM ( SELECT session_id, CASE WHEN (us_rec.hit_bytes_in+us_rec.miss_bytes_in)*$rate/1048576 < 0.0001 THEN 0 ELSE (us_rec.hit_bytes_in+us_rec.miss_bytes_in)*$rate/1048576 END AS costIn, CASE WHEN (us_rec.hit_bytes_out+us_rec.miss_bytes_out)*$rate/1048576 < 0.0001 THEN 0 ELSE (us_rec.hit_bytes_out+us_rec.miss_bytes_out)*$rate/1048576 END AS costOut FROM user_session WHERE group_id = $groupId ) a WHERE group_id = $groupId AND user_id = a.user_id; Jochem ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [GENERAL] functions/operators with 2 sets as arguments
Tom Lane wrote: Jochem van Dieten <[EMAIL PROTECTED]> writes: While trying to create an operator to test if 2 intervals overlap, I ran into the following limitation: ERROR: Functions and operators can take only one set argument (PostgreSQL 7.3.2 on OpenBSD) It's still there, and is unlikely to go away (I think future development is more likely to go in the direction of deprecating/removing set-valued functions in SELECT expressions than extending the facility). Why would interval overlap testing require such a function? What are you trying to do, exactly? Also, have you looked at the SQL OVERLAPS operator? Your problem might be solved already, if what you are after corresponds to what the SQL authors thought would be useful ... The SQL OVERLAPS predicate is almost what I want, but I would like to have an operator that returns TRUE if the end of the first interval is the beginning of the second interval. But since 2 sets can't be used, I will fix it by simply adding an extra condition to the query. Jochem -- When you don't want to be surprised by the revolution organize one yourself - Loesje ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [GENERAL] To Postgres Devs : Wouldn't changing the select limit
Bruce Momjian wrote: >>Bruce Momjian writes: >> >> Break the SQL code that has been implemented for prior versions?? Bummer ;((. >>>Yes, but we don't follow the MySQL behavior, which we copied when we >>>added LIMIT. Seems we should agree with their implementation. >>> >>Isn't it much worse to not follow PostgreSQL behavior than to not follow >>MySQL behavior? >> > > Another idea: because our historical Limit #,# differs from MySQL, one > idea is to disable LIMIT #,# completely and instead print an error > stating they have to use LIMIT # OFFSET #. Although that would break > both MySQl and old PostgreSQL queries, it would not generate incorrect > results. I would say the relevant behaviour is neither the one that MySQL historically uses nor the one that PostgreSQL historically uses, but the one that is specified in the relevant standards. Since nobody brought this up yet I presume these standards leave the implementation of LIMIT open (I tried to google myself, but I couldn't exactly find it). Is that correct or does (any of the) the SQL standards specify a behaviour? Jochem ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [GENERAL] ODBC error
Michelle Murrain wrote: > I'm trying to isolate a problem - using Cold Fusion, and sending to pg > a large text block, I'm getting an error: > > ODBC Error Code = 08S01 (Communication link failure) > > Error while executing the query; Query string is too long > > Is this a Cold Fusion problem, or a postgresODBC driver problem? I > know that pg now has no limit on text objects. > > Using pg 7.1, on Debian stable, using CF 4.5 What version of the ODBC driver and can you post the CF code? Jochem ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [GENERAL] PostgresQL equivalent of NOCOUNT
Tom Lane wrote: > Jochem van Dieten <[EMAIL PROTECTED]> writes: > >>Does PostgresQL have some way to make update, insert and delete queries >>not return the number of affected rows? I know that in MS SQL one would >>use NOCOUNT for that. >> > > Uh ... why? Seems like a useless anti-feature. Certainly suppressing > the count wouldn't save a noticeable number of cycles. I am not in it for the cycles, just for the laziness ;) Currently working with a ColdFusion frontend through ODBC, and ColdFusion is unable to return 2 resultsets for one call to cfquery (the ColdFusion query implementation). In MS SQL I would use the query below to suppress one resultset and return the primary key of the recently inserted record. SET NOCOUNT ON INSERT INTO () VALUES () SELECT @@IDENTITY AS 'Identity' SET NOCOUNT OFF I was wondering if something like that is possible in PostgresQL. I know I can wrap it in a transaction and do a second query or build a procedure to do it, but this would be much easier (presuming I can use curval('primary_key_seq') instead of @@identity). Any suggestions? Jochem ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
[GENERAL] PostgresQL equivalent of NOCOUNT
Does PostgresQL have some way to make update, insert and delete queries not return the number of affected rows? I know that in MS SQL one would use NOCOUNT for that. TIA, Jochem ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly