Re: [GENERAL] sslv3 alert illegal parameter
Pedro Doria Meunier wrote: My server started spitting this out every second! LOG: could not accept SSL connection: sslv3 alert illegal parameter PostgreSQL 8.3.7 on x86_64-redhat-linux-gnu, compiled by GCC gcc (GCC) 4.1.2 20070925 (Red Hat 4.1.2-33) Server key in place.. What can I do about it?! Several ideas: You say started spitting: that suggests that this is a new phenomenon. What changes have there been since before it started? Every second: Try to find out what tries to connect every second. Set log_connections to on and find out the IP address. Once you have found the culprit, figure out how it tries to connect. Yours, Laurenz Albe -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Performance problem with low correlation data
Since noone replied to http://www.mail-archive.com/pgsql-general@postgresql.org/msg133360.html, I tried another approach: I can't cluster the whole table every day; it would take too much (as I said, table as 60M rows, and I have hundreds of them). Plus, it wouldn't really make much sense: the only portion of table to be clustered is the one written after the last cluster command (since no row is deleted/updated, only inserted each 15 minutes). So I thought: I'll cluster only the part that has been written every day: begin; lock table testinsert in ACCESS EXCLUSIVE MODE; insert into testinsert select ne_id+10, t, v from testinsert where t between '2009-08-01 00:00:00' and '2009-08-02 00:00:00' order by ne_id,t; DELETE from testinsert where t between '2009-08-01 00:00:00' and '2009-08-02 00:00:00' and ne_id10; update testinsert set ne_id = ne_id - 10 where t between '2009-08-01 00:00:00' and '2009-08-02 00:00:00'; commit; this would run after midnight of 2009-08-02. Next day would have different time values. What I'm trying to do here is cluster on ne_id,t the portion of table written every day. Well, I guess the table is layed out as expected, but in pg_stats correlation for the ne_id col is still VERY low: select attname,n_distinct,correlation from pg_stats where tablename='testinsert3'; attname | n_distinct | correlation -++- ne_id | 2 | 0.111041 low value t | 864 | 0.987778 v | 1 | 1 this leads the planner to sequence scans of the table as soon as 10% of the table has to be read: explain select * FROM idtable as g inner join testinsert on id=ne_id where groupid between 1 and 4 and t between '2009-08-01 00:00:00' and '2009-08-09 00:00:00' Hash Join (cost=134.45..2127071.28 rows=614331 width=244) Hash Cond: (testinsert3.ne_id = g.id) - Seq Scan on testinsert (cost=0.00..2063200.00 rows=15358272 width=236) Filter: ((t = '2009-08-01 00:00:00'::timestamp without time zone) AND (t = '2009-08-09 00:00:00'::timestamp without time zone)) - Hash (cost=124.45..124.45 rows=800 width=8) - Bitmap Heap Scan on idtable g (cost=24.45..124.45 rows=800 width=8) Recheck Cond: ((groupid = 1) AND (groupid = 4)) - Bitmap Index Scan on idtable_pk (cost=0.00..24.25 rows=800 width=0) Index Cond: ((groupid = 1) AND (groupid = 4)) Which is a terrible plan! testinsert contains t values between '2009-08-01' and '2009-08-09', and ne_id from 1 to 2. But only 800 out of 2 ne_id have to be read; there's no need for a table scan! I guess this is a reflection of the poor correlation on ne_id; but, as I said, I don't really think ne_id is so bad correlated. In fact, doing a select ne_id, t from testinsert limit 10 I can see that data is laid out pretty much by ne_id, t, grouped by day (that is, same ne_id for one day, then next ne_id and so on until next day). How is the correlation calculated? Can someone explain to me why, after the procedure above,correlation is so low??? -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] c++ program to connect to postgre database
ramon rhey serrano wrote: Hi Sir John, Thanks for the links and reply. I downloaded this libpqxx-3.0 but i really don't know what to do with the file and where to put them. I'm still having hard time how to start the C++ program using Dev C++ as my IDE, what files do i need to install, what headers to use, how the basic flow of program should look like (maybe for a simple database connection). I already installed PostgreSQL version 1.8.4 in my PC (windows). I have basic understanding and background in C++ programming, but totally new to PostgreSQL database system. as I said earlier, I'd probably just use C API calls to libpq, and do my own C++ wrappings.C++ libraries pretty much have to be compiled for the speciifc compiler environment, so I'd have to assume youd load the libpqxx project source into your IDE, and build it. with libpq, the OS should already have the .so (or dll if you're on windows), and .h files, so you'd simply #include libpq-fe.h and call the various functions using :: in front of the code to indicate they are not part of a class. you might have to tell the IDE where the libpq h and .so files are if they aren't in a standard system library locations like /usr/include and /usr/lib I've never used that IDE so I have no idea of the specifics of how you configure it with new libraries or include files. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] pg_dump PostgreSQL 8.4
hi!. i am trying to migrate my database from 8.3.7 to 8.4 (both versions running on the same server windows 2003, hp ml350 quad core xeon) using pg_dump and restore using pgAdmin III R 1.10.0, this i was able to do. but i cannot seem to make a backup on 8.4. i keep getting the following error message: pg_dump: reading user-defined tables pg_dump: SQL command failed pg_dump: Error message from server: ERROR: column reltriggers does not exist LINE 1: ...oles WHERE oid = relowner) as rolname, relchecks, reltrigger... ^ pg_dump: The command was: SELECT c.tableoid, c.oid, relname, relacl, relkind, relnamespace, (SELECT rolname FROM pg_catalog.pg_roles WHERE oid = relowner) as rolname, relchecks, reltriggers, relhasindex, relhasrules, relhasoids, d.refobjid as owning_tab, d.refobjsubid as owning_col, (SELECT spcname FROM pg_tablespace t WHERE t.oid = c.reltablespace) AS reltablespace, array_to_string(c.reloptions, ', ') as reloptions from pg_class c left join pg_depend d on (c.relkind = 'S' and d.classid = c.tableoid and d.objid = c.oid and d.objsubid = 0 and d.refclassid = c.tableoid and d.deptype = 'a') where relkind in ('r', 'S', 'v', 'c') order by c.oid pg_dump: *** aborted because of error Process returned exit code 1. is there any configuration setting i need to do. any help would be appreciated. regards raffy segador destiny cable inc ncr, philipines
Re: [GENERAL] pg_dump PostgreSQL 8.4
What version of pg_dump and pgAdmin are you using? In PostgreSQL version 8.3 it's reltriggers, in version 8.4 it's relhastriggers. Just run the next query to see the difference: SELECT * FROM pg_class; Using pgAdmin 1.10, I don't have any problemes with 8.3 nor 8.4 Regards, Frank Op 9 jul 2009, om 10:24 heeft serafin g.segador het volgende geschreven: hi!. i am trying to migrate my database from 8.3.7 to 8.4 (both versions running on the same server windows 2003, hp ml350 quad core xeon) using pg_dump and restore using pgAdmin III R 1.10.0, this i was able to do. but i cannot seem to make a backup on 8.4. i keep getting the following error message: pg_dump: reading user-defined tables pg_dump: SQL command failed pg_dump: Error message from server: ERROR: column reltriggers does not exist LINE 1: ...oles WHERE oid = relowner) as rolname, relchecks, reltrigger... ^ pg_dump: The command was: SELECT c.tableoid, c.oid, relname, relacl, relkind, relnamespace, (SELECT rolname FROM pg_catalog.pg_roles WHERE oid = relowner) as rolname, relchecks, reltriggers, relhasindex, relhasrules, relhasoids, d.refobjid as owning_tab, d.refobjsubid as owning_col, (SELECT spcname FROM pg_tablespace t WHERE t.oid = c.reltablespace) AS reltablespace, array_to_string(c.reloptions, ', ') as reloptions from pg_class c left join pg_depend d on (c.relkind = 'S' and d.classid = c.tableoid and d.objid = c.oid and d.objsubid = 0 and d.refclassid = c.tableoid and d.deptype = 'a') where relkind in ('r', 'S', 'v', 'c') order by c.oid pg_dump: *** aborted because of error Process returned exit code 1. is there any configuration setting i need to do. any help would be appreciated. regards raffy segador destiny cable inc ncr, philipines Frank Heikens frankheik...@mac.com -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Table replication
Hi Members, Is there any way to sync a single table across the DBs. We need a single table alone to be equal in all DBs in the single postgres. Presenly we are doing this with the help of Update, insert, delete trigger. Is there any other best way for that. Arvind S
Re: [GENERAL] c++ program to connect to postgre database
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Hi Rey, If you give QT4 a try I might be able to help you there ;-) Regards Pedro Doria Meunier GSM: +351 96 17 20 188 Skype: pdoriam ramon rhey serrano wrote: !-- DIV {margin:0px;} -- Hello, I'm thinking of making a program in C++ (API) that will connect, read, and write to a PostgreSQL database. I honestly have no idea where to start and how to implement it in C++ using Dev C++ IDE. Any help would be greatly appreciated. Thanks. Sincerely, Rhey -- New Email addresses available on Yahoo! http://sg.rd.yahoo.com/aa/mail/domainchoice/mail/signature/*http://mail.promotions.yahoo.com/newdomains/aa/ Get the Email name you've always wanted on the new @ymail and @rocketmail. Hurry before someone else does! -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.7 (GNU/Linux) Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org iD4DBQFKVbp+2FH5GXCfxAsRAi0WAKC1LxYxa9TVzOy6/MN7+UA5mYfcfQCYsCvz LdGheJTQJcWApVisqDeS+w== =yx4r -END PGP SIGNATURE- -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] sslv3 alert illegal parameter [SOLVED]
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Hi Albe, Txs for replying. I did all the necessary mods to the postgresql.conf and put as much info as I could in the logs. The 'culprit' was localhost with no discernible reason whatsoever ... :O Anyway, after exausting all possbilities and learning that openssl 0.9.8b as memory leaks issues :-( the only thing left for me was restarting the server and go figure: it went way on its on! :O So as much as I'd love to know the reason I guess I'll just be happy that the thing resolved itself with a sort of a sword-dangling-over-my-head feeling ... Best regards, Pedro Doria Meunier GSM: +351 96 17 20 188 Skype: pdoriam Albe Laurenz wrote: Pedro Doria Meunier wrote: My server started spitting this out every second! LOG: could not accept SSL connection: sslv3 alert illegal parameter PostgreSQL 8.3.7 on x86_64-redhat-linux-gnu, compiled by GCC gcc (GCC) 4.1.2 20070925 (Red Hat 4.1.2-33) Server key in place.. What can I do about it?! Several ideas: You say started spitting: that suggests that this is a new phenomenon. What changes have there been since before it started? Every second: Try to find out what tries to connect every second. Set log_connections to on and find out the IP address. Once you have found the culprit, figure out how it tries to connect. Yours, Laurenz Albe -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.7 (GNU/Linux) Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org iD8DBQFKVb6a2FH5GXCfxAsRAmBfAJ46fR1jOrSphHPxL5icYVum/EU1xQCgvTew INQgVjKv4fOXvw7Ic82XNM4= =oR5w -END PGP SIGNATURE- -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] pg_dump PostgreSQL 8.4
i found where the error is. thanks for the tip. i have versions 8.3.7 and 8.4 of PostgreSQL running on the same server, as well as versions 1.8 and 1.10 of pgAdmin. although i run the backup routine for pg8.4 from pgAdmin 1.10, the pdAdmin uses PostgreSQL\8.3\bin\pg_dump as default. i now run the backup routine from the command line utility pointing to the correct version. thank you. regards, raffy On Thu, 09 Jul 2009 17:03:04 +0800, Frank Heikens frankheik...@mac.com wrote: What version of pg_dump and pgAdmin are you using? In PostgreSQL version 8.3 it's reltriggers, in version 8.4 it's relhastriggers. Just run the next query to see the difference: SELECT * FROM pg_class; Using pgAdmin 1.10, I don't have any problemes with 8.3 nor 8.4 Regards, Frank Op 9 jul 2009, om 10:24 heeft serafin g.segador het volgende geschreven: hi!. i am trying to migrate my database from 8.3.7 to 8.4 (both versions running on the same server windows 2003, hp ml350 quad core xeon) using pg_dump and restore using pgAdmin III R 1.10.0, this i was able to do. but i cannot seem to make a backup on 8.4. i keep getting the following error message: pg_dump: reading user-defined tables pg_dump: SQL command failed pg_dump: Error message from server: ERROR: column reltriggers does not exist LINE 1: ...oles WHERE oid = relowner) as rolname, relchecks, reltrigger... ^ pg_dump: The command was: SELECT c.tableoid, c.oid, relname, relacl, relkind, relnamespace, (SELECT rolname FROM pg_catalog.pg_roles WHERE oid = relowner) as rolname, relchecks, reltriggers, relhasindex, relhasrules, relhasoids, d.refobjid as owning_tab, d.refobjsubid as owning_col, (SELECT spcname FROM pg_tablespace t WHERE t.oid = c.reltablespace) AS reltablespace, array_to_string(c.reloptions, ', ') as reloptions from pg_class c left join pg_depend d on (c.relkind = 'S' and d.classid = c.tableoid and d.objid = c.oid and d.objsubid = 0 and d.refclassid = c.tableoid and d.deptype = 'a') where relkind in ('r', 'S', 'v', 'c') order by c.oid pg_dump: *** aborted because of error Process returned exit code 1. is there any configuration setting i need to do. any help would be appreciated. regards raffy segador destiny cable inc ncr, philipines Frank Heikens frankheik...@mac.com -- Using Opera's revolutionary e-mail client: http://www.opera.com/mail/ -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] pg_dump PostgreSQL 8.4
On Thu, Jul 9, 2009 at 11:06 AM, serafin segadorrsega...@mydestiny.net wrote: i found where the error is. thanks for the tip. i have versions 8.3.7 and 8.4 of PostgreSQL running on the same server, as well as versions 1.8 and 1.10 of pgAdmin. although i run the backup routine for pg8.4 from pgAdmin 1.10, the pdAdmin uses PostgreSQL\8.3\bin\pg_dump as default. i now run the backup routine from the command line utility pointing to the correct version. Check the PG bin path option and File - Options. It also sounds like you disabled the version match check on that dialogue, otherwise pg_dump 8.3 should have complained about be used with 8.4. -- Dave Page EnterpriseDB UK: http://www.enterprisedb.com -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Table replication
S Arvind wrote: Hi Members, Is there any way to sync a single table across the DBs. We need a single table alone to be equal in all DBs in the single postgres. Presenly we are doing this with the help of Update, insert, delete trigger. Is there any other best way for that. Most trigger-based replication systems should work fine with one or more tables: slony, londiste etc. That's assuming a single master DB that receives all updates. -- Richard Huxton Archonet Ltd -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] c++ program to connect to postgre database
John R Pierce schrieb: ramon rhey serrano wrote: Hi Sir John, Thanks for the links and reply. I downloaded this libpqxx-3.0 but i really don't know what to do with the file and where to put them. I'm still having hard time how to start the C++ program using Dev C++ as my IDE, what files do i need to install, what headers to use, how the basic flow of program should look like (maybe for a simple database connection). I already installed PostgreSQL version 1.8.4 in my PC (windows). I have basic understanding and background in C++ programming, but totally new to PostgreSQL database system. as I said earlier, I'd probably just use C API calls to libpq, and do my own C++ wrappings.C++ libraries pretty much have to be compiled for the speciifc compiler environment, so I'd have to assume youd load the libpqxx project source into your IDE, and build it. The best C++ library I came across so far is SOCI: http://soci.sourceforge.net/ Maybe this could be a good starting point for Ramon since he is familiar with C++. Rainer -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] c++ program to connect to postgre database
I'm a big libpqxx user, and can confirm that it's a great wrapper, but good luck getting it to build in Dev-c++, which hasn't had a new release in 4 years and isn't supported by pqxx. I'm not sure of the specifics, but Dev-c++ uses a MinGW port of GCC. libpqxx does support MinGW, but probably not the Dev-C++ flavour. I suggest you download visual studio 2008 express, and use that to build libpqxx. No one uses Dev-C++ anymore anyway. Regards, Peter Geoghegan -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] ZFS prefetch considered evil?
On Jul 9, 2009, at 3:53 AM, Yaroslav Tykhiy wrote: On 08/07/2009, at 8:39 PM, Alban Hertroys wrote: On Jul 8, 2009, at 2:50 AM, Yaroslav Tykhiy wrote: IIRC prefetch tries to keep data (disk blocks?) in memory that it fetched recently. What you described is just a disk cache. And a trivial implementation of prefetch would work as follows: An application or other file/disk consumer asks the provider (driver, kernel, whatever) to read, say, 2 disk blocks worth of data. The provider thinks, I know you are short-sighted; I bet you are going to ask for more contiguous blocks very soon, so it schedules a disk read for many more contiguous blocks than requested and caches them in RAM. For bulk data applications such as file serving this trick works as a charm. But other applications do truly random access and they never come back after the prefetched blocks; in this case both disk bandwidth and cache space are wasted. An advanced implementation can try to distinguish sequential and random access patterns, but in reality it appears to be a challenging task. Ah yes, thanks for the correction, I now remember reading about that before. Makes the name 'prefetch' that more fitting, doesn't it? And as you say, it's not that useful a feature with random access (hadn't thought about that); in fact, I can imagine that it might delay moving the disk-heads to the next desired (random) position as the FS is still requesting data that it isn't going to be needing (except for some lucky cases) - unless it manages to detect the randomness of the access patterns. You can't predict randomness from just read requests of course, you don't know about the requests that are still to come. You can however assume something like that is the case if historic requests turned out to be random by nature, but then you'd want to know for which area of the FS this is the case. I don't know how you partitioned your zpools, but to me it seems like it'd be preferable to have the PostgreSQL tablespaces (and possibly other data that's likely to be accessed randomly) in a separate zpool from the rest of the system so you can restrict disabling prefetch to just that file-system. You probably already did that... It could be interesting to see how clustering the relevant tables would affect the prefetch performance, I'd expect disk access to be less random that way. It's probably still better to disable prefetch though. ZFS uses quite a bit of memory, so if you distributed all your memory to be used by just postgres and disk cache then you didn't leave enough space for the prefetch data and _something_ will be moved to swap. I hope you know that FreeBSD is exceptionally good at distributing available memory between its consumers. That said, useless prefetch indeed puts extra pressure on disk cache and results in unnecessary cache evictions, thus making things even worse. It is true that ZFS is memory hungry and so rather sensitive to non-optimal memory use patterns. Useless prefetch wastes memory that could be used to speed up other ZFS operations. Yes, I do know that, it's one of the reasons I prefer it over other OSs. The keyword here was 'available memory' though, under the assumption that something was hitting swap. But apparently that wasn't the case. You'll probably want to ask about this on the FreeBSD mailing lists as well, they'll know much better than I do ;) Are you a local FreeBSD expert? ;-) Jokes apart, I don't think this topic has to do with FreeBSD as such; it is mostly about making the advanced technologies of Postgresql and ZFS go well together. Even ZFS developers admit that in database related applications exceptions from general ZFS practices and rules may be called for. I wouldn't call myself an expert, I just use it on a few systems at home and am more a user than an administrator. I do read the stable/ current mailing lists though (since 2004 according to my mail client) and keep an eye on (among others) the ZFS discussions as I feel tempted to change my gmirrors into zpools some day. It certainly looks like an interesting FS, very flexible and reliable. Alban Hertroys -- If you can't see the forest for the trees, cut the trees and you'll see there is no forest. !DSPAM:737,4a55e49a10131296212767! -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Bug in ecpg lib ?
Hello Laurenz, Thank you for your very thorough walk through the 'ecpg use' of threads with respect to the sqlca. It was very clear and specific. I reproduced what you did almost exactly as you have done and I could then also play around with things to see what happens 'if'... I have learned much about threads and ecpg, which I'm sure will be very helpful. Also I'm afraid I have to agree with you that it must be a mudflap flop ;-) ... unfortunately, because now I'm then back to the real problem in the larger program and how to track that error. I'm pleased that it wasn't an ecpg bug, and I know now not to use mudflap for tracking my problem. Thanks for your big effort on this, Leif - Albe Laurenz laurenz.a...@wien.gv.at wrote: l...@crysberg.dk wrote: I have been trying to figure this thing out myself too, breakpointing and single stepping my way through some of the ecpg code, but without much clarification. (More that I learned new things about pthread). I have been trying to figure out whether this is a real thing or more a mudflapth mis-judgement. Also on most (the faster ones) machines mudflap complains either about invalid pointer in free() or double free() or corruption. I haven't been able to verify this yet. Specifically on one (slower) machine, I have only seen this mudflapth complaint once, though I have been both running and debugging it on that many times. Are you sure what you suggest is nonsense ? In the light of the sqlca struct being local to each thread ? I tried to put the open and close connection within the thread, but I was still able to get the mudflap complaint. Theoretically, I guess one could use just 1 connection for all db access in all threads just having them enclosed within pthread_mutex_[un]lock()s !? (Not what I do, though.) The sqlca is local to each thread, but that should not be a problem. On closer scrutiny of the source, it works like this: Whenever a thread performs an SQL operation, it will allocate an sqlca in its thread-specific data area (TSD) in the ECPG function ECPGget_sqlca(). When the thread exits or is cancelled, the sqlca is freed by pthread by calling the ECPG function ecpg_sqlca_key_destructor(). pthread makes sure that each destructor function is only called once per thread. So when several threads use a connection, there will be several sqlca's around, but that should not matter as they get freed when the thread exits. After some experiments, I would say that mudflap's complaint is a mistake. I've compiled your program against a debug-enabled PostgreSQL 8.4.0 with $ ecpg crashex $ gcc -Wall -O0 -g -o crashex crashex.c -I /magwien/postgres-8.4.0/include \ -L/magwien/postgres-8.4.0/lib -lecpg -Wl,-rpath,/magwien/postgres-8.4.0/lib and run a gdb session: $ gdb GNU gdb Red Hat Linux (6.3.0.0-1.138.el3rh) Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type show copying to see the conditions. There is absolutely no warranty for GDB. Type show warranty for details. This GDB was configured as i386-redhat-linux-gnu. Set the program to be debugged: (gdb) file crashex Reading symbols from /home/laurenz/ecpg/crashex...done. Using host libthread_db library /lib/tls/libthread_db.so.1. This is where the source of libecpg is: (gdb) dir /home/laurenz/rpmbuild/BUILD/postgresql-8.4.0/src/interfaces/ecpg/ecpglib Source directories searched: /home/laurenz/rpmbuild/BUILD/postgresql-8.4.0/src/interfaces/ecpg/ecpglib:$cdir:$cwd Start the program (main thread): (gdb) break main Breakpoint 1 at 0x804892c: file crashex.pgc, line 54. (gdb) run Starting program: /home/laurenz/ecpg/crashex [Thread debugging using libthread_db enabled] [New Thread -1218572160 (LWP 29290)] [Switching to Thread -1218572160 (LWP 29290)] Breakpoint 1, main (argc=1, argv=0xbfffce44) at crashex.pgc:54 54 PerformTask( 25 ); (gdb) delete Delete all breakpoints? (y or n) y Set breakpoint #2 in the function where sqlca is freed: (gdb) break ecpg_sqlca_key_destructor Breakpoint 2 at 0x457a27: file misc.c, line 124. (gdb) list misc.c:124 119 120 #ifdef ENABLE_THREAD_SAFETY 121 static void 122 ecpg_sqlca_key_destructor(void *arg) 123 { 124 free(arg); /* sqlca structure allocated in ECPGget_sqlca */ 125 } 126 127 static void 128 ecpg_sqlca_key_init(void) Set breakpoint #3 where a new sqlca is allocated in ECPGget_sqlca(): (gdb) break misc.c:147 Breakpoint 3 at 0x457ad2: file misc.c, line 147. (gdb) list misc.c:134,misc.c:149 134 struct sqlca_t * 135 ECPGget_sqlca(void) 136 { 137 #ifdef ENABLE_THREAD_SAFETY 138 struct sqlca_t *sqlca; 139 140
[GENERAL] SELECT DISTINCT very slow
Hi, Can anybody explain this: Records: 600,000 Field Width: Approximately 20 UTF8 characters - type is VARCHAR(25) Field is Indexed. SELECT DISTINCT field FROM table; Takes about 6 seconds. There are 111 distinct items. On Sqlite, and another place where I have a B+Tree, this query is faster than my eye can measure. Is this a well known issue? Thanks, Ben -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Asking for assistance in determining storage requirements
You assistance is appreciated. I have question regarding disk storage for postgres servers We are thinking long term about scalable storage and performance and would like some advise or feedback about what other people are using. We would like to get as much performance from our file systems as possible. We use ibm 3650 quad processor with onboard SAS controller ( 3GB/Sec) with 15,000rpm drives. We use raid 1 for the centos operating system and the wal archive logs. The postgres database is on 5 drives configured as raid 5 with a global hot spare. We are curious about using SAN with fiber channel hba and if anyone else uses this technology. We would also like to know if people have preference to the level of raid with/out striping. Sincerely, Chris Barnes Recognia Inc. Senior DBA _ Attention all humans. We are your photos. Free us. http://go.microsoft.com/?linkid=9666046
[GENERAL] PostgreSQL 8.4 Window functions
Hi all, I am trying to calculate an 8 hour moving average using the new Window functions without success. Here is what I am trying to do : -- create test table CREATE TABLE temperatures ( fulldate timestamp NOT NULL PRIMARY KEY, value numeric ); -- inserts INSERT INTO temperatures select '2009-07-01 00:00:00'::timestamp + interval '1 hour' * s.a as fulldate, round(cast(random() as numeric), 1) as value from generate_series(0,23) as s(a) ; -- selects select * from temperatures order by fulldate; -- window function SELECT fulldate, value, avg(value) OVER () FROM temperatures ORDER BY fulldate; SELECT fulldate, value, avg(value) OVER (ORDER BY fulldate) FROM temperatures ORDER BY fulldate; SELECT fulldate, value, round(avg(value) OVER (ORDER BY fulldate RANGE UNBOUNDED PRECEDING), 2) as value FROM temperatures ORDER BY fulldate; -- not supported SELECT fulldate, value, round(avg(value) OVER (ORDER BY fulldate RANGE -8 PRECEDING), 2) as value FROM temperatures ORDER BY fulldate; Is there any way to PARTITION on a subset of rows (in this case 8) ? -- expected result -- date timevalue moving-average 2009-07-01 00:00:00 0,3 2009-07-01 01:00:00 0,1 2009-07-01 02:00:00 0,5 2009-07-01 03:00:00 0,1 2009-07-01 04:00:00 0,2 2009-07-01 05:00:00 0,7 2009-07-01 06:00:00 0,9 2009-07-01 07:00:00 0,7 0,44 2009-07-01 08:00:00 0 0,4 2009-07-01 09:00:00 0,9 0,5 2009-07-01 10:00:00 0,8 0,54 2009-07-01 11:00:00 0,4 0,58 2009-07-01 12:00:00 0,6 0,63 2009-07-01 13:00:00 0,4 0,59 2009-07-01 14:00:00 0,7 0,56 2009-07-01 15:00:00 0,2 0,5 2009-07-01 16:00:00 0,2 0,53 2009-07-01 17:00:00 0,5 0,48 2009-07-01 18:00:00 0,7 0,46 2009-07-01 19:00:00 0 0,41 2009-07-01 20:00:00 0,4 0,39 2009-07-01 21:00:00 0,9 0,45 2009-07-01 22:00:00 0,4 0,41 2009-07-01 23:00:00 0,7 0,48 0,51 0,52 0,48 0,6 0,67 0,55 0,7 Thanks in advance Paolo Saudin -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] [Password?]
Hi, Thank you all for your kind responses. Things however aren't falling in place. Will take a short break, rework, and get back. Probably, with a new problem! Thanks again, Regards Swati
Re: [GENERAL] SELECT DISTINCT very slow
In response to Ben Harper rogo...@gmail.com: Hi, Can anybody explain this: Records: 600,000 Field Width: Approximately 20 UTF8 characters - type is VARCHAR(25) Field is Indexed. SELECT DISTINCT field FROM table; Takes about 6 seconds. There are 111 distinct items. What's the output of EXPLAIN ANALYZE SELECT DISTINCT field FROM table;? Does a VACUUM ANALYZE of the table help? Is the query significantly faster the second time you run it? Is this a well known issue? Not that I'm aware of. -- Bill Moran http://www.potentialtech.com http://people.collaborativefusion.com/~wmoran/ -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] SELECT DISTINCT very slow
On Thursday 09 July 2009 17:09:13 Ben Harper wrote: Hi, Can anybody explain this: Records: 600,000 Field Width: Approximately 20 UTF8 characters - type is VARCHAR(25) Field is Indexed. SELECT DISTINCT field FROM table; Takes about 6 seconds. There are 111 distinct items. On Sqlite, and another place where I have a B+Tree, this query is faster than my eye can measure. Is this a well known issue? Yes, I think so. AFAIK the primary cause is that indexes in pg do not store visibility information. That means you need to check for existence of the tuple on the heap. Possibly due to that PG has no special case code for DISTINCT to optimize such a query using mostly the index. It would be possible that for each possible value of 'field' you check the index only long enough to prove that there is at least one such entry. Taking that single field into its own table is not possible? Andres -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] SELECT DISTINCT very slow
Hello when you use older pg than 8.3, please, use GROUP BY. SELECT field FROM table GROUP BY field. Regards Pavel Stehule 2009/7/9 Ben Harper rogo...@gmail.com: Hi, Can anybody explain this: Records: 600,000 Field Width: Approximately 20 UTF8 characters - type is VARCHAR(25) Field is Indexed. SELECT DISTINCT field FROM table; Takes about 6 seconds. There are 111 distinct items. On Sqlite, and another place where I have a B+Tree, this query is faster than my eye can measure. Is this a well known issue? Thanks, Ben -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Checkpoint Tuning Question
As Greg commented upthread, we seem to be getting forced to the conclusion that the initial buffer scan in BufferSync() is somehow causing this. There are a couple of things it'd be useful to try here: * see how the size of the hiccup varies with shared_buffers; I tried decreasing shared buffers - both 25MB and 50MB were too small for my load - I had slow queries at all times. So then I increased it from what I was using - 100MB - to 500MB - and the hiccup roughly doubles in length. At 100MB, the hiccup is about 2-3 seconds long. At 500MB, the hiccup is about 6 seconds long. * try inserting a delay into that scan loop, as per attached quick-and-dirty patch. (Numbers pulled from the air, but we can worry about tuning after we see if this is really where the problem is.) After finally getting this particular system into a state where I could build postgres (I was using the binary install) I built a 8.3.4, using your patch - but I didn't see any change in the behaviour. I see hiccups that appear to be the same length as I saw on the binary build of 8.3.4. Thanks, Dan -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Table replication
On Jul 9, 7:53 pm, d...@archonet.com (Richard Huxton) wrote: S Arvind wrote: Hi Members, Is there any way to sync a single table across the DBs. We need a single table alone to be equal in all DBs in the single postgres. Presenly we are doing this with the help of Update, insert, delete trigger. Is there any other best way for that. Most trigger-based replication systems should work fine with one or more tables: slony, londiste etc. That's assuming a single master DB that receives all updates. -- Richard Huxton Archonet Ltd -- Sent via pgsql-general mailing list (pgsql-gene...@postgresql.org) To make changes to your subscription:http://www.postgresql.org/mailpref/pgsql-general And tools like bucardo or rubyrep can do this in a master-master replication. Regards, Arndt Lehmann -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] is autovacuum recommended?
Hi, Whenever i start a big action, like inserting millions of recs or doing a large update, the autovacuum fires on top of that. It has some adverse effects on performance when i need it most. More than once a postgres service crashed on me because of it. Sure, it had too little memory, but it wouldn't have happened if it wasn't for the autovacuum. Should ik keep autovacuum on, or is it better to run a script like: vacuumdb -az (daily except one day) vacuumdb -azf (once a week) My database is growing, so really freeing up space isn't such an issue. What I want is optimal performance. Of course autovacuum is more generic, but for me i think the daily scheme will be sufficient. A connection told me it would be better to enable the autovacuum, because it does more than the above script. Can anyone verify that? Cheers, WBL -- Patriotism is the conviction that your country is superior to all others because you were born in it. -- George Bernard Shaw
Re: [GENERAL] is autovacuum recommended?
Willy-Bas Loos escribió: Hi, Whenever i start a big action, like inserting millions of recs or doing a large update, the autovacuum fires on top of that. It has some adverse effects on performance when i need it most. More than once a postgres service crashed on me because of it. Sure, it had too little memory, but it wouldn't have happened if it wasn't for the autovacuum. 1. the server should never crash. If it does you have another problem that perhaps is triggered by the high load. Investigate that. 2. what PG version are you running? What operating system? 3. You can turn autovacuum off for particular tables. This would allow you to have it process most tables, and manually vacuum only the specific tables that are causing you grief. Should ik keep autovacuum on, or is it better to run a script like: vacuumdb -az (daily except one day) vacuumdb -azf (once a week) 4. VACUUM FULL is (almost) never needed unless your server is not properly tuned. If you find a need for it, it's a symptom that you need to tweak something somewhere. Need more details about the problem to give you more advice. 2a. Upgrading to 8.4 may give you several benefits in this area. -- Alvaro Herrerahttp://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] is autovacuum recommended?
Hi, On Thursday 09 July 2009 19:25:15 Willy-Bas Loos wrote: Whenever i start a big action, like inserting millions of recs or doing a large update, the autovacuum fires on top of that. You can configure autovacuum to use less resources. http://www.postgresql.org/docs/current/static/runtime-config-autovacuum.html It has some adverse effects on performance when i need it most. More than once a postgres service crashed on me because of it. Sure, it had too little memory, but it wouldn't have happened if it wasn't for the autovacuum. That sounds like misconfiguration. PG got OOM killed? It is also possible that your max_fsm_pages were to small, so autovacuum hat to start all over. My database is growing, so really freeing up space isn't such an issue. What I want is optimal performance. Well, not doing vacuum regulary can lead to significant bloat which makes lookups (and thus updates/deletes) way much slower. Of course autovacuum is more generic, but for me i think the daily scheme will be sufficient. That may be. Hard to say without knowing more though. How fast are your tables changing? Inserts only or also updates/deletes? A connection told me it would be better to enable the autovacuum, because it does more than the above script. Can anyone verify that? I dont think it really does more. vacuumdb -azf (once a week) Using -f is normally not a good idea. It takes an exclusive lock on the tables - so you cannot do anything in those tables. Which version of PG are you using? Another possibility would be to upgrade to 8.4 - keeping track of free space works quite different there. Andres -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Performance problem with low correlation data
m_li...@yahoo.it wrote: testinsert contains t values between '2009-08-01' and '2009-08-09', and ne_id from 1 to 2. But only 800 out of 2 ne_id have to be read; there's no need for a table scan! I guess this is a reflection of the poor correlation on ne_id; but, as I said, I don't really think ne_id is so bad correlated. In fact, doing a select ne_id, t from testinsert limit 10 I can see that data is laid out pretty much by ne_id, t, grouped by day (that is, same ne_id for one day, then next ne_id and so on until next day). How is the correlation calculated? Can someone explain to me why, after the procedure above,correlation is so low??? Did you run ANALYZE after the procedure above? -- Alvaro Herrerahttp://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] is autovacuum recommended?
In response to Willy-Bas Loos willy...@gmail.com: Whenever i start a big action, like inserting millions of recs or doing a large update, the autovacuum fires on top of that. It has some adverse effects on performance when i need it most. More than once a postgres service crashed on me because of it. Sure, it had too little memory, but it wouldn't have happened if it wasn't for the autovacuum. If autovacuum is interfering with performance, you have two basic choices: disable autovac or add hardware (assuming your system is already optimally tuned). Should I keep autovacuum on, or is it better to run a script like: vacuumdb -az (daily except one day) vacuumdb -azf (once a week) VACUUM FULL is seldom a good choice for a scheduled activity. That being said, there are some corner use cases where a regular VACUUM FULL is a good idea. Also, it's likely that a daily vacuum won't be enough. If that were the case, then autovac wouldn't be kicking off as often as it does. Once you start vacuuming multiple times per day, you're back to interfering with performance again. However, if you're doing it on a controlled schedule, it's possible that you can schedule it in such a way that it's less intrusive. Overall, however, you're probably going to have to solve the problem by adding hardware, or better tuning your system. My database is growing, so really freeing up space isn't such an issue. What I want is optimal performance. If you _only_ do inserts, then vacuum isn't required at all. If you do _ANY_ UPDATE or DELETE operations, then you still need vacuum or your filesystem will fill up with data that's no longer used. If autovacuum is kicking off, then you need to do vacuums, as autovac isn't random, it uses statistics on the usage of tables to determine if it's needed. You can also manually vacuum as part of your application. For example, vacuum a table manually after a bunch of UPDATEs. If you do this intelligently, you can leave autovacuum enabled, and it will only pick vacuum tables that aren't done manually. Of course autovacuum is more generic, but for me i think the daily scheme will be sufficient. Unlikely. Unless your database sees very few updates, daily vacuum probably isn't often enough. However, your usage pattern will dictate that. A connection told me it would be better to enable the autovacuum, because it does more than the above script. Can anyone verify that? Autovacuum is smarter -- it won't vacuum tables that don't need it, whereas the above script vacuums everything, even if it doesn't need it. -- Bill Moran http://www.potentialtech.com http://people.collaborativefusion.com/~wmoran/ -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] constraint checking on partitions
I have several partitions on a history table that are partitioned by a date range (monthly). However, it's possible for an unexpected but valid date (either very far in the future or very far in the past) to come in the data set and so there is an overflow table. Say table A is parent, B is April data, C is June data, D is July data, and O is overflow data. I set several stored procedures to facilitate the adding of triggers, constraints, etc for partitions. These procs, in addition to adding the constraint the normal partitions, also add a NOT constraint to the overflow table. i.e., when the July partition is created with alter table D add constraint onlyjuly check (date1 = '2009-07-01' and date1 '2009-07-01') Then this is also run alter table O add constraint notjuly check (NOT(date1 = '2009-07-01' and date1 '2009-07-01')) The planner excludes correctly except that it always checks O. It doesn't seem to be able to use the multiple constraints on O. Are multiple NOT constraints too much for the planner for excluding partitions? postgres=# select version(); version -- PostgreSQL 8.4.0 on x86_64-unknown-linux-gnu, compiled by GCC gcc (GCC) 3.4.6 20060404 (Red Hat 3.4.6-8.0.1), 64-bit Chris Spotts
Re: [GENERAL] ZFS prefetch considered evil?
Alban Hertroys wrote: I don't know how you partitioned your zpools, but to me it seems like it'd be preferable to have the PostgreSQL tablespaces (and possibly other data that's likely to be accessed randomly) in a separate zpool from the rest of the system so you can restrict disabling prefetch to just that file-system. You probably already did that... It could be interesting to see how clustering the relevant tables would affect the prefetch performance, I'd expect disk access to be less random that way. It's probably still better to disable prefetch though. in fact, somewhere in Sun.com land there's an app-note that suggests creating TWO ZFS mount-points for Postgres, one for the $PGDATA directory, which uses 128k blocks, and another for a tablespace that you put all your regular databases in, this uses 8k blocks.the idea is, the WAL logging is relatively sequential, and takes place in the 128k block zfs, while the actual database table files are far more purely random. These two ZFS can be made in the same zpool, the normal recommendation is to have one large non-root zpool mirror for all your data (and another smaller zpool mirror for your OS, at least assuming you have more than two physical disk drives). -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Asking for assistance in determining storage requirements
On Thu, Jul 9, 2009 at 11:15 AM, Chris Barnescompuguruchrisbar...@hotmail.com wrote: We are curious about using SAN with fiber channel hba and if anyone else uses this technology. We would also like to know if people have preference to the level of raid with/out striping. I used SurfRAID Triton external RAID units connected to Sun X4100 boxes via LSI Fibre Channel cards. I run them as RAID6 plus hot spare with a total of 16 drives. This is extremely fast and provides for up to 2 disk failure. The key is to have 1 or 2 gigs of cache on the RAID units. I also crank up the RAM on the servers to at least 20Gb. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] constraint checking on partitions
Chris Spotts rfu...@gmail.com writes: Then this is also run alter table O add constraint notjuly check (NOT(date1 = '2009-07-01' and date1 '2009-07-01')) The planner excludes correctly except that it always checks O. What are the actual queries you're hoping it will exclude for? regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] constraint checking on partitions
-Original Message- From: Tom Lane [mailto:t...@sss.pgh.pa.us] Sent: Thursday, July 09, 2009 1:52 PM To: Chris Spotts Cc: 'postgres list' Subject: Re: [GENERAL] constraint checking on partitions Chris Spotts rfu...@gmail.com writes: Then this is also run alter table O add constraint notjuly check (NOT(date1 = '2009-07-01' and date1 '2009-07-01')) The planner excludes correctly except that it always checks O. What are the actual queries you're hoping it will exclude for? regards, tom lane [Spotts, Christopher] I mistyped, that should be alter table D add constraint onlyjuly check (date1 = '2009-07-01' and date1 '2009-08-01') Then this is also run alter table O add constraint notjuly check (NOT(date1 = '2009-07-01' and date1 '2009-08-01')) If I ran a select * from A where date1 = '2009-07-02' and date1 '2009-07-15' then I would think it wouldn't check O. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] constraint checking on partitions
Chris Spotts rfu...@gmail.com writes: I mistyped, that should be alter table D add constraint onlyjuly check (date1 = '2009-07-01' and date1 '2009-08-01') Then this is also run alter table O add constraint notjuly check (NOT(date1 = '2009-07-01' and date1 '2009-08-01')) If I ran a select * from A where date1 = '2009-07-02' and date1 '2009-07-15' then I would think it wouldn't check O. Works for me ... regression=# create table a (date1 date); CREATE TABLE regression=# create table july() inherits(a); CREATE TABLE regression=# create table other() inherits(a); CREATE TABLE regression=# alter table other add constraint notjuly check (NOT(date1 = '2009-07-01' and date1 '2009-08-01')); ALTER TABLE regression=# explain select * from a where date1 = '2009-07-02' and date1 '2009-07-15'; QUERY PLAN Result (cost=0.00..92.00 rows=24 width=4) - Append (cost=0.00..92.00 rows=24 width=4) - Seq Scan on a (cost=0.00..46.00 rows=12 width=4) Filter: ((date1 = '2009-07-02'::date) AND (date1 '2009-07-15'::date)) - Seq Scan on july a (cost=0.00..46.00 rows=12 width=4) Filter: ((date1 = '2009-07-02'::date) AND (date1 '2009-07-15'::date)) (6 rows) regression=# regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] constraint checking on partitions
If I ran a select * from A where date1 = '2009-07-02' and date1 '2009-07-15' then I would think it wouldn't check O. [Spotts, Christopher] I oversimplified this too much - but I figured out what was happening. If you added the June table as well and added a separate NOT constraint for June, and then wrote the query SELECT * from A where date1 = '2009-06-01' and date1 '2009-07-05' the planner can't match them to individual constraints, so it doesn't exclude. Theoretically the planner could logically AND them together to get better exclusion, but it must not be. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Trying to find a low-cost program for Data migration and ETL
How about using an open source ETL tool? You could go with Talend Open Studio, it is an open source program to do data migration, ETL and data synchronization. Check the site: http://www.talend.com/ Scott Mead-3 wrote: On Tue, Jul 7, 2009 at 11:48 AM, Rstat tom.rus...@gmail.com wrote: Hi, Im building a database for my company. We are a rather small size book company with a lot of references and still growing. We have a Mysql database here and are trying to find some good tools to use it at its best. Basically we are just starting up the database after dealing with Excel: we had a size problem… So im trying to find a program that will allow us to do two different things: the migration of our data from the old system to the new one and a specialized software to perform ETL (Extract, transform and load) on our database. About the price of the tools, if we were one year ago, the accounting department would have been pretty relaxed about this. But today, we have some budget restrictions and therefore need a low cost tool. So could you give me some advice on a good data migration and etl tool for a low cost? Thanks for your help. You may have some luck by viewing a similar thread on another mailing list: http://www.theserverside.net/discussions/thread.tss?thread_id=54755 -- Scott -- View this message in context: http://www.nabble.com/Trying-to-find-a-low-cost-program-for-Data-migration-and-ETL-tp24375920p24407304.html Sent from the PostgreSQL - general mailing list archive at Nabble.com. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Postgres 8.4 literal escaping
with Postgres 8.4: SQL 1: update customer set note = ('A' || '\r\n' || 'B') where 1=0; generates: WARNING: nonstandard use of escape in a string literal LINE 1: update customer set note = ('A' || '\r\n' || 'B') where 1=0; ^ HINT: Use the escape string syntax for escapes, e.g., E'\r\n'. (Note is of datatype TEXT) SQL 2: select ('A' || '\r\n' || 'B') as tt from customer limit 1 Functions without escaping literal On postgres 8.3.7 both of the above SQLs function without requiring any literals to be escaped. This seems inconsistent to me in that I would have thought that in Postgres 8.4 either both SQLs would have required escaping or neither of them would have required escaping. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Help me please...
Please help me with this. I really need your advice as to how to retrieve the data in my postgresql database. I have postgresql installed in Windows XP platform five months ago. Just yesterday my OS bugged down and saying NTDLR is missing. What I did was I re-installed my OS. When I finished my installation I found out that I cannot access anymore my postgresql using PGAdmin III. What should I do to access my database and retrieve the important records in it? Unfortunately, I have not created a backup of it. Please help me. Thanks in advance and God bless... Roseller Romanos STI-Pagadian Gillamac Building, Pagadian City Office Nos: (062) 2144-785 Home Nos: (062) 2144-695 Mobile Nos: 09203502636
[GENERAL] Overhead of union versus union all
I am combining query results that I know are disjoint. I'm wondering how much overhead there is in calling union versus union all. (Just curious really; I can't see a reason not to use union all.) (cc me please; not subscribed...) THK -- Timothy H. Keitt http://www.keittlab.org/ -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Overhead of union versus union all
Tim Keitt wrote: I am combining query results that I know are disjoint. I'm wondering how much overhead there is in calling union versus union all. (Just curious really; I can't see a reason not to use union all.) UNION needs to uniquify the output, for which it plasters an additional sort step, whereas UNION ALL does not need to uniquify its output and thus it can avoid the sort step. Using UNION ALL is recommended wherever possible. -- Alvaro Herrerahttp://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Overhead of union versus union all
Tim Keitt wrote: I am combining query results that I know are disjoint. I'm wondering how much overhead there is in calling union versus union all. (Just curious really; I can't see a reason not to use union all.) (cc me please; not subscribed...) THK I think you can test this one yourself pretty easily. Just run the two queries with explain analyze. Union All should run in about the sum of the separate queries. Plain Union will always be slower, because it takes the same results from union all and runs them through an extra sort/distinct or hash step. In my tests, on a query with 600,000 rows, the Plain Union took about 3x as long to complete. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Overhead of union versus union all
Alvaro Herrera wrote: Tim Keitt wrote: I am combining query results that I know are disjoint. I'm wondering how much overhead there is in calling union versus union all. (Just curious really; I can't see a reason not to use union all.) UNION needs to uniquify the output, for which it plasters an additional sort step, whereas UNION ALL does not need to uniquify its output and thus it can avoid the sort step. Using UNION ALL is recommended wherever possible. Yep, ideally UNION ALL would be the default behavior, but that standard requires otherwise. Many people don't know that UNION has an extra SORT/UNIQUE step. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. + -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Help me please...
do the DB folders still exist? if so back them up, reinstall Postgres (reinstalling XP probably wiped out either DLL's or registry entries) and relaunch it. don't have it initialize a DB on install From: pgsql-general-ow...@postgresql.org [mailto:pgsql-general-ow...@postgresql.org] On Behalf Of Roseller A. Romanos Sent: Wednesday, July 08, 2009 9:37 PM To: pgsql-general@postgresql.org Subject: [GENERAL] Help me please... Please help me with this. I really need your advice as to how to retrieve the data in my postgresql database. I have postgresql installed in Windows XP platform five months ago. Just yesterday my OS bugged down and saying NTDLR is missing. What I did was I re-installed my OS. When I finished my installation I found out that I cannot access anymore my postgresql using PGAdmin III. What should I do to access my database and retrieve the important records in it? Unfortunately, I have not created a backup of it. Please help me. Thanks in advance and God bless... Roseller Romanos STI-Pagadian Gillamac Building, Pagadian City Office Nos: (062) 2144-785 Home Nos: (062) 2144-695 Mobile Nos: 09203502636
Re: [GENERAL] Help me please...
if you used default locations I believe it should be (this is from memory mind) under c:\program files\postgres\version\data\ data is the folder you want. From: pgsql-general-ow...@postgresql.org [mailto:pgsql-general-ow...@postgresql.org] On Behalf Of ja...@aers.ca Sent: Thursday, July 09, 2009 1:18 PM To: don2_...@yahoo.com; pgsql-general@postgresql.org Subject: Re: [GENERAL] Help me please... do the DB folders still exist? if so back them up, reinstall Postgres (reinstalling XP probably wiped out either DLL's or registry entries) and relaunch it. don't have it initialize a DB on install From: pgsql-general-ow...@postgresql.org [mailto:pgsql-general-ow...@postgresql.org] On Behalf Of Roseller A. Romanos Sent: Wednesday, July 08, 2009 9:37 PM To: pgsql-general@postgresql.org Subject: [GENERAL] Help me please... Please help me with this. I really need your advice as to how to retrieve the data in my postgresql database. I have postgresql installed in Windows XP platform five months ago. Just yesterday my OS bugged down and saying NTDLR is missing. What I did was I re-installed my OS. When I finished my installation I found out that I cannot access anymore my postgresql using PGAdmin III. What should I do to access my database and retrieve the important records in it? Unfortunately, I have not created a backup of it. Please help me. Thanks in advance and God bless... Roseller Romanos STI-Pagadian Gillamac Building, Pagadian City Office Nos: (062) 2144-785 Home Nos: (062) 2144-695 Mobile Nos: 09203502636
Re: [GENERAL] Help me please...
ja...@aers.ca wrote: if you used default locations I believe it should be (this is from memory mind) under c:\program files\postgres\version\data\ data is the folder you want. First, verify the location of pgdata... sc qc pgsql-8.3 (I'm assuming this is 8.3, modify for other versions) note the value of the -D parameter on the BINARY_PATH_NAME, like... BINARY_PATH_NAME : D:\postgres\8.3\bin\pg_ctl.exe runservice -w -N pgsql-8.3 -D D:\postgres\8.3\data\ hence, mine is D:\postgres\8.3\data\ If you've reinstalled postgres from scratch, you'll likely have to do a bit of dinking around. First, NET STOP pgsql-8.3 Now, MOVE the current data dir somewhere safe, and COPY/S your backup of the DATA directory to the active location. Then, use the 'security' dialog in file explorer, or the CACL command line, to grant the 'postgres' user full control over the data directory and all files in it. command line version: cacls /t /e /c /g postgres:f \path\to\data if this postgres user already existed from before, but the reinstalled postgres service is having problems starting, you may need to reset the service account password.pick a random forgettable password. I'm going to use shattUp373treHodhu (random generator output)... NET USER postgres shattUp373treHodhu SC CONFIG pgsql-8.3 password= shattUp373treHodhu upper case doesn't matter in the commands except for the password itself but the spacing around the password= is critical (none before the =, one space after) then try net start pgsql-8.3 and with any luck, your data is all intact. its absolutely critical if you've reinstalled postgres that you install the same version as you used before. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Postgres 8.4 literal escaping
Niederland niederl...@gmail.com writes: SQL 2: select ('A' || '\r\n' || 'B') as tt from customer limit 1 Functions without escaping literal Really? I get regression=# select ('A' || '\r\n' || 'B') as tt from customer limit 1; WARNING: nonstandard use of escape in a string literal LINE 1: select ('A' || '\r\n' || 'B') as tt from customer limit 1; ^ HINT: Use the escape string syntax for escapes, e.g., E'\r\n'. tt (0 rows) regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Database storage
Hi - I'm running Postgres v8.3 on Fedora 10 (home machine - x86_64 architecture). After realizing that the storage requirements of one of my databases will exceed 4Tb, I wanted to see if anyone had any suggestions as to hardware setup that works well with Postgres running on Linux. I have partitioned most of the database, so older data can go on slower devices with only 400-500Gb needed on faster devices. Redundancy is a requirement. General database usage is generally low, with burst input. For the curious, here's the general profile: a) ~19,000,000 INSERT statements/ day (190 INSERTs/ commit blocks) for a total of ~10 Gb INSERT/day b) INSERTs occur sequentially (daily batch) c) Handful of SELECT statements run per day to generate a few dozen reports, but these are not resource-intensive. d) Only 3-5 database users configured. Initial thought: Use an external multi-bay eSATA case and inserting some 7200RPM+, 32Mb cache 1T+ SATA drives running RAID 5. A few specific questions: 1) Which components generally lead to bottlenecks (controller card, individual HD cache, HD rotational speed, other???) 2) Are there better solutions (broad question in terms of price/ data availability)? Budget: $500 - $750 for the storage medium. Better question: Which setup was worked well with a Postgres implementation running on a similar budget? 3) If using a similar setup, are there any particular implementation issues that are not relatively obvious? Thanks for the input (even if not Postgres-specifc, I thought this might be of interest to hobby database admins like myself running Postgres implementations). -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Database storage
On Thu, Jul 9, 2009 at 5:40 PM, nabble.30.miller_2...@spamgourmet.com wrote: Hi - I'm running Postgres v8.3 on Fedora 10 (home machine - x86_64 architecture). After realizing that the storage requirements of one of my databases will exceed 4Tb, I wanted to see if anyone had any suggestions as to hardware setup that works well with Postgres running on Linux. I have partitioned most of the database, so older data can go on slower devices with only 400-500Gb needed on faster devices. Redundancy is a requirement. General database usage is generally low, with burst input. For the curious, here's the general profile: a) ~19,000,000 INSERT statements/ day (190 INSERTs/ commit blocks) for a total of ~10 Gb INSERT/day b) INSERTs occur sequentially (daily batch) c) Handful of SELECT statements run per day to generate a few dozen reports, but these are not resource-intensive. d) Only 3-5 database users configured. Initial thought: Use an external multi-bay eSATA case and inserting some 7200RPM+, 32Mb cache 1T+ SATA drives running RAID 5. While RAID-5 is generally not the best database setup, considering that your inputs will be bulk you're probably ok. Also, given that 2TB server class drives (5400rpm) are now out, and that they have about the same throughput as 1TB 7200 RPM server class drives, you might want to look into them as you can get by on fewer drives for the same storage. A few specific questions: 1) Which components generally lead to bottlenecks (controller card, individual HD cache, HD rotational speed, other???) IO is almost always your bottleneck on queries over large data sets. 2) Are there better solutions (broad question in terms of price/ data availability)? Budget: $500 - $750 for the storage medium. Better question: Which setup was worked well with a Postgres implementation running on a similar budget? $750 is about what a decent RAID controller would cost you, but again it's likely that given your bulk import scenario, you're probably ok without one. In this instance, you're probably best off with software RAID than a cheap RAID card which will cost extra and probably be slower than linux software RAID. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] SELECT DISTINCT very slow
On Thu, Jul 9, 2009 at 4:47 PM, Andres Freundand...@anarazel.de wrote: AFAIK the primary cause is that indexes in pg do not store visibility information. Not really. The OP doesn't say how wide the record rows are but unless they're very wide it wouldn't pay to use an index for this even if you didn't have to access the heap also. It's going to be faster to scan the whole heap and either sort or use a hash. Currently there aren't many cases where a btree with 6,000 copies of 111 distinct keys is going to be useful. Arguably the missing feature here is skip-scans where we scan the index but only pull out one record for each distinct value. I'm not sure there's anything particularly stopping Postgres from being able to do them, but it might be a lot of code for a narrow use case. -- greg http://mit.edu/~gsstark/resume.pdf -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] SELECT DISTINCT very slow
Greg Stark gsst...@mit.edu writes: Not really. The OP doesn't say how wide the record rows are but unless they're very wide it wouldn't pay to use an index for this even if you didn't have to access the heap also. It's going to be faster to scan the whole heap and either sort or use a hash. Currently there aren't many cases where a btree with 6,000 copies of 111 distinct keys is going to be useful. It was 600,000 not 6,000 ... so a skip-scan might be worth the trouble, but as you say we haven't done it. In any case I think the real issue is that the OP is probably using a pre-8.4 release which will always do SELECT DISTINCT via sort-and-unique. Hash aggregation would be a whole lot faster for these numbers, even if not exactly instantaneous. He could update to 8.4, or go over to using GROUP BY as was recommended upthread. regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] pg_dump PostgreSQL 8.4
thank you all for your assistance. i tried not to be adventurous tweaking the configuration of a production system except for system upgrade which has not failed on me before. burned my fingers once. i need to visit the documentation and study more the admin side of the system. thanks all. raffy segador On Thu, 09 Jul 2009 18:27:35 +0800, Dave Page dp...@pgadmin.org wrote: On Thu, Jul 9, 2009 at 11:06 AM, serafin segadorrsega...@mydestiny.net wrote: i found where the error is. thanks for the tip. i have versions 8.3.7 and 8.4 of PostgreSQL running on the same server, as well as versions 1.8 and 1.10 of pgAdmin. although i run the backup routine for pg8.4 from pgAdmin 1.10, the pdAdmin uses PostgreSQL\8.3\bin\pg_dump as default. i now run the backup routine from the command line utility pointing to the correct version. Check the PG bin path option and File - Options. It also sounds like you disabled the version match check on that dialogue, otherwise pg_dump 8.3 should have complained about be used with 8.4. -- Using Opera's revolutionary e-mail client: http://www.opera.com/mail/ -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] ubuntu packages for 8.4
I don't see any ubuntu packages for 8.4 in the default repositories. Does anybody know if they will be upgrading the postgresql package to 8.4 or creating a new package for it. I'd rather use the packages than to compile it myself. If anybody has an argument as to why I should compile it I am all ears. TIA. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Database storage
On Fri, Jul 10, 2009 at 1:28 AM, Scott Marlowescott.marl...@gmail.com wrote: $750 is about what a decent RAID controller would cost you, but again it's likely that given your bulk import scenario, you're probably ok without one. In this instance, you're probably best off with software RAID than a cheap RAID card which will cost extra and probably be slower than linux software RAID. Fwiw the main disadvantage of software raid is NOT speed -- Linux software RAID is very fast. Aside from raid-5 where it lets you offload the parity calculation there really isn't much speed benefit to hardware raid. The main advantage of hardware raid is the error handling. When you get low level errors or pull a drive a lot of consumer level controllers and their drivers don't respond very well and have long timeouts or keep retrying tragically unaware that the software raid would be able to handle recoverying. A good server-class RAID controller should handle those situations without breaking a sweat. -- greg http://mit.edu/~gsstark/resume.pdf -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Database storage
On Thu, Jul 9, 2009 at 7:29 PM, Greg Starkgsst...@mit.edu wrote: On Fri, Jul 10, 2009 at 1:28 AM, Scott Marlowescott.marl...@gmail.com wrote: $750 is about what a decent RAID controller would cost you, but again it's likely that given your bulk import scenario, you're probably ok without one. In this instance, you're probably best off with software RAID than a cheap RAID card which will cost extra and probably be slower than linux software RAID. Fwiw the main disadvantage of software raid is NOT speed -- Linux software RAID is very fast. Aside from raid-5 where it lets you offload the parity calculation there really isn't much speed benefit to hardware raid. The main advantage of hardware raid is the error handling. When you get low level errors or pull a drive a lot of consumer level controllers and their drivers don't respond very well and have long timeouts or keep retrying tragically unaware that the software raid would be able to handle recoverying. A good server-class RAID controller should handle those situations without breaking a sweat. Definitely a big plus of a quality HW controller, and one of the reasons I don't scrimp on the HW controllers I put in our 24/7 servers. OTOH, if you can afford a bit of downtime to handle failures, linux software RAID works pretty well, and since quad core CPUs are now pretty much the standard, it's ok if parity calculation uses up a bit of one core for lower performing servers like the reporting server the OP was talking about. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] ubuntu packages for 8.4
2009/7/9 Tim Uckun timuc...@gmail.com I don't see any ubuntu packages for 8.4 in the default repositories. Does anybody know if they will be upgrading the postgresql package to 8.4 or creating a new package for it. I'd rather use the packages than to compile it myself. If anybody has an argument as to why I should compile it I am all ears. I just forwarded your message to Martin Pitt, he's the package maintain for Postgres for Ubuntu (and Debian, I believe). I don't know if this is planned for Karmic Koala (to be released in October). -- - Bret
Re: [GENERAL] Overhead of union versus union all
Alvaro Herrera wrote: Tim Keitt wrote: I am combining query results that I know are disjoint. I'm wondering how much overhead there is in calling union versus union all. (Just curious really; I can't see a reason not to use union all.) UNION needs to uniquify the output, for which it plasters an additional sort step, whereas UNION ALL does not need to uniquify its output and thus it can avoid the sort step. Using UNION ALL is recommended wherever possible. I think I read somewhere that as of 8.4 it no longer required the sort step, due to the improvements in hashing. Here it is http://wiki.postgresql.org/wiki/WhatsNew84#Performance -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Overhead of union versus union all
Scott Bailey wrote: Alvaro Herrera wrote: Tim Keitt wrote: I am combining query results that I know are disjoint. I'm wondering how much overhead there is in calling union versus union all. (Just curious really; I can't see a reason not to use union all.) UNION needs to uniquify the output, for which it plasters an additional sort step, whereas UNION ALL does not need to uniquify its output and thus it can avoid the sort step. Using UNION ALL is recommended wherever possible. I think I read somewhere that as of 8.4 it no longer required the sort step, due to the improvements in hashing. Here it is http://wiki.postgresql.org/wiki/WhatsNew84#Performance Oh, yea, hashing is used in some cases rather than sort. I assume sort is still used if the hash exceeds workmem size. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. + -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] psql language
Hello, I just installed pg-8.4 on Windows XP but ran into some unexpected problems. I am working on some tools to aid English-speaking learners of Japanese. This of course requires me to regularly display and enter Japanese text on my machine, so I have the Regional setting, Language for non-unicode programs set to Japanese although the locale language is English. This allows me to to work with both english and japanese text in the windows console (cmd.exe) just fine. Psql also worked fine until 8.4. With 8.4 though, psql presents messages in Japanese. Since I am still a beginner at Japanese myself, this is a problem. FWIW, I tried doing SET LANGUAGE en_US and with every other locale-related variable LC_ALL, LANG, LC_MESSAGES, etc, I could think of, before running psql but with no effect. How can I tell psql (and any other command line tools) to use english messages? -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] ubuntu packages for 8.4
Tim Uckun wrote: I don't see any ubuntu packages for 8.4 in the default repositories. Does anybody know if they will be upgrading the postgresql package to 8.4 or creating a new package for it. I'd rather use the packages than to compile it myself. If anybody has an argument as to why I should compile it I am all ears. TIA. I was looking for this the other day. I found it in lenny backports, but I haven't tried it yet. http://packages.debian.org/lenny-backports/postgresql-8.4 -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Overhead of union versus union all
On Thu, Jul 9, 2009 at 7:58 PM, Bruce Momjianbr...@momjian.us wrote: Scott Bailey wrote: Alvaro Herrera wrote: Tim Keitt wrote: I am combining query results that I know are disjoint. I'm wondering how much overhead there is in calling union versus union all. (Just curious really; I can't see a reason not to use union all.) UNION needs to uniquify the output, for which it plasters an additional sort step, whereas UNION ALL does not need to uniquify its output and thus it can avoid the sort step. Using UNION ALL is recommended wherever possible. I think I read somewhere that as of 8.4 it no longer required the sort step, due to the improvements in hashing. Here it is http://wiki.postgresql.org/wiki/WhatsNew84#Performance Oh, yea, hashing is used in some cases rather than sort. I assume sort is still used if the hash exceeds workmem size. The important point being that it's still more expensive than a plain union all thought, right? -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Overhead of union versus union all
Scott Marlowe wrote: On Thu, Jul 9, 2009 at 7:58 PM, Bruce Momjianbr...@momjian.us wrote: Scott Bailey wrote: Alvaro Herrera wrote: Tim Keitt wrote: I am combining query results that I know are disjoint. I'm wondering how much overhead there is in calling union versus union all. (Just curious really; I can't see a reason not to use union all.) UNION needs to uniquify the output, for which it plasters an additional sort step, whereas UNION ALL does not need to uniquify its output and thus it can avoid the sort step. ?Using UNION ALL is recommended wherever possible. I think I read somewhere that as of 8.4 it no longer required the sort step, due to the improvements in hashing. Here it is http://wiki.postgresql.org/wiki/WhatsNew84#Performance Oh, yea, hashing is used in some cases rather than sort. ?I assume sort is still used if the hash exceeds workmem size. The important point being that it's still more expensive than a plain union all thought, right? Yep. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. + -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Multi - table statistics
On Jul 1, 2009, at 4:15 AM, Scara Maccai wrote: I query is using Index scan instead of Bitmap Index Scan because it thinks that the number of rows returned will be low (12). In fact, the number of rows returned is 120. This happens because there are some WHERE conditions that check cross-tables values (something like taba.col1 tabb.col2) . Since Postgresql doesn't know (I guess) anything about cross-table values statistics, how are the rows values calculated? Are you doing this as a JOIN or a subquery? That could make a difference. Also, non-equality predicates (ie: ) are hard to estimate for. Is there any way I can force the planner to use bitmap index scan? I would try pulling the explicit values you need into a temp table, analyze that, and then join. The planner will then know exactly how many rows it's dealing with. But keep in mind that it doesn't actually look at the values it will be getting, so if you have a skewed distribution of values in the join column in the larger table you might be stuck... -- Decibel!, aka Jim C. Nasby, Database Architect deci...@decibel.org Give your computer some brain candy! www.distributed.net Team #1828 -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] cache reads vs. disk reads
On Jul 1, 2009, at 4:20 AM, Gerd König wrote: b) pg_statio_user_tables= heap_blks_read is the number of disk blocks read for that table (excluding index access), does heap_blks_hit mean the number of accesses to the cache for that data ? ...and is the number of heap_blks_read in heap_blks_hit included, or is this number the additional accesses, after reading the data from disk to buffer ? Take a look in the manual; there's a pretty clear explanation in there. -- Decibel!, aka Jim C. Nasby, Database Architect deci...@decibel.org Give your computer some brain candy! www.distributed.net Team #1828 -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] REINDEX is not a btree
On Jul 4, 2009, at 8:06 AM, Craig Ringer wrote: On Fri, 2009-07-03 at 15:00 +0100, Vanessa Lopez wrote: I don't know much about postgre, I have no clue what else I can do. Please, please any help is very very much appreciated I have lots of databases and months of work in postgre (also lots of backups for the data in /data) When you say in /data, do you mean the directory that contains the directories pg_xlog, base, global, pg_clog, etc ? Did you back up and restore the WHOLE data directory at once? Or did you restore only parts of it? And how exactly did you make the backups? You can't simply take a filesystem copy of a running database; that won't work. -- Decibel!, aka Jim C. Nasby, Database Architect deci...@decibel.org Give your computer some brain candy! www.distributed.net Team #1828 -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general