Re: [GENERAL] share library version problems
Tom Lane wrote: If the Fedora 5 RPMs won't install on your FC4 machine, grab the FC5 SRPM and do a quick rpmbuild --rebuild to make custom RPMs for your environment. After about 5 minutes of compiling I get this: == pg_regress: initdb failed Examine ./log/initdb.log for the reason. make: *** [check] Error 2 error: Bad exit status from /var/tmp/rpm-tmp.67109 (%build) RPM build errors: Bad exit status from /var/tmp/rpm-tmp.67109 (%build) == /usr/src/redhat/BUILD/postgresql-8.1.4/ src/test/regress/log/initdb.log Running in noclean mode. Mistakes will not be cleaned up. initdb: cannot be run as root Please log in (using, e.g., su) as the (unprivileged) user that will own the server process. -- Bryan White, ArcaMax Publishing Inc. ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [GENERAL] share library version problems
Tom Lane wrote: Don't do the rpmbuild as root. Alternatively, I believe there's a %define you can set to skip the regression test ... but that's probably not a good idea. I think I have it solved now. I am not to familiar with the process of building from source RPMs. You said to not do it as root but that meant I did not have write access to /usr/src/redhat. I tried to options to build from a different location without much luck. In the end I moved /usr/src/redhat to /usr/src/redhat.old and created a new one (including sub-directories) and made myself the owner. It then builds fine. Seems like there has to be an easier way. Anyway, after installing the new RPMs on my FC4 dev server and rebuilding my programs, the programs do now run on my web server (stock FC4 PostgreSQL). Thanks for your help -- Bryan White, ArcaMax Publishing Inc. ---(end of broadcast)--- TIP 6: explain analyze is your friend
[GENERAL] share library version problems
I am having problems with my libpq programs crashing. This seems to be a version incompatibility and I want to find out how to best proceed. My main database is running Fedora Core 5 with the supplied PostgreSQL 8.1.4. My web server is running Fedora Core 4 with the supplied PostgreSQL 8.0.8. My dev server was running the same setup as the web server. The difference is that it acts as its own database server. I was uncomfortable running an older version of the server on my test system then on the live system. So yesterday I removed the OS supplied PostgreSQL RPMs and installed 8.1.4 from RPMs on the PostgreSQL download site. Today I discovered that programs that I compile on my dev server will segfault when run on the live web server. On my live web server I have: /usr/lib/libpq.a /usr/lib/libpq.so - libpq.so.4.0 /usr/lib/libpq.so.4 - libpq.so.4.0 /usr/lib/libpq.so.4.0 On my dev server I have: /usr/lib/libpq.a /usr/lib/libpq.so - libpq.so.4.1 /usr/lib/libpq.so.4 - libpq.so.4.1 /usr/lib/libpq.so.4.1 My programs are compiled with -lpq Is there something I can do on my dev server to get it to produce programs that will run on my live server? Note: I would rather change the dev server because there are about 12 other live servers that would also need to be fixed. Is there a document somewhere that discusses how to handle these types of issues? Note: In the recent past I had been running 7.1.x on my dev server and had no problems running the produced programs on a live server with 8.0.x libraries. -- Bryan White, ArcaMax Publishing Inc. Bryan is used to being beast of burden to other people's needs. Very sad life. Probably have very sad death. But, at least there is symmetry. ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [GENERAL] share library version problems
Tom Lane wrote: Bryan White [EMAIL PROTECTED] writes: I am having problems with my libpq programs crashing. This seems to be a version incompatibility and I want to find out how to best proceed. My main database is running Fedora Core 5 with the supplied PostgreSQL 8.1.4. My web server is running Fedora Core 4 with the supplied PostgreSQL 8.0.8. My dev server was running the same setup as the web server. The difference is that it acts as its own database server. I was uncomfortable running an older version of the server on my test system then on the live system. So yesterday I removed the OS supplied PostgreSQL RPMs and installed 8.1.4 from RPMs on the PostgreSQL download site. Today I discovered that programs that I compile on my dev server will segfault when run on the live web server. Can you get a core dump and provide a gdb backtrace from the segfault? Right offhand I see no difference in the claimed API of 8.0 and 8.1 libpq except that 8.1 adds lo_create(), which I suppose you're not using. So while this isn't good practice in general, I don't see why it wouldn't work in this particular case. I have used both gdp and valgrind with full debug builds. The segfault does not seem to occur in Postgres related code. It occurs before any database connection is established. If it makes a difference, the code is all written in C++. The reason I suspect the Postgres lib is because there have been no changes to this code. After upgrading the dev server yesterday, I rolled out a small fix this morning and started seeing the segfault. Reverting the change did not fix it. Compiling a clean subversion checkout on both boxes confirmed that code compiled on the dev box will not run on the web server but code compiled on the web server runs on either box. One thing you should check is whether both libs were built with the same options (compare pg_config --configure output from the 8.0 and 8.1 installations). I think that might be the problem. These are the differences in pg_config --configure output: dev server: '--host=i686-redhat-linux-gnu' '--build=i686-redhat-linux-gnu' '--target=i686-redhat-linux' '--with-includes=/usr/include' '--with-libraries=/usr/lib' 'CFLAGS=-O2 -g -march=i686 -I/usr/include/et' 'CPPFLAGS= -I/usr/include/et' 'build_alias=i686-redhat-linux-gnu' 'host_alias=i686-redhat-linux-gnu' 'target_alias=i686-redhat-linux' web server: '--build=i386-redhat-linux' '--host=i386-redhat-linux' '--target=i386-redhat-linux-gnu' '--with-tcl' '--with-tclconfig=/usr/lib' '--enable-thread-safety' 'CFLAGS=-O2 -g -pipe -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -m32 -march=i386 -mtune=pentium4 -fasynchronous-unwind-tables' 'build_alias=i386-redhat-linux' 'host_alias=i386-redhat-linux' 'target_alias=i386-redhat-linux-gnu' I note that Postgres is packaged in the following RPMS: postgresql-devel postgresql-libs postgresql-server postgresql Does it work to install the postgresql-server RPM from the 8.1 version and the others from the Fedora 4 included 8.0 version? -- Bryan White, ArcaMax Publishing Inc. ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [GENERAL] share library version problems
Martijn van Oosterhout wrote: Can you provide the output of ldd? The libraries are supposed to be reasonably compatable. web server: linux-gate.so.1 = (0x00ab) libexpat.so.0 = /usr/lib/libexpat.so.0 (0x00411000) libpq.so.4 = /usr/lib/libpq.so.4 (0x00324000) libssl.so.5 = /lib/libssl.so.5 (0x005c2000) libstdc++.so.6 = /usr/lib/libstdc++.so.6 (0x0069c000) libm.so.6 = /lib/libm.so.6 (0x0049e000) libgcc_s.so.1 = /lib/libgcc_s.so.1 (0x00bd7000) libc.so.6 = /lib/libc.so.6 (0x00d82000) libpthread.so.0 = /lib/libpthread.so.0 (0x00ece000) libcrypto.so.5 = /lib/libcrypto.so.5 (0x00c0) libkrb5.so.3 = /usr/lib/libkrb5.so.3 (0x00afc000) libcrypt.so.1 = /lib/libcrypt.so.1 (0x00111000) libresolv.so.2 = /lib/libresolv.so.2 (0x00f0c000) libnsl.so.1 = /lib/libnsl.so.1 (0x00f88000) libgssapi_krb5.so.2 = /usr/lib/libgssapi_krb5.so.2 (0x00b7) libcom_err.so.2 = /lib/libcom_err.so.2 (0x00a9f000) libk5crypto.so.3 = /usr/lib/libk5crypto.so.3 (0x00ad6000) libdl.so.2 = /lib/libdl.so.2 (0x0013f000) libz.so.1 = /usr/lib/libz.so.1 (0x003fc000) /lib/ld-linux.so.2 (0x00267000) libkrb5support.so.0 = /usr/lib/libkrb5support.so.0 (0x00ac3000) dev server: linux-gate.so.1 = (0x00497000) libexpat.so.0 = /usr/lib/libexpat.so.0 (0x0065b000) libpq.so.4 = /usr/lib/libpq.so.4 (0x00524000) libssl.so.5 = /lib/libssl.so.5 (0x0021) libstdc++.so.6 = /usr/lib/libstdc++.so.6 (0x00775000) libm.so.6 = /lib/libm.so.6 (0x002b7000) libgcc_s.so.1 = /lib/libgcc_s.so.1 (0x006de000) libc.so.6 = /lib/libc.so.6 (0x00bff000) libcrypto.so.5 = /lib/libcrypto.so.5 (0x00101000) libkrb5.so.3 = /usr/lib/libkrb5.so.3 (0x00d74000) libcrypt.so.1 = /lib/libcrypt.so.1 (0x009b6000) libresolv.so.2 = /lib/libresolv.so.2 (0x00395000) libnsl.so.1 = /lib/libnsl.so.1 (0x00248000) libgssapi_krb5.so.2 = /usr/lib/libgssapi_krb5.so.2 (0x0025d000) libcom_err.so.2 = /lib/libcom_err.so.2 (0x00d44000) libk5crypto.so.3 = /usr/lib/libk5crypto.so.3 (0x00d49000) libdl.so.2 = /lib/libdl.so.2 (0x00275000) libz.so.1 = /usr/lib/libz.so.1 (0x004e1000) /lib/ld-linux.so.2 (0x00a8b000) libkrb5support.so.0 = /usr/lib/libkrb5support.so.0 (0x00d6f000) In any case, you should try to run both servers against the same set of libs and headers. You can have multiple copies of libpq around and select it at compile time. The client library doesn't really have to match the server version... As I asked in another thread: Does it work to install the postgresql-server RPM from the 8.1 version and the others from the Fedora 4 included 8.0 version? -- Bryan White, ArcaMax Publishing Inc. ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
[GENERAL] Resetting priveleges on a table
I have a database that has a few tables that have privileges granted by a user that no longer works here. I am the owner of these tables and the owner of the database. If I do any granting/revoking on these tables my actions do not seem to affect the privs set by this other user. The privileges I have set show up after the original user privileges in the \z output. How can I clean this up. Would dropping the user have any effect? This is on 7.4 if that makes a difference. -- Bryan White, ArcaMax Publishing Inc. The world ends when your dead. Until then you got more punishment in store. Stand it like a man... And give some back. -- Al Swearengen ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [GENERAL] Resetting priveleges on a table
Tom Lane wrote: You need to revoke them as that user, likely. REVOKE really means revoke grants I made, not revoke any grant anybody made. Ok I tried logging is as that user. Oddly after the revoke then only grant that disappeared was one I created. Maybe it has something to do with 'grant option' permissions which seem to have been created here. Transscript: (pconner is the obsolet user, bryan is my account) ec=# \z bulkuploadcfg Access privileges for database ec Schema | Table |Access privileges +---+-- public | bulkuploadcfg | {pconner=a*r*w*d*R*x*t*/pconner,=arwdRxt/pconner,=arwdRxt/bryan} (1 row) ec=# select current_user; current_user -- pconner (1 row) ec=# revoke all on bulkuploadcfg from public; REVOKE ec=# \z bulkuploadcfg Access privileges for database ec Schema | Table | Access privileges +---+--- public | bulkuploadcfg | {pconner=a*r*w*d*R*x*t*/pconner,=arwdRxt/pconner} (1 row) ec=# revoke all on bulkuploadcfg from pconner; REVOKE ec=# \z bulkuploadcfg Access privileges for database ec Schema | Table | Access privileges +---+--- public | bulkuploadcfg | {pconner=a*r*w*d*R*x*t*/pconner,=arwdRxt/pconner} (1 row) ec=# revoke grant option for all on bulkuploadcfg from pconner; REVOKE ec=# \z bulkuploadcfg Access privileges for database ec Schema | Table | Access privileges +---+--- public | bulkuploadcfg | {pconner=a*r*w*d*R*x*t*/pconner,=arwdRxt/pconner} (1 row) -- Bryan ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [GENERAL] Resetting priveleges on a table
Tom Lane wrote: Bryan White [EMAIL PROTECTED] writes: ec=# \z bulkuploadcfg Access privileges for database ec Schema | Table |Access privileges +---+-- public | bulkuploadcfg | {pconner=a*r*w*d*R*x*t*/pconner,=arwdRxt/pconner,=arwdRxt/bryan} (1 row) Hm, this is 7.4.what exactly? The above should be an illegal state (assuming pconner is the table owner) because there is no grant option to bryan allowing him to grant anything to public. ec=# select version(); version - PostgreSQL 7.4.3 on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 3.3.2 20031022 (Red Hat Linux 3.3.2-1) (1 row) There was an old bug that would allow you to get into this state if bryan was a superuser (the system would allow him to grant privileges anyway), but according to the CVS logs we fixed that in 7.4RC1. This table wouldn't happen to be a holdover from a 7.4 beta version would it? bryan is a super user. Another possibility is that you did an ALTER TABLE OWNER after assigning some initial permissions. 7.4 had that command but it didn't do anything about changing the ACL list to match. I think you could have gotten to the above state if pconner were the original table owner and had done GRANT ALL TO PUBLIC, and then you altered table ownership to bryan and he also did GRANT ALL TO PUBLIC. That would match the history. A while ago I changed the owner of all tables to 'bryan'. I just noticed the permission strangeness today. I had some problems trying to load a dump of this database onto a system running 8.0.7 with no pconner user defined. I decided it was time to clean this stuff up and to do that I had to go back to the source. Best solution might be to forcibly set the table's pg_class.relacl field to null (resetting all the permissions to default) and then grant what you want. That seems to fix it. Thanks!!! -- Bryan ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [GENERAL] Text data type doesn't accept newlines?
I have a logging database that logs errors. The error messages contain newlines. Pgsql doesn't accept them on insert in a text data field. I have never had a problem storing newlines in a text field. What interface are you using? The only ascii character that I have found I have to escape is the single quote character. - Bryan White ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [GENERAL] corrupted database?
My hard disk partition with the postgres data directory got full. I tried to shut down postgres so I could clear some space, nothing happened. So I did a reboot. On restart (after clearing some pg_sorttemp.XX files), I discovered that all my tables appear empty! When I check in the data directories of the databases, I see that the files for each table have data (they are still of the size as before). I've been running some experiments on another machine and notice that if I remove the pg_log file, databases seem to disappear (or data to become invisible). So I am guessing that postgres is looking in one place and deciding there is no data. Now I need to get my data of course! Any solutions?? My programming skills are generally very good so if it involves some code I'd have no problem. How do I get a dump of the raw data (saw copy-style output) from the table files? Please help! I am running v7.0.3 on linux kernel v2.2 You might want to look at pg_check located here www.arcamax.com/pg_check. Note that the results might contain rows that have been previously deleted and both new and old copies of rows that have been updated. ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [GENERAL] Number of Connections
Hello, I'm a bit new to postgres. Is there anyway to tell the current number of connections on a database or server? I'm having a connection closing problem and would like to debug it somehow. I know on Sybase you can check a sys table to determine this. Not familiar with how to do this on Postgres. I use: ps ax | grep postgres | wc -l Note the value is often one to high because is picks up the grep process. Use ps ax | grep postgres to look at the processes and see what IP are connected, what users, and what the backend is doing (IDLE, SELECT, ..)
[GENERAL] Insert where not duplicate
I need to insert a bunch of records in a transaction. The transaction must not abort if the a duplicate is found. I know I have seen the syntax for the before. Can someone jog my memory? Bryan White, ArcaMax.com, VP of Technology The avalanche has already begun. It is too late for the pebbles to vote.
[GENERAL] Create View failing
I am trying to play with views. Every time I try to create one I get a message like this: ERROR: pg_atoi: error reading "3642040800": Numerical result out of range The reported number is incremented by a small amount each time. Here is the create statement: create view ordertotals as select * from orders; The orignal statement was more complex (and useful) but I simplified it to try and narrow down the problem. I always get the same error message. Here is the definition of the 'orders' table. ec=# \d orders Table "orders" Attribute| Type | Modifier +---+-- orderid| integer | not null custid | integer | not null employee | text | not null default '' date | date | not null default date(now()) leadsource | text | not null default '' ordersource| text | not null default '' paymenttype| text | not null default '' paymentinfo| text | not null default '' paymentexpdate | text | not null default '' paymentstatus | text | not null default '' tax| numeric(9,2) | not null default 0 shipping | numeric(9,2) | not null default 0 shipmethod | text | not null default '' note | text | not null default '' shipdate | date | camptail | text | not null default '' company| text | not null default '' title | text | not null default '' lname | text | not null default '' fname | text | not null default '' addr1 | text | not null default '' addr2 | text | not null default '' city | text | not null default '' state | text | not null default '' zip| text | not null default '' country| text | not null default '' phone | text | not null default '' batchid| text | not null default '' paydate| date | shipemp| text | couponid | integer | couponamt | numeric(10,2) | default '0.00' Indices: iordcus3, iorddate3, iordid3, iordldsrc3 Bryan White, ArcaMax.com, VP of Technology You can't deny that it is not impossible, can you.
[GENERAL] pg_check 0.1.3 is now Available
I sent this to the 'announce' list earlier but it did not show up. My guess is that list is moderated so I will post again here. -- pg_check is a command line tool I have written to aid is diagnosing and recovering from PostgreSQL table corruptions. See the ReadMe file for more information. It is available here http://www.arcamax.com/pg_check/ I am looking for suggestions as to how to make it more useful so please look it over. Bryan White, ArcaMax.com, VP of Technology You can't deny that it is not impossible, can you.
Re: [GENERAL] I tried to increase the block size
I tried to increase the block size from 8K to 32K and received a IPC error. Now IPC is compiled into the Kernel so why would I get this error. I switched it back to 8K and it runs fine. Did you dump your database(s) before the change and initdb/reload them after? I presume this is needed and could wreak all sorts of havoc if you don't. Also there is a script 'ipcclean' that is supposed to reinitialize some stuff to do with IPC. (yes that is vague but its all I know).
Re: [GENERAL] pg_check 0.1.3 is now Available
Fascinating. Looks like a possible framework for building a standalone dumping utility.for migration It could be turned into that. It already does all the parsing, you would just have to change the output functions for the desired format.
Re: [GENERAL] how good is PostgreSQL
Whenever a query is executed (not found in cache, etc.), the caching system would simply store the query, the results, and a list of tables queried. When a new query came in, it would do a quick lookup in the query hash to see if it already had the results. If so, whammo. Whenever an insert/delete/update was sensed, it would look at the tables being affected, and the caching mechanism would clear out the entries depending on those tables. It seems to me that tracking the list of cached queries and watching for queries that might invalidate them adds a lot of complexity to the back end and the front end still has to establish the connection and wait transfer the data over the socket. On a more practical level, a backend solution would require someone with fairly detailed knowlege of the internals of the backend. A front end solution can more likely to be implemented by someone not as knowlegable. One of the big advantages of your technique is there is no code change at the application level. This means less database lock-in. Maybe that is a disadvantage too. ;-)
[GENERAL] pg_dump's over 2GB
My current backups made with pg_dump are currently 1.3GB. I am wondering what kind of headaches I will have to deal with once they exceed 2GB. What will happen with pg_dump on a Linux 2.2.14 i386 kernel when the output exceeds 2GB? Currently the dump file is later fed to a 'tar cvfz'. I am thinking that instead I will need to pipe pg_dumps output into gzip thus avoiding the creation of a file of that size. Does anyone have experince with this sort of thing? Bryan White, ArcaMax.com, VP of Technology You can't deny that it is not impossible, can you.
[GENERAL] Corrupt Table
I have apparently picked up a corrupt record in a table. What happend: Yesterday at one point the database seems to hang. There were three backend processes consuming large amounts of CPU time. I stopped the server and rebooted (3 months since last reboot). The database restarted and seemed to be fine. Then last night the nightly backups failed apparently when reading the 'customer' table. The database restarted itself. There have been a couple of database restarts since then. As far as I can tell it is the customer table that is the problem. Here is what a failure looks like in the log file: Server process (pid 2864) exited with status 139 at Thu Sep 14 10:13:11 2000 Terminating any active server processes... 000914.10:13:11.425 [5879] NOTICE: Message from PostgreSQL backend: The Postmaster has informed me that some other backend died abnormally --- The last entry is repeated multiple times. I have written a small utility program (pganal). It looks for inconsistancies in page layout and tuple layout. My original intent was to parse the tuple internal structure as well but that proved to be more complex that I was ready to handle at the time. Anyway I stopped the database, copied the customer file to another directory and restarted the database. Here is the pganal output from this copy: -- Analyzing customer Page 25878 ERROR: pd_lower is too small pd_lower=0 pd_upper=0 pd_special=0 pd_opaque.od_pagesize=0 Page 31395 ERROR: pd_lower is too small pd_lower=0 pd_upper=0 pd_special=0 pd_opaque.od_pagesize=0 Page 32950 ERROR: pd_lower is too small pd_lower=0 pd_upper=0 pd_special=0 pd_opaque.od_pagesize=0 Tuple 71453.0 Ofs=8029 Len=164 Flags=1 Error: tuple overwrites pd_special Tuple 71453.4 Ofs=7346 Len=208 Flags=1 Error: tuple overlaps another Tuple 71453.40 Ofs=1365 Len=160 Flags=1 Error: tuple overlaps another Page 71958 ERROR: pd_lower has odd value pd_lower=11886 pd_upper=24239 pd_special=109 pd_opaque.od_pagesize=0 Page 73622 ERROR: pd_lower is too small pd_lower=0 pd_upper=0 pd_special=0 pd_opaque.od_pagesize=0 Page Summary Data Pages = 76555 Unused Pages = 0 New Pages= 0 Empty Pages = 0 Bad Pages= 5 Total Pages = 76560 Tuple Summary O/L Error Tuples = 1 Overlaped Tuples = 2 Unused Tuples= 47994 Used Tuples = 3698495 Total Tuples = 3746492 -- I suspect the 'pd_lower is too small' may be just my misunderstanding of the page layout. The three tuple errors (all on the same page) and the 'pd_lower has odd value' error seem to be real. 'pd_lower has odd value' comes from: int nitems = (phd-pd_lower - sizeof(*phd)) / sizeof(ItemIdData); if(nitems * sizeof(ItemIdData) != phd-pd_lower - sizeof(*phd)) pderr = "pd_lower has odd value"; Basically it means the pd_lower did not leave room for an integral number of ItemIDData structures. I seem to have two separate corrupt pages. I can post the full source to pganal if anyone is interested. Its about 300 lines. My question is how do I proceed from here. Going back to the previous day's backup would be very painful in terms of lost data. I suspect the answer is to perform surgery on the bad pages and then rebuild indexes but this is a scary idea. Has anyone else created tools to deal with this kind of problem? Bryan White, ArcaMax.com, VP of Technology You can't deny that it is not impossible, can you.
Re: [GENERAL] Corrupt Table
Here is a follow up. I did a hex/ascii dump of the 3 bad tuples. In the dump I could pick out an email address. This is an indexed field. I did a select on each of them in the live database. The 1st and 3rd were not found. The second worked ok if I only selected the customer id (an int4 and the first field in the record). The custid reported seems to be nonsense. The backend crashed if I selected the whole record.
Re: [GENERAL] Corrupt Table
Server process (pid 2864) exited with status 139 at Thu Sep 14 10:13:11 2000 That should produce a coredump --- can you get a backtrace? I found a core file. I am not all that familiar with gdb but the backtrace looks useless: #0 0x8064fb4 in ?? () #1 0x809da10 in ?? () #2 0x809e538 in ?? () #3 0x809e0c6 in ?? () #4 0x809e176 in ?? () #5 0x809e2dc in ?? () #6 0x809e5ab in ?? () #7 0x809e666 in ?? () #8 0x809eb00 in ?? () #9 0x80a1f90 in ?? () #10 0x809d3ae in ?? () #11 0x80a31a4 in ?? () #12 0x809d3b7 in ?? () #13 0x809c6b9 in ?? () #14 0x809bd0e in ?? () #15 0x80ec132 in ?? () #16 0x80ec19c in ?? () #17 0x80eadc7 in ?? () #18 0x80eaca7 in ?? () #19 0x80ebba2 in ?? () #20 0x80d61f2 in ?? () #21 0x80d5dd1 in ?? () #22 0x80d518a in ?? () #23 0x80d4c14 in ?? () #24 0x80ab736 in ?? () #25 0x401029cb in ?? () Cool; want to submit it as a contrib item? This sounds like something that could be gradually improved into a "fsck" kind of thing... Right now it is sort of 'hack it up as needed'. I will try and polish it up and add command line options to control it. Hmm. The all-zero pages (I expect you'll find that 25878 etc are *all* zeroes, not just their header fields) look like an old bug wherein a newly-added page might not get initialized if the transaction that's adding the page aborts just after allocating the page. I thought I'd fixed that in 7.0 though. You are running 7.0.2 I hope? Yes I am running 7.0.2. The 4 pages in question have 0's in the first 16 bytes but other data after that. I see some text that look like real data. A VACUUM should patch up zero pages. I'm guessing that you haven't vacuumed this table in a long time... Vacuum occurs nightly just after the backup. I checked and it ran fine the night before. Last nights vacuum reported: psql: The Data Base System is in recovery mode Page 71958 looks pretty badly corrupted --- you'll have to look at that and see if you can clean it up by hand. Something fishy about 71453 as well. Worst case, you could set these pages to all-zero by hand, and just lose the tuples thereon rather than the whole table. How fast does your app add/update tuples in this table? If you are lucky, the tuples in pages 71453 and 71958 might be available from your last successful backup, in which case trying to patch up the page contents by hand is probably a waste of effort. Zero those pages, dump out the current contents of the file with COPY, and start comparing that to your last backup. The fact that you haven't vacuumed will make this pretty easy, because the tuple ordering should be the same. If you do choose to recover by zeroing pages, it'd be a good idea to drop and recreate the indexes on the table. Sooner or later you should do a vacuum to fix the zero pages, but not just yet --- you want to leave the tuples in their current ordering for comparison to your backup ... I suspect diff will produce more output that I want to deal with. Customer records are never deleted from this table so I think the thing to do is copy all customers from the previous good backup that are not in a cleaned up customer table. I will lose some edits but it should not be too bad. Ok here is my plan: 1) Stop the server 2) Backup the physical customer file 3) Zero out all the corrupt pages. 4) Restart the database for localhost access only. 5) Dump the customer table 6) Reload the customer table from the dump (I know it is now clean) 7) Recreate the customer indexes. 8) Vacuum the customer table 9) Restart the database for normal access 10) Load the last good backup into a test database 11) Rename the test database customer table to custbackup. 12) Load the customer dump from above into the test database 13) Run a program on the test database to produce insert customer statements for records in the custback but not in the customer table. 14) Apply the above insert statements to the live database.
Re: [GENERAL] Great Bridge benchmark results for Postgres, 4 others
Greetings all, At long last, here are the results of the benchmarking tests that Great Bridge conducted in its initial exploration of PostgreSQL. We held it up so we could test the shipping release of the new Interbase 6.0. This is a news release that went out today. The release is also on our website at http://www.greatbridge.com/news/p_081420001.html. Graphics of the AS3AP and TPC-C test results are at http:/www.greatbridge.com/img/as3ap.gif and http://www.greatbridge.com/img/tpc-c.gif respectively. This looks great. Better than I would have expected. However I have some concerns. 1) Using only ODBC drivers. I don't know how much of an impact a driver can make but it would seem that using native drivers would shutdown one source of objections. 2) Postgres has the 'vacuum' process which is typically run nightly which if not accounted for in the benchmark would give Postgres an artificial edge. I don't know how you would account for it but in fairness I think it should be acknowledged. Do the other big databases have similar maintenance issues? 3) The test system has 512MB RAM. Given the licensing structure and high licencing fees, users have an incentive to use much larger amounts of RAM. Someone who can only afford 512MB probably can't afford the big names anyway. 4) The artical does not mention the Speed or Number of CPUs or anything about the disks other than size. I can halfway infer that they are SCSI but how are they layed out. I am not trying to tear the benchmark down. Just wanting it more immune to such attempts.
Re: [GENERAL] Corrupted Table
Status 139 indicates a SEGV trap on most Unixen. There should be a core dump left by the crashed backend --- can you get a backtrace from it with gdb? I concur that this probably indicates corrupted data in the file. We may or may not be able to guess how it got corrupted, but a stack trace seems like the place to start. Here is the backtrace: #0 0x808b0e1 in CopyTo () #1 0x808ae2f in DoCopy () #2 0x80ec7c1 in ProcessUtility () #3 0x80ead48 in pg_exec_query_dest () #4 0x80eaca7 in pg_exec_query () #5 0x80ebba2 in PostgresMain () #6 0x80d61f2 in DoBackend () #7 0x80d5dd1 in BackendStartup () #8 0x80d518a in ServerLoop () #9 0x80d4c14 in PostmasterMain () #10 0x80ab736 in main () #11 0x401029cb in __libc_start_main (main=0x80ab6d0 main, argc=8, argv=0xbb54, init=0x8063fac _init, fini=0x812969c _fini, rtld_fini=0x4000ae60 _dl_fini, stack_end=0xbb4c) at ../sysdeps/generic/libc-start.c:92 BTW this is Postgres 7.0.2 on i386/RedHat 6.2. The core file was made when I tried to dump the table. As far as I can tell the corruption occured on Friday because that is the date of my last good automated backup.
Re: [GENERAL] Corrupted Table
Hmm. Assuming that it is a corrupted-data issue, the only likely failure spot that I see in CopyTo() is the heap_getattr macro. A plausible theory is that the length word of a variable-length field (eg, text column) has gotten corrupted, so that when the code tries to access the next field beyond that, it calculates a pointer off the end of memory. You will probably find that plain SELECT will die too if it tries to extract data from the corrupted tuple or tuples. With judicious use of SELECT last-column ... LIMIT you might be able to narrow down which tuples are bad, and then dump out the disk block containing them (use the 'tid' pseudo-attribute to see which block a tuple is in). I'm not sure if the exercise will lead to anything useful or not, but if you want to pursue it... I am wiling to spend some time to track this down. However I would prefer to not keep crashing my live database. I would like to copy the raw data files to a backup maching. Are there any catches in doing this. This particular table is only updated at predictable times on the live system. I am guessing as long as it is stable for at least a few minutes before I copy the file it will work. How hard would it be to write a utility that would walk a table looking this kind of corruption? Are the on-disk data formats documented anywhere?
Re: [GENERAL] Corrupted Table
Shut down the postmaster and then copy the entire db (including pg_log file) and it should work. The catch is to make sure pg_log is in sync with your table files. I would rather not leave my database down long enough to copy the entire db (3.5GB). I have control over when changes are applied to this table. If I restart the database and make certain no updates/inserts/deletes occur then will all info be flushed from the pg_log file? If not, how about if I first vacuum the table?
[GENERAL] table contraints and pg_dump
I have been looking at the new syntax in create table such as unique and primary key constraints (new as in I just noticed it, I don't know when it was added). It seems to me there is a minor gotcha when using pg_dump/psql to reload a database. When indexes were created separately pg_dump would put the index creation at the end of its output, after the data was loaded. My understanding is that this is faster than creating the indexes first and then populating the table. When using table constraints the constraints are created as part of the table creation and there will be a performance hit when loading a dumped database. This discourages using these new features when the table may become large. The solution it would seem is to expand 'alter table' to allow adding the constraints after the copy and change pg_dump accordingly. On the surface these changes to 'alter table' don't look to complex. The only issue is what to do if the constraint is not met by pre-existing data. The alter statement should probably fail in these cases or maybe an option to force it to accept the existing data. The issues are the same as in 'create unique index'.
Re: [GENERAL] Bigger sequences (int8)
Can I make a sequence use an int8 instead of int4? I have an application where, over a few years, it's quite possible to hit the ~2 billion limit. (~4 billion if I start the sequence at -2 billion.) There won't be that many records in the table, but there will be that many inserts. In other words, there will be many deletes, as well. If I CYCLE, old record could still be lingering, and I have the overhead of checking every NEXTVAL to make sure it's not already being used. :-( Any other ideas? I could use two int4's together as primary key, and do some crazy math to increment, or jump through other hoops (besides CYCLE) to intelligently reuse keys ... but then I have some ugly overhead, as well. I really want the sequence to just be an int8. Are we talking about a heinous amount of work in the source code to allow this option? I wouldn't want to mess with the "CREATE SEQUENCE" syntax; it would seem more appropriate as a compile-time option. I'm no expert on the backend but it seems to me you would not even have to change the syntax. The maxval defaults to 2 billion. For an int8 sequence just specify max val greater than that. Actually it may make the most sence to always use 64 bit values for the sequence. Just leave the default ranges in place for compatibility.
Re: [GENERAL] Cannot INDEX an Access97 ODBC export?
Error return is that it is not able to find the attribute "any_column_name" in the table. This maybe obvious but have you looked at the table layout to see if the column exists. You may have a problem with spaces in the name or upper case letters in the name. In either case you must quote the name (table or column). Un quoted names are converted to all lower case. In psql do: \d tablename or from the shell do: pg_dump -s -t tablename
[GENERAL] Vacuum analyze vs just Vacuum
I would just like to check an assumption. I "vacuum analyze" regularly. I have always assumed that this did a plain vacuum in addition to gathering statistics. Is this true? The documentation never states explicitly one way or the other but it almost suggests that they are independant operations.
Re: [GENERAL] Update Performance from 6.5.0 to 6.5.3 to 7.0
That's good, but does it mean that 7.0 is slower about adding index entries than 6.5 was? Or did you have fewer indexes on the table when you were using 6.5? No the indexes have been there all along. My impression is the performance loss was between 6.5.0 and 6.5.3. I had just ignored the problem for a while.
Re: [GENERAL] PG 7.0 is 2.5 times slower running a big report
Well, when you have 2.7 million records in a database, the code might be as good as it can be. I have recoverd the performance lost when I moved to Postgres 7.0 by executing SET enable_indexscan = OFF before creating my cursors and turning it back on for the inner loop query. It may even be faster then before so I am happy. So each run of the log takes 2.7 million queries. Is there no way to use a where clause to limit the scope of your queries? No, the point is the resulting report covers the entire database. An explanation of the purpose of these queries would help me think about what you are after. Maybe it isn't a code problem, may it a business rules problem. At the moment, I understand why you need to go through all the items in the db on every day. Can't you just go through the items that were updated on a given day? So, in general, what is the goal of this report? The report summarizes a lot of different aspects of our database. It breaks down totals in many by a variety of variables. For example orders are broken down by source, date, status, and sku (most orders have only one line item). I could come up with quicker methods to do any one of the breakdowns but to do them all it is more efficient to make a single pass over all the data. I'm new to the cursor method (and to PostgreSQL, though I've done a lot of work with MySQL), but it is interesting, so I'm certainly going to look into it. Part of the advantage of the cursor is in memory management. Is a non-cursor select the database builds and transfers the entire result set to the front end application. For large queries this can be quite a chunk of memory. With a sequential scan and sort of the database I think it is still buffering the result set on the back end but one copy is better then two. Keep in mind I am running this report on a backup database server so I don't have to worry much about other processes being hurt by the load. It seems that with index scans the cursors start producing data right away (but the overall rate is slower). With sequential scan and sort the report gets no data for the first 30 minutes and then runs at about 4 times the rate of the index scan. What difference do you want to capture in distinguishing the customer table from the custlist table? I am not certain I understand the question. The custlist table contains subscriptions to various services. It is a very simple table containing a listid and a custid.
Re: [GENERAL] PG 7.0 is 2.5 times slower running a big report
I have set index_scan off for tomorrow morning's run. I will let you know what happens. I think my problem is fixed. By disabling index scan while creating the cursors and then turning it back on again for the small query that occurs in my inner loop the performance has returned to normal (It may infact be better then before). Thanks a bunch. Bryan
[GENERAL] PG 7.0 is 2.5 times slower running a big report
I have a large report that I run once a day. Under 6.5.3 it took just over 3hrs to run. Under 7.0 it is now taking 8 hours to run. No other changes were made. This is on RedHat Linux 6.2. A PIII 733 with 384MB Ram, and 2 IDE 7200 RPM disks. One disk contains the Postgres directroy including the data directory, and the other disk has everything else. The Postmaster is started with these options: -i -B 1024 -N 256 -o -F The report is being run on a backup server just after the database has been loaded from a dump and 'vacuum analyse'd. There is practically nothing else running on the box during the run. The report creates four separate concurrent cursors. Each of the queries is sorted by a customer number which is an index. The index is unique in the customer table but not the others. For the other cursors it pops values as needed to process data for the current customer number. There are also other selects that are run for each customer order processed to retrieve its line items. The report does not update the database at all, it is just accumulating totals that will be written out when the report finishes. Top tells me the front end process is using 5 to 10 percent of the CPU and the back end is using 10 to 20 percent. The load average is about 1.0 and the CPU is about 80% idle. I am prettry certain on 6.5.3 that the CPU usage was much higher. Its almost like the new version has some sort of throttle to keep one backend from saturating the system. Indeed the box is much more responsive than it used to be while running this report. Suggestions?
Re: [GENERAL] shutdown gracefully single user mode?
hi, all experts there, greetings! Just minutes ago, my boss found out one of the attributes in a table is too short (varchar 64 for url), we need to make it wider to 85 A.S.A.P. Seems that alter table can not do it. So, I used pg_dump, (how to do it gracefully?) immediately drop the table, shutdown the postmaster (not necessary?) and change the dumped table, and then restart the postmaster, use pgsql to reload the data. Our database is not large YET. So, it took ONLY 10 minutes to re-load. the job was done. But I feel nervous -- because I expect this will happen again SOON. What is the "standard" procedure for postgreSQL to do such kind of things? more specifically (maybe there are some other ways?): 1) how to shutdown postmaster gracefully (I simply used kill proID. I feel lucky that I do not need to use -9 ! ) so that I'm sure the data is not corrupted? 2) how to "shutdown"(drop) just one table gracefully? 3) any simpler way (alter-table-like) to widden the attribute? 4) single user mode? I usually create a new table, do an insert of data from a select on the old table, then rename the tables. Don't forget to recreate any indexes. I usually write it all up in a single script and run it on a test database before hand. For small tables or tables that don't under normal usage this works reasonably well. For larger more criticle tables I would be tempted to pull the ethernet plug during the changeover. Course that depends on how your system is laid out. NOTE: I don't rely on the alter table add column feature. It seems to ignore the new default value entirely (even on data added after the alteration). Also I suggest using the 'text' data type. It has no length limit other than the 8K tuple size.
[GENERAL] alter table add column is broken in 6.5
I posted this yesterday and got no response, so I will try again. Alter table add column does not set the default value for new rows added to a table. I can except that it does not copy the new default value for existing rows. That is easy to work around. My problem is that the database does not reflect the default value for new rows added after the alter statement. I could work around this is someone could tell me how to modify the system tables to specify a default value. This does not seem to be much documentation for the layout of the system tables. Bryan White ArcaMax Inc. Yorktown VA www.arcamax.com
[GENERAL] Default values not working
If I add a field to a colum using: alter table mytable add column myint int not null default 0; The default value does not seem to make it into the data dictionary. This is using Postgres 6.5. Is there is way to manually poke a default value into the data dictionary? Bryan White ArcaMax Inc. Yorktown VA www.arcamax.com
[GENERAL] Regression test core dumps
I am installing 6.4.2 on a new PII 350 with 128MB RAM, plenty of disk space and Red Hat 5.2. I have been using 6.4.0 on a similar machine for a while now. It seems to be failing when running the regression tests. I seem to be getting a core dump. There are core files left in both /usr/src/pgsql/src/test and /usr/src/pgsql/src/test/bench The end of the output of 'gmake all runtest' looks like this: - DEBUG: --Relation bprime-- DEBUG: Pages 25: Changed 25, Reapped 0, Empty 0, New 0; Tup 1000: Vac 0,... gmake[1]: *** Deleting file `bench.out' gmake: *** [runtest] Quit (core dumped) Quit (core dumped) -- Has any body run into this before. Bryan White ArcaMax Inc. Yorktown VA www.arcamax.com
[GENERAL] Rename a database
Is there a way rename a database? I currently have a 'live' and a 'test' database. For reasons I won't delve into all the programs are currently using 'test'. I want a quick way of getting all the data into 'live'. I am guessing I can shutdown the postmaster and then move and/or copy directories in /usr/local/pgsql/data. Can someone confirm this? Bryan White ArcaMax Inc. Yorktown VA www.arcamax.com
[GENERAL] what is WaitOnLock Error
I get this error: WaitOnLock: error on wakeup - Aborting this transaction What does it mean? Background: Running Postgres 6.4 on Linux. I have a process to handle subscription requests via email. We are running QMail and it directs the mail to this 'c' program I wrote. It parses the message and tries to add the user to the database. This is the sequence that occurs when the process fails: begin work select from customer returns 0 row (not on file). select nextval('seqcuscustid') returns next custid insert into customer ( custid, email ) values ( , '[EMAIL PROTECTED]'); This is where the error occurs The process works all the time when I test it with single messages. It fails 75% of the time when I redirect a bunch of messages (These messages have been accumulating in a Eudora mailbox while I got the process setup). Might this be some sort of resource problem? Bryan White ArcaMax Inc. Yorktown VA www.arcamax.com
Re: [GENERAL] Other user access to database
Now that I have a database functional, I need to allow other users to have access. Using createuser I can give other users access to the postmaster, but I need to give them access to my database as well. Could someone enlighten me. Access to tables is controled with the Grant and Revoke SQL statements. From memory try: grant OPERATION on TABLENAME to USER; where OPERATION is one of SELECT, UPDATE, DELETE, INSERT or ALL for all operations. TABLENAME is a comma separated list of tables. USER is a comma separated list of users or the keyword PUBLIC.