Re: [GENERAL] share library version problems

2006-08-23 Thread Bryan White

Tom Lane wrote:

If the Fedora 5 RPMs won't install on your FC4 machine, grab the FC5
SRPM and do a quick rpmbuild --rebuild to make custom RPMs for your
environment.

After about 5 minutes of compiling I get this:
==
pg_regress: initdb failed
Examine ./log/initdb.log for the reason.

make: *** [check] Error 2
error: Bad exit status from /var/tmp/rpm-tmp.67109 (%build)


RPM build errors:
Bad exit status from /var/tmp/rpm-tmp.67109 (%build)
== /usr/src/redhat/BUILD/postgresql-8.1.4/
src/test/regress/log/initdb.log
Running in noclean mode.  Mistakes will not be cleaned up.
initdb: cannot be run as root
Please log in (using, e.g., su) as the (unprivileged) user that will
own the server process.

--
Bryan White, ArcaMax Publishing Inc.


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [GENERAL] share library version problems

2006-08-23 Thread Bryan White

Tom Lane wrote:

Don't do the rpmbuild as root.  Alternatively, I believe there's a
%define you can set to skip the regression test ... but that's
probably not a good idea.


I think I have it solved now.  I am not to familiar with the process of 
building from source RPMs.  You said to not do it as root but that meant 
I did not have write access to /usr/src/redhat.  I tried to options to 
build from a different location without much luck.  In the end I moved 
/usr/src/redhat to /usr/src/redhat.old and created a new one (including 
sub-directories) and made myself the owner.  It then builds fine.  Seems 
like there has to be an easier way.


Anyway, after installing the new RPMs on my FC4 dev server and 
rebuilding my programs, the programs do now run on my web server (stock 
FC4 PostgreSQL).


Thanks for your help

--
Bryan White, ArcaMax Publishing Inc.



---(end of broadcast)---
TIP 6: explain analyze is your friend


[GENERAL] share library version problems

2006-08-22 Thread Bryan White
I am having problems with my libpq programs crashing.  This seems to be 
a version incompatibility and I want to find out how to best proceed.


My main database is running Fedora Core 5 with the supplied PostgreSQL 
8.1.4.


My web server is running Fedora Core 4 with the supplied PostgreSQL 8.0.8.

My dev server was running the same setup as the web server.  The 
difference is that it acts as its own database server.  I was 
uncomfortable running an older version of the server on my test system 
then on the live system.  So yesterday I removed the OS supplied 
PostgreSQL RPMs and installed 8.1.4 from RPMs on the PostgreSQL download 
site.


Today I discovered that programs that I compile on my dev server will 
segfault when run on the live web server.


On my live web server I have:
   /usr/lib/libpq.a
   /usr/lib/libpq.so - libpq.so.4.0
   /usr/lib/libpq.so.4   - libpq.so.4.0
   /usr/lib/libpq.so.4.0

On my dev server I have:
   /usr/lib/libpq.a
   /usr/lib/libpq.so - libpq.so.4.1
   /usr/lib/libpq.so.4   - libpq.so.4.1
   /usr/lib/libpq.so.4.1

My programs are compiled with -lpq

Is there something I can do on my dev server to get it to produce 
programs that will run on my live server?  Note: I would rather change 
the dev server because there are about 12 other live servers that would 
also need to be fixed.


Is there a document somewhere that discusses how to handle these types 
of issues?


Note:  In the recent past I had been running 7.1.x on my dev server and 
had no problems running the produced programs on a live server with 
8.0.x libraries.

--
Bryan White, ArcaMax Publishing Inc.

Bryan is used to being beast of burden to other people's needs.
Very sad life. Probably have very sad death. But, at least there
is symmetry.

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [GENERAL] share library version problems

2006-08-22 Thread Bryan White

Tom Lane wrote:

Bryan White [EMAIL PROTECTED] writes:
I am having problems with my libpq programs crashing.  This seems to be 
a version incompatibility and I want to find out how to best proceed.


My main database is running Fedora Core 5 with the supplied PostgreSQL 
8.1.4.



My web server is running Fedora Core 4 with the supplied PostgreSQL 8.0.8.


My dev server was running the same setup as the web server.  The 
difference is that it acts as its own database server.  I was 
uncomfortable running an older version of the server on my test system 
then on the live system.  So yesterday I removed the OS supplied 
PostgreSQL RPMs and installed 8.1.4 from RPMs on the PostgreSQL download 
site.


Today I discovered that programs that I compile on my dev server will 
segfault when run on the live web server.


Can you get a core dump and provide a gdb backtrace from the segfault?
Right offhand I see no difference in the claimed API of 8.0 and 8.1
libpq except that 8.1 adds lo_create(), which I suppose you're not
using.  So while this isn't good practice in general, I don't see
why it wouldn't work in this particular case.


I have used both gdp and valgrind with full debug builds.  The segfault 
does not seem to occur in Postgres related code.  It occurs before any 
database connection is established.  If it makes a difference, the code 
is all written in C++.


The reason I suspect the Postgres lib is because there have been no 
changes to this code.  After upgrading the dev server yesterday, I 
rolled out a small fix this morning and started seeing the segfault. 
Reverting the change did not fix it.  Compiling a clean subversion 
checkout on both boxes confirmed that code compiled on the dev box will 
not run on the web server but code compiled on the web server runs on 
either box.



One thing you should check is whether both libs were built with the same
options (compare pg_config --configure output from the 8.0 and 8.1
installations).


I think that might be the problem.  These are the differences in 
pg_config --configure output:


dev server:
 '--host=i686-redhat-linux-gnu'
 '--build=i686-redhat-linux-gnu'
 '--target=i686-redhat-linux'
 '--with-includes=/usr/include'
 '--with-libraries=/usr/lib'
 'CFLAGS=-O2 -g -march=i686 -I/usr/include/et'
 'CPPFLAGS= -I/usr/include/et'
 'build_alias=i686-redhat-linux-gnu'
 'host_alias=i686-redhat-linux-gnu'
 'target_alias=i686-redhat-linux'

web server:
 '--build=i386-redhat-linux'
 '--host=i386-redhat-linux'
 '--target=i386-redhat-linux-gnu'
 '--with-tcl'
 '--with-tclconfig=/usr/lib'
 '--enable-thread-safety'
 'CFLAGS=-O2 -g -pipe -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -m32 
-march=i386 -mtune=pentium4 -fasynchronous-unwind-tables'

 'build_alias=i386-redhat-linux'
 'host_alias=i386-redhat-linux'
 'target_alias=i386-redhat-linux-gnu'

I note that Postgres is packaged in the following RPMS:
postgresql-devel
postgresql-libs
postgresql-server
postgresql

Does it work to install the postgresql-server RPM from the 8.1 version 
and the others from the Fedora 4 included 8.0 version?


--
Bryan White, ArcaMax Publishing Inc.



---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [GENERAL] share library version problems

2006-08-22 Thread Bryan White

Martijn van Oosterhout wrote:

Can you provide the output of ldd? The libraries are supposed to be
reasonably compatable.


web server:
linux-gate.so.1 =  (0x00ab)
libexpat.so.0 = /usr/lib/libexpat.so.0 (0x00411000)
libpq.so.4 = /usr/lib/libpq.so.4 (0x00324000)
libssl.so.5 = /lib/libssl.so.5 (0x005c2000)
libstdc++.so.6 = /usr/lib/libstdc++.so.6 (0x0069c000)
libm.so.6 = /lib/libm.so.6 (0x0049e000)
libgcc_s.so.1 = /lib/libgcc_s.so.1 (0x00bd7000)
libc.so.6 = /lib/libc.so.6 (0x00d82000)
libpthread.so.0 = /lib/libpthread.so.0 (0x00ece000)
libcrypto.so.5 = /lib/libcrypto.so.5 (0x00c0)
libkrb5.so.3 = /usr/lib/libkrb5.so.3 (0x00afc000)
libcrypt.so.1 = /lib/libcrypt.so.1 (0x00111000)
libresolv.so.2 = /lib/libresolv.so.2 (0x00f0c000)
libnsl.so.1 = /lib/libnsl.so.1 (0x00f88000)
libgssapi_krb5.so.2 = /usr/lib/libgssapi_krb5.so.2 (0x00b7)
libcom_err.so.2 = /lib/libcom_err.so.2 (0x00a9f000)
libk5crypto.so.3 = /usr/lib/libk5crypto.so.3 (0x00ad6000)
libdl.so.2 = /lib/libdl.so.2 (0x0013f000)
libz.so.1 = /usr/lib/libz.so.1 (0x003fc000)
/lib/ld-linux.so.2 (0x00267000)
libkrb5support.so.0 = /usr/lib/libkrb5support.so.0 (0x00ac3000)

dev server:
linux-gate.so.1 =  (0x00497000)
libexpat.so.0 = /usr/lib/libexpat.so.0 (0x0065b000)
libpq.so.4 = /usr/lib/libpq.so.4 (0x00524000)
libssl.so.5 = /lib/libssl.so.5 (0x0021)
libstdc++.so.6 = /usr/lib/libstdc++.so.6 (0x00775000)
libm.so.6 = /lib/libm.so.6 (0x002b7000)
libgcc_s.so.1 = /lib/libgcc_s.so.1 (0x006de000)
libc.so.6 = /lib/libc.so.6 (0x00bff000)
libcrypto.so.5 = /lib/libcrypto.so.5 (0x00101000)
libkrb5.so.3 = /usr/lib/libkrb5.so.3 (0x00d74000)
libcrypt.so.1 = /lib/libcrypt.so.1 (0x009b6000)
libresolv.so.2 = /lib/libresolv.so.2 (0x00395000)
libnsl.so.1 = /lib/libnsl.so.1 (0x00248000)
libgssapi_krb5.so.2 = /usr/lib/libgssapi_krb5.so.2 (0x0025d000)
libcom_err.so.2 = /lib/libcom_err.so.2 (0x00d44000)
libk5crypto.so.3 = /usr/lib/libk5crypto.so.3 (0x00d49000)
libdl.so.2 = /lib/libdl.so.2 (0x00275000)
libz.so.1 = /usr/lib/libz.so.1 (0x004e1000)
/lib/ld-linux.so.2 (0x00a8b000)
libkrb5support.so.0 = /usr/lib/libkrb5support.so.0 (0x00d6f000)



In any case, you should try to run both servers against the same set of
libs and headers. You can have multiple copies of libpq around and
select it at compile time. The client library doesn't really have to
match the server version...


As I asked in another thread:  Does it work to install the 
postgresql-server RPM from the 8.1 version and the others from the 
Fedora 4 included 8.0 version?



--
Bryan White, ArcaMax Publishing Inc.


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


[GENERAL] Resetting priveleges on a table

2006-03-14 Thread Bryan White
I have a database that has a few tables that have privileges granted by 
a user that no longer works here.  I am the owner of these tables and 
the owner of the database.  If I do any granting/revoking on these 
tables my actions do not seem to affect the privs set by this other 
user.  The privileges I have set show up after the original user 
privileges in the \z output.


How can I clean this up.  Would dropping the user have any effect?

This is on 7.4 if that makes a difference.


--
Bryan White, ArcaMax Publishing Inc.

The world ends when your dead.
Until then you got more punishment in store.
Stand it like a man... And give some back. -- Al Swearengen

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [GENERAL] Resetting priveleges on a table

2006-03-14 Thread Bryan White

Tom Lane wrote:

You need to revoke them as that user, likely.  REVOKE really means
revoke grants I made, not revoke any grant anybody made.


Ok I tried logging is as that user.  Oddly after the revoke then only 
grant that disappeared was one I created.


Maybe it has something to do with 'grant option' permissions which seem 
to have been created here.


Transscript: (pconner is the obsolet user, bryan is my account)

ec=# \z bulkuploadcfg
Access privileges for database ec
 Schema | Table |Access privileges
+---+--
 public | bulkuploadcfg | 
{pconner=a*r*w*d*R*x*t*/pconner,=arwdRxt/pconner,=arwdRxt/bryan}

(1 row)

ec=# select current_user;
 current_user
--
 pconner
(1 row)

ec=# revoke all on bulkuploadcfg from public;
REVOKE
ec=# \z bulkuploadcfg
Access privileges for database ec
 Schema | Table | Access privileges
+---+---
 public | bulkuploadcfg | {pconner=a*r*w*d*R*x*t*/pconner,=arwdRxt/pconner}
(1 row)

ec=# revoke all on bulkuploadcfg from pconner;
REVOKE
ec=# \z bulkuploadcfg
Access privileges for database ec
 Schema | Table | Access privileges
+---+---
 public | bulkuploadcfg | {pconner=a*r*w*d*R*x*t*/pconner,=arwdRxt/pconner}
(1 row)

ec=# revoke grant option for all on bulkuploadcfg from pconner;
REVOKE
ec=# \z bulkuploadcfg
Access privileges for database ec
 Schema | Table | Access privileges
+---+---
 public | bulkuploadcfg | {pconner=a*r*w*d*R*x*t*/pconner,=arwdRxt/pconner}
(1 row)

--
Bryan

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [GENERAL] Resetting priveleges on a table

2006-03-14 Thread Bryan White

Tom Lane wrote:

Bryan White [EMAIL PROTECTED] writes:

ec=# \z bulkuploadcfg
 Access privileges for database ec
  Schema | Table |Access privileges
+---+--
  public | bulkuploadcfg | 
{pconner=a*r*w*d*R*x*t*/pconner,=arwdRxt/pconner,=arwdRxt/bryan}

(1 row)


Hm, this is 7.4.what exactly?  The above should be an illegal state
(assuming pconner is the table owner) because there is no grant option
to bryan allowing him to grant anything to public.


ec=# select version();
 version
-
 PostgreSQL 7.4.3 on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 3.3.2 
20031022 (Red Hat Linux 3.3.2-1)

(1 row)



There was an old bug that would allow you to get into this state if
bryan was a superuser (the system would allow him to grant privileges
anyway), but according to the CVS logs we fixed that in 7.4RC1.  This
table wouldn't happen to be a holdover from a 7.4 beta version would it?


bryan is a super user.


Another possibility is that you did an ALTER TABLE OWNER after assigning
some initial permissions.  7.4 had that command but it didn't do
anything about changing the ACL list to match.  I think you could have
gotten to the above state if pconner were the original table owner and
had done GRANT ALL TO PUBLIC, and then you altered table ownership to
bryan and he also did GRANT ALL TO PUBLIC.


That would match the history.  A while ago I changed the owner of all 
tables to 'bryan'.  I just noticed the permission strangeness today.  I 
had some problems trying to load a dump of this database onto a system 
running 8.0.7 with no pconner user defined.  I decided it was time to 
clean this stuff up and to do that I had to go back to the source.



Best solution might be to forcibly set the table's pg_class.relacl field
to null (resetting all the permissions to default) and then grant what
you want.


That seems to fix it.  Thanks!!!

--
Bryan

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [GENERAL] Text data type doesn't accept newlines?

2001-06-05 Thread Bryan White

 I have a logging database that logs errors. The error messages contain
 newlines. Pgsql doesn't accept them on insert in a text data field.

I have never had a problem storing newlines in a text field.  What interface
are you using?
The only ascii character that I have found I have to escape is the single
quote character.
-
Bryan White


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])



Re: [GENERAL] corrupted database?

2001-04-18 Thread Bryan White

  My hard disk partition with the postgres data directory got full. I
 tried to shut down postgres so I could clear some space, nothing
 happened. So I did a reboot. On restart (after clearing some
 pg_sorttemp.XX files), I discovered that all my tables appear empty!
 When I check in the data directories of the databases, I see that the
 files for each table have data (they are still of the size as before).

  I've been running some experiments on another machine and notice that
 if I remove the pg_log file, databases seem to disappear (or data to
 become invisible). So I am guessing that postgres is looking in one
 place and deciding there is no data. Now I need to get my data of
 course! Any solutions?? My programming skills are generally very good
 so
 if it involves some code I'd have no problem. How do I get a dump of
 the
 raw data (saw copy-style output) from the table files? Please help!

  I am running v7.0.3 on linux kernel v2.2

You might want to look at pg_check located here www.arcamax.com/pg_check.
Note that the results might contain rows that have been previously deleted
and both new and old copies of rows that have been updated.


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]



Re: [GENERAL] Number of Connections

2001-02-16 Thread Bryan White



 Hello,

 I'm a bit new to postgres.  Is there anyway to tell the current number of
 connections on a database or server?  I'm having a connection closing
 problem and would like to debug it somehow.  I know on Sybase you can
check
 a sys table to determine this.  Not familiar with how to do this on
 Postgres.

I use:
ps ax | grep postgres | wc -l
Note the value is often one to high because is picks up the grep process.

Use
ps ax | grep postgres
to look at the processes and see what IP are connected, what users, and what
the backend is doing (IDLE, SELECT, ..)





[GENERAL] Insert where not duplicate

2000-12-19 Thread Bryan White

I need to insert a bunch of records in a transaction.  The transaction must
not abort if the a duplicate is found.

I know I have seen the syntax for the before.  Can someone jog my memory?

Bryan White, ArcaMax.com, VP of Technology
The avalanche has already begun.
It is too late for the pebbles to vote.




[GENERAL] Create View failing

2000-11-28 Thread Bryan White

I am trying to play with views.  Every time I try to create one I get a
message like this:

ERROR:  pg_atoi: error reading "3642040800": Numerical result out of range

The reported number is incremented by a small amount each time.  Here is the
create statement:

create view ordertotals as select * from orders;

The orignal statement was more complex (and useful) but I simplified it to
try and narrow down the problem.  I always get the same error message.

Here is the definition of the 'orders' table.
ec=# \d orders
Table "orders"
   Attribute| Type  |   Modifier
+---+--
 orderid| integer   | not null
 custid | integer   | not null
 employee   | text  | not null default ''
 date   | date  | not null default date(now())
 leadsource | text  | not null default ''
 ordersource| text  | not null default ''
 paymenttype| text  | not null default ''
 paymentinfo| text  | not null default ''
 paymentexpdate | text  | not null default ''
 paymentstatus  | text  | not null default ''
 tax| numeric(9,2)  | not null default 0
 shipping   | numeric(9,2)  | not null default 0
 shipmethod | text  | not null default ''
 note   | text  | not null default ''
 shipdate   | date  |
 camptail   | text  | not null default ''
 company| text  | not null default ''
 title  | text  | not null default ''
 lname  | text  | not null default ''
 fname  | text  | not null default ''
 addr1  | text  | not null default ''
 addr2  | text  | not null default ''
 city   | text  | not null default ''
 state  | text  | not null default ''
 zip| text  | not null default ''
 country| text  | not null default ''
 phone  | text  | not null default ''
 batchid| text  | not null default ''
 paydate| date  |
 shipemp| text  |
 couponid   | integer   |
 couponamt  | numeric(10,2) | default '0.00'
Indices: iordcus3,
 iorddate3,
 iordid3,
     iordldsrc3

Bryan White, ArcaMax.com, VP of Technology
You can't deny that it is not impossible, can you.




[GENERAL] pg_check 0.1.3 is now Available

2000-10-31 Thread Bryan White

I sent this to the 'announce' list earlier but it did not show up.  My guess
is that list is moderated so I will post again here.
--

pg_check is a command line tool I have written to aid is diagnosing and
recovering from PostgreSQL table corruptions.  See the ReadMe file for more
information.

It is available here http://www.arcamax.com/pg_check/

I am looking for suggestions as to how to make it more useful so please look
it over.

Bryan White, ArcaMax.com, VP of Technology
You can't deny that it is not impossible, can you.






Re: [GENERAL] I tried to increase the block size

2000-10-31 Thread Bryan White

 I tried to increase the block size from 8K to 32K and received a IPC
error.
 Now IPC is compiled into the Kernel so why would I get this error. I
 switched it back to 8K and it runs fine.

Did you dump your database(s) before the change and initdb/reload them
after?  I presume this is needed and could wreak all sorts of havoc if you
don't.  Also there is a script 'ipcclean' that is supposed to reinitialize
some stuff to do with IPC.  (yes that is vague but its all I know).




Re: [GENERAL] pg_check 0.1.3 is now Available

2000-10-31 Thread Bryan White

 Fascinating.  Looks like a possible framework for building a standalone
 dumping utility.for migration

It could be turned into that.  It already does all the parsing, you would
just have to change the output functions for the desired format.




Re: [GENERAL] how good is PostgreSQL

2000-10-31 Thread Bryan White

  Whenever a query is executed (not found in cache, etc.), the caching
 system would simply store the query, the results, and a list of tables
 queried.   When a new query came in, it would do a quick lookup in the
query
 hash to see if it already had the results.  If so, whammo. Whenever an
 insert/delete/update was sensed, it would look at the tables being
affected,
 and the caching mechanism would clear out the entries depending on those
 tables.

It seems to me that tracking the list of cached queries and watching for
queries that might invalidate them adds a lot of complexity to the back end
and the front end still has to establish the connection and wait transfer
the data over the socket.

On a more practical level, a backend solution would require someone with
fairly detailed knowlege of the internals of the backend.  A front end
solution can more likely to be implemented by someone not as knowlegable.

One of the big advantages of your technique is there is no code change at
the application level.  This means less database lock-in.  Maybe that is a
disadvantage too. ;-)





[GENERAL] pg_dump's over 2GB

2000-09-29 Thread Bryan White

My current backups made with pg_dump are currently 1.3GB.  I am wondering
what kind of headaches I will have to deal with once they exceed 2GB.

What will happen with pg_dump on a Linux 2.2.14 i386 kernel when the output
exceeds 2GB?
Currently the dump file is later fed to a 'tar cvfz'.  I am thinking that
instead I will need to pipe pg_dumps output into gzip thus avoiding the
creation of a file of that size.

Does anyone have experince with this sort of thing?

Bryan White, ArcaMax.com, VP of Technology
You can't deny that it is not impossible, can you.




[GENERAL] Corrupt Table

2000-09-14 Thread Bryan White

I have apparently picked up a corrupt record in a table.

What happend:
Yesterday at one point the database seems to hang.  There were three backend
processes consuming large amounts of CPU time.  I stopped the server and
rebooted (3 months since last reboot).  The database restarted and seemed to
be fine.

Then last night the nightly backups failed apparently when reading the
'customer' table.  The database restarted itself.  There have been a couple
of database restarts since then.  As far as I can tell it is the customer
table that is the problem.

Here is what a failure looks like in the log file:

Server process (pid 2864) exited with status 139 at Thu Sep 14 10:13:11 2000
Terminating any active server processes...
000914.10:13:11.425  [5879] NOTICE:  Message from PostgreSQL backend:
The Postmaster has informed me that some other backend died
abnormally
---
The last entry is repeated multiple times.

I have written a small utility program (pganal).  It looks for
inconsistancies in page layout and tuple layout.  My original intent was to
parse the tuple internal structure as well but that proved to be more
complex that I was ready to handle at the time.

Anyway I stopped the database, copied the customer file to another directory
and restarted the database.  Here is the pganal output from this copy:
--
Analyzing customer
Page 25878 ERROR: pd_lower is too small
pd_lower=0 pd_upper=0 pd_special=0 pd_opaque.od_pagesize=0
Page 31395 ERROR: pd_lower is too small
pd_lower=0 pd_upper=0 pd_special=0 pd_opaque.od_pagesize=0
Page 32950 ERROR: pd_lower is too small
pd_lower=0 pd_upper=0 pd_special=0 pd_opaque.od_pagesize=0
Tuple 71453.0 Ofs=8029 Len=164 Flags=1 Error: tuple overwrites pd_special
Tuple 71453.4 Ofs=7346 Len=208 Flags=1 Error: tuple overlaps another
Tuple 71453.40 Ofs=1365 Len=160 Flags=1 Error: tuple overlaps another
Page 71958 ERROR: pd_lower has odd value
pd_lower=11886 pd_upper=24239 pd_special=109 pd_opaque.od_pagesize=0
Page 73622 ERROR: pd_lower is too small
pd_lower=0 pd_upper=0 pd_special=0 pd_opaque.od_pagesize=0

Page Summary
Data Pages   = 76555
Unused Pages = 0
New Pages= 0
Empty Pages  = 0
Bad Pages= 5
Total Pages  = 76560

Tuple Summary
O/L Error Tuples = 1
Overlaped Tuples = 2
Unused Tuples= 47994
Used Tuples  = 3698495
Total Tuples = 3746492
--

I suspect the 'pd_lower is too small' may be just my misunderstanding of the
page layout.
The three tuple errors (all on the same page) and the 'pd_lower has odd
value' error seem to be real.
'pd_lower has odd value' comes from:
int nitems = (phd-pd_lower - sizeof(*phd)) / sizeof(ItemIdData);
if(nitems * sizeof(ItemIdData) != phd-pd_lower - sizeof(*phd))
pderr = "pd_lower has odd value";
Basically it means the pd_lower did not leave room for an integral number of
ItemIDData structures.

I seem to have two separate corrupt pages.  I can post the full source to
pganal if anyone is interested.  Its about 300 lines.

My question is how do I proceed from here.  Going back to the previous day's
backup would be very painful in terms of lost data.  I suspect the answer is
to perform surgery on the bad pages and then rebuild indexes but this is a
scary idea.  Has anyone else created tools to deal with this kind of
problem?

Bryan White, ArcaMax.com, VP of Technology
You can't deny that it is not impossible, can you.




Re: [GENERAL] Corrupt Table

2000-09-14 Thread Bryan White

Here is a follow up.  I did a hex/ascii dump of the 3 bad tuples.  In the
dump I could pick out an email address.  This is an indexed field.  I did a
select on each of them in the live database.  The 1st and 3rd were not
found.  The second worked ok if I only selected the customer id (an int4 and
the first field in the record).  The custid reported seems to be nonsense.
The backend crashed if I selected the whole record.




Re: [GENERAL] Corrupt Table

2000-09-14 Thread Bryan White

  Server process (pid 2864) exited with status 139 at Thu Sep 14 10:13:11
2000

 That should produce a coredump --- can you get a backtrace?

I found a core file.  I am not all that familiar with gdb but the backtrace
looks useless:
#0  0x8064fb4 in ?? ()
#1  0x809da10 in ?? ()
#2  0x809e538 in ?? ()
#3  0x809e0c6 in ?? ()
#4  0x809e176 in ?? ()
#5  0x809e2dc in ?? ()
#6  0x809e5ab in ?? ()
#7  0x809e666 in ?? ()
#8  0x809eb00 in ?? ()
#9  0x80a1f90 in ?? ()
#10 0x809d3ae in ?? ()
#11 0x80a31a4 in ?? ()
#12 0x809d3b7 in ?? ()
#13 0x809c6b9 in ?? ()
#14 0x809bd0e in ?? ()
#15 0x80ec132 in ?? ()
#16 0x80ec19c in ?? ()
#17 0x80eadc7 in ?? ()
#18 0x80eaca7 in ?? ()
#19 0x80ebba2 in ?? ()
#20 0x80d61f2 in ?? ()
#21 0x80d5dd1 in ?? ()
#22 0x80d518a in ?? ()
#23 0x80d4c14 in ?? ()
#24 0x80ab736 in ?? ()
#25 0x401029cb in ?? ()

 Cool; want to submit it as a contrib item?  This sounds like something
 that could be gradually improved into a "fsck" kind of thing...

Right now it is sort of 'hack it up as needed'.  I will try and polish it up
and add command line options to control it.

Hmm.  The all-zero pages (I expect you'll find that 25878 etc are *all*
zeroes, not just their header fields) look like an old bug wherein
a newly-added page might not get initialized if the transaction that's
adding the page aborts just after allocating the page.  I thought I'd
fixed that in 7.0 though.  You are running 7.0.2 I hope?

Yes I am running 7.0.2.  The 4 pages in question have 0's in the first 16
bytes but other data after that.  I see some text that look like real data.

 A VACUUM should patch up zero pages.  I'm guessing that you haven't
 vacuumed this table in a long time...

Vacuum occurs nightly just after the backup.  I checked and it ran fine the
night before.  Last nights vacuum reported: psql: The Data Base System is in
recovery mode


 Page 71958 looks pretty badly corrupted --- you'll have to look at that
 and see if you can clean it up by hand.  Something fishy about 71453
 as well.  Worst case, you could set these pages to all-zero by hand,
 and just lose the tuples thereon rather than the whole table.

 How fast does your app add/update tuples in this table?  If you are
 lucky, the tuples in pages 71453 and 71958 might be available from your
 last successful backup, in which case trying to patch up the page
 contents by hand is probably a waste of effort.  Zero those pages,
 dump out the current contents of the file with COPY, and start comparing
 that to your last backup.  The fact that you haven't vacuumed will make
 this pretty easy, because the tuple ordering should be the same.

 If you do choose to recover by zeroing pages, it'd be a good idea to
 drop and recreate the indexes on the table.  Sooner or later you should
 do a vacuum to fix the zero pages, but not just yet --- you want to
 leave the tuples in their current ordering for comparison to your
 backup ...

I suspect diff will produce more output that I want to deal with.  Customer
records are never deleted from this table so I think the thing to do is copy
all customers from the previous good backup that are not in a cleaned up
customer table.  I will lose some edits but it should not be too bad.

Ok here is my plan:
1) Stop the server
2) Backup the physical customer file
3) Zero out all the corrupt pages.
4) Restart the database for localhost access only.
5) Dump the customer table
6) Reload the customer table from the dump (I know it is now clean)
7) Recreate the customer indexes.
8) Vacuum the customer table
9) Restart the database for normal access
10) Load the last good backup into a test database
11) Rename the test database customer table to custbackup.
12) Load the customer dump from above into the test database
13) Run a program on the test database to produce insert customer statements
for records in the custback but not in the customer table.
14) Apply the above insert statements to the live database.






Re: [GENERAL] Great Bridge benchmark results for Postgres, 4 others

2000-08-14 Thread Bryan White

 Greetings all,

 At long last, here are the results of the benchmarking tests that
 Great Bridge conducted in its initial exploration of PostgreSQL.  We
 held it up so we could test the shipping release of the new
 Interbase 6.0.  This is a news release that went out today.

 The release is also on our website at
 http://www.greatbridge.com/news/p_081420001.html.  Graphics of the
 AS3AP and TPC-C test results are at
 http:/www.greatbridge.com/img/as3ap.gif and
 http://www.greatbridge.com/img/tpc-c.gif respectively.

This looks great.  Better than I would have expected.  However I have some
concerns.

1) Using only ODBC drivers.  I don't know how much of an impact a driver can
make but it would seem that using native drivers would shutdown one source
of objections.

2) Postgres has the 'vacuum' process which is typically run nightly which if
not accounted for in the benchmark would give Postgres an artificial edge.
I don't know how you would account for it but in fairness I think it should
be acknowledged.  Do the other big databases have similar maintenance
issues?

3) The test system has 512MB RAM.  Given the licensing structure and high
licencing fees, users have an incentive to use much larger amounts of RAM.
Someone who can only afford 512MB probably can't afford the big names
anyway.

4) The artical does not mention the Speed or Number of CPUs or anything
about the disks other than size.  I can halfway infer that they are SCSI but
how are they layed out.

I am not trying to tear the benchmark down.  Just wanting it more immune to
such attempts.





Re: [GENERAL] Corrupted Table

2000-07-31 Thread Bryan White


 Status 139 indicates a SEGV trap on most Unixen.  There should be a core
 dump left by the crashed backend --- can you get a backtrace from it
 with gdb?

 I concur that this probably indicates corrupted data in the file.  We
 may or may not be able to guess how it got corrupted, but a stack trace
 seems like the place to start.

Here is the backtrace:
#0  0x808b0e1 in CopyTo ()
#1  0x808ae2f in DoCopy ()
#2  0x80ec7c1 in ProcessUtility ()
#3  0x80ead48 in pg_exec_query_dest ()
#4  0x80eaca7 in pg_exec_query ()
#5  0x80ebba2 in PostgresMain ()
#6  0x80d61f2 in DoBackend ()
#7  0x80d5dd1 in BackendStartup ()
#8  0x80d518a in ServerLoop ()
#9  0x80d4c14 in PostmasterMain ()
#10 0x80ab736 in main ()
#11 0x401029cb in __libc_start_main (main=0x80ab6d0 main, argc=8,
argv=0xbb54, init=0x8063fac _init, fini=0x812969c _fini,
rtld_fini=0x4000ae60 _dl_fini, stack_end=0xbb4c)
at ../sysdeps/generic/libc-start.c:92

BTW this is Postgres 7.0.2 on i386/RedHat 6.2.

The core file was made when I tried to dump the table.  As far as I can tell
the corruption occured on Friday because that is the date of my last good
automated backup.





Re: [GENERAL] Corrupted Table

2000-07-31 Thread Bryan White

 Hmm.  Assuming that it is a corrupted-data issue, the only likely
 failure spot that I see in CopyTo() is the heap_getattr macro.
 A plausible theory is that the length word of a variable-length field
 (eg, text column) has gotten corrupted, so that when the code tries to
 access the next field beyond that, it calculates a pointer off the end
 of memory.

 You will probably find that plain SELECT will die too if it tries to
 extract data from the corrupted tuple or tuples.  With judicious use of
 SELECT last-column ... LIMIT you might be able to narrow down which
 tuples are bad, and then dump out the disk block containing them (use
 the 'tid' pseudo-attribute to see which block a tuple is in).  I'm not
 sure if the exercise will lead to anything useful or not, but if you
 want to pursue it...

I am wiling to spend some time to track this down.  However I would prefer
to not keep crashing my live database.  I would like to copy the raw data
files to a backup maching.  Are there any catches in doing this.  This
particular table is only updated at predictable times on the live system.  I
am guessing as long as it is stable for at least a few minutes before I copy
the file it will work.

How hard would it be to write a utility that would walk a table looking this
kind of corruption?  Are the on-disk data formats documented anywhere?





Re: [GENERAL] Corrupted Table

2000-07-31 Thread Bryan White

 Shut down the postmaster and then copy the entire db (including pg_log
 file) and it should work.  The catch is to make sure pg_log is in sync
 with your table files.

I would rather not leave my database down long enough to copy the entire db
(3.5GB).  I have control over when changes are applied to this table.  If I
restart the database and make certain no updates/inserts/deletes occur then
will all info be flushed from the pg_log file?  If not, how about if I first
vacuum the table?






[GENERAL] table contraints and pg_dump

2000-06-21 Thread Bryan White

I have been looking at the new syntax in create table such as unique and
primary key constraints (new as in I just noticed it, I don't know when it
was added).  It seems to me there is a minor gotcha when using pg_dump/psql
to reload a database.

When indexes were created separately pg_dump would put the index creation at
the end of its output, after the data was loaded.  My understanding is that
this is faster than creating the indexes first and then populating the
table.

When using table constraints the constraints are created as part of the
table creation and there will be a performance hit when loading a dumped
database.  This discourages using these new features when the table may
become large.

The solution it would seem is to expand 'alter table' to allow adding the
constraints after the copy and change pg_dump accordingly.  On the surface
these changes to 'alter table' don't look to complex.

The only issue is what to do if the constraint is not met by pre-existing
data.  The alter statement should probably fail in these cases or maybe an
option to force it to accept the existing data.  The issues are the same as
in 'create unique index'.




Re: [GENERAL] Bigger sequences (int8)

2000-06-20 Thread Bryan White

 Can I make a sequence use an int8 instead of int4?

 I have an application where, over a few years, it's quite possible to hit
 the ~2 billion limit.  (~4 billion if I start the sequence at -2
 billion.)

 There won't be that many records in the table, but there will be that many
 inserts.  In other words, there will be many deletes, as well.

 If I CYCLE, old record could still be lingering, and I have the overhead
 of checking every NEXTVAL to make sure it's not already being used.  :-(

 Any other ideas?  I could use two int4's together as primary key, and do
 some crazy math to increment, or jump through other hoops (besides CYCLE)
 to intelligently reuse keys ... but then I have some ugly overhead, as
 well.

 I really want the sequence to just be an int8.  Are we talking about a
 heinous amount of work in the source code to allow this option?  I
 wouldn't want to mess with the "CREATE SEQUENCE" syntax; it would seem
 more appropriate as a compile-time option.

I'm no expert on the backend but it seems to me you would not even have to
change the syntax.  The maxval defaults to 2 billion.  For an int8 sequence
just specify max val greater than that.  Actually it may make the most sence
to always use 64 bit values for the sequence.  Just leave the default ranges
in place for compatibility.




Re: [GENERAL] Cannot INDEX an Access97 ODBC export?

2000-06-09 Thread Bryan White


 Error return is that it is not able to find the attribute
"any_column_name" in the table.

This maybe obvious but have you looked at the table layout to see if the
column exists.  You may have a problem with spaces in the name or upper case
letters in the name.  In either case you must quote the name (table or
column).  Un quoted names are converted to all lower case.

In psql do: \d tablename
or from the shell do: pg_dump -s -t tablename





[GENERAL] Vacuum analyze vs just Vacuum

2000-05-28 Thread Bryan White

I would just like to check an assumption.  I "vacuum analyze" regularly.  I
have always assumed that this did a plain vacuum in addition to gathering
statistics.  Is this true?  The documentation never states explicitly one
way or the other but it almost suggests that they are independant
operations.




Re: [GENERAL] Update Performance from 6.5.0 to 6.5.3 to 7.0

2000-05-27 Thread Bryan White

 That's good, but does it mean that 7.0 is slower about adding index
 entries than 6.5 was?  Or did you have fewer indexes on the table
 when you were using 6.5?

No the indexes have been there all along.  My impression is the performance
loss was between 6.5.0 and 6.5.3.  I had just ignored the problem for a
while.




Re: [GENERAL] PG 7.0 is 2.5 times slower running a big report

2000-05-25 Thread Bryan White

 Well, when you have 2.7 million records in a database, the code might be
as good
 as it can be.

I have recoverd the performance lost when I moved to Postgres 7.0 by
executing
SET enable_indexscan = OFF before creating my cursors and turning it back on
for the inner loop query.  It may even be faster then before so I am happy.

 So each run of the log takes 2.7 million queries.

 Is there no way to use a where clause to limit the scope of your queries?

No, the point is the resulting report covers the entire database.

 An explanation of the purpose of these queries would help me think about
what
 you are after.  Maybe it isn't a code problem, may it a business rules
problem.
 At the moment, I understand why you need to go through all the items in
the db
 on every day.  Can't you just go through the items that were updated on a
given
 day?  So, in general, what is the goal of this report?

The report summarizes a lot of different aspects of our database.  It breaks
down totals in many by a variety of variables.  For example orders are
broken down by source, date, status, and sku (most orders have only one line
item).  I could come up with quicker methods to do any one of the breakdowns
but to do them all it is more efficient to make a single pass over all the
data.

 I'm new to the cursor method (and to PostgreSQL, though I've done a lot of
work
 with MySQL), but it is interesting, so I'm certainly going to look into
it.

Part of the advantage of the cursor is in memory management.  Is a
non-cursor select the database builds and transfers the entire result set to
the front end application.  For large queries this can be quite a chunk of
memory.

With a sequential scan and sort of the database I think it is still
buffering the result set on the back end but one copy is better then two.
Keep in mind I am running this report on a backup database server so I don't
have to worry much about other processes being hurt by the load.

It seems that with index scans the cursors start producing data right away
(but the overall rate is slower).  With sequential scan and sort the report
gets no data for the first 30 minutes and then runs at about 4 times the
rate of the index scan.

 What difference do you want to capture in distinguishing the customer
table from
 the custlist table?

I am not certain I understand the question.  The custlist table contains
subscriptions to various services.  It is a very simple table containing a
listid and a custid.





Re: [GENERAL] PG 7.0 is 2.5 times slower running a big report

2000-05-25 Thread Bryan White

 I have set index_scan off for tomorrow morning's run.  I will let you know
 what happens.

I think my problem is fixed.  By disabling index scan while creating the
cursors and then turning it back on again for the small query that occurs in
my inner loop the performance has returned to normal (It may infact be
better then before).

Thanks a bunch.
Bryan




[GENERAL] PG 7.0 is 2.5 times slower running a big report

2000-05-24 Thread Bryan White

I have a large report that I run once a day.  Under 6.5.3 it took just over
3hrs to run.  Under 7.0 it is now taking 8 hours to run.  No other changes
were made.

This is on RedHat Linux 6.2.  A PIII 733 with 384MB Ram, and 2 IDE 7200 RPM
disks.  One disk contains the Postgres directroy including the data
directory, and the other disk has everything else.

The Postmaster is started with these options: -i -B 1024 -N 256 -o -F

The report is being run on a backup server just after the database has been
loaded from a dump and 'vacuum analyse'd.  There is practically nothing else
running on the box during the run.

The report creates four separate concurrent cursors.  Each of the queries is
sorted by a customer number which is an index.  The index is unique in the
customer table but not the others.  For the other cursors it pops values as
needed to process data for the current customer number.  There are also
other selects that are run for each customer order processed to retrieve its
line items.  The report does not update the database at all, it is just
accumulating totals that will be written out when the report finishes.

Top tells me the front end process is using 5 to 10 percent of the CPU and
the back end is using 10 to 20 percent.  The load average is about 1.0 and
the CPU is about 80% idle.  I am prettry certain on 6.5.3 that the CPU usage
was much higher.  Its almost like the new version has some sort of throttle
to keep one backend from saturating the system.  Indeed the box is much more
responsive than it used to be while running this report.

Suggestions?







Re: [GENERAL] shutdown gracefully single user mode?

1999-09-14 Thread Bryan White

 hi, all experts there, greetings!

 Just minutes ago, my boss found out one of the attributes in a
 table is too short (varchar 64 for url), we need to make
 it wider to 85 A.S.A.P. Seems that alter table can not do it.
 So, I used pg_dump, (how to do it gracefully?) immediately drop the table,
 shutdown the postmaster (not necessary?) and change the dumped table, and
 then restart the postmaster, use pgsql to reload the data. Our database is
 not large YET. So, it took ONLY 10 minutes to re-load.

 the job was done. But I feel nervous -- because I expect this will
 happen again SOON. What is the "standard" procedure for postgreSQL to
 do such kind of things? more specifically (maybe there are some other
 ways?):

 1) how to shutdown postmaster gracefully (I simply used
 kill proID. I feel lucky that I do not need to use -9 ! ) so
 that I'm sure the data is not corrupted?
 2) how to "shutdown"(drop) just one table gracefully?
 3) any simpler way (alter-table-like) to widden the attribute?
 4) single user mode?

I usually create a new table, do an insert of data from a select on the old
table, then rename the tables.  Don't forget to recreate any indexes.  I
usually write it all up in a single script and run it on a test database
before hand.  For small tables or tables that don't under normal usage this
works reasonably well.  For larger more criticle tables I would be tempted
to pull the ethernet plug during the changeover.  Course that depends on how
your system is laid out.

NOTE:  I don't rely on the alter table add column feature.  It seems to
ignore the new default value entirely (even on data added after the
alteration).

Also I suggest using the 'text' data type. It has no length limit other than
the 8K tuple size.






[GENERAL] alter table add column is broken in 6.5

1999-07-01 Thread Bryan White

I posted this yesterday and got no response, so I will try again.

Alter table add column does not set the default value for new rows added to
a table.  I can except that it does not copy the new default value for
existing rows. That is easy to work around.  My problem is that the database
does not reflect the default value for new rows added after the alter
statement.

I could work around this is someone could tell me how to modify the system
tables to specify a default value.  This does not seem to be much
documentation for the layout of the system tables.

Bryan White
ArcaMax Inc.
Yorktown VA
www.arcamax.com




[GENERAL] Default values not working

1999-06-30 Thread Bryan White

If I add a field to a colum using:
alter table mytable add column myint int not null default 0;
The default value does not seem to make it into the data dictionary.

This is using Postgres 6.5.

Is there is way to manually poke a default value into the data dictionary?

Bryan White
ArcaMax Inc.
Yorktown VA
www.arcamax.com




[GENERAL] Regression test core dumps

1999-04-14 Thread Bryan White

I am installing 6.4.2 on a new PII 350 with 128MB RAM, plenty of disk space
and Red Hat 5.2.  I have been using 6.4.0 on a similar machine for a while
now.  It seems to be failing when running the regression tests.

I seem to be getting a core dump.  There are core files left in both
/usr/src/pgsql/src/test and /usr/src/pgsql/src/test/bench

The end of the output of 'gmake all runtest' looks like this:

-
DEBUG:  --Relation bprime--
DEBUG:  Pages 25: Changed 25, Reapped 0, Empty 0, New 0; Tup 1000: Vac 0,...
gmake[1]: *** Deleting file `bench.out'
gmake: *** [runtest] Quit (core dumped)
Quit (core dumped)
--

Has any body run into this before.
Bryan White
ArcaMax Inc.
Yorktown VA
www.arcamax.com




[GENERAL] Rename a database

1999-02-18 Thread Bryan White

Is there a way rename a database?

I currently have a 'live' and a 'test' database.  For reasons I won't delve
into all the programs are currently using 'test'.  I want a quick way of
getting all the data into 'live'.  I am guessing I can shutdown the
postmaster and then move and/or copy directories in /usr/local/pgsql/data.
Can someone confirm this?


Bryan White
ArcaMax Inc.
Yorktown VA
www.arcamax.com




[GENERAL] what is WaitOnLock Error

1998-12-17 Thread Bryan White

I get this error: WaitOnLock: error on wakeup - Aborting this transaction
What does it mean?

Background:
Running Postgres 6.4 on Linux.  I have a process to handle subscription
requests via email.  We are running QMail and it directs the mail to this
'c' program I wrote.  It parses the message and tries to add the user to the
database.

This is the sequence that occurs when the process fails:
begin work
select from customer  returns 0 row (not on file).
select nextval('seqcuscustid') returns next custid
insert into customer ( custid, email ) values ( ,
'[EMAIL PROTECTED]');
This is where the error occurs

The process works all the time when I test it with single messages.  It
fails 75% of the time when I redirect a bunch of messages (These messages
have been accumulating in a Eudora mailbox while I got the process setup).

Might this be some sort of resource problem?

Bryan White
ArcaMax Inc.
Yorktown VA
www.arcamax.com




Re: [GENERAL] Other user access to database

1998-12-08 Thread Bryan White

Now that I have a database functional, I need to allow other users
to have access.  Using createuser I can give other users access to
the postmaster, but I need to give them access to my database as well.

Could someone enlighten me.


Access to tables is controled with the Grant and Revoke SQL statements.

From memory try:
grant OPERATION on TABLENAME to USER;

where

OPERATION is one of SELECT, UPDATE, DELETE, INSERT or ALL for all
operations.

TABLENAME is a comma separated list of tables.

USER is a comma separated list of users or the keyword PUBLIC.