Re: [GENERAL] pg_dump 8.3.3 ERROR: invalid page header in block 2264419 of relation "pg_largeobject"

2017-05-26 Thread David Wall

On 5/25/17 6:30 AM, Tom Lane wrote:

David Wall  writes:

They do have a slave DB running via WAL shipping.  Would that likely
help us in any way?

Have you tried taking a backup from the slave?  It's possible that
the corruption exists only on the master.
We will give this a try once the customer let's us try.  It appears 
we'll need to allot considerable downtime to try our options.


Because the warm standby was resynced in October (it was down due to the 
OS going into a read-only filesystem for an untold long time that we 
only noted when the primary disk was going full with WALs), we believe 
we may have 'tar' copied the corrupted data too.  But we will first stop 
the web apps, then 'tar' backup the database, then stop recovery on the 
warm standby and ensure our table counts appear to match production (in 
case it has any issues of its own), and see if the warm DB is any 
better.  If so, we'll restore from there.  If not, we'll try the 
zero-out bad blocks and see what happens.


Fortunately, this only appears in one of their two DBs.  Once we can 
successfully dump the DBs, we will migrate to the new hardware and OS 
and upgraded PG.


Thanks to you and Karsten Hilbert for your help.


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] pg_dump 8.3.3 ERROR: invalid page header in block 2264419 of relation "pg_largeobject"

2017-05-24 Thread David Wall

On 5/24/17 4:18 PM, Tom Lane wrote:

David Wall  writes:

We have not noted any issues, but when I ran a pg_dump on an 8.3.3
database, it failed after an hour or so with the error:

8.3.3?

Yes, it's old.


cat /proc/version
Linux version 2.6.18-92.1.10.el5.xs5.0.0.39xen (root@pondo-2) (gcc
version 4.1.2 20071124 (Red Hat 4.1.2-42)) #1 SMP Thu Aug 7 14:58:14 EDT
2008

Egad.  I take it this server has been entirely unmaintained for ~ 8 years.
Indeed!  We are just the software vendor, hence our surprise too. They 
didn't even know their backups were failing due to this error.




ERROR:  invalid page header in block 2264419 of relation "pg_largeobject"
pg_dump: The command was: FETCH 1000 IN bloboid
As we seem to have some data corruption issue, the question is how can I
either fix this, or have pg_dump ignore it and continue doing the best
dump it can?  That is, I'd like to create a new clean database that has
whatever data I can recover.

Setting zero_damaged_pages would be a brute-force answer, but bear in mind
that that's irreversible and it's hard to predict how much you'll lose.
If possible, I'd take a physical backup (e.g. with tar) of the entire
$PGDATA directory, preferably with the server stopped, before you do
that.  Then you'll at least know you can get back to where you are.

I hope this is being done as part of migration to more up-to-date
software.
Yes, this was discovered as part of a migration to all new servers. We 
have just put into place PG 9.3.4 and was looking to load it when their 
last pg_dump was restored and got an error because it's an incomplete 
dump.  We then when to run our own pg_dump when we found that it crashed 
with the invalid page header.


Good idea on the physical backup first, while the system is stopped.

They do have a slave DB running via WAL shipping.  Would that likely 
help us in any way?  Because the DBs are big (they have two at 191GB and 
127GB), it takes a fair bit of time to do backups, transfers and 
restores.  I'm trying to find options as it likely means downtime for 
them that they are not expecting (yet).


Thanks for your help!


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] pg_dump 8.3.3 ERROR: invalid page header in block 2264419 of relation "pg_largeobject"

2017-05-24 Thread David Wall
We have not noted any issues, but when I ran a pg_dump on an 8.3.3 
database, it failed after an hour or so with the error:


ERROR:  invalid page header in block 2264419 of relation "pg_largeobject"
pg_dump: The command was: FETCH 1000 IN bloboid

As we seem to have some data corruption issue, the question is how can I 
either fix this, or have pg_dump ignore it and continue doing the best 
dump it can?  That is, I'd like to create a new clean database that has 
whatever data I can recover.


Because the large objects are mostly for storing uploaded files (that 
have been encrypted, so the DB contents will likely be meaningless), if 
we are missing any, it's not too bad, well, no less bad than whatever we 
have now.


Thanks,
David

The OS it is running on shows:

cat /proc/version
Linux version 2.6.18-92.1.10.el5.xs5.0.0.39xen (root@pondo-2) (gcc 
version 4.1.2 20071124 (Red Hat 4.1.2-42)) #1 SMP Thu Aug 7 14:58:14 EDT 
2008


uname -a
Linux example.com 2.6.18-92.1.10.el5.xs5.0.0.39xen #1 SMP Thu Aug 7 
14:58:14 EDT 2008 i686 i686 i386 GNU/Linux




--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] PDF files: to store in database or not

2016-12-06 Thread David Wall

On 12/6/16 12:33 PM, Tom Lane wrote:

John R Pierce  writes:

On 12/6/2016 12:10 PM, Rich Shepard wrote:

I did not realize that a BLOB is not the same as a bytea (page 217
of the
9.6 PDF manual), and I cannot find BLOB as a postgres data type. Please
point me in the right direction to learn how to store PDFs as BLOBs.

indeed BYTEA is postgres's type for storing arbitrary binary objects
that are called BLOB in certain other databases.

Well, there are also "large objects", which aren't really a data type at
all.  If you're storing stuff large enough that you need to write/read
it in chunks rather than all at once, the large-object APIs are what
you want.

regards, tom lane



Yeah, we've not used much BYTEA, but use PG's large objects.  It also 
has a streaming API and you don't have to encode/decode every byte going 
in and out of the DB.


In a table, you juse define the "blob_data" column as an OID.  Since we 
use Java/JDBC, this is handled by ResultSet.getBlob() for a 
java.sql.Blob object.


Some complain about DB backups being biggers if the PDFs are inside, 
which is true, but this only presumes you don't care about the 
filesystem PDFs being backed up separately (and no way to ensure a 
reliable DB backup and PDF filesystem backup if the system is active 
when doing the backups).  You can certainly put the files in a 
filesystem and point to them, but you'll likely need some access control 
or people will be able to download any/all PDFs in a given folder.  In 
the DB, you surely will have access control as I presume you don't allow 
browser access to the DB .


Either way, you may want to see if your PDFs compress well or not as 
that may save some storage space at the cost of compress/decompress on 
accesses.


David


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] PDF files: to store in database or not

2016-12-06 Thread David Wall

On 12/6/16 11:12 AM, Eric Schwarzenbach wrote:

On 12/06/2016 01:34 PM, Joshua D. Drake wrote:

On 12/06/2016 10:30 AM, Rich Shepard wrote:

  My thinking is to not store these documents in the database, but 
to store

them in subdirectories outside the database.

  Your thoughts?


Due to the widely variable size of a PDF document, I would say no. I 
would store the metadata and file location.



Can you elaborate on this? Why is the variable size an issue? Are you 
assuming the files go into the same table as the rest of the data? 
(They certainly don't have to, and I would assume that not to be the 
smartest design.)



The advantages of storing in the database is that a DB backup will have 
everything, instead of a DB backup and a file system backup. Using a 
BLOB, you can certainly keep track of variable length PDFs.


Also, if in the database, it can be part of a transaction so you will 
not have any issues keeping the DB and filesystem in sync.


David


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] pg_dump slower than pg_restore

2014-07-06 Thread David Wall


On 7/6/2014 9:06 AM, Tom Lane wrote:

David Wall  writes:

There's one row in pg_largeobject_metadata per large object.  The rows in
pg_largeobject represent 2KB "pages" of large objects (so it looks like
your large objects are averaging only 8KB-10KB apiece).  The "metadata"
table was added in 9.0 to carry ownership and access permission data for
each large object.


Thanks for that insight.

That metadata table is what first got me when I upgraded from 8.3 to 9.2 
when there were all sorts of LO permission errors.  I have found that I 
get the same sort of issue when I migrate from one system to another, 
presumably because the id of the owner has changed, though I use the 
same name each time.  I've taken to doing the large object permission 
assignment after every restore "just to be safe," but it has the 
drawback that I often have to set max_locks_per_transaction to a very 
high number (4) for the restore, and then I comment it back out once 
it's done and restart.  That number is less than the number of LOs by a 
long shot, so I'm not sure an optimal number is, but I think at 2 I 
ran out during the re-permissioning of LOs.


It could be that when I restore, the objects take on the permission of 
the DB admin user (i.e. postgres) since it has full permissions for 
creating everything.  But I'd prefer that the objects all take on the 
ownership of the DB app user, which of course has more limited 
permissions, but otherwise is the user that does all of the 
inserts/updates/deletes/selects.  I'm not sure if I can create users in 
new databases with the same id when I'm using the same name or not.



I think this report confirms something we'd worried about during 9.0
development, which was whether pg_dump wouldn't have issues with
sufficiently many large objects.  At the time we'd taught it to handle LOs
as if they were full-fledged database objects, since that was the easiest
way to piggyback on its existing machinery for handling ownership and
permissions; but that's rather expensive for objects that don't really
need all the trappings of, eg, dependency tracking.  We'd done some
measurements that seemed to indicate that the overhead wasn't awful for
medium-size numbers of large objects, but I'm not sure we tried it for
millions of 'em.

I guess the good news is that it's only being a bit slow for you and not
falling over completely.  Still, it seems like some more work is indicated
in this area.
Yes, it takes 3 hours to do the backup, which is generally okay.  It was 
just surprising that I could restore in 2 hours .


David



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] pg_dump slower than pg_restore

2014-07-05 Thread David Wall


On 7/4/2014 9:18 PM, Tom Lane wrote:

There are only 32 table, no functions, but mostly large objects. Not
sure how to know about the LOs, but a quick check from the table sizes I
estimate at only 2GB, so 16GB could be LOs.  There are 7,528,803 entries
in pg_catalog.pg_largeobject.

Hmm ... how many rows in pg_largeobject_metadata?


pg_largeobject_metadata reports 1,656,417 rows.

By the way, what is pg_largeobject_metadata vs. pg_largeobject since the 
counts are so different?




   7547 esignfor  30  10 1148m 1.0g  852 S  2.3 26.9 14:10.27 pg_dump
--format=c --oids ibc01


I haven't tested it for any side issues, but the --oids can probably be 
removed as we don't cross reference against OID columns anymore (all 
OIDs are just a field in a table that uses a UUID now for 
cross-referencing).  But removing it seemed to make no difference if 
overall time for the pg_dump to complete.



That's a pretty large resident size for pg_dump :-( ... you evidently
have a lot of objects of some sort, and I'm betting it's LOs, but
let's make sure.


Is there postgresql.conf setting that might help?  It's a small 1GB RAM 
Linux VM with Tomcat web server (we give it 500-700MB) with PG DB on 
it.  We don't do much but change max-connections to 70, shared_buffers 
to 128MB, maintenance_work_mem to 120MB, checkpoint_segments to 6.


But in the end, I guess the main question is why the backup takes longer 
than the restore, which just seems counter-intuitive to me.


Thanks for all your help and thinking about it!


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] pg_dump slower than pg_restore

2014-07-05 Thread David Wall


On 7/4/2014 11:30 AM, Bosco Rama wrote:

Random thought:  What OS & kernel are you running?  Kernels between
3.2.x and 3.9.x were known to have IO scheduling issues.  This was
highlighted most by the kernel in Ubuntu 12.04 (precise) as shown
here:




I'm on CentOS 6.4 which seems to be Linux version 2.6.32-431.20.3.el6.x86_64

But it is a VM, so disk I/O can be rather random as there are other 
tenants.  While improving performance is nice, I was most interested in 
wy a pg_dump takes longer than a pg_restore (nearly 50% longer as it 
takes about 2.75 hours to dump, but 2 hours to restore).  It's 
counter-intuitive as reading from a DB is usually faster than writing 
into a DB.  I think those LOs are getting me as our DB is LO-intensive 
(most data is encrypted blobs: encrypted uploaded user files and 
encrypted app-generated XML/HTML).



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] pg_dump slower than pg_restore

2014-07-04 Thread David Wall


On 7/4/2014 7:19 AM, Tom Lane wrote:

You haven't given us much info about the contents of this database.
Are there a lot of tables? functions? large objects?  How many is
"a lot", if so?

I'm suspicious that you're paying a penalty associated with pg_dump's
rather inefficient handling of metadata for large objects, but there's
not enough info in this thread to diagnose it.  It'd be very interesting
to see perf or oprofile stats on the pg_dump run, particularly during
the parts where it doesn't seem to be writing anything.


There are only 32 table, no functions, but mostly large objects. Not 
sure how to know about the LOs, but a quick check from the table sizes I 
estimate at only 2GB, so 16GB could be LOs.  There are 7,528,803 entries 
in pg_catalog.pg_largeobject.


pg_database_size reports 18GB

biggest table sizes:
 relation  |  size
---+
 public.esf_formparty  | 635 MB
 public.esf_activity_log   | 416 MB
 public.esf_form   | 181 MB
 public.esf_encrypted_blob | 134 MB
 public.esf_activity_log_ownertime | 73 MB
 public.esf_tranfield  | 72 MB
 public.esf_formpartytranididx | 70 MB
 public.esf_formparty_pkey | 65 MB
 public.esf_encrypted_blob_pkey| 64 MB
 public.esf_formpartyididx | 63 MB
 public.esf_tranfield_pkey | 52 MB
 public.esf_formpartypickupidx | 51 MB
 public.esf_activity_log_typetime  | 47 MB
 public.esf_tran   | 46 MB
 public.esf_formorderidx   | 46 MB
 public.esf_form_pkey  | 42 MB
 public.esf_tranfieldvalueidx  | 39 MB
 public.esf_traninittimeidx| 19 MB
 public.esf_tranupdatetimeidx  | 19 MB
 public.esf_tran_pkey  | 13 MB

Basic top stats while running show:

top - 08:53:40 up 27 days, 17:38,  1 user,  load average: 1.03, 1.12, 1.22
Tasks: 156 total,   1 running, 155 sleeping,   0 stopped, 0 zombie
Cpu(s):  1.3%us,  0.6%sy,  0.4%ni, 74.2%id, 23.5%wa, 0.0%hi,  0.0%si,  
0.0%st

Mem:   3974112k total,  3954520k used,19592k free, 46012k buffers
Swap:  4245496k total,29996k used,  4215500k free, 1123844k cached

  PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEM TIME+  COMMAND
 7549 esignfor  20   0  116m 1372  884 S  3.0  0.0 16:39.69 gpg --batch 
--symmetric --cipher-algo AES256 --passphrase 3z4ig0Rq]w
 7547 esignfor  30  10 1148m 1.0g  852 S  2.3 26.9 14:10.27 pg_dump 
--format=c --oids ibc01

 7548 esignfor  20   0  4296  748  372 S  2.3  0.0 13:05.44 gzip
 7551 esignfor  20   0  555m 413m 410m D  1.7 10.6 9:32.03 postgres: 
esignforms ibc01 [local] 

 1978 esignfor  20   0 15032 1372 1004 R  0.7  0.0 0:00.27 top -c
 7550 esignfor  20   0 98.6m  592  472 S  0.3  0.0 0:49.80 split -b 
51200 - /home/esignforms/customers/archive/db/dump.20140704.gz.gpg




--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] pg_dump slower than pg_restore

2014-07-03 Thread David Wall


On 7/3/2014 11:47 AM, Eduardo Morras wrote:
No, there's nothing wrong. All transparent compressed objects stored 
in database, toast, lo, etc.. is transparently decompressed while 
pg_dump access them and then you gzip it again. I don't know why it 
doesn't dump the compressed data directly. 


That sounds odd, but if pg_dump decompresses the large objects and then 
I gzip them on backup, doesn't the same more or less happen in reverse 
when I pg_restore them?  I mean, I gunzip the backup and then pg_restore 
must compress the large objects when it writes them back.


It just seems odd that pg_dump is slower than pg_restore to me. Most 
grumblings I read about suggest that pg_restore is too slow.


I have noted that the last split file segment will often appear to be 
done -- no file modifications -- while pg_dump is still running, often 
for another 20 minutes or so, and then some last bit is finally 
written.  It's as if pg_dump is calculating something at the end that is 
quite slow.  At startup, there's a delay before data is written, too, 
but it's generally 1-2 minutes at most.



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] pg_dump slower than pg_restore

2014-07-03 Thread David Wall


On 7/3/2014 10:13 PM, Bosco Rama wrote:
Is the issue with S3 or just transfer time? I would expect that 
'rsync' with the '--partial' option (or -P if you want progress info 
too) may help there.


Don't know if rsync and S3 work together or what that would mean, but 
it's not an issue I'm suffering now.  I do think they may now have a 
multipart upload with s3cmd (which I use), though that also wasn't 
available when we first built our scripts.


I suspect nothing is really helping here and I'm mostly limited by disk 
I/O, but not sure why the pg_dump is so much slower than pg_restore as 
they are all on the same disks.  I say this because even with pg_dump 
-Z0 | gpg -z 0 and gzip removed entirely and no --oids on pg_dump, 
there's no effective difference in overall speed.  While I can see all 
of those processes vying for resources via 'top -c', the throughput 
remains much the same.



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] pg_dump slower than pg_restore

2014-07-03 Thread David Wall


On 7/3/2014 6:26 PM, Bosco Rama wrote:
Hmmm. You are using '--oids' to *include* large objects? IIRC, that's 
not the intent of that option. Large objects are dumped as part of a 
DB-wide dump unless you request that they not be. However, if you 
restrict your dumps to specific schemata and/or tables then the large 
objects are NOT dumped unless you request that they are. Something to 
keep in mind. 


I can get rid of --oids and see what happens.  I used to have 
cross-table references to OID fields before, so this is no doubt a 
holdover, but I think I am now using UUIDs for all such links/references 
and the OID fields are just like any other data field.  It may not be 
needed and I'll see if it speeds up the backup and restores correctly.



Many of the large objects are gzip compressed when stored.  Would I be
better off letting PG do its compression and remove gzip, or turn off
all PG compression and use gzip?  Or perhaps use neither if my large
objects, which take up the bulk of the database, are already compressed?

OK.  Given all the above (and that gpg will ALSO do compression unless
told not to), I'd go with the following (note lowercase 'z' in gpg
command).  Note also that there may be a CPU vs I/O trade-off here that
may make things muddier but the following are 'conceptually' true.

Fast but big

$ pg_dump -Z0 -Fc ... $DB | gpg -z0 ... | split

Less fast but smaller
=
$ pg_dump -Z1 -Fc ... $DB | gpg -z0 ... | split


I'll give that a try now.  I didn't notice any real time savings when I 
changed pg_dump without any -Z param to -Z 0, and oddly, not much of a 
difference removing gzip entirely.



BTW, is there any particular reason to do the 'split'?
Yes, I transfer the files to Amazon S3 and there were too many troubles 
with one really big file.


Thanks again...



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] pg_dump slower than pg_restore

2014-07-03 Thread David Wall


On 7/3/2014 5:13 PM, Bosco Rama wrote:
If you use gzip you will be doing the same 'possibly unnecessary' 
compression step. Use a similar approach to the gzip command as you 
would for the pg_dump command. That is, use one if the -[0-9] options, 
like this: $ pg_dump -Z0 -Fc ... | gzip -[0-9] ... 


Bosco, maybe you can recommend a different approach.  I pretty much run 
daily backups that I only have for disaster recovery.  I generally don't 
do partials recoveries, so I doubt I'd ever modify the dump output.  I 
just re-read the docs about formats, and it's not clear what I'd be best 
off with, and "plain" is the default, but it doesn't say it can be used 
with pg_restore.


Maybe the --format=c isn't the fastest option for me, and I'm less sure 
about the compression.  I do want to be able to restore using pg_restore 
(unless plain is the best route, in which case, how do I restore that 
type of backup?), and I need to include large objects (--oids), but 
otherwise, I'm mostly interested in it being as quick as possible.


Many of the large objects are gzip compressed when stored.  Would I be 
better off letting PG do its compression and remove gzip, or turn off 
all PG compression and use gzip?  Or perhaps use neither if my large 
objects, which take up the bulk of the database, are already compressed?




--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] pg_dump slower than pg_restore

2014-07-03 Thread David Wall


On 7/3/2014 10:38 AM, Tim Clarke wrote:
I'd also check the effect of those other run components; the vacuum's 
and other things that are only running with the backup and not during 
the restore. 


The vacuumlo, vacuum and analyze run before the pg_dump.  I am not 
talking about any of the time they spend doing anything.  What I am 
measuring is the last modified timestamps of the 512MB split files only, 
so if anything, I'm under-measuring by the time it takes to do the first 
512MB segment.




--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] pg_dump slower than pg_restore

2014-07-03 Thread David Wall


On 7/3/2014 10:36 AM, Bosco Rama wrote:

If those large objects are 'files' that are already compressed (e.g.
most image files and pdf's) you are spending a lot of time trying to
compress the compressed data ... and failing.

Try setting the compression factor to an intermediate value, or even
zero (i.e. no dump compression).  For example, to get the 'low hanging
fruit' compressed:
 $ pg_dump -Z1 -Fc ...

IIRC, the default value of '-Z' is 6.

As usual your choice will be a run-time vs file-size trade-off so try
several values for '-Z' and see what works best for you.


That's interesting.  Since I gzip the resulting output, I'll give -Z0 a 
try.  I didn't realize that any compression was on by default.


Thanks for the tip...


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] pg_dump slower than pg_restore

2014-07-03 Thread David Wall
I'm running PG 9.3.4 on CentOS 6.4 and noted that backing up my database 
takes much longer than restoring it.


That seems counter-intuitive to me because it seems like reading from a 
database should generally be faster than writing to it.


I have a database that pg_database_size reports as 18GB, and resulting 
dump is about 13GB in 27 files (split creates them as 512MB).


A pg_dump backup -- with most of the data stored as large objects -- 
takes about 5 hours.


But restoring that dump takes about 2 hours.  So it's taking 2.5 times 
longer to back it up than to restore it.


My backup script runs vacuumlo, then vacuum, then analyze, then pg_dump 
--format=c --oids $DB


I actually push pg_dump output through gzip, gpg and split on 512MB 
files, but they shouldn't matter too much I figure as I have to run cat, 
gpg and gunzip before pg_restore.  In fact, my restore should have been 
at a disadvantage because I used '-v' and showed the results to my ssh 
term over the Internet which includes a line for each LOID, and the 
postgresql.conf had 'ddl' logging on (which I suspect I can turn off in 
future restores to speed things up a bit).


Is there something that might be wrong about my configuration that the 
backup is slower than the restore?


Thanks,
David



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Warm standby (log shipping) from PG 8.3 to 9.3

2014-06-10 Thread David Wall

On 6/10/2014 11:54 AM, hubert depesz lubaczewski wrote:
On Tue, Jun 10, 2014 at 8:13 PM, David Wall <mailto:d.w...@computer.org>> wrote:


Is it safe to assume that my working PG 8.3 archive command on the
master and recovery.conf (using contrib's pg_standby) on the
standby will work the same under 9.3?


Yes, it will work just fine. Of course you can't load 9.3 xlogs into 
8.3, or 8.3 xlogs into 9.3, but the commands are the same.


Thanks.  Yes, that makes sense as I'll update both DBs to the same 
version and we'll likely start with a fresh DB snapshot.


Do you know if there's a way to turn that warm standby into a hot 
standby (so I can query the standby during its ongoing recovery) 
easily?  Any docs that show the changes necessary to make that happen, 
or is that a bigger task?


David


[GENERAL] Warm standby (log shipping) from PG 8.3 to 9.3

2014-06-10 Thread David Wall
Is it safe to assume that my working PG 8.3 archive command on the 
master and recovery.conf (using contrib's pg_standby) on the standby 
will work the same under 9.3?


That is, under PG 8.3, my master server uses:

archive_mode = on
archive_command = '~/postgresql/bin/copyWAL "%p" "%f"'
archive_timeout = 300

(and copyWAL does an SCP to the warm standby's recovery WALs directory)

And my warm standby recovery warm standby uses recovery.conf:

restore_command = '~/postgresql/bin/pg_standby -l -d -t 
~/postgresql/recoveryWALs/STOP_RECOVERY ~/postgresql/recoveryWALs %f %p 
%r 2>> ~/postgresql/logs/pg_standby.log'


I'm getting ready to do a migration to upgraded versions to 9.3 and 
wanted to know if I had to address this concern or whether it should 
just work the same as in 8.3.  We're not ready to try streaming.


I'll read up on what it means to do hot standby instead of warm 
standby.  We don't expect to need to use the standby for running queries 
from our app, but it would sure be nice, from a comfort level, for 
operations to be able to see updates in the standby.  Is it really just 
as easy as our current warm standby but just adding 'wal_level = 
hot_standby' to the master's postgresql.conf ?  Is there anything I can 
read about moving from 8.3 warm standby to 9.3 hot standby?


It's a bit confusing because of the various options for standby mode.  
Would it be better to use the new standby setup instead of pg_standby?


Thanks,
David


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Upgrading from PG 8.3.3 to 9.3.4 - FATAL: invalid value for parameter "TimeZone": "PST"

2014-06-01 Thread David Wall


On 6/1/2014 9:05 AM, Adrian Klaver wrote:


I ask because a look at the PG JDBC code shows this, which did not 
show up until Sept 22, 2011. Not sure what release, but it looks like 
9.2+:


// Construct and send a startup packet.
String[][] params = {
{ "user", user },
{ "database", database },
{ "client_encoding", "UTF8" },
{ "DateStyle", "ISO" },
{ "extra_float_digits", "2" },
{ "TimeZone", 
createPostgresTimeZone() },



 /**
 * Convert Java time zone to postgres time zone.
 * All others stay the same except that GMT+nn changes to GMT-nn and
 * vise versa.
 *
 * @return The current JVM time zone in postgresql format.
 */
private String createPostgresTimeZone() {
String tz = TimeZone.getDefault().getID();
if (tz.length() <= 3 || !tz.startsWith("GMT")) {
return tz;
}
char sign = tz.charAt(3);
String start;
if (sign == '+') {
start = "GMT-";
} else if (sign == '-') {
start = "GMT+";
} else {
// unknown type
return tz;
}

return start + tz.substring(4);

Ahh I see the problem.

From here:
A time zone abbreviation, for example PST. Such a specification merely 
defines a particular offset from UTC, in contrast to full time zone 
names which can imply a set of daylight savings transition-date rules 
as well. The recognized abbreviations are listed in the 
pg_timezone_abbrevs view (see Section 47.70). You cannot set the

  ^^
configuration parameters TimeZone or log_timezone to a time zone

abbreviation, but you can use abbreviations in date/time input values
^
and with the AT TIME ZONE operator.


So:

test=> select version();
   version

- 

 PostgreSQL 9.3.4 on i686-pc-linux-gnu, compiled by gcc (SUSE Linux) 
4.7.2 20130108 [gcc-4_7-branch revision 195012], 32-bit

(1 row)

test=> set TimeZone='PST';
ERROR:  invalid value for parameter "TimeZone": "PST"
test=> set TimeZone='PST8PDT';
SET

The JDBC code above, if I am following correctly, is picking up a 
default timezone of 'PST' and then in the first if returning that as 
the tz value to SET TimeZone in the startup packet.


Two things.

1) Where is it getting PST from ?

2) Should the driver even be returning an abbreviation given that 
Postgres will not accept it as a TimeZone value?





Thanks for the extra help, Adrian.

It led me to investigate the 35 webapps we deploy on Tomcat and I found 
2 rogue apps that set their timezone to "PST".  Once I fixed these two, 
all is working great again.


I guess there's no bug, per se, except in our configuration.  Once we 
changed it to PST8PDT, all was good again.


Thanks,
David



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Upgrading from PG 8.3.3 to 9.3.4 - FATAL: invalid value for parameter "TimeZone": "PST"

2014-06-01 Thread David Wall

On 5/31/2014 11:47 PM, John R Pierce wrote:

On 5/31/2014 11:41 PM, David Wall wrote:


 FATAL:  invalid value for parameter "TimeZone": "PST" 


I'd be using "America/Los_Angeles" as the timezone rather than PST, 
as the TLA timezones are ambiguous (CST is both Central Standard Time 
in the USA, and China Standard Time).


but thats probably not the problem, hard to guess what is, sounds 
like you have a lot going on.




PG itself has the timezone US/Pacific set in postgresql.conf

In Linux, we show:
lrwxrwxrwx 1 root root 39 Apr 23 16:00 /etc/localtime -> 
/usr/share/zoneinfo/America/Los_Angeles


Even when we use PST in our Java code, we use PST8PDT.

The exception occurs when JDBC tries to connect to PG:

2014-05-31 22:14:34,351 ERROR (eSignForms) SQLException: 
ConnectionPool.makeConnection(esf) to URL: 
jdbc:postgresql://localhost.localdomain:25432/zingr:
2014-05-31 22:14:34,352 ERROR (eSignForms)   Message:   FATAL: invalid 
value for parameter "TimeZone": "PST"
org.postgresql.util.PSQLException: FATAL: invalid value for parameter 
"TimeZone": "PST"
at 
org.postgresql.core.v3.ConnectionFactoryImpl.readStartupMessages(ConnectionFactoryImpl.java:574)
at 
org.postgresql.core.v3.ConnectionFactoryImpl.openConnectionImpl(ConnectionFactoryImpl.java:177)
at 
org.postgresql.core.ConnectionFactory.openConnection(ConnectionFactory.java:64)
at 
org.postgresql.jdbc2.AbstractJdbc2Connection.(AbstractJdbc2Connection.java:138)
at 
org.postgresql.jdbc3.AbstractJdbc3Connection.(AbstractJdbc3Connection.java:29)
at 
org.postgresql.jdbc3g.AbstractJdbc3gConnection.(AbstractJdbc3gConnection.java:21)
at 
org.postgresql.jdbc4.AbstractJdbc4Connection.(AbstractJdbc4Connection.java:31)
at 
org.postgresql.jdbc4.Jdbc4Connection.(Jdbc4Connection.java:24)

at org.postgresql.Driver.makeConnection(Driver.java:410)
at org.postgresql.Driver.connect(Driver.java:280)
at java.sql.DriverManager.getConnection(DriverManager.java:571)
at java.sql.DriverManager.getConnection(DriverManager.java:215)
at 
com.esignforms.db.ConnectionPool.makeConnection(ConnectionPool.java:302)


And of course none seems to explain why it works often enough, but then 
fails as more apps are deployed.  Ugh!



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Upgrading from PG 8.3.3 to 9.3.4 - FATAL: invalid value for parameter "TimeZone": "PST"

2014-05-31 Thread David Wall
I am posting to both PG and the JDBC group since I am using JDBC to 
connect, so I'm not sure if this is a PG issue or a PG JDBC issue.


I am upgrading from 8.3.3 to 9.3.4.  I ran pg_dump on my 8.3.3 databases 
(which ran on CentOS 5.2, Java 6) and ran pg_restore on my 9.3.4 
databases (which runs on CentOS 6.4, Java 7.0.55, using JDBC 
postgresql-9.3-1101.jdbc41.jar).  The new databases appear to be correct 
in that I can run psql and query my tables and I don't see any obvious 
issues.


In fact, I seem to be able to run several applications at the same time, 
each connecting to its own database.


But I also see that sometimes I get this error in my postgresql.log file:

 FATAL:  invalid value for parameter "TimeZone": "PST"

I normally run about 35-40 webapps at a time in our dev environment 
without issue, but when I try to start them all, it seems this error 
occurs on nearly every application.  Yet when I only run about 10, I 
don't seem to see the issue at all.  I have tried adding them in one at 
a time to see if there was a "rogue" webapp, but when I find a newly 
added one suffers the TimeZone FATAL error above, if I restart Tomcat, 
it may then be fine.  Apps that run okay at one time, will exhibit the 
error when I try to add the others in.


Does anybody have any ideas on what I might try?  I did a 'yum update' 
to get the latest, rebooted, recompiled PG and reloaded the DBs, but 
that made no difference.


Thanks for any tips if anybody has seen this!

David


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Permissions on large objects - db backup and restore

2013-04-04 Thread David Wall


On 4/3/2013 5:57 PM, Tom Lane wrote:
$ pg_restore -? ... -O, --no-owner skip restoration of object 
ownership ... So there you have it. pg_restore just restored all the 
objects (blobs and otherwise) as owned by the user running it. I 
should think you'd have had issues with other things besides the blobs 
by now. regards, tom lane 


Thanks you!  Unsure why that -O was being used since when I check 
version control, it's been in there since PG 8.3 days.  I hadn't checked 
on its meaning and just assumed it was something to do with OIDs like 
the pg_dump flags.  No doubt it solved something back then since TABLE 
GRANTS could be run after a restore since before 9.0 the loids had no 
permission issue to deal with.


Thanks again, Tom!  Best regards, David




--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Permissions on large objects - db backup and restore

2013-04-03 Thread David Wall


On 4/3/2013 3:14 PM, Tom Lane wrote:
A 9.2->9.2 dump and restore certainly should preserve large object 
ownership (and permissions, if you've set any). In a quick check I do 
see "ALTER LARGE OBJECT nnn OWNER TO ..." commands in pg_dump's output 
for such a case. Are you sure this is really the "same" issue? Are you 
doing something strange like using pg_dump's --no-owner option? Did 
you get any errors while doing the pg_dump or pg_restore? (Trying to 
run the restore as non-superuser would mean the ALTER OWNER commands 
would fail, but you should have gotten plenty of bleats about that.) 
regards, tom lane 


We used this to backup on server1 (9.2.2):

pg_dump --format=c --oids /dbname/

And we restored on server2 (9.2.3) with:

pg_restore -v -O -d /dbname/

The application user/role is the same name as the /dbname /but has more 
limited permissions than the superuser/role $PGUSER used when running 
pg_dump and pg_restore.


How can I check if ALTER LARGE OBJECT is specified in my backup?  Do I 
need to change the --format option?


I'm not positive it was the same issue as when I upgraded from 8.4, but 
I did note that my db had a different pg_largeobject_metadata.lomowner 
value before and after I ran that DO script to alter each and the 
problem with reading a large object in my code went away.


Thanks, David







[GENERAL] Permissions on large objects - db backup and restore

2013-04-03 Thread David Wall
When we upgraded from PG 8.4 to PG 9.2, we ran into a permissions issue 
with large objects as discussed here: 
http://postgresql.1045698.n5.nabble.com/Large-Object-permissions-lost-in-transfer-td4281604.html


The basic solution was to do an ALTER LARGE OBJECT and set the OWNER TO 
using a script like the following in our bash script:


do \$\$
declare r record;
begin
for r in select distinct loid from pg_catalog.pg_largeobject loop
execute 'ALTER LARGE OBJECT ' || r.loid || ' OWNER TO $DBUSER';
end loop;
end\$\$;
CLOSE ALL;

I thought it had to do with an 8.4 backup and a 9.2 restore, but even 
when I did a backup on a 9.2.2 and restored on a 9.2.3 (we migrated to a 
new server at the same time), the same issue arose. Is there a setting 
for pg_dump and pg_restore so that our large objects don't run into this 
issue?  I suspect I'm missing something easy, or do I just need to do 
this after any restore where I change systems?  I suspect it's because 
my PG is owned with a superuser account like 'dbadmin' which I use to 
run the backups and restores, while the DB itself is owned by a less 
privileged user account like 'dbuser'.  It may be that on restore, the 
large objects are all owned by dbadmin instead of dbuser?


Thanks for any clarifications.  I may just find I'll put that script 
above in my table grants that we use to set all such permissions for tables.


David


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] PG 8.4 to 9.2 upgrade issues with ownership of large objects

2012-12-17 Thread David Wall

On 12/16/2012 6:21 PM, Tom Lane wrote:

David Wall  writes:

On 12/16/2012 11:22 AM, Tom Lane wrote:
Large objects didn't have privileges, nor owners, in 8.4.  If you 
don't

feel like fixing your apps right now, you can return to the previous
behavior by setting the obscurely-named lo_compat_privileges 
setting in

postgresql.conf.



I'll read up more on permissions as they relate large objects.



Okay, I cannot find any information on how to view the permissions for a 
large object.  I know that psql's \dp can be used to see the permissions 
on a table, but how do I see the permissions assigned to a large object?


The docs show how to use GRANT/REVOKE ON LARGE OBJECT loid, and a 
script that shows ALTER LARGE OBJECT loid OWNER TO..., but I don't see 
anything about retrieving the loid's permissions.


Thanks, David


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] PG 8.4 to 9.2 upgrade issues with ownership of large objects

2012-12-17 Thread David Wall


On 12/16/2012 6:21 PM, Tom Lane wrote:

David Wall  writes:

On 12/16/2012 11:22 AM, Tom Lane wrote:

Large objects didn't have privileges, nor owners, in 8.4.  If you don't
feel like fixing your apps right now, you can return to the previous
behavior by setting the obscurely-named lo_compat_privileges setting in
postgresql.conf.

I am using the latest JDBC driver and have not noted any other issues
with large objects accessed, created or deleted using the blob
interfaces.  What does fixing an app mean or entail?  We've always
accessed large objects as a simple blob stored and referenced in a table
as an OID with both having the same lifetime.

It would only be an issue if you created large objects under one role
and then tried to access them under another, since the default
permissions would forbid that.  I assumed since you were complaining
that you'd run into something of the sort ...

regards, tom lane

Okay, that's good.  I suspect the JDBC library is taking care of this, 
but I'll check with them. I think our issue was just on the restore of 
the backup from 8.4 which had no owner/permissions and the restore into 
9.2 in which they were assigned.  Our apps all use a more limited role 
than what the db admin uses, and I guess that's when it first appeared.  
We do reset all of the GRANTs for tables for the application role, but I 
guess that doesn't make it down to the large objects.  I'll read up more 
on permissions as they relate large objects.


Thanks again for all your help, Tom.

David


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] PG 8.4 to 9.2 upgrade issues with ownership of large objects

2012-12-16 Thread David Wall


On 12/16/2012 11:22 AM, Tom Lane wrote:

David Wall  writes:

In past PG upgrades, we've done a pg_dump on the current version, then a
pg_restore on the new version.  But during the 8.4 to 9.2 upgrade (on
Linux x64), we ran into issues with the permissions associated with the
large objects after the restore.

Large objects didn't have privileges, nor owners, in 8.4.  If you don't
feel like fixing your apps right now, you can return to the previous
behavior by setting the obscurely-named lo_compat_privileges setting in
postgresql.conf.

regards, tom lane



Thanks for the information, Tom and Adrian.

I am using the latest JDBC driver and have not noted any other issues 
with large objects accessed, created or deleted using the blob 
interfaces.  What does fixing an app mean or entail?  We've always 
accessed large objects as a simple blob stored and referenced in a table 
as an OID with both having the same lifetime.





--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] PG 8.4 to 9.2 upgrade issues with ownership of large objects

2012-12-16 Thread David Wall
In past PG upgrades, we've done a pg_dump on the current version, then a 
pg_restore on the new version.  But during the 8.4 to 9.2 upgrade (on 
Linux x64), we ran into issues with the permissions associated with the 
large objects after the restore.


Is this something new or were we just "lucky" before?

Our postmaster runs many databases, with each database owned by the PG 
admin, but we normally just used a set of GRANT statements to provide 
appropriate access control to the application user. In our each, each 
database has it's own application user which accesses the DB for a web 
app (and for convenience, the DBNAME and DBUSER are the same name).


Our pg_dump command is basically: pg_dump --format=c --oids DBNAME

Our pg_restore is basically: pg_restore -v -O -d DBNAME

Should we be doing this differently now as we never found an issue 
before this somewhat significant update from 8.4 to 9.2?


We resolved the issue from a posting we saw online that basically 
suggested this resolution after the restore with psql:


do $$
declare r record;
begin
for r in select loid from pg_catalog.pg_largeobject loop
execute 'ALTER LARGE OBJECT ' || r.loid || ' OWNER TO DBUSER';
end loop;
end$$;
CLOSE ALL;

Thanks for any good pointers or tips on this.

David



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] UUID column as pimrary key?

2011-01-04 Thread David Wall
We're using UUID for primary keys in PG 8.4 without any issues.  I have 
no real insights into the details or performance issues, but always 
figured it was stored as a binary 128-bit value, but with added benefits 
of being able to enter and view them using a standard string format.  We 
don't sort them as they have no real meaning for us.


On 1/4/2011 11:07 AM, Dennis Gearon wrote:

I haven't been able to find anywhere, easily, in the documentation using google
where a list of allowed data types for primary keys is.

So, UUIDs can be primary keys?
Any issues wtih them on sorting or paging of index tables, etc.?

Also, the documentation says that UUIDs are 128 bit value, but never explicitly
says that's how it's stored. Nor does it use one of the nice, blue headered
tables for UUID (or ENUM) showing storage and other attributes as it does for
numeric, character,boolean, date/time, binary, monetary, geometric, or network
types.


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Storing many big files in database- should I do it?

2010-04-29 Thread David Wall



Huh ???  isn't that point of using bytea or text datatypes.

I could have sworn bytea does not use large object interface it uses
TOAST or have i gone insane

 

You're not insane :)

Put it another way: bytea values are not stored in the pg_largeobject
catalog.
   


I missed the part that BYTEA was being used since it's generally not a 
good way for starting large binary data because you are right that BYTEA 
requires escaping across the wire (client to backend) both directions, 
which for true binary data (like compressed/encrypted data, images or 
other non-text files) makes for a lot of expansion in size and related 
memory.


BYTEA and TEXT both can store up to 1GB of data (max field length), 
which means even less "file size" supported if you use TEXT with base64 
coding.  LO supports 2GB of data.  In JDBC, typically BYTEA is used with 
byte[] or binary stream while LOs with BLOB.  I think LOs allow for 
streaming with the backend, too, but not sure about that, whereas I'm 
pretty sure BYTEA/TEXT move all the data together you it will be in 
memory all or nothing.


Of course, to support larger file storage than 1GB or 2GB, you'll have 
to create your own "toast" like capability to split them into multiple rows.


David

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Storing many big files in database- should I do it?

2010-04-29 Thread David Wall

Things to consider when /not /storing them in the DB:

1) Backups of DB are incomplete without a corresponding backup of the files.

2) No transactional integrity between filesystem and DB, so you will 
have to deal with orphans from both INSERT and DELETE (assuming you 
don't also update the files).


3) No built in ability for replication, such as WAL shipping

Big downside for the DB is that all large objects appear to be stored 
together in pg_catalog.pg_largeobject, which seems axiomatically 
troubling that you know you have lots of big data, so you then store 
them together, and then worry about running out of 'loids'.


David

On 4/29/2010 2:10 AM, Cédric Villemain wrote:

2010/4/28 Adrian Klaver:
   

On Tuesday 27 April 2010 5:45:43 pm Anthony wrote:
 

On Tue, Apr 27, 2010 at 5:17 AM, Cédric Villemain<

cedric.villemain.deb...@gmail.com>  wrote:
   

store your files in a filesystem, and keep the path to the file (plus
metadata, acl, etc...) in database.
 

What type of filesystem is good for this?  A filesystem with support for
storing tens of thousands of files in a single directory, or should one
play the 41/56/34/41563489.ext game?
   

I'll prefer go with XFS or ext{3-4}. In both case with a path game.
You path game will let you handle the scalability of your uploads. (so
the first increment is the first directory) something like
1/2/3/4/foo.file 2/2/3/4/bar.file etc... You might explore a hash
function or something that split a SHA1(or other) sum of the file to
get the path.


   

Are there any open source systems which handle keeping a filesystem and
database in sync for this purpose, or is it a wheel that keeps getting
reinvented?

I know "store your files in a filesystem" is the best long-term solution.
But it's just so much easier to just throw everything in the database.
   

In the for what it is worth department check out this Wiki:
http://sourceforge.net/apps/mediawiki/fuse/index.php?title=DatabaseFileSystems
 

and postgres fuse also :-D

   

--
Adrian Klaver
adrian.kla...@gmail.com

 



   


[GENERAL] VACUUM FULL vs backup/restore

2010-04-27 Thread David Wall
We are copying a production database and then pairing it down 
dramatically (perhaps removing 99% of records, most of which are large 
objects).


It seems my options are 1) VACUUM FULL with a reindex_db; or 2) backup, 
then restore.


Is there anything one does better than the other?  Our impression is 
that the backup and restore will run faster (perhaps 2 hours), whereas 
we have a currently running VACUUM FULL that's been running for 4.5 
hours already.


Anybody have any experience on this?  Would a backup/restore essentially 
create a minimally sized database with all fresh indexes?


Thanks,
David

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Where do you store key for encryption

2009-11-17 Thread David Wall
In our open-esignforms project we use a layered approach for keys in 
which we have a boot key for the application that requires dual 
passwords which we then combine into a single password for PBE 
encryption of the boot key.  We then have session keys that are 
encrypted with the boot key, and the session keys are used to encrypt 
one-up keys for encrypted blobs.


In your case, you could encrypt your key using PBE assuming you have a 
way to provide the password to unlock it.  This would allow you to 
protect the key with a password, which is the most basic way to go if 
you don't have a keystore to use.


David

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] pg_dump on Linux with pg_restore on Windows?

2009-11-03 Thread David Wall
I know this used to be a problem, but I don't have a Windows server to 
test it on again.


Can I do a pg_dump on Linux and then pg_restore it on Windows?  We're 
using PG 8.3 now, but could certainly upgrade if that made the difference.


Thanks,
David


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] pg_dump on Linux with pg_restore on Windows?

2009-11-03 Thread David Wall


I know this used to be a problem, but I don't have a Windows server to 
test it on again.


Can I do a pg_dump on Linux and then pg_restore it on Windows?  We're 
using PG 8.3 now, but could certainly upgrade if that made the 
difference.


Okay, maybe I answered my own question, though if anybody has experience 
one way or another, I'd love to hear it.  It looks like the pg_dump 
option -Fp (or perhaps just no -F option at all) on Linux would write 
the backup in script format, so it may allow it to be processed by psql 
on the Windows.


If anybody has done this, let me know since I'd hate to have someone 
order a new server only to find out we can't get it to work.


Thanks again,
David

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Operational performance: one big table versus many smaller tables

2009-10-26 Thread David Wall
If I have various record types that are "one up" records that are 
structurally similar (same columns) and are mostly retrieved one at a 
time by its primary key, is there any performance or operational benefit 
to having millions of such records split across multiple tables (say by 
their application-level purpose) rather than all in one big table? 

I am thinking of PG performance (handing queries against multiple tables 
each with hundreds of thousands or rows, versus queries against a single 
table with millions of rows), and operational performance (number of WAL 
files created, pg_dump, vacuum, etc.).


If anybody has any tips, I'd much appreciate it.

Thanks,
David

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] When much of a DB is large objects - PG 8.4

2009-10-20 Thread David Wall



the bytea limit is 1gb (as are all datums in postgres).
pg_largeobject can go up to 2gb, but in either case you'd likely run
into severe performance/scalability issues long before objects began
approaching those size because of memory usage and other issues.  With
100kb objects though, you should be all right.
  


Thanks, Merlin.  Yes, I am not worried so much about the size limits of 
a given field or row (not yet anyway). 

But I am concerned that all my data across all my tables really ends up 
in a single pg_largeobject table, which seems like it could be a 
bottleneck.  Since it's blobs and generally big, I figure repeated 
access to the table doesn't really even benefit from caching that much 
like if I had a small table that was being hit a lot.  I am worried 
about the overall table size of pg_largeobject as blobs are inserted in 
my various tables, but they only get an OID stored, whereas 
pg_largeobject gets all of the data across all of those tables.


I am concerned with backups/restores, crash recovery, partitioning 
options, etc. if most of my data is stored in the single pg_largeobject 
table.  Can it be partitioned?  How many blobs can it hold before it 
won't perform well?


And is there any performance gain if I had several pg_largeobject-like 
tables that I built myself using bytea types as it does?  I mean, does 
PG perform any better if my queries are across a series of tables all 
with their own byteas rather than using LOs?



libpq supports a binary protocol mode which allows you to execute
queries sending bytea without escaping.  (I'm not familiar with the
jdbc driver, but I'd imagine it should support it in some fashion).  l
would start researching there: find out if the jdbc driver supports
binary queries and use them if possible.  If you can't or won't be use
jdbc in this way, your options are to stick with large objects or try
and figure out another way to get data into the database.
  
Thanks.  I'll see what I can learn about bytea escaping in JDBC as I 
don't see anything obvious in its JDBC-specific PG docs.  Perhaps it no 
longer suffers from each byte being converted into escaped characters, 
which really balloons already big enough data for the transfer 
protocols, and that would be great.  Of course, it's all moot if there 
would not be any real performance gain to be had by having these objects 
stored across multiple tables rather than all being in pg_largeobject 
(most such data is not deleted, though it is often updated, so vacuumlo 
running daily generally isn't a problem for us).


David

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] When much of a DB is large objects - PG 8.4

2009-10-19 Thread David Wall
We have a system in which a large amount of the data is stored as large 
objects now in PG 8.4.  The relational aspects of these objects are 
stored using traditional tables, but we store a lot of binary data 
(files, images) and XML objects that are compressed and then encrypted.  
The ACID properties of PG are great for this since we can ensure 
multiple updates are committed/rolled-back together.


But, even if I have 30 tables that themselves all have OID types for the 
encrypted or binary data, most ends up in the pg_largeobject table, 
which makes partitioning and disk access complicated, and there's always 
some concern we'll end up maxing out the OIDs in that system table and 
what the performance of banging against one table invites.


So my question is there a way to load balance/partition pg_largeobject 
across disks or otherwise ensure I'm getting the best performance from 
PG?  I know I can use BYTEA (forget its max size off hand, but note that 
it's not exactly JDBC friendly because of all the escaping to be done 
moving a large byte array).  Would I do well, perhaps to clone the idea 
of pg_largeobject for those objects that are say 100KB or less (many 
will be I suspect) and store them in a sequence of BYTEA rows in my own 
tables as way to allow the storage of these blobs across many tables 
instead of just pg_largeobject?  It probably wouldn't be as efficient as 
the large object code, but I'm sure it could be made to work.


Thanks for any hints or ideas

David

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Handling large number of OR/IN conditions

2009-05-01 Thread David Wall
(quoted from Chris) 
Select field1,field2 FROM table1 inner join relationships on 
table1.creator_user_id = relationships.employee WHERE 
relationships.manager = ?


(quoted from Steve) 
select table1.field1, table2.field2 from table1, reports where 
table1.creator_user_id = reports.peon and reports.overlord = 'bob'


Thanks, Steve and Chris, who both suggested similar things. 

I believe that will work for me in some situations, but currently the 
report allows the manager to select any combination (from one to all) of 
users that report to him.  So one solution would be to run the report as 
you have it and then have the application filter it, but that could be 
painful if the users "de-selected" from his full team happen to have a 
large percentage of the matching rows.  Of course, I may see if this is 
something they really do much of.  I mean, perhaps if they select 
individual users (just a few), I can just use the OR/IN style, and if 
they select "all my users" I could rely on the table joins.


Does anybody know if PG will perform better with the table join instead 
of evaluating the series of OR/IN?  The OR/IN has to be parsed, but the 
comparisons may be faster than the table join.


Many thanks for your help

David


[GENERAL] Handling large number of OR/IN conditions

2009-05-01 Thread David Wall
We have a database report function that seemed clean when the number of 
users was small, but as the number of users grow, I was wondering if 
anybody had any good ideas about how to handle OR or IN for SELECTs. 

The general scenario is that a manager runs reports that list all 
records that were created by users under his/her oversight.  So, when 
the number of users is small, we had simple queries like:


SELECT field1, field2 FROM table1 WHERE creator_user_id = 'U1' OR 
creator_user_id = 'U2';


But when there are thousands of users, and a manager has oversight of 
100 of them, the OR construct seems out of whack when you read the query:


WHERE creator_user_id = 'U1' OR creator_user_id = 'U2' *...* OR 
creator_user_id = 'U99' OR creator_user_id = 'U100'


I know it can be shortened with IN using something like, but don't know 
if it's any more/less efficient or a concern:


WHERE creator_user_id IN ('U1', 'U2', , 'U99', 'U100)

How do people tend to handle this sort of thing?  I suspect manager 
reports against their people must be pretty common.  Are there any good 
tricks on how to group users like this?  Unfortunately, group membership 
changes over time, and users may report to more than one manager and 
thus belong to more than one group, so we can't just have a 
'creator_group_id' attribute that is set and then query against that.


Thanks,
David


Re: [GENERAL] Array, bytea and large objects

2009-02-05 Thread David Wall

Thanks, Filip.


If I have an "unlimited" number of name-value pairs that I'd like
to get easy access to for flexible reports, could I store these in
two arrays (one for name, the other for value) in a table so that
if I had 10 name-value pairs or 200 name-value pairs, I could
store these into a single row using arrays so I could retrieve all
name-value pairs in a single SELECT from the db?


yes you could, but what for?
what's wrong with many rows? create table kvstore(key text primary 
key,value text);

what stops you from using single SELECT to get all these kv pairs?


We basically do it that way now, but was thinking we might run tests to 
see if it's faster.  When we run reports, only 2-3 of the name-value 
pairs are used in search criteria, so these we'd like to keep in such a 
table, but other fields (typically 10-30 name-value pairs) are just 
listed in the report, so we thought it might make sense to keep these in 
a single row for efficiency sake as we do retrieve them in a group and 
don't need to sort or select based on their values.  "Single SELECT" was 
poor word choice as we were thinking more about retrieving a single row 
with 10-30 name-values stored in an ARRAY would be faster than retrieve 
10-30 rows from a joined table.




At what size does it make more sense to store in LO instead of a
BYTEA (because of all the escaping and such)?
How many BYTEA fields can I have in a database?

no limit (other than limits mentioned in the FAQ)
 


Are the BYTEA fields stored in the same table as the rest of the
data?  



yes - and the TOAST tables if it's larger than 1/3 of a page or so. 
search for TOAST details if you're interested.
Hmm...  So a page is 8192 bytes, and it leaves your regular table and 
goes to TOAST if the BYTEA is more than 2730 bytes.  I thought it only 
went to TOAST when it exceed the page size, not just one-third of its 
size.  I am sure we have lots of encrypted, compressed XML (so it's all 
binary at this point, no longer text) that would be less than that.  So 
perhaps it makes sense to use BYTEA for these smaller binary objects as 
the data is stored with the row, is simpler to deal with, easy to handle 
in memory (unlike really big LOs), and the cost of escaping each byte 
may not be too high.


I have seen a comparison 
(http://zephid.dk/2008/08/09/oid-vs-bytea-in-postgresql/) that show 
BYTEA uses more memory (up to 10x more) and is slower (about 4x slower) 
than LOs, which indicate that most of this is due to escaping the bytes.



I'd avoid LO unless you really need streaming (block-wise) access.
This is interesting only because we've done the opposite.  That is, we 
store all binary data (mostly compressed, encrypted XML name-values) in 
LOs today and it works well.  But we are concerned about the 
pg_largeobject table being a bottleneck, becoming an issue for 
vaccum/vacuumlo/pg_dump as our database grows.


We'd like to do streaming for large files being uploaded, but today 
we're not doing that and have a java.sql.Blob interface class that 
essentially reads/writes using a byte array so we're not getting any 
benefits of streaming for very large objects, though as I said, most of 
our LOs are really not that big and thus not an issue for us.  We'll see 
what it means for us to change this to better support streaming for our 
truly large objects that we store.


Since you'd avoid LOs, what are the main advantages of BYTEA (since we 
use JDBC, we can use both with equal ease as both currently work for us 
using byte arrays in our Java code)?  I'm still thinking we may find 
that based on the size of the binary data, it may be best to choose 
BYTEA for smaller (< 8196 or < 2730) data and LOs elsewhere.


Thanks,
David



[GENERAL] Array, bytea and large objects

2009-02-04 Thread David Wall
I am trying to assess the db issues surrounding several constructs 
allowed in PG 8.3, including ARRAY, BYTEA and large objects (LO).


We store a lot of data as encrypted XML structures (name-value pairs 
mostly) that can be updated many times during its lifetime (most updates 
occur over several days and then the data tends to change no more), as 
well as storing images and uploaded files (these rarely change and are 
only inserted/deleted).  We currently use LO for all of these.  We 
mostly use the JDBC library for access to PG.


First, LOs seem to allow an OID column to be added to any number of 
tables, but is it true that the actual large object data is stored in a 
single table (pg_largeobject?).  If so, wouldn't this become a 
bottleneck if LOs were used frequently?  Even vacuuming and vacuumlo 
must create a lot of pressure on that one table if LOs are used 
extensively.  And can you backup a table with an OID column and get only 
those LOs referenced in the dump?


Does the JDBC library support LO streaming?  Can I receive data, 
compress, encrypt and stream into the database as well as do the 
opposite when reading it back?


If I have an "unlimited" number of name-value pairs that I'd like to get 
easy access to for flexible reports, could I store these in two arrays 
(one for name, the other for value) in a table so that if I had 10 
name-value pairs or 200 name-value pairs, I could store these into a 
single row using arrays so I could retrieve all name-value pairs in a 
single SELECT from the db?  How are these arrays stored -- does it use 
an underlying type like LO or BYTEA?


How big can an LO get?  Is it 2GB?
How many LO fields can I have in a database? 
It seems that the LO may even be implemented as an OID with one or more 
BYTEA storage structure in the pg_largeobject table (loid,pageno,data).  
Is that true?


How big is a "page"?  Maybe an LO is more efficient than a BYTEA if it's 
bigger than one page?


How big can a BYTEA get?  Is it 1GB?
At what size does it make more sense to store in LO instead of a BYTEA 
(because of all the escaping and such)?
How many BYTEA fields can I have in a database? 
Are the BYTEA fields stored in the same table as the rest of the data?  
I believe this is yes, so a backup of that table will include the binary 
data, too, correct?


How big can an ARRAY get?  Is it 1GB?
How many ARRAY fields can I have in a table or database?   Are there 
limits?

Are the ARRAY fields stored in the same table as the rest of the data?

Sorry for all the questions, but I'm trying to research it but the info 
is not always clear (and perhaps some of the stuff I find is not even true).


I am wondering if when my encrypted XML data is small, should I choose 
to store it in a table using BYTEA so that each "record" in my 
application (which uses the encrypted XML name-value storage) is not 
forced to be in a single pg_largeobject table, and use LO when my data 
reaches a threshold size?  Thoughts?


Thanks,
David

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Data Replication

2008-12-10 Thread David Wall
We've done warm standby as you indicate, and we've not needed anything 
special.


On the primary's postgresql.conf we use:

archive_command = '~/postgresql/bin/copyWAL "%p" "%f"'

Our copyWAL script is just a wrapper for 'scp' since we want to copy the 
data encrypted over the network:


#!/bin/bash

WALPATH=$1
WALFILE=$2
WALCOPYLOG=~/postgresql/logs/WALcopy.log

echo $0 - $(date) - Copy WAL received PATH: $WALPATH and FILE: $WALFILE 
>> $WALCOPYLOG

echo $0 - $(date) - $(ls -l $PGDATA/$WALPATH) >> $WALCOPYLOG

scp -B -C -p "$WALPATH" 
[EMAIL PROTECTED]:postgresql/recoveryWALs/"$WALFILE"

RET=$?
if [ $RET -ne 0 ]; then
 echo $0 - $(date) - Copy WAL PATH: $WALPATH - failed to copy to backup 
system, exit code: $RET >> $WALCOPYLOG

 exit RET
fi

# 0 exit status means successfully copied
exit 0;

On the warm standby, our recovery.conf uses pg_standby, which is part of 
the contrib code:


restore_command = '~/postgresql/bin/pg_standby -l -d -t 
~/postgresql/recoveryWALs/STOP_RECOVERY ~/postgresql/recoveryWALs %f %p 
%r 2>> ~/po

stgresql/logs/pg_standby.log'

We have a script that puts the STOP_RECOVERY file in the specified 
folder when we want it to go into live mode.


Hope this helps

David


Rutherdale, Will wrote:

Thanks, Joshua.

As I mentioned to Steve, warm standby / log shipping seems to be the
main feature I'm looking for.

The PITR solution you mention:  is that an improvement over regular log
shipping?  Or do I misunderstand where that fits into the system?

-Will
  


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Get PG version using JDBC?

2008-10-15 Thread David Wall

Is there a way to get the PG version string from JDBC?  I'm using PG 8.3.

Thanks,
David

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Large objects oids

2008-06-10 Thread David Wall


Tom Lane wrote:

David Wall <[EMAIL PROTECTED]> writes:
  
Since large objects use OIDs, does PG 8.3 have a limit of 4 billion 
large objects across all of my various tables



Yup, and in practice you'd better have a lot less than that or assigning
a new OID might take a long time.
  


What's a rough estimate of "a lot less"?  Are we talking 2 billion, 3 
billion, 1 billion?



(actually, I presume OIDs 
are used elsewhere besides just large objects)? 



They are, but this isn't relevant to large objects.  The uniqueness
requirement is only per-catalog.
  
Isn't there just one catalog per postmaster instance (pg_catalog)?  The 
issue we have is that one postmaster runs a large number of databases 
(let's say 100 for easy calculations), so even with the max 4 billion 
potential OIDs, that would drop each DB to 40 million each.


Part of this is just architectural to us.  We do heavy 
encryption/compression of data (in particular digitally signed XML text) 
and use large objects to store these, but we may need to change these to 
use bytea since they wouldn't use up OIDs and the actual data size tends 
not to be too large (perhaps 10KB compressed and encrypted binary data) 
and can be done in a block.  All that character escaping of binary data, 
though, makes the JDBC-to-Postmaster interface a tad bit ugly, though.



  
Is there any plan on allowing large objects to support more than 2GB?  



No, it's not on the radar screen really.
  


Too bad, but again, we can always work around it, even if means a layer 
that bundles large objects sort of like large objects bundle bytea.  We 
prefer not to store it outside the database as the large files can get 
out of sync from the database (ACID properties) and of course need to be 
backed up separately from the database backups and WAL copying for 
replication.


David


[GENERAL] Large objects oids

2008-06-10 Thread David Wall
Since large objects use OIDs, does PG 8.3 have a limit of 4 billion 
large objects across all of my various tables (actually, I presume OIDs 
are used elsewhere besides just large objects)? 

Is there any plan on allowing large objects to support more than 2GB?  
As data gets larger and larger, I can see this being a problem if you'd 
like the ACID properties of a DB to work with video and other large data 
sets.


Since all large objects are stored in the pg_catalog.pg_largeobject 
table, will running 'vacuum full' on it reduce the FSM issues I often 
see with errors like:


WARNING:  relation "pg_catalog.pg_largeobject" contains more than 
"max_fsm_pages" pages with useful free space
HINT:  Consider using VACUUM FULL on this relation or increasing the 
configuration parameter "max_fsm_pages".
NOTICE:  number of page slots needed (1045968) exceeds max_fsm_pages 
(30)


Thanks,
David


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Open Source CRM - Options?

2008-05-28 Thread David Wall

What about SugarCRM?

David

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] pg_standby / WAL archive-restore through system restarts

2008-05-13 Thread David Wall


Thanks, Guillaume.  Can anyone confirm his answer to the question below 
about restarting the backup postmaster? 



IIRC, it works conveniently in 8.2 and up.  Just do pg_ctl stop -m fast.
  


Thanks, Tom.  What about if the postmaster is just killed (kill pid, or 
kill -9 pid) or the server/OS crashes?  Will PG 8.3 in recovery mode be 
able to come back up okay and resume recovery?


David


Re: [GENERAL] pg_standby / WAL archive-restore through system restarts

2008-05-13 Thread David Wall
Thanks, Guillaume.  Can anyone confirm his answer to the question below 
about restarting the backup postmaster? 

It seems counter-intuitive since the backup server is only waiting for 
new WAL files to restore, it would seem it could be killed and restarted 
at any time and it should just recover as it has been all along.  I can 
see if I end recovery mode this would be the case, but can't the 
recovery postmaster be killed without having it leave recovery mode?



Can I shutdown Server B (backup/recovery postmaster) simply by 
killing the postmaster and restart it back in recovery mode to 
continue re-syncing where it left off?  Or does stopping Server B 
while in recovery mode require any manual re-sync steps before it can 
resume recovery?


I think you'll need to redo the whole process : restore the full data 
backup, create the recovery.conf file, etc.




--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] pg_standby / WAL archive-restore through system restarts

2008-05-12 Thread David Wall
Just wanted to see if others would confirm my impressions about running 
WAL archiving and pg_standby restore.


Server A (Primary): Runs PG 8.3 with WAL archiving enabled.  Each WAL is 
copied over the network to Server B.  (A previous 'tar backup' of the 
database along with the requisite psql commands 
pg_start_backup()/pg_stop_backup() was done and restored on Server B.)


Server B (Backup): Runs PG 8.3 in 'recovery mode' (recovery.conf), with 
pg_standby feeding the WAL archives.


The idea is that if Server A dies, Server B will have a relatively close 
snapshot of the most recent database changes and could take over as the 
new primary database server.


During regular operations, Server A and Server B may independently need 
to be rebooted or postgresql restarted, etc.  Is there any sync-ing 
between Server A and Server B I have to worry about when doing this? 

That is, will Server B need to know anything about the fact that the 
postmaster on Server A going up and down?  Will it just run normally, 
waiting for the restarted Server A to send over new WAL archives?


Can I shutdown Server B simply by killing the postmaster and restart it 
back in recovery mode to continue re-syncing where it left off?  Or does 
stopping Server B in recovery mode require any manual re-sync steps 
before it can resume recovery?


If I ever tell Server B to stop recovery (assuming I didn't intend for 
it to become the new primary), I cannot just resume recovery mode unless 
I do the full data backup again from Server A and restore on Server B in 
start in recovery mode.  That is, once you tell PG to end recovery, it 
becomes a primary and no longer can resume recovery mode without redoing 
the recovery setup steps.


Are these correct?

Thanks,
David

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] JDBC addBatch more efficient?

2008-04-21 Thread David Wall


Does this means that the two features are independent one from each 
other ?
In other words, can we say that JDBC batch will limit information 
exchange between client and server while Postgres prepared statements 
will optimize their execution ?
I've not used it yet, but my impression is that you can use 
PreparedStatements (in a loop typically), often surrounded by jdbc 
transaction commit so each statement doesn't run in its own 
transactions, do an addBatch and then submit that so all of the 
statements are executed by the server as a group.  So in this case, yes, 
they should be independent of each other.


David

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] JDBC addBatch more efficient?

2008-04-21 Thread David Wall


The JDBC driver's batch processing is more efficient than regular 
execution because it requires fewer network roundtrips so there's less 
waiting.  The JDBC batch is broken into an internal batch size of 256 
statement and all of these are sent over to the server at once.


That's great, Kris.  I  believe our code only does 100 updates/deletes 
before committing the transaction anyway, so this should work well for us.


Thanks,
David

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] JDBC addBatch more efficient?

2008-04-20 Thread David Wall
Just checking if the JDBC library's batch processing code is more 
efficient with respect to the postgresql back end or not.  Does it 
really batch the requests and submit them once over the link, or does it 
just send them to the database to be processed one at a time?


Thanks,
David

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] postgre vs MySQL

2008-03-14 Thread David Wall



I imagine you can get round the second one by building your software
so it supports PostgreSQL as well - that way you don't 'require
customes to install MySQL'.
  
Well, I'm not sure how they'd even know you were doing this, but as a 
commercial company, I'd suggest you not follow that advice since the 
code would not work without install MySQL.  Yes, they could install PG 
instead, and if they did, MySQL would have no problem.  But if you use 
MySQL, then clearly it's required and a commercial license would be 
required (though perhaps at least you'd put the legal obligation on the 
end customer).  Of course, all of this is based on reading their high 
level stuff, not the actual legal document that may be tighter or looser.


That fact that there's so much confusion and so many instances in which 
commercial licenses would be required that I say they are only open 
source in self-branding, not reality.


David

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] postgre vs MySQL

2008-03-14 Thread David Wall


My understanding is that's not quite true. The client libraries are 
GPL, so you can't use them directly, but I don't see what would stop 
you using their ODBC/JDBC drivers with your non-GPL application 
(especially if you support other ODBC databases as well). The server 
can't be bundled in your application, but you can still get the user 
to install it and use it with your application.
According to the MySQL license info ( 
http://www.mysql.com/about/legal/licensing/commercial-license.html ):


When your application is not licensed under either the GPL-compatible 
Free Software License as defined by the Free Software Foundation or 
approved by OSI, and you intend to or you may distribute MySQL software, 
you must first obtain a commercial license to the MySQL product.


Typical examples of MySQL distribution include:

   *

 *Selling software that includes MySQL* to customers who install
 the software on their own machines.

   *

 Selling software that *requires customers to install MySQL
 themselves *on their own machines.

   *

 Building a hardware system that includes MySQL and selling that
 hardware system to customers for installation at their own locations.


It sure sounds like if your application uses MySQL and you sell your 
software (I presume this would include online services that charge for 
use of the site and that site runs MySQL under the hood), you have to 
buy a commercial license, and you can't get around it just by not 
directly distributing MySQL and having your customer install it separately.


Way off topic for PG, though, which has a great OSS license in BSD.

David






Re: [GENERAL] postgre vs MySQL

2008-03-13 Thread David Wall



What then?  Could it be marketing or the sad results of a avalanche
effect? Geee, there's a thought.
  
What a wide variety of topics.  One big difference for me is that MySQL 
used to be open source, but it no longer is.  It's an odd hybrid OSS 
that barely makes sense to me since they claim to be open source under 
the GPL, and while you can contribute code to them (I did so in their 
JDBC driver many years ago before switching to Postgresql), they then 
own the code (fine!), but if you want to use it in any system that's not 
itself open source, you have to pay to get a license.  Pay for GPL software?


But they proudly state they are part of LAMP, yet only the "M" charges 
to use their software.  The real leaders in these open source camps are 
Linux and Apache, neither of which have such absurd pseudo-open 
licensing terms.


David

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] postgre vs MySQL

2008-03-12 Thread David Wall



Well, if you have a crappy system that cannot sustain concurrent load or
even be backed up concurrently with regular operation, one solution is
to write a kick-ass replication system.
  
Still, it would be nice to have a kick-ass replication system for PG, 
too.  We've been toying with WAL archiving and backup db recovery, which 
works pretty well it seems as it appears to support all of our tables 
(not just those with an explicit primary key) and does the DDL stuff for 
creating/altering/dropping tables, columns, etc. 

The downside is that the backup is not operational in order to run even 
run a SELECT against, and because it's asynchronous in nature, there's 
always a window of data loss for transactions written to the WAL that 
haven't been archived yet.


David


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] standby questions

2008-02-08 Thread David Wall


That's correct.  You have to do it that way or the system in recovery 
mode can start to consume the new segment file before it has been 
completely copied over yet.


Does pg_standby take care of this by checking file sizes or the like?  
In my testing with scp, we never experienced any problems, but I wonder 
if we were somehow "just lucky."


David


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [GENERAL] standby questions

2008-02-08 Thread David Wall



2) if archive_command is activated and working, the primary
server sends (preferably with rsync or some other "atomic tool") the
NEW WAL files to the standby server. Later, at some point, the primary
server will delete this files when considers that are not necessary
anymore.
  
Is 'scp' by itself considered an "atomic tool" for copying files to the 
standby server?  Does "atomic" mean that the program should copy a file 
over using a temp file name and then renames at the end or does it mean 
something else?


David


---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org/


Re: [GENERAL] Question for Postgres 8.3

2008-02-04 Thread David Wall



If you want to support multiple encodings, the only safe locale choice
is (and always has been) C.  If you doubt this, troll the archives for
awhile --- for example, searching for locale+encoding in pgsql-bugs
should provide plenty of amusing reading matter.  
This is most interesting.  I think some of use UTF-8 under the 
impression that it would support unicode/java (and thus US-ascii) 
easily, and of course then allow for foreign language encodings when we 
need to internationalize.  Thank goodness we only plan to I18N for a 
decade and never got around to it!


David

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


[GENERAL] Autovacuum and stats_row_level

2008-01-28 Thread David Wall
Noted that to use autovacuum we need to turn on stats_row_level (along 
with stats_start_collector that is on by default).  Since 
stats_row_level is off by default, I wonder what sort of overhead is 
incurred since it sounds like it could add up if it's storing additional 
stats information on every row update, though it's not clear to me just 
what it means to have row level stats.


We currently use cron to run vacuum and analyze daily in the early 
morning.  It's  never been an issue, but the databases are getting 
bigger and we wonder if autovacuum is a good addition to our operations.


Are the default values for autovacuum generally good enough for most 
deployments?  The various parameters are a bit complicated to 
understand, so tweaking them is something we're naturally concerned 
about doing.


Will autovacuum running allow regular vacuum and analyze commands to run 
faster?  Can it replace them entirely, or do we still need to run them 
from time to time?


Can autovacuum be configured to run in a backup server that is in 
"recovery" mode handling pg_standby WAL file updates?


Thanks,
David

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [GENERAL] Postgresql + digital signature

2008-01-23 Thread David Wall


We're in a stage where I need to implement a mechanism to prevent the 
data modification.


I'm thinking on 'Digital Signatures' (maybe RSA) in each row. If 
there's a modification, the signature doesn't verify.
Like all such solutions, the key (lame pun intended) is how to do you 
manage the keys?  Obviously, when the digitally signed data is inserted, 
the private key must be accessible.  If you then do an update and also 
have access to the keys, then new digitally signed data would be there.


Is there no way for your application to ensure that once data is 
inserted, it cannot be changed? 

You can also grant database access with just SELECT,INSERT permissions 
so that an UPDATE and DELETE are not allowed.


We store lots of digitally signed data as BLOBs in PG, but control this 
at the application level since it's the one that has access to the 
private key, and our application has no UPDATE/DELETE calls.


Good luck,
David

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org/


Re: [GENERAL] Stupid question about WAL archiving

2008-01-18 Thread David Wall


Which is exactly why I pointed out that using pg_standby's -k switch 
was the more reliable option.


And supposedly even that switch is not needed once we can get to 8.3, 
which should be soon.  Even the -k switch can be an issue since you 
don't really know how many you should keep around.


David

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [GENERAL] pg_dumping large objects

2007-09-24 Thread David Wall
Large objects are defined in the DDL as type 'OID'.  Those with various 
drivers may not know the lo_* calls are being used, since in Java this 
takes place if you use the SQL BLOB type in JDBC and the field itself is 
defined as 'OID'.


David

Alvaro Herrera wrote:

Morris Goldstein escribió:
  

pg_dump --help says:

-b, --blobs include large objects in dump

What is the definition of "large object"?



The kind you create with lo_open(), etc.  If you don't do that, you
don't need it.

  


Re: [GENERAL] OFFSET and LIMIT - performance

2007-06-28 Thread David Wall



Network transmission costs alone would make the second way a loser.

Large OFFSETs are pretty inefficient because the backend generates and
discards the rows internally ... but at least it never converts them to
external form or ships them to the client.  Rows beyond the LIMIT are
not generated at all.
  
Some of this would depend on the query, too, I suspect, since an ORDER 
BY would require the entire result set to be determined, sorted and then 
the limit/offset could take place.  Regardless, it's better than 
filtering in the Java/client side to avoid sending it from the database 
backend to the client.


But how would that compare to using a cursor/fetch query.  It seems like 
the JDBC library will automatically use a cursor if you specify some 
params on the PreparedStatement, though the details escape me.  I think 
it's related to setFetchSize() and/or setMaxRows().Of course, those 
are not guaranteed to do anything special either, and you'd still need 
to retrieve and discard initial rows unless you can adjust your WHERE 
condition to find the "next set". 

If you have an ORDER BY on a unique field, for example, you could use 
that field to query the next set by remembering the last value in your 
previous query set (or select 1 more row than you need so you have the 
exact value that would be next) and specifying it in the WHERE clause.  
Even this could be an issue if updates would change the grouping.


LIMIT/OFFSET are not part of the SQL standard, too, should that matter 
for DB portability.  I believe mysql supports it, but it seems like 
Oracle didn't (at least at one time).


David


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [GENERAL] VACUUM ANALYZE extremely slow

2007-06-18 Thread David Wall

On Jun 17, 2007, at 2:15 PM, Sergei Shelukhin wrote:

This is my first (and, by the love of the God, last) project w/pgsql
and everything but the simplest selects is so slow I want to cry.
This is especially bad with vacuum analyze - it takes several hours
for a database of mere 15 Gb on a fast double-core server w/2Gb of RAM
and virtually no workload at the moment. Maintenance work mem is set
to 512 Mb.
I have a  23 GB PG 8.1 db running on a 3 year old dual processor Dell 
running Red Hat Linux, 2GB RAM.  My tests ran on our production database 
while it was running (albeit not under a heavy load) with no obvious 
slowdown for users using our application.  PG was using 1.2% of the CPU, 
and noted the server had been running for 322 days.  I'm not sure if 
there's a way to make vacuum use more processor time to speed it up or not.


"ANALYZE" took 1 minute 14 seconds.

"VACUUM" took significantly longer at 24 minutes 22 seconds.

I noted that we hadn't set maintenance_work_mem so it presumably is 
using the default of 16384 (16MB).


I'm sure much depends on how many tables, how many deleted rows, etc. 
are in your db, but "several hours" certainly is nothing like what we see.


David


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [GENERAL] Replication for PG 8 recommendations

2007-05-10 Thread David Wall


Hannes Dorbath wrote:


Replicate the whole block device, PostgreSQL sits on.
For Linux, are you talking about something like DRDB?  That would be 
nice in that it would also replicate the web app itself.


David

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [GENERAL] Replication for PG 8 recommendations

2007-05-09 Thread David Wall



On Wed, 2007-05-09 at 14:40 -0700, David Wall wrote:
  
Is there a "preferred" replication system for PG 8 db users?  Obviously, 
we're looking for robustness, ease of operations/installation, low 
latency and efficient with system and network resources, with an active 
open source community being preferred.


Jeff Davis wrote:
http://www.postgresql.org/docs/8.2/static/high-availability.html
  
Thanks.  I've seen the options and was hoping for grunt-level 
realities.  Many projects seem to have fallen by the wayside over time. 

My first impression was towards a Slony-I type solution, but I need 
large objects and would prefer schema updates to be automatic.  I was 
hoping to hear back on any pitfalls or preferences or "how I'd do it if 
I could do it again" type stories.  We mostly need it for disaster 
recovery since we're looking to improve upon our current nightly 
backup/syncs in which we pg_dump the database, SCP it to the backup, 
then pg_restore on the backup.  It's possible WAL copying will do it, 
too, but don't know if people find this workable or not.


David


[GENERAL] Replication for PG 8 recommendations

2007-05-09 Thread David Wall
Is there a "preferred" replication system for PG 8 db users?  Obviously, 
we're looking for robustness, ease of operations/installation, low 
latency and efficient with system and network resources, with an active 
open source community being preferred.


Thanks,
David

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org/


Re: [GENERAL] reindexdb program error under PG 8.1.3

2006-05-25 Thread David Wall


It really isn't neccessary. That said, there is *something* going on with your 
db, so it might be prudent to figure out what it is. 
  
Thanks, Robert.  Actually, I noted that each time I ran my backup 
script, the OID mentioned in the error was bigger than previous number.


Before I run the reindexdb program, I am running the vacuumlo (from 
contrib).  Just by putting a sleep 5 between the two commands, I cannot 
reproduce the error.  If I remove the sleep, then the error happens 
again.  Clearly, there's something "left over" from vacuumlo that 
reindexdb is finding.


Any thoughts on that observation?

Thanks,
David

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [GENERAL] reindexdb program error under PG 8.1.3

2006-05-25 Thread David Wall




Rats!  I found nothing  But just for grins, I ran the reindexdb
program a second time, and this time it reported a different OID,
higher than the previous.  I know we didn't change the template1
database other than using it when creating new databases, creating
users, tablespaces, etc.

My guess is this is somehow related to the reindexdb program.  Do you
know if there's any advantage to using the reindexdb program versus
running the psql commands REINDEX DATABASE and REINDEX SYSTEM?  

When reading the docs, I'm beginning to wonder if it's even necessary
to do the reindexing.  I think it may be a holdover from earlier
thinking that doesn't apply anymore.  I believe the idea is that as
rows are inserted, updated and deleted, the index can get rather
scrambled, and that reindexing often sorts the values and rewrites that
data so that the indexes take less space and work faster.  Is that not
the case here?

Thanks,
David



Tom Lane wrote:

  David Wall <[EMAIL PROTECTED]> writes:
  
  
Thanks, but I clearly lack that expertise.  How do I find "a table with 
that OID"?

  
  
Try "select * from pg_class where oid = 41675" and
"select * from pg_attribute where attrelid = 41675"
to see if you find any rows.

			regards, tom lane

  





Re: [GENERAL] reindexdb program error under PG 8.1.3

2006-05-25 Thread David Wall





(repost to newsgroup since the other went directly to Mr. Lane's email
address)
I think you've done something to template1,
even though you say you
  
haven't.  Out-of-the-box, an 8.1.3 database won't have any OIDs
assigned
  
above about 11000, so OID 41675 would have to belong to a user-created
  
object.  Look in template1 and see if you see a table with that OID.
  
  
  
  
Thanks, but I clearly lack that expertise.  How do I find "a table with
that OID"?  The \dl lists large objects, but there's no oid with that
number in my regular db and template1 has not oids at all using that
command.  I also checked the database area with all the files that have
such numbers, and there's no entry with the matching number either,
though there are some that are close. 
Is it possible that the 'reindexdb' program (not the command within
psql, but the standalone program) creates some temp tables that might
reflect this oid?


Thanks,

David







[GENERAL] reindexdb program error under PG 8.1.3

2006-05-25 Thread David Wall
When I create a brand new database and then run the reindexdb program 
under PG 8.1.3, I get the follow error:


reindexdb: reindexing of database "" failed: ERROR:  could not open 
relation with OID 41675


I'm not sure what this error is since we have not even added any data 
yet.  Is this something that's misconfigured in the template1 database 
(we have not customized it), something we're doing wrong, or is this 
"normal"?


Note that prior to this error, we get many NOTICE messages about each 
table being reindexed just as we'd like.


Thanks,
David


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [GENERAL] Is difference between PostgreSQL and mySQL licences

2006-04-09 Thread David Wall



That doesn't sound consistent with the arrangements that MySQL AB
expect for commercial users of their products.

They indicate assortedly that:

- If you are developing and distributing open source applications
  under the GPL, or some OSI-approved license, you are free to use
  MySQL(tm) "for free."

- If you are *not* licensing and distributing your source code under
  the GPL, then MySQL AB expects you to use their "OEM Commercial 
  License," which involves negotiating a contract with their 
  sales team.


Your scenario seems to clearly fall into the scenario where MySQL AB
expects you to pay them license fees.

If you don't include MySQL(tm) with your product, then that presumably
imposes the obligation to pay MySQL AB a license fee on the purchasor
that does the download.  That may leave your hands clean, but if you
do not warn your customers of their obligation, and legal problems
arise, they may not be too happy with you...
  


The real key is that MySQL has left the world of traditional open source 
and has instead taken on a commercial business interest in their 
"product."  This is disturbing considering that many people have 
contributed code to their projects, including database drivers and other 
client code that may be part of your product even if they have to 
download the db engine separately.  Now, those same people who 
contributed may want to use MySQL, but unless they are building their 
own GPL system, their use of MySQL with their freely contributed code 
inside is no longer available without paying a fee.


My guess is that many (most?) MySQL implementations today violate the 
provisions of their license.  This is because many applications are 
written for in-house consumption, simple web sites, etc. and those 
applications are not GPL.  It's mostly commercial apps that end up 
getting the license because businesses fear lawsuits more than 
individuals do (something like music "piracy" in which end users copying 
a CD to their disk drives are much less likely to be sued than if you 
produce products that incorporate the music inside).


PostgreSQL is the way to go if you want a truly open source solution for 
your needs, whether commercial, private or open source.


David

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


[GENERAL] reindexdb script required in 8.1?

2006-03-07 Thread David Wall
I've upgraded from 8.0 to 8.1 and want to be sure I'm making the changes 
that are required.


For my backups, I have removed the option --blobs because it says this 
is no longer needed.  I guess the backup automatically includes blobs now?


I have also been running the contributed 'reindexdb' script by Shaun 
Thomas in my backups.  Is that still necessary?  It does not appear to 
be part of the contrib area anymore.


Also, my backup scripts still run vacuumlo.  I rebuilt that in 8.1 and 
am using that.  Is that still required or is vacuumlo part of the new 
8.1 vacuum capabilities?


Lastly, do I still need to run vacuum analyze from time to time update 
my stats, or is that done automatically?  I read about how some vacuums 
are automatic, but it's not clear if this auto-vacuum is activated by 
default or not and I'm not sure how I specify that I want this to occur.


Thanks,
David


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [GENERAL] VACUUM is hanging - jdbc?

2001-05-07 Thread David Wall

> Try not to leave open transactions hanging around in your Apache code.
> An idle database connection couldn't block VACUUM like that; it'd have
> to have been in the middle of a BEGIN block, or maybe even an unfinished
> query.

It's possible that the JDBC library always creates a new transaction after a
commit/rollback. It wouldn't have any locks or the like, though.

David


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster



Re: [GENERAL] Re: Postgresql table reloading

2001-05-05 Thread David Wall

> In the table, you're storing the OID of the large object itself, so if you
> copy all the contents of the table, so far as I can see, yes, you'd still
> have the reference to your large object.

Thanks.  I did see this in a quick test, but I wasn't sure if the data would
be removed when I dropped the original table (does the DB keep a reference
count for rows pointing to the blob?).  Yes, I'll give it a test first!

Thanks,
David


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster



Re: [GENERAL] VACUUM is hanging

2001-05-04 Thread David Wall

> Try not to leave open transactions hanging around in your Apache code.
> An idle database connection couldn't block VACUUM like that; it'd have
> to have been in the middle of a BEGIN block, or maybe even an unfinished
> query.

It's possible that the JDBC library always creates a new transaction after a
commit/rollback. It wouldn't have any locks or the like, though.

David



---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster



Re: [GENERAL] Re: Getting milliseconds out of TIMESTAMP

2001-04-22 Thread David Wall

> "David Wall" <[EMAIL PROTECTED]> writes:
> > The real question for me is that 7.1 docs say that the resolution of a
> > timestamp is 8 bytes at "1 microsecond / 14 digits", yet I generally see
> > -MM-DD HH-MM-SS.cc returned in my queries (both with pgsql and with
> > JDBC).
>
> That's just a matter of the default display format not being what you
> want.  The underlying representation is double precision seconds from
> (IIRC) 1/1/2000, so accuracy is 1 microsec or better for ~70 years
> either way from that date, decreasing as you move further out.

That makes sense, but it wasn't clear if there was a clean way to get the
extra info.  Your example below looks interesting, but I'll need to look
further to see if I understand what you are really doing.  The real key for
me will be to come up with a generic mechanism that is easy to plug into
JDBC.

> One way to get the fractional seconds with better precision is
> date_part.  For example,
>
> regression=# create table ts (f1 timestamp);
> CREATE
> regression=# insert into ts values(now());
> INSERT 144944 1
> regression=# insert into ts values(now()  + interval '.0001 sec');
> INSERT 144945 1
> regression=# insert into ts values(now()  + interval '.01 sec');
> INSERT 144946 1
> regression=# insert into ts values(now()  + interval '.001 sec');
> INSERT 144947 1
> regression=# select f1, date_part('epoch', f1), date_part('microseconds',
f1) from ts;
> f1 |date_part | date_part
> ---+--+---
>  2001-04-22 16:04:31-04|987969871 | 0
>  2001-04-22 16:04:39.00-04 |   987969879.0001 |   100.332
>  2001-04-22 16:04:45.00-04 | 987969885.01 | 0.7475243
>  2001-04-22 16:04:51-04|987969891 | 0
> (4 rows)
>
> Not sure why the last example drops out completely --- looks like
> something is rounding off sooner than it needs to.  But certainly there
> are six fractional digits available at the moment.

This solution appears to show that the data is there, but that extracting it
is not that pleasant, especially in an automated way from JDBC.  It seems
like we'd have to modify select calls such that whenever a timestamp field
is being used, we'd get the timestamp, but also get the 'date_part(epoch)',
then put use the numbers after the decimal point in epoch and replace any
numbers after the period in the timestamp.

If I have to change my code to make it work, then I'd prefer to simply store
the 64-bit long integer in the database and save all of the date conversions
and parsing.  But when it's a TIMESTAMP, the data looks a lot better when
using psql, and it makes queries by date ranges usable by mortals!

Thanks,
David



---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://www.postgresql.org/search.mpl



Re: [GENERAL] Re: Getting milliseconds out of TIMESTAMP

2001-04-22 Thread David Wall

> Just curious, but what is the point of having times 'acurate' to the
> milisecond in a database?

In my case it's a simple matter that I include a timestamp in a digital
signature, and since my timestamp comes from Java (and most Unixes are the
same), it has millisecond resolution "built in."  The problem is that when I
retrieve the information about that digital signature, it was failing
because the database only when to centiseconds.  I've "fixed" my code by
reducing the my timestamp resolution.

As another point, computers are incredibly fast these days, and doing more
than 100 logging operations in a second is commonplace.  If you log records
to the database and do more than 100/second, then you cannot use the
TIMESTAMP as an indicator of order that messages were emitted since all rows
logged after the 100th will be the same.

Of course, the question is easy to turn around.  Why not just have
timestamps accurate to the second?  Or perhaps to the minute since many
(most?) computer clocks are not that accurate anyway?

The real question for me is that 7.1 docs say that the resolution of a
timestamp is 8 bytes at "1 microsecond / 14 digits", yet I generally see
-MM-DD HH-MM-SS.cc returned in my queries (both with pgsql and with
JDBC).  This is unusual since an 8 byte integer is 2^63, which is far more
than 14 digits, and the "ascii digits" of that preceding format is already
16 digits.

David


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html



[GENERAL] 7.1 dumps with large objects

2001-04-14 Thread David Wall

Wonderful job on getting 7.1 released.  I've just installed it in place of a
7.1beta4 database, with the great advantage of not even having to migrate
the database.

It seems that 7.1 is able to handle large objects in its dump/restore
natively now and no longer requires the use of the contrib program to dump
them.  Large objects are represented by OIDs in the table schema, and I'm
trying to make sure that I understand the process correctly from what I've
read in the admin guide and comand reference guide.

In my case, the OID does not mean anything to my programs, and they are not
used as keys.  So I presume that I don't really care about preserving OIDs.
Does this just mean that if I restore a blob, it will get a new OID, but
otherwise everything will be okay?

This is my plan of attack:

To backup my database (I have several databases running in a single
postgresql server, and I'd like to be able to back them up separately since
they could move from one machine to another as the loads increase), I'll be
using:

pg_dump -b -Fc dbname > dbname.dump

Then, to restore, I'd use:

pg_restore -d dbname dbname.dump

Is that going to work for me?

I also noted that pg_dump has a -Z level specifier for compression.  When
not specified, the backup showed a compression level of "-1" (using
pg_restore -l).  Is that the highest compression level, or does that mean it
was disabled?  I did note that the -Fc option created a file that was larger
than a plain file, and not anywhere near as small as if I gzip'ed the
output.  In my case, it's a very small test database, so I don't know if
that's the reason, or whether -Fc by itself doesn't really compress unless
the -Z option is used.

And for -Z, is 0 or 9 the highest level compression?  Is there a particular
value that's generally considered the best tradeoff in terms of speed versus
space?

Thanks,
David


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]



[GENERAL] LOs and pg_dump, restore, vacuum for 7.1

2001-04-10 Thread David Wall

Does 7.1 "natively" handle large objects in dumps, restores and vaccums?  In
7.0.3 there was a contrib for LOs.

Thanks,
David


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]



Re: [GENERAL] Data type for storing images?

2001-03-13 Thread David Wall

> I am not sure what your exact purpose is, but in doing a lot of web
> work with pgsql, I usually just store the full path to the image in a
> varchar field and fetch that right into the IMG SRC attribute when I
> need to display the image.

Why would a full path in a IMG tag work?  Doesn't your web server use
relative paths based on its document root? If not, you have a serious
security hole allowing people to access any file on your system. 

> I also have modified a simple web based image uploading script so upon
> uploading an image it detects the images width and height and inserts
> that into the DB as well.  The script is written in PHP, let me know
> if it would be helpful to send it to you.

I'd be interested in this script.  You'll probably need to send it either
"in place" thru the list, or as an attachment directly to me.

Thanks,
David




---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])



Re: [GENERAL] Re: Data type for storing images?

2001-03-13 Thread David Wall

> More to the point, what would be a reason for not using bytea to
> replace BLOBs completely?

Does anybody know if bytea can be accessed by JDBC yet?  I couldn't figure
out how to make it work (setBytes/getBytes) didn't work; only the OID sql
type was working for me with the byte routines.

David


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])



Re: [GENERAL] Data type for storing images?

2001-03-11 Thread David Wall

> bytea is probably your best bet.  The 8k limit is toast in 7.1, btw.

If you're using JDBC, you may not be able to get bytea types to work.  I was
able to get the SQL type OID to work with JDBC's get/setBytes() methods.
You may also want to use 7.1 since I believe it's large object support is
improved.

David


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])



Re: [GENERAL] upper() vs. lower()

2001-03-05 Thread David Wall

I know that if you move towards unicode, you'll probably do better by going
to lowercase since I understand that there are more lowercase letters than
uppercase (for some odd reason!).  I think it's related to the fact that
some languages have multiple lowercase letters that will map to a single
uppercase letter, so there's great "distinct-ness" with lowercase.

David

> Is there any reason why everyone suggests to use upper() to do a case
> insensitive search rather than lower()?
>
> Will initcap() capitalize each word in a column, or just the first word?



---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]



Re: [GENERAL] how do you run your backups?

2001-01-26 Thread David Wall

> Is there a way to start with yesterday's dump and load each transaction
> log dump in order until you have a complete restore of the database?

Does recovery of pg_dump and the transaction logs work with a backup and
then doing a rollforward on the transaction logs?  What would be the
procedure?  It doesn't seem as though there is any real connection between a
pg_dump and the backup of transaction logs.

Also, does a pg_dump guarantee a consistent view of all of the tables in a
database at a given snapshot in time, or does it simply dump the contents of
each table as they are encountered while dumping?

And what format of pg_dump is required to also dump the large objects so
that they can be automatically loaded back on failure recovery (or just
copying the database to another computer)?

David




Re: [GENERAL] MySQL has transactions

2001-01-24 Thread David Wall

> > Something I'm curious about: does MySQL support user-defined datatypes?
>
> No, it does not.
>
> > If so, how hard is it to make a new datatype?  I think datatype
> > extensibility is one of the cooler features of Postgres ...

Making new datatypes sounds good, but in practice how easy is it to do?  I
don't see old SQL92 types like BLOB being supported by the database, and
large objects appears quite broken in its implementation, so adding new
types can't be all that easy to do.

David





[GENERAL] 7.1 expected features list?

2001-01-24 Thread David Wall

Is there a place where I can see what the new features for 7.1 are expected
to be?  I'm hopeful for native BLOB support 

David