[GENERAL] diagram tools?
Can anyone recommend a PostgreSQL compatible free tool that I can use to generate some schema diagrams of an existing database? Thanks Dan -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Checkpoint Tuning Question
On Mon, Jul 13, 2009 at 3:53 PM, Dan Armbrust wrote: >> So this thought leads to a couple of other things Dan could test. >> First, see if turning off full_page_writes makes the hiccup go away. >> If so, we know the problem is in this area (though still not exactly >> which reason); if not we need another idea. That's not a good permanent >> fix though, since it reduces crash safety. The other knobs to >> experiment with are synchronous_commit and wal_sync_method. If the >> stalls are due to commits waiting for additional xlog to get written, >> then async commit should stop them. I'm not sure if changing >> wal_sync_method can help, but it'd be worth experimenting with. >> >> regards, tom lane >> > > > All of my testing to date has been done with synchronous_commit=off > > I just tried setting full_page_writes=off - and like magic, the entire > hiccup went away. > > Thanks, > > Dan I haven't done any other testing for this issue since discovering that setting "full_page_writes=off" makes my performance hiccup go away. I'm about to move on to some different tests and reset this setup that I was using to test out this issue - is there anything else that anyone would like to have gathered before I axe this system? I can redo it again later too, it will just take a fair bit longer for me to reconfigure the test environment. Thanks, Dan -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Checkpoint Tuning Question
> > Propose a DTrace probe immediately after the "goto begin" at line 740 of > xlog.c, so we can start tracing from the first backend following > checkpoint, and turn off tracing when all backends have completed a > transaction. > That's greek to me. But I'm happy to test things if you send me patches or custom code. Thanks, Dan -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Checkpoint Tuning Question
> So this thought leads to a couple of other things Dan could test. > First, see if turning off full_page_writes makes the hiccup go away. > If so, we know the problem is in this area (though still not exactly > which reason); if not we need another idea. That's not a good permanent > fix though, since it reduces crash safety. The other knobs to > experiment with are synchronous_commit and wal_sync_method. If the > stalls are due to commits waiting for additional xlog to get written, > then async commit should stop them. I'm not sure if changing > wal_sync_method can help, but it'd be worth experimenting with. > > regards, tom lane > All of my testing to date has been done with synchronous_commit=off I just tried setting full_page_writes=off - and like magic, the entire hiccup went away. Thanks, Dan -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Checkpoint Tuning Question
> Hm, I'm not sure I believe any of that except the last bit, seeing that > he's got plenty of excess CPU capability. But the last bit fits with > the wimpy-I/O problem, and it also offers something we could test. > Dan, please see what happens when you vary the wal_buffers setting. > (Note you need a postmaster restart to change that.) > > regards, tom lane > Ok, I tried a few different values - 32kb, 64kb, 512kb, 2MB and 10MB. I'm not seeing any highly noticeable change in behaviour with any setting - it wasn't a scientific test, but I seem to have about the same size hiccup with each setting. The hiccup may be slightly shorter with the 10MB setting, but barely, if it is. Thanks, Dan -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Checkpoint Tuning Question
> As Greg commented upthread, we seem to be getting forced to the > conclusion that the initial buffer scan in BufferSync() is somehow > causing this. There are a couple of things it'd be useful to try > here: > > * see how the size of the hiccup varies with shared_buffers; I tried decreasing shared buffers - both 25MB and 50MB were too small for my load - I had slow queries at all times. So then I increased it from what I was using - 100MB - to 500MB - and the hiccup roughly doubles in length. At 100MB, the hiccup is about 2-3 seconds long. At 500MB, the hiccup is about 6 seconds long. > > * try inserting a delay into that scan loop, as per attached > quick-and-dirty patch. (Numbers pulled from the air, but > we can worry about tuning after we see if this is really > where the problem is.) > After finally getting this particular system into a state where I could build postgres (I was using the binary install) I built a 8.3.4, using your patch - but I didn't see any change in the behaviour. I see hiccups that appear to be the same length as I saw on the binary build of 8.3.4. Thanks, Dan -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Checkpoint Tuning Question
> However, the latest report says that he > managed that, and yet there's still a one-or-two-second transient of > some sort. I'm wondering what's causing that. If it were at the *end* > of the checkpoint, it might be the disk again (failing to handle a bunch > of fsyncs, perhaps). But if it really is at the *start* then there's > something else odd happening. > > regards, tom lane > Log output during heavy load - checkpoint logging on, and slow query logging on for queries that take longer than 1 second. Blank space inserted to show interesting bits Almost all of the slow query log messages are logged within about 3 seconds of the checkpoint starting message. LOG: checkpoint starting: xlog LOG: duration: 1101.419 ms execute : select dynamichos0_.ethernetmacaddr as ethernet1_0_, dynamichos0_.ipaddr as ipaddr0_, dynamichos0_.cpemac as cpemac0_, dynamichos0_.last_updated as last3_0_, dynamichos0_.leasetime as leasetime0_, dynamichos0_.regtime as regtime0_ from iphost dynamichos0_, cpe cpe1_ where dynamichos0_.cpemac=cpe1_.cpemac and 1=1 and dynamichos0_.ethernetmacaddr=$1 and dynamichos0_.cpemac=$2 and cpe1_.regBaseId=$3 and dynamichos0_.ipaddr<>$4 DETAIL: parameters: $1 = '01:01:01:56:01:7F', $2 = '00:55:00:82', $3 = '01:01:01:56', $4 = '000.000.000.000' LOG: duration: 1101.422 ms execute : insert into iphost (cpemac, last_updated, leasetime, regtime, ethernetmacaddr, ipaddr) values ($1, $2, $3, $4, $5, $6) DETAIL: parameters: $1 = '01:AA:00:A0', $2 = '2009-07-08 15:33:20.673', $3 = '2009-07-08 21:06:40.67', $4 = '2009-07-08 15:33:20.67', $5 = '01:01:02:AB:01:9D', $6 = '2.171.156.0' LOG: duration: 1501.905 ms execute : select dynamichos0_.ethernetmacaddr as ethernet1_0_, dynamichos0_.ipaddr as ipaddr0_, dynamichos0_.cpemac as cpemac0_, dynamichos0_.last_updated as last3_0_, dynamichos0_.leasetime as leasetime0_, dynamichos0_.regtime as regtime0_ from iphost dynamichos0_, cpe cpe1_ where dynamichos0_.cpemac=cpe1_.cpemac and 1=1 and dynamichos0_.ethernetmacaddr=$1 and dynamichos0_.cpemac=$2 and cpe1_.regBaseId=$3 and dynamichos0_.ipaddr<>$4 DETAIL: parameters: $1 = '01:01:01:C3:01:8B', $2 = '00:C2:00:8E', $3 = '01:01:01:C3', $4 = '000.000.000.000' LOG: checkpoint complete: wrote 9975 buffers (77.9%); 0 transaction log file(s) added, 0 removed, 15 recycled; write=156.576 s, sync=0.065 s, total=156.662 s LOG: checkpoint starting: xlog LOG: duration: 1104.780 ms execute : delete from iphost where ethernetmacaddr=$1 and ipaddr=$2 and last_updated=$3 DETAIL: parameters: $1 = '01:01:01:33:01:AA', $2 = '1.50.169.0', $3 = '2009-07-08 15:32:57.131' LOG: duration: 1106.499 ms execute : select cpe0_.cpemac as cpemac2_0_, cpe0_.changeTime as changeTime2_0_, cpe0_.comment as comment2_0_, cpe0_.configuration as configur4_2_0_, cpe0_.cpeconfigid as cpeconf17_2_0_, cpe0_.cpefilterid as cpefilt18_2_0_, cpe0_.endTime as endTime2_0_, cpe0_.ispId as ispId2_0_, cpe0_.last_updated as last7_2_0_, cpe0_.reglocationid as regloca19_2_0_, cpe0_.modelId as modelId2_0_, cpe0_.numberOfHosts as numberOf9_2_0_, cpe0_.regBaseId as regBaseId2_0_, cpe0_.regTime as regTime2_0_, cpe0_.roamAllowed as roamAll12_2_0_, cpe0_.serialNumber as serialN13_2_0_, cpe0_.slaid as slaid2_0_, cpe0_.enable as enable2_0_, cpe0_.staticip as staticip2_0_, cpe0_.subscriberid as subscri21_2_0_, cpe0_.swVersion as swVersion2_0_, cpe0_.vlanid as vlanid2_0_, cpe0_.voipid as voipid2_0_ from cpe cpe0_ where cpe0_.cpemac=$1 DETAIL: parameters: $1 = '00:84:00:37' LOG: duration: 1205.828 ms execute : insert into iphost (cpemac, last_updated, leasetime, regtime, ethernetmacaddr, ipaddr) values ($1, $2, $3, $4, $5, $6) DETAIL: parameters: $1 = '02:31:00:25', $2 = '2009-07-08 15:39:53.718', $3 = '2009-07-08 21:13:13.715', $4 = '2009-07-08 15:39:53.715', $5 = '01:01:03:32:01:22', $6 = '3.51.33.0' LOG: duration: 1203.287 ms execute : insert into iphost (cpemac, last_updated, leasetime, regtime, ethernetmacaddr, ipaddr) values ($1, $2, $3, $4, $5, $6) DETAIL: parameters: $1 = '03:47:00:81', $2 = '2009-07-08 15:39:53.72', $3 = '2009-07-08 21:13:13.717', $4 = '2009-07-08 15:39:53.717', $5 = '01:01:04:48:01:7E', $6 = '4.74.125.0' LOG: duration: 1201.480 ms execute : insert into iphost (cpemac, last_updated, leasetime, regtime, ethernetmacaddr, ipaddr) values ($1, $2, $3, $4, $5, $6) DETAIL: parameters: $1 = '01:12:00:7F', $2 = '2009-07-08 15:39:53.725', $3 = '2009-07-08 21:13:13.721', $4 = '2009-07-08 15:39:53.721', $5 = '01:01:02:13:01:7C', $6 = '2.19.123.0' LOG: checkpoint complete: wrote 9794 buffers (76.5%); 0 transaction log file(s) added, 0 removed, 15 recycled; write=148.084 s, sync=0.062 s, total=148.172 s -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Checkpoint Tuning Question
>> Wouldn't increasing the length between checkpoints result in the >> checkpoint process taking even longer to complete? > > You don't really care how long it takes. What you want is for it not to > be chewing a bigger fraction of your I/O bandwidth than you can spare. > Hence, you want it to take longer. Trying to shorten it is just going > to make the spike worse. > > regards, tom lane > I bumped the segments up to 15, and the timeout up to 10 minutes, and changed the completion target to .7. What I observe now is that I get a short (1-2 second) period where I get slow queries - I'm running about 30 queries in parallel at any given time - it appears that all 30 queries get paused for a couple of seconds at the moment that a checkpoint begins. However, after the initial slowdown, I don't get any additional slow queries logged while the checkpoint process runs. My takeaway is that starting the checkpoint process is really expensive - so I don't want to start it very frequently. And the only downside to longer intervals between checkpoints is a longer recovery time if the system crashes? Thanks, Dan -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Checkpoint Tuning Question
On Wed, Jul 8, 2009 at 1:23 PM, Tom Lane wrote: > Dan Armbrust writes: >> With checkpoint_segments set to 10, the checkpoints appear to be >> happening due to checkpoint_timeout - which I've left at the default >> of 5 minutes. > > Well, you could increase both those settings so as to put the > checkpoints further apart, and/or increase checkpoint_completion_target > to spread the checkpoint I/O over a larger fraction of the cycle. > > regards, tom lane > Sorry, didn't mean to get off list. Wouldn't increasing the length between checkpoints result in the checkpoint process taking even longer to complete? The way my system processes and buffers incoming data, having infrequent (but long and disruptive) checkpoints is bad, since it causes the throughput to suffer so bad - my buffers can't hold the flood, and I have to drop data. If I can reduce the impact of the checkpoints, and have them occur more frequently, they my buffers should be able to hold the incoming data during the short durations that I have slow queries. I'll go experiment with checkpoint_completion_target. Thanks, Dan -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Checkpoint Tuning Question
On Wed, Jul 8, 2009 at 12:50 PM, Tom Lane wrote: > Dan Armbrust writes: >> However, once the checkpoint process begins, I get a whole flood of >> queries that take between 1 and 10 seconds to complete. My throughput >> crashes to near nothing. The checkpoint takes between 45 seconds and >> a minute to complete. > > You sure this is 8.3? It should spread out checkpoints over a couple of > minutes by default. [thinks...] Maybe you need to increase > checkpoint_segments some more. If it's forcing the checkpoint in order > to hold down the number of WAL segments used up, that would explain a > fast checkpoint. > > regards, tom lane > Just checked - currently running 8.3.4 on the system I'm testing on. With checkpoint_segments set to 10, the checkpoints appear to be happening due to checkpoint_timeout - which I've left at the default of 5 minutes. If I double my test load, I end up with checkpoints happening about every 4 minutes, with the log message that I read to indicate that it used up all 10 segments. So not much I can do to keep the checkpoint process from causing a burst of slow queries? Thanks, Dan -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Checkpoint Tuning Question
I'm running a steady state test where I am pushing about 600 queries per second through a Posgres 8.3 system on an 8 CPU Linux system. It's a mix of inserts, updates, and deletes on a few tables - the two biggest ones probably have about 200,000 rows. Harddrive is just a simple, run-of-the-mill desktop drive. Here are parameters that I have changed from defaults: shared_buffers =100MB synchronous_commit=off And, after noting complaints in the log file about checkpoint intervals, I set: checkpoint_segments=10 Then I turned on slow query logging for queries that take more than 1 second, and checkpoint logging. Typically, I see no slow queries. The system handles the load just fine. Once in a while, I'll see a query that takes 3 - 5 seconds. However, once the checkpoint process begins, I get a whole flood of queries that take between 1 and 10 seconds to complete. My throughput crashes to near nothing. The checkpoint takes between 45 seconds and a minute to complete. After the checkpoint completes - the system returns to having very few slow queries, and the keeps up with the load fine. Is there anything I can do to prevent the occasional slow query? Is there anything I can do to prevent (or minimize) the performance impact of the checkpoint? Thanks, Dan -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] recover corrupt DB?
> These reports seem to come up a bit, with disk full issues resulting in > the need to pg_resetxlog, dump, and re-initdb, but I wouldn't be too > shocked if they all turned out to be on xfs or something like that. > My particular disk-full condition was on ext2. Nothing exotic. Also, the process that filled the disk was postgres - if that makes any difference - I had left a debug level turned up in the postgres config file, and it was logging every single db query. Since it wasn't set up to remove old log files - it filled the disk. Nothing else unusual occurred that I'm aware of - things went weird for the lab tester, he cleared some space, rebooted the system, and postgres didn't come back online. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] recover corrupt DB?
> In general, pg_resetxlog would be > the tool to try here. Don't panic yet. ;-) > > Yep, that was the command I was looking for. That at least got the DB to a point where it would start, and I was able to do a dump. So, I dumped and reloaded all of the databases. Things seem fine, but bits and pieces of documentation I've seen for pg_resetxlog also recommend initdb, and starting over. Is that necessary? Thanks, Dan -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] recover corrupt DB?
I had a test system (read as not backed up, sigh) which had the disk go full while PostgreSQL was loaded, consequently, PostgreSQL will no longer start. It is logging an error about detecting an invalid shutdown, trying to replay something, and then an error about not being able to open a file it is looking for. Is this DB toast? Or is there something I could do to get the DB back into a state where it will start, without losing everything? Thanks, Dan -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Slow Vacuum was: vacuum output question
> Well, your throughput on this machine is horrible. It looks like with > 8.1 all your time is sys + cpu for your cpus, while with 8.3 you've > got more idle and more i/o wait, which tells me that 8.3 is smarter > about vacuuming, so it's spending less time working the cpus and more > time waiting for the i/o subsystem. > > Wither way, getting only 2 or so megs a second write is pretty bad. I > can get those numbers from a laptop. An older laptop like a single > core 1.6GHz pentium M based T42 or something. My laptop, which is new > from last year, is about twice as fast as your server in terms of I/O. This is my problem in a nutshell. As of yet, I have no rational explanation for this performance. The servers in question are not slow. This particular server never shows this problem when running a newer OS - but I have not yet finished isolating which OS's have problems on this hardware. No other software on the system exhibits any sort of IO issue other than PostgreSQL. I have customers with $20K servers that can't handle the workload that I can handle on an old cruddy laptop. However, much of the appeal of our product is low per-site installation costs, so expensive servers don't fit into the mix. Random futzing - reindexing, manual full vacuums, rebooting the server - each of these has cleared the error condition on one site or another, and allowed the system to go back to functioning the way it should for months, until the problem randomly crops up again. I'm still looking into it, but, at the same time, we have enough workarounds to the issue now (scheduled reindex, install a newer OS, upgrade to Postgres 8.3) that this is becoming a low priority mystery, rather than the high priority one it has been. Thanks for your thoughts, Dan -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Slow Vacuum was: vacuum output question
On PostgreSQL 8.1, while a long vacuum is running, the output of vmstat 10 looks like this (sorry, can't format this very will in this e-mail client): r b swpd free buff cache si sobi bo in cs us sy id wa st 5 2112 53732 4388 116340400 13524 1322 941 75053 45 55 0 0 0 3 2112 51596 2336 116003200 20731 732 999 196116 35 58 0 7 0 3 2112 53784 2448 115538800 5729 1680 716 114442 46 52 0 2 0 7 1112 52088 2568 115679600 205 1919 546 28929 56 44 0 0 0 3 2112 51772 2652 115702800 162 2010 534 2845 56 44 0 0 0 3 2112 52296 2736 115633200 174 1910 517 2648 53 47 0 0 0 2 2112 52736 2832 115538800 173 1960 521 2698 57 43 0 0 0 6 1112 51172 2908 115740000 173 1825 511 2621 52 48 0 0 0 2 1112 52156 2984 115587600 152 1766 495 2669 53 47 0 0 0 5 2112 53828 3064 115402800 124 1762 498 2673 54 46 0 0 0 5 0112 53396 3140 115468400 122 1803 532 2871 52 48 0 0 0 6 1112 52004 3240 115589200 150 2004 522 3712 56 44 0 0 0 5 2112 53100 3348 115236400 128 2142 543 2711 54 38 0 8 0 1 2112 56704 3428 114813200 102 2229 560 2925 56 44 0 0 0 2 1112 54876 3500 11500800091 2129 598 3068 65 35 0 0 0 3 1112 53196 3588 11521640090 2091 542 2803 57 43 0 0 0 8 2112 53528 3664 115186800 106 2180 553 2947 55 39 0 6 0 Vacuum hasn't stopped yet at this point - its been running for many minutes. On PostgreSQL 8.3, we had this: r b swpd free buffcache si sobi bo in csus sy id wa st 6 0 9604 51932 14276 113899600 3 1808 523 5115 45 9 42 4 0 2 0 9580 53284 14124 1138092 100 564 2561 585 5126 53 10 19 18 0 7 0 9564 53412 14144 113769600 0 1682 497 4985 42 9 48 2 0 3 0 9532 53320 13880 11379406037 1757 538 4979 48 8 41 3 More random data: On an overnight test of PostgreSQL 8.3 on Fedora Core 6, with auto-vacuum on (using all default settings) the amount of time that it takes to run a manual vacuum on a constant sized database (only row updates, or deletes followed by a replacement add) has doubled from 2 seconds to 4 seconds. A reindex brings the time back down to 2 seconds. Dan -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Slow Vacuum was: vacuum output question
I'm no closer to a solution, but here are some additional data points - all taken on Fedora Core 6. Postgres 8.1 built from source. Auto vacuum disabled. Create Empty Database. Run our load on the system for 2 hours to populate and exercise the database. Run Vacuum. Takes more than a minute. Run Vacuum immediately again. Takes more than a minute. Reindex the database. Takes 10 seconds. Run Vacuum again. Takes 2 seconds. Allow load to run on the system for 30 minutes. Run Vacuum again. Takes more than a minute. Postgres 8.3 built from source. Auto vacuum disabled. Create Empty Database. Run our load on the system for 2 hours to populate and exercise the database. Run Vacuum. Takes more than a minute. Run Vacuum immediately again. Takes 15 seconds. Run Vacuum immediately again. Takes 15 seconds. Reindex the database. Takes 10 seconds. Run Vacuum again. Takes 2 seconds. So, PostgreSQL 8.3 shows better behaviour, but it is still showing some sort of performance issue which a reindex fixes. And then of course, the kicker is that we can't recreate any of these issues when running the same exact test, on the same exact hardware - but using a different underlying OS. When we were running under a modern Ubuntu, the vacuum never takes more than 2 seconds. We will be checking other OSs soon. I guess if we can't figure out what is causing it, we can at least isolate the distros that we need to tell our customers not to use (or to schedule a reindex if they insist on not upgrading their OS) -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Slow Vacuum was: vacuum output question
On Tue, Jan 6, 2009 at 3:36 PM, Tom Lane wrote: > "Dan Armbrust" writes: >> INFO: "cpe": found 415925 removable, 50003 nonremovable row versions >> in 10849 pages > >> What on earth could be going on between PostgreSQL 8.1 and Fedora 6 >> that is bloating and/or corrupting the indexes like this? > > You're focusing on the indexes when the problem is dead table rows. > > It's very hard to believe that there's any OS dependence as such > involved in that. I wonder whether (a) the Ubuntu and Fedora packages > you're using are the same 8.1.x point release; (b) if there's any > interesting non-default behavior built into the Ubuntu packaging > ... like running autovacuum automatically, for instance. > >regards, tom lane > In our testing, Postgres 8.1 was build from source (PostgreSQL website source) on both systems. No Distro packages involved. Believe me, we are as baffled as you. We have been chasing this bug off and on for months on a couple of different customer sites now. Thanks, Dan -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Slow Vacuum was: vacuum output question
> > Obviously the choice of operating system has no impact on the contents of > your index. > > A better question might be, what did your application or maintenance > procedures do different in the different tests? > > > -- > Alan Our problem for a long time has been assuming the "obvious". But we now have tests that show otherwise. I'm now thinking something along the lines of an obscure file system or kernel interaction bug now - that was perhaps corrected in newer releases of the OS. Now that we can finally reproduce the problem in house, we are still doing more tests to figure out specifics - does the problem go away with Postgres 8.3, ext2/ext3/reiserFS, etc. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Slow Vacuum was: vacuum output question
> On Tue, Jan 6, 2009 at 1:39 PM, Dan Armbrust > wrote: >> Here is an interesting new datapoint. >> >> Modern Ubuntu distro - PostgreSQL 8.1. SATA drive. No Raid. Cannot >> reproduce slow vacuum performance - vacuums take less than a second >> for the whole database. >> >> Reinstall OS - Fedora Core 6 - PostgreSQL 8.1. Push data through >> PostgreSQL for a couple hours (same as above) and now vacuum reports >> this: > > Are you pushing the same amount of data through the ubuntu server? if > not, then the comparison is invalid, if so, then yeah, there's some > kind of difference between the platforms. > > Note that Fedora Core 6 is quite old compared to ubuntu 8.04 or 8.10. > Also it's more likely to be installed on older and / or slower > equipment. > Yep - actually, we pushed much more data through the Ubuntu system and could never reproduce the problem. On the Fedora Core 6 system, the problem happened very quickly. In our testing here, the Ubuntu test was on the same hardware as the fedora core 6 system (not just identical, but the same actual box) It seems that there is some sort of bad interaction between some part of the older OS and PostgreSQL. We have also seen what appears to be the same issue on a Cent OS 4.4 system. Which is a rather similar package level to Fedora Core 6. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Slow Vacuum was: vacuum output question
On Tue, Jan 6, 2009 at 3:01 PM, Alvaro Herrera wrote: > Dan Armbrust escribió: > >> What on earth could be going on between PostgreSQL 8.1 and Fedora 6 >> that is bloating and/or corrupting the indexes like this? > > Postgres 8.1 was slow to vacuum btree indexes. My guess is that your > indexes are so bloated that it takes a lot of time to scan them. > > I think the solution here is to vacuum this table far more often. > > -- > Alvaro Herrerahttp://www.CommandPrompt.com/ > The PostgreSQL Company - Command Prompt, Inc. > Actually, the customer reported problem is that when they enable autovacuum, the performance basically tanks because vacuum runs so slow they can't bear to have it run frequently. Though, perhaps they had bloated indexes before they started autovacuum, and it never fixed them. Perhaps it will behave properly if we do a reindex, and then enable autovacuum. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Slow Vacuum was: vacuum output question
Here is an interesting new datapoint. Modern Ubuntu distro - PostgreSQL 8.1. SATA drive. No Raid. Cannot reproduce slow vacuum performance - vacuums take less than a second for the whole database. Reinstall OS - Fedora Core 6 - PostgreSQL 8.1. Push data through PostgreSQL for a couple hours (same as above) and now vacuum reports this: INFO: vacuuming "public.cpe" INFO: index "pk_cpe" now contains 50048 row versions in 2328 pages DETAIL: 415925 index row versions were removed. 0 index pages have been deleted, 0 are currently reusable. CPU 0.05s/0.33u sec elapsed 0.94 sec. INFO: index "ix_cpe_ispid" now contains 50090 row versions in 1338 pages DETAIL: 415925 index row versions were removed. 953 index pages have been deleted, 0 are currently reusable. CPU 0.27s/0.22u sec elapsed 8.93 sec. INFO: index "ix_cpe_enable" now contains 50676 row versions in 1637 pages DETAIL: 415925 index row versions were removed. 1161 index pages have been deleted, 0 are currently reusable. CPU 0.45s/0.31u sec elapsed 14.01 sec. INFO: "cpe": removed 415925 row versions in 10844 pages DETAIL: CPU 1.48s/0.25u sec elapsed 35.86 sec. INFO: "cpe": found 415925 removable, 50003 nonremovable row versions in 10849 pages DETAIL: 6 dead row versions cannot be removed yet. There were 0 unused item pointers. 0 pages are entirely empty. CPU 2.40s/1.18u sec elapsed 61.13 sec. It tooks 61 seconds to vacuum, and the number of index row versions removed was huge. We than issued a reindex command for the entire database - and now the vaccum times are back down under a second. What on earth could be going on between PostgreSQL 8.1 and Fedora 6 that is bloating and/or corrupting the indexes like this? Thanks, Dan -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Slow Vacuum was: vacuum output question
> >> Their workaround had been to run a daily autovacuum at the lowest load >> time of day, to cause the least disruption. > > What is a "daily autovacuum"? It sounds like some tables just need > vacuuming more often. If they find that the system is not responsive > during that, it tells us that they need more disk bandwidth or that > they need to integrate vacuuming some tables with their program. > Sorry, I meant a daily manual vacuum. On paper, their hardware is plenty fast for their workload. Out of hundreds of sites, all running the same software putting load on the database, this is only the second time where we have seen this odd behaviour of very slow vacuums. I guess I was hoping that someone would be able to chime in and say - yes, in so and so version, we fixed an obscure bug that sometimes caused huge slowdowns, perhaps when combined with certain linux kernels. It was a nice dream anyway :) iozone looks useful. I'll see if I can get on their system and do some proper benchmarks. Thanks, Dan -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Slow Vacuum was: vacuum output question
>> INFO: "cpe": found 95498 removable, 18757 nonremovable row versions >> in 7 pages >> DETAIL: 0 dead row versions cannot be removed yet. >> There were 280173 unused item pointers. >> 0 pages are entirely empty. >> CPU 5.35s/0.99u sec elapsed 724.38 sec. > > How many idle transactions are there? > Not sure. I don't expect that there were many, their system isn't very highly loaded compared to most of our customers systems. The way that they reported the problem to us was that if they enable autovacuum, when ever it runs (about 4 times an hour) it would stop processing the things it needed to process, due to table lock contention for several minutes. Their workaround had been to run a daily autovacuum at the lowest load time of day, to cause the least disruption. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Slow Vacuum was: vacuum output question
>> INFO: "cpe": found 95498 removable, 18757 nonremovable row versions >> in 7 pages >> DETAIL: 0 dead row versions cannot be removed yet. >> There were 280173 unused item pointers. >> 0 pages are entirely empty. >> CPU 5.35s/0.99u sec elapsed 724.38 sec. >> >> Then, running vacuum again immediately afterword, on a system that was >> basically idle, would result in nearly the same amount of time to >> vacuum the table. > > You do realize that except for the end of a table, vacuum recovers no > actual space, just makes it available for new tuples to move in there. > So it makes sense that the second vacuum would take just as long, or > nearly so. > Yep. The real issue is that it took 724 seconds, instead of say, a half second - like it does on my system. I wasn't sure if I should expect vacuum to take longer to run when it finds a large number of tuples that it needs to make available, so I just have them run it twice so I can easily compare the time that it takes with essentially no work to do. > > Hard to say. Have them run > > vmstat 1 > > while vacuuming and see what bi/bo look like. > Haven't looked at that yet on this particular system. Last time, on different hardware when this occurred the vmstat 'wa' column showed very large values while vacuum was running. I don't recall what the bi/bo columns indicated. top also showed very high load averages while vacuum was running - but basically no cpu use. Are there any common tools that could do a better disk benchmark than hdparm -Tt? Thanks, Dan -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Slow Vacuum was: vacuum output question
To follow up on an old thread that I started - I had a customer who had a system where manual vacuum runs were taking a very long time to run. I was seeing output like this: INFO: "cpe": found 95498 removable, 18757 nonremovable row versions in 7 pages DETAIL: 0 dead row versions cannot be removed yet. There were 280173 unused item pointers. 0 pages are entirely empty. CPU 5.35s/0.99u sec elapsed 724.38 sec. Then, running vacuum again immediately afterword, on a system that was basically idle, would result in nearly the same amount of time to vacuum the table. Getting a copy of the database from the customer, and loading it onto my Postgres System and running the vacuum would result in runs that took less than a second (as expected). General opinion was that it was a disk-io problem. We rebooted the system, and magically, the problem went away. As tends to be the case with "magically" fixed problems, this one is back. Now I have a different customer running Postgres 8.1 on Fedora Core 6, and their system produced that log snippit above. hdparm shows disk-io thruput being perfectly normal on their system. Everything else on the system seems to be working correctly. The reboot solution doesn't help. Vacuum still runs painfully slow. I'm still waiting for a copy of their postgresql config file, but I'm guessing that almost everything was left on the default settings. I don't suppose anyone knows of any bugs that existed between postgres 8.1 and older linux kernels that led to behavior like this? I can't really just ask them to upgrade their OS and/or Postgres on the hunch that the problem should go away And I haven't yet been able to reproduce anything like this on a system that I have control over. Thanks for any ideas. Dan -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] vacuum output question
> > There was concurrent access to the table during VACUUMing, so the long > delay is explainable as long waits for cleanup lock, plus probably > thrashing the cache with bloated indexes. The CPU overhead per row seems > OK. We should instrument the wait time during a VACUUM and report that > also. > > -- > Simon Riggs www.2ndQuadrant.com > PostgreSQL Training, Services and Support Is that a guess? Or something you can tell from the log above? Because there shouldn't have been any concurrent access while the VACUUM was run - the customers had failed over to a different system, so while I can't be sure, I expect that there was no other database activity at the time the command was run. Thanks, Dan -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] vacuum output question
Thanks everyone. You have helped back up my suspicions. It is indeed a Linux system, and it has a RAID IO system, but I don't yet know the details of that IO system. Time to put them back to work looking at their hardware, rather than blaming our software :) Thanks for the extra tips on hunting for excess IO from other processes. Dan -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] vacuum output question
I have a system backed by a PostgreSQL DB at a customer site that mysteriously slowed way down - and couldn't keep up with the load for no apparent reason. I had them run a vacuum analyze verbose on my database, and had these lines come back which made me suspicious: INFO: index "ix_cpe_ispid" now contains 41626 row versions in 13727 pages DETAIL: 5224 index row versions were removed. 1543 index pages have been deleted, 1373 are currently reusable. CPU 13.09s/3.51u sec elapsed 157.85 sec. INFO: index "ix_cpe_enable" now contains 41628 row versions in 29417 pages DETAIL: 5224 index row versions were removed. 3706 index pages have been deleted, 3291 are currently reusable. CPU 31.27s/8.22u sec elapsed 687.60 sec. INFO: "cpe": found 5224 removable, 41626 nonremovable row versions in 1303 pages DETAIL: 0 dead row versions cannot be removed yet. There were 22416 unused item pointers. 0 pages are entirely empty. CPU 44.46s/11.82u sec elapsed 852.85 sec. Why did those particular tables and indexes take _so_ long to vacuum? Perhaps we have a disk level IO problem on this system? Can someone tell me what 'CPU 44.46s/11.82u sec' means? I have a guess, but I'm not sure. Thanks, Dan -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Drop database / database in use question
This is interesting. On windows, if I attempt to drop a database when I am purposefully holding a connection open to the DB, it fails immediately - ~ 100 ms. On linux, if I do the same thing, the drop database command takes upwards of 4 seconds before it gives up, and says it can't drop the DB. Is this expected? Is there some way I can control how long it will block the drop command, waiting for the DB to not be in use? Thanks, Dan -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Drop database / database in use question
It would seem that way. But if you have ever tried programming with the constraints of an InstallAnywhere installer, you would know why :) On Fri, Oct 17, 2008 at 10:43 AM, Alvaro Herrera <[EMAIL PROTECTED]> wrote: > Dan Armbrust escribió: >> I don't suppose that there is any easy way way that I can stop and/or >> disable the Autovac temporarily for the database that I want to drop. >> >> The only thing that I have seen so far, is that I would have to add >> rows to the pg_autovacuum table for each table in my database, but >> I'mnot confident that that would even stop it from accessing the >> database, since it says that even when set to disabled, autovacuum >> will still run to prevent transaction id wraparounds, so it seems that >> it still would make make a quick check into the database that my drop >> command could collide with. > > You seem to want to go to a lot of trouble just to a void a simple retry > loop. > > -- > Alvaro Herrerahttp://www.CommandPrompt.com/ > PostgreSQL Replication, Consulting, Custom Development, 24x7 support > -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Drop database / database in use question
I don't suppose that there is any easy way way that I can stop and/or disable the Autovac temporarily for the database that I want to drop. The only thing that I have seen so far, is that I would have to add rows to the pg_autovacuum table for each table in my database, but I'mnot confident that that would even stop it from accessing the database, since it says that even when set to disabled, autovacuum will still run to prevent transaction id wraparounds, so it seems that it still would make make a quick check into the database that my drop command could collide with. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Drop database / database in use question
But there is no user2. I _know_ I am the only user of this database. So how can User 1 create a race condition by himself? Or is this something PostgreSQL is doing internally (like vacuum) ? Do I really just have to keep trying the DROP command N times in a row, until it decides it wants to work? That really doesn't seem right. Thanks, Dan On Fri, Oct 17, 2008 at 9:24 AM, Scott Marlowe <[EMAIL PROTECTED]> wrote: > On Fri, Oct 17, 2008 at 8:11 AM, Dan Armbrust > <[EMAIL PROTECTED]> wrote: >>> There are obvious race conditions in that assumption. Why don't you >>> just try the drop and see if it succeeds? >>> >>>regards, tom lane >>> >> >> I don't follow - why is there a race condition? I'm driving the >> commands into postgresql via the command line. > > User 1: select * from pg_stat_activity where datname='db123'; > User 2: psql db123 > User 1: drop database db123; > -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Drop database / database in use question
> There are obvious race conditions in that assumption. Why don't you > just try the drop and see if it succeeds? > >regards, tom lane > I don't follow - why is there a race condition? I'm driving the commands into postgresql via the command line. The command that does the query on the pg_stat_activity table happens quite a while before my attempt to drop the table - and it's logging into the template1 database, rather than the database I want to drop. The drop attempt comes later, in a subsequent psql command line invocation. The drop command also logs in using the template1 database. Does the psql command line client connection not get cleaned up immediately, or something like that? No other command or tool will access this database (to my knowledge) in between the two commands. So what is the mystery user that I'm finding using the table? My only guess so far is that it was the autovac daemon - but I don't know enough about how that works to know if that is even a reasonable guess. Due to the nature of the installer tool I'm driving this fun, parsing back the output of the psql commands isn't much fun... and there are cases where a failure is acceptable (the database already doesn't exist, etc). If I can have a reliable drop command that always works, it would be much easier. Thanks, Dan -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Drop database / database in use question
I have some installation tools which in some cases need to replace a postgres database. For this example, assume my database name is 'fred'. Also, assume that no other processes are accessing this database. Any database access will come from my running installer. First, after validating some passwords, my installer will run this command: psql -c "select datname from pg_stat_activity where datname='fred'" -U username -d template1 I then scan the output looking for 'fred'. My (perhaps incorrect) assumption is that if I don't find the database name 'fred' in the output, then I can assume that no process is using this database, and I should be able to drop it. Later, my installer runs this code: psql -c "drop database fred" -U username -d template1 99% of the time, everything works fine. 1% of the time, the drop fails, and I get this: ERROR: database "fred" is being accessed by other users My two part question is why, and what can I do about it? At a minimum, I need to be able to reliably determine if I can drop the database. Ideally, I would like to be able to drop the database even though it is still in use - force an override of some sort - kick out the offending user. Thanks for any advice Dan -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] checkpoint_segments warning?
> > That part that confused me was the "-2484". Is that just a hyphen in > > a strange place? Or did it really calculate a negative value? > > Yeah, it really did. AFAICS this could only happen if the value of time(2) > went backwards. Something fooling with your system clock? > > regards, tom lane > Or _someone_. Yes. Precisely. I may even know who did it ;) Thanks, Dan -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] checkpoint_segments warning?
I got a warning from PostgreSQL 8.3.1 that said something to the effect of checkpoints are happening to frequently... (-2484 seconds) That part that confused me was the "-2484". Is that just a hyphen in a strange place? Or did it really calculate a negative value? Is this expected? Thanks, Dan -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Deadlock situation?
On Wed, Apr 30, 2008 at 12:36 PM, <[EMAIL PROTECTED]> wrote: > > In this case, Postgres had been started in the foreground on a > > terminal, so I went to that terminal, and did a ctrl-c. Eventually, > > postgres stopped, but the terminal wouldn't respond either - and I had > > to close it. > > Just out of curiosity, could you maybe have XOFF'd the terminal? I've had > cases in the past where Postgres got unhappy and froze when it couldn't > write logging information to stderr. (Though, granted, in the cases I hit > the server's stderr was redirected to a pipe that was full, so it's not > 100% analogous) Its certainly possible that something "interesting" happened to the terminal. We do have an issue with a KVM switch on this machine that sometimes causes all sorts of unpredictable random garbage input to hit the screen when you switch to the system. If I can't reproduce the problem (which so far I have not been able to) I'll probably let it go, perhaps naively believeing that some terminal issue set off the chain of events. Thanks, Dan -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Deadlock situation?
> I wonder whether you saw some variant of the deadlock reported in bug > #3883 --- see discussion here: > http://archives.postgresql.org/pgsql-hackers/2008-01/msg00873.php > The only known way of getting into that problem is fixed in 8.3, > but a potential for silent deadlock is definitely still there. > I don't however have any idea how such a deadlock could occur without > at least some processes showing as 'waiting' in pg_stat_activity. > > Do you have any idea what the transactions that were blocked at > COMMIT had been doing? Should have just been selects and updates, and perhaps some inserts and deletes - nothing to out of the ordinary or complex. If I can get this to happen consistently, I'll set up a logger to capture all of the SQL statements leading up to the freeze. > > Are there any foreign key constraints leading *to* this table? Nope. Unfortunately, I had been running for a couple of weeks steady before I got the error - so reproducing it may not be quick. But I'm going to set up a couple of parallel tests on more systems, and see if I can happen into it again sooner. Oh, and I don't think though should have anything to do with it, but I did discover that fsync was turned off on this postgres instance. Someone else didn't clean up after a different test, and I forgot to change it back. Thanks, Dan -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Deadlock situation?
> Well, there went the evidence :-( ... but what exactly did you have > to do to shut it down? I'm wondering whether the backends responded > to SIGINT or SIGTERM. Sorry :( First, I tried issuing a kill to the pid of the vacuum process. It didn't respond to kill. Then, I tried a kill -9. It responded to that :) However, postgres got very unhappy at this point - I tried to log in to issue another query against the pg_stat_activity table, but every command I issued at this point simply hung. In this case, Postgres had been started in the foreground on a terminal, so I went to that terminal, and did a ctrl-c. Eventually, postgres stopped, but the terminal wouldn't respond either - and I had to close it. > > Next time, it'd be good to confirm (with top or vmstat or similar) > whether the backends are actually idle or are eating CPU or I/O. I didn't notice any high load on the processor - the server seemed idle. But I didn't look real close a the individual postgres processes. > Also try strace'ing a few of them; the pattern of kernel calls if > any would be revealing. > > The lack of deadlock reports or 't' values in pg_stat_activity.waiting > says that you weren't blocking on heavyweight locks. It's not > impossible that there was a deadlock at the LWLock level, though. > > What sort of indexes are there on this table? Here is the DDL for the table where vacuum was trying to run: CREATE TABLE iphost ( ethernetmacaddr char(17) NOT NULL, cpemac char(11) NOT NULL, ipaddr varchar(15) NOT NULL, regtime timestamp NOT NULL, leasetime timestamp, last_updated timestamp NOT NULL DEFAULT now(), CONSTRAINT pk_iphost PRIMARY KEY (ethernetmacaddr, ipaddr), CONSTRAINT fk_iphost_cpe FOREIGN KEY (cpemac) REFERENCES cpe (cpemac) MATCH SIMPLE ON UPDATE CASCADE ON DELETE CASCADE ) WITHOUT OIDS; ALTER TABLE iphost OWNER TO pslogin; CREATE INDEX ix_iphost_cpemac ON iphost USING btree (cpemac); CREATE INDEX ix_iphost_ipaddr ON iphost USING btree (ipaddr); > Teodor just fixed > an issue in GIN indexes that involved taking an unreasonable number of > LWLocks, and if that code wasn't exposing itself to deadlock risks > I'd be pretty surprised. > > regards, tom lane Thanks. If/when I can recreate this, I'll try to gather more info. Dan -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Deadlock situation?
Semantics of deadlock aside, I seem to be deadlocked, yet postgres didn't detect any deadlock situation. There are no DDL queries running. Just a lot of updates, and some inserts and deletes. I had to restart postgres to let things recover - so I can't gather any more info right now - but if/when it happens again, I'd like to know what else to gather. Looking at the time stamps, the transaction start timestamp of this autovacuum query is the oldest one: autovacuum: VACUUM public.iphost 2008-04-24 03:05:13.212436-05 | Then, between 03:05:37 and 03:05:38, nearly every other connection came to a halt. A few connections came to a halt several hours later. I'm baffled, because this autovacuum query seems to have locked the entire database. I also don't know what the "waiting" column means in the output - but they all have the flag of "f". Does that column means that it is waiting on a lock - t or f? Thanks, Dan -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Deadlock situation?
I have an app that we were load testing - it maintains a pool of connections to PostgreSQL - 8.3.1 Suddenly, after running fine for weeks, the app hung - unable to get a free connection from the pool. select * from pg_stat_activity;" shows me that most of my connections in a COMMIT phase: 03:05:37.73064-05 | 2008-04-24 03:05:38.419796-05 | 2008-04-24 02:11:53.908518-05 | 127.0.0.1 | 53807 16385 | ispaaa | 953 |16384 | pslogin | COMMIT | f | 2008-04-24 While some are in a SELECT: 16385 | ispaaa |1181 |16384 | pslogin | select dynamichos0_.ethernetmacaddr as ethernet1_0_, dynamichos0_.ipaddr as ipaddr0_, dynamichos0_.cpemac as cpemac0_, dynamichos0_.regtime as regtime0_, dynamichos0_.leasetime as leasetime0_, dynamichos0_.last_updated as last5_0_ from iphost dynamichos0_, cpe cpe1_ where dynamichos0_.cpemac=cpe1_.cpemac and 1=1 and dynamichos0_.ethernetmacaddr=$1 and dynamichos0_.cpemac=$2 and cpe1_.regBaseId=$3 and dynamichos0_.ipaddr<>$4| f | 2008-04-24 03:05:37.734041-05 | 2008-04-24 03:05:38.405854-05 | 2008-04-24 02:41:54.413337-05 | 127.0.0.1 | 55363 Perhaps VACUUM had something to do with it?: 16385 | ispaaa |8956 |16384 | pslogin | delete from iphost where leasetime<$1 | f | 2008-04-24 18:43:29.920069-05 | 2008-04-24 18:43:30.116435-05 | 2008-04-24 18:41:59.071032-05 | 127.0.0.1 | 49069 16385 | ispaaa |1618 | 10 | postgres | autovacuum: VACUUM public.iphost | f | 2008-04-24 03:05:13.212436-05 | 2008-04-24 03:05:13.212436-05 | 2008-04-24 03:05:12.526611-05 | | Where should I begin to look for the source of this problem? Thanks for any info, Dan -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] errors in pg_restore on windows?
I didn't do anything specific to load those functions into template1. I was under the impression that that particular debugger wasn't supposed to be on by default on windows - and that it was an installer oversight - and that you may turn it off by default for future builds of the windows installer. But that seems to lead the pg_restore function to report errors - which would certainly scare my users. Perhaps I didn't disable it properly. If your not concerned about it, I won't be either. I'll just stop turning off that debug flag. I guess it's all a moot point if no changes were planned to the windows installer WRT that debug setting. Thanks, Dan On Fri, Feb 29, 2008 at 11:05 AM, Tom Lane <[EMAIL PROTECTED]> wrote: > "Dan Armbrust" <[EMAIL PROTECTED]> writes: > > > If I disable that debugger library in the config file, create a > > database, backup the database, and then restore the database, I get > > many errors - all similar to the one I posted before. > > Perhaps you've got those functions loaded into template1? > > > > Are you saying they are really just warnings? > > Well, they're errors in the sense that those functions didn't get > restored. But if you're trying to remove the plpgsql debugger, > I think that's what you want. > > regards, tom lane > ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [GENERAL] errors in pg_restore on windows?
Sorry, I don't follow. If I disable that debugger library in the config file, create a database, backup the database, and then restore the database, I get many errors - all similar to the one I posted before. They are reported at the end of the restore as errors. Are you saying they are really just warnings? Thanks, Dan On Fri, Feb 29, 2008 at 10:19 AM, Tom Lane <[EMAIL PROTECTED]> wrote: > "Dan Armbrust" <[EMAIL PROTECTED]> writes: > > > It turns out, that commenting out the "shared_preload_libraries" in > > the postgresql.conf file breaks pg_restore. > > It hardly "breaks" anything. The errors you show just indicate that you > had the plpgsql debugger loaded into your database. > > regards, tom lane > ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [GENERAL] errors in pg_restore on windows?
Ok, it turns out that the fact that my pg_restore won't work on windows is directly related to this (from a couple days ago) "Dan Armbrust" <[EMAIL PROTECTED]> writes: > To follow up on my old thread - > I tested another install of Postgres 8.3.0.1 - and on windows, the > postgresql.conf file has this in it: > shared_preload_libraries = '$libdir/plugins/plugin_debugger.dll' > # > (change requires restart) > I got the impression that that debugger wasn't intended to be enabled, > by default. It looks like the windows installer is enabling it, > however. It turns out, that commenting out the "shared_preload_libraries" in the postgresql.conf file breaks pg_restore. So, perhaps that shouldn't be disabled by default in the installer. Dan On Thu, Feb 28, 2008 at 5:13 PM, Dan Armbrust <[EMAIL PROTECTED]> wrote: > Using PostgreSQL 8.3 on Windows, I make a backup like this: > pg_dump -U postgres -C -d -D -Fc -f ispaaa-pg.bak ispaaa > > Then restore like this: > pg_restore -U postgres -C -d template1 -v -Fc ispaaa-pg.bak > > And I get lots of these errors: > > > pg_restore: creating TABLE voip > pg_restore: creating FUNCTION pldbg_abort_target(integer) > pg_restore: [archiver (db)] Error while PROCESSING TOC: > pg_restore: [archiver (db)] Error from TOC entry 21; 1255 16419 > FUNCTION pldbg_abort_target(integer) postgres > pg_restore: [archiver (db)] could not execute query: ERROR: could not > load library "C:/Program Files (x86)/PostgreSQL/8.3/lib/pldbgapi.dll": > The specified module could not be found. > > Command was: CREATE FUNCTION pldbg_abort_target(session integer) RETURNS > SET > OF boolean > AS '$libdir/pldbgapi', 'pldbg_abort_target' > ... > pg_restore: [archiver (db)] could not execute query: ERROR: function > public.pld > bg_abort_target(integer) does not exist > Command was: ALTER FUNCTION public.pldbg_abort_target(session integer) > OWNER > TO postgres; > > > It looks like a packaging issue with the windows installer - these > commands work fine for me on Linux. > > Thanks, > > Dan > ---(end of broadcast)--- TIP 6: explain analyze is your friend
[GENERAL] errors in pg_restore on windows?
Using PostgreSQL 8.3 on Windows, I make a backup like this: pg_dump -U postgres -C -d -D -Fc -f ispaaa-pg.bak ispaaa Then restore like this: pg_restore -U postgres -C -d template1 -v -Fc ispaaa-pg.bak And I get lots of these errors: pg_restore: creating TABLE voip pg_restore: creating FUNCTION pldbg_abort_target(integer) pg_restore: [archiver (db)] Error while PROCESSING TOC: pg_restore: [archiver (db)] Error from TOC entry 21; 1255 16419 FUNCTION pldbg_abort_target(integer) postgres pg_restore: [archiver (db)] could not execute query: ERROR: could not load library "C:/Program Files (x86)/PostgreSQL/8.3/lib/pldbgapi.dll": The specified module could not be found. Command was: CREATE FUNCTION pldbg_abort_target(session integer) RETURNS SET OF boolean AS '$libdir/pldbgapi', 'pldbg_abort_target' ... pg_restore: [archiver (db)] could not execute query: ERROR: function public.pld bg_abort_target(integer) does not exist Command was: ALTER FUNCTION public.pldbg_abort_target(session integer) OWNER TO postgres; It looks like a packaging issue with the windows installer - these commands work fine for me on Linux. Thanks, Dan ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
[GENERAL] Windows 8.3 installer leaves a debug flag on?
To follow up on my old thread - I tested another install of Postgres 8.3.0.1 - and on windows, the postgresql.conf file has this in it: shared_preload_libraries = '$libdir/plugins/plugin_debugger.dll' # (change requires restart) I got the impression that that debugger wasn't intended to be enabled, by default. It looks like the windows installer is enabling it, however. ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [GENERAL] performance issues on windows with 8.3.0?
Thanks for all the help. Performance is back where I thought it should be, after I fixed our pooling bug. I didn't think that postgres would be released with performance issues like that - its just too good :) Thanks, Dan ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [GENERAL] performance issues on windows with 8.3.0?
So, my ill Postgres 8.3 database is filling up log files in the pg_log directory with the following: 2008-02-13 18:29:31 CST LOG: loaded library "$libdir/plugins/plugin_debugger.dll" 2008-02-13 18:29:31 CST LOG: loaded library "$libdir/plugins/plugin_debugger.dll" 2008-02-13 18:29:31 CST LOG: loaded library "$libdir/plugins/plugin_debugger.dll" 2008-02-13 18:29:31 CST LOG: loaded library "$libdir/plugins/plugin_debugger.dll" 2008-02-13 18:29:31 CST LOG: loaded library "$libdir/plugins/plugin_debugger.dll" 2008-02-13 18:29:31 CST LOG: loaded library "$libdir/plugins/plugin_debugger.dll" 2008-02-13 18:29:31 CST LOG: loaded library "$libdir/plugins/plugin_debugger.dll" 2008-02-13 18:29:31 CST LOG: loaded library "$libdir/plugins/plugin_debugger.dll" 2008-02-13 18:29:31 CST LOG: loaded library "$libdir/plugins/plugin_debugger.dll" 2008-02-13 18:29:31 CST LOG: loaded library "$libdir/plugins/plugin_debugger.dll" 2008-02-13 18:29:31 CST LOG: loaded library "$libdir/plugins/plugin_debugger.dll" Does this mean anything to anyone? I can't hardly believe that I'm the first one to notice that the windows build of 8.3 has serious issues. Thanks, Dan ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
[GENERAL] performance issues on windows with 8.3.0?
Are there any known regression issues WRT performance on the 8.3.0.1 binary build for windows? And I mean serious -multiple orders of magnitude- performance issues running simple queries on a small database... A little more background. I built 8.3.0 on Cent OS 5 today. Started using it with a server application that I have to do some benchmarking. Java application connecting via jdbc to Postgres on localhost. Everything looked great. Perhaps even faster than 8.2 - but I don't have the benchmarks to say for sure yet. Then, I wanted to test what is usually our fastest performing setup - with the database on a second system. So, I installed the new 8.3 on a Windows 2003 system, started it up, created my database, pointed my java app to it and everything seemed good. Then I put a load on it - and my server thru-put went from about 1500 messages per second on the linux-localhost combination to 30 on the remote windows postgres installation. I will be investigating this much more completely first thing tomorrow - but I figured I'd ask to see if there was something obvious first... Thanks, Dan ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
[GENERAL] Data conversion tools?
I'm trying to convert a database from either MS Access or MySQL into Postgres. I have found a couple of tools that will almost do what I want - but not quite. To make things match up with code that is already written - I need to have all of the table names and column names converted to lower case during the conversion process. I have found a couple of (free or demo) tools that will do it - navicat has a tools that will convert from MSAccess to Postgres - however, it doesn't offer a feature to lowercase all of the table names and column names. DB Tools Manager Professional will do it as well - but again, no way to tell it to lowercase things in the process. PGAdmin II had a tool that would do this - but alas - that part of PGAdmin was never brought into PGAdmin III. And the last version of PGAdmin II that I was able to find wouldn't run against my Postgres DB. Short of installing an older Postgres DB that PGAdmin II will work with - does anyone else know of a tool that can do what I want? Is there a script of some sort that I can run that would go through a set of tables and lowercase everything? Thanks, Dan -- Daniel Armbrust Biomedical Informatics Mayo Clinic Rochester daniel.armbrust(at)mayo.edu http://informatics.mayo.edu/ ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org/
Re: [GENERAL] GUI Interface
Tony Caduto wrote: Jim Wilson wrote: Those Windows developer tools must be working really well How right you are Jim, far superior and more productive than anything available on Linux at this time, though the netbeans IDE comes close, to bad Java is slow for almost all desktop applications. One example is LimeWire, while it works well, it takes forever to load and the UI seems a little sluggish compared to a native compiled application. You live in an interesting world... meanwhile, I'm here in the real world, using Eclipse - the best IDE I've ever used to develop java applications. Oh, wait, Eclipse is written in Java? I didn't think it was possible to write good apps in java? Certainly better than visual studio (and yes, I have to use both - eclipse is a far better IDE in my opinion). Oh, and I can run eclipse on my linux desktop as well as my windows desktop? Thats just cool. Oh, and my next machine? Definitely a Mac. And it woks there too - just like all of my SWT apps do. I will admit, it is rather easy to write slow java swing applications There are a lot of poor ones out there. Its a shame that Sun botched swing so badly, and have never repaired it properly. Its not impossible to write fast, responsive apps in swing, it just takes skilled developers. And its a pain. However, now with the emergence of SWT and modern JVM's - there is no reason for your java GUI to be any slower than anything else. The only excuse for a slow java app these days is the quality of the code that it is built with. And you can write a bad, slow app in any language. Oh, and to get back on topic a bit more - DBVisualizer is also a nice database GUI (written in Java, by the way) that is very fast, responsive, and cross platform. http://www.minq.se/products/dbvis/ Its also rather handy if you have to deal with 5 different types of databases on a daily basis, since it handles them all. Dan -- Daniel Armbrust Biomedical Informatics Mayo Clinic Rochester daniel.armbrust(at)mayo.edu http://informatics.mayo.edu/ ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
[GENERAL] Problem dropping a table
I have a java application that is trying to dynamically drop a set of tables. Problem is, when it gets to a particular table and I execute the "drop table foo cascade" command from a prepared statement, the query never returns. It just hangs indefinitely. I presume that it is waiting on a lock for this table, but there is nothing that I know of that should be locking this table. So, I have two issues - the first, how do I prevent myself from hanging indefinitely? I tried 'setQueryTimeout' on the prepared statement, but it doesn't seem to have any effect. Still hangs indefinitely. The second, how can I track down what is locking this table? I presume that it is some of my code somewhere... maybe a prepared statement that I didn't get closed - but I can't find it. I've check my code twice for any reference to this table, and every use of it is properly closing the result sets and the prepared statement. Any ideas? Thanks, Dan -- Daniel Armbrust Biomedical Informatics Mayo Clinic Rochester daniel.armbrust(at)mayo.edu http://informatics.mayo.edu/ ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [GENERAL] Create a new database from JDBC?
Dave Page wrote: On 8/2/06 17:26, "Joshua D. Drake" <[EMAIL PROTECTED]> wrote: Certainly not what I want, since that database doesn't exist. Is there a system database I could always count on being available that I could connect to? template1 but only if you allow it from pg_hba.conf. The postgres database would be more appropriate on 8.1+ (it is specifically inteded for that kind of thing), however like template1, it can be dropped, though that isn't advisable without good reason. Regards, Dave. So there is no guaranteed way to do this, unless there is a known, existing database? Seems like a shortcoming, to me. Would this be considered as a feature request, or does the current design make it to difficult to implement? Dan -- Daniel Armbrust Biomedical Informatics Mayo Clinic Rochester daniel.armbrust(at)mayo.edu http://informatics.mayo.edu/ ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
[GENERAL] Create a new database from JDBC?
How do I make a connection to a Postgresql database that (potentially) has 0 databases from jdbc, so that I can issue a create database command? In that _other_ database, I can make a jdbc connection to an address like this: jdbc:mysql://servername/ And then issue a "Create database foo" command. Then I can reconnect to jdbc:mysql://servername/foo And everything is good. If I try to connect to a postgresql database like this: jdbc:postgresql://servername/ Postgres attempts to connect to a database that is named the same as the username I provided. Certainly not what I want, since that database doesn't exist. Is there a system database I could always count on being available that I could connect to? Thanks, Dan -- Daniel Armbrust Biomedical Informatics Mayo Clinic Rochester daniel.armbrust(at)mayo.edu http://informatics.mayo.edu/ ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
[GENERAL] Specify schema in jdbc connection string?
Is there any way to specify which schema I want to use (or have at the front of the search path) within the jdbc connection string? I've seen several people asking for the feature in the archives, one person saying they wrote it but the patch was rejected, but not authoritative answers... Thanks, Dan -- Daniel Armbrust Biomedical Informatics Mayo Clinic Rochester daniel.armbrust(at)mayo.edu http://informatics.mayo.edu/ ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [GENERAL] fine tuned database dump/reload?
Jim C. Nasby wrote: > pg_dump handles table ordering properly. > Maybe I missed something then, because it didn't last time I tried to move some data. I had to drop my foreign keys before I could reload it. Dan -- Daniel Armbrust Biomedical Informatics Mayo Clinic Rochester daniel.armbrust(at)mayo.edu http://informatics.mayo.edu/ ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [GENERAL] fine tuned database dump/reload?
Jim C. Nasby wrote: Since no one else has mentioned it, there has been discussion this week on one of the lists (probably -general or -hackers) about expanding the capabilities of pg_dump. I've advocated for allowing a file that specifies what objects to dump and what kind of filtering to apply to the name. Allowing for only dumping certain rows in a table (applying a WHERE clause) would be a neat addition to that. That would be exactly what I need - the ability to apply a where clause. That - and the ability to specify multiple tables to dump, instead of just one. Because currently, I still get into foreign key trouble in this scenario: I drop data (and only data) from database A. This data has foreign key constraints. I want to load this data into database B - where the tables already exist, they already have foreign keys, and there is some existing data. My dump from database A won't load - because the tables were dumped in the wrong order to satisfy the foreign keys on the reload - so the only way I can load this data back into an existing database is to dump each table individually, and reload in the proper order, or manually drop my foreign keys while I load the data. Both ways are a pain - and it makes it overly difficult to export/import chunks of data from a database. Dan -- Daniel Armbrust Biomedical Informatics Mayo Clinic Rochester daniel.armbrust(at)mayo.edu http://informatics.mayo.edu/ ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [GENERAL] fine tuned database dump/reload?
My use case is not so much for database backup purposes as it is for fine grained export and import. Our database schema consists of 15 tables that represent a terminology. Each database can hold multiple terminologies - every table has a terminologyId column which distinguishes one terminology from another. I now realize that I don't need to do anything special with individual tables since PostgreSQL handles foreign keys properly - so that question is void - I'm fine with working with a whole database at a time. However, quite often I will load up a terminology on our development or test server (which also contains many other very large terminologies). When I am satisfied that it is ready, I want to put this terminology onto our production servers. But I don't want to have to export millions of rows that I don't need. With MySQL, the ability to do a dump of data which satisfies a where clause (for example: terminologyId='foo') gives me the ability to dump the data out to a file very quickly - move the file to the server it needs to be loaded on, and then re-load that data into the production database. In PostgreSQL, the only way that I see to do this is to have my desired data in a database all by itself, so that I can use pg_dump to backup the entire database. Then I can load that into the existing production database. Is there a better way to do this? Is there a flag I could specify for psql that would cause it to output INSERT or COPY statements as a result of a query - select * from foo where terminologyId=foo? Then I could just have 15 select statements batched up in a file, and pipe the output into a new file. I suppose this is kind of an obscure use case - but a flag on pg_dump where I could specify a where condition would certainly be handy. Thanks, Dan -- Daniel Armbrust Biomedical Informatics Mayo Clinic Rochester daniel.armbrust(at)mayo.edu http://informatics.mayo.edu/ ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [GENERAL] fine tuned database dump/reload?
Dan Armbrust wrote: Bricklen Anderson wrote: pg_dump --help ... -n, --schema=SCHEMA dump the named schema only -s, --schema-onlydump only the schema, no data -t, --table=TABLEdump the named table only ... Just after I sent the e-mail, I realized that I forgot that I saw that option - but on my first reading of the description (and being and annoying user who didn't try it first) it didn't appear that it would allow me to specify more than one table. But I tested it, and it does let me specify the -t command more than once. So that gets me past that issue. Dan Now I'm just filling the mailing list with mis-information. It actually ignores all but the last -t flag - so this only allows me to specify one table at a time, rather than several tables. I need to write up my use case so the maintainers can see why I want to be able to dump things in such a specific way - its not for backup purposes - so I'm kind of misusing the intent of the tool. More info in a bit. Dan -- Daniel Armbrust Biomedical Informatics Mayo Clinic Rochester daniel.armbrust(at)mayo.edu http://informatics.mayo.edu/ ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [GENERAL] fine tuned database dump/reload?
Bricklen Anderson wrote: pg_dump --help ... -n, --schema=SCHEMA dump the named schema only -s, --schema-onlydump only the schema, no data -t, --table=TABLEdump the named table only ... Just after I sent the e-mail, I realized that I forgot that I saw that option - but on my first reading of the description (and being and annoying user who didn't try it first) it didn't appear that it would allow me to specify more than one table. But I tested it, and it does let me specify the -t command more than once. So that gets me past that issue. Dan -- Daniel Armbrust Biomedical Informatics Mayo Clinic Rochester daniel.armbrust(at)mayo.edu http://informatics.mayo.edu/ ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
[GENERAL] fine tuned database dump/reload?
Does postgresql have any facility to dump anything more fine grained than a database to a text file? For example, to mention a bad word, MySQL's dump command allows you to specify individual tables to dump - and not only that - you can also specify a SQL query that must be satisfied for a row to make it into the dump file - this is extremely useful for the way that we use our database. Also - how does postgresql handle foreign keys if you dump an entire database along with the create table statements? Because my foreign keys require that the tables be created in a certain order - MySQL fails on this front - but since it allows me to specify the tables to dump, and it dumps them in that order, I can specify the correct order. PostgreSQL's pg_dump command seems rather limited in its abilities. Maybe I'm missing the command I'm looking for. Thanks, Dan -- Daniel Armbrust Biomedical Informatics Mayo Clinic Rochester daniel.armbrust(at)mayo.edu http://informatics.mayo.edu/ ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [GENERAL] Oracle buys Innobase
Greg Sabino Mullane wrote: The other answer may be the license: plugging PG into the MySQL system (which is about as technically feasible trying to breed a porpoise and an elephant) keeps MySQL GPL, which is another reason many people like it. The fact that PostgreSQL is NOT released under GPL is the reason that people like me are here - MySQL's license drove us away from them. Their change of the driver licensing prevents us from shipping new drivers with our applications. GPL is a poison pill when it comes to groups like us that are trying to develop standards (and shared code bases) that can be used by both opensource and corporate types alike. So keep up the good work! Dan -- Daniel Armbrust Biomedical Informatics Mayo Clinic Rochester daniel.armbrust(at)mayo.edu http://informatics.mayo.edu/ ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [GENERAL] PostgreSQL 8.1 vs. MySQL 5.0?
IBM have previously bought Informix (who bought Illustra, RedBrick, Cloudscape) None of those take- overs has led to a product actually surviving. Thats not exactly true - Cloudscape was just given to Apache, and is now opensourced under the name "Derby" http://db.apache.org/derby/ Suddenly, Hypersonic SQL http://www.hsqldb.org/ (which also works wonderfully for small databases - nobody would claim that these can scale like PostgreSQL) has a bunch of competition. Dan -- Daniel Armbrust Biomedical Informatics Mayo Clinic Rochester daniel.armbrust(at)mayo.edu http://informatics.mayo.edu/ ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
[GENERAL] [Off Topic] Swing vs SWT
Michael Schmidt wrote: Mr. Armbrust, Hopefully, this isn't off topic. Just wondering why SWT instead of swing? Michael Schmidt I'm pretty sure this is completely off topic, but I'll indulge it - I've done development using Swing and SWT - and I find SWT's APIs to be much cleaner, intuitive, and easy to use. Manual GUI development is much quicker with SWT and results in much less code that is far more readable. Also, SWT layout managers are _much_ easier to use than the crud that comes with Swing from Sun. Side note - if you do have to do Swing development, I highly recommend using the JGoodies forms package - it is a layout manager for Swing that is far and away better than trying to do things with Swing layout managers - http://www.jgoodies.com/freeware/forms/index.html As an example, I rewrote some forms that were 800 lines of code using GridBagLayout into about 150 lines of code using JGoodies. And the JGoodies code is far more readable. Back to SWT - Swing still just looks like crud when compared side-by-side with native Windows applications. Especially in windows XP - Sun still doesn't have the Windows XP Look and Feel implemented properly. And if a user applies a custom XP skin, then it really looks strange. The file choosers don't look or work like native Windows XP file choosers. And when windows "super duper 5.0" comes out next year, I'm sure it will only take sun another 3 years or so to write the new look and Feel. Meanwhile, your apps just look weird. SWT Apps _always_ look exactly like the OS look and feel - because it uses the native OS widgets. You can also embed Swing parts into a SWT application if you need to. When I write a SWT application, most of my users don't even know it is a java application. It just looks like a normal "windows" application. You can also launch SWT apps using Java WebStart, or, if you don't want to use webstart, you combine your app, SWT, and http://jsmooth.sourceforge.net/ and suddenly you have a native windows executable for your users. All of javas ugly bits are safely hidden away from your end user. If you want to compare a Swing application (written using JGoodies looks and JGoodies look and feel - which does a better job than Suns look and feel at looking like Windows XP) side by side with a similar SWT application - take a look at these two apps: (scroll down to the webstart links) (ps - these apps aren't supposed to be masterpieces of user interface design - they are just utilities for techie types using LexGrid) http://informatics.mayo.edu/LexGrid/index.php?page=convert http://informatics.mayo.edu/LexGrid/index.php?page=indexer Especially pay attention to the speed and appearance of the file choosers (click the "Browse..". button) - to enable the second file chooser in the Indexer Creator app - go to the "Options" menu and choose "Build Normalized Index" Another awesome gui that is written in SWT (and most end users don't even know its written in Java) http://azureus.sourceforge.net/ And a couple more - to finish with a little bit of relevance to PostgreSQL A database designer plugin for Eclipse (which is all written in SWT) - supports PostgreSQL: http://www.azzurri.jp/en/software/clay/index.jsp Another database plugin for Eclipse that lets you work with a PostgreSQL database. http://quantum.sourceforge.net/screen_shots.html Dan Daniel Armbrust Biomedical Informatics Mayo Clinic Rochester daniel.armbrust(at)mayo.edu http://informatics.mayo.edu/
Re: [GENERAL] What Front-End you suggest ?
Gerson - Yahoo wrote: Dear friends, I need to development one system in native Linux, Language and DataBase, today I use PostgreSQL with Delphi 2005 in Windows 2005 Servers, and I need to know what language will use to do this ? That you suggest ? This Langage work good in what Linux SO ? This language have developmente ambient IDE (like Delphi 2005) ? Where may download this compiler to test ? Tks for help. Gerson Java... And Eclipse for your IDE. And using SWT rather than Swing for your GUI. Dan -- Daniel Armbrust Biomedical Informatics Mayo Clinic Rochester daniel.armbrust(at)mayo.edu http://informatics.mayo.edu/
Re: [GENERAL] 5 new entries for FAQ
Bruce Momjian wrote: I have updated the FAQ to handle three of the items you mentioned. In one case, I added a new FAQ entry (double-quoting identifiers with a link to our docs), and in two other cases (indexing long columns, case-insensitive columns) I added to existing FAQ items where appropriate. Patch attached and applied. For the batch job and single-CPU issues, they did not fit into existing FAQ entries, and I am not sure they are asked enough to be added as FAQs. I am interested to hear other's opinions on this. *** *** 613,618 --- 617,628 The maximum table size and maximum number of columns can be quadrupled by increasing the default block size to 32k. + One limitation is that indexes can not be created on columns + longer than about 2,000 characters. Fortunately, such indexes are + rarely needed. Uniqueness is best guaranteed using another column + that is an MD5 hash of the long column, and full text indexing + allows for searching of words within the column. + 4.5) How much database disk space is required to store data from a typical text file? I liked the FAQ entry written up by the original submitter much better, WRT to the long index issue. This write up does not contain the actual error message encountered, so its not going to be found by someone looking for a solution to the problem. It doesn't contain the optional workaround, and it doesn't have a link into the full text searching features as the recommended fix. I was one of the more recent people to encounter this problem, and I don't think that this FAQ would have helped me (if I even saw it, because google certainly won't pick it up in a way that most people will search when the encounter the issue - they are going to put in the error message) Dan -- Daniel Armbrust Biomedical Informatics Mayo Clinic Rochester daniel.armbrust(at)mayo.edu http://informatics.mayo.edu/
Re: [GENERAL] psqsl -> remote db
[EMAIL PROTECTED] wrote: First, makes sure the PostgreSQL database is listening for TCP/IP connections on the computer you're trying to connect to. On Mac OS X, run this command in the a terminal, i.e. Terminal.app: netstat -l You should see a line like this somewhere in the output: tcp6 0 0 *.5432 *.* LISTEN Second, the database server's pg_hba.conf likely needs to be configured to accept connections from remote users. This can be done to restrict connections from specific hosts (IP addresses), which isn't terribly practical, or by user and encrypted password. Search the PostgreSQL document site for "pg_hba.conf" for info. Third, if the database server is listening, and it is configured to allow you to connect, try connecting with psql: psql -h myhost.com -U me mydb "myhost.com" can be an IP address. Type "man psql" to see what the options mean. I hope this helps. Jim I'm sorry, I'm sure I should be asking this someplace more general, but apparantly the word 'port' is used in so many different contexts that a google search turns up fruitless ;) I'm just trying to access a remote postgresql database on port 5432, which does not appear to be open on the remote computer. I'm on a MacOSX and cannot figure out for the life of me how to open up this port on the computer. This is probably a unix question if anything, but any help would be apprecaited. -Andrew I'm not a Mac user, but don't they have a firewall by default? Maybe useful? http://cfm.gs.washington.edu/security/firewall/osx-10.1-ipfw/ -- Daniel Armbrust Biomedical Informatics Mayo Clinic Rochester daniel.armbrust(at)mayo.edu http://informatics.mayo.edu/ ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [GENERAL] [BUGS] BUG #1552 followup
Jaime Casanova wrote: On 8/2/05, Dan Armbrust <[EMAIL PROTECTED]> wrote: I've been trying to track down some performance problems that I am having doing large inserts on tables with foreign keys. I'm 99% sure that the issue I am having is BUG 1552 - http://archives.postgresql.org/pgsql-bugs/2005-03/msg00183.php When I read the whole thread, I got the impression that a fix (or maybe a hack) had been put into the 8.0 branch. I'm running 8.0.3, and I'm seeing the issue. I can go into more details if desired... So my questions are: 1) Would that fix have made it into the 8.0.3 release - alternatively, how can I figure that out myself? 2) Is there a bug tracker somewhere, or do you just track bugs on the mailing list? I haven't been able to find anything on your site. Thanks, Dan What part of the discussion gives you that impression... I understood that Tom will (in the future) see a fix when Neil finish the framework to replan prepared statements (or is rebuild cached plans where the table size has grow significantly?) PD: This is not the list for this discussion... i am forwarding you to another(randomly choosen) list. This posting seems to indicate that there was a change made: http://archives.postgresql.org/pgsql-bugs/2005-03/msg00183.php What is the proper way to discuss bugs? The only thing I have found on the website is the link for reporting new bugs, (which just sends an e-mail to the bugs mailing list?) Since I didn't see any bug tracker software, I figured (wrongly) that you must just discuss bugs on the bugs list... Dan -- Daniel Armbrust Biomedical Informatics Mayo Clinic Rochester daniel.armbrust(at)mayo.edu http://informatics.mayo.edu/
Force PostgreSQL to use indexes on foreign key lookups - Was: [GENERAL] Slow Inserts on 1 table?
An aha moment: http://archives.postgresql.org/pgsql-bugs/2005-03/msg00183.php Some of the ensuing conversation seemed to indicate that a change was made in the 8.0 branch in March, that was intended to fix this issue. Any idea if that fix would have made it into the 8.0.3 release? Or maybe the fix didn't fix the issue. Dan -- Daniel Armbrust Biomedical Informatics Mayo Clinic Rochester daniel.armbrust(at)mayo.edu http://informatics.mayo.edu/ ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [GENERAL] Slow Inserts on 1 table?
Alvaro Herrera wrote: On Tue, Aug 02, 2005 at 10:01:50AM -0500, Dan Armbrust wrote: I shouldn't have to manually run Analyze to make the DB be capable of handling inserts involving tables with foreign keys correctly. My code that is doing the inserts is a java application that works across multiple DBS - MySQL, PostgreSQL, DB2, MS Access, Sybase, etc. [etc, rant removed] You don't _have_ to be rude. This is a known limitation, and people have suggested the usual workarounds. This is an open source project -- if you think you can make it better, please by all means post a patch. HAND. My apologies, I don't intend to be rude. But it is rather easy to get into rant mode when the prevailing opinion is that not being able to insert rows into a table with a foreign key without running Analyze after X rows is a misuse of the DB, rather than a bug. I did not know that this is a known limitation, I have not been able to find any documentation that talks about how foreign keys, indexes, and the query planner relate. My first assumption was that since foreign key creation implicitly creates the necessary indexes, that these indexes would always be used for foreign key checks. I wouldn't have even guessed that the query planner was involved in this portion. But, these are all (apparently wrong) guesses - I don't know the internals. The performance probably is better on small tables to not use these indexes. But it seems to me, that if you know that the statistics are out of date (which I would think that you should know, if analyze hasn't been run since the tables were created) that the safer choice would be to use the indexes, rather than not using the indexes. Dan -- Daniel Armbrust Biomedical Informatics Mayo Clinic Rochester daniel.armbrust(at)mayo.edu http://informatics.mayo.edu/
Re: [GENERAL] Slow Inserts on 1 table?
My loading is done programatically, from another format, so COPY is not an option. Why not? A lot of my bulk-loads are generated from other systems and I go through a temporary-file/pipe via COPY when I can. When I don't I block inserts into groups of e.g. 1000 and stick in an analyse/etc as required. I guess I should clarify - my inserts are done by a Java application running on a client machine. This isn't bulk load in the normal definition. I don't have any problem with the speed of the inserts when they are working correctly. The only problem is that the query analyzer is making a really poor decision when it is executing insert statements on tables that have foreign keys. So run ANALYSE in parallel with your load, or break the bulk-load into blocks and analyse in-line. I'm not sure ripping out PG's cost-based query analyser will be a popular solution just to address bulk-loads. I never suggested that it needed to be ripped out. It just seems that when it is looking to check foreign keys, and the statistics are not up to date (or have not yet been created) it should default to using the indexes, rather than not using the indexes. The time savings of using indexes when things are big is FAR bigger than the time savings of not using indexes when things are small. Dan -- Daniel Armbrust Biomedical Informatics Mayo Clinic Rochester daniel.armbrust(at)mayo.edu http://informatics.mayo.edu/ ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [GENERAL] Slow Inserts on 1 table?
Joshua D. Drake wrote: Why can't postgres compile some rough statistics on tables without running analyze? Why can't you just run analyze? You don't have to empty the tables to do so and you can alter the statistics on the fly. Heck you can even run analyze while doing the inserts. I shouldn't have to manually run Analyze to make the DB be capable of handling inserts involving tables with foreign keys correctly. My code that is doing the inserts is a java application that works across multiple DBS - MySQL, PostgreSQL, DB2, MS Access, Sybase, etc. I shouldn't have to put custom code into it just to make postgres deal with inserts properly. No other database that I insert data into has problems like this. This will look really nice in the instructions for my data loader - * - If you are using PostgreSQL for your database server, it has a bug that causes its performance to become abysmal unless you manually run this "Analyze" command a little while after you start the load process. I will have users that don't even know what a database is, much less have to go out and run manual sysadmin level commands on it to make the thing work. I have already had to disable sequential scans, since the planner is almost _always_ wrong in deciding whether or not to use an index. Sounds again like you are not setting up your statistics correctly or running analyze as and when it should. Doesn't matter if the statistics are perfectly up to date. It still doesn't use the indexes. If the default statistics are so poorly set up that the planner thinks a 5 million row table scan will be quicker for a query that is doing exact matches on indexed columns, I would say it is poorly implemented. So I just disabled that "feature". And it works fine with sequential scans disabled - I have no problem with it in this respect, since I can turn it off. Is there any way that I can disable sequential scans for foreign key checks? Dan -- Daniel Armbrust Biomedical Informatics Mayo Clinic Rochester daniel.armbrust(at)mayo.edu http://informatics.mayo.edu/ ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [GENERAL] Slow Inserts on 1 table?
What, ALWAYS faster, even for the first FK check when there's only one row in the target table and that's cached? If you're really in a hurry doing your bulk loads: 1. Use COPY. 2. Drop/restore the foreign-key constraints before/after. That will be hugely faster than INSERTs, although it's not always an applicable solution. -- Richard Huxton Archonet Ltd It seems like the query planner goes to great lengths to avoid using indexes because it might take 5 ms longer to execute an index lookup on a table with one row. But then, when the table has 1 million rows, and a full scan takes 3 minutes, and the index scan takes 3 seconds, it has no problem picking the 3 minute route. I'll gladly give up the 5 ms in turn for not having to wait 3 minutes, which is why I disabled the sequential scans. If I have a small table, where indexes won't speed things up, I wont build an index on it. The other factor, is that most of my tables have at least thousands, and usually millions of rows. Sequential scans will never be faster for the queries that I am doing - like I said, that is why I created the indexes. My loading is done programatically, from another format, so COPY is not an option. Neither is removing foreign keys, as they are required to guarantee valid data. I don't really have a problem with the insert speed when it is working properly - it is on par with other DBs that I have on the same hardware. The problem is when it stops using the indexes, for no good reason. Example, last night, I kicked off a load process - this morning, it had only managed to make it through about 600,000 rows (split across several tables). After restarting it this morning, it made it through the same data in 30 minutes. If thats not bad and buggy behavior, I don't know what is Dan -- Daniel Armbrust Biomedical Informatics Mayo Clinic Rochester daniel.armbrust(at)mayo.edu http://informatics.mayo.edu/ ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [GENERAL] Slow Inserts on 1 table?
Dan Armbrust wrote: I have one particular insert query that is running orders of magnitude slower than other insert queries, and I cannot understand why. For example, Inserts into "conceptProperty" (detailed below) are at least 5 times faster than inserts into "conceptPropertyMultiAttributes". When I am running the inserts, postmaster shows as pegging one CPU on the Fedora Core 3 server it is running on at nearly 100%. Any advice is appreciated. Here is a lot of info that may shed light on the issue to someone with more experience than me: Example Insert Query with data: INSERT INTO conceptPropertyMultiAttributes (codingSchemeName, conceptCode, propertyId, attributeName, attributeValue) VALUES ('NCI MetaThesaurus', 'C005', 'T-2', 'Source', 'MSH2005_2004_10_12') EXPLAIN ANALYZE output: QUERY PLAN Result (cost=0.00..0.01 rows=1 width=0) (actual time=0.005..0.008 rows=1 loops=1) Total runtime: 4.032 ms Table Structure: CREATE TABLE conceptpropertymultiattributes ( codingschemename character varying(70) NOT NULL, conceptcode character varying(100) NOT NULL, propertyid character varying(50) NOT NULL, attributename character varying(50) NOT NULL, attributevalue character varying(250) NOT NULL ); Primary Key: ALTER TABLE ONLY conceptpropertymultiattributes ADD CONSTRAINT conceptpropertymultiattributes_pkey PRIMARY KEY (codingschemename, conceptcode, propertyid, attributename, attributevalue); Foreign Key: ALTER TABLE ONLY conceptpropertymultiattributes ADD CONSTRAINT f FOREIGN KEY (codingschemename, conceptcode, propertyid) REFERENCES conceptproperty(codingschemename, conceptcode, propertyid); Structure of Table Referenced by Foreign Key: CREATE TABLE conceptproperty ( codingschemename character varying(70) NOT NULL, conceptcode character varying(100) NOT NULL, propertyid character varying(50) NOT NULL, property character varying(250) NOT NULL, "language" character varying(32), presentationformat character varying(50), datatype character varying(50), ispreferred boolean, degreeoffidelity character varying(50), matchifnocontext boolean, representationalform character varying(50), propertyvalue text NOT NULL ); Primary Key: ALTER TABLE ONLY conceptproperty ADD CONSTRAINT conceptproperty_pkey PRIMARY KEY (codingschemename, conceptcode, propertyid); Thanks, Dan Well, I now have a further hunch on why the inserts are so slow on 1 table. Most of the time, when I am doing bulk inserts, I am starting with an empty database. My insertion program creates the tables, indexes and foreign keys. The problem seems to be the foreign key - PostgreSQL is apparently being to stupid to use the indexes while loading and checking the foreign key between two large tables - my guess is because analyze has not been run yet, so it thinks all of the tables are size 0. If I let it run for a while, then kill the load process, run Analyze, empty the tables, and then restart, things perform fine. But that is kind of a ridiculous sequence to have to use to load a database. Why can't postgres compile some rough statistics on tables without running analyze? Seems that it would be pretty easy to keep track of the number of inserts/deletions that have taken place since the last Analyze execution... It may not be the exact right number, but it would certainly be smarter than continuing to assume that the tables are size 0, even though it has been doing constant inserts on the tables in question I have already had to disable sequential scans, since the planner is almost _always_ wrong in deciding whether or not to use an index. I put the indexes on the columns I choose for a reason - it is because I KNOW the index read will ALWAYS be faster since I designed the indexes for the queries I am running. But it still must be doing a sequential scan on these inserts... -- Daniel Armbrust Biomedical Informatics Mayo Clinic Rochester daniel.armbrust(at)mayo.edu http://informatics.mayo.edu/ ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [GENERAL] index row size exceeds btree maximum, 2713 - Solutions?
Dan Armbrust wrote: I'm trying to load some data into PostgreSQL 8.0.3, and I got the error message "index row size 2904 exceeds btree maximum, 2713". After a bunch of searching, I believe that I am getting this error because a value that I am indexing is longer than ~ 1/3 of the block size - or the BLCKSZ variable in the src/include/pg_config_manual.h file. Am I correct so far? I need to fix this problem. I cannot change the indexed columns. I cannot shorten the data value. And I cannot MD5 it, or any of those hashing types of solutions that I saw a lot while searching. Is there a variable I can set somewhere, so that postgresql would just truncate the value to the max length that the index can handle when it goes to enter it into the index, instead of failing with an error? I would be fine with not having this particular row fully indexed, so long as I could still retrieve the full data value. The other solution that I saw was to modify the BLCKSZ variable. From what I saw, it appears that to change that variable, I would need to dump my databases out, recompile everything, and then reload them from scratch. Is this correct? Currently the BLCKSZ variable is set to 8192. What are the performance/disk usage/other? implications of doubling this value, to 16384? Any other suggestions in dealing with this problem? Thanks, Dan Thanks for all the information and ideas WRT this issue. I ended up just having to remove the index from this particular column that was having the issue - in my particular case, I didn't lose anything by doing this anyway, because the index wasn't being used for its intended purpose anyway, due to case sensitivity issues. Could I suggest adding this error, its causes, and possible solutions from this thread (http://archives.postgresql.org/pgsql-general/2005-07/msg00731.php) to the FAQ? It took me a long time to connect all the dots through a lot of different e-mail threads. Also, maybe the max index size should be documented in the manual as well? Dan -- Daniel Armbrust Biomedical Informatics Mayo Clinic Rochester daniel.armbrust(at)mayo.edu http://informatics.mayo.edu/ ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[GENERAL] Slow Inserts on 1 table?
I have one particular insert query that is running orders of magnitude slower than other insert queries, and I cannot understand why. For example, Inserts into "conceptProperty" (detailed below) are at least 5 times faster than inserts into "conceptPropertyMultiAttributes". When I am running the inserts, postmaster shows as pegging one CPU on the Fedora Core 3 server it is running on at nearly 100%. Any advice is appreciated. Here is a lot of info that may shed light on the issue to someone with more experience than me: Example Insert Query with data: INSERT INTO conceptPropertyMultiAttributes (codingSchemeName, conceptCode, propertyId, attributeName, attributeValue) VALUES ('NCI MetaThesaurus', 'C005', 'T-2', 'Source', 'MSH2005_2004_10_12') EXPLAIN ANALYZE output: QUERY PLAN Result (cost=0.00..0.01 rows=1 width=0) (actual time=0.005..0.008 rows=1 loops=1) Total runtime: 4.032 ms Table Structure: CREATE TABLE conceptpropertymultiattributes ( codingschemename character varying(70) NOT NULL, conceptcode character varying(100) NOT NULL, propertyid character varying(50) NOT NULL, attributename character varying(50) NOT NULL, attributevalue character varying(250) NOT NULL ); Primary Key: ALTER TABLE ONLY conceptpropertymultiattributes ADD CONSTRAINT conceptpropertymultiattributes_pkey PRIMARY KEY (codingschemename, conceptcode, propertyid, attributename, attributevalue); Foreign Key: ALTER TABLE ONLY conceptpropertymultiattributes ADD CONSTRAINT f FOREIGN KEY (codingschemename, conceptcode, propertyid) REFERENCES conceptproperty(codingschemename, conceptcode, propertyid); Structure of Table Referenced by Foreign Key: CREATE TABLE conceptproperty ( codingschemename character varying(70) NOT NULL, conceptcode character varying(100) NOT NULL, propertyid character varying(50) NOT NULL, property character varying(250) NOT NULL, "language" character varying(32), presentationformat character varying(50), datatype character varying(50), ispreferred boolean, degreeoffidelity character varying(50), matchifnocontext boolean, representationalform character varying(50), propertyvalue text NOT NULL ); Primary Key: ALTER TABLE ONLY conceptproperty ADD CONSTRAINT conceptproperty_pkey PRIMARY KEY (codingschemename, conceptcode, propertyid); Thanks, Dan ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
[GENERAL] index row size exceeds btree maximum, 2713 - Solutions?
I'm trying to load some data into PostgreSQL 8.0.3, and I got the error message "index row size 2904 exceeds btree maximum, 2713". After a bunch of searching, I believe that I am getting this error because a value that I am indexing is longer than ~ 1/3 of the block size - or the BLCKSZ variable in the src/include/pg_config_manual.h file. Am I correct so far? I need to fix this problem. I cannot change the indexed columns. I cannot shorten the data value. And I cannot MD5 it, or any of those hashing types of solutions that I saw a lot while searching. Is there a variable I can set somewhere, so that postgresql would just truncate the value to the max length that the index can handle when it goes to enter it into the index, instead of failing with an error? I would be fine with not having this particular row fully indexed, so long as I could still retrieve the full data value. The other solution that I saw was to modify the BLCKSZ variable. From what I saw, it appears that to change that variable, I would need to dump my databases out, recompile everything, and then reload them from scratch. Is this correct? Currently the BLCKSZ variable is set to 8192. What are the performance/disk usage/other? implications of doubling this value, to 16384? Any other suggestions in dealing with this problem? Thanks, Dan ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [GENERAL] index row size exceeds btree maximum, 2713 - Solutions?
Hmm, well, I don't know if it is actually building an index properly on this column, I just assumed that it was. It doesn't fail on every insert, only on the one that has a really long text value. I know it doesn't use the index when I do "ILIKE" queries, resulting in poor performance... but I assumed that was because I was trying to do a case insensitve search on a case sensitive column index. I didn't want to go down the road of writing even more database implementation specific code. I will usually be using Lucene for the full text searches anyway. Where is the documentation on tsearch2? I haven't seen it mentioned anywhere except a couple of mailing list postings. All of my other limitations on changing things aside - given a query like this: Select * from conceptproperty where codingSchemeName='foo' AND property='anotherfoo' and propertyValue ILIKE 'valu%' What indexe(s) would be recommended? My current 3 column index (that works on other DB's) doesn't perform well due to case sensitivity issues, and now fails, due to data length issues. Dan Dan Jaime Casanova wrote: The index that is failing is CREATE INDEX i1 ON conceptproperty USING btree (codingschemename, property, propertyvalue). I don't think you could create indexes on text fields... there are other type of indexes for that... tsearch2 for example
Re: [GENERAL] index row size exceeds btree maximum, 2713 - Solutions?
We have built a Model for terminologies that we call The Lexical Grid (more info http://informatics.mayo.edu/LexGrid/index.php) LexGrid has multiple backend data storage mechanisms, including LDAP and SQL. We do our best to remain implementation independent - our SQL implementations, for example can run against MS Access, DB2, MySQL and PostgreSQL. I'm currently trying to load a new terminology into a PosgreSQL backend, and arrived at this error because it happens to have a couple of very large data values that get mapped into the 'propertyvalue' field. The structure of the table that I am (currently) having problems with is: (apologies if your client does not parse HTML - this is what is convenient to me right now) codingschemename character varying(70) NOT NULL Alter Drop conceptcode character varying(100) NOT NULL Alter Drop propertyid character varying(50) NOT NULL Alter Drop property character varying(250) NOT NULL Alter Drop language character varying(32) Alter Drop presentationformat character varying(50) Alter Drop datatype character varying(50) Alter Drop ispreferred boolean Alter Drop degreeoffidelity character varying(50) Alter Drop matchifnocontext boolean Alter Drop representationalform character varying(50) Alter Drop propertyvalue text NOT NULL Alter Drop The structure of the table is not easily changed, as it closely follows our model. The index that is failing is CREATE INDEX i1 ON conceptproperty USING btree (codingschemename, property, propertyvalue). Usually, the 'propertyValue' field is fairly short - 100 chars or less. And in those cases, I need to be able to do an indexed search on it. In this particular case, this terminology has a propertyValue that is very long. I can't just toss it. I don't really care if it makes it into the index - for a terminology with as much data as this has, I'll be using Lucene to do text searches anyway - but I do need to be able to return the full propertyValue in response to a propertyId query. I also don't want to negatively affect the performance of the rest of the DB just to accommodate this instance - hence the questions about the implications of changing the BLCKSZ variable. Dan Bruno Wolff III wrote: On Mon, Jul 18, 2005 at 14:44:26 -0500, Dan Armbrust <[EMAIL PROTECTED]> wrote: I'm trying to load some data into PostgreSQL 8.0.3, and I got the error message "index row size 2904 exceeds btree maximum, 2713". After a bunch of searching, I believe that I am getting this error because a value that I am indexing is longer than ~ 1/3 of the block size - or the BLCKSZ variable in the src/include/pg_config_manual.h file. Am I correct so far? I need to fix this problem. I cannot change the indexed columns. I cannot shorten the data value. And I cannot MD5 it, or any of those hashing types of solutions that I saw a lot while searching. Can you explain how you are using the index now and what you are trying to accomplish? It is hard to suggest alternatives without knowing what you are really trying to do.
Re: [GENERAL] index row size exceeds btree maximum, 2713 - Solutions?
Nevermind this question... > Where is the documentation on tsearch2? Google first, ask second, I remind myself again... I knew I hadn't seen it mentioned in the official postgresql manual.. didn't think about it being an extension. Dan Dan Armbrust wrote: Hmm, well, I don't know if it is actually building an index properly on this column, I just assumed that it was. It doesn't fail on every insert, only on the one that has a really long text value. I know it doesn't use the index when I do "ILIKE" queries, resulting in poor performance... but I assumed that was because I was trying to do a case insensitve search on a case sensitive column index. I didn't want to go down the road of writing even more database implementation specific code. I will usually be using Lucene for the full text searches anyway. I haven't seen it mentioned anywhere except a couple of mailing list postings. All of my other limitations on changing things aside - given a query like this: Select * from conceptproperty where codingSchemeName='foo' AND property='anotherfoo' and propertyValue ILIKE 'valu%' What indexe(s) would be recommended? My current 3 column index (that works on other DB's) doesn't perform well due to case sensitivity issues, and now fails, due to data length issues. Dan Dan Jaime Casanova wrote: The index that is failing is CREATE INDEX i1 ON conceptproperty USING btree (codingschemename, property, propertyvalue). I don't think you could create indexes on text fields... there are other type of indexes for that... tsearch2 for example