from:"Dan Armbrust"

[GENERAL] diagram tools?

2010-11-24 Thread Dan Armbrust

Can anyone recommend a PostgreSQL compatible free tool that I can use
to generate some schema diagrams of an existing database?

Thanks

Dan

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Checkpoint Tuning Question

2009-07-20 Thread Dan Armbrust

On Mon, Jul 13, 2009 at 3:53 PM, Dan
Armbrust wrote:
>> So this thought leads to a couple of other things Dan could test.
>> First, see if turning off full_page_writes makes the hiccup go away.
>> If so, we know the problem is in this area (though still not exactly
>> which reason); if not we need another idea.  That's not a good permanent
>> fix though, since it reduces crash safety.  The other knobs to
>> experiment with are synchronous_commit and wal_sync_method.  If the
>> stalls are due to commits waiting for additional xlog to get written,
>> then async commit should stop them.  I'm not sure if changing
>> wal_sync_method can help, but it'd be worth experimenting with.
>>
>>                        regards, tom lane
>>
>
>
> All of my testing to date has been done with synchronous_commit=off
>
> I just tried setting full_page_writes=off - and like magic, the entire
> hiccup went away.
>
> Thanks,
>
> Dan

I haven't done any other testing for this issue since discovering that
setting "full_page_writes=off" makes my performance hiccup go away.
I'm about to move on to some different tests and reset this setup that
I was using to test out this issue - is there anything else that
anyone would like to have gathered before I axe this system?

I can redo it again later too, it will just take a fair bit longer for
me to reconfigure the test environment.

Thanks,

Dan

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Checkpoint Tuning Question

2009-07-14 Thread Dan Armbrust

>
> Propose a DTrace probe immediately after the "goto begin" at line 740 of
> xlog.c, so we can start tracing from the first backend following
> checkpoint, and turn off tracing when all backends have completed a
> transaction.
>

That's greek to me.  But I'm happy to test things if you send me
patches or custom code.

Thanks,

Dan

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Checkpoint Tuning Question

2009-07-13 Thread Dan Armbrust

> So this thought leads to a couple of other things Dan could test.
> First, see if turning off full_page_writes makes the hiccup go away.
> If so, we know the problem is in this area (though still not exactly
> which reason); if not we need another idea.  That's not a good permanent
> fix though, since it reduces crash safety.  The other knobs to
> experiment with are synchronous_commit and wal_sync_method.  If the
> stalls are due to commits waiting for additional xlog to get written,
> then async commit should stop them.  I'm not sure if changing
> wal_sync_method can help, but it'd be worth experimenting with.
>
>                        regards, tom lane
>


All of my testing to date has been done with synchronous_commit=off

I just tried setting full_page_writes=off - and like magic, the entire
hiccup went away.

Thanks,

Dan

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Checkpoint Tuning Question

2009-07-10 Thread Dan Armbrust

> Hm, I'm not sure I believe any of that except the last bit, seeing that
> he's got plenty of excess CPU capability.  But the last bit fits with
> the wimpy-I/O problem, and it also offers something we could test.
> Dan, please see what happens when you vary the wal_buffers setting.
> (Note you need a postmaster restart to change that.)
>
>                        regards, tom lane
>

Ok, I tried a few different values - 32kb, 64kb, 512kb, 2MB and 10MB.

I'm not seeing any highly noticeable change in behaviour with any
setting - it wasn't a scientific test, but I seem to have about the
same size hiccup with each setting.  The hiccup may be slightly
shorter with the 10MB setting, but barely, if it is.

Thanks,

Dan

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Checkpoint Tuning Question

2009-07-09 Thread Dan Armbrust

> As Greg commented upthread, we seem to be getting forced to the
> conclusion that the initial buffer scan in BufferSync() is somehow
> causing this.  There are a couple of things it'd be useful to try
> here:
>
> * see how the size of the hiccup varies with shared_buffers;

I tried decreasing shared buffers - both 25MB and 50MB were too small
for my load - I had slow queries at all times.

So then I increased it from what I was using - 100MB - to 500MB - and
the hiccup roughly doubles in length.

At 100MB, the hiccup is about 2-3 seconds long.
At 500MB, the hiccup is about 6 seconds long.

>
> * try inserting a delay into that scan loop, as per attached
>  quick-and-dirty patch.  (Numbers pulled from the air, but
>  we can worry about tuning after we see if this is really
>  where the problem is.)
>

After finally getting this particular system into a state where I
could build postgres (I was using the binary install) I built a 8.3.4,
using your patch - but I didn't see any change in the behaviour.  I
see hiccups that appear to be the same length as I saw on the binary
build of 8.3.4.

Thanks,

Dan

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Checkpoint Tuning Question

2009-07-08 Thread Dan Armbrust

> However, the latest report says that he
> managed that, and yet there's still a one-or-two-second transient of
> some sort.  I'm wondering what's causing that.  If it were at the *end*
> of the checkpoint, it might be the disk again (failing to handle a bunch
> of fsyncs, perhaps).  But if it really is at the *start* then there's
> something else odd happening.
>
>                        regards, tom lane
>

Log output during heavy load - checkpoint logging on, and slow query
logging on for queries that take longer than 1 second.
Blank space inserted to show interesting bits

Almost all of the slow query log messages are logged within about 3
seconds of the checkpoint starting message.


LOG:  checkpoint starting: xlog

LOG:  duration: 1101.419 ms  execute : select
dynamichos0_.ethernetmacaddr as ethernet1_0_, dynamichos0_.ipaddr as
ipaddr0_, dynamichos0_.cpemac as cpemac0_, dynamichos0_.last_updated
as last3_0_, dynamichos0_.leasetime as leasetime0_,
dynamichos0_.regtime as regtime0_ from iphost dynamichos0_, cpe cpe1_
where dynamichos0_.cpemac=cpe1_.cpemac and 1=1 and
dynamichos0_.ethernetmacaddr=$1 and dynamichos0_.cpemac=$2 and
cpe1_.regBaseId=$3 and dynamichos0_.ipaddr<>$4
DETAIL:  parameters: $1 = '01:01:01:56:01:7F', $2 = '00:55:00:82', $3
= '01:01:01:56', $4 = '000.000.000.000'
LOG:  duration: 1101.422 ms  execute : insert into iphost
(cpemac, last_updated, leasetime, regtime, ethernetmacaddr, ipaddr)
values ($1, $2, $3, $4, $5, $6)
DETAIL:  parameters: $1 = '01:AA:00:A0', $2 = '2009-07-08
15:33:20.673', $3 = '2009-07-08 21:06:40.67', $4 = '2009-07-08
15:33:20.67', $5 = '01:01:02:AB:01:9D', $6 = '2.171.156.0'



LOG:  duration: 1501.905 ms  execute : select
dynamichos0_.ethernetmacaddr as ethernet1_0_, dynamichos0_.ipaddr as
ipaddr0_, dynamichos0_.cpemac as cpemac0_, dynamichos0_.last_updated
as last3_0_, dynamichos0_.leasetime as leasetime0_,
dynamichos0_.regtime as regtime0_ from iphost dynamichos0_, cpe cpe1_
where dynamichos0_.cpemac=cpe1_.cpemac and 1=1 and
dynamichos0_.ethernetmacaddr=$1 and dynamichos0_.cpemac=$2 and
cpe1_.regBaseId=$3 and dynamichos0_.ipaddr<>$4
DETAIL:  parameters: $1 = '01:01:01:C3:01:8B', $2 = '00:C2:00:8E', $3
= '01:01:01:C3', $4 = '000.000.000.000'

LOG:  checkpoint complete: wrote 9975 buffers (77.9%); 0 transaction
log file(s) added, 0 removed, 15 recycled; write=156.576 s, sync=0.065
s, total=156.662 s
LOG:  checkpoint starting: xlog

LOG:  duration: 1104.780 ms  execute : delete from iphost
where ethernetmacaddr=$1 and ipaddr=$2 and last_updated=$3
DETAIL:  parameters: $1 = '01:01:01:33:01:AA', $2 = '1.50.169.0', $3 =
'2009-07-08 15:32:57.131'
LOG:  duration: 1106.499 ms  execute : select cpe0_.cpemac as
cpemac2_0_, cpe0_.changeTime as changeTime2_0_, cpe0_.comment as
comment2_0_, cpe0_.configuration as configur4_2_0_, cpe0_.cpeconfigid
as cpeconf17_2_0_, cpe0_.cpefilterid as cpefilt18_2_0_, cpe0_.endTime
as endTime2_0_, cpe0_.ispId as ispId2_0_, cpe0_.last_updated as
last7_2_0_, cpe0_.reglocationid as regloca19_2_0_, cpe0_.modelId as
modelId2_0_, cpe0_.numberOfHosts as numberOf9_2_0_, cpe0_.regBaseId as
regBaseId2_0_, cpe0_.regTime as regTime2_0_, cpe0_.roamAllowed as
roamAll12_2_0_, cpe0_.serialNumber as serialN13_2_0_, cpe0_.slaid as
slaid2_0_, cpe0_.enable as enable2_0_, cpe0_.staticip as staticip2_0_,
cpe0_.subscriberid as subscri21_2_0_, cpe0_.swVersion as
swVersion2_0_, cpe0_.vlanid as vlanid2_0_, cpe0_.voipid as voipid2_0_
from cpe cpe0_ where cpe0_.cpemac=$1
DETAIL:  parameters: $1 = '00:84:00:37'



LOG:  duration: 1205.828 ms  execute : insert into iphost
(cpemac, last_updated, leasetime, regtime, ethernetmacaddr, ipaddr)
values ($1, $2, $3, $4, $5, $6)
DETAIL:  parameters: $1 = '02:31:00:25', $2 = '2009-07-08
15:39:53.718', $3 = '2009-07-08 21:13:13.715', $4 = '2009-07-08
15:39:53.715', $5 = '01:01:03:32:01:22', $6 = '3.51.33.0'
LOG:  duration: 1203.287 ms  execute : insert into iphost
(cpemac, last_updated, leasetime, regtime, ethernetmacaddr, ipaddr)
values ($1, $2, $3, $4, $5, $6)
DETAIL:  parameters: $1 = '03:47:00:81', $2 = '2009-07-08
15:39:53.72', $3 = '2009-07-08 21:13:13.717', $4 = '2009-07-08
15:39:53.717', $5 = '01:01:04:48:01:7E', $6 = '4.74.125.0'
LOG:  duration: 1201.480 ms  execute : insert into iphost
(cpemac, last_updated, leasetime, regtime, ethernetmacaddr, ipaddr)
values ($1, $2, $3, $4, $5, $6)
DETAIL:  parameters: $1 = '01:12:00:7F', $2 = '2009-07-08
15:39:53.725', $3 = '2009-07-08 21:13:13.721', $4 = '2009-07-08
15:39:53.721', $5 = '01:01:02:13:01:7C', $6 = '2.19.123.0'

LOG:  checkpoint complete: wrote 9794 buffers (76.5%); 0 transaction
log file(s) added, 0 removed, 15 recycled; write=148.084 s, sync=0.062
s, total=148.172 s

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Checkpoint Tuning Question

2009-07-08 Thread Dan Armbrust

>> Wouldn't increasing the length between checkpoints result in the
>> checkpoint process taking even longer to complete?
>
> You don't really care how long it takes.  What you want is for it not to
> be chewing a bigger fraction of your I/O bandwidth than you can spare.
> Hence, you want it to take longer.  Trying to shorten it is just going
> to make the spike worse.
>
>                        regards, tom lane
>

I bumped the segments up to 15, and the timeout up to 10 minutes, and
changed the completion target to .7.

What I observe now is that I get a short (1-2 second) period where I
get slow queries - I'm running about 30 queries in parallel at any
given time - it appears that all 30 queries get paused for a couple of
seconds at the moment that a checkpoint begins.  However, after the
initial slowdown, I don't get any additional slow queries logged while
the checkpoint process runs.

My takeaway is that starting the checkpoint process is really
expensive - so I don't want to start it very frequently.  And the only
downside to longer intervals between checkpoints is a longer recovery
time if the system crashes?

Thanks,

Dan

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Checkpoint Tuning Question

2009-07-08 Thread Dan Armbrust

On Wed, Jul 8, 2009 at 1:23 PM, Tom Lane wrote:
> Dan Armbrust  writes:
>> With checkpoint_segments set to 10, the checkpoints appear to be
>> happening due to checkpoint_timeout - which I've left at the default
>> of 5 minutes.
>
> Well, you could increase both those settings so as to put the
> checkpoints further apart, and/or increase checkpoint_completion_target
> to spread the checkpoint I/O over a larger fraction of the cycle.
>
>                        regards, tom lane
>

Sorry, didn't mean to get off list.

Wouldn't increasing the length between checkpoints result in the
checkpoint process taking even longer to complete?

The way my system processes and buffers incoming data, having
infrequent (but long and disruptive) checkpoints is bad, since it
causes the throughput to suffer so bad - my buffers can't hold the
flood, and I have to drop data.  If I can reduce the impact of the
checkpoints, and have them occur more frequently, they my buffers
should be able to hold the incoming data during the short durations
that I have slow queries.

I'll go experiment with checkpoint_completion_target.

Thanks,

Dan

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Checkpoint Tuning Question

2009-07-08 Thread Dan Armbrust

On Wed, Jul 8, 2009 at 12:50 PM, Tom Lane wrote:
> Dan Armbrust  writes:
>> However, once the checkpoint process begins, I get a whole flood of
>> queries that take between 1 and 10 seconds to complete.  My throughput
>> crashes to near nothing.  The checkpoint takes between 45 seconds and
>> a minute to complete.
>
> You sure this is 8.3?  It should spread out checkpoints over a couple of
> minutes by default.  [thinks...]  Maybe you need to increase
> checkpoint_segments some more.  If it's forcing the checkpoint in order
> to hold down the number of WAL segments used up, that would explain a
> fast checkpoint.
>
>                        regards, tom lane
>

Just checked - currently running 8.3.4 on the system I'm testing on.

With checkpoint_segments set to 10, the checkpoints appear to be
happening due to checkpoint_timeout - which I've left at the default
of 5 minutes.

If I double my test load, I end up with checkpoints happening about
every 4 minutes, with the log message that I read to indicate that it
used up all 10 segments.

So not much I can do to keep the checkpoint process from causing a
burst of slow queries?

Thanks,

Dan

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] Checkpoint Tuning Question

2009-07-08 Thread Dan Armbrust

I'm running a steady state test where I am pushing about 600 queries
per second through a Posgres 8.3 system on an 8 CPU Linux system.
It's a mix of inserts, updates, and deletes on a few tables - the two
biggest ones probably have about 200,000 rows.

Harddrive is just a simple, run-of-the-mill desktop drive.

Here are parameters that I have changed from defaults:

shared_buffers =100MB
synchronous_commit=off

And, after noting complaints in the log file about checkpoint intervals, I set:

checkpoint_segments=10

Then I turned on slow query logging for queries that take more than 1
second, and checkpoint logging.

Typically, I see no slow queries.  The system handles the load just fine.
Once in a while, I'll see a query that takes 3 - 5 seconds.

However, once the checkpoint process begins, I get a whole flood of
queries that take between 1 and 10 seconds to complete.  My throughput
crashes to near nothing.  The checkpoint takes between 45 seconds and
a minute to complete.

After the checkpoint completes - the system returns to having very few
slow queries, and the keeps up with the load fine.

Is there anything I can do to prevent the occasional slow query?

Is there anything I can do to prevent (or minimize) the performance
impact of the checkpoint?

Thanks,

Dan

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] recover corrupt DB?

2009-05-04 Thread Dan Armbrust

> These reports seem to come up a bit, with disk full issues resulting in
> the need to pg_resetxlog, dump, and re-initdb, but I wouldn't be too
> shocked if they all turned out to be on xfs or something like that.
>

My particular disk-full condition was on ext2.  Nothing exotic.  Also,
the process that filled the disk was postgres - if that makes any
difference - I had left a debug level turned up in the postgres config
file, and it was logging every single db query.  Since it wasn't set
up to remove old log files - it filled the disk.

Nothing else unusual occurred that I'm aware of - things went weird
for the lab tester, he cleared some space, rebooted the system, and
postgres didn't come back online.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] recover corrupt DB?

2009-04-23 Thread Dan Armbrust

>   In general, pg_resetxlog would be
> the tool to try here.  Don't panic yet. ;-)
>
>

Yep, that was the command I was looking for.  That at least got the DB
to a point where it would start, and I was able to do a dump.

So, I dumped and reloaded all of the databases.  Things seem fine, but
bits and pieces of documentation I've seen for pg_resetxlog also
recommend initdb, and starting over.  Is that necessary?

Thanks,

Dan

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] recover corrupt DB?

2009-04-23 Thread Dan Armbrust

I had a test system (read as not backed up, sigh) which had the disk
go full while PostgreSQL was loaded, consequently, PostgreSQL will no
longer start.

It is logging an error about detecting an invalid shutdown, trying to
replay something, and then an error about not being able to open a
file it is looking for.

Is this DB toast?  Or is there something I could do to get the DB back
into a state where it will start, without losing everything?

Thanks,

Dan

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Slow Vacuum was: vacuum output question

2009-01-12 Thread Dan Armbrust

> Well, your throughput on this machine is horrible.  It looks like with
> 8.1 all your time is sys + cpu for your cpus, while with 8.3 you've
> got more idle and more i/o wait, which tells me that 8.3 is smarter
> about vacuuming, so it's spending less time working the cpus and more
> time waiting for the i/o subsystem.
>
> Wither way, getting only 2 or so megs a second write is pretty bad.  I
> can get those numbers from a laptop.  An older laptop like a single
> core 1.6GHz pentium M based T42 or something.  My laptop, which is new
> from last year, is about twice as fast as your server in terms of I/O.

This is my problem in a nutshell.  As of yet, I have no rational
explanation for this performance.

The servers in question are not slow.  This particular server never
shows this problem when running a newer OS - but I have not yet
finished isolating which OS's have problems on this hardware.  No
other software on the system exhibits any sort of IO issue other than
PostgreSQL.

I have customers with $20K servers that can't handle the workload that
I can handle on an old cruddy laptop.  However, much of the appeal of
our product is low per-site installation costs, so expensive servers
don't fit into the mix.

Random futzing - reindexing, manual full vacuums, rebooting the server
- each of these has cleared the error condition on one site or
another, and allowed the system to go back to functioning the way it
should for months, until the problem randomly crops up again.

I'm still looking into it, but, at the same time, we have enough
workarounds to the issue now (scheduled reindex, install a newer OS,
upgrade to Postgres 8.3)  that this is becoming a low priority
mystery, rather than the high priority one it has been.

Thanks for your thoughts,

Dan

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Slow Vacuum was: vacuum output question

2009-01-08 Thread Dan Armbrust

On PostgreSQL 8.1, while a long vacuum is running, the output of
vmstat 10 looks like this (sorry, can't format this very will in this
e-mail client):

r  b   swpd   free buff cache si   sobi   bo
in   cs  us  sy id wa st
5  2112  53732   4388 116340400 13524  1322  941 75053 45
55  0  0  0
3  2112  51596   2336 116003200 20731   732  999 196116 35
58  0  7  0
3  2112  53784   2448 115538800  5729  1680  716 114442 46
52  0  2  0
7  1112  52088   2568 115679600   205  1919  546 28929 56
44  0  0  0
3  2112  51772   2652 115702800   162  2010  534 2845 56 44  0  0  0
3  2112  52296   2736 115633200   174  1910  517 2648 53 47  0  0  0
2  2112  52736   2832 115538800   173  1960  521 2698 57 43  0  0  0
6  1112  51172   2908 115740000   173  1825  511 2621 52 48  0  0  0
2  1112  52156   2984 115587600   152  1766  495 2669 53 47  0  0  0
5  2112  53828   3064 115402800   124  1762  498 2673 54 46  0  0  0
5  0112  53396   3140 115468400   122  1803  532 2871 52 48  0  0  0
6  1112  52004   3240 115589200   150  2004  522 3712 56 44  0  0  0
5  2112  53100   3348 115236400   128  2142  543 2711 54 38  0  8  0
1  2112  56704   3428 114813200   102  2229  560 2925 56 44  0  0  0
2  1112  54876   3500 11500800091  2129  598 3068 65 35  0  0  0
3  1112  53196   3588 11521640090  2091  542 2803 57 43  0  0  0
8  2112  53528   3664 115186800   106  2180  553 2947 55 39  0  6  0

Vacuum hasn't stopped yet at this point - its been running for many minutes.


On PostgreSQL 8.3, we had this:

 r   b   swpd   free buffcache si   sobi bo
 in   csus  sy id wa st
 6  0   9604  51932  14276 113899600 3  1808  523 5115 45
9 42  4  0
 2  0   9580  53284  14124 1138092   100   564  2561  585 5126 53
10 19 18  0
 7  0   9564  53412  14144 113769600 0  1682  497 4985 42
9 48  2  0
 3  0   9532  53320  13880 11379406037  1757  538 4979 48  8 41  3

More random data:

On an overnight test of PostgreSQL 8.3 on Fedora Core 6, with
auto-vacuum on (using all default settings) the amount of time that it
takes to run a manual vacuum on a constant sized database (only row
updates, or deletes followed by a replacement add) has doubled from 2
seconds to 4 seconds.  A reindex brings the time back down to 2
seconds.

Dan

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Slow Vacuum was: vacuum output question

2009-01-07 Thread Dan Armbrust

I'm no closer to a solution, but here are some additional data points
- all taken on Fedora Core 6.

Postgres 8.1 built from source.  Auto vacuum disabled.

Create Empty Database.
Run our load on the system for 2 hours to  populate and exercise the database.
Run Vacuum.  Takes more than a minute.
Run Vacuum immediately again.  Takes more than a minute.
Reindex the database.  Takes 10 seconds.
Run Vacuum again.  Takes 2 seconds.
Allow load to run on the system for 30 minutes.
Run Vacuum again.  Takes more than a minute.


Postgres 8.3 built from source.  Auto vacuum disabled.

Create Empty Database.
Run our load on the system for 2 hours to  populate and exercise the database.
Run Vacuum.  Takes more than a minute.
Run Vacuum immediately again.  Takes 15 seconds.
Run Vacuum immediately again.  Takes 15 seconds.
Reindex the database.  Takes 10 seconds.
Run Vacuum again.  Takes 2 seconds.

So, PostgreSQL 8.3 shows better behaviour, but it is still showing
some sort of performance issue which a reindex fixes.
And then of course, the kicker is that we can't recreate any of these
issues when running the same exact test, on the same exact hardware -
but using a different underlying OS.  When we were running under a
modern Ubuntu, the vacuum never takes more than 2 seconds.  We will be
checking other OSs soon.  I guess if we can't figure out what is
causing it, we can at least isolate the distros that we need to tell
our customers not to use (or to schedule a reindex if they insist on
not upgrading their OS)

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Slow Vacuum was: vacuum output question

2009-01-06 Thread Dan Armbrust

On Tue, Jan 6, 2009 at 3:36 PM, Tom Lane  wrote:
> "Dan Armbrust"  writes:
>> INFO:  "cpe": found 415925 removable, 50003 nonremovable row versions
>> in 10849 pages
>
>> What on earth could be going on between PostgreSQL 8.1 and Fedora 6
>> that is bloating and/or corrupting the indexes like this?
>
> You're focusing on the indexes when the problem is dead table rows.
>
> It's very hard to believe that there's any OS dependence as such
> involved in that.  I wonder whether (a) the Ubuntu and Fedora packages
> you're using are the same 8.1.x point release; (b) if there's any
> interesting non-default behavior built into the Ubuntu packaging
> ... like running autovacuum automatically, for instance.
>
>regards, tom lane
>

In our testing, Postgres 8.1 was build from source (PostgreSQL website
source) on both systems.  No Distro packages involved.

Believe me, we are as baffled as you.  We have been chasing this bug
off and on for months on a couple of different customer sites now.

Thanks,

Dan

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Slow Vacuum was: vacuum output question

2009-01-06 Thread Dan Armbrust

>
> Obviously the choice of operating system has no impact on the contents of
> your index.
>
> A better question might be, what did your application or maintenance
> procedures do different in the different tests?
>
>
> --
> Alan

Our problem for a long time has been assuming the "obvious".  But we
now have tests that show otherwise.

I'm now thinking something along the lines of an obscure file system
or kernel interaction bug now - that was perhaps corrected in newer
releases of the OS.

Now that we can finally reproduce the problem in house, we are still
doing more tests to figure out specifics - does the problem go away
with Postgres 8.3, ext2/ext3/reiserFS, etc.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Slow Vacuum was: vacuum output question

2009-01-06 Thread Dan Armbrust

> On Tue, Jan 6, 2009 at 1:39 PM, Dan Armbrust
>  wrote:
>> Here is an interesting new datapoint.
>>
>> Modern Ubuntu distro - PostgreSQL 8.1.  SATA drive.  No Raid.  Cannot
>> reproduce slow vacuum performance - vacuums take less than a second
>> for the whole database.
>>
>> Reinstall OS - Fedora Core 6 - PostgreSQL 8.1.  Push data through
>> PostgreSQL for a couple hours (same as above) and now vacuum reports
>> this:
>
> Are you pushing the same amount of data through the ubuntu server?  if
> not, then the comparison is invalid, if so, then yeah, there's some
> kind of difference between the platforms.
>
> Note that Fedora Core 6 is quite old compared to ubuntu 8.04 or 8.10.
> Also it's more likely to be installed on older and / or slower
> equipment.
>

Yep - actually, we pushed much more data through the Ubuntu system and
could never reproduce the problem.  On the Fedora Core 6 system, the
problem happened very quickly.

In our testing here, the Ubuntu test was on the same hardware as the
fedora core 6 system (not just identical, but the same actual box)

It seems that there is some sort of bad interaction between some part
of the older OS and PostgreSQL.  We have also seen what appears to be
the same issue on a Cent OS 4.4 system.  Which is a rather similar
package level to Fedora Core 6.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Slow Vacuum was: vacuum output question

2009-01-06 Thread Dan Armbrust

On Tue, Jan 6, 2009 at 3:01 PM, Alvaro Herrera
 wrote:
> Dan Armbrust escribió:
>
>> What on earth could be going on between PostgreSQL 8.1 and Fedora 6
>> that is bloating and/or corrupting the indexes like this?
>
> Postgres 8.1 was slow to vacuum btree indexes.  My guess is that your
> indexes are so bloated that it takes a lot of time to scan them.
>
> I think the solution here is to vacuum this table far more often.
>
> --
> Alvaro Herrerahttp://www.CommandPrompt.com/
> The PostgreSQL Company - Command Prompt, Inc.
>

Actually, the customer reported problem is that when they enable
autovacuum, the performance basically tanks because vacuum runs so
slow they can't bear to have it run frequently.

Though, perhaps they had bloated indexes before they started
autovacuum, and it never fixed them.  Perhaps it will behave properly
if we do a reindex, and then enable autovacuum.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Slow Vacuum was: vacuum output question

2009-01-06 Thread Dan Armbrust

Here is an interesting new datapoint.

Modern Ubuntu distro - PostgreSQL 8.1.  SATA drive.  No Raid.  Cannot
reproduce slow vacuum performance - vacuums take less than a second
for the whole database.

Reinstall OS - Fedora Core 6 - PostgreSQL 8.1.  Push data through
PostgreSQL for a couple hours (same as above) and now vacuum reports
this:

INFO:  vacuuming "public.cpe"
INFO:  index "pk_cpe" now contains 50048 row versions in 2328 pages
DETAIL:  415925 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 0.05s/0.33u sec elapsed 0.94 sec.
INFO:  index "ix_cpe_ispid" now contains 50090 row versions in 1338 pages
DETAIL:  415925 index row versions were removed.
953 index pages have been deleted, 0 are currently reusable.
CPU 0.27s/0.22u sec elapsed 8.93 sec.
INFO:  index "ix_cpe_enable" now contains 50676 row versions in 1637 pages
DETAIL:  415925 index row versions were removed.
1161 index pages have been deleted, 0 are currently reusable.
CPU 0.45s/0.31u sec elapsed 14.01 sec.
INFO:  "cpe": removed 415925 row versions in 10844 pages
DETAIL:  CPU 1.48s/0.25u sec elapsed 35.86 sec.
INFO:  "cpe": found 415925 removable, 50003 nonremovable row versions
in 10849 pages
DETAIL:  6 dead row versions cannot be removed yet.
There were 0 unused item pointers.
0 pages are entirely empty.
CPU 2.40s/1.18u sec elapsed 61.13 sec.



It tooks 61 seconds to vacuum, and the number of index row versions
removed was huge.
We than issued a reindex command for the entire database - and now the
vaccum times are back down under a second.

What on earth could be going on between PostgreSQL 8.1 and Fedora 6
that is bloating and/or corrupting the indexes like this?

Thanks,

Dan

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Slow Vacuum was: vacuum output question

2008-12-30 Thread Dan Armbrust

>
>> Their workaround had been to run a daily autovacuum at the lowest load
>> time of day, to cause the least disruption.
>
> What is a "daily autovacuum"?  It sounds like some tables just need
> vacuuming more often.  If they find that the system is not responsive
> during that, it tells us that they need more disk bandwidth or that
> they need to integrate vacuuming some tables with their program.
>

Sorry, I meant a daily manual vacuum.

On paper, their hardware is plenty fast for their workload.  Out of
hundreds of sites, all running the same software putting load on the
database, this is only the second time where we have seen this odd
behaviour of very slow vacuums.

I guess I was hoping that someone would be able to chime in and say -
yes, in so and so version, we fixed an obscure bug that sometimes
caused huge slowdowns, perhaps when combined with certain linux
kernels.  It was a nice dream anyway :)

iozone looks useful.  I'll see if I can get on their system and do
some proper benchmarks.

Thanks,

Dan

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Slow Vacuum was: vacuum output question

2008-12-30 Thread Dan Armbrust

>> INFO:  "cpe": found 95498 removable, 18757 nonremovable row versions
>> in 7 pages
>> DETAIL:  0 dead row versions cannot be removed yet.
>> There were 280173 unused item pointers.
>> 0 pages are entirely empty.
>> CPU 5.35s/0.99u sec elapsed 724.38 sec.
>
> How many idle transactions are there?
>

Not sure.  I don't expect that there were many, their system isn't
very highly loaded compared to most of our customers systems.

The way that they reported the problem to us was that if they enable
autovacuum, when ever it runs (about 4 times an hour) it would stop
processing the things it needed to process, due to table lock
contention for several minutes.

Their workaround had been to run a daily autovacuum at the lowest load
time of day, to cause the least disruption.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Slow Vacuum was: vacuum output question

2008-12-30 Thread Dan Armbrust

>> INFO:  "cpe": found 95498 removable, 18757 nonremovable row versions
>> in 7 pages
>> DETAIL:  0 dead row versions cannot be removed yet.
>> There were 280173 unused item pointers.
>> 0 pages are entirely empty.
>> CPU 5.35s/0.99u sec elapsed 724.38 sec.
>>
>> Then, running vacuum again immediately afterword, on a system that was
>> basically idle, would result in nearly the same amount of time to
>> vacuum the table.
>
> You do realize that except for the end of a table, vacuum recovers no
> actual space, just makes it available for new tuples to move in there.
>  So it makes sense that the second vacuum would take just as long, or
> nearly so.
>

Yep.  The real issue is that it took 724 seconds, instead of say, a
half second - like it does on my system.  I wasn't sure if I should
expect vacuum to take longer to run when it finds a large number of
tuples that it needs to make available, so I just have them run it
twice so I can easily compare the time that it takes with essentially
no work to do.

>
> Hard to say.  Have them run
>
> vmstat 1
>
> while vacuuming and see what bi/bo look like.
>

Haven't looked at that yet on this particular system.  Last time, on
different hardware when this occurred the vmstat 'wa' column showed
very large values while vacuum was running.  I don't recall what the
bi/bo columns indicated.

top also showed very high load averages while vacuum was running - but
basically no cpu use.

Are there any common tools that could do a better disk benchmark than
hdparm -Tt?

Thanks,

Dan

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] Slow Vacuum was: vacuum output question

2008-12-30 Thread Dan Armbrust

To follow up on an old thread that I started - I had a customer who
had a system where manual vacuum runs were taking a very long time to
run.  I was seeing output like this:

INFO:  "cpe": found 95498 removable, 18757 nonremovable row versions
in 7 pages
DETAIL:  0 dead row versions cannot be removed yet.
There were 280173 unused item pointers.
0 pages are entirely empty.
CPU 5.35s/0.99u sec elapsed 724.38 sec.

Then, running vacuum again immediately afterword, on a system that was
basically idle, would result in nearly the same amount of time to
vacuum the table.

Getting a copy of the database from the customer, and loading it onto
my Postgres System and running the vacuum would result in runs that
took less than a second (as expected).

General opinion was that it was a disk-io problem.  We rebooted the
system, and magically, the problem went away.

As tends to be the case with "magically" fixed problems, this one is
back.  Now I have a different customer running Postgres 8.1 on Fedora
Core 6, and their system produced that log snippit above.

hdparm shows disk-io thruput being perfectly normal on their system.
Everything else on the system seems to be working correctly.  The
reboot solution doesn't help.  Vacuum still runs painfully slow.

I'm still waiting for a copy of their postgresql config file, but I'm
guessing that almost everything was left on the default settings.

I don't suppose anyone knows of any bugs that existed between postgres
8.1 and older linux kernels that led to behavior like this?  I can't
really just ask them to upgrade their OS and/or Postgres on the hunch
that the problem should go away  And I haven't yet been able to
reproduce anything like this on a system that I have control over.

Thanks for any ideas.

Dan

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] vacuum output question

2008-11-14 Thread Dan Armbrust

>
> There was concurrent access to the table during VACUUMing, so the long
> delay is explainable as long waits for cleanup lock, plus probably
> thrashing the cache with bloated indexes. The CPU overhead per row seems
> OK. We should instrument the wait time during a VACUUM and report that
> also.
>
> --
>  Simon Riggs   www.2ndQuadrant.com
>  PostgreSQL Training, Services and Support


Is that a guess?  Or something you can tell from the log above?
Because there shouldn't have been any concurrent access while the
VACUUM was run - the customers had failed over to a different system,
so while I can't be sure, I expect that there was no other database
activity at the time the command was run.

Thanks,

Dan

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] vacuum output question

2008-11-14 Thread Dan Armbrust

Thanks everyone.  You have helped back up my suspicions.  It is indeed
a Linux system, and it has a RAID IO system, but I don't yet know the
details of that IO system.  Time to put them back to work looking at
their hardware, rather than blaming our software :)

Thanks for the extra tips on hunting for excess IO from other processes.

Dan

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] vacuum output question

2008-11-13 Thread Dan Armbrust

I have a system backed by a PostgreSQL DB at a customer site that
mysteriously slowed way down - and couldn't keep up with the load for
no apparent reason.

I had them run a vacuum analyze verbose on my database, and had these
lines come back which made me suspicious:

INFO:  index "ix_cpe_ispid" now contains 41626 row versions in 13727 pages
DETAIL:  5224 index row versions were removed.
1543 index pages have been deleted, 1373 are currently reusable.
CPU 13.09s/3.51u sec elapsed 157.85 sec.

INFO:  index "ix_cpe_enable" now contains 41628 row versions in 29417 pages
DETAIL:  5224 index row versions were removed.
3706 index pages have been deleted, 3291 are currently reusable.
CPU 31.27s/8.22u sec elapsed 687.60 sec.

INFO:  "cpe": found 5224 removable, 41626 nonremovable row versions in
1303 pages
DETAIL:  0 dead row versions cannot be removed yet.
There were 22416 unused item pointers.
0 pages are entirely empty.
CPU 44.46s/11.82u sec elapsed 852.85 sec.


Why did those particular tables and indexes take _so_ long to vacuum?
Perhaps we have a disk level IO problem on this system?

Can someone tell me what 'CPU 44.46s/11.82u sec' means?  I have a
guess, but I'm not sure.

Thanks,

Dan

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Drop database / database in use question

2008-10-17 Thread Dan Armbrust

This is interesting.

On windows, if I attempt to drop a database when I am purposefully
holding a connection open to the DB, it fails immediately - ~ 100 ms.

On linux, if I do the same thing, the drop database command takes
upwards of 4 seconds before it gives up, and says it can't drop the
DB.

Is this expected?  Is there some way I can control how long it will
block the drop command, waiting for the DB to not be in use?

Thanks,

Dan

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Drop database / database in use question

2008-10-17 Thread Dan Armbrust

It would seem that way.  But if you have ever tried programming with
the constraints of an InstallAnywhere installer, you would know why :)



On Fri, Oct 17, 2008 at 10:43 AM, Alvaro Herrera
<[EMAIL PROTECTED]> wrote:
> Dan Armbrust escribió:
>> I don't suppose that there is any easy way way that I can stop and/or
>> disable the Autovac temporarily for the database that I want to drop.
>>
>> The only thing that I have seen so far, is that I would have to add
>> rows to the pg_autovacuum  table for each table in my database, but
>> I'mnot confident that that would even stop it from accessing the
>> database, since it says that even when set to disabled, autovacuum
>> will still run to prevent transaction id wraparounds, so it seems that
>> it still would make make a quick check into the database that my drop
>> command could collide with.
>
> You seem to want to go to a lot of trouble just to a void a simple retry
> loop.
>
> --
> Alvaro Herrerahttp://www.CommandPrompt.com/
> PostgreSQL Replication, Consulting, Custom Development, 24x7 support
>

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Drop database / database in use question

2008-10-17 Thread Dan Armbrust

I don't suppose that there is any easy way way that I can stop and/or
disable the Autovac temporarily for the database that I want to drop.

The only thing that I have seen so far, is that I would have to add
rows to the pg_autovacuum  table for each table in my database, but
I'mnot confident that that would even stop it from accessing the
database, since it says that even when set to disabled, autovacuum
will still run to prevent transaction id wraparounds, so it seems that
it still would make make a quick check into the database that my drop
command could collide with.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Drop database / database in use question

2008-10-17 Thread Dan Armbrust

But there is no user2.  I _know_ I am the only user of this database.

So how can User 1 create a race condition by himself?

Or is this something PostgreSQL is doing internally (like vacuum) ?

Do I really just have to keep trying the DROP command N times in a
row, until it decides it wants to work?  That really doesn't seem
right.

Thanks,

Dan

On Fri, Oct 17, 2008 at 9:24 AM, Scott Marlowe <[EMAIL PROTECTED]> wrote:
> On Fri, Oct 17, 2008 at 8:11 AM, Dan Armbrust
> <[EMAIL PROTECTED]> wrote:
>>> There are obvious race conditions in that assumption.  Why don't you
>>> just try the drop and see if it succeeds?
>>>
>>>regards, tom lane
>>>
>>
>> I don't follow - why is there a race condition?  I'm driving the
>> commands into postgresql via the command line.
>
> User 1: select * from pg_stat_activity where datname='db123';
> User 2: psql db123
> User 1: drop database db123;
>

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Drop database / database in use question

2008-10-17 Thread Dan Armbrust

> There are obvious race conditions in that assumption.  Why don't you
> just try the drop and see if it succeeds?
>
>regards, tom lane
>

I don't follow - why is there a race condition?  I'm driving the
commands into postgresql via the command line.

The command that does the query on the pg_stat_activity table happens
quite a while before my attempt to drop the table - and it's logging
into the template1 database, rather than the database I want to drop.
The drop attempt comes later,  in a subsequent psql command line
invocation.  The drop command also logs in using the template1
database.  Does the psql command line client connection not get
cleaned up immediately, or something like that?

No other command or tool will access this database (to my knowledge)
in between the two commands.  So what is the mystery user that I'm
finding using the table?  My only guess so far is that it was the
autovac daemon - but I don't know enough about how that works to know
if that is even a reasonable guess.

Due to the nature of the installer tool I'm driving this fun, parsing
back the output of the psql commands isn't much fun... and there are
cases where a failure is acceptable (the database already doesn't
exist, etc).

If I can have a reliable drop command that always works, it would be
much easier.

Thanks,

Dan

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] Drop database / database in use question

2008-10-16 Thread Dan Armbrust

I have some installation tools which in some cases need to replace a
postgres database.

For this example, assume my database name is 'fred'.

Also, assume that no other processes are accessing this database.  Any
database access will come from my running installer.


First,  after validating some passwords, my installer will run this command:

psql -c "select datname from pg_stat_activity where datname='fred'" -U
username -d template1

I then scan the output looking for 'fred'.  My (perhaps incorrect)
assumption is that if I don't find the database name 'fred' in the
output, then I can assume that no process is using this database, and
I should be able to drop it.

Later, my installer runs this code:

psql -c "drop database fred" -U username -d template1

99% of the time, everything works fine.

1% of the time, the  drop fails, and I get this:

ERROR:  database "fred" is being accessed by other users

My two part question is why, and what can I do about it?  At a
minimum, I need to be able to reliably determine if I can drop the
database.  Ideally, I would like to be able to drop the database even
though it is still in use - force an override of some sort - kick out
the offending user.

Thanks for any advice

Dan

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] checkpoint_segments warning?

2008-05-05 Thread Dan Armbrust

>  > That part that confused me was the "-2484".  Is that just a hyphen in
>  > a strange place?  Or did it really calculate a negative value?
>
>  Yeah, it really did.  AFAICS this could only happen if the value of time(2)
>  went backwards.  Something fooling with your system clock?
>
> regards, tom lane
>

Or _someone_.  Yes.  Precisely.  I may even know who did it ;)

Thanks,

Dan

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] checkpoint_segments warning?

2008-05-05 Thread Dan Armbrust

I got a warning from PostgreSQL 8.3.1 that said something to the effect of

checkpoints are happening to frequently... (-2484 seconds)

That part that confused me was the "-2484".  Is that just a hyphen in
a strange place?  Or did it really calculate a negative value?

Is this expected?

Thanks,

Dan

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Deadlock situation?

2008-05-01 Thread Dan Armbrust

On Wed, Apr 30, 2008 at 12:36 PM,  <[EMAIL PROTECTED]> wrote:
> > In this case, Postgres had been started in the foreground on a
>  > terminal, so I went to that terminal, and did a ctrl-c.  Eventually,
>  > postgres stopped, but the terminal wouldn't respond either - and I had
>  > to close it.
>
>  Just out of curiosity, could you maybe have XOFF'd the terminal? I've had
>  cases in the past where Postgres got unhappy and froze when it couldn't
>  write logging information to stderr. (Though, granted, in the cases I hit
>  the server's stderr was redirected to a pipe that was full, so it's not
>  100% analogous)

Its certainly possible that something "interesting" happened to the
terminal.  We do have an issue with a KVM switch on this machine that
sometimes causes all sorts of unpredictable random garbage input to
hit the screen when you switch to the system.

If I can't reproduce the problem (which so far I have not been able
to) I'll probably let it go, perhaps naively believeing that some
terminal issue set off the chain of events.

Thanks,

Dan

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Deadlock situation?

2008-04-30 Thread Dan Armbrust

>  I wonder whether you saw some variant of the deadlock reported in bug
>  #3883 --- see discussion here:
>  http://archives.postgresql.org/pgsql-hackers/2008-01/msg00873.php
>  The only known way of getting into that problem is fixed in 8.3,
>  but a potential for silent deadlock is definitely still there.
>  I don't however have any idea how such a deadlock could occur without
>  at least some processes showing as 'waiting' in pg_stat_activity.
>
>  Do you have any idea what the transactions that were blocked at
>  COMMIT had been doing?

Should have just been selects and updates, and perhaps some inserts
and deletes - nothing to out of the ordinary or complex.

If I can get this to happen consistently, I'll set up a logger to
capture all of the SQL statements leading up to the freeze.

>
>  Are there any foreign key constraints leading *to* this table?

Nope.

Unfortunately, I had been running for a couple of weeks steady before
I got the error - so reproducing it may not be quick.  But I'm going
to set up a couple of parallel tests on more systems, and see if I can
happen into it again sooner.

Oh, and I don't think though should have anything to do with it, but I
did discover that fsync was turned off on this postgres instance.
Someone else didn't clean up after a different test, and I forgot to
change it back.

Thanks,

Dan

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Deadlock situation?

2008-04-30 Thread Dan Armbrust

>  Well, there went the evidence :-( ... but what exactly did you have
>  to do to shut it down?  I'm wondering whether the backends responded
>  to SIGINT or SIGTERM.

Sorry :(

First, I tried issuing a kill to the pid of the vacuum process.  It
didn't respond to kill.
Then, I tried a kill -9.  It responded to that :)

However, postgres got very unhappy at this point - I tried to log in
to issue another query against the pg_stat_activity table, but every
command I issued at this point simply hung.

In this case, Postgres had been started in the foreground on a
terminal, so I went to that terminal, and did a ctrl-c.  Eventually,
postgres stopped, but the terminal wouldn't respond either - and I had
to close it.

>
>  Next time, it'd be good to confirm (with top or vmstat or similar)
>  whether the backends are actually idle or are eating CPU or I/O.

I didn't notice any high load on the processor - the server seemed
idle.  But I didn't look real close a the individual postgres
processes.

>  Also try strace'ing a few of them; the pattern of kernel calls if
>  any would be revealing.
>
>  The lack of deadlock reports or 't' values in pg_stat_activity.waiting
>  says that you weren't blocking on heavyweight locks.  It's not
>  impossible that there was a deadlock at the LWLock level, though.
>
>  What sort of indexes are there on this table?

Here is the DDL for the table where vacuum was trying to run:

CREATE TABLE iphost
(
  ethernetmacaddr char(17) NOT NULL,
  cpemac char(11) NOT NULL,
  ipaddr varchar(15) NOT NULL,
  regtime timestamp NOT NULL,
  leasetime timestamp,
  last_updated timestamp NOT NULL DEFAULT now(),
  CONSTRAINT pk_iphost PRIMARY KEY (ethernetmacaddr, ipaddr),
  CONSTRAINT fk_iphost_cpe FOREIGN KEY (cpemac)
  REFERENCES cpe (cpemac) MATCH SIMPLE
  ON UPDATE CASCADE ON DELETE CASCADE
)
WITHOUT OIDS;
ALTER TABLE iphost OWNER TO pslogin;

CREATE INDEX ix_iphost_cpemac
  ON iphost
  USING btree
  (cpemac);

CREATE INDEX ix_iphost_ipaddr
  ON iphost
  USING btree
  (ipaddr);


> Teodor just fixed
>  an issue in GIN indexes that involved taking an unreasonable number of
>  LWLocks, and if that code wasn't exposing itself to deadlock risks
>  I'd be pretty surprised.
>
> regards, tom lane


Thanks.  If/when I can recreate this, I'll try to gather more info.

Dan

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Deadlock situation?

2008-04-29 Thread Dan Armbrust

Semantics of deadlock aside, I seem to be deadlocked, yet postgres
didn't detect any deadlock situation.

There are no DDL queries running.  Just a lot of updates, and some
inserts and deletes.

I had to restart postgres to let things recover - so I can't gather
any more info right now - but if/when it happens again, I'd like to
know what else to gather.

Looking at the time stamps, the transaction start timestamp of this
autovacuum query is the oldest one:

 autovacuum: VACUUM public.iphost   2008-04-24 03:05:13.212436-05 |

Then, between 03:05:37 and 03:05:38, nearly every other connection
came to a halt.

A few connections came to a halt several hours later.

I'm baffled, because this autovacuum query seems to have locked the
entire database.

I also don't know what the "waiting" column means in the output - but
they all have the flag of "f".  Does that column means that it is
waiting on a lock - t or f?

Thanks,

Dan

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] Deadlock situation?

2008-04-29 Thread Dan Armbrust

I have an app that we were load testing - it maintains a pool of
connections to PostgreSQL - 8.3.1


Suddenly, after running fine for weeks, the app hung - unable to get a
free connection from the pool.


select * from pg_stat_activity;" shows me that most of my connections
in a COMMIT phase:

03:05:37.73064-05  | 2008-04-24 03:05:38.419796-05 | 2008-04-24
02:11:53.908518-05 | 127.0.0.1   |   53807
 16385 | ispaaa  | 953 |16384 | pslogin  | COMMIT   | f
   | 2008-04-24


While some are in a SELECT:

 16385 | ispaaa  |1181 |16384 | pslogin  | select
dynamichos0_.ethernetmacaddr as ethernet1_0_, dynamichos0_.ipaddr as
ipaddr0_, dynamichos0_.cpemac as cpemac0_, dynamichos0_.regtime as
regtime0_, dynamichos0_.leasetime as leasetime0_,
dynamichos0_.last_updated as last5_0_ from iphost dynamichos0_, cpe
cpe1_ where  dynamichos0_.cpemac=cpe1_.cpemac and 1=1 and
dynamichos0_.ethernetmacaddr=$1 and dynamichos0_.cpemac=$2 and
cpe1_.regBaseId=$3 and dynamichos0_.ipaddr<>$4| f   |
2008-04-24 03:05:37.734041-05 | 2008-04-24 03:05:38.405854-05 |
2008-04-24 02:41:54.413337-05 | 127.0.0.1   |   55363


Perhaps VACUUM had something to do with it?:

 16385 | ispaaa  |8956 |16384 | pslogin  | delete from iphost
where leasetime<$1  | f   | 2008-04-24 18:43:29.920069-05 |
2008-04-24 18:43:30.116435-05 | 2008-04-24 18:41:59.071032-05 |
127.0.0.1   |

  49069  16385 | ispaaa  |1618 |   10 | postgres | autovacuum:
VACUUM public.iphost  | f   | 2008-04-24 03:05:13.212436-05 |
2008-04-24 03:05:13.212436-05 | 2008-04-24 03:05:12.526611-05 |
 |


Where should I begin to look for the source of this problem?

Thanks for any info,

Dan

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] errors in pg_restore on windows?

2008-02-29 Thread Dan Armbrust

I didn't do anything specific to load those functions into template1.

I was under the impression that that particular debugger wasn't
supposed to be on by default on windows - and that it was an installer
oversight - and that you may turn it off by default for future builds
of the windows installer.

But that seems to lead the pg_restore function to report errors -
which would certainly scare my users.

Perhaps I didn't disable it properly.

If your not concerned about it, I won't be either.  I'll just stop
turning off that debug flag.

I guess it's all a moot point if no changes were planned to the
windows installer WRT that debug setting.

Thanks,

Dan

On Fri, Feb 29, 2008 at 11:05 AM, Tom Lane <[EMAIL PROTECTED]> wrote:
> "Dan Armbrust" <[EMAIL PROTECTED]> writes:
>
> > If I disable that debugger library in the config file, create a
>  > database, backup the database, and then restore the database, I get
>  > many errors - all similar to the one I posted before.
>
>  Perhaps you've got those functions loaded into template1?
>
>
>  > Are you saying they are really just warnings?
>
>  Well, they're errors in the sense that those functions didn't get
>  restored.  But if you're trying to remove the plpgsql debugger,
>  I think that's what you want.
>
> regards, tom lane
>

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [GENERAL] errors in pg_restore on windows?

2008-02-29 Thread Dan Armbrust

Sorry, I don't follow.

If I disable that debugger library in the config file, create a
database, backup the database, and then restore the database, I get
many errors - all similar to the one I posted before.

They are reported at the end of the restore as errors.

Are you saying they are really just warnings?

Thanks,

Dan

On Fri, Feb 29, 2008 at 10:19 AM, Tom Lane <[EMAIL PROTECTED]> wrote:
> "Dan Armbrust" <[EMAIL PROTECTED]> writes:
>
> > It turns out, that commenting out the "shared_preload_libraries" in
>  > the postgresql.conf file breaks pg_restore.
>
>  It hardly "breaks" anything.  The errors you show just indicate that you
>  had the plpgsql debugger loaded into your database.
>
> regards, tom lane
>

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [GENERAL] errors in pg_restore on windows?

2008-02-29 Thread Dan Armbrust

Ok, it turns out that the fact that my pg_restore won't work on
windows is directly related to this (from a couple days ago)

"Dan Armbrust" <[EMAIL PROTECTED]> writes:
> To follow up on my old thread -
> I tested another install of Postgres 8.3.0.1 - and on windows, the
> postgresql.conf file has this in it:

> shared_preload_libraries = '$libdir/plugins/plugin_debugger.dll'  
> #
> (change requires restart)

> I got the impression that that debugger wasn't intended to be enabled,
> by default.  It looks like the windows installer is enabling it,
> however.

It turns out, that commenting out the "shared_preload_libraries" in
the postgresql.conf file breaks pg_restore.

So, perhaps that shouldn't be disabled by default in the installer.

Dan


On Thu, Feb 28, 2008 at 5:13 PM, Dan Armbrust
<[EMAIL PROTECTED]> wrote:
> Using PostgreSQL 8.3 on Windows, I make a backup like this:
>  pg_dump -U postgres -C -d -D -Fc -f ispaaa-pg.bak ispaaa
>
>  Then restore like this:
>  pg_restore -U postgres -C -d template1 -v -Fc ispaaa-pg.bak
>
>  And I get lots of these errors:
>
>
>  pg_restore: creating TABLE voip
>  pg_restore: creating FUNCTION pldbg_abort_target(integer)
>  pg_restore: [archiver (db)] Error while PROCESSING TOC:
>  pg_restore: [archiver (db)] Error from TOC entry 21; 1255 16419
>  FUNCTION pldbg_abort_target(integer) postgres
>  pg_restore: [archiver (db)] could not execute query: ERROR:  could not
>  load library "C:/Program Files (x86)/PostgreSQL/8.3/lib/pldbgapi.dll":
>  The specified module could not be found.
>
> Command was: CREATE FUNCTION pldbg_abort_target(session integer) RETURNS 
> SET
>  OF boolean
> AS '$libdir/pldbgapi', 'pldbg_abort_target'
>   ...
>  pg_restore: [archiver (db)] could not execute query: ERROR:  function 
> public.pld
>  bg_abort_target(integer) does not exist
> Command was: ALTER FUNCTION public.pldbg_abort_target(session integer) 
> OWNER
>   TO postgres;
>
>
>  It looks like a packaging issue with the windows installer - these
>  commands work fine for me on Linux.
>
>  Thanks,
>
>  Dan
>

---(end of broadcast)---
TIP 6: explain analyze is your friend

[GENERAL] errors in pg_restore on windows?

2008-02-28 Thread Dan Armbrust

Using PostgreSQL 8.3 on Windows, I make a backup like this:
pg_dump -U postgres -C -d -D -Fc -f ispaaa-pg.bak ispaaa

Then restore like this:
pg_restore -U postgres -C -d template1 -v -Fc ispaaa-pg.bak

And I get lots of these errors:


pg_restore: creating TABLE voip
pg_restore: creating FUNCTION pldbg_abort_target(integer)
pg_restore: [archiver (db)] Error while PROCESSING TOC:
pg_restore: [archiver (db)] Error from TOC entry 21; 1255 16419
FUNCTION pldbg_abort_target(integer) postgres
pg_restore: [archiver (db)] could not execute query: ERROR:  could not
load library "C:/Program Files (x86)/PostgreSQL/8.3/lib/pldbgapi.dll":
The specified module could not be found.

Command was: CREATE FUNCTION pldbg_abort_target(session integer) RETURNS SET
OF boolean
AS '$libdir/pldbgapi', 'pldbg_abort_target'
  ...
pg_restore: [archiver (db)] could not execute query: ERROR:  function public.pld
bg_abort_target(integer) does not exist
Command was: ALTER FUNCTION public.pldbg_abort_target(session integer) OWNER
 TO postgres;


It looks like a packaging issue with the windows installer - these
commands work fine for me on Linux.

Thanks,

Dan

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

[GENERAL] Windows 8.3 installer leaves a debug flag on?

2008-02-25 Thread Dan Armbrust

To follow up on my old thread -

I tested another install of Postgres 8.3.0.1 - and on windows, the
postgresql.conf file has this in it:

shared_preload_libraries = '$libdir/plugins/plugin_debugger.dll'
#
(change requires restart)

I got the impression that that debugger wasn't intended to be enabled,
by default.  It looks like the windows installer is enabling it,
however.

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [GENERAL] performance issues on windows with 8.3.0?

2008-02-15 Thread Dan Armbrust

Thanks for all the help.  Performance is back where I thought it
should be, after I fixed our pooling bug.

I didn't think that postgres would be released with performance issues
like that - its just too good :)

Thanks,

Dan

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [GENERAL] performance issues on windows with 8.3.0?

2008-02-14 Thread Dan Armbrust

So, my ill Postgres 8.3 database is filling up log files in the pg_log
directory with the following:

2008-02-13 18:29:31 CST LOG:  loaded library
"$libdir/plugins/plugin_debugger.dll"
2008-02-13 18:29:31 CST LOG:  loaded library
"$libdir/plugins/plugin_debugger.dll"
2008-02-13 18:29:31 CST LOG:  loaded library
"$libdir/plugins/plugin_debugger.dll"
2008-02-13 18:29:31 CST LOG:  loaded library
"$libdir/plugins/plugin_debugger.dll"
2008-02-13 18:29:31 CST LOG:  loaded library
"$libdir/plugins/plugin_debugger.dll"
2008-02-13 18:29:31 CST LOG:  loaded library
"$libdir/plugins/plugin_debugger.dll"
2008-02-13 18:29:31 CST LOG:  loaded library
"$libdir/plugins/plugin_debugger.dll"
2008-02-13 18:29:31 CST LOG:  loaded library
"$libdir/plugins/plugin_debugger.dll"
2008-02-13 18:29:31 CST LOG:  loaded library
"$libdir/plugins/plugin_debugger.dll"
2008-02-13 18:29:31 CST LOG:  loaded library
"$libdir/plugins/plugin_debugger.dll"
2008-02-13 18:29:31 CST LOG:  loaded library
"$libdir/plugins/plugin_debugger.dll"


Does this mean anything to anyone?

I can't hardly believe that I'm the first one to notice that the
windows build of 8.3 has serious issues.

Thanks,

Dan

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

[GENERAL] performance issues on windows with 8.3.0?

2008-02-13 Thread Dan Armbrust

Are there any known regression issues WRT performance on the 8.3.0.1
binary build for windows?

And I mean serious -multiple orders of magnitude- performance issues
running simple queries on a small database...

A little more background.  I built 8.3.0 on Cent OS 5 today.  Started
using it with a server application that I have to do some
benchmarking.  Java application connecting via jdbc to Postgres on
localhost.  Everything looked great.  Perhaps even faster than 8.2 -
but I don't have the benchmarks to say for sure yet.

Then, I wanted to test what is usually our fastest performing setup -
with the database on a second system.  So, I installed the new 8.3 on
a Windows 2003 system, started it up, created my database, pointed my
java app to it and everything seemed good.

Then I put a load on it - and my server thru-put went from about 1500
messages per second on the linux-localhost combination to 30 on the
remote windows postgres installation.

I will be investigating this much more completely first thing tomorrow
- but I figured I'd ask to see if there was something obvious first...

Thanks,

Dan

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

[GENERAL] Data conversion tools?

2006-11-15 Thread Dan Armbrust

I'm trying to convert a database from either MS Access or MySQL into 
Postgres.  I have found a couple of tools that will almost do what I 
want - but not quite.  To make things match up with code that is already 
written - I need to have all of the table names and column names 
converted to lower case during the conversion process.


I have found a couple of (free or demo) tools that will do it - navicat 
has a tools that will convert from MSAccess to Postgres - however, it 
doesn't offer a feature to lowercase all of the table names and column 
names.


DB Tools Manager Professional will do it as well - but again, no way to 
tell it to lowercase things in the process.


PGAdmin II had a tool that would do this - but alas - that part of 
PGAdmin was never brought into PGAdmin III.  And the last version of 
PGAdmin II that I was able to find wouldn't run against my Postgres DB.


Short of installing an older Postgres DB that PGAdmin II will work with 
- does anyone else know of a tool that can do what I want?


Is there a script of some sort that I can run that would go through a 
set of tables and lowercase everything?


Thanks,

Dan


--

Daniel Armbrust
Biomedical Informatics
Mayo Clinic Rochester
daniel.armbrust(at)mayo.edu
http://informatics.mayo.edu/

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org/

Re: [GENERAL] GUI Interface

2006-05-12 Thread Dan Armbrust


Tony Caduto wrote:

Jim Wilson wrote:


Those Windows developer tools must be working really well   
How right you are Jim, far superior and more productive than anything 
available on
Linux at this time, though the netbeans IDE comes close, to bad Java is 
slow for

almost all desktop applications.
One example is LimeWire, while it works well, it takes forever to load and
the UI seems a little sluggish compared to a native compiled application.



You live in an interesting world... meanwhile, I'm here in the real 
world, using Eclipse - the best IDE I've ever used to develop java 
applications.  Oh, wait, Eclipse is written in Java?  I didn't think it 
was possible to write good apps in java?  Certainly better than visual 
studio (and yes, I have to use both - eclipse is a far better IDE in my 
opinion).  Oh, and I can run eclipse on my linux desktop as well as my 
windows desktop?  Thats just cool.  Oh, and my next machine?  Definitely 
a Mac.  And it woks there too - just like all of my SWT apps do.


I will admit, it is rather easy to write slow java swing applications 
There are a lot of poor ones out there.   Its a shame that Sun botched 
swing so badly, and have never repaired it properly.  Its not impossible 
to write fast, responsive apps in swing, it just takes skilled 
developers.  And its a pain.


However, now with the emergence of SWT and modern JVM's - there is no 
reason for your java GUI to be any slower than anything else.  The only 
excuse for a slow java app these days is the quality of the code that it 
is built with.  And you can write a bad, slow app in any language.


Oh, and to get back on topic a bit more - DBVisualizer is also a nice 
database GUI (written in Java, by the way) that is very fast, 
responsive, and cross platform.


http://www.minq.se/products/dbvis/

Its also rather handy if you have to deal with 5 different types of 
databases on a daily basis, since it handles them all.


Dan


--

Daniel Armbrust
Biomedical Informatics
Mayo Clinic Rochester
daniel.armbrust(at)mayo.edu
http://informatics.mayo.edu/

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match

[GENERAL] Problem dropping a table

2006-05-09 Thread Dan Armbrust

I have a java application that is trying to dynamically drop a set of 
tables.  Problem is, when it gets to a particular table and I execute 
the "drop table foo cascade" command from a prepared statement, the 
query never returns.  It just hangs indefinitely.  I presume that it is 
waiting on a lock for this table, but there is nothing that I know of 
that should be locking this table.


So, I have two issues - the first, how do I prevent myself from hanging 
indefinitely?  I tried 'setQueryTimeout' on the prepared statement, but 
it doesn't seem to have any effect.  Still hangs indefinitely.


The second, how can I track down what is locking this table?  I presume 
that it is some of my code somewhere... maybe a prepared statement that 
I didn't get closed - but I can't find it.  I've check my code twice for 
any reference to this table, and every use of it is properly closing the 
result sets and the prepared statement.


Any ideas?

Thanks,

Dan


--

Daniel Armbrust
Biomedical Informatics
Mayo Clinic Rochester
daniel.armbrust(at)mayo.edu
http://informatics.mayo.edu/

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [GENERAL] Create a new database from JDBC?

2006-02-09 Thread Dan Armbrust


Dave Page wrote:



On 8/2/06 17:26, "Joshua D. Drake" <[EMAIL PROTECTED]> wrote:


Certainly not what I want, since that database doesn't exist.  Is
there a system database I could always count on being available that I
could connect to?

template1 but only if you allow it from pg_hba.conf.


The postgres database would be more appropriate on 8.1+ (it is specifically
inteded for that kind of thing), however like template1, it can be dropped,
though that isn't advisable without good reason.

Regards, Dave.




So there is no guaranteed way to do this, unless there is a known, 
existing database?  Seems like a shortcoming, to me.  Would this be 
considered as a feature request, or does the current design make it to 
difficult to implement?


Dan

--

Daniel Armbrust
Biomedical Informatics
Mayo Clinic Rochester
daniel.armbrust(at)mayo.edu
http://informatics.mayo.edu/

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq

[GENERAL] Create a new database from JDBC?

2006-02-08 Thread Dan Armbrust

How do I make a connection to a Postgresql database that (potentially) 
has 0 databases from jdbc, so that I can issue a create database command?


In that _other_ database, I can make a jdbc connection to an address 
like this:


jdbc:mysql://servername/

And then issue a "Create database foo" command.

Then I can reconnect to
jdbc:mysql://servername/foo

And everything is good.

If I try to connect to a postgresql database like this:

jdbc:postgresql://servername/

Postgres attempts to connect to a database that is named the same as the 
username I provided.


Certainly not what I want, since that database doesn't exist.  Is there 
a system database I could always count on being available that I could 
connect to?


Thanks,

Dan

--

Daniel Armbrust
Biomedical Informatics
Mayo Clinic Rochester
daniel.armbrust(at)mayo.edu
http://informatics.mayo.edu/

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match

[GENERAL] Specify schema in jdbc connection string?

2005-12-08 Thread Dan Armbrust

Is there any way to specify which schema I want to use (or have at the 
front of the search path) within the jdbc connection string?


I've seen several people asking for the feature in the archives, one 
person saying they wrote it but the patch was rejected, but not 
authoritative answers...


Thanks,

Dan

--

Daniel Armbrust
Biomedical Informatics
Mayo Clinic Rochester
daniel.armbrust(at)mayo.edu
http://informatics.mayo.edu/

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [GENERAL] fine tuned database dump/reload?

2005-10-17 Thread Dan Armbrust

Jim C. Nasby wrote:
> pg_dump handles table ordering properly.
>

Maybe I missed something then, because it didn't last time I tried to 
move some data.  I had to drop my foreign keys before I could reload it.

Dan

--

Daniel Armbrust
Biomedical Informatics
Mayo Clinic Rochester
daniel.armbrust(at)mayo.edu
http://informatics.mayo.edu/

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [GENERAL] fine tuned database dump/reload?

2005-10-14 Thread Dan Armbrust


Jim C. Nasby wrote:

Since no one else has mentioned it, there has been discussion this week
on one of the lists (probably -general or -hackers) about expanding the
capabilities of pg_dump. I've advocated for allowing a file that
specifies what objects to dump and what kind of filtering to apply to
the name. Allowing for only dumping certain rows in a table (applying a
WHERE clause) would be a neat addition to that.


That would be exactly what I need - the ability to apply a where clause. 
 That - and the ability to specify multiple tables to dump, instead of 
just one.


Because currently, I still get into foreign key trouble in this scenario:

I drop data (and only data) from database A.  This data has foreign key 
constraints.  I want to load this data into database B - where the 
tables already exist, they already have foreign keys, and there is some 
existing data.


My dump from database A won't load - because the tables were dumped in 
the wrong order to satisfy the foreign keys on the reload - so the only 
way I can load this data back into an existing database is to dump each 
table individually, and reload in the proper order, or manually drop my 
foreign keys while I load the data.


Both ways are a pain - and it makes it overly difficult to export/import 
chunks of data from a database.


Dan


--

Daniel Armbrust
Biomedical Informatics
Mayo Clinic Rochester
daniel.armbrust(at)mayo.edu
http://informatics.mayo.edu/

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [GENERAL] fine tuned database dump/reload?

2005-10-11 Thread Dan Armbrust

My use case is not so much for database backup purposes as it is for 
fine grained export and import.


Our database schema consists of 15 tables that represent a terminology. 
   Each database can hold multiple terminologies - every table has a 
terminologyId column which distinguishes one terminology from another.


I now realize that I don't need to do anything special with individual 
tables since PostgreSQL handles foreign keys properly - so that question 
is void - I'm fine with working with a whole database at a time.


However, quite often I will load up a terminology on our development or 
test server (which also contains many other very large terminologies). 
When I am satisfied that it is ready, I want to put this terminology 
onto our production servers.  But I don't want to have to export 
millions of rows that I don't need.


With MySQL, the ability to do a dump of data which satisfies a where 
clause (for example: terminologyId='foo') gives me the ability to dump 
the data out to a file very quickly - move the file to the server it 
needs to be loaded on, and then re-load that data into the production 
database.


In PostgreSQL, the only way that I see to do this is to have my desired 
data in a database all by itself, so that I can use pg_dump to backup 
the entire database.  Then I can load that into the existing production 
database.


Is there a better way to do this?  Is there a flag I could specify for 
psql that would cause it to output INSERT or COPY statements as a result 
of a query - select * from foo where terminologyId=foo?  Then I could 
just have 15 select statements batched up in a file, and pipe the output 
into a new file.


I suppose this is kind of an obscure use case - but a flag on pg_dump 
where I could specify a where condition would certainly be handy.


Thanks,

Dan

--

Daniel Armbrust
Biomedical Informatics
Mayo Clinic Rochester
daniel.armbrust(at)mayo.edu
http://informatics.mayo.edu/

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq

Re: [GENERAL] fine tuned database dump/reload?

2005-10-11 Thread Dan Armbrust


Dan Armbrust wrote:

Bricklen Anderson wrote:



pg_dump --help
...
  -n, --schema=SCHEMA  dump the named schema only
  -s, --schema-onlydump only the schema, no data
  -t, --table=TABLEdump the named table only
...



Just after I sent the e-mail, I realized that I forgot that I saw that 
option - but on my first reading of the description (and being and 
annoying user who didn't try it first) it didn't appear that it would 
allow me to specify more than one table.


But I tested it, and it does let me specify the -t command more than 
once.  So that gets me past that issue.


Dan


Now I'm just filling the mailing list with mis-information.  It actually 
ignores all but the last -t flag - so this only allows me to specify one 
table at a time, rather than several tables.


I need to write up my use case so the maintainers can see why I want to 
be able to dump things in such a specific way - its not for backup 
purposes - so I'm kind of misusing the intent of the tool.  More info in 
a bit.


Dan



--

Daniel Armbrust
Biomedical Informatics
Mayo Clinic Rochester
daniel.armbrust(at)mayo.edu
http://informatics.mayo.edu/

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [GENERAL] fine tuned database dump/reload?

2005-10-11 Thread Dan Armbrust


Bricklen Anderson wrote:



pg_dump --help
...
  -n, --schema=SCHEMA  dump the named schema only
  -s, --schema-onlydump only the schema, no data
  -t, --table=TABLEdump the named table only
...



Just after I sent the e-mail, I realized that I forgot that I saw that 
option - but on my first reading of the description (and being and 
annoying user who didn't try it first) it didn't appear that it would 
allow me to specify more than one table.


But I tested it, and it does let me specify the -t command more than 
once.  So that gets me past that issue.


Dan


--

Daniel Armbrust
Biomedical Informatics
Mayo Clinic Rochester
daniel.armbrust(at)mayo.edu
http://informatics.mayo.edu/

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

[GENERAL] fine tuned database dump/reload?

2005-10-11 Thread Dan Armbrust

Does postgresql have any facility to dump anything more fine grained 
than a database to a text file?


For example, to mention a bad word, MySQL's dump command allows you to 
specify individual tables to dump - and not only that - you can also 
specify a SQL query that must be satisfied for a row to make it into the 
dump file - this is extremely useful for the way that we use our database.


Also - how does postgresql handle foreign keys if you dump an entire 
database along with the create table statements?  Because my foreign 
keys require that the tables be created in a certain order - MySQL fails 
on this front - but since it allows me to specify the tables to dump, 
and it dumps them in that order, I can specify the correct order.


PostgreSQL's pg_dump command seems rather limited in its abilities. 
Maybe I'm missing the command I'm looking for.


Thanks,

Dan

--

Daniel Armbrust
Biomedical Informatics
Mayo Clinic Rochester
daniel.armbrust(at)mayo.edu
http://informatics.mayo.edu/

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [GENERAL] Oracle buys Innobase

2005-10-10 Thread Dan Armbrust


Greg Sabino Mullane wrote:


The other answer may be the license: plugging PG into the MySQL system
(which is about as technically feasible trying to breed a porpoise
and an elephant) keeps MySQL GPL, which is another reason many people
like it.



The fact that PostgreSQL is NOT released under GPL is the reason that 
people like me are here - MySQL's license drove us away from them. 
Their change of the driver licensing prevents us from shipping new 
drivers with our applications.


GPL is a poison pill when it comes to groups like us that are trying to 
develop standards (and shared code bases) that can be used by both 
opensource and corporate types alike.


So keep up the good work!

Dan


--

Daniel Armbrust
Biomedical Informatics
Mayo Clinic Rochester
daniel.armbrust(at)mayo.edu
http://informatics.mayo.edu/

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match

Re: [GENERAL] PostgreSQL 8.1 vs. MySQL 5.0?

2005-10-07 Thread Dan Armbrust




IBM have previously bought Informix (who bought Illustra, RedBrick,
Cloudscape)  None of those take-
overs has led to a product actually surviving. 
 



Thats not exactly true - Cloudscape was just given to Apache, and is now 
opensourced under the name "Derby"

http://db.apache.org/derby/

Suddenly,  Hypersonic SQL http://www.hsqldb.org/ (which also works 
wonderfully for small databases - nobody would claim that these can 
scale like PostgreSQL)

has a bunch of competition.

Dan


--

Daniel Armbrust
Biomedical Informatics
Mayo Clinic Rochester
daniel.armbrust(at)mayo.edu
http://informatics.mayo.edu/


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

[GENERAL] [Off Topic] Swing vs SWT

2005-10-07 Thread Dan Armbrust





Michael Schmidt wrote:

  
  
  

  Mr. Armbrust,
  Hopefully, this isn't off topic.  Just wondering why SWT instead
of swing?
   
  Michael Schmidt

I'm pretty sure this is completely off topic, but I'll indulge it -


I've done development using Swing and SWT - and I find SWT's APIs to be
much cleaner, intuitive, and easy to use.  Manual GUI development is
much quicker with SWT  and results in much less code that is far more
readable.  Also, SWT layout managers are _much_ easier to use than the
crud that comes with Swing from Sun.

Side note - if you do have to do Swing development, I highly recommend
using the JGoodies forms package - it is a layout manager for Swing that
is far and away better than trying to do things with Swing layout
managers -  http://www.jgoodies.com/freeware/forms/index.html
As an example, I rewrote some forms that were 800 lines of code using
GridBagLayout into about 150 lines of code using JGoodies.  And the
JGoodies code is far more readable.

Back to SWT - Swing still just looks like crud when compared
side-by-side with native Windows applications.  Especially in windows
XP - Sun still doesn't have the Windows XP Look and Feel implemented
properly.  And if a user applies a custom XP skin, then it really looks
strange.  The file choosers don't look or work like native Windows XP
file choosers.   And when windows "super duper 5.0" comes out next
year, I'm sure it will only take sun another 3 years or so to write the
new look and Feel.  Meanwhile, your apps just look weird.  SWT Apps
_always_ look exactly like the OS look and feel - because it uses the
native OS widgets.    You can also embed Swing parts into a SWT
application if you need to.

When I write a SWT application, most of my users don't even know it is
a java application.  It just looks like a normal "windows"
application.  You can also launch SWT apps using Java WebStart, or, if
you don't want to use webstart, you combine your app, SWT, and
http://jsmooth.sourceforge.net/ and suddenly you have a native windows
executable for your users.  All of javas ugly bits are safely hidden
away from your end user.

If you want to compare a Swing application (written using JGoodies
looks and JGoodies look and feel - which does a better job than Suns
look and feel at looking like Windows XP) side by side with a similar
SWT application - take a look at these two apps:

(scroll down to the webstart links)
(ps - these apps aren't supposed to be masterpieces of user interface
design - they are just utilities for techie types using LexGrid)
http://informatics.mayo.edu/LexGrid/index.php?page=convert
http://informatics.mayo.edu/LexGrid/index.php?page=indexer

Especially pay attention to the speed and appearance of the file
choosers (click the "Browse..". button) - to enable the second file
chooser in the Indexer Creator app - go to the "Options" menu and
choose "Build Normalized Index"

Another awesome gui that is written in SWT (and most end users don't
even know its written in Java)
http://azureus.sourceforge.net/

And a couple more - to finish with a little bit of relevance to
PostgreSQL
A database designer plugin for Eclipse  (which is all written in SWT) 
- supports PostgreSQL:
http://www.azzurri.jp/en/software/clay/index.jsp

Another database plugin for Eclipse that lets you work with a
PostgreSQL database.
http://quantum.sourceforge.net/screen_shots.html 

Dan


Daniel Armbrust
Biomedical Informatics
Mayo Clinic Rochester
daniel.armbrust(at)mayo.edu
http://informatics.mayo.edu/

Re: [GENERAL] What Front-End you suggest ?

2005-10-07 Thread Dan Armbrust





Gerson - Yahoo wrote:

  
  
  

  
  
  Dear friends,
   
  I need to
development one system in
native Linux, Language and DataBase, today I use PostgreSQL with Delphi
  2005 in
Windows 2005
Servers, and I need to know what language will use to do this ?
  That you
suggest ?
  This Langage
work good in what Linux
SO ?
  This
language have developmente
ambient IDE (like Delphi 2005) ?
  Where may
download this compiler to
test ?
   
  Tks for help.
   
  Gerson
   
  

Java...   And Eclipse for your IDE.
And using SWT rather than Swing for your GUI.

Dan

-- 

Daniel Armbrust
Biomedical Informatics
Mayo Clinic Rochester
daniel.armbrust(at)mayo.edu
http://informatics.mayo.edu/

Re: [GENERAL] 5 new entries for FAQ

2005-08-10 Thread Dan Armbrust





Bruce Momjian wrote:

  
I have updated the FAQ to handle three of the items you mentioned.  In
one case, I added a new FAQ entry (double-quoting identifiers with a
link to our docs), and in two other cases (indexing long columns,
case-insensitive columns) I added to existing FAQ items where
appropriate.  Patch attached and applied.

For the batch job and single-CPU issues, they did not fit into existing
FAQ entries, and I am not sure they are asked enough to be added as
FAQs.  I am interested to hear other's opinions on this.
  


  ***
*** 613,618 
--- 617,628 
  The maximum table size and maximum number of columns can be
  quadrupled by increasing the default block size to 32k.
  
+ One limitation is that indexes can not be created on columns
+ longer than about 2,000 characters. Fortunately, such indexes are
+ rarely needed. Uniqueness is best guaranteed using another column
+ that is an MD5 hash of the long column, and full text indexing
+ allows for searching of words within the column.
+ 
  4.5) How much database disk space is required
  to store data from a typical text file?


  
  

I liked the FAQ entry written up by the original submitter much better,
WRT to the long index issue.  This write up does not contain the actual
error message encountered, so its not going to be found by someone
looking for a solution to the problem.  It doesn't contain the optional
workaround, and it doesn't have a link into the full text searching
features as the recommended fix.  I was one of the more recent people
to encounter this problem, and I don't think that this FAQ would have
helped me (if I even saw it, because google certainly won't pick it up
in a way that most people will search when the encounter the issue -
they are going to put in the error message)

Dan
-- 

Daniel Armbrust
Biomedical Informatics
Mayo Clinic Rochester
daniel.armbrust(at)mayo.edu
http://informatics.mayo.edu/

Re: [GENERAL] psqsl -> remote db

2005-08-04 Thread Dan Armbrust


[EMAIL PROTECTED] wrote:


First, makes sure the PostgreSQL database is listening for TCP/IP
connections on the computer you're trying to connect to. On Mac OS X, run
this command in the a terminal, i.e. Terminal.app:

netstat -l

You should see a line like this somewhere in the output:

tcp6 0 0 *.5432 *.* LISTEN

Second, the database server's pg_hba.conf likely needs to be configured to
accept connections from remote users. This can be done to restrict
connections from specific hosts (IP addresses), which isn't terribly
practical, or by user and encrypted password. Search the PostgreSQL
document site for "pg_hba.conf" for info.

Third, if the database server is listening, and it is configured to allow
you to connect, try connecting with psql:

psql -h myhost.com -U me mydb

"myhost.com" can be an IP address. Type "man psql" to see what the options
mean.

I hope this helps.

Jim

 


I'm sorry, I'm sure I should be asking this someplace more general, but
apparantly the word 'port' is used in so many different contexts that a
google search turns up fruitless ;)

I'm just trying to access a remote postgresql database on port 5432,
which does not appear to be open on the remote computer.  I'm on a
MacOSX and  cannot figure out for the life of me how to open up this
port on the computer.

This is probably a unix question if anything, but any help would be
apprecaited.

-Andrew

   


I'm not a Mac user, but don't they have a firewall by default?

Maybe useful?

http://cfm.gs.washington.edu/security/firewall/osx-10.1-ipfw/



--

Daniel Armbrust
Biomedical Informatics
Mayo Clinic Rochester
daniel.armbrust(at)mayo.edu
http://informatics.mayo.edu/


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [GENERAL] [BUGS] BUG #1552 followup

2005-08-02 Thread Dan Armbrust





Jaime Casanova wrote:

  On 8/2/05, Dan Armbrust <[EMAIL PROTECTED]> wrote:
  
  
I've been trying to track down some performance problems that I am
having doing large inserts on tables with foreign keys.

I'm 99% sure that the issue I am having is BUG 1552 -
http://archives.postgresql.org/pgsql-bugs/2005-03/msg00183.php

When I read the whole thread, I got the impression that a fix (or maybe
a hack) had been put into the 8.0 branch.  I'm running 8.0.3, and I'm
seeing the issue.  I can go into more details if desired...

So my questions are:
1) Would that fix have made it into the 8.0.3 release - alternatively,
how can I figure that out myself?
2) Is there a bug tracker somewhere, or do you just track bugs on the
mailing list?  I haven't been able to find anything on your site.

Thanks,

Dan


  
  
What part of the discussion gives you that impression... I understood
that Tom will (in the future) see a fix when Neil finish the framework
to replan prepared statements (or is rebuild cached plans where the
table size has grow significantly?)

PD: This is not the list for this discussion... i am forwarding you to
another(randomly choosen) list.

  

This posting seems to indicate that there was a change made:
http://archives.postgresql.org/pgsql-bugs/2005-03/msg00183.php


What is the proper way to discuss bugs?  The only thing I have found on
the website is the link for reporting new bugs, (which just sends an
e-mail to the bugs mailing list?)

Since I didn't see any bug tracker software, I figured (wrongly) that
you must just discuss bugs on the bugs list...

Dan

-- 

Daniel Armbrust
Biomedical Informatics
Mayo Clinic Rochester
daniel.armbrust(at)mayo.edu
http://informatics.mayo.edu/

Force PostgreSQL to use indexes on foreign key lookups - Was: [GENERAL] Slow Inserts on 1 table?

2005-08-02 Thread Dan Armbrust


An aha moment:

http://archives.postgresql.org/pgsql-bugs/2005-03/msg00183.php

Some of the ensuing conversation seemed to indicate that a change was 
made in the 8.0 branch in March, that was intended to fix this issue.  
Any idea if that fix would have made it into the 8.0.3 release? 


Or maybe the fix didn't fix the issue.

Dan


--

Daniel Armbrust
Biomedical Informatics
Mayo Clinic Rochester
daniel.armbrust(at)mayo.edu
http://informatics.mayo.edu/


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [GENERAL] Slow Inserts on 1 table?

2005-08-02 Thread Dan Armbrust





Alvaro Herrera wrote:

  On Tue, Aug 02, 2005 at 10:01:50AM -0500, Dan Armbrust wrote:

  
  
I shouldn't have to manually run Analyze to make the DB be capable of 
handling inserts involving tables with foreign keys correctly.  My code 
that is doing the inserts is a java application that works across 
multiple DBS - MySQL, PostgreSQL, DB2, MS Access, Sybase, etc. 

  
  
[etc, rant removed]

You don't _have_ to be rude.  This is a known limitation, and people
have suggested the usual workarounds.  This is an open source project --
if you think you can make it better, please by all means post a patch.

HAND.

  

My apologies, I don't intend to be rude.  

But it is rather easy to get into rant mode when the prevailing opinion
is that not being able to insert rows into a table with a foreign key
without running Analyze after X rows is a misuse of the DB, rather than
a bug.

I did not know that this is a known limitation, I have not been able to
find any documentation that talks about how foreign keys, indexes, and
the query planner relate.

My first assumption was that since foreign key creation implicitly
creates the necessary indexes, that these indexes would always be used
for foreign key checks.  I wouldn't have even guessed that the query
planner was involved in this portion.  But, these are all (apparently
wrong) guesses - I don't know the internals.  The performance probably
is better on small tables to not use these indexes.  But it seems to
me, that if you know that the statistics are out of date (which I would
think that you should know, if analyze hasn't been run since the tables
were created) that the safer choice would be to use the indexes, rather
than not using the indexes.


Dan

-- 

Daniel Armbrust
Biomedical Informatics
Mayo Clinic Rochester
daniel.armbrust(at)mayo.edu
http://informatics.mayo.edu/

Re: [GENERAL] Slow Inserts on 1 table?

2005-08-02 Thread Dan Armbrust



My loading is done programatically, from another format, so COPY is 
not an option. 



Why not? A lot of my bulk-loads are generated from other systems and I 
go through a temporary-file/pipe via COPY when I can. When I don't I 
block inserts into groups of e.g. 1000 and stick in an analyse/etc as 
required.



I guess I should clarify - my inserts are done by a Java application 
running on a client machine. This isn't bulk load in the normal definition.


I don't have any problem with the speed of the inserts when they are 
working correctly.  The only problem is that the query analyzer is 
making a really poor decision when it is executing insert statements on 
tables that have foreign keys.





So run ANALYSE in parallel with your load, or break the bulk-load into 
blocks and analyse in-line. I'm not sure ripping out PG's cost-based 
query analyser will be a popular solution just to address bulk-loads.



I never suggested that it needed to be ripped out.  It just seems that 
when it is looking to check foreign keys, and the statistics are not up 
to date (or have not yet been created) it should default to using the 
indexes, rather than not using the indexes.  The time savings of using 
indexes when things are big is FAR bigger than the time savings of not 
using indexes when things are small.



Dan



--

Daniel Armbrust
Biomedical Informatics
Mayo Clinic Rochester
daniel.armbrust(at)mayo.edu
http://informatics.mayo.edu/


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [GENERAL] Slow Inserts on 1 table?

2005-08-02 Thread Dan Armbrust


Joshua D. Drake wrote:

Why can't postgres compile some rough statistics on tables without 
running analyze?



Why can't you just run analyze? You don't have to empty the tables to 
do so and you can alter the statistics on the fly. Heck you can even 
run analyze while doing the inserts.


I shouldn't have to manually run Analyze to make the DB be capable of 
handling inserts involving tables with foreign keys correctly.  My code 
that is doing the inserts is a java application that works across 
multiple DBS - MySQL, PostgreSQL, DB2, MS Access, Sybase, etc. 

I shouldn't have to put custom code into it just to make postgres deal 
with inserts properly.  No other database that I insert data into has 
problems like this.


This will look really nice in the instructions for my data loader -

* - If you are using PostgreSQL for your database server, it has a bug 
that causes its performance to become abysmal unless you manually run 
this "Analyze" command a little while after you start the load process.


I will have users that don't even know what a database is, much less 
have to go out and run manual sysadmin level commands on it to make the 
thing work.




I have already had to disable sequential scans, since the planner is 
almost _always_ wrong in deciding whether or not to use an index.



Sounds again like you are not setting up your statistics correctly or 
running analyze as and when it should.



Doesn't matter if the statistics are perfectly up to date.  It still 
doesn't use the indexes.  If the default statistics are so poorly set up 
that the planner thinks a 5 million row table scan will be quicker for a 
query that is doing exact matches on indexed columns, I would say it is 
poorly implemented.  So I just disabled that "feature".  And it works 
fine with sequential scans disabled - I have no problem with it in this 
respect, since I can turn it off.


Is there any way that I can disable sequential scans for foreign key checks?

Dan


--

Daniel Armbrust
Biomedical Informatics
Mayo Clinic Rochester
daniel.armbrust(at)mayo.edu
http://informatics.mayo.edu/


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq

Re: [GENERAL] Slow Inserts on 1 table?

2005-08-02 Thread Dan Armbrust





What, ALWAYS faster, even for the first FK check when there's only one 
row in the target table and that's cached?


If you're really in a hurry doing your bulk loads:
 1. Use COPY.
 2. Drop/restore the foreign-key constraints before/after.
That will be hugely faster than INSERTs, although it's not always an 
applicable solution.

--
  Richard Huxton
  Archonet Ltd

It seems like the query planner goes to great lengths to avoid using 
indexes because it might take 5 ms longer to execute an index lookup on 
a table with one row. 

But then, when the table has 1 million rows, and a full scan takes 3 
minutes, and the index scan takes 3 seconds, it has no problem picking 
the 3 minute route.
I'll gladly give up the 5 ms in turn for not having to wait 3 minutes, 
which is why I disabled the sequential scans.  If I have a small table, 
where indexes won't speed things up, I wont build an index on it.


The other factor, is that most of my tables have at least thousands, and 
usually millions of rows.  Sequential scans will never be faster for the 
queries that I am doing - like I said, that is why I created the indexes.


My loading is done programatically, from another format, so COPY is not 
an option.  Neither is removing foreign keys, as they are required to 
guarantee valid data.  I don't really have a problem with the insert 
speed when it is working properly - it is on par with other DBs that I 
have on the same hardware.  The problem is when it stops using the 
indexes, for no good reason.


Example, last night, I kicked off a load process - this morning, it had 
only managed to make it through about 600,000 rows (split across several 
tables).  After restarting it this morning, it made it through the same 
data in 30 minutes. 


If thats not bad and buggy behavior, I don't know what is

Dan

--

Daniel Armbrust
Biomedical Informatics
Mayo Clinic Rochester
daniel.armbrust(at)mayo.edu
http://informatics.mayo.edu/


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match

Re: [GENERAL] Slow Inserts on 1 table?

2005-08-02 Thread Dan Armbrust


Dan Armbrust wrote:

I have one particular insert query that is running orders of magnitude 
slower than other insert queries, and I cannot understand why.
For example, Inserts into "conceptProperty" (detailed below) are at 
least 5 times faster than inserts into "conceptPropertyMultiAttributes".


When I am running the inserts, postmaster shows as pegging one CPU on 
the Fedora Core 3 server it is running on at nearly 100%.


Any advice is appreciated.  Here is a lot of info that may shed light 
on the issue to someone with more experience than me:


Example Insert Query with data:
INSERT INTO conceptPropertyMultiAttributes (codingSchemeName, 
conceptCode, propertyId, attributeName, attributeValue) VALUES ('NCI 
MetaThesaurus', 'C005', 'T-2', 'Source', 'MSH2005_2004_10_12')


EXPLAIN ANALYZE output:
QUERY PLAN
Result  (cost=0.00..0.01 rows=1 width=0) (actual time=0.005..0.008 
rows=1 loops=1)

Total runtime: 4.032 ms

Table Structure:
CREATE TABLE conceptpropertymultiattributes (
   codingschemename character varying(70) NOT NULL,
   conceptcode character varying(100) NOT NULL,
   propertyid character varying(50) NOT NULL,
   attributename character varying(50) NOT NULL,
   attributevalue character varying(250) NOT NULL
);

Primary Key:
ALTER TABLE ONLY conceptpropertymultiattributes
   ADD CONSTRAINT conceptpropertymultiattributes_pkey PRIMARY KEY 
(codingschemename, conceptcode, propertyid, attributename, 
attributevalue);


Foreign Key:
ALTER TABLE ONLY conceptpropertymultiattributes
   ADD CONSTRAINT f FOREIGN KEY (codingschemename, conceptcode, 
propertyid) REFERENCES conceptproperty(codingschemename, conceptcode, 
propertyid);



Structure of Table Referenced by Foreign Key:
CREATE TABLE conceptproperty (
   codingschemename character varying(70) NOT NULL,
   conceptcode character varying(100) NOT NULL,
   propertyid character varying(50) NOT NULL,
   property character varying(250) NOT NULL,
   "language" character varying(32),
   presentationformat character varying(50),
   datatype character varying(50),
   ispreferred boolean,
   degreeoffidelity character varying(50),
   matchifnocontext boolean,
   representationalform character varying(50),
   propertyvalue text NOT NULL
);

Primary Key:
ALTER TABLE ONLY conceptproperty
   ADD CONSTRAINT conceptproperty_pkey PRIMARY KEY (codingschemename, 
conceptcode, propertyid);


Thanks,

Dan

Well, I now have a further hunch on why the inserts are so slow on 1 
table.  Most of the time, when I am doing bulk inserts, I am starting 
with an empty database.  My insertion program creates the tables, 
indexes and foreign keys. 

The problem seems to be the foreign key - PostgreSQL is apparently being 
to stupid to use the indexes while loading and checking the foreign key 
between two large tables - my guess is because analyze has not been run 
yet, so it thinks all of the tables are size 0.  If I let it run for a 
while, then kill the load process, run Analyze, empty the tables, and 
then restart, things perform fine.  But that is kind of a ridiculous 
sequence to have to use to load a database. 

Why can't postgres compile some rough statistics on tables without 
running analyze?  Seems that it would be pretty easy to keep track of 
the number of inserts/deletions that have taken place since the last 
Analyze execution...  It may not be the exact right number, but it would 
certainly be smarter than continuing to assume that the tables are size 
0, even though it has been doing constant inserts on the tables in 
question


I have already had to disable sequential scans, since the planner is 
almost _always_ wrong in deciding whether or not to use an index.  I put 
the indexes on the columns I choose for a reason - it is because I KNOW 
the index read will ALWAYS be faster since I designed the indexes for 
the queries I am running.  But it still must be doing a sequential scan 
on these inserts...




--

Daniel Armbrust
Biomedical Informatics
Mayo Clinic Rochester
daniel.armbrust(at)mayo.edu
http://informatics.mayo.edu/


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [GENERAL] index row size exceeds btree maximum, 2713 - Solutions?

2005-07-20 Thread Dan Armbrust


Dan Armbrust wrote:

I'm trying to load some data into PostgreSQL 8.0.3, and I got the 
error message "index row size 2904 exceeds btree maximum, 2713".  
After a bunch of searching, I believe that I am getting this error 
because a value that I am indexing is longer than ~ 1/3 of the block 
size - or the BLCKSZ variable in the src/include/pg_config_manual.h file.


Am I correct so far?

I need to fix this problem.  I cannot change the indexed columns.  I 
cannot shorten the data value.  And I cannot MD5 it, or any of those 
hashing types of solutions that I saw a lot while searching.


Is there a variable I can set somewhere, so that postgresql would just 
truncate the value to the max length that the index can handle when it 
goes to enter it into the index, instead of failing with an error?  I 
would be fine with not having this particular row fully indexed, so 
long as I could still retrieve the full data value.


The other solution that I saw was to modify the BLCKSZ variable.  From 
what I saw, it appears that to change that variable, I would need to 
dump my databases out,  recompile everything, and then reload them 
from scratch.  Is this correct?


Currently the BLCKSZ variable is set to 8192.  What are the 
performance/disk usage/other? implications of doubling this value, to 
16384?


Any other suggestions in dealing with this problem?

Thanks,

Dan



Thanks for all the information and ideas WRT this issue.

I ended up just having to remove the index from this particular column 
that was having the issue - in my particular case, I didn't lose 
anything by doing this anyway, because the index wasn't being used for 
its intended purpose anyway, due to case sensitivity issues.


Could I suggest adding this error, its causes, and possible solutions 
from this thread 
(http://archives.postgresql.org/pgsql-general/2005-07/msg00731.php) to 
the FAQ?  It took me a long time to connect all the dots through a lot 
of different e-mail threads.


Also, maybe the max index size should be documented in the manual as well?

Dan

--

Daniel Armbrust
Biomedical Informatics
Mayo Clinic Rochester
daniel.armbrust(at)mayo.edu
http://informatics.mayo.edu/


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

[GENERAL] Slow Inserts on 1 table?

2005-07-20 Thread Dan Armbrust

I have one particular insert query that is running orders of magnitude 
slower than other insert queries, and I cannot understand why.
For example, Inserts into "conceptProperty" (detailed below) are at 
least 5 times faster than inserts into "conceptPropertyMultiAttributes".


When I am running the inserts, postmaster shows as pegging one CPU on 
the Fedora Core 3 server it is running on at nearly 100%.


Any advice is appreciated.  Here is a lot of info that may shed light on 
the issue to someone with more experience than me:


Example Insert Query with data:
INSERT INTO conceptPropertyMultiAttributes (codingSchemeName, 
conceptCode, propertyId, attributeName, attributeValue) VALUES ('NCI 
MetaThesaurus', 'C005', 'T-2', 'Source', 'MSH2005_2004_10_12')


EXPLAIN ANALYZE output:
QUERY PLAN
Result  (cost=0.00..0.01 rows=1 width=0) (actual time=0.005..0.008 
rows=1 loops=1)

Total runtime: 4.032 ms

Table Structure:
CREATE TABLE conceptpropertymultiattributes (
   codingschemename character varying(70) NOT NULL,
   conceptcode character varying(100) NOT NULL,
   propertyid character varying(50) NOT NULL,
   attributename character varying(50) NOT NULL,
   attributevalue character varying(250) NOT NULL
);

Primary Key:
ALTER TABLE ONLY conceptpropertymultiattributes
   ADD CONSTRAINT conceptpropertymultiattributes_pkey PRIMARY KEY 
(codingschemename, conceptcode, propertyid, attributename, attributevalue);


Foreign Key:
ALTER TABLE ONLY conceptpropertymultiattributes
   ADD CONSTRAINT f FOREIGN KEY (codingschemename, conceptcode, 
propertyid) REFERENCES conceptproperty(codingschemename, conceptcode, 
propertyid);



Structure of Table Referenced by Foreign Key:
CREATE TABLE conceptproperty (
   codingschemename character varying(70) NOT NULL,
   conceptcode character varying(100) NOT NULL,
   propertyid character varying(50) NOT NULL,
   property character varying(250) NOT NULL,
   "language" character varying(32),
   presentationformat character varying(50),
   datatype character varying(50),
   ispreferred boolean,
   degreeoffidelity character varying(50),
   matchifnocontext boolean,
   representationalform character varying(50),
   propertyvalue text NOT NULL
);

Primary Key:
ALTER TABLE ONLY conceptproperty
   ADD CONSTRAINT conceptproperty_pkey PRIMARY KEY (codingschemename, 
conceptcode, propertyid);


Thanks,

Dan

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match

[GENERAL] index row size exceeds btree maximum, 2713 - Solutions?

2005-07-18 Thread Dan Armbrust

I'm trying to load some data into PostgreSQL 8.0.3, and I got the error 
message "index row size 2904 exceeds btree maximum, 2713".  After a 
bunch of searching, I believe that I am getting this error because a 
value that I am indexing is longer than ~ 1/3 of the block size - or the 
BLCKSZ variable in the src/include/pg_config_manual.h file.


Am I correct so far?

I need to fix this problem.  I cannot change the indexed columns.  I 
cannot shorten the data value.  And I cannot MD5 it, or any of those 
hashing types of solutions that I saw a lot while searching.


Is there a variable I can set somewhere, so that postgresql would just 
truncate the value to the max length that the index can handle when it 
goes to enter it into the index, instead of failing with an error?  I 
would be fine with not having this particular row fully indexed, so long 
as I could still retrieve the full data value.


The other solution that I saw was to modify the BLCKSZ variable.  From 
what I saw, it appears that to change that variable, I would need to 
dump my databases out,  recompile everything, and then reload them from 
scratch.  Is this correct?


Currently the BLCKSZ variable is set to 8192.  What are the 
performance/disk usage/other? implications of doubling this value, to 16384?


Any other suggestions in dealing with this problem?

Thanks,

Dan

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [GENERAL] index row size exceeds btree maximum, 2713 - Solutions?

2005-07-18 Thread Dan Armbrust





Hmm, well, I don't know if it is actually building an index properly on
this column, I just assumed that it was.  It doesn't fail on every
insert, only on the one that has a really long text value.  I know it
doesn't use the index when I do "ILIKE" queries, resulting in poor
performance...  but I assumed that was because I was trying to do a
case insensitve search on a case sensitive column index.  I didn't want
to go down the road of writing even more database implementation
specific code.  I will usually be using Lucene for the full text
searches anyway.  

Where is the documentation on tsearch2?  I haven't seen it mentioned
anywhere except a couple of mailing list postings.

All of my other limitations on changing things aside - given a query
like this:

Select * from conceptproperty where codingSchemeName='foo' AND
property='anotherfoo' and propertyValue ILIKE 'valu%'

What indexe(s) would be recommended?  

My current 3 column index (that works on other DB's) doesn't perform
well due to case sensitivity issues, and now fails, due to data length
issues.

Dan



Dan

Jaime Casanova wrote:

  
The index that is failing is CREATE INDEX i1 ON conceptproperty USING btree
(codingschemename, property, propertyvalue).


  
  
I don't think you could create indexes on text fields... there are
other type of indexes for that... tsearch2 for example

Re: [GENERAL] index row size exceeds btree maximum, 2713 - Solutions?

2005-07-18 Thread Dan Armbrust





We have built a Model for terminologies that we call The Lexical Grid
(more info http://informatics.mayo.edu/LexGrid/index.php)

LexGrid has multiple backend data storage mechanisms, including LDAP
and SQL.  We do our best to remain implementation independent - our SQL
implementations, for example can run against MS Access, DB2, MySQL and
PostgreSQL.

I'm currently trying to load a new terminology into a PosgreSQL
backend, and arrived at this error because it happens to have a couple
of very large data values that get mapped into the 'propertyvalue'
field.

The structure of the table that I am (currently) having problems with
is: (apologies if your client does not parse HTML - this is what is
convenient to me right now)


  

  codingschemename
  character varying(70)
  
  NOT NULL
  
  
  
  Alter
  Drop
  
  


  conceptcode
  character varying(100)
  
  NOT NULL
  
  
  
  Alter
  Drop
  
  


  propertyid
  character varying(50)
  
  NOT NULL
  
  
  
  Alter
  Drop
  
  


  property
  character varying(250)
  
  NOT NULL
  
  
  
  Alter
  Drop
  
  


  language
  character varying(32)
  
  
  
  
  Alter
  Drop
  
  


  presentationformat
  character varying(50)
  
  
  
  
  Alter
  Drop
  
  


  datatype
  character varying(50)
  
  
  
  
  Alter
  Drop
  
  


  ispreferred
  boolean
  
  
  
  
  Alter
  Drop
  
  


  degreeoffidelity
  character varying(50)
  
  
  
  
  Alter
  Drop
  
  


  matchifnocontext
  boolean
  
  
  
  
  Alter
  Drop
  
  


  representationalform
  character varying(50)
  
  
  
  
  Alter
  Drop
  
  


  propertyvalue
  text
  
  NOT NULL
  
  
  
  Alter
  Drop
  
  

  


The structure of the table is not easily changed, as it closely follows
our model.

The index that is failing is CREATE INDEX i1 ON conceptproperty USING
btree (codingschemename, property, propertyvalue).

Usually, the 'propertyValue' field is fairly short - 100 chars or
less.  And in those cases, I need to be able to do an indexed search on
it.  In this particular case, this terminology has a propertyValue that
is very long.  I can't just toss it.  I don't really care if it makes
it into the index - for a terminology with as much data as this has,
I'll be using Lucene to do text searches anyway - but I do need to be
able to return the full propertyValue in response to a propertyId query.

I also don't want to negatively affect the performance of the rest of
the DB just to accommodate this instance - hence the questions about
the implications of changing the BLCKSZ variable.

Dan



Bruno Wolff III wrote:

  On Mon, Jul 18, 2005 at 14:44:26 -0500,
  Dan Armbrust <[EMAIL PROTECTED]> wrote:
  
  
I'm trying to load some data into PostgreSQL 8.0.3, and I got the error 
message "index row size 2904 exceeds btree maximum, 2713".  After a 
bunch of searching, I believe that I am getting this error because a 
value that I am indexing is longer than ~ 1/3 of the block size - or the 
BLCKSZ variable in the src/include/pg_config_manual.h file.

Am I correct so far?

I need to fix this problem.  I cannot change the indexed columns.  I 
cannot shorten the data value.  And I cannot MD5 it, or any of those 
hashing types of solutions that I saw a lot while searching.

  
  
Can you explain how you are using the index now and what you are trying
to accomplish? It is hard to suggest alternatives without knowing what
you are really trying to do.

Re: [GENERAL] index row size exceeds btree maximum, 2713 - Solutions?

2005-07-18 Thread Dan Armbrust





Nevermind this question...

>
Where is the documentation on tsearch2?

Google first, ask second, I remind myself again...  I knew I hadn't
seen it mentioned in the official postgresql manual.. didn't think
about it being an extension.

Dan

Dan Armbrust wrote:

  
Hmm, well, I don't know if it is actually building an index properly on
this column, I just assumed that it was.  It doesn't fail on every
insert, only on the one that has a really long text value.  I know it
doesn't use the index when I do "ILIKE" queries, resulting in poor
performance...  but I assumed that was because I was trying to do a
case insensitve search on a case sensitive column index.  I didn't want
to go down the road of writing even more database implementation
specific code.  I will usually be using Lucene for the full text
searches anyway.  
  
  I haven't seen it mentioned
anywhere except a couple of mailing list postings.
  
All of my other limitations on changing things aside - given a query
like this:
  
Select * from conceptproperty where codingSchemeName='foo' AND
property='anotherfoo' and propertyValue ILIKE 'valu%'
  
What indexe(s) would be recommended?  
  
My current 3 column index (that works on other DB's) doesn't perform
well due to case sensitivity issues, and now fails, due to data length
issues.
  
Dan
  
  
  
Dan
  
Jaime Casanova wrote:
  

  The index that is failing is CREATE INDEX i1 ON conceptproperty USING btree
(codingschemename, property, propertyvalue).




I don't think you could create indexes on text fields... there are
other type of indexes for that... tsearch2 for example

81 matches

Mail list logo