Re: [PERFORM] Very long deletion time on a 200 GB database

2012-02-29 Thread Lew

On 02/27/2012 07:14 AM, Shaun Thomas wrote:

On 02/27/2012 08:59 AM, Reuven M. Lerner wrote:


From what I understand, the issue isn't one of current disk space,
but rather of how quickly the disk space is being used up.


Noted. Just keep in mind that dead rows are not free. In the case of sequence
scans, the rows still have to be read from disk and then ignored by the
engine. Vacuums also act as sequence scans, so the more data they're reading,
the longer that takes. This is especially true on an overloaded system.


I wouldn't be surprised if we end up doing a CLUSTER at some point.
The problem is basically that this machine is in 24/7 operation at
high-speed manufacturing plants, and the best-case scenario is for a
4-hour maintenance window.


The best case scenario is for them to buy a second server. If operation of
this app stack really is critical to business, they need to spend the money to
keep it working, or they'll end up paying much more for it when it fails. You
also said that server has other stuff running on it, and it already has very
little memory. That tells me they have no DR node. I'm afraid to even ask how
they're doing backups. That one machine is a giant, red, flashing single point
of failure. I really hope they understand that.


I've suggested that we might be able to help the situation somewhat
by attaching a portable USB-based hard disk, and adding a new
tablespace that'll let us keep running while we divide up the work
that the disk is doing, but they've made it clear that the current
hardware configuration cannot and will not change. Period.


And that's it, then. You have yourself a bad client. If it were me, I'd get
through this contract and never do business with them again. They have a
system that's basically 100% guaranteed to fail some time in the future (and
yet is critical for operation!) and are putting Band-Aids on it. I think
there's a parable somewhere about eggs and baskets, but I can't recall it at
this moment. ;)


There is more than one parable here.

For the client - don't be a damn fool. When you go to a doctor for a broken 
arm, you don't refuse the splint and insist on using just aspirin to manage 
the problem.


For the consultant/employee - stop buying into the bullshit. This is a common 
situation, where you tell your client, You need X and they refuse the 
advice. You need to be crystal clear with them that they are therefore NOT 
solving their problem.


I stopped giving in to the client's bullshit in this regard years ago when a 
customer tried to withhold over eight thousand dollars because I agreed to my 
manager's refusal to normalize a database and thus didn't fix a performance 
problem.  I got paid when their programmer whom I'd secretly informed of the 
problem and how to fix it took over as the project manager, after using my 
advice to become the hero. The lesson I took is not to gloss over real 
problems because the client is recalcitrant. They don't win, you don't win, 
nobody wins. (Unless you use a workaround as I did, but politics is the court 
of last resort for an engineer.)


I'd rather have my bosses think I'm a little snarky (as long as I'm not fired 
for it), than have them hate me and try not to pay me. I am just loud about 
what is correct and what the consequences of incorrect are; then when they get 
those consequences I make sure to draw the connection.


I'm not there to make friends, I'm there to make solutions. It is fiduciary 
irresponsibility to let your clients go down in flames without at least 
informing them of the alternative.


--
Lew
Honi soit qui mal y pense.
http://upload.wikimedia.org/wikipedia/commons/c/cf/Friz.jpg

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Very long deletion time on a 200 GB database

2012-02-29 Thread Gavin Flower

On 29/02/12 06:06, David Kerr wrote:

On 02/27/2012 12:08 AM, Reuven M. Lerner wrote:

Hi, everyone. I wanted to thank you again for your help on the huge
delete problem that I was experiencing.

After a lot of trial and error, we finally came to the conclusion that
deleting this much data in the time frame that they need, on
underpowered hardware that is shared with an application, with each test
iteration taking 5-9 hours to run (but needing to run in 2-3), is just
not going to happen. We tried many of the options that people helpfully
suggested here, but none of them gave us the performance that we needed.

(One of the developers kept asking me how it can possibly take so long
to delete 200 GB, when he can delete files of that size in much less
time. I had to explain to him that deleting rows from a database, is a
far more complicated task, and can't really be compared to deleting a
few files.)

In the end, it was agreed that we could execute the deletes over time,
deleting items in the background, or in parallel with the application's
work. After all, if the disk is filling up at the rate of 2 GB/day, then
so long as we delete 4 GB/day (which is pretty easy to do), we should be
fine. Adding RAM or another disk are simply out of the question, which
is really a shame for a database of this size.



Howdy,

I'm coming a little late to the tread but i didn't see anyone propose 
some tricks I've used in the past to overcome the slow delete problem.


First - if you can drop your FKs, delete, re-create your FKs you'll 
find that you can delete an amazing amount of data very quickly.


second - if you can't do that - you can try function that loops and 
deletes a small amount at a time, this gets around the deleting more 
data then you can fit into memory problem. It's still slow but just 
not as slow.


third - don't delete, instead,
create new_table as select * from old_table where records are not the 
ones you want to delete

rename new_table to old_table;
create indexes and constraints
drop old_table;

fourth - I think some folks mentioned this, but just for completeness, 
partition the table and make sure that your partition key is such that 
you can just drop an entire partition.


Hope that helps and wasn't redundant.

Dave


 Hi,

I think your first and third points are very obvious - but only after I 
had read them!  :-)


Your third point is not bad either!

Brilliant simplicity, I hope I can remember them if I run into a similar 
situation.



Thanks,
Gavin




--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Very long deletion time on a 200 GB database

2012-02-28 Thread David Kerr

On 02/27/2012 12:08 AM, Reuven M. Lerner wrote:

Hi, everyone. I wanted to thank you again for your help on the huge
delete problem that I was experiencing.

After a lot of trial and error, we finally came to the conclusion that
deleting this much data in the time frame that they need, on
underpowered hardware that is shared with an application, with each test
iteration taking 5-9 hours to run (but needing to run in 2-3), is just
not going to happen. We tried many of the options that people helpfully
suggested here, but none of them gave us the performance that we needed.

(One of the developers kept asking me how it can possibly take so long
to delete 200 GB, when he can delete files of that size in much less
time. I had to explain to him that deleting rows from a database, is a
far more complicated task, and can't really be compared to deleting a
few files.)

In the end, it was agreed that we could execute the deletes over time,
deleting items in the background, or in parallel with the application's
work. After all, if the disk is filling up at the rate of 2 GB/day, then
so long as we delete 4 GB/day (which is pretty easy to do), we should be
fine. Adding RAM or another disk are simply out of the question, which
is really a shame for a database of this size.



Howdy,

I'm coming a little late to the tread but i didn't see anyone propose 
some tricks I've used in the past to overcome the slow delete problem.


First - if you can drop your FKs, delete, re-create your FKs you'll find 
that you can delete an amazing amount of data very quickly.


second - if you can't do that - you can try function that loops and 
deletes a small amount at a time, this gets around the deleting more 
data then you can fit into memory problem. It's still slow but just not 
as slow.


third - don't delete, instead,
create new_table as select * from old_table where records are not the 
ones you want to delete

rename new_table to old_table;
create indexes and constraints
drop old_table;

fourth - I think some folks mentioned this, but just for completeness, 
partition the table and make sure that your partition key is such that 
you can just drop an entire partition.


Hope that helps and wasn't redundant.

Dave

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Very long deletion time on a 200 GB database

2012-02-27 Thread Reuven M. Lerner
Hi, everyone.  I wanted to thank you again for your help on the huge 
delete problem that I was experiencing.


After a lot of trial and error, we finally came to the conclusion that 
deleting this much data in the time frame that they need, on 
underpowered hardware that is shared with an application, with each test 
iteration taking 5-9 hours to run (but needing to run in 2-3), is just 
not going to happen.  We tried many of the options that people helpfully 
suggested here, but none of them gave us the performance that we needed.


(One of the developers kept asking me how it can possibly take so long 
to delete 200 GB, when he can delete files of that size in much less 
time.  I had to explain to him that deleting rows from a database, is a 
far more complicated task, and can't really be compared to deleting a 
few files.)


In the end, it was agreed that we could execute the deletes over time, 
deleting items in the background, or in parallel with the application's 
work.  After all, if the disk is filling up at the rate of 2 GB/day, 
then so long as we delete 4 GB/day (which is pretty easy to do), we 
should be fine.  Adding RAM or another disk are simply out of the 
question, which is really a shame for a database of this size.


I should add that it was interesting/amusing to see the difference 
between the Unix and Windows philosophies.  Each time I would update my 
pl/pgsql functions, the Windows guys would wrap it into a string, inside 
of a .NET program, which then needed to be compiled, installed, and run. 
 (Adding enormous overhead to our already long testing procedure.)  I 
finally managed to show them that we could get equivalent functionality, 
with way less overhead, by just running psql -f FILENAME.  This version 
doesn't have fancy GUI output, but it works just fine...


I always tell people that PostgreSQL is not just a great database, but a 
fantastic, helpful community.  Thanks to everyone for their suggestions 
and advice.


Reuven

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Very long deletion time on a 200 GB database

2012-02-27 Thread lephongvu
I have a question.

Your data is growing 1Gb by 1 day.

Can we use another Disk or partition to continue archive data ?
I mean, do postgreSql support a Layering System for archive data ?

--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/Very-long-deletion-time-on-a-200-GB-database-tp5507359p5517941.html
Sent from the PostgreSQL - performance mailing list archive at Nabble.com.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Very long deletion time on a 200 GB database

2012-02-27 Thread Shaun Thomas

On 02/27/2012 02:08 AM, Reuven M. Lerner wrote:


In the end, it was agreed that we could execute the deletes over
time, deleting items in the background, or in parallel with the
application's work. After all, if the disk is filling up at the rate
of 2 GB/day, then so long as we delete 4 GB/day (which is pretty easy
to do), we should be fine.


Please tell me you understand deleting rows from a PostgreSQL database 
doesn't work like this. :) The MVCC storage system means you'll 
basically just be marking all those deleted rows as reusable, so your 
database will stop growing, but you'll eventually want to purge all the 
accumulated dead rows.


One way to see how many there are is to use the pgstattuple contrib 
module. You can just call it on the table name in question:


SELECT * FROM pgstattuple('my_table');

You may find that after your deletes are done, you'll have a free_pct of 
80+%. In order to get rid of all that, you'll need to either run CLUSTER 
on your table(s) or use the select-truncate-insert method anyway.


--
Shaun Thomas
OptionsHouse | 141 W. Jackson Blvd. | Suite 500 | Chicago IL, 60604
312-444-8534
stho...@peak6.com

__

See http://www.peak6.com/email_disclaimer/ for terms and conditions related to 
this email

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Very long deletion time on a 200 GB database

2012-02-27 Thread Andrew Dunstan



On 02/27/2012 09:45 AM, Shaun Thomas wrote:

On 02/27/2012 02:08 AM, Reuven M. Lerner wrote:


In the end, it was agreed that we could execute the deletes over
time, deleting items in the background, or in parallel with the
application's work. After all, if the disk is filling up at the rate
of 2 GB/day, then so long as we delete 4 GB/day (which is pretty easy
to do), we should be fine.


Please tell me you understand deleting rows from a PostgreSQL database 
doesn't work like this. :) The MVCC storage system means you'll 
basically just be marking all those deleted rows as reusable, so your 
database will stop growing, but you'll eventually want to purge all 
the accumulated dead rows.


One way to see how many there are is to use the pgstattuple contrib 
module. You can just call it on the table name in question:


SELECT * FROM pgstattuple('my_table');

You may find that after your deletes are done, you'll have a free_pct 
of 80+%. In order to get rid of all that, you'll need to either run 
CLUSTER on your table(s) or use the select-truncate-insert method 
anyway.




If he has autovacuum on he could well be just fine with his proposed 
strategy. Or he could have tables partitioned by time and do the delete 
by just dropping partitions. There are numerous way he could get this to 
work.


cheers

andrew

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Very long deletion time on a 200 GB database

2012-02-27 Thread Reuven M. Lerner

Hi, Shaun.  You wrote:



In the end, it was agreed that we could execute the deletes over
time, deleting items in the background, or in parallel with the
application's work. After all, if the disk is filling up at the rate
of 2 GB/day, then so long as we delete 4 GB/day (which is pretty easy
to do), we should be fine.


Please tell me you understand deleting rows from a PostgreSQL database
doesn't work like this. :) The MVCC storage system means you'll
basically just be marking all those deleted rows as reusable, so your
database will stop growing, but you'll eventually want to purge all the
accumulated dead rows.


Oh, I understand that all right.  I've had many, *many* conversations 
with this company explaining MVCC.  It doesn't seem to work; when they 
refer to vacuuming the database, I remind them that we have autovacuum 
working, to which they respond, Oh, we mean VACUUM FULL.  At which 
point I remind them that VACUUM FULL is almost certainly not what they 
want to do, and then they say, Yes, we know, but we still like to do it 
every so often.


From what I understand, the issue isn't one of current disk space, but 
rather of how quickly the disk space is being used up.  Maybe they want 
to reclaim disk space, but it's more crucial to stop the rate at which 
disk space is being taken.  If we were to delete all of the existing 
rows, and let vacuum mark them as dead and available for reuse, then 
that would probably be just fine.


I wouldn't be surprised if we end up doing a CLUSTER at some point.  The 
problem is basically that this machine is in 24/7 operation at 
high-speed manufacturing plants, and the best-case scenario is for a 
4-hour maintenance window.  I've suggested that we might be able to help 
the situation somewhat by attaching a portable USB-based hard disk, and 
adding a new tablespace that'll let us keep running while we divide up 
the work that the disk is doing, but they've made it clear that the 
current hardware configuration cannot and will not change.  Period.


So for now, we'll just try to DELETE faster than we INSERT, and combined 
with autovacuum, I'm hoping that this crisis will be averted.  That 
said, the current state of affairs with these machines is pretty 
fragile, and I think that we might want to head off such problems in the 
future, rather than be surprised by them.


Reuven

--
Reuven M. Lerner -- Web development, consulting, and training
Mobile: +972-54-496-8405 * US phone: 847-230-9795
Skype/AIM: reuvenlerner

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Very long deletion time on a 200 GB database

2012-02-27 Thread Shaun Thomas

On 02/27/2012 08:53 AM, Andrew Dunstan wrote:


If he has autovacuum on he could well be just fine with his proposed
strategy. Or he could have tables partitioned by time and do the delete
by just dropping partitions. There are numerous way he could get this to
work.


He isn't using partitions though. That's the whole reason for this 
thread. Having autovacuum turned on (which should be the case for 8.4 
and above anyway) will not magically remove the old rows. VACUUM marks 
rows as dead/reusable, so INSERT and UPDATE statements will take the 
dead spots instead of creating new extents.


Like I said, this will stop his tables from growing further so long as 
he keeps his maintenance functions running regularly from now on, but 
the existing rows he's trying to delete will never go away until he runs 
a CLUSTER or some other system of actually purging the dead rows.


Notice how I don't suggest using VACUUM FULL. :)

--
Shaun Thomas
OptionsHouse | 141 W. Jackson Blvd. | Suite 500 | Chicago IL, 60604
312-444-8534
stho...@peak6.com

__

See http://www.peak6.com/email_disclaimer/ for terms and conditions related to 
this email

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Very long deletion time on a 200 GB database

2012-02-27 Thread Shaun Thomas

On 02/27/2012 08:59 AM, Reuven M. Lerner wrote:


From what I understand, the issue isn't one of current disk space,
but rather of how quickly the disk space is being used up.


Noted. Just keep in mind that dead rows are not free. In the case of 
sequence scans, the rows still have to be read from disk and then 
ignored by the engine. Vacuums also act as sequence scans, so the more 
data they're reading, the longer that takes. This is especially true on 
an overloaded system.



I wouldn't be surprised if we end up doing a CLUSTER at some point.
The problem is basically that this machine is in 24/7 operation at
high-speed manufacturing plants, and the best-case scenario is for a
4-hour maintenance window.


The best case scenario is for them to buy a second server. If operation 
of this app stack really is critical to business, they need to spend the 
money to keep it working, or they'll end up paying much more for it when 
it fails. You also said that server has other stuff running on it, and 
it already has very little memory. That tells me they have no DR node. 
I'm afraid to even ask how they're doing backups. That one machine is a 
giant, red, flashing single point of failure. I really hope they 
understand that.



I've suggested that we might be able to help the situation somewhat
by attaching a portable USB-based hard disk, and adding a new
tablespace that'll let us keep running while we divide up the work
that the disk is doing, but they've made it clear that the current
hardware configuration cannot and will not change. Period.


And that's it, then. You have yourself a bad client. If it were me, I'd 
get through this contract and never do business with them again. They 
have a system that's basically 100% guaranteed to fail some time in the 
future (and yet is critical for operation!) and are putting Band-Aids on 
it. I think there's a parable somewhere about eggs and baskets, but I 
can't recall it at this moment. ;)


--
Shaun Thomas
OptionsHouse | 141 W. Jackson Blvd. | Suite 500 | Chicago IL, 60604
312-444-8534
stho...@peak6.com

__

See http://www.peak6.com/email_disclaimer/ for terms and conditions related to 
this email

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Very long deletion time on a 200 GB database

2012-02-27 Thread Claudio Freire
On Mon, Feb 27, 2012 at 12:01 PM, Shaun Thomas stho...@peak6.com wrote:

 Like I said, this will stop his tables from growing further so long as he
 keeps his maintenance functions running regularly from now on, but the
 existing rows he's trying to delete will never go away until he runs a
 CLUSTER or some other system of actually purging the dead rows.

Actually, given the usage and deletion pattern, it's quite probable
that by doing only regular vacuuming disk space will be returned to
the OS within 30 days. Assuming the free space map can contain all
that free space (where progressive deletion would help in comparison
to full deletion at once), new rows will be assigned to reusable
pages, and eventually trailing pages will become free and be purged.

I'd expect that process to take around 30 days, 60 at worst. Though,
clearly, the best option is to cluster. Cluster is a lot faster than
vacuum full in 8.3, so it's worth considering, but it does require a
lot of free disk space which the system may not have.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Very long deletion time on a 200 GB database

2012-02-27 Thread Samuel Gendler
On Mon, Feb 27, 2012 at 6:59 AM, Reuven M. Lerner reu...@lerner.co.ilwrote:


 So for now, we'll just try to DELETE faster than we INSERT, and combined
 with autovacuum, I'm hoping that this crisis will be averted.  That said,
 the current state of affairs with these machines is pretty fragile, and I
 think that we might want to head off such problems in the future, rather
 than be surprised by them.



For the record, one very effective long term solution for doing this and
continuing to be able to do this no matter how many rows have accumulated
is to partition the data tables over time so that you can just drop older
partitions.  It does require code changes since relying on a trigger on the
parent table to distribute the inserts to the correct partition is much
slower than simply modifying your code to insert/copy into the correct
partition directly.  But it is well worth doing if you are accumulating
large volumes of data.  You can even leave old partitions around if you
don't need the disk space, since well-constructed queries will simply
ignore their existence, anyway, if you are only ever going back 30 days or
less.  Indexes are on individual partitions, so you needn't worry about
indexes getting too large, either.


Re: [PERFORM] Very long deletion time on a 200 GB database

2012-02-27 Thread Scott Marlowe
On Mon, Feb 27, 2012 at 2:13 PM, Samuel Gendler
sgend...@ideasculptor.com wrote:


 On Mon, Feb 27, 2012 at 6:59 AM, Reuven M. Lerner reu...@lerner.co.il
 wrote:


 So for now, we'll just try to DELETE faster than we INSERT, and combined
 with autovacuum, I'm hoping that this crisis will be averted.  That said,
 the current state of affairs with these machines is pretty fragile, and I
 think that we might want to head off such problems in the future, rather
 than be surprised by them.



 For the record, one very effective long term solution for doing this and
 continuing to be able to do this no matter how many rows have accumulated is
 to partition the data tables over time so that you can just drop older
 partitions.  It does require code changes since relying on a trigger on the
 parent table to distribute the inserts to the correct partition is much
 slower than simply modifying your code to insert/copy into the correct
 partition directly.  But it is well worth doing if you are accumulating
 large volumes of data.  You can even leave old partitions around if you
 don't need the disk space, since well-constructed queries will simply ignore
 their existence, anyway, if you are only ever going back 30 days or less.
  Indexes are on individual partitions, so you needn't worry about indexes
 getting too large, either.

If they're only inserting ~1 or 2G a day, a trigger is likely plenty
fast.  I've had stats dbs that grew up 10s or 20s of gigs a day and
the triggers were never a performance problem there.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Very long deletion time on a 200 GB database

2012-02-26 Thread Reuven M. Lerner

Hi again, everyone.

Wow, I can't get over how helpful everyone has been.

Shaun wrote:


The main problem you're going to run into is that your table is larger than the 
memory in that server. 4GB is really pretty small for a server hosting a 200+GB 
database. That they didn't mean it to get that big doesn't really help you 
clean it up.


Yep!  And as you pointed out later in you note, PostgreSQL isn't the 
only thing running on this computer.  There's also a full-fledged 
Windows application normally running on it.  And the nature of the 
manufacturing, black-box context means that maintenance is supposed to 
be rare, and that anything which gets us off of a 24/7 work schedule is 
enormously expensive.


This has been a fun problem to fix, for sure...  We're not there yet, 
but I feel like we're really close.


I'm currently trying a hybrid approach, based on several suggestions 
that were posted to this list:


Given that during this maintenance operation, nothing else should 
running, I'm going to bump up the shared_buffers.  Even after we run our 
maintenance, the fact that shared_buffers was so ridiculously low 
couldn't be helping anything, and I'll push it up.


I finally remembered why I had such a tortured set of subselects in my 
original query: If you're going to do a query with LIMIT in it, you had 
better be sure that you know what you're doing, because without an ORDER 
BY clause, you might be in for surprises.  And sure enough, in our 
testing, I realized that when we asked the database for up to 5 rows, we 
were getting the same rows again and again, thus stopping after it 
deleted a few bunches of rows.


So I changed tactics somewhat, and it appears to be working much, much 
faster: I first created a table (not a temp table, simply because my 
functions are getting invoked by the .NET application in a new 
connection each time, and I obviously don't want my table to go away) 
with the IDs of the R table that are older than n days old.This 
table has about 200,000 rows in it, but each column is an int, so it's 
pretty small.


I then have a separate function that takes a parameter, the chunk size. 
 I loop through the table created in the first function 
(old_report_ids), deleting all of the records in the B table that 
references the R table.  I then remove the row from the old_report_ids 
table, and then loop again, until I've reached the chunk size.  There 
are undoubtedly more elegant ways to do this, but we just gotta get it 
working at this point. :-)


We're about to test this, but from my small tests on my computer, it ran 
much, much faster than other options.  We'll see what happens when we 
try it now on the 200 GB monster...


Reuven

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Very long deletion time on a 200 GB database

2012-02-25 Thread Reuven M. Lerner

Hi, everyone.  Jeff wrote:


Is this 9 hours run time for deleting one day worth of data, or for
deleting the entire accumulation of cruft that filled up the hard
drive in the first place (which would be 170 days, if you have 200GB
that accumulated at 1GB per day and you only need 30 days) ?


Unfortunately, it took 9 hours to delete all of the rows associated with 
the older-than-30-days records.


Reuven
--
Reuven M. Lerner -- Web development, consulting, and training
Mobile: +972-54-496-8405 * US phone: 847-230-9795
Skype/AIM: reuvenlerner

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Very long deletion time on a 200 GB database

2012-02-24 Thread Lew

On 02/23/2012 07:28 AM, k...@rice.edu wrote:

On Thu, Feb 23, 2012 at 05:25:46PM +0200, Reuven M. Lerner wrote:


What is the distribution of end_dates? It might be worth running this in
several steps, deleting records older than, say, 90 days, 60 days, 30 days.


I've suggested something similar, but was told that we have limited
time to execute the DELETE, and that doing it in stages might not be
possible.


In cases like this, I have often found that doing the delete in smaller
pieces goes faster, sometimes much faster, than the bigger delete.


For some reason it is common for a conversation with a software manager to go 
like this:


Programmer: Let's go with option A; it'll be much faster than what we're 
doing.
Manager: We don't have time to do that.

We don't have time to be faster? When I've had this conversation, the payback 
was usually immediate, like it's Wednesday and it'll be faster by Thursday the 
next day, and we'll get more done by Friday of the same week the new way. But 
we don't have time.


I have had this conversation dozens of times over the years. (I was always 
Programmer.)


--
Lew
Honi soit qui mal y pense.
http://upload.wikimedia.org/wikipedia/commons/c/cf/Friz.jpg

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Very long deletion time on a 200 GB database

2012-02-24 Thread Samuel Gendler
On Thu, Feb 23, 2012 at 10:39 PM, Reuven M. Lerner reu...@lerner.co.ilwrote:

 Hi, everyone.

 So it turns out that we're not using 25 GB of virtual memory.  (That's
 what I had been shown yesterday, and it was a bit surprising, to say the
 least...)

 A few statistics that I managed to get from the Windows developers/system
 administrators:

 - The machine has a total of 3.5 GB of RAM
 - shared_buffers was set to 256 MB (yes, MB!)
 - Virtual memory usage by our process is 3 MB (yes, MB)
 - CPU is virtually idle when running the deletes, using about 1% of CPU
 - No other processes are accessing the database when we're running the
 maintenance; there are a total of three server processes, but two are idle.


What is work_mem set to?  If all the other values were set so low, I'd
expect work_mem to also be small, which could be causing all kind of disk
activity when steps don't fit into a work_mem segment.


Re: [PERFORM] Very long deletion time on a 200 GB database

2012-02-24 Thread Reuven M. Lerner

Hi, everyone.  Samuel wrote:


What is work_mem set to?  If all the other values were set so low, I'd
expect work_mem to also be small, which could be causing all kind of
disk activity when steps don't fit into a work_mem segment.


I just checked, and work_mem is set to 30 MB.  That seems a bit low to 
me, given the size of the database and the fact that we're doing so much 
sorting and subselecting.  Am I right that we should push that up some more?


Reuven
--
Reuven M. Lerner -- Web development, consulting, and training
Mobile: +972-54-496-8405 * US phone: 847-230-9795
Skype/AIM: reuvenlerner

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Very long deletion time on a 200 GB database

2012-02-24 Thread Claudio Freire
On Fri, Feb 24, 2012 at 9:37 AM, Reuven M. Lerner reu...@lerner.co.il wrote:
 I just checked, and work_mem is set to 30 MB.  That seems a bit low to me,
 given the size of the database and the fact that we're doing so much sorting
 and subselecting.  Am I right that we should push that up some more?

You can certainly increase work_mem **for the delete** (which you can
do by issuing set work_mem='something' just before the delete - it
will only apply to that connection), but bear in mind that work_mem is
the amount of memory each connection can use for each operation. Total
usage can go way higher than max_connections * work_mem, depending on
the kind of queries you have.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Very long deletion time on a 200 GB database

2012-02-24 Thread Shaun Thomas

On 02/24/2012 12:39 AM, Reuven M. Lerner wrote:


- CPU is virtually idle when running the deletes, using about 1% of
CPU


I think you found your problem.

See if you can get the Windows admins to give you some info on how busy 
the disks are (percent utilization, IOPS, something) the next time you 
try this. Increasing your memory settings may help, but a 1% CPU usage 
usually suggests it's waiting for disk blocks to be read before it can 
actually do something.


--
Shaun Thomas
OptionsHouse | 141 W. Jackson Blvd. | Suite 500 | Chicago IL, 60604
312-444-8534
stho...@peak6.com

__

See http://www.peak6.com/email_disclaimer/ for terms and conditions related to 
this email

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Very long deletion time on a 200 GB database

2012-02-24 Thread Reuven M. Lerner

Hi, everyone.  Shaun said:

On 02/24/2012 12:39 AM, Reuven M. Lerner wrote:


- CPU is virtually idle when running the deletes, using about 1% of
CPU


I think you found your problem.

See if you can get the Windows admins to give you some info on how busy
the disks are (percent utilization, IOPS, something) the next time you
try this. Increasing your memory settings may help, but a 1% CPU usage
usually suggests it's waiting for disk blocks to be read before it can
actually do something.


I asked them for disk readings, but I'm not sure how to contextualize 
the numbers I got:


I/O writes: process1: 820,000, process2: 1Milion Process3: 33,000

Any suggestions for what I can do to improve performance with such a 
slow disk, and a lack of additional RAM?


Reuven

--
Reuven M. Lerner -- Web development, consulting, and training
Mobile: +972-54-496-8405 * US phone: 847-230-9795
Skype/AIM: reuvenlerner

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Very long deletion time on a 200 GB database

2012-02-24 Thread Shaun Thomas

On 02/24/2012 08:54 AM, Reuven M. Lerner wrote:


I/O writes: process1: 820,000, process2: 1Milion Process3: 33,000


That's not especially helpful, unfortunately. That doesn't really tell 
us how saturated the controller is. However I suspect it's being 
effectively slammed based simply on your CPU usage.


The main problem you're going to run into is that your table is larger 
than the memory in that server. 4GB is really pretty small for a server 
hosting a 200+GB database. That they didn't mean it to get that big 
doesn't really help you clean it up.


But as a consequence, deleting from that table, or creating a temp table 
with 30 days of data, truncating the table, and re-inserting, it's still 
going to cause a lot of disk activity. Especially since the database is 
constantly writing out transaction logs. But you do have a few things on 
your side.


You say you're deleting from table B, which has no foreign keys 
referencing it. That's good. You need to go back to your truncate 
approach, and do this:


CREATE TABLE keep_b_data AS
SELECT *
  FROM B
 WHERE some_date = CURRENT_DATE - INTERVAL '30 days';

TRUNCATE TABLE B;

DROP INDEX idx_something_on_b_1;
DROP INDEX idx_something_on_b_2;
DROP INDEX idx_something_on_b_3;

ALTER TABLE B DROP CONSTRAINT whatever_pk;

INSERT INTO B
SELECT *
  FROM keep_b_data;

ALTER TABLE B ADD CONSTRAINT whatever_pk PRIMARY KEY (some_col);

CREATE INDEX idx_something_on_b_1 ON B (col_a);
CREATE INDEX idx_something_on_b_2 ON B (col_b);
CREATE INDEX idx_something_on_b_3 ON B (col_c);

You need to make sure nothing is reading from the table while you're 
doing this, because the missing indexes will make selects increase your 
disk utilization, which you definitely don't want. Get a window to work in.


But this should be much faster than your original attempts. Inserting 
the 30-day window into table B should be just as fast as creating the 
holding table, and creating the primary key and recreating the indexes 
should take about the same amount of time each.


So to get a *rough* idea of how long it will take, do the first step, 
and create the holding table. Multiply that by the number of indexes and 
the primary key, plus 1. So if it takes 20 minutes, and you have three 
indexes, and the primary key, multiply by five.


I guess the other question is: Is PostgreSQL the only thing running on 
this server? If not, that may be the source of your disk IO, and it's 
choking the database and your ability to clean it up. Try to get them to 
temporarily disable all non-essential services while you do the cleanup. 
I'm wondering if they're also running the app on the Windows machine, 
based on your original story. That in itself isn't a very good design, 
but they're also running a PostgreSQL server on Windows, so who knows 
what they're thinking over there. :)


--
Shaun Thomas
OptionsHouse | 141 W. Jackson Blvd. | Suite 500 | Chicago IL, 60604
312-444-8534
stho...@peak6.com

__

See http://www.peak6.com/email_disclaimer/ for terms and conditions related to 
this email

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Very long deletion time on a 200 GB database

2012-02-24 Thread Steve Crawford

On 02/23/2012 12:39 AM, Reuven M. Lerner wrote:

Hi, everyone. ...
...at one installation, we're quickly running out of disk space.  The 
database is already taking up about 200 GB of space, and is growing by 
1 GB or so a day.


I've been following the discussion of approaches and tuning for bulk 
deletes and suddenly wondered if you have checked a couple other basics.


Do you know the source of the increases in DB size? Is it due strictly 
to inserted data or are there lots of updates as well?


Is autovacuum running properly?

Could you, due to bulk deletes and other issues, be suffering from 
table- or index-bloat? Heavily bloated tables/indexes will exacerbate 
both your disk-usage and performance problems.


If possible you might try clustering your tables and see what happens to 
disk usage and bulk-delete performance. Clusters are generally 
reasonably fast - way faster than VACUUM FULL, though they could still 
take a while on your very large tables.


As a bonus, cluster gives you shiny, new non-bloated indexes. They do 
require an exclusive lock and they do require sufficient disk-space to 
build the new, albeit smaller, table/indexes so it may not be an option 
if you are short on disk-space.  You may be able to start by clustering 
your smaller tables and move toward the larger ones as you free 
disk-space. Be sure to run ANALYZE on any table that you have CLUSTERed.


You might find it useful to make CLUSTER part of your regular maintenance.

Cheers,
Steve


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] Very long deletion time on a 200 GB database

2012-02-23 Thread Reuven M. Lerner
Hi, everyone.  I'm maintaining an application that exists as a black 
box in manufacturing plants.  The system is based on Windows, .NET, and 
PostgreSQL 8.3.  I'm a Unix kind of guy myself, but the application 
layer and system administration are being handled by other people; I'm 
just the PostgreSQL guy.


Because of the nature of the application, we don't have direct control 
over what happens.  And it turns out that at one installation, we're 
quickly running out of disk space.  The database is already taking up 
about 200 GB of space, and is growing by 1 GB or so a day.  Switching 
disks, either to a larger/faster traditional drive, or even to a SSD, is 
not an option.  (And yes, I know that SSDs have their own risks, but I'm 
just throwing that out as one option.)


Right now, the best solution to the space problem is to delete 
information associated with old records, where old is from at least 30 
days ago.  The old records are spread across a few tables, including 
many large objects.  (The application was written by people who were new 
to PostgreSQL, and didn't realize that they could use BYTEA.) 
Basically, given a foreign key B.a_id that points to table A, I want to 
DELETE all in B where A's creation date is at least 30 days ago.


Unfortunately, when we implemented this simple delete, it executed 
slower than molasses, taking about 9 hours to do its thing.   Not only 
does this seem like a really, really long time to do such deleting, but 
we have only a 4-hour window in which to run this maintenance activity, 
before the factory starts to use our black box again.


I've tried a few approaches so far, none of which have been hugely 
successful.  The fact that it takes several hours to test each theory is 
obviously a bit of a pain, and so I'm curious to hear suggestions from 
people here.


I should note that my primary concern is available RAM.  The database, 
as I wrote, is about 200 GB in size, and PostgreSQL is reporting 
(according to Windows) use of about 5 GB RAM, plus another 25 GB of 
virtual memory.  I've told the Windows folks on this project that 
virtual memory kills a database, and that it shouldn't surprise us to 
have horrible performance if the database and operating system are both 
transferring massive amounts of data back and forth.  But there doesn't 
seem to be a good way to handle this


This is basically what I'm trying to execute:

DELETE FROM  B
WHERE r_id IN (SELECT R.id
 FROM R, B
WHERE r.end_date  (NOW() - (interval '1 day' * 30))
  AND r.id = b.r_id

(1) I tried to write this as a join, rather than a subselect.  But B has 
an oid column that points to large objects, and on which we have a rule 
that removes the associated large object when a row in B is removed. 
Doing the delete as a join resulted in no such large object with an oid 
of xxx errors.  (I'm not sure why, although it might have to do with 
the rule.)


(2) I tried to grab the rows that *do* interest me, put them into a 
temporary table, TRUNCATE the existing table, and then copy the rows 
back.   I only tested that with a 1 GB subset of the data, but that took 
longer than other options.


(3) There are some foreign-key constraints on the B table.  I thought 
that perhaps doing a mass DELETE was queueing up all of those 
constraints, and possibly using up lots of memory and/or taking a long 
time to execute.  I thus rewrote my queries such that they first removed 
the constraints, then executed the DELETE, and then restored the 
constraints.  That didn't seem to improve things much either, and took a 
long time (30 minutes) just to remove the constraints.  I expected 
re-adding the constraints to take a while, but shouldn't removing them 
be relatively quick?


(4) I tried chunking the deletes, such that instead of trying to 
delete all of the records from the B table, I would instead delete just 
those associated with 100 or 200 rows from the R table.  On a 1 GB 
subset of the data, this seemed to work just fine.  But on the actual 
database, it was still far too slow.


I've been surprised by the time it takes to delete the records in 
question.  I keep trying to tell the others on this project that 
PostgreSQL isn't inherently slow, but that a 200 GB database running on 
a non-dedicated machine, with an old version (8.3), and while it's 
swapping RAM, will be slow regardless of the database software we're 
using.  But even so, 9 hours to delete 100 GB of data strikes me as a 
very long process.


Again, I continue to believe that given our hard time deadlines, and the 
fact that we're using a large amount of virtual memory, that there isn't 
really a solution that will work quickly and easily.  But I'd be 
delighted to be wrong, and welcome any and all comments and suggestions 
for how to deal with this.


Reuven

--
Reuven M. Lerner -- Web development, consulting, and training
Mobile: +972-54-496-8405 * US phone: 847-230-9795
Skype/AIM: reuvenlerner

--
Sent via pgsql-performance 

Re: [PERFORM] Very long deletion time on a 200 GB database

2012-02-23 Thread Glyn Astill
Do you have any more detailed information about the hardware, what sort of disk 
configuration does it have?

Can you get onto the machine to look at what is using those resources?  You 
mention the 25gb of virtual memory; is that being used?  If so is it being used 
by postgres or something else?  If it's being used by postgres you should 
change postgresql.conf to work within your 5gb, otherwise can you stop the 
other applications to do your delete?  A snapshot from task manager or process 
monitor of process resource usage would be useful, even better somelogging from 
perfmon including physical disk usage.


What would be even more useful is the table definitions; you mention trying to 
drop constraints to speed it up but is there anything else at play, e.g. 
triggers?




 From: Reuven M. Lerner reu...@lerner.co.il
To: pgsql-performance@postgresql.org 
Sent: Thursday, 23 February 2012, 8:39
Subject: [PERFORM] Very long deletion time on a 200 GB database
 


Hi, everyone.  I'm maintaining an application that exists as a black box in 
manufacturing plants.  The system is based on Windows, .NET, and PostgreSQL 
8.3.  I'm a Unix kind of guy myself, but the application layer and system 
administration are being handled by other people; I'm just the PostgreSQL guy.

Because of the nature of the application, we don't have direct control over 
what happens.  And it turns out that at one installation, we're quickly 
running out of disk space.  The database is already taking up about 200 GB of 
space, and is growing by 1 GB or so a day.  Switching disks, either to a 
larger/faster traditional drive, or even to a SSD, is not an option.  (And 
yes, I know that SSDs have their own risks, but I'm just throwing that out as 
one option.)

Right now, the best solution to the space problem is to delete information 
associated with old records, where old is from at least 30 days ago.  The 
old records are spread across a few tables, including many large objects.  
(The application was written by people who were new to PostgreSQL, and didn't 
realize that they could use BYTEA.) Basically, given a foreign key B.a_id that 
points to table A, I want to DELETE all in B where A's creation date is at 
least 30 days ago.

Unfortunately, when we implemented this simple delete, it executed slower than 
molasses, taking about 9 hours to do its thing.   Not only does this seem like 
a really, really long time to do such deleting, but we have only a 4-hour 
window in which to run this maintenance activity, before the factory starts to 
use our black box again.

I've tried a few approaches so far, none of which have been hugely 
successful.  The fact that it takes several hours to test each theory is 
obviously a bit of a pain, and so I'm curious to hear suggestions from people 
here.

I should note that my primary concern is available RAM.  The database, as I 
wrote, is about 200 GB in size, and PostgreSQL is reporting (according to 
Windows) use of about 5 GB RAM, plus another 25 GB of virtual memory.  I've 
told the Windows folks on this project that virtual memory kills a database, 
and that it shouldn't surprise us to have horrible performance if the database 
and operating system are both transferring massive amounts of data back and 
forth.  But there doesn't seem to be a good way to handle this

This is basically what I'm trying to execute:

DELETE FROM  B
WHERE r_id IN (SELECT R.id
     FROM R, B
    WHERE r.end_date  (NOW() - (interval '1 day' * 30))
      AND r.id = b.r_id

(1) I tried to write this as a join, rather than a subselect.  But B has an 
oid column that points to large objects, and on which we have a rule that 
removes the associated large object when a row in B is removed. Doing the 
delete as a join resulted in no such large object with an oid of xxx 
errors.  (I'm not sure why, although it might have to do with the rule.)

(2) I tried to grab the rows that *do* interest me, put them into a temporary 
table, TRUNCATE the existing table, and then copy the rows back.   I only 
tested that with a 1 GB subset of the data, but that took longer than other 
options.

(3) There are some foreign-key constraints on the B table.  I thought that 
perhaps doing a mass DELETE was queueing up all of those constraints, and 
possibly using up lots of memory and/or taking a long time to execute.  I thus 
rewrote my queries such that they first removed the constraints, then executed 
the DELETE, and then restored the constraints.  That didn't seem to improve 
things much either, and took a long time (30 minutes) just to remove the 
constraints.  I expected re-adding the constraints to take a while, but 
shouldn't removing them be relatively quick?

(4) I tried chunking the deletes, such that instead of trying to delete all 
of the records from the B table, I would instead delete just those associated 
with 100 or 200 rows from the R table.  On a 1 GB subset of the data, this 
seemed to work

Re: [PERFORM] Very long deletion time on a 200 GB database

2012-02-23 Thread Marcin Mańk
 DELETE FROM  B
 WHERE r_id IN (SELECT R.id
     FROM R, B
    WHERE r.end_date  (NOW() - (interval '1 day' * 30))
      AND r.id = b.r_id


How about:

 DELETE FROM  B
 WHERE r_id IN (SELECT distinct R.id
     FROM R  WHERE r.end_date  (NOW() - (interval '1 day' * 30))

?

Greetings
Marcin

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Very long deletion time on a 200 GB database

2012-02-23 Thread Richard Huxton

On 23/02/12 08:39, Reuven M. Lerner wrote:
(4) I tried chunking the deletes, such that instead of trying to 
delete all of the records from the B table, I would instead delete 
just those associated with 100 or 200 rows from the R table.  On a 1 
GB subset of the data, this seemed to work just fine.  But on the 
actual database, it was still far too slow.


This is the approach I'd take. You don't have enough control / access to 
come up with a better solution. Build a temp table with 100 ids to 
delete. Time that, and then next night you can increase to 200 etc until 
it takes around 3 hours.


Oh - and get the Windows admins to take a look at disk activity - the 
standard performance monitor can tell you more than enough. If it is 
swapping constantly, performance will be atrocious but even if the disks 
are just constantly busy then updates and deletes can be very slow.


--
  Richard Huxton
  Archonet Ltd


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Very long deletion time on a 200 GB database

2012-02-23 Thread Claudio Freire
On Thu, Feb 23, 2012 at 5:39 AM, Reuven M. Lerner reu...@lerner.co.il wrote:
 Unfortunately, when we implemented this simple delete, it executed slower
 than molasses, taking about 9 hours to do its thing.   Not only does this
 seem like a really, really long time to do such deleting, but we have only a
 4-hour window in which to run this maintenance activity, before the factory
 starts to use our black box again.

PG 8.3 had horrible hash joins for big tables, so you might try set
enable_hashjoin=false prior to your query. I suspect this is not your
biggest problem, though...

 I should note that my primary concern is available RAM.  The database, as I
 wrote, is about 200 GB in size, and PostgreSQL is reporting (according to
 Windows) use of about 5 GB RAM, plus another 25 GB of virtual memory.

You really *have* to look into that situation. 25GB of *active*
virtual memory? That would mean a thrashing server, and an utterly
unresponsive one from my experience. Your data is probably wrong,
because if I had a server *using* 30G of RAM only 5 of which are
physical, I wouldn't even be able to open a remote desktop to it.

I bet your numbers are wrong. In any case, you have to look into the
matter. Any amount of swapping will kill performance for postgres,
throwing plans out of whack, turning sequential I/O into random I/O, a
mess.

 told the Windows folks on this project that virtual memory kills a database,
 and that it shouldn't surprise us to have horrible performance if the
 database and operating system are both transferring massive amounts of data
 back and forth.  But there doesn't seem to be a good way to handle this

There is, tune postgresql.conf to use less memory.

 This is basically what I'm trying to execute:

 DELETE FROM  B
 WHERE r_id IN (SELECT R.id
     FROM R, B
    WHERE r.end_date  (NOW() - (interval '1 day' * 30))
      AND r.id = b.r_id

DELETE is way faster when you have no constraints, no primary key, no indices.

If you could stop all database use in that 4-hour period, it's
possible dropping indices, FKs and PK, delete, and recreating the
indices/FKs/PK will run fast enough.
You'll have to test that on some test server though...

 (3) There are some foreign-key constraints on the B table.  I thought that
 perhaps doing a mass DELETE was queueing up all of those constraints, and
 possibly using up lots of memory and/or taking a long time to execute.  I
 thus rewrote my queries such that they first removed the constraints, then
 executed the DELETE, and then restored the constraints.

That's not enough. As I said, you have to drop the PK and all indices too.

 That didn't seem to
 improve things much either, and took a long time (30 minutes) just to remove
 the constraints.  I expected re-adding the constraints to take a while, but
 shouldn't removing them be relatively quick?

That means your database is locked (a lot of concurrent access), or
thrashing, because dropping a constraint is a very quick task.
If you stop your application, I'd expect dropping constraints and
indices to take almost no time.

 (4) I tried chunking the deletes, such that instead of trying to delete
 all of the records from the B table, I would instead delete just those
 associated with 100 or 200 rows from the R table.  On a 1 GB subset of the
 data, this seemed to work just fine.  But on the actual database, it was
 still far too slow.

Check the hash join thing.
Check/post an explain of the delete query, to see if it uses hash
joins, and which tables are hash-joined. If they're big ones, 8.3 will
perform horribly.

 I've been surprised by the time it takes to delete the records in question.
  I keep trying to tell the others on this project that PostgreSQL isn't
 inherently slow, but that a 200 GB database running on a non-dedicated
 machine, with an old version (8.3), and while it's swapping RAM, will be
 slow regardless of the database software we're using.  But even so, 9 hours
 to delete 100 GB of data strikes me as a very long process.

Deletes in MVCC is more like an update. It's a complex procedure to
make it transactional, that's why truncate is so much faster.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Very long deletion time on a 200 GB database

2012-02-23 Thread Andrew Dunstan



On 02/23/2012 05:07 AM, Marcin Mańk wrote:

DELETE FROM  B
WHERE r_id IN (SELECT R.id
 FROM R, B
WHERE r.end_date  (NOW() - (interval '1 day' * 30))
  AND r.id = b.r_id


How about:

  DELETE FROM  B
  WHERE r_id IN (SELECT distinct R.id
  FROM R  WHERE r.end_date  (NOW() - (interval '1 day' * 30))

?



Or possibly without the DISTINCT.  But I agree that the original query 
shouldn't have B in the subquery - that alone could well make it crawl.


What is the distribution of end_dates? It might be worth running this in 
several steps, deleting records older than, say, 90 days, 60 days, 30 days.


cheers

andrew

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Very long deletion time on a 200 GB database

2012-02-23 Thread Reuven M. Lerner
Hi, everyone.  Thanks for all of the help and suggestions so far; I'll 
try to respond to some of them soon.  Andrew wrote:



How about:

DELETE FROM B
WHERE r_id IN (SELECT distinct R.id
FROM R WHERE r.end_date (NOW() - (interval '1 day' * 30))

?



Or possibly without the DISTINCT. But I agree that the original query
shouldn't have B in the subquery - that alone could well make it crawl.


I put B in the subquery so as to reduce the number of rows that would be 
returned, but maybe that was indeed backfiring on me.  Now that I think 
about it, B is a huge table, and R is a less-huge one, so including B in 
the subselect was probably a mistake.




What is the distribution of end_dates? It might be worth running this in
several steps, deleting records older than, say, 90 days, 60 days, 30 days.


I've suggested something similar, but was told that we have limited time 
to execute the DELETE, and that doing it in stages might not be possible.


Reuven


--
Reuven M. Lerner -- Web development, consulting, and training
Mobile: +972-54-496-8405 * US phone: 847-230-9795
Skype/AIM: reuvenlerner

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Very long deletion time on a 200 GB database

2012-02-23 Thread k...@rice.edu
On Thu, Feb 23, 2012 at 05:25:46PM +0200, Reuven M. Lerner wrote:
 
 What is the distribution of end_dates? It might be worth running this in
 several steps, deleting records older than, say, 90 days, 60 days, 30 days.
 
 I've suggested something similar, but was told that we have limited
 time to execute the DELETE, and that doing it in stages might not be
 possible.
 
 Reuven
 

In cases like this, I have often found that doing the delete in smaller
pieces goes faster, sometimes much faster, than the bigger delete.

Regards,
Ken

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Very long deletion time on a 200 GB database

2012-02-23 Thread Steve Crawford

On 02/23/2012 12:39 AM, Reuven M. Lerner wrote:

Hi, everyone...
This is basically what I'm trying to execute:

DELETE FROM  B
WHERE r_id IN (SELECT R.id
 FROM R, B
WHERE r.end_date  (NOW() - (interval '1 day' * 30))
  AND r.id = b.r_id


I don't recall which versions like which approach, but have you tried 
...WHERE EXISTS (SELECT... instead of WHERE IN? Depending on the version 
of PostgreSQL, one or the other may yield a superior result.



(2) I tried to grab the rows that *do* interest me, put them into a 
temporary table, TRUNCATE the existing table, and then copy the rows 
back.   I only tested that with a 1 GB subset of the data, but that 
took longer than other options.




Was the 1GB subset the part you were keeping or the part you were 
deleting? Which part was slow (creating the temp table or copying it back)?


Try running EXPLAIN on the SELECT query that creates the temporary table 
and try to optimize that. Also, when copying the data back, you are 
probably having to deal with index and foreign keys maintenance. It will 
probably be faster to drop those, copy the data back then recreate them.


I know you are a *nix-guy in a Windows org so your options are limited, 
but word-on-the-street is that for high-performance production use, 
install PostgreSQL on *nix.


Cheers,
Steve


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Very long deletion time on a 200 GB database

2012-02-23 Thread Shaun Thomas

On 02/23/2012 02:39 AM, Reuven M. Lerner wrote:


I should note that my primary concern is available RAM. The database, as
I wrote, is about 200 GB in size, and PostgreSQL is reporting (according
to Windows) use of about 5 GB RAM, plus another 25 GB of virtual memory.


O_o

That... that would probably swap just constantly. No end. Just swap all 
day long. But maybe not. Please tell us the values for these settings:


* shared_buffers
* work_mem
* maintenance_work_mem
* checkpoint_segments
* checkpoint_timeout

It also wouldn't be a bad idea to see how many concurrent connections 
there are, because that may determine how much memory all the backends 
are consuming. In any case, if it's actually using 25GB of virtual 
memory, any command you run that doesn't happen to be in cache, will 
just immediately join a giant logjam.



I've told the Windows folks on this project that virtual memory kills a
database, and that it shouldn't surprise us to have horrible performance
if the database and operating system are both transferring massive
amounts of data back and forth. But there doesn't seem to be a good way
to handle this


You kinda can, by checking those settings and sanitizing them. If 
they're out of line, or too large, they'll create the need for more 
virtual memory. Having the virtual memory there isn't necessarily bad, 
but using it is.



DELETE FROM B
WHERE r_id IN (SELECT R.id
FROM R, B
WHERE r.end_date  (NOW() - (interval '1 day' * 30))
AND r.id = b.r_id


Just to kinda help you out syntactically, have you ever tried a DELETE 
FROM ... USING? You can also collapse your interval notation.


DELETE FROM B
 USING R
 WHERE R.id = B.r_id
   AND R.end_date  CURRENT_DATE - INTERVAL '30 days';

But besides that, the other advise you've received is sound. Since your 
select-truncate-insert attempt was also slow, I suspect you're having 
problems with foreign key checks, and updating the index trees. 
Maintaining an existing index can be multiples slower than filling an 
empty table and creating the indexes afterwards.


So far as your foreign keys, if any of the child tables don't have an 
index on the referring column, your delete performance will be 
atrocious. You also need to make sure the types of the columns are 
identical. Even a numeric/int difference will be enough to render an 
index unusable.


We have a 100GB *table* with almost 200M rows and even deleting from 
that in many of our archive tests doesn't take anywhere near 9 hours. 
But I *have* seen a delete take that long when we had a numeric primary 
key, and an integer foreign key. Even a handful of records can cause a 
nested loop sequence scan, which will vastly inflate delete time.


--
Shaun Thomas
OptionsHouse | 141 W. Jackson Blvd. | Suite 500 | Chicago IL, 60604
312-444-8534
stho...@peak6.com

__

See http://www.peak6.com/email_disclaimer/ for terms and conditions related to 
this email

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Very long deletion time on a 200 GB database

2012-02-23 Thread Greg Spiegelberg
On Thu, Feb 23, 2012 at 8:25 AM, Reuven M. Lerner reu...@lerner.co.ilwrote:


 I've suggested something similar, but was told that we have limited time
 to execute the DELETE, and that doing it in stages might not be possible.


Just so happens I had this exact problem last week on a rather large table.
 * DELETE FROM x WHERE id IN (SELECT id FROM y ...) was horrible at best.
 * DELETE FROM x USING y WHERE ... was nearly as bad
Both of the above were taking hours and looking more like it would stretch
into days.

What I discovered however was this little technique that I'm sure all  the
Pg gods will smote me for using however it worked for me.

BEGIN;
LOCK x IN SHARE UPDATE EXCLUSIVE;  -- to prevent VACUUM's
SELECT x.ctid INTO TEMPORARY TABLE recs_to_delete FROM x,y WHERE x.id=y.id;
DELETE FROM x USING recs_to_delete r WHERE r.ctid=x.ctid;
COMMIT;

I know there are perils in using ctid but with the LOCK it should be safe.
 This transaction took perhaps 30 minutes and removed 100k rows and once
the table was VACUUM'd afterward it freed up close to 20 GB on the file
system.

HTH
-Greg


Re: [PERFORM] Very long deletion time on a 200 GB database

2012-02-23 Thread Shaun Thomas

On 02/23/2012 11:56 AM, Greg Spiegelberg wrote:


I know there are perils in using ctid but with the LOCK it should be
safe. This transaction took perhaps 30 minutes and removed 100k rows
and once the table was VACUUM'd afterward it freed up close to 20 GB
on the file system.


It took *30 minutes* to delete 100k rows? And 100k rows were using 20GB? 
Is that off by an order of magnitude?


Using the ctid is a cute trick, though. :)

--
Shaun Thomas
OptionsHouse | 141 W. Jackson Blvd. | Suite 500 | Chicago IL, 60604
312-444-8534
stho...@peak6.com

__

See http://www.peak6.com/email_disclaimer/ for terms and conditions related to 
this email

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Very long deletion time on a 200 GB database

2012-02-23 Thread Andy Colson

On 2/23/2012 12:05 PM, Shaun Thomas wrote:

On 02/23/2012 11:56 AM, Greg Spiegelberg wrote:


I know there are perils in using ctid but with the LOCK it should be
safe. This transaction took perhaps 30 minutes and removed 100k rows
and once the table was VACUUM'd afterward it freed up close to 20 GB
on the file system.


It took *30 minutes* to delete 100k rows? And 100k rows were using 20GB?
Is that off by an order of magnitude?

Using the ctid is a cute trick, though. :)



And I'm not sure the LOCK is necessary, while googling for delete from 
table limit 10 I ran across this thread:


http://archives.postgresql.org/pgsql-hackers/2010-11/msg02028.php

They use it without locks.

-Andy

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Very long deletion time on a 200 GB database

2012-02-23 Thread Greg Spiegelberg
On Thu, Feb 23, 2012 at 11:11 AM, Andy Colson a...@squeakycode.net wrote:

 On 2/23/2012 12:05 PM, Shaun Thomas wrote:

 On 02/23/2012 11:56 AM, Greg Spiegelberg wrote:

  I know there are perils in using ctid but with the LOCK it should be
 safe. This transaction took perhaps 30 minutes and removed 100k rows
 and once the table was VACUUM'd afterward it freed up close to 20 GB
 on the file system.


 It took *30 minutes* to delete 100k rows? And 100k rows were using 20GB?
 Is that off by an order of magnitude?

 Using the ctid is a cute trick, though. :)


 And I'm not sure the LOCK is necessary, while googling for delete from
 table limit 10 I ran across this thread:

 http://archives.postgresql.**org/pgsql-hackers/2010-11/**msg02028.phphttp://archives.postgresql.org/pgsql-hackers/2010-11/msg02028.php

 They use it without locks.


I used LOCK simply because if a VACUUM FULL x; slipped in between the
SELECT and the DELETE the ctid's could conceivably change.

-Greg


Re: [PERFORM] Very long deletion time on a 200 GB database

2012-02-23 Thread Tom Lane
Greg Spiegelberg gspiegelb...@gmail.com writes:
 I used LOCK simply because if a VACUUM FULL x; slipped in between the
 SELECT and the DELETE the ctid's could conceivably change.

VACUUM FULL can't slip in there, because you'd have AccessShareLock
just from the SELECT.  The real problem goes like this:

1. You SELECT some ctid and save it in the other table.
2. Somebody else updates or deletes that row.
3. Plain VACUUM comes along and frees the dead TID.
4. Somebody else (maybe not same somebody as #2) inserts a new
   row at that TID position.
5. You DELETE that TID.  Ooops.

So you might say okay, the point of the lock is to block plain vacuum,
not vacuum full.  I'm still a bit worried about whether the technique
is entirely safe, though, because of page pruning which can happen
anyway.  What this really boils down to is: how sure are you that no
other userland activity is going to update or delete any of the targeted
rows between the SELECT INTO and the DELETE?  If you're sure, then this
is safe without the extra lock.  Otherwise, I wouldn't trust it.

It might be worth having the SELECT that creates the temp table be a
SELECT FOR UPDATE on the target table, so as to ensure you've locked
down the targeted rows against anybody else.  This is not free though,
as it'll mean extra writes of all the modified tuples.

regards, tom lane

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Very long deletion time on a 200 GB database

2012-02-23 Thread Tom Lane
I just reread the original post and noted this:

Reuven M. Lerner reu...@lerner.co.il writes:
 (1) I tried to write this as a join, rather than a subselect.  But B has 
 an oid column that points to large objects, and on which we have a rule 
 that removes the associated large object when a row in B is removed. 

A rule?  Really?  That's probably bad enough in itself, but when you
write an overcomplicated join delete query, I bet the resulting plan
is spectacularly bad.  Have you looked at the EXPLAIN output for this?

I'd strongly recommend getting rid of the rule in favor of a trigger.
Also, as already noted, the extra join inside the IN sub-select is
probably hurting far more than it helps.

 (3) There are some foreign-key constraints on the B table.

If those are FK references *to* the B table, make sure the other end
(the referencing column) is indexed.  Postgres doesn't require an index
on a referencing column, but deletes in the referenced table will suck
if you haven't got one.

I don't think any of the fancy stuff being discussed in the thread is
worth worrying about until you've got these basic issues dealt with.

regards, tom lane

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Very long deletion time on a 200 GB database

2012-02-23 Thread Jesper Krogh

On 23/02/12 09:39, Reuven M. Lerner wrote:
Hi, everyone.  I'm maintaining an application that exists as a black 
box in manufacturing plants.  The system is based on Windows, .NET, 
and PostgreSQL 8.3.  I'm a Unix kind of guy myself, but the 
application layer and system administration are being handled by other 
people; I'm just the PostgreSQL guy.


Just thinking loud. It looks like (just guessing)
that the application needs store data worth 1 month back and
it was put into production under the assumption that it would
never fill up or deletion easily could be done under maintaince
windows. And that now turns out not to be the case.

I would stuff in a  trigger function on the table that automatically
does the cleanup.. It could be a BEFORE INSERT OR UPDATE
TRIGGER that just tries to prune 2-3 rows of the table if they
have exceeded the keep-back time. Just installing that in the
maintance window would allow the system to self-heal over time.

If the maintaince window allows for more cleanup, then manually
do some deletions. Now the black-box is self-healing.

--
Jesper

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Very long deletion time on a 200 GB database

2012-02-23 Thread Reuven M. Lerner

Hi, everyone.

So it turns out that we're not using 25 GB of virtual memory.  (That's 
what I had been shown yesterday, and it was a bit surprising, to say the 
least...)


A few statistics that I managed to get from the Windows 
developers/system administrators:


- The machine has a total of 3.5 GB of RAM
- shared_buffers was set to 256 MB (yes, MB!)
- Virtual memory usage by our process is 3 MB (yes, MB)
- CPU is virtually idle when running the deletes, using about 1% of CPU
- No other processes are accessing the database when we're running the 
maintenance; there are a total of three server processes, but two are idle.


(I was a bit surprised, to say the least, by the low number on 
shared_buffers, given that I believe it's one of the first things I told 
them to increase about 18 months ago.)


As for Tom's point about rules, I know that rules are bad, and I'm not 
sure why the system is using a rule rather than a trigger.I'll see 
if I can change that to a trigger, but I have very indirect control over 
the machines, and every change requires (believe it or not) writing a 
.NET program that runs my changes, rather than just a textual script 
that deploys them.


The only foreign keys are from the B table (i.e., the table whose 
records I want to remove) to other tables.  There are no REFERENCES 
pointing to the B table.  That said, I hadn't realized that primary keys 
and indexes can also delay the DELETE.


For the latest round of testing, I quadrupled shared_buffers to 1 GB, 
turned off hash joins (as suggested by someone), and also simplified the 
query (based on everyone's suggestions).  In the tests on my own 
computer (with a somewhat random 1 GB snapshot of the 200 GB database), 
the simplified query was indeed much faster, so I'm optimistic.


Several people suggested that chunking the deletes might indeed help, 
which makes me feel a bit better.  Unfortunately, given the time that it 
takes to run the queries, it's hard to figure out the right chunk size. 
 Whoever suggested doing it in time slices had an interesting idea, but 
I'm not sure if it'll be implementable given our constraints.


Thanks again to everyone for your help.  I'll let you know what happens...

Reuven

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance