[SQL] update query taking 24+ hours

Ken Sat, 13 Jan 2007 23:35:47 -0800

Hello,

I have postgres 8.1 on a linux box: 2.6Ghz P4, 1.5GB ram, 320GB harddrive. I'm performing an update between two large tables and so farit's been running for 24+ hours.


I have two tables:
Master:
x int4
y int4
val1 int2
val2 int2

Import:
x int4
y int4
val int2

Each table has about 100 million rows. I want to populate val2 inMaster with val from Import where the two tables match on x and y.

So, my query looks like:

UPDATE Master SET val2=Import.val WHERE Master.x=Import.x ANDMaster.y=Import.y;


Both tables have indexes on the x and y columns.  Will that help?

Is there a better way to do this? In each table x,y are unique, doesthat make a difference? ie: would it be faster to run some kind ofquery, or loop, that just goes through each row in Import and updatesMaster (val2=val) where x=x and y=y?

If this approach would be better how to construct such a SQL statement?

The other weird thing is that when I monitor the system with xload itshows two bars of load, and the hard drive is going nuts, so far mydatabase directory has grown by 25GB, however when I run "top" thesystem shows 98% idle and the postmaster process is usually only between1-2% CPU, although it is using 50% (750MB) ram. Also the process showsup with a "D" status in the "S" column.Not sure what is going on. If the size of the tables makes what I'mtrying to do insane, or if I just have a bad SQL approach, or ifsomething is wrong with my postgres configuration.


Really appreciate any help!
Thanks!
Ken




---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
      choose an index scan if your joining column's datatypes do not
      match

[SQL] update query taking 24+ hours

Reply via email to