Re: [GENERAL] Does writing new records while massive update will generate lock ?

2014-08-21 Thread Alban Hertroys
On 21 August 2014 15:41, Victor d'Agostino 
wrote:

>  I'm updating this column (for more than 48 hours now) on a RAID5 server.
>

RAID5? That's probably the worst performing RAID configuration you can have
and is usually advised against on this list.
You would be better off with RAID 10, RAID 1+0 or even RAID 1, but you
would be using more disk space.

That said, if your IO is not being saturated by that query, you could split
the update across multiple CPU's by dividing up the email_id's over
multiple queries that you run from a session each.

-- 
If you can't see the forest for the trees,
Cut the trees and you'll see there is no forest.


Re: [GENERAL] Does writing new records while massive update will generate lock ?

2014-08-21 Thread Raymond O'Donnell
On 21/08/2014 14:41, Victor d'Agostino wrote:
> Hi everybody,
> 
> I added a datetime column to a table with 51.10^6 entries.
> 
> ALTER TABLE MYBIGTABLE ADD COLUMN date timestamp without time zone;
> 
> I'm updating this column (for more than 48 hours now) on a RAID5
> server.
> 
> UPDATE MYBIGTABLE SET date = (SELECT date FROM INDEXEDTABLE WHERE
> INDEXEDTABLE.email_id=MYBIGTABLE.email_id) WHERE date is null;
> 
> This transaction is still running and will end in several days. It
> only uses 1 core.
> 
> My question is : Can I add new records in the table or will it
> generate locks ?
> 
> I am using postgresql *8.4*


Not that this helps your issue, but you may not be aware that 8.4 is now
end-of-life and so is no longer supported:

  http://www.postgresql.org/support/versioning

Ray.


-- 
Raymond O'Donnell :: Galway :: Ireland
r...@iol.ie


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Does writing new records while massive update will generate lock ?

2014-08-21 Thread Shaun Thomas

On 08/21/2014 08:41 AM, Victor d'Agostino wrote:


UPDATE MYBIGTABLE SET date = (SELECT date FROM INDEXEDTABLE WHERE
INDEXEDTABLE.email_id=MYBIGTABLE.email_id) WHERE date is null;


I may be wrong here, but wouldn't this style of query force a nested 
loop? Over several million rows, that would take an extremely long time. 
You might want to try this syntax instead:


UPDATE MYBIGTABLE big
   SET date = idx.date
  FROM INDEXEDTABLE idx
 WHERE idx.email_id = big.email_id
   AND big.date IS NULL;


This transaction is still running and will end in several days. It only
uses 1 core.


That's not your problem. I suspect if you checked your RAID IO, you'd 
see 100% IO utilization because instead of a sequence scan, it's 
performing a random seek for every update.



My question is : Can I add new records in the table or will it generate
locks ?


Your update statement will only lock the rows being updated. You should 
be able to add new rows, but with the IO consuming your RAID, you'll 
probably see significant write delays that resemble lock waits.


--
Shaun Thomas
OptionsHouse, LLC | 141 W. Jackson Blvd. | Suite 800 | Chicago IL, 60604
312-676-8870
stho...@optionshouse.com

__

See http://www.peak6.com/email_disclaimer/ for terms and conditions related to 
this email


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general