Re: Batch update million records in prd DB

2021-03-02 Thread Yi Sun
Hi Kristjan, Thank you for this information. "postgres the memory is slowly eaten away when doing updates within plsql loop" for this memory issue, I want to check if it exists in our current postgresql version. And let developer change to use python for loop also need to show them the proof, how

Re: Batch update million records in prd DB

2021-03-02 Thread Yi Sun
Hi Michael, Thank you, after create index to the temp table column, time cost become smaller Michael Lewis 于2021年3月2日周二 上午12:08写道: > 1) Don't pretend it is a left join when your where clause will turn it > into an INNER join. > LEFT JOIN pol gp ON gab.policy_id = gp.id > WHERE > > *

Re: Batch update million records in prd DB

2021-03-02 Thread Kristjan Mustkivi
Hi Yi, I found that in postgres the memory is slowly eaten away when doing updates within plsql loop. It only gets released once the whole block completes. While it is ok for small tables you will eventually run out of memory for really big ones. The working approach was to do the loop in e.g a py

Re: Batch update million records in prd DB

2021-03-01 Thread Michael Lewis
1) Don't pretend it is a left join when your where clause will turn it into an INNER join. LEFT JOIN pol gp ON gab.policy_id = gp.id WHERE *AND gp.name LIKE 'Mobile backup%' AND gp.deleted_at IS NOT NULL;* 2) It is interesting to me that the ro

Re: Batch update million records in prd DB

2021-03-01 Thread Yi Sun
Hi Michael This is the script and explain plan info, please check, seems Filter remove more records took more time DO $MAIN$ DECLARE affect_count integer := 1000; processed_row_count integer := 0; BEGIN LOOP exit WHEN affect_count = 0; UPDATE app g

Re: Batch update million records in prd DB

2021-02-26 Thread Michael Lewis
It might be a concern, but generally that should be a row level lock and only block other update/delete options on those rows. It might be helpful to look at the explain analyze output early on vs later in the process. It might be that you are getting very few hot updates and indexes are being upda

Re: Batch update million records in prd DB

2021-02-25 Thread Yi Sun
Hi Michael, Thank you for your reply We found that each loop take time is different, it will become slower and slower, as our table is big table and join other table, even using index the last 1000 records take around 15 seconds, will it be a problem? Will other concurrent update have to wait for

Re: Batch update million records in prd DB

2021-02-24 Thread Michael Lewis
Of course it will impact a system using that table, but not significant I expect and the production system should handle it. If you are committing like this, then you can kill the script at any time and not lose any work. The query to find the next IDs to update is probably the slowest part of this

Batch update million records in prd DB

2021-02-24 Thread Yi Sun
Hello, Now need to update several million records in a table in prd DB, if can use batch update 1000 records and commit each time, if it will affect prd application like below sample script please? Sample script: DO $MAIN$ DECLARE affect_count integer; chunk_size CONSTANT integer :=1000; s