On 4/6/24 08:47, yudhi s wrote:
Thank you Adrian, Greg and Veem.

I tried writing a small routine to see how the performance differs in these four approaches i.e. Upsert VS traditional update+insert VS Merge vs Truncate+load.

Initially I was thinking Upsert will perform the same as Merge as the logic looks similar but it seems it's the worst performing among all, not sure why , yet to know the reason though. Truncate+ load seems to be the best performing among all. Hope i am doing it correctly. Please correct me if I'm wrong.

Your original problem description was:

"Then subsequently these rows will be inserted/updated based on the delta number of rows that got inserted/updated in the source database. In some cases these changed data can flow multiple times per day to the downstream i.e. postgres database and in other cases once daily."

If the above is not a hard rule, then yes up to some point just replacing the data in mass would be the simplest/fastest method. You could cut a step out by doing something like TRUNCATE target_tab and then COPY target_tab FROM 'source.csv' bypassing the INSERT INTO source_tab.



--
Adrian Klaver
adrian.kla...@aklaver.com



Reply via email to