GitHub user dmainou edited a comment on the discussion: How to improve the 
Snowflake upsert speed

Perhaps an architectural one?

Using the insert/update step requires reading the data to then upsert.

Couple of solutions. 

1- Load the data into Snowflake then run a statement that accomplishes what you 
need.

or

1. Read your source data and snowflake data inputs on two streams
2. sort them to be in the same order
3. use the Merge rows (diff) transform to determine new, deleted, changed & 
identical
4. toss the identical aside
5. then bulk insert hat's needed, update as required, etc.


Ps I've previously dropped the length of a job from 6+ hours to 15 minutes.


GitHub link: 
https://github.com/apache/hop/discussions/5310#discussioncomment-13138748

----
This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]

Reply via email to