>> Has anyone any positive feedback on the hive MERGE statement ? FYI
https://issues.apache.org/jira/browse/HIVE-19286 https://issues.apache.org/jira/browse/HIVE-19295 On Mon, May 7, 2018 at 12:35 AM, Nicolas Paris <[email protected]> wrote: > Hi, > > Has anyone any positive feedback on the hive MERGE statement ? There > is some informations [1] and [2]. > > From my experience, merging a source table of 300M rows and 100 columns > to a target of 1.5B is 100 times slower than doing an UPDATE and an INSERT. > It is also slower than a third approach consisting in building the > new table from scratch, and renaming it to replace the old one. > > Second bad point: Right now spark is not able to read an ACID table > without Major compaction. Meaning, the table needs to be rebuild > from scratch behind the scene. > > Then I am wondering if the merge statement is impracticable because > of bad use of myself or because this feature is just not mature enough. > > [1]: https://thisdataguy.com/2018/01/29/why-is-my-hive-merge- > statement-slow/ > [2]: https://fr.hortonworks.com/blog/apache-hive-moving- > beyond-analytics-offload-with-sql-merge/ > > > -- Oleksiy
