Hi Boris, those are good ideas. Currently Kudu does not have atomic bulk load capabilities or staging abilities. Theoretically renaming a partition atomically shouldn't be that hard to implement, since it's just a master metadata operation which can be done atomically, but it's not yet implemented.
There is a JIRA to track a generic bulk load API here: https://issues.apache.org/jira/browse/KUDU-1370 Since I couldn't find anything to track the specific features you mentioned, I just filed the following improvement JIRAs so we can track it: - KUDU-2326: Support atomic bulk load operation <https://issues.apache.org/jira/browse/KUDU-2326> - KUDU-2327: Support atomic swap of tables or partitions <https://issues.apache.org/jira/browse/KUDU-2327> Mike On Thu, Feb 22, 2018 at 6:39 AM, Boris Tyukin <bo...@boristyukin.com> wrote: > Hello, > > I am trying to figure out the best and safest way to swap data in a > production Kudu table with data from a staging table. > > Basically, once in a while we need to perform a full reload of some tables > (once in a few months). These tables are pretty large with billions of rows > and we want to minimize the risk and downtime for users if something bad > happens in the middle of that process. > > With Hive and Impala on HDFS, we can use a very cool handy command LOAD > DATA INPATH. We can prepare data for reload in a staging table upfront and > this process might take many hours. Once staging table is ready, we can > issue LOAD DATA INPATH command which will move underlying HDFS files to a > production table - this operation is almost instant and the very last step > in our pipeline. > > Alternatively, we can swap partitions using ALTER TABLE EXCHANGE PARTITION > command. > > Now with Kudu, I cannot seem to find a good strategy. The only thing came > to my mind is to drop the production table and rename a staging table to > production table as the last step of the job, but in this case we are going > to lose statistics and security permissions. > > Any other ideas? > > Thanks! > Boris >