Thank you all for the help. I'm preparing the patch for reviewing. 秦凯捷 Tel: +86-13810485829 E-mail: daniel...@gmail.com
On Tue, Dec 19, 2017 at 12:49 AM, Eugene Koifman <ekoif...@hortonworks.com> wrote: > +1 to Alex’ comment > > On 12/14/17, 3:27 PM, "Alexander Kolbasov" <ak...@cloudera.com> wrote: > > Kaijie, > > can you describe in more details why would you need such functionality? > What problem does it actually solve? > > I do not think that HMS should do more "atomic" compound operations > then it > does now - IMO it should do less instead. This is especially the case > when > operations involve a mix of metadata operations and filesystem > operations > which can not be always reverted correctly. Such things make semantics > of > HMS calls more and more complex and difficult to maintain. Existing > bulk > APIs are not a good example that we should follow. > > > - Alex > > On Wed, Dec 13, 2017 at 6:54 PM, 秦凯捷 <daniel...@gmail.com> wrote: > > > Hi Andrew, > > > > Thanks for you response. For your comments: > > > > -Functionality: > > Support adding and altering multiple partitions for multiple tables > in one > > SQL and API request as one transaction. > > > > - what happens in the case of a failure when part way through the > > operations. > > For altering and adding partitions, all the objectstore changes for > > partitions will be operated in one transaction. So the transaction > will be > > roll-back in case of failure. > > For adding partitions, there may be additional steps to add > directories on > > filesystem for newly added partitions. They will be deleted in case > of > > failure, just like what AddPartitions is doing now. > > > > - what impact on the system there will be if an operation takes a > long time > > Alter partitions for multiple tables actually has no big difference > than > > current altering partitions for one table. They will both take a > long time > > if someone is trying to alter too many partitions or for too many > tables. > > Transaction timeout will strike down the operation. > > We are doing performance test on our system to see how long it takes > for > > multiple scenarios but after all, this should not be a blocker. > > > > Thanks, > > Kaijie > > > > 秦凯捷 > > Tel: +86-13810485829 > > E-mail: daniel...@gmail.com > > > > > > > > On Thu, Dec 14, 2017 at 3:38 AM, Andrew Sherman < > asher...@cloudera.com> > > wrote: > > > > > Hi Kaijie, > > > > > > I think this is an area that other the Hive community is > interested in. > > So > > > please do go ahead and describe your functionality. > > > I think that it is important to describe > > > - what happens in the case of a failure when part way through the > > > operations. > > > - what impact on the system there will be if an operation takes a > long > > time > > > > > > Thanks > > > > > > -Andrew > > > > > > On Tue, Dec 12, 2017 at 1:31 AM, 秦凯捷 <daniel...@gmail.com> wrote: > > > > > > > Hi dev, > > > > > > > > I'm wondering if Hive community have ever considered support > adding and > > > > altering multiple partitions for multiple tables? > > > > > > > > I'm using Hive Metastore to manage the metadata for Presto > querying. > > Our > > > > business requires that we should publish some partitions of data > for > > > > multiple tables at the same time in an atomic transaction to > keep the > > > data > > > > consistency. Currently Hive Metastore only supports adding and > altering > > > > multiple tables for one table. > > > > > > > > I drafted AddPartitionsForTables and AlterPartitionsForTables > function > > to > > > > achieve this based on existing AddPartition and AlterPartition > logic > > and > > > we > > > > are testing it on our system. > > > > I'm wondering if community have considered these functionality. > I would > > > > like to contribute the functionality if you have interest. > > > > > > > > Thank you! > > > > -Kaijie > > > > > > > > > > > > Tel: +86-13810485829 > > > > E-mail: daniel...@gmail.com > > > > > > > > > > > >