+1 to Alex’ comment

On 12/14/17, 3:27 PM, "Alexander Kolbasov" <ak...@cloudera.com> wrote:

    Kaijie,
    
    can you describe in more details why would you need such functionality?
    What problem does it actually solve?
    
    I do not think that HMS should do more "atomic" compound operations then it
    does now - IMO it should do less instead. This is especially the case when
    operations involve a mix of metadata operations and filesystem operations
    which can not be always reverted correctly. Such things make semantics of
    HMS calls more and more complex and difficult to maintain. Existing bulk
    APIs are not a good example that we should follow.
    
    
    - Alex
    
    On Wed, Dec 13, 2017 at 6:54 PM, 秦凯捷 <daniel...@gmail.com> wrote:
    
    > Hi Andrew,
    >
    > Thanks for you response. For your comments:
    >
    > -Functionality:
    > Support adding and altering multiple partitions for multiple tables in one
    > SQL and API request as one transaction.
    >
    > - what happens in the case of a failure when part way through the
    > operations.
    > For altering and adding partitions, all the objectstore changes for
    > partitions will be operated in one transaction. So the transaction will be
    > roll-back in case of failure.
    > For adding partitions, there may be additional steps to add directories on
    > filesystem for newly added partitions. They will be deleted in case of
    > failure, just like what AddPartitions is doing now.
    >
    > - what impact on the system there will be if an operation takes a long 
time
    > Alter partitions for multiple tables actually has no big difference than
    > current altering partitions for one table. They will both take a long time
    > if someone is trying to alter too many partitions or for too many tables.
    > Transaction timeout will strike down the operation.
    > We are doing performance test on our system to see how long it takes for
    > multiple scenarios but after all, this should not be a blocker.
    >
    > Thanks,
    > Kaijie
    >
    > 秦凯捷
    > Tel: +86-13810485829
    > E-mail: daniel...@gmail.com
    >
    >
    >
    > On Thu, Dec 14, 2017 at 3:38 AM, Andrew Sherman <asher...@cloudera.com>
    > wrote:
    >
    > > Hi Kaijie,
    > >
    > > I think this is an area that other the Hive community is interested in.
    > So
    > > please do go ahead and describe your functionality.
    > > I think that it is important to describe
    > > - what happens in the case of a failure when part way through the
    > > operations.
    > > - what impact on the system there will be if an operation takes a long
    > time
    > >
    > > Thanks
    > >
    > > -Andrew
    > >
    > > On Tue, Dec 12, 2017 at 1:31 AM, 秦凯捷 <daniel...@gmail.com> wrote:
    > >
    > > > Hi dev,
    > > >
    > > > I'm wondering if Hive community have ever considered support adding 
and
    > > > altering multiple partitions for multiple tables?
    > > >
    > > > I'm using Hive Metastore to manage the metadata for Presto querying.
    > Our
    > > > business requires that we should publish some partitions of data for
    > > > multiple tables at the same time in an atomic transaction to keep the
    > > data
    > > > consistency. Currently Hive Metastore only supports adding and 
altering
    > > > multiple tables for one table.
    > > >
    > > > I drafted AddPartitionsForTables and AlterPartitionsForTables function
    > to
    > > > achieve this based on existing AddPartition and AlterPartition logic
    > and
    > > we
    > > > are testing it on our system.
    > > > I'm wondering if community have considered these functionality. I 
would
    > > > like to contribute the functionality if you have interest.
    > > >
    > > > Thank you!
    > > > -Kaijie
    > > >
    > > >
    > > > Tel: +86-13810485829
    > > > E-mail: daniel...@gmail.com
    > > >
    > >
    >
    

Reply via email to