Hi,

Thank you for the email,

please find some replies inline,

On Mon, May 17, 2021 at 11:12 PM David <dam6...@gmail.com> wrote:

> Hello Gang,
>
> I just wanted to put out a few thoughts for anyone interested in the
> Metastore, and in particular, the connection handling.
>
> As I understand it, client requests from the Thrift server come into Hive
> via the HMSHandler class.  This class lists all of the services (RPCs) that
> the Hive Metastore provides.
>
> This class's methods do some amount of validation, listener notification,
> but it ultimately calls one or more RawStore/ObjectStore methods to
> interact with the database.
>

I guess you mean one of the RawStore implementations to interact with the
database,

for example it could be CachedStore also.


>
> This entire orchestration needs some work to make this code more easy to
> work with and to improve error handling.
>
> What I propose is:
>
> 1// Remove Thrift Errors from RawStore
>
> Remove all references to
> NoSuchObjectException/InvalidOperationException/MetaException from the
> method signature of RawStore.  These Exceptions are generated by Thrift and
> are used to communicate error conditions across the wire.  They are not
> designed for use as part of the underlying stack, yet in Hive, they have
> been pushed down into these data access operators.  The RawStore should not
> have to be this tightly coupled to the transport layer.  My preference here
> would be to remove all checked Exceptions from RawStore in favor of runtime
> exceptions.


I guess we will have to look at what to do with InvalidOperationExceptions
such as the following,

if (hasNsChange) {
      throw new InvalidOperationException("Cannot change ns; from " +
getNsOrDefault(plan.getNs())
          + " to " + changes.getNs());
 }

So do we retain these?




> This is a popular format and is used (and therefore dovetails
> nicely) with the underlying database access library DataNucleaus.  All of
> the logging of un-checked Exceptions, and transforming them into Thrift
> exceptions, should happen at the HMSHandler code.
>

I am OK with this, but you might end up amalgamating a lot of RawStore
implementation-specific exception handling code in the HMSHandler, an
already cluttered class. For example, the CachedStore implementation may be
throwing an exception different from the ObjectStore and we will probably
end up wrapping both of these in the HMSHandler.

Similarly, each RawStore implementation may need to perform actions in the
event of an exception from the underlying database,
e.g. rollbackAndCleanup. I guess this is where the moving of transaction
management kicks in?



>
> 2// Move Transaction Management
>
> The ObjectStore has a pretty crazy path of handling transactions.  There
> seems to be a lot of extra code around transaction tracking that was put in
> probably because it's so hard to track transaction management within Hive.
>

Agreed!


> All of the boiler-plate transaction management code should be removed from
> ObjectStore and instead brought up into HMS handler as well.


This will need to be thought through. I agree that refactoring transaction
management is a good idea, but moving it into HMSHandler is just going to
further clutter HMSHandler. HMSHandler is already a huge class with a lot
of clutter that needs to be refactored before we add more into it.

This allows
> the handler to create a single transaction per-request and call the
> necessary ObjectStore methods.  This is not currently possible because each
> ObjectStore handles transactions in its own special way. When you include
> all of the open/commit/roll-back, and "transactional listeners," I'm not
> certain all code paths are correct.  For example, I suspect some listeners
> are being alerted outside of a transaction.  I also suspect some actions
> are occurring in multiple transactions that should really be occurring
> within a single transaction.
>

I think a CachedStore also passes down various calls down to the
ObjectStore. Starting a transaction in HMSHandler will keep the transaction
open for a more than necessary duration. I agree that the refactoring needs
to happen, but I am not convinced that moving all transaction handling to
HMSHandler is the way forward. Ofcourse I don't have a better idea as of
now.


>
> I have locally created some helper-code (TransactionOperations) to do this
> from HMSHandler:
>
>       TransactionOperations.newOperation(rawstore).execute(new
> TransactionCallback<RawStore>() {
>
>         // This method is called after openTransaction is called on the
> RawStore
>         // Runtime Exceptions are caught and cause the transaction to roll
> back
>         // The RawStore method commitTransaction is called if method
> completes OK
>         @Override
>         public void doInTransaction(RawStore rawstore) throws MetaException
> {
>
>           // These RawStore method happen in one transaction
>           rawstore.methodABC();
>           rawstore.methodXXX();
>           rawstore.methodXYZ();
>
>           if (!transactionalListeners.isEmpty()) {
>             transactionalListenersResponses =
>
> MetaStoreListenerNotifier.notifyEvent(transactionalListeners,
>                     EventType.CREATE_XXX,
>                     new CreateXxxEvent(true, HMSHandler.this, xxx));
>           }
>         }
>       });
>
>
> Re-architecting the method signatures to remove the MetaExceptions is a
> large-ish task, but trying to unwind all this transaction code is going to
> be a bear, it's what prompted me to write this email.
>

Sure, I understand.


>
> Thanks.
>

Thank you,
Narayanan

Reply via email to