Hello Gang,

I just wanted to put out a few thoughts for anyone interested in the
Metastore, and in particular, the connection handling.

As I understand it, client requests from the Thrift server come into Hive
via the HMSHandler class.  This class lists all of the services (RPCs) that
the Hive Metastore provides.

This class's methods do some amount of validation, listener notification,
but it ultimately calls one or more RawStore/ObjectStore methods to
interact with the database.

This entire orchestration needs some work to make this code more easy to
work with and to improve error handling.

What I propose is:

1// Remove Thrift Errors from RawStore

Remove all references to
NoSuchObjectException/InvalidOperationException/MetaException from the
method signature of RawStore.  These Exceptions are generated by Thrift and
are used to communicate error conditions across the wire.  They are not
designed for use as part of the underlying stack, yet in Hive, they have
been pushed down into these data access operators.  The RawStore should not
have to be this tightly coupled to the transport layer.  My preference here
would be to remove all checked Exceptions from RawStore in favor of runtime
exceptions.  This is a popular format and is used (and therefore dovetails
nicely) with the underlying database access library DataNucleaus.  All of
the logging of un-checked Exceptions, and transforming them into Thrift
exceptions, should happen at the HMSHandler code.


2// Move Transaction Management

The ObjectStore has a pretty crazy path of handling transactions.  There
seems to be a lot of extra code around transaction tracking that was put in
probably because it's so hard to track transaction management within Hive.
All of the boiler-plate transaction management code should be removed from
ObjectStore and instead brought up into HMS handler as well.  This allows
the handler to create a single transaction per-request and call the
necessary ObjectStore methods.  This is not currently possible because each
ObjectStore handles transactions in its own special way. When you include
all of the open/commit/roll-back, and "transactional listeners," I'm not
certain all code paths are correct.  For example, I suspect some listeners
are being alerted outside of a transaction.  I also suspect some actions
are occurring in multiple transactions that should really be occurring
within a single transaction.

I have locally created some helper-code (TransactionOperations) to do this
from HMSHandler:

      TransactionOperations.newOperation(rawstore).execute(new
TransactionCallback<RawStore>() {

        // This method is called after openTransaction is called on the
RawStore
        // Runtime Exceptions are caught and cause the transaction to roll
back
        // The RawStore method commitTransaction is called if method
completes OK
        @Override
        public void doInTransaction(RawStore rawstore) throws MetaException
{

          // These RawStore method happen in one transaction
          rawstore.methodABC();
          rawstore.methodXXX();
          rawstore.methodXYZ();

          if (!transactionalListeners.isEmpty()) {
            transactionalListenersResponses =

MetaStoreListenerNotifier.notifyEvent(transactionalListeners,
                    EventType.CREATE_XXX,
                    new CreateXxxEvent(true, HMSHandler.this, xxx));
          }
        }
      });


Re-architecting the method signatures to remove the MetaExceptions is a
large-ish task, but trying to unwind all this transaction code is going to
be a bear, it's what prompted me to write this email.

Thanks.

Reply via email to