#1 is preferable, IMHO.

On Wed, Jan 24, 2018 at 12:16 PM Alan Gates <alanfga...@gmail.com> wrote:

> In HIVE-17983 I have been working on packaing and start/stop scripts for
> the standalone metastore.  One question this brings up is how Hive will be
> released now, with or without the metastore.  I can see two options:
>
> 1) We continue to ship the metastore with Hive.  Not only does this mean
> the metastore code is in the Hive source code release and the metastore
> jars are in the Hive binary distribution, but scripts like metastore.sh are
> still included in Hive's bin directory, so that Hive admins can still do
> 'hive --service metastore' to start the metastore.  I see the following
> advantages of this:
> a) it is completely backwards compatible;
> b) it is what users would expect (I have installed many databases and never
> been asked to first install a separate package for its data catalog or any
> other essential piece);
> c) this will still be the metastore's most frequent use case for at least
> the near future.
>
> The disadvantage is it is error prone when Hive is set up to connect to a
> separate metastore.  An operator could easily start the metastore in the
> Hive package, not realizing Hive is configured to connect to a different
> one.
>
> 2) We remove the metastore from the packaging completely like we do Hadoop
> and require the user to install it separately.  The advantages and
> disadvantages of this exactly mirror those of option 1.
>
> Based on both the 80/20 rule (most metastore users will still be single
> system Hive users) and the law of least astonishment (people expect a
> database to have a data catalog) I vote for option 1.
>
> Anyone strongly feel we should do 2 instead?
>
> Any other options I haven't considered?
>
> Alan.
>

Reply via email to