Alan, While continuing shipping HMS with Hive makes sense (at least for a while), what do you think about somehow separating lib/bin directories created in the distro so Hive and metastore have a separate set of bin/lib dirs?
- Alex On Wed, Jan 24, 2018 at 12:16 PM, Alan Gates <alanfga...@gmail.com> wrote: > In HIVE-17983 I have been working on packaing and start/stop scripts for > the standalone metastore. One question this brings up is how Hive will be > released now, with or without the metastore. I can see two options: > > 1) We continue to ship the metastore with Hive. Not only does this mean > the metastore code is in the Hive source code release and the metastore > jars are in the Hive binary distribution, but scripts like metastore.sh are > still included in Hive's bin directory, so that Hive admins can still do > 'hive --service metastore' to start the metastore. I see the following > advantages of this: > a) it is completely backwards compatible; > b) it is what users would expect (I have installed many databases and never > been asked to first install a separate package for its data catalog or any > other essential piece); > c) this will still be the metastore's most frequent use case for at least > the near future. > > The disadvantage is it is error prone when Hive is set up to connect to a > separate metastore. An operator could easily start the metastore in the > Hive package, not realizing Hive is configured to connect to a different > one. > > 2) We remove the metastore from the packaging completely like we do Hadoop > and require the user to install it separately. The advantages and > disadvantages of this exactly mirror those of option 1. > > Based on both the 80/20 rule (most metastore users will still be single > system Hive users) and the law of least astonishment (people expect a > database to have a data catalog) I vote for option 1. > > Anyone strongly feel we should do 2 instead? > > Any other options I haven't considered? > > Alan. >