Hi,

+1

I am sure Hive Metastore is very competitive. The standard Thrift protocol
guarantees maximum connectivity; even new catalogs are willing to federate
with HMS. Our open mind enables HMS to support new features and protocols
in a timely fashion; for example, HMS will likely be one of the most robust
Iceberg REST API implementations in a couple of months. HMS is easy to
deploy, requiring only a single RDBMS and a single Java process at a
minimum. HMS is highly sustainable; we will be maintaining it in 2030 with
the enterprise-grade schema migration tool.

However, we're not effectively highlighting the brilliant points, and one
of the myths is the imaginary difficulty of the deployment. I have had
several opportunities to talk with people who have newly deployed a
metadata catalog. I often hear that HMS is overkill. But their option is
potentially not easier than HMS.

I expect the HMS tarball and Docker will dramatically attract both new data
platform users and existing HMS users. With a Docker image, HMS can be the
easiest option for the above users. I have also observed that various users
are willing to use an official Docker image in another community, such as
the Trino Slack.

Therefore, I agree with the initiative. I am aware of some troubles with
the preparation. However, it can be the most impactful measure from a
marketing perspective, and it will make actual users happy.

Best,
Okumin


On Wed, Jul 2, 2025 at 11:29 AM Butao Zhang <zhangbu...@apache.org> wrote:

> +1
> First of all, thank you very much to Stamatis for pointing out the work
> [1] the Hive community has done in decoupling HMS from Hive, such as ticket
> HIVE-17159 [2]. Personally, I believe that maintaining the decoupling of
> HMS and Hive code is absolutely the right approach. Therefore, without
> considering other community factors, directly bundling hive-exec.jar and
> hive-iceberg-handler.jar into the HMS package would seem entirely
> unreasonable—I might even give it a -1.
> The main reason for my +1 is due to the current state of community
> development.
> As we all know, next-generation table formats like Iceberg have brought
> significant challenges to traditional Hive table formats. Meanwhile, HMS,
> as the de facto standard catalog component, is also facing major
> competition from new systems like Unity Catalog and various Iceberg REST
> catalogs. Fortunately, the Hive community has recognized this. We have
> begun fully embracing the Iceberg table format, reconsidering how HMS can
> interface with different metadata services compatible with the HMS API
> (HIVE-27473)[3], integrating the Iceberg REST catalog into HMS
> (HIVE-28059)[4], and even planning to implement cool features like
> Federated Catalog in HMS (HIVE-28879)[5]. Personally, I believe there are
> many valuable things we can do around HMS in the future.
> Since many current and future efforts will focus on HMS, now is an
> excellent time for the community to release Standalone HMS. *This would
> send a positive signal to both the Hive community and beyond: the Hive
> community is committed to strengthening HMS Catalog as the de facto
> standard for metadata management*. I believe this will attract more
> community users to participate and contribute to the growth of the Hive
> community, creating a win-win situation for both users and the Hive
> ecosystem.
> If we wait until the code decoupling work is thoroughly completed before
> releasing Standalone HMS, it would be difficult for me to estimate the
> timeline. I’m also unsure whether the community has sufficient resources in
> the short term to undertake such significant decoupling efforts. If this
> refactoring takes a year or even two, I fear we might miss the best window
> for the release of Standalone HMS*.*
> In summary, I believe now is the right time to release Standalone HMS.
> When weighing the future development of the Hive community against
> code-level decoupling, I would prioritize the former. But I also think we
> can continue the code decoupling work in parallel after releasing
> Standalone HMS—the two efforts can, to some extent, proceed simultaneously.
> [1]
> https://issues.apache.org/jira/browse/HIVE-29052?focusedCommentId=17987181&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17987181
> [2] https://issues.apache.org/jira/browse/HIVE-17159
> [3] https://issues.apache.org/jira/browse/HIVE-27473
> [4] https://issues.apache.org/jira/browse/HIVE-28059
> [5] https://issues.apache.org/jira/browse/HIVE-28879
>
> Thanks,
> Butao Zhang
> ---- Replied Message ----
> From lisoda<lis...@yeah.net> <lis...@yeah.net>
> Date 7/1/2025 23:36
> To dev<dev@hive.apache.org> <dev@hive.apache.org>
> Subject Re: [VOTE] Release HMS tarball and Docker image in Hive 4.1
> +1
>
>
> ---- Replied Message ----
> From Denys Kuzmenko<dkuzme...@apache.org> <dkuzme...@apache.org>
> Date 07/01/2025 22:33
> To dev@hive.apache.org
> Cc
> Subject [VOTE] Release HMS tarball and Docker image in Hive 4.1
> Hi All,
>
> Please vote on whether we should proceed with releasing a tarball and a
> Docker image for the Hive Metastore (HMS) as part of the Hive 4.1 release.
>
> *Context*:
>
> There was a concern raised in HIVE-29052 [1], suggesting that the current
> packaging approach is flawed and proposing that HMS tarball should not be
> released in 4.1.
>
> To ensure complete functionality, the HMS tarball includes hive-exec-[
> *core]* and hive-iceberg-handler jars. While HMS is not a standalone
> project and likely won’t be in the foreseeable future, I don’t believe this
> should block the release.
>
> Making HMS a truly standalone component would require a major refactor and
> substantial reorganization of modules and class dependencies, work that has
> been stalled for several years.
>
> Moreover, we need to release the HMS IcebergCatalog now to prevent users
> from shifting to alternative catalog implementations, which risks rendering
> HMS obsolete.
>
> Offering users a 458MB Hive tarball instead of a 169MB HMS parcel isn’t
> ideal. Many are reluctant to download the full Hive bundle just to access
> HMS binaries.
>
> While improvements can be made in the future, releasing a dedicated HMS
> package now provides a solid foundation and immediate value to users.
>
> *Please vote*:
>
> +1 - Proceed with releasing the HMS tarball and Docker image in 4.1
>   0 - No strong opinion
> -1 - Do not release the HMS tarball and Docker image in 4.1 (please
> explain why)
>
> [1]
> https://issues.apache.org/jira/browse/HIVE-29052?focusedCommentId=17987183&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17987183
>

Reply via email to