[jira] [Comment Edited] (HIVE-29370) Catalog Metadata MCP Server

Denys Kuzmenko (Jira) Fri, 10 Apr 2026 16:28:08 -0700


    [ 
https://issues.apache.org/jira/browse/HIVE-29370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18072755#comment-18072755
 ]


Denys Kuzmenko edited comment on HIVE-29370 at 4/10/26 11:27 PM:
-----------------------------------------------------------------

Hi [~Aggarwal_Raghav], apologies for missing your earlier message.

On your points:

1. Ranger access control
MCP acts as a thin abstraction layer that exposes tools for interacting with 
HMS metadata. Under the hood, it relies on existing HMS APIs, so authorization 
can (and should) be enforced through the same mechanisms already in place—such 
as Ranger policies. As long as the HMS layer is properly integrated with 
Ranger, access checks will naturally apply to requests coming via MCP as well. 

2. Complex metadata queries and HMS load
Queries like "list all tables having columns > x and partitions < y" are 
inherently metadata-intensive and may require scanning large portions of the 
catalog. This is not unique to AI agents—any client issuing such queries would 
incur similar costs. To mitigate potential impact on HMS performance, we could 
consider:
* Introducing guardrails (e.g., query limits, pagination, or timeouts)
* Encouraging more targeted queries through the interface
* Leveraging caching or pre-aggregated metadata where feasible
* Monitoring and possibly rate-limiting heavy or repetitive requests

Overall, MCP should be treated as another client of HMS, meaning existing 
scalability, governance, and optimization considerations still apply.


was (Author: dkuzmenko):
Hi [~Aggarwal_Raghav], apologies for missing your earlier message.

On your points:

# Ranger access control
MCP acts as a thin abstraction layer that exposes tools for interacting with 
HMS metadata. Under the hood, it relies on existing HMS APIs, so authorization 
can (and should) be enforced through the same mechanisms already in place—such 
as Ranger policies. As long as the HMS layer is properly integrated with 
Ranger, access checks will naturally apply to requests coming via MCP as well. 

# Complex metadata queries and HMS load
Queries like "list all tables having columns > x and partitions < y" are 
inherently metadata-intensive and may require scanning large portions of the 
catalog. This is not unique to AI agents—any client issuing such queries would 
incur similar costs. To mitigate potential impact on HMS performance, we could 
consider:
* Introducing guardrails (e.g., query limits, pagination, or timeouts)
* Encouraging more targeted queries through the interface
* Leveraging caching or pre-aggregated metadata where feasible
* Monitoring and possibly rate-limiting heavy or repetitive requests

Overall, MCP should be treated as another client of HMS, meaning existing 
scalability, governance, and optimization considerations still apply.

> Catalog Metadata MCP Server
> ---------------------------
>
>                 Key: HIVE-29370
>                 URL: https://issues.apache.org/jira/browse/HIVE-29370
>             Project: Hive
>          Issue Type: Sub-task
>            Reporter: Denys Kuzmenko
>            Assignee: Denys Kuzmenko
>            Priority: Major
>              Labels: hive-ai, pull-request-available
>
> Implement MCP server that exposes metadata from HMS to AI agents, LLMs, or 
> other MCP clients. This server will act as an adapter layer between the 
> metadata backend and the MCP protocol, enabling structured tool calls without 
> introducing session state or modifying the HMS.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Comment Edited] (HIVE-29370) Catalog Metadata MCP Server

Reply via email to