Mingmin Xu created HIVE-27677: --------------------------------- Summary: print out HiveMetaStore.audit to json format Key: HIVE-27677 URL: https://issues.apache.org/jira/browse/HIVE-27677 Project: Hive Issue Type: Improvement Components: Standalone Metastore Reporter: Mingmin Xu Assignee: Mingmin Xu
This task aims to print a new[1] line of HiveMetaStore audit log in JSON format, similar as [https://github.com/apache/hive/pull/1582] but extend to `cmd` details as well. # existing audit log ``` HiveMetaStore.audit: ugi=xxx ip=xx.xx.xx.xx cmd=source:xx.xx.xx.xx get_table : db=xxx tbl=xxx HiveMetaStore.audit: ugi=xxx ip=xx.xx.xx.xx cmd=source:xx.xx.xx.xx get_partition_with_auth : db=xx tbl=xx[xxx] ``` # The new audit log ``` HiveMetaStore.audit: \{ugi: "xxx", ip: "xx.xx.xx.xx", cmd={source: "xx.xx.xx.xx", api="get_table", params={db: "xxx", tbl: "xxx"}}} HiveMetaStore.audit: \{ugi: "xxx", ip: "xx.xx.xx.xx", cmd={source: "xx.xx.xx.xx", api="get_partition_with_auth", params={db: "xxx", tbl: "xxx", key=["xxx"]}}} ``` ---------------- For some context, we're tracking the usage of the shared Hive Metastore Service. HiveMetaStore auditLog is the raw data we reply on, to understand the traffic on different dimensions, source(IP), API, database, table, etc. Currently the audit log is in raw string without a standard format, especially for extraLogInfo, code point [here|https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HMSHandler.java#L182-L200], makes it harder to analyze. [1] should we print another line instead of replacing the existing one, to avoid a breaking-change? -- This message was sent by Atlassian Jira (v8.20.10#820010)