Copilot commented on code in PR #6521:
URL: https://github.com/apache/hive/pull/6521#discussion_r3376961962
##########
standalone-metastore/metastore-server/src/main/sql/mssql/upgrade-4.2.0-to-4.3.0.mssql.sql:
##########
@@ -5,6 +5,36 @@ ALTER TABLE MATERIALIZATION_REBUILD_LOCKS ADD MRL_CAT_NAME
nvarchar(128) NOT NUL
CREATE INDEX MIN_HISTORY_WRITE_ID_IDX ON MIN_HISTORY_WRITE_ID (MH_DATABASE,
MH_TABLE, MH_WRITEID);
+ALTER TABLE TXN_COMPONENTS ADD TC_ID bigint IDENTITY(1,1) NOT NULL;
+ALTER TABLE TXN_COMPONENTS ADD CONSTRAINT TXN_COMPONENTS_PK PRIMARY KEY
CLUSTERED (TC_ID);
+
+ALTER TABLE COMPLETED_TXN_COMPONENTS ADD CTC_ID bigint IDENTITY(1,1) NOT NULL;
+ALTER TABLE COMPLETED_TXN_COMPONENTS ADD CONSTRAINT
COMPLETED_TXN_COMPONENTS_PK PRIMARY KEY CLUSTERED (CTC_ID);
+
+ALTER TABLE COMPACTION_METRICS_CACHE ADD CMC_ID bigint IDENTITY(1,1) NOT NULL;
+ALTER TABLE COMPACTION_METRICS_CACHE ADD CONSTRAINT
COMPACTION_METRICS_CACHE_PK PRIMARY KEY CLUSTERED (CMC_ID);
+
+ALTER TABLE WRITE_SET ADD WS_ID bigint IDENTITY(1,1) NOT NULL;
+ALTER TABLE WRITE_SET ADD CONSTRAINT WRITE_SET_PK PRIMARY KEY CLUSTERED
(WS_ID);
Review Comment:
SQL Server `ALTER TABLE ... ADD ... IDENTITY` + adding clustered PK
constraints can be long-running and will take schema modification locks. For
large metastore tables (TXN_COMPONENTS/WRITE_SET/etc.) this can introduce
significant upgrade downtime; consider documenting expected impact or using a
staged migration (add column, populate, then add PK) if downtime is a concern.
##########
standalone-metastore/metastore-server/src/main/sql/postgres/upgrade-4.2.0-to-4.3.0.postgres.sql:
##########
@@ -5,6 +5,25 @@ ALTER TABLE "MATERIALIZATION_REBUILD_LOCKS" ADD COLUMN
"MRL_CAT_NAME" varchar(12
CREATE INDEX "MIN_HISTORY_WRITE_ID_IDX" ON "MIN_HISTORY_WRITE_ID"
("MH_DATABASE", "MH_TABLE", "MH_WRITEID");
+ALTER TABLE "TXN_COMPONENTS" ADD COLUMN "TC_ID" bigserial PRIMARY KEY;
+ALTER TABLE "COMPLETED_TXN_COMPONENTS" ADD COLUMN "CTC_ID" bigserial PRIMARY
KEY;
+ALTER TABLE "COMPACTION_METRICS_CACHE" ADD COLUMN "CMC_ID" bigserial PRIMARY
KEY;
+ALTER TABLE "WRITE_SET" ADD COLUMN "WS_ID" bigserial PRIMARY KEY;
Review Comment:
Adding `bigserial PRIMARY KEY` columns via `ALTER TABLE ... ADD COLUMN ...`
forces an `ACCESS EXCLUSIVE` lock and rewrites/scans large transactional tables
(e.g., TXN_COMPONENTS/WRITE_SET). On busy metastores this can cause a long
upgrade outage; consider a lower-lock migration (add nullable column, backfill
IDs in batches, then add NOT NULL + PK) or document expected downtime/size
assumptions.
##########
standalone-metastore/metastore-server/src/main/sql/mysql/upgrade-4.2.0-to-4.3.0.mysql.sql:
##########
@@ -5,6 +5,19 @@ ALTER TABLE MATERIALIZATION_REBUILD_LOCKS ADD MRL_CAT_NAME
varchar(128) NOT NUL
CREATE INDEX MIN_HISTORY_WRITE_ID_IDX ON MIN_HISTORY_WRITE_ID (MH_DATABASE,
MH_TABLE, MH_WRITEID);
+ALTER TABLE TXN_COMPONENTS ADD TC_ID bigint NOT NULL AUTO_INCREMENT PRIMARY
KEY FIRST;
+ALTER TABLE COMPLETED_TXN_COMPONENTS ADD CTC_ID bigint NOT NULL AUTO_INCREMENT
PRIMARY KEY FIRST;
+ALTER TABLE COMPACTION_METRICS_CACHE ADD CMC_ID bigint NOT NULL AUTO_INCREMENT
PRIMARY KEY FIRST;
+ALTER TABLE WRITE_SET ADD WS_ID bigint NOT NULL AUTO_INCREMENT PRIMARY KEY
FIRST;
+
Review Comment:
This upgrade adds `AUTO_INCREMENT PRIMARY KEY` columns to existing large txn
tables (TXN_COMPONENTS/WRITE_SET/etc.). In MySQL these `ALTER TABLE` operations
are typically table-rebuilding and can be very slow + blocking for large
installations; consider using an online schema change approach (if supported in
your deployment) or at least calling out the expected downtime/impact in
release notes.
##########
standalone-metastore/metastore-server/src/main/sql/oracle/upgrade-4.2.0-to-4.3.0.oracle.sql:
##########
@@ -5,6 +5,46 @@ ALTER TABLE MATERIALIZATION_REBUILD_LOCKS ADD (MRL_CAT_NAME
VARCHAR2(128) DEFAUL
CREATE INDEX MIN_HISTORY_WRITE_ID_IDX ON MIN_HISTORY_WRITE_ID (MH_DATABASE,
MH_TABLE, MH_WRITEID);
+ALTER TABLE TXN_COMPONENTS ADD (TC_ID NUMBER(19) GENERATED BY DEFAULT AS
IDENTITY PRIMARY KEY);
+ALTER TABLE COMPLETED_TXN_COMPONENTS ADD (CTC_ID NUMBER(19) GENERATED BY
DEFAULT AS IDENTITY PRIMARY KEY);
+ALTER TABLE COMPACTION_METRICS_CACHE ADD (CMC_ID NUMBER(19) GENERATED BY
DEFAULT AS IDENTITY PRIMARY KEY);
+ALTER TABLE WRITE_SET ADD (WS_ID NUMBER(19) GENERATED BY DEFAULT AS IDENTITY
PRIMARY KEY);
Review Comment:
For Oracle, the upgrade adds identity primary keys to several existing
transactional tables (TXN_COMPONENTS/WRITE_SET/etc.). These DDLs can be
long-running/blocking on large tables; since you already use a TMP-table copy
approach later in the script, consider whether a similar staged approach is
needed here as well, or explicitly document expected downtime/size assumptions
for Oracle upgrades.
##########
standalone-metastore/metastore-server/src/main/sql/derby/upgrade-4.2.0-to-4.3.0.derby.sql:
##########
@@ -3,5 +3,22 @@ ALTER TABLE MATERIALIZATION_REBUILD_LOCKS ADD COLUMN
MRL_CAT_NAME varchar(128) N
CREATE INDEX MIN_HISTORY_WRITE_ID_IDX ON MIN_HISTORY_WRITE_ID (MH_DATABASE,
MH_TABLE, MH_WRITEID);
+ALTER TABLE TXN_COMPONENTS ADD COLUMN TC_ID BIGINT PRIMARY KEY GENERATED BY
DEFAULT AS IDENTITY;
+ALTER TABLE COMPLETED_TXN_COMPONENTS ADD COLUMN CTC_ID BIGINT PRIMARY KEY
GENERATED BY DEFAULT AS IDENTITY;
+ALTER TABLE COMPACTION_METRICS_CACHE ADD COLUMN CMC_ID BIGINT PRIMARY KEY
GENERATED BY DEFAULT AS IDENTITY;
+ALTER TABLE WRITE_SET ADD COLUMN WS_ID BIGINT PRIMARY KEY GENERATED BY DEFAULT
AS IDENTITY;
+
Review Comment:
Derby `ALTER TABLE` operations that add identity primary keys and new PK
constraints can be expensive and locking (often requiring a full table
scan/rewrite). If this upgrade is expected to run on non-trivial embedded Derby
metastores, consider documenting potential upgrade time/downtime or applying a
staged backfill to reduce lock time.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]