harshal-16 commented on code in PR #5617:
URL: https://github.com/apache/hive/pull/5617#discussion_r1939286256
##########
hcatalog/server-extensions/src/main/java/org/apache/hive/hcatalog/listener/DbNotificationListener.java:
##########
@@ -1205,7 +1205,7 @@ private void
addWriteNotificationLog(List<NotificationEvent> eventBatch, List<Ac
String select = sqlGenerator.addForUpdateClause("select \"WNL_ID\",
\"WNL_FILES\" from" +
" \"TXN_WRITE_NOTIFICATION_LOG\" " +
"where \"WNL_DATABASE\" = ? " +
- "and \"WNL_TABLE\" = ? " + " and \"WNL_PARTITION\" = ? " +
+ "and \"WNL_TABLE\" = ? " + " and (\"WNL_PARTITION\" = ? OR
\"WNL_PARTITION\" IS NULL) " +
Review Comment:
@deniskuzZ
for 1. you can simply create CREATE TABLE employee (id int, name string,
salary int)
STORED AS ORC TBLPROPERTIES ('transactional' = 'true');
This will not have any partition information and in the table we are adding
'' for partition
For 2. I am not changing any logic here. I am just covering the way oracle
handles '' string
Consider below example in oracle database
```
CREATE TABLE t1 (name VARCHAR2(100));
INSERT INTO t1 (name) VALUES ('harshal');
INSERT INTO t1 (name) VALUES ('');
INSERT INTO t1 (name) VALUES (NULL);
SELECT * FROM t1; -- RETURNS all 3 rows
SELECT * FROM t1 WHERE name = ''; -- RETURNS 0 rows
SELECT * FROM t1 WHERE name IS NULL; --returns 2 rows
```
Because of this quirky behavior of Oracle, if the backend database is
Oracle, then the MERGE operation adds 2 rows into TXN_WRITE_NOTIFICATION_LOG
instead of 1.
Merge statement ran in Hive shell:
`MERGE INTO employee AS a
USING employee_update AS b ON a.id = b.id
WHEN MATCHED THEN UPDATE SET salary = b.salary
WHEN NOT MATCHED THEN INSERT VALUES (b.id, b.name, b.salary);
`
Rows added in oracle database in TXN_WRITE_NOTIFICATION_LOG table
`"WNL_ID","WNL_TXNID","WNL_WRITEID","WNL_DATABASE","WNL_TABLE","WNL_PARTITION","WNL_TABLE_OBJ","WNL_PARTITION_OBJ","WNL_FILES","WNL_EVENT_TIME"
7,9,3,default,employee,,"{""1"":{""str"":""employee""},""2"":{""str"":""default""},""3"":{""str"":""hive""},""4"":{""i32"":1738583942},""5"":{""i32"":0},""6"":{""i32"":0},""7"":{""rec"":{""1"":{""lst"":[""rec"",3,{""1"":{""str"":""id""},""2"":{""str"":""int""}},{""1"":{""str"":""name""},""2"":{""str"":""string""}},{""1"":{""str"":""salary""},""2"":{""str"":""int""}}]},""2"":{""str"":""hdfs://ccycloud-1.harshal1.root.comops.site:8020/warehouse/tablespace/managed/hive/employee""},""3"":{""str"":""org.apache.hadoop.hive.ql.io.orc.OrcInputFormat""},""4"":{""str"":""org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat""},""5"":{""tf"":0},""6"":{""i32"":-1},""7"":{""rec"":{""2"":{""str"":""org.apache.hadoop.hive.ql.io.orc.OrcSerde""},""3"":{""map"":[""str"",""str"",0,{}]}}},""8"":{""lst"":[""str"",0]},""9"":{""lst"":[""rec"",0]},""10"":{""map"":[""str"",""str"",0,{}]},""11"":{""rec"":{""1"":{""lst"":[""str"",0]},""2"":{""lst"":[""lst"",0]},""3"":{""map"":[""lst"",""str"",0,{}]}}},""12"":{""
tf"":0}}},""8"":{""lst"":[""rec"",0]},""9"":{""map"":[""str"",""str"",10,{""totalSize"":""2391"",""numRows"":""3"",""rawDataSize"":""0"",""transactional_properties"":""default"",""COLUMN_STATS_ACCURATE"":""{\""BASIC_STATS\"":\""true\""}"",""numFiles"":""3"",""transient_lastDdlTime"":""1738584068"",""bucketing_version"":""2"",""numFilesErasureCoded"":""0"",""transactional"":""true""}]},""12"":{""str"":""MANAGED_TABLE""},""15"":{""tf"":0},""17"":{""str"":""hive""},""18"":{""i32"":1},""19"":{""i64"":3},""26"":{""i64"":131}}",[NULL],"hdfs://ccycloud-1.harshal1.root.comops.site:8020/warehouse/tablespace/managed/hive/employee/delta_0000003_0000003_0001/bucket_00000_0###delta_0000003_0000003_0001",1738584069
8,9,3,default,employee,,"{""1"":{""str"":""employee""},""2"":{""str"":""default""},""3"":{""str"":""hive""},""4"":{""i32"":1738583942},""5"":{""i32"":0},""6"":{""i32"":0},""7"":{""rec"":{""1"":{""lst"":[""rec"",3,{""1"":{""str"":""id""},""2"":{""str"":""int""}},{""1"":{""str"":""name""},""2"":{""str"":""string""}},{""1"":{""str"":""salary""},""2"":{""str"":""int""}}]},""2"":{""str"":""hdfs://ccycloud-1.harshal1.root.comops.site:8020/warehouse/tablespace/managed/hive/employee""},""3"":{""str"":""org.apache.hadoop.hive.ql.io.orc.OrcInputFormat""},""4"":{""str"":""org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat""},""5"":{""tf"":0},""6"":{""i32"":-1},""7"":{""rec"":{""2"":{""str"":""org.apache.hadoop.hive.ql.io.orc.OrcSerde""},""3"":{""map"":[""str"",""str"",0,{}]}}},""8"":{""lst"":[""str"",0]},""9"":{""lst"":[""rec"",0]},""10"":{""map"":[""str"",""str"",0,{}]},""11"":{""rec"":{""1"":{""lst"":[""str"",0]},""2"":{""lst"":[""lst"",0]},""3"":{""map"":[""lst"",""str"",0,{}]}}},""12"":{""
tf"":0}}},""8"":{""lst"":[""rec"",0]},""9"":{""map"":[""str"",""str"",10,{""totalSize"":""2391"",""numRows"":""3"",""rawDataSize"":""0"",""transactional_properties"":""default"",""COLUMN_STATS_ACCURATE"":""{\""BASIC_STATS\"":\""true\""}"",""numFiles"":""3"",""transient_lastDdlTime"":""1738584069"",""bucketing_version"":""2"",""numFilesErasureCoded"":""0"",""transactional"":""true""}]},""12"":{""str"":""MANAGED_TABLE""},""15"":{""tf"":0},""17"":{""str"":""hive""},""18"":{""i32"":1},""19"":{""i64"":3},""26"":{""i64"":131}}",[NULL],"hdfs://ccycloud-1.harshal1.root.comops.site:8020/warehouse/tablespace/managed/hive/employee/delta_0000003_0000003_0002/bucket_00000_0###delta_0000003_0000003_0002",1738584070
`
and if you see all other database's DDL statements for table creation, it
has primary key on WNL_TXNID, WNL_DATABASE, WNL_TABLE, WNL_PARTITION -So, above
thing will not even happen in case of other databases but this is a different
topic for discussion
Snippet from hive-schema-4.0.0-beta-1.mssql.sql
`ALTER TABLE TXN_WRITE_NOTIFICATION_LOG ADD CONSTRAINT
TXN_WRITE_NOTIFICATION_LOG_PK PRIMARY KEY (WNL_TXNID, WNL_DATABASE, WNL_TABLE,
WNL_PARTITION);
`
Apart from this, Similar logic is added other places but missing here
Ref:
1.
https://github.com/apache/hive/blob/dee65466b8dcb0a624e65f689e9580425deb34ab/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/jdbc/commands/RemoveCompactionMetricsDataCommand.java#L35
2.
https://github.com/apache/hive/blob/dee65466b8dcb0a624e65f689e9580425deb34ab/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/jdbc/commands/RemoveDuplicateCompleteTxnComponentsCommand.java#L56
Let me know if you still have some doubts / suggestion then we can surely
discuss in detail
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]