Cameron Moberg created HIVE-29216:
-------------------------------------
Summary: DirectSQL disables strict checking in MySQL, allowing
corrupted partition data
Key: HIVE-29216
URL: https://issues.apache.org/jira/browse/HIVE-29216
Project: Hive
Issue Type: Bug
Components: Standalone Metastore
Affects Versions: 4.1.0, 3.1.3
Environment: This happened on Hive 3.1.3 and Spark 3.5.3, however, I
do not see any additional logic checks in master branch that would stop this
from happening.
Reporter: Cameron Moberg
[https://github.com/apache/hive/blob/77d0d8d92c3257fb056337e5757f0f9bd8c34f02/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/DatabaseProduct.java#L314]
When `SET @@session.sql_mode=ANSI_QUOTES` is written, it clears all other MySQL
defaults, namely `STRICT_TRANS_TABLES`.
When inserting a partition with a value > 256 characters, it is successfully
inserted when it should be rejected in MySQL. This leads to a scenario where
you can insert a partition, the `PARTITIONS` table has a `PART_NAME` with > 256
chars, but the `PART_KEY_VALS` is silently truncated to 256 chars.
So when you attempt to drop a table/cleanup partitions, it will never succeed,
as the partition returned to the client is the truncated one, which then 404s
on attempted deletion (correctly).
Ideally there is a validation check on partition values to ensure that it will
insert into the DB, or we don't clear the default session mode (unless required)
--
This message was sent by Atlassian Jira
(v8.20.10#820010)