Cameron Moberg created HIVE-29216:
-------------------------------------

             Summary: DirectSQL disables strict checking in MySQL, allowing 
corrupted partition data
                 Key: HIVE-29216
                 URL: https://issues.apache.org/jira/browse/HIVE-29216
             Project: Hive
          Issue Type: Bug
          Components: Standalone Metastore
    Affects Versions: 4.1.0, 3.1.3
         Environment: This happened on Hive 3.1.3 and Spark 3.5.3, however, I 
do not see any additional logic checks in master branch that would stop this 
from happening.
            Reporter: Cameron Moberg


[https://github.com/apache/hive/blob/77d0d8d92c3257fb056337e5757f0f9bd8c34f02/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/DatabaseProduct.java#L314]

 

When `SET @@session.sql_mode=ANSI_QUOTES` is written, it clears all other MySQL 
defaults, namely `STRICT_TRANS_TABLES`.

 

When inserting a partition with a value > 256 characters, it is successfully 
inserted when it should be rejected in MySQL. This leads to a scenario where 
you can insert a partition, the `PARTITIONS` table has a `PART_NAME` with > 256 
chars, but the `PART_KEY_VALS` is silently truncated to 256 chars.

 

So when you attempt to drop a table/cleanup partitions, it will never succeed, 
as the partition returned to the client is the truncated one, which then 404s 
on attempted deletion (correctly).

 

Ideally there is a validation check on partition values to ensure that it will 
insert into the DB, or we don't clear the default session mode (unless required)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to