[ 
https://issues.apache.org/jira/browse/HIVE-28145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17956846#comment-17956846
 ] 

Denys Kuzmenko commented on HIVE-28145:
---------------------------------------

[~VenuReddy], [~dengzh] is this a regression? what HMS client was used, 
SessionHiveMetaStoreClient?

> getPartitionsByNames API returns partition objects with empty values in many 
> fields when it is executed concurrently with dropPartition API 
> --------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HIVE-28145
>                 URL: https://issues.apache.org/jira/browse/HIVE-28145
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Venugopal Reddy K
>            Priority: Major
>              Labels: hive-4.1.0-must
>
> *Description:*
> getPartitionsByNames API returns partition objects with empty values in many 
> fields when it is executed concurrently with dropPartition API.
> org.apache.hadoop.hive.metastore.MetaStoreDirectSql#getPartitionsViaPartNames 
> method does multiple queries to backend db to populate the various fields in 
> the partition object. First it queries for part ids using partition names, 
> then joins PARTITIONS, SDS, SERDES tables for those part ids and creates 
> partition objects. Then another query to PARTITION_KEY_VALS table to get the 
> partition values for those part ids and populates in already created 
> partition objects.
> So if the partition is deleted just before PARTITION_KEY_VALS table query, it 
> can lead to empty values in partition object. This issue can happen for other 
> fields(like, partition params, storage descriptor params, serde params, sort 
> cols, bucket cols, skewed cols etc) too in partition object that require 
> queries to populate those fields.
> *Note: Issue can be observed with both directsql and JDO based query.  Need 
> to check for all APIs that involves multiple queries to backend database 
> within a transaction.*
> *Root Cause:*
> Transaction is opened with default isolation level(read-committed). The 
> default in DataNucleus is read-committed.
> *Steps to reproduce:*
>  # Create a partitioned table and add 500~1000 dynamic partitions(can add 
> dummy partition param, sd param, serde param).
>  # Create a thread pool of size 2 and submit 2 tasks. One task to submit 
> getPartitionsByNames and another task to submit dropPartition in loop
>  # Verify the fields in partition objects returned from 
> getPartitionsByNames().



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to