[ 
https://issues.apache.org/jira/browse/HIVE-27944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-27944:
----------------------------------
    Labels: pull-request-available  (was: )

> When HIVE-LLAP reads the ICEBERG table, a deadlock may occur.
> -------------------------------------------------------------
>
>                 Key: HIVE-27944
>                 URL: https://issues.apache.org/jira/browse/HIVE-27944
>             Project: Hive
>          Issue Type: Bug
>    Affects Versions: 3.1.3, 4.0.0, 4.0.0-beta-1
>            Reporter: yongzhi.shao
>            Priority: Critical
>              Labels: pull-request-available
>         Attachments: image-2023-12-08-14-17-53-822.png, 
> image-2023-12-08-14-22-18-998.png, image-2023-12-10-16-24-34-351.png
>
>
> We found that org.apache.hadoop.hive.ql.plan.PartitionDesc.equals() may 
> deadlock in a multithreaded environment.
> Here's the deadlock information we've gathered: 
> {code:java}
> "DAG44-Input-4-16" Id=161 BLOCKED on 
> org.apache.hadoop.hive.common.CopyOnFirstWriteProperties@44196d35 owned by 
> "DAG44-Input-4-15" Id=160
>     at 
> org.apache.hadoop.hive.common.CopyOnFirstWriteProperties.size(CopyOnFirstWriteProperties.java:315)
>     -  blocked on 
> org.apache.hadoop.hive.common.CopyOnFirstWriteProperties@44196d35
>     at java.util.Hashtable.equals(Hashtable.java:801)
>     -  locked java.util.Properties@77a541be <---- but blocks 3 other threads!
>     at 
> org.apache.hadoop.hive.common.CopyOnFirstWriteProperties.equals(CopyOnFirstWriteProperties.java:213)
>     -  locked 
> org.apache.hadoop.hive.common.CopyOnFirstWriteProperties@2d973aa3
>     at 
> org.apache.hadoop.hive.ql.plan.PartitionDesc.equals(PartitionDesc.java:327)
>     at java.util.AbstractMap.equals(AbstractMap.java:495)
>     at java.util.concurrent.ConcurrentHashMap.get(ConcurrentHashMap.java:940)
>     at 
> org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getFromPathRecursively(HiveFileFormatUtils.java:374)
>     at 
> org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getFromPathRecursively(HiveFileFormatUtils.java:359)
>     at 
> org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getFromPathRecursively(HiveFileFormatUtils.java:354)
>     at 
> org.apache.hadoop.hive.ql.exec.tez.SplitGrouper.schemaEvolved(SplitGrouper.java:278)
>     at 
> org.apache.hadoop.hive.ql.exec.tez.SplitGrouper.generateGroupedSplits(SplitGrouper.java:183)
>     at 
> org.apache.hadoop.hive.ql.exec.tez.SplitGrouper.generateGroupedSplits(SplitGrouper.java:160)
>     at 
> org.apache.hadoop.hive.ql.exec.tez.HiveSplitGenerator.initialize(HiveSplitGenerator.java:287)
>  {code}
> Since the Properties object implement HashTable interface, all the methods of 
> the HashTable interface are synchronised.
> In a multi-threaded environment, a deadlock will occur when 
> propA.equals(propB)  and propB.equals(propA) occur at the same time.
>  
> I have a fix-idea for this, when we call CopyOnFirstWriteProperties.equals(), 
> we can do a copy of the object within this method. Compare it with the copied 
> object. If there are no problems with this solution, I will submit a PR.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to