BsoBird commented on code in PR #5934:
URL: https://github.com/apache/hive/pull/5934#discussion_r2189531594


##########
ql/src/java/org/apache/hadoop/hive/ql/exec/tez/SplitGrouper.java:
##########
@@ -71,8 +70,8 @@ public class SplitGrouper {
 
   // TODO This needs to be looked at. Map of Map to Map... Made concurrent for 
now since split generation
   // can happen in parallel.
-  private static final Map<Map<Path, PartitionDesc>, Map<Path, PartitionDesc>> 
cache =
-      new ConcurrentHashMap<>();
+  private final Map<Map<Path, PartitionDesc>, Map<Path, PartitionDesc>> cache =

Review Comment:
   > how did you reproduce this in first place? Could you please share the 
query and tables definitions.
   > I'm not trying to test for concurrency issues, I just want to trigger 
SplitGrouper in a way that uses the cache.
   
   @deniskuzZ 
   Currently, my memory is a bit hazy. The necessary conditions to trigger this 
issue are as follows:
   
   1.Use LLAP or a persistent Tez DAG(reuse on yarn) to read an Iceberg 
partition table.
   2.The Iceberg table is just a regular Hadoop-catalog partitioned table.(But 
we may need to test the behavior when we set bucketing_version=1 or 
bucketing_version=2 for the Iceberg external table..)
   
   However, this issue is not consistently reproducible. We had been using 
IcebergStorageHandler for a long time without encountering this problem before.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org
For additional commands, e-mail: gitbox-h...@hive.apache.org

Reply via email to