[
https://issues.apache.org/jira/browse/IMPALA-14657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18057638#comment-18057638
]
ASF subversion and git services commented on IMPALA-14657:
----------------------------------------------------------
Commit a073dd22b3e4da451efbf6cf27d38d743d5df1d3 in impala's branch
refs/heads/master from stiga-huang
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=a073dd22b ]
IMPALA-14657: Use thread-safe set for droppedPartitions_
droppedPartitions_ of HdfsTable keeps the partition instances that are
recently dropped but haven't been sent in the catalog updates. The set
will be cleared in the catalog update thread after these deletions are
collected. However, catalog update thread just acquires read lock on
this table. So we need a thread-safe set to avoid breaking other readers
like toMinimalTCatalogObject().
This changes droppedPartitions_ to use a thread-safe set.
Tests
- Ran TestPartitionDeletion::test_local_catalog_no_event_processing 40
times.
Change-Id: I12ff7a57c269ee387c1e41048a9e0a6679a586c3
Reviewed-on: http://gerrit.cloudera.org:8080/23957
Reviewed-by: Csaba Ringhofer <[email protected]>
Tested-by: Impala Public Jenkins <[email protected]>
> ConcurrentModificationException in HdfsTable.toMinimalTCatalogObject() when
> iterating droppedPartitions_
> --------------------------------------------------------------------------------------------------------
>
> Key: IMPALA-14657
> URL: https://issues.apache.org/jira/browse/IMPALA-14657
> Project: IMPALA
> Issue Type: Bug
> Components: Catalog
> Reporter: Quanlong Huang
> Assignee: Quanlong Huang
> Priority: Critical
>
> Saw a failure in test_local_catalog_no_event_processing due to
> ConcurrentModificationException in HdfsTable.toMinimalTCatalogObject() when
> iterating droppedPartitions_:
> {code:python}
> custom_cluster/test_partition.py:115: in
> test_local_catalog_no_event_processing
> self._test_partition_deletion(unique_database)
> custom_cluster/test_partition.py:178: in _test_partition_deletion
> self.client.execute("invalidate metadata " + tbl)
> common/impala_connection.py:692: in execute
> cursor.execute(sql_stmt, configuration=self.__query_options)
> ../infra/python/env-gcc10.4.0-py3/lib/python3.8/site-packages/impala/hiveserver2.py:394:
> in execute
> self._wait_to_finish() # make execute synchronous
> ../infra/python/env-gcc10.4.0-py3/lib/python3.8/site-packages/impala/hiveserver2.py:484:
> in _wait_to_finish
> raise OperationalError(resp.errorMessage)
> E impala.error.OperationalError: Query b247acecf48556b1:e9cc935d00000000
> failed:
> E ConcurrentModificationException: null{code}
> The exception:
> {noformat}
> I20260103 00:55:15.206120 4188041 jni-util.cc:321]
> b247acecf48556b1:e9cc935d00000000] java.util.ConcurrentModificationException
> at java.util.HashMap$HashIterator.nextNode(HashMap.java:1469)
> at java.util.HashMap$KeyIterator.next(HashMap.java:1493)
> at
> org.apache.impala.catalog.HdfsTable.toMinimalTCatalogObject(HdfsTable.java:2238)
> at
> org.apache.impala.catalog.CatalogServiceCatalog.addIncompleteTable(CatalogServiceCatalog.java:2778)
> at
> org.apache.impala.catalog.CatalogServiceCatalog.invalidateTable(CatalogServiceCatalog.java:3421)
> at
> org.apache.impala.service.CatalogOpExecutor.execResetMetadataImpl(CatalogOpExecutor.java:7374)
> at
> org.apache.impala.service.CatalogOpExecutor.execResetMetadata(CatalogOpExecutor.java:7303)
> at
> org.apache.impala.service.JniCatalog.lambda$resetMetadata$4(JniCatalog.java:331)
> at
> org.apache.impala.service.JniCatalogOp.lambda$execAndSerialize$1(JniCatalogOp.java:90)
> at org.apache.impala.service.JniCatalogOp.execOp(JniCatalogOp.java:58)
> at
> org.apache.impala.service.JniCatalogOp.execAndSerialize(JniCatalogOp.java:89)
> at
> org.apache.impala.service.JniCatalogOp.execAndSerialize(JniCatalogOp.java:100)
> at
> org.apache.impala.service.JniCatalog.execAndSerialize(JniCatalog.java:243)
> at
> org.apache.impala.service.JniCatalog.execAndSerialize(JniCatalog.java:257)
> at
> org.apache.impala.service.JniCatalog.resetMetadata(JniCatalog.java:330){noformat}
> The code is
> {code:java}
> 2236 // Adds the recently dropped partitions that are not yet synced to
> the catalog
> 2237 // topic.
> 2238 for (HdfsPartition part : droppedPartitions_) {
> 2239
> hdfsTable.addToDropped_partitions(part.toMinimalTHdfsPartition());
> 2240 }
> {code}
> [https://github.com/apache/impala/blob/1970cc709/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java#L2238]
> The set could be clear concurrently by the catalog topic update thread in
> {code:java}
> private void addHdfsPartitionsToCatalogDelta(HdfsTable hdfsTable,
> GetCatalogDeltaContext ctx) throws TException {
> ...
> hdfsTable.resetDroppedPartitions();
> {code}
> [https://github.com/apache/impala/blob/1970cc709/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java#L2032]
> Both threads hold the read lock of the table.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]