[
https://issues.apache.org/jira/browse/HIVE-29689?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
ASF GitHub Bot updated HIVE-29689:
----------------------------------
Labels: pull-request-available (was: )
> TestHiveSplitGenerator.testExceptionIsPropagatedFromSplitSerializer is flaky
> ----------------------------------------------------------------------------
>
> Key: HIVE-29689
> URL: https://issues.apache.org/jira/browse/HIVE-29689
> Project: Hive
> Issue Type: Bug
> Reporter: László Bodor
> Assignee: László Bodor
> Priority: Major
> Labels: pull-request-available
> Attachments: Screenshot 2026-06-28 at 20.44.20.png
>
>
> {code}
> Error
> Already running future in not supposed to be cancelled with the current
> implementation
> Stacktrace
> java.lang.AssertionError: Already running future in not supposed to be
> cancelled with the current implementation
> at org.junit.Assert.fail(Assert.java:89)
> at org.junit.Assert.assertTrue(Assert.java:42)
> at
> org.apache.hadoop.hive.ql.exec.tez.TestHiveSplitGenerator.testExceptionIsPropagatedFromSplitSerializer(TestHiveSplitGenerator.java:148)
> at java.base/java.lang.reflect.Method.invoke(Method.java:580)
> Standard Error
> 2026-06-27T01:24:09,955 WARN [main] conf.HiveConf: HiveConf of name
> hive.dummyparam.test.server.specific.config.override does not exist
> 2026-06-27T01:24:09,956 WARN [main] conf.HiveConf: HiveConf of name
> hive.dummyparam.test.server.specific.config.hivesite does not exist
> 2026-06-27T01:24:09,957 WARN [main] conf.HiveConf: HiveConf of name
> hive.dummyparam.test.server.specific.config.metastoresite does not exist
> 2026-06-27T01:24:09,980 INFO [main] tez.HiveSplitGenerator: SplitGenerator
> using llap affinitized locations: false locationProviderClass:
> org.apache.hadoop.hive.ql.exec.tez.HostAffinitySplitLocationProvider
> 2026-06-27T01:24:09,980 INFO [main] tez.HiveSplitGenerator:
> SplitLocationProvider: org.apache.hadoop.hive.ql.exec.tez.Utils$1@10c07b8d
> 2026-06-27T01:24:09,984 INFO [HiveSplitGenerator.SplitSerializer Thread -
> #1] tez.TestHiveSplitGenerator: Write split #1
> 2026-06-27T01:24:09,984 INFO [HiveSplitGenerator.SplitSerializer Thread -
> #1] tez.TestHiveSplitGenerator: Split #1 is about to throw exception
> 2026-06-27T01:24:10,986 ERROR [main] tez.HiveSplitGenerator: Exception while
> generating splits
> java.lang.RuntimeException: java.io.IOException: Cannot write file to path:
> file:/tmp/jenkins/tez/staging/.tez/application_1000_0200/events/hive_1782548648445/0_MRInput_InputDataInformationEvent_1
> at
> org.apache.hadoop.hive.ql.exec.tez.HiveSplitGenerator$SplitSerializer.lambda$write$0(HiveSplitGenerator.java:229)
> at
> java.base/java.util.concurrent.CompletableFuture$AsyncRun.run(CompletableFuture.java:1804)
> at
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
> at
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
> at java.base/java.lang.Thread.run(Thread.java:1583)
> Caused by: java.io.IOException: Cannot write file to path:
> file:/tmp/jenkins/tez/staging/.tez/application_1000_0200/events/hive_1782548648445/0_MRInput_InputDataInformationEvent_1
> at
> org.apache.hadoop.hive.ql.exec.tez.TestHiveSplitGenerator$HiveSplitGeneratorSerializerException$SplitSerializerWithException.writeSplit(TestHiveSplitGenerator.java:244)
> at
> org.apache.hadoop.hive.ql.exec.tez.HiveSplitGenerator$SplitSerializer.lambda$write$0(HiveSplitGenerator.java:226)
> ... 4 more
> {code}
> The test asserts the contract documented in
> HiveSplitGenerator.SplitSerializer: a write task that is already running when
> another task fails must not be cancelled — i.e., split0Finished == true and
> split2Finished == false.
> It tries to set this up with three splits on an 8-thread executor:
> - Split #0: sleeps 1s, sets split0Finished = true on completion.
> - Split #1: throws IOException, which sets anyTaskFailed = true.
> - Split #2: enters write() after a 1s delay, so the runnable's
> !anyTaskFailed.get() guard short-circuits it and split2Finished stays false.
> The problem: every task body in SplitSerializer.write() is wrapped in if
> (!anyTaskFailed.get()) { writeSplit(...) }. There is no happens-before
> relation guaranteeing that split #0's runnable evaluates that guard before
> split #1's runnable runs to completion. With 8 threads available, the
> executor can schedule split #1 first; it sets anyTaskFailed = true before
> split #0's thread even reaches the guard. Split #0 then short-circuits —
> writeSplit is never called, Thread.sleep/split0Finished.set(true) never run —
> and the assertion fails.
> The CI log confirms this ordering: only Write split #1 is logged before the
> failure; the expected Write split #0 line is absent.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)