[ 
https://issues.apache.org/jira/browse/HIVE-29689?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

László Bodor resolved HIVE-29689.
---------------------------------
    Resolution: Fixed

> TestHiveSplitGenerator.testExceptionIsPropagatedFromSplitSerializer is flaky
> ----------------------------------------------------------------------------
>
>                 Key: HIVE-29689
>                 URL: https://issues.apache.org/jira/browse/HIVE-29689
>             Project: Hive
>          Issue Type: Bug
>            Reporter: László Bodor
>            Assignee: László Bodor
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 4.3.0
>
>         Attachments: Screenshot 2026-06-28 at 20.44.20.png
>
>
> [https://ci.hive.apache.org/blue/organizations/jenkins/hive-precommit/detail/PR-6566/3/tests/]
> {code:java}
> Error
> Already running future in not supposed to be cancelled with the current 
> implementation
> Stacktrace
> java.lang.AssertionError: Already running future in not supposed to be 
> cancelled with the current implementation
>       at org.junit.Assert.fail(Assert.java:89)
>       at org.junit.Assert.assertTrue(Assert.java:42)
>       at 
> org.apache.hadoop.hive.ql.exec.tez.TestHiveSplitGenerator.testExceptionIsPropagatedFromSplitSerializer(TestHiveSplitGenerator.java:148)
>       at java.base/java.lang.reflect.Method.invoke(Method.java:580)
> Standard Error
> 2026-06-27T01:24:09,955  WARN [main] conf.HiveConf: HiveConf of name 
> hive.dummyparam.test.server.specific.config.override does not exist
> 2026-06-27T01:24:09,956  WARN [main] conf.HiveConf: HiveConf of name 
> hive.dummyparam.test.server.specific.config.hivesite does not exist
> 2026-06-27T01:24:09,957  WARN [main] conf.HiveConf: HiveConf of name 
> hive.dummyparam.test.server.specific.config.metastoresite does not exist
> 2026-06-27T01:24:09,980  INFO [main] tez.HiveSplitGenerator: SplitGenerator 
> using llap affinitized locations: false locationProviderClass: 
> org.apache.hadoop.hive.ql.exec.tez.HostAffinitySplitLocationProvider
> 2026-06-27T01:24:09,980  INFO [main] tez.HiveSplitGenerator: 
> SplitLocationProvider: org.apache.hadoop.hive.ql.exec.tez.Utils$1@10c07b8d
> 2026-06-27T01:24:09,984  INFO [HiveSplitGenerator.SplitSerializer Thread - 
> #1] tez.TestHiveSplitGenerator: Write split #1
> 2026-06-27T01:24:09,984  INFO [HiveSplitGenerator.SplitSerializer Thread - 
> #1] tez.TestHiveSplitGenerator: Split #1 is about to throw exception
> 2026-06-27T01:24:10,986 ERROR [main] tez.HiveSplitGenerator: Exception while 
> generating splits
> java.lang.RuntimeException: java.io.IOException: Cannot write file to path: 
> file:/tmp/jenkins/tez/staging/.tez/application_1000_0200/events/hive_1782548648445/0_MRInput_InputDataInformationEvent_1
>       at 
> org.apache.hadoop.hive.ql.exec.tez.HiveSplitGenerator$SplitSerializer.lambda$write$0(HiveSplitGenerator.java:229)
>       at 
> java.base/java.util.concurrent.CompletableFuture$AsyncRun.run(CompletableFuture.java:1804)
>       at 
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
>       at 
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
>       at java.base/java.lang.Thread.run(Thread.java:1583)
> Caused by: java.io.IOException: Cannot write file to path: 
> file:/tmp/jenkins/tez/staging/.tez/application_1000_0200/events/hive_1782548648445/0_MRInput_InputDataInformationEvent_1
>       at 
> org.apache.hadoop.hive.ql.exec.tez.TestHiveSplitGenerator$HiveSplitGeneratorSerializerException$SplitSerializerWithException.writeSplit(TestHiveSplitGenerator.java:244)
>       at 
> org.apache.hadoop.hive.ql.exec.tez.HiveSplitGenerator$SplitSerializer.lambda$write$0(HiveSplitGenerator.java:226)
>       ... 4 more
> {code}
> The test asserts the contract documented in 
> HiveSplitGenerator.SplitSerializer: a write task that is already running when 
> another task fails must not be cancelled — i.e., split0Finished == true and 
> split2Finished == false.
> It tries to set this up with three splits on an 8-thread executor:
>  - Split #0: sleeps 1s, sets split0Finished = true on completion.
>  - Split #1: throws IOException, which sets anyTaskFailed = true.
>  - Split #2: enters write() after a 1s delay, so the runnable's 
> !anyTaskFailed.get() guard short-circuits it and split2Finished stays false.
> The problem: every task body in SplitSerializer.write() is wrapped in if 
> (!anyTaskFailed.get())\{ writeSplit(...) }.
> *There is no happens-before relation guaranteeing that split #0's runnable 
> evaluates that guard before split #1's runnable runs to completion.* With 8 
> threads available, the executor can schedule split #1 first; it sets 
> anyTaskFailed = true before split #0's thread even reaches the guard. Split 
> #0 then short-circuits — writeSplit is never called, 
> Thread.sleep/split0Finished.set(true) never run — and the assertion fails.
> The CI log confirms this ordering: only Write split #1 is logged before the 
> failure; the expected Write split #0 line is absent.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to