Sourabh Badhya created HIVE-26871:
-
Summary: TestCrudCompactorOnTez is flaky after HIVE-26479
Key: HIVE-26871
URL: https://issues.apache.org/jira/browse/HIVE-26871
Project: Hive
Issue Type: Bug
Components: Test
Reporter: Sourabh Badhya
Assignee: Sourabh Badhya
The 3 tests in TestCrudCompactorOnTez which use the ProtoLoggingHook run at
different times. Unfortunately, the 3 tests are run at the following times as
described in the logs -
Test 1 -
{code:java}
2022-12-15T16:57:44,294 INFO [main] compactor.TestCrudCompactorOnTez: Current
time: 2022-12-15T23:57:44 {code}
Test 2 -
{code:java}
2022-12-15T17:00:32,452 INFO [main] compactor.TestCrudCompactorOnTez: Current
time: 2022-12-16T00:00:32 {code}
Test 3 -
{code:java}
2022-12-15T17:04:12,895 INFO [main] compactor.TestCrudCompactorOnTez: Current
time: 2022-12-16T00:04:12 {code}
As we can see, the tests are run on 2 different dates. Therefore,
HiveProtoLoggingHook generates a unique event logs for every unique date. This
is the behaviour of HiveProtoLoggingHook.
[https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/hooks/HiveProtoLoggingHook.java#L296-L310]
However the expectation from the test side, while generating the log readers is
that there must be a single file in the log folder defined.
[https://github.com/apache/hive/blob/master/ql/src/test/org/apache/hadoop/hive/ql/hooks/TestHiveProtoLoggingHook.java#L310]
Unfortunately, since there are 2 files which are generated (as mentioned in the
logs as well), the following tests fail -
{code:java}
2022-12-15T17:04:14,837 INFO [main] hooks.TestHiveProtoLoggingHook: List of
paths:
2022-12-15T17:04:14,837 INFO [main] hooks.TestHiveProtoLoggingHook:
file:/home/jenkins/agent/workspace/internal-hive-flaky-check/itests/hive-unit/target/tmp/junit441259831997042392/junit3438435196942546140/date=2022-12-15
2022-12-15T17:04:14,837 INFO [main] hooks.TestHiveProtoLoggingHook:
file:/home/jenkins/agent/workspace/internal-hive-flaky-check/itests/hive-unit/target/tmp/junit441259831997042392/junit3438435196942546140/date=2022-12-16
{code}
The solution is to make _getTestReader()_ in _TestHiveProtoLoggingHook_ more
compatible with multiple event log file scenario and be able to generate
multiple readers for all files present in the folder instead of fixating on a
single file clause.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)