[
https://issues.apache.org/jira/browse/ATLAS-5002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18085060#comment-18085060
]
Paresh Devalia edited comment on ATLAS-5002 at 6/1/26 4:09 AM:
---------------------------------------------------------------
Team
The key point is that those 11 disabled HiveHookIT tests are not simple unit
tests—they execute real Hive operations (CTAS, INSERT ... SELECT,
lineage-producing queries, export/import workflows, etc.) that require Hive to
generate and execute a physical plan. During that process, Hive 3.1.3
serializes execution plans using Kryo.
On JDK 9+ (including JDK 17), Hive 3.1.3 can fail during plan serialization
because its Kryo serializer assumes the existence of the internal field:
{code:java}
java.util.ArrayList$SubList.parentOffset{code}
That field no longer exists in newer JDK implementations, resulting in:
{{}}
{code:java}
MapRedTask
-> NoSuchFieldException: parentOffset {code}
As you noted, this occurs before Atlas Hive hook validation is reached.
Therefore: * The failing tests do *not* indicate a defect in the Atlas Hive
hook.
* Additional JVM options such as {{--add-opens}} can resolve reflective-access
problems but cannot fix a serializer that depends on a field that was removed
from the JDK.
* Re-enabling the tests while keeping Hive 3.1.3 + JDK 17 will likely produce
infrastructure failures rather than meaningful Atlas validation failures.
h3. Practical options
# *Keep the tests disabled on JDK 17*
** Most conservative approach.
** Document that the failures are caused by Hive runtime incompatibility, not
Atlas functionality.
# *Run those integration tests on an older JDK*
** Typically JDK 8 is the safest environment for Hive 3.1.3.
** Allows the tests to exercise real execution and Atlas lineage registration.
# *Use a Hive build that contains the HIVE-22097 fix*
** Requires building Hive from the maintained 3.1 branch (if the fix was
backported there) or moving to a newer Hive line where the fix exists.
** Maven Central's published 3.1.3 artifacts do not include the fix.
# *Upgrade Atlas to a Hive version that officially supports modern JDKs*
** Larger effort, but removes the underlying compatibility issue instead of
working around it.
So if your goal is to re-enable {{{}testCTAS{}}}, {{{}testInsertIntoTable{}}},
{{{}testColumnLevelLineage{}}}, {{{}testLineage{}}}, and the other
distributed-execution tests, the prerequisite is not an Atlas code change—it is
a Hive runtime whose plan serialization is compatible with the JDK being used.
was (Author: pareshd):
Team
We have skip some test-case from hive-bridge addons module.
The 11 disabled HiveHookIT methods (e.g. testCTAS, testInsertIntoTable,
testColumnLevelLineage, testLineage) run real Hive SQL that triggers MapReduce
(or equivalent distributed execution): CTAS, INSERT … SELECT, export/import
with data movement, etc. They assert that the Atlas Hive hook registers
processes, lineage, and entities after execution.
They fail on the current stack (Hive 3.1.3 + JDK 17)
Atlas today uses Hive 3.1.3 (pom.xml). On JDK 9+, local MR still serializes the
query plan with Kryo. Hive 3.1.3’s serializer expects
java.util.ArrayList$SubList.parentOffset, which does not exist on modern JDKs
(HIVE-22097). Typical failure:
MapRedTask → NoSuchFieldException: parentOffset
That is a Hive + JVM compatibility bug, not an Atlas hook bug. Extra
--add-opens fixes other JDK issues (e.g. CopyOnFirstWriteProperties) but cannot
restore a removed field.
Maven Central has no newer 3.1.x release that includes this fix; the patch
exists on Hive’s 3.1 branch and in 4.x lineages, not in published 3.1.3
artifacts.
If you re-enable those tests without disabling them
You need a Hive runtime where plan serialization works on your JDK.
> Support Java 17 for build and runtime
> -------------------------------------
>
> Key: ATLAS-5002
> URL: https://issues.apache.org/jira/browse/ATLAS-5002
> Project: Atlas
> Issue Type: Bug
> Components: atlas-core
> Reporter: Paresh Devalia
> Assignee: Paresh Devalia
> Priority: Major
> Time Spent: 1h 10m
> Remaining Estimate: 0h
>
> Currently only Java 8 is supported. Java 17 is a major LTS version of Java
> and adding support would modernize our Java version support.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)