[ 
https://issues.apache.org/jira/browse/IMPALA-10906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fang-Yu Rao updated IMPALA-10906:
---------------------------------
    Description: 
HIVE-24852 added the dependency on the artifact {{hadoop-hdfs}} under the 
module of {{hive-service}}, which is a dependency of {{hive-jdbc}} that Impala 
relies on. However, since {{hadoop-hdfs}} transitively pulls in the artifact 
{{jersey-server}}, which is a banned dependency by Impala, we had to explicitly 
exclude {{jersey-server}} when adding {{hive-jdbc}} as a dependency so that 
Impala frontend could be compiled next time when we bump up 
{{CDP_BUILD_NUMBER}} that includes this Hive patch.

Moreover, due to the fact that after HIVE-24852, the creation of a partitioned 
iceberg table requires the existence of the class 
{{org.apache.hadoop.hdfs.protocol.SnapshotException}} on Impala's classpath at 
runtime, we had to explicitly add the dependency on the artifact 
{{hadoop-hdfs}} so that such an operation will not result in a 
{{NoClassDefFoundError}}. We also need to explicitly excluded some banned 
artifacts that were transitively pulled in by {{hadoop-hdfs}} so that Impala 
frontend could be compiled.

For easy reference, a query to reproduce the {{NoClassDefFoundError}} issue is 
provided in the following. A very similar query will be  executed during the 
loading of Impala's test data so that once the dependencies are properly 
revised, we expect Impala to be able to load test data correctly.
{code:sql}
[localhost:21050] default> CREATE TABLE IF NOT EXISTS 
default.iceberg_int_partitioned_tbl (i INT, j INT, k INT)
                         > PARTITION BY SPEC (i identity, j identity)
                         > STORED AS ICEBERG;
Query: CREATE TABLE IF NOT EXISTS default.iceberg_int_partitioned_tbl (i INT, j 
INT, k INT)
PARTITION BY SPEC (i identity, j identity)
STORED AS ICEBERG
ERROR: NoClassDefFoundError: org/apache/hadoop/hdfs/protocol/SnapshotException
CAUSED BY: ClassNotFoundException: 
org.apache.hadoop.hdfs.protocol.SnapshotException
{code}

  was:
HIVE-24852 added the dependency on the artifact {{hadoop-hdfs}} under the 
module of {{hive-service}}, which is a dependency of {{hive-jdbc}} that Impala 
relies on. However, since {{hadoop-hdfs}} transitively pulls in the artifact 
{{jersey-server}}, which is a banned dependency by Impala, we had to explicitly 
exclude {{jersey-server}} when adding {{hive-jdbc}} as a dependency so that 
Impala frontend could be compiled next time when we bump up 
{{CDP_BUILD_NUMBER}} that includes this Hive patch.

Moreover, due to the fact that after HIVE-24852, the creation of a partitioned 
iceberg table requires the existence of the class 
{{org.apache.hadoop.hdfs.protocol.SnapshotException}} on Impala's classpath at 
runtime, we had to explicitly add the dependency on the artifact 
{{hadoop-hdfs}} so that such an operation will not result in a 
{{NoClassDefFoundError}}. We also need to  explicitly excluded some banned 
artifacts that were transitively pulled in by {{hadoop-hdfs}} so that Impala 
frontend could be compiled.

For easy reference, a query to reproduce the {{NoClassDefFoundError}} issue is 
provided in the following. A very similar query has been executed during the 
loading of Impala's test data so that once the dependencies are properly 
revised, we expect Impala to be able to load test data correctly.
{code:sql}
[localhost:21050] default> CREATE TABLE IF NOT EXISTS 
default.iceberg_int_partitioned_tbl (i INT, j INT, k INT)
                         > PARTITION BY SPEC (i identity, j identity)
                         > STORED AS ICEBERG;
Query: CREATE TABLE IF NOT EXISTS default.iceberg_int_partitioned_tbl (i INT, j 
INT, k INT)
PARTITION BY SPEC (i identity, j identity)
STORED AS ICEBERG
ERROR: NoClassDefFoundError: org/apache/hadoop/hdfs/protocol/SnapshotException
CAUSED BY: ClassNotFoundException: 
org.apache.hadoop.hdfs.protocol.SnapshotException
{code}



> Adjust dependencies after HIVE-24852
> ------------------------------------
>
>                 Key: IMPALA-10906
>                 URL: https://issues.apache.org/jira/browse/IMPALA-10906
>             Project: IMPALA
>          Issue Type: Improvement
>            Reporter: Fang-Yu Rao
>            Assignee: Fang-Yu Rao
>            Priority: Major
>
> HIVE-24852 added the dependency on the artifact {{hadoop-hdfs}} under the 
> module of {{hive-service}}, which is a dependency of {{hive-jdbc}} that 
> Impala relies on. However, since {{hadoop-hdfs}} transitively pulls in the 
> artifact {{jersey-server}}, which is a banned dependency by Impala, we had to 
> explicitly exclude {{jersey-server}} when adding {{hive-jdbc}} as a 
> dependency so that Impala frontend could be compiled next time when we bump 
> up {{CDP_BUILD_NUMBER}} that includes this Hive patch.
> Moreover, due to the fact that after HIVE-24852, the creation of a 
> partitioned iceberg table requires the existence of the class 
> {{org.apache.hadoop.hdfs.protocol.SnapshotException}} on Impala's classpath 
> at runtime, we had to explicitly add the dependency on the artifact 
> {{hadoop-hdfs}} so that such an operation will not result in a 
> {{NoClassDefFoundError}}. We also need to explicitly excluded some banned 
> artifacts that were transitively pulled in by {{hadoop-hdfs}} so that Impala 
> frontend could be compiled.
> For easy reference, a query to reproduce the {{NoClassDefFoundError}} issue 
> is provided in the following. A very similar query will be  executed during 
> the loading of Impala's test data so that once the dependencies are properly 
> revised, we expect Impala to be able to load test data correctly.
> {code:sql}
> [localhost:21050] default> CREATE TABLE IF NOT EXISTS 
> default.iceberg_int_partitioned_tbl (i INT, j INT, k INT)
>                          > PARTITION BY SPEC (i identity, j identity)
>                          > STORED AS ICEBERG;
> Query: CREATE TABLE IF NOT EXISTS default.iceberg_int_partitioned_tbl (i INT, 
> j INT, k INT)
> PARTITION BY SPEC (i identity, j identity)
> STORED AS ICEBERG
> ERROR: NoClassDefFoundError: org/apache/hadoop/hdfs/protocol/SnapshotException
> CAUSED BY: ClassNotFoundException: 
> org.apache.hadoop.hdfs.protocol.SnapshotException
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

Reply via email to