[ 
https://issues.apache.org/jira/browse/SPARK-47881?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

jungho.choi updated SPARK-47881:
--------------------------------
    Description: 
I trying to use Hive Metastore version 3.1.3 with Spark version 3.4.2, but 
encountering an error when specifying the path to the metastore JARs on HDFS.

According to the official documentation, followed the guidelines and specified 
the path using an HDFS URI:
{code:java}
spark.sql.hive.metastore.version     3.1.3
spark.sql.hive.metastore.jars        path
spark.sql.hive.metastore.jars.path   hdfs://namespace/spark/hive3_lib/* {code}
However, when tested it, encountered an error stating that the URI schema in 
HiveClientImpl.scala is not file.
{code:java}
Caused by: java.lang.ExceptionInInitializerError: 
java.lang.IllegalArgumentException: URI scheme is not "file"
  at 
org.apache.spark.sql.hive.client.HiveClientImpl$.newHiveConf(HiveClientImpl.scala:1296)
  at 
org.apache.spark.sql.hive.client.HiveClientImpl.newState(HiveClientImpl.scala:174)
  at 
org.apache.spark.sql.hive.client.HiveClientImpl.<init>(HiveClientImpl.scala:139)
  at 
java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance0(Native
 Method)
  at 
java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
  at 
java.base/jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
  at java.base/java.lang.reflect.Constructor.newInstance(Constructor.java:490)
  at 
org.apache.spark.sql.hive.client.IsolatedClientLoader.createClient(IsolatedClientLoader.scala:315)
  at 
org.apache.spark.sql.hive.HiveUtils$.newClientForMetadata(HiveUtils.scala:517)
  at 
org.apache.spark.sql.hive.HiveUtils$.newClientForMetadata(HiveUtils.scala:377)
  at 
org.apache.spark.sql.hive.HiveExternalCatalog.client$lzycompute(HiveExternalCatalog.scala:70)
  at 
org.apache.spark.sql.hive.HiveExternalCatalog.client(HiveExternalCatalog.scala:69)
  at 
org.apache.spark.sql.hive.HiveExternalCatalog.$anonfun$databaseExists$1(HiveExternalCatalog.scala:223)
  at scala.runtime.java8.JFunction0$mcZ$sp.apply(JFunction0$mcZ$sp.java:23)
  at 
org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:101)
  ... 143 more {code}
To resolve this, changed the spark.sql.hive.metastore.jars.path to a local file 
path instead of an HDFS path, and it worked fine. I think I followed the 
instructions correctly, but are there any specific configurations or 
preferences required to use HDFS paths?

  was:
I trying to use Hive Metastore version 3.1.3 with Spark version 3.4.2, but 
encountering an error when specifying the path to the metastore JARs on HDFS.

According to the official documentation, you've followed the guidelines and 
specified the path using an HDFS URI:
{code:java}
spark.sql.hive.metastore.version     3.1.3
spark.sql.hive.metastore.jars        path
spark.sql.hive.metastore.jars.path   hdfs://namespace/spark/hive3_lib/* {code}
However, when tested it, encountered an error stating that the URI schema in 
HiveClientImpl.scala is not file.
{code:java}
Caused by: java.lang.ExceptionInInitializerError: 
java.lang.IllegalArgumentException: URI scheme is not "file"
  at 
org.apache.spark.sql.hive.client.HiveClientImpl$.newHiveConf(HiveClientImpl.scala:1296)
  at 
org.apache.spark.sql.hive.client.HiveClientImpl.newState(HiveClientImpl.scala:174)
  at 
org.apache.spark.sql.hive.client.HiveClientImpl.<init>(HiveClientImpl.scala:139)
  at 
java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance0(Native
 Method)
  at 
java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
  at 
java.base/jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
  at java.base/java.lang.reflect.Constructor.newInstance(Constructor.java:490)
  at 
org.apache.spark.sql.hive.client.IsolatedClientLoader.createClient(IsolatedClientLoader.scala:315)
  at 
org.apache.spark.sql.hive.HiveUtils$.newClientForMetadata(HiveUtils.scala:517)
  at 
org.apache.spark.sql.hive.HiveUtils$.newClientForMetadata(HiveUtils.scala:377)
  at 
org.apache.spark.sql.hive.HiveExternalCatalog.client$lzycompute(HiveExternalCatalog.scala:70)
  at 
org.apache.spark.sql.hive.HiveExternalCatalog.client(HiveExternalCatalog.scala:69)
  at 
org.apache.spark.sql.hive.HiveExternalCatalog.$anonfun$databaseExists$1(HiveExternalCatalog.scala:223)
  at scala.runtime.java8.JFunction0$mcZ$sp.apply(JFunction0$mcZ$sp.java:23)
  at 
org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:101)
  ... 143 more {code}
To resolve this, changed the spark.sql.hive.metastore.jars.path to a local file 
path instead of an HDFS path, and it worked fine. I think I followed the 
instructions correctly, but are there any specific configurations or 
preferences required to use HDFS paths?


> Not working HDFS path for hive.metastore.jars.path
> --------------------------------------------------
>
>                 Key: SPARK-47881
>                 URL: https://issues.apache.org/jira/browse/SPARK-47881
>             Project: Spark
>          Issue Type: Question
>          Components: SQL
>    Affects Versions: 3.4.2
>            Reporter: jungho.choi
>            Priority: Major
>
> I trying to use Hive Metastore version 3.1.3 with Spark version 3.4.2, but 
> encountering an error when specifying the path to the metastore JARs on HDFS.
> According to the official documentation, followed the guidelines and 
> specified the path using an HDFS URI:
> {code:java}
> spark.sql.hive.metastore.version     3.1.3
> spark.sql.hive.metastore.jars        path
> spark.sql.hive.metastore.jars.path   hdfs://namespace/spark/hive3_lib/* {code}
> However, when tested it, encountered an error stating that the URI schema in 
> HiveClientImpl.scala is not file.
> {code:java}
> Caused by: java.lang.ExceptionInInitializerError: 
> java.lang.IllegalArgumentException: URI scheme is not "file"
>   at 
> org.apache.spark.sql.hive.client.HiveClientImpl$.newHiveConf(HiveClientImpl.scala:1296)
>   at 
> org.apache.spark.sql.hive.client.HiveClientImpl.newState(HiveClientImpl.scala:174)
>   at 
> org.apache.spark.sql.hive.client.HiveClientImpl.<init>(HiveClientImpl.scala:139)
>   at 
> java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance0(Native
>  Method)
>   at 
> java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>   at 
> java.base/jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>   at java.base/java.lang.reflect.Constructor.newInstance(Constructor.java:490)
>   at 
> org.apache.spark.sql.hive.client.IsolatedClientLoader.createClient(IsolatedClientLoader.scala:315)
>   at 
> org.apache.spark.sql.hive.HiveUtils$.newClientForMetadata(HiveUtils.scala:517)
>   at 
> org.apache.spark.sql.hive.HiveUtils$.newClientForMetadata(HiveUtils.scala:377)
>   at 
> org.apache.spark.sql.hive.HiveExternalCatalog.client$lzycompute(HiveExternalCatalog.scala:70)
>   at 
> org.apache.spark.sql.hive.HiveExternalCatalog.client(HiveExternalCatalog.scala:69)
>   at 
> org.apache.spark.sql.hive.HiveExternalCatalog.$anonfun$databaseExists$1(HiveExternalCatalog.scala:223)
>   at scala.runtime.java8.JFunction0$mcZ$sp.apply(JFunction0$mcZ$sp.java:23)
>   at 
> org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:101)
>   ... 143 more {code}
> To resolve this, changed the spark.sql.hive.metastore.jars.path to a local 
> file path instead of an HDFS path, and it worked fine. I think I followed the 
> instructions correctly, but are there any specific configurations or 
> preferences required to use HDFS paths?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to