[jira] [Assigned] (SPARK-34558) warehouse path should be resolved ahead of populating and use
[ https://issues.apache.org/jira/browse/SPARK-34558?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan reassigned SPARK-34558: --- Assignee: Kent Yao > warehouse path should be resolved ahead of populating and use > - > > Key: SPARK-34558 > URL: https://issues.apache.org/jira/browse/SPARK-34558 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.0.2, 3.1.2 >Reporter: Kent Yao >Assignee: Kent Yao >Priority: Major > > Currently, the warehouse path gets fully qualified in the caller side for > creating a database, table, partition, etc. An unqualified path is populated > into Spark and Hadoop confs, which leads to inconsistent API behaviors. We > should make it qualified ahead. > When the value is a relative path `spark.sql.warehouse.dir=lakehouse`, for > example. > If the default database is absent at runtime, the app fails with > {code:java} > Caused by: java.lang.IllegalArgumentException: java.net.URISyntaxException: > Relative path in absolute URI: file:./datalake > at org.apache.hadoop.fs.Path.initialize(Path.java:263) > at org.apache.hadoop.fs.Path.(Path.java:254) > at > org.apache.hadoop.hive.metastore.Warehouse.getDnsPath(Warehouse.java:133) > at > org.apache.hadoop.hive.metastore.Warehouse.getDnsPath(Warehouse.java:137) > at > org.apache.hadoop.hive.metastore.Warehouse.getWhRoot(Warehouse.java:150) > at > org.apache.hadoop.hive.metastore.Warehouse.getDefaultDatabasePath(Warehouse.java:163) > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.createDefaultDB_core(HiveMetaStore.java:636) > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.createDefaultDB(HiveMetaStore.java:655) > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.init(HiveMetaStore.java:431) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:148) > at > org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:107) > at > org.apache.hadoop.hive.metastore.RetryingHMSHandler.(RetryingHMSHandler.java:79) > ... 73 more > {code} > If the default database is present at runtime, the app can work with it, and > if we create a database, it gets fully qualified, for example > {code:sql} > spark-sql> create database test2 location 'datalake'; > 21/02/26 21:52:57 WARN ObjectStore: Failed to get database test2, returning > NoSuchObjectException > Time taken: 0.052 seconds > spark-sql> desc database test; > Database Name test > Comment > Location > file:/Users/kentyao/Downloads/spark/spark-3.2.0-SNAPSHOT-bin-20210226/datalake/test.db > Owner kentyao > Time taken: 0.023 seconds, Fetched 4 row(s) > {code} > Another thing is that the log becomes nubilous, for example. > {code:java} > 21/02/27 13:54:17 INFO SharedState: Setting hive.metastore.warehouse.dir > ('null') to the value of spark.sql.warehouse.dir ('datalake'). > 21/02/27 13:54:17 INFO SharedState: Warehouse path is 'datalake'. > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-34558) warehouse path should be resolved ahead of populating and use
[ https://issues.apache.org/jira/browse/SPARK-34558?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-34558: Assignee: (was: Apache Spark) > warehouse path should be resolved ahead of populating and use > - > > Key: SPARK-34558 > URL: https://issues.apache.org/jira/browse/SPARK-34558 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.0.2, 3.1.2 >Reporter: Kent Yao >Priority: Major > > Currently, the warehouse path gets fully qualified in the caller side for > creating a database, table, partition, etc. An unqualified path is populated > into Spark and Hadoop confs, which leads to inconsistent API behaviors. We > should make it qualified ahead. > When the value is a relative path `spark.sql.warehouse.dir=lakehouse`, for > example. > If the default database is absent at runtime, the app fails with > {code:java} > Caused by: java.lang.IllegalArgumentException: java.net.URISyntaxException: > Relative path in absolute URI: file:./datalake > at org.apache.hadoop.fs.Path.initialize(Path.java:263) > at org.apache.hadoop.fs.Path.(Path.java:254) > at > org.apache.hadoop.hive.metastore.Warehouse.getDnsPath(Warehouse.java:133) > at > org.apache.hadoop.hive.metastore.Warehouse.getDnsPath(Warehouse.java:137) > at > org.apache.hadoop.hive.metastore.Warehouse.getWhRoot(Warehouse.java:150) > at > org.apache.hadoop.hive.metastore.Warehouse.getDefaultDatabasePath(Warehouse.java:163) > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.createDefaultDB_core(HiveMetaStore.java:636) > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.createDefaultDB(HiveMetaStore.java:655) > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.init(HiveMetaStore.java:431) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:148) > at > org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:107) > at > org.apache.hadoop.hive.metastore.RetryingHMSHandler.(RetryingHMSHandler.java:79) > ... 73 more > {code} > If the default database is present at runtime, the app can work with it, and > if we create a database, it gets fully qualified, for example > {code:sql} > spark-sql> create database test2 location 'datalake'; > 21/02/26 21:52:57 WARN ObjectStore: Failed to get database test2, returning > NoSuchObjectException > Time taken: 0.052 seconds > spark-sql> desc database test; > Database Name test > Comment > Location > file:/Users/kentyao/Downloads/spark/spark-3.2.0-SNAPSHOT-bin-20210226/datalake/test.db > Owner kentyao > Time taken: 0.023 seconds, Fetched 4 row(s) > {code} > Another thing is that the log becomes nubilous, for example. > {code:java} > 21/02/27 13:54:17 INFO SharedState: Setting hive.metastore.warehouse.dir > ('null') to the value of spark.sql.warehouse.dir ('datalake'). > 21/02/27 13:54:17 INFO SharedState: Warehouse path is 'datalake'. > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-34558) warehouse path should be resolved ahead of populating and use
[ https://issues.apache.org/jira/browse/SPARK-34558?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-34558: Assignee: Apache Spark > warehouse path should be resolved ahead of populating and use > - > > Key: SPARK-34558 > URL: https://issues.apache.org/jira/browse/SPARK-34558 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.0.2, 3.1.2 >Reporter: Kent Yao >Assignee: Apache Spark >Priority: Major > > Currently, the warehouse path gets fully qualified in the caller side for > creating a database, table, partition, etc. An unqualified path is populated > into Spark and Hadoop confs, which leads to inconsistent API behaviors. We > should make it qualified ahead. > When the value is a relative path `spark.sql.warehouse.dir=lakehouse`, for > example. > If the default database is absent at runtime, the app fails with > {code:java} > Caused by: java.lang.IllegalArgumentException: java.net.URISyntaxException: > Relative path in absolute URI: file:./datalake > at org.apache.hadoop.fs.Path.initialize(Path.java:263) > at org.apache.hadoop.fs.Path.(Path.java:254) > at > org.apache.hadoop.hive.metastore.Warehouse.getDnsPath(Warehouse.java:133) > at > org.apache.hadoop.hive.metastore.Warehouse.getDnsPath(Warehouse.java:137) > at > org.apache.hadoop.hive.metastore.Warehouse.getWhRoot(Warehouse.java:150) > at > org.apache.hadoop.hive.metastore.Warehouse.getDefaultDatabasePath(Warehouse.java:163) > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.createDefaultDB_core(HiveMetaStore.java:636) > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.createDefaultDB(HiveMetaStore.java:655) > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.init(HiveMetaStore.java:431) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:148) > at > org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:107) > at > org.apache.hadoop.hive.metastore.RetryingHMSHandler.(RetryingHMSHandler.java:79) > ... 73 more > {code} > If the default database is present at runtime, the app can work with it, and > if we create a database, it gets fully qualified, for example > {code:sql} > spark-sql> create database test2 location 'datalake'; > 21/02/26 21:52:57 WARN ObjectStore: Failed to get database test2, returning > NoSuchObjectException > Time taken: 0.052 seconds > spark-sql> desc database test; > Database Name test > Comment > Location > file:/Users/kentyao/Downloads/spark/spark-3.2.0-SNAPSHOT-bin-20210226/datalake/test.db > Owner kentyao > Time taken: 0.023 seconds, Fetched 4 row(s) > {code} > Another thing is that the log becomes nubilous, for example. > {code:java} > 21/02/27 13:54:17 INFO SharedState: Setting hive.metastore.warehouse.dir > ('null') to the value of spark.sql.warehouse.dir ('datalake'). > 21/02/27 13:54:17 INFO SharedState: Warehouse path is 'datalake'. > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org