[jira] [Assigned] (SPARK-31709) Proper base path for location when it is a relative path
[ https://issues.apache.org/jira/browse/SPARK-31709?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan reassigned SPARK-31709: --- Assignee: Kent Yao > Proper base path for location when it is a relative path > > > Key: SPARK-31709 > URL: https://issues.apache.org/jira/browse/SPARK-31709 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 2.4.5, 3.0.0, 3.1.0 >Reporter: Kent Yao >Assignee: Kent Yao >Priority: Major > > Currently, the user home directory is used as the base path for the database > and table locations when their location is specified with a relative path, > e.g. > {code:sql} > > set spark.sql.warehouse.dir; > spark.sql.warehouse.dir > file:/Users/kentyao/Downloads/spark/spark-3.1.0-SNAPSHOT-bin-20200512/spark-warehouse/ > spark-sql> create database loctest location 'loctestdbdir'; > spark-sql> desc database loctest; > Database Name loctest > Comment > Location > file:/Users/kentyao/Downloads/spark/spark-3.1.0-SNAPSHOT-bin-20200512/loctestdbdir > Owner kentyao > spark-sql> create table loctest(id int) location 'loctestdbdir'; > spark-sql> desc formatted loctest; > idint NULL > # Detailed Table Information > Database default > Table loctest > Owner kentyao > Created Time Thu May 14 16:29:05 CST 2020 > Last Access UNKNOWN > Created BySpark 3.1.0-SNAPSHOT > Type EXTERNAL > Provider parquet > Location > file:/Users/kentyao/Downloads/spark/spark-3.1.0-SNAPSHOT-bin-20200512/loctestdbdir > Serde Library org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe > InputFormat org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat > OutputFormat org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat > {code} > The user home is not always warehouse-related, unchangeable in runtime, and > shared both by database and table as the parent directory. Meanwhile, we use > the table path as the parent directory for relative partition locations. > the config `spark.sql.warehouse.dir` represents the default location for > managed databases and tables. For databases, the case above seems not to > follow its semantics. For tables it is right but here I suggest enriching its > meaning that is also for external tables with relative paths for locations. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-31709) Proper base path for location when it is a relative path
[ https://issues.apache.org/jira/browse/SPARK-31709?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-31709: Assignee: Apache Spark > Proper base path for location when it is a relative path > > > Key: SPARK-31709 > URL: https://issues.apache.org/jira/browse/SPARK-31709 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 2.4.5, 3.0.0, 3.1.0 >Reporter: Kent Yao >Assignee: Apache Spark >Priority: Major > > Currently, the user home directory is used as the base path for the database > and table locations when their location is specified with a relative path, > e.g. > {code:sql} > > set spark.sql.warehouse.dir; > spark.sql.warehouse.dir > file:/Users/kentyao/Downloads/spark/spark-3.1.0-SNAPSHOT-bin-20200512/spark-warehouse/ > spark-sql> create database loctest location 'loctestdbdir'; > spark-sql> desc database loctest; > Database Name loctest > Comment > Location > file:/Users/kentyao/Downloads/spark/spark-3.1.0-SNAPSHOT-bin-20200512/loctestdbdir > Owner kentyao > spark-sql> create table loctest(id int) location 'loctestdbdir'; > spark-sql> desc formatted loctest; > idint NULL > # Detailed Table Information > Database default > Table loctest > Owner kentyao > Created Time Thu May 14 16:29:05 CST 2020 > Last Access UNKNOWN > Created BySpark 3.1.0-SNAPSHOT > Type EXTERNAL > Provider parquet > Location > file:/Users/kentyao/Downloads/spark/spark-3.1.0-SNAPSHOT-bin-20200512/loctestdbdir > Serde Library org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe > InputFormat org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat > OutputFormat org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat > {code} > The user home is not always warehouse-related, unchangeable in runtime, and > shared both by database and table as the parent directory. Meanwhile, we use > the table path as the parent directory for relative partition locations. > the config `spark.sql.warehouse.dir` represents the default location for > managed databases and tables. For databases, the case above seems not to > follow its semantics. For tables it is right but here I suggest enriching its > meaning that is also for external tables with relative paths for locations. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-31709) Proper base path for location when it is a relative path
[ https://issues.apache.org/jira/browse/SPARK-31709?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-31709: Assignee: (was: Apache Spark) > Proper base path for location when it is a relative path > > > Key: SPARK-31709 > URL: https://issues.apache.org/jira/browse/SPARK-31709 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 2.4.5, 3.0.0, 3.1.0 >Reporter: Kent Yao >Priority: Major > > Currently, the user home directory is used as the base path for the database > and table locations when their location is specified with a relative path, > e.g. > {code:sql} > > set spark.sql.warehouse.dir; > spark.sql.warehouse.dir > file:/Users/kentyao/Downloads/spark/spark-3.1.0-SNAPSHOT-bin-20200512/spark-warehouse/ > spark-sql> create database loctest location 'loctestdbdir'; > spark-sql> desc database loctest; > Database Name loctest > Comment > Location > file:/Users/kentyao/Downloads/spark/spark-3.1.0-SNAPSHOT-bin-20200512/loctestdbdir > Owner kentyao > spark-sql> create table loctest(id int) location 'loctestdbdir'; > spark-sql> desc formatted loctest; > idint NULL > # Detailed Table Information > Database default > Table loctest > Owner kentyao > Created Time Thu May 14 16:29:05 CST 2020 > Last Access UNKNOWN > Created BySpark 3.1.0-SNAPSHOT > Type EXTERNAL > Provider parquet > Location > file:/Users/kentyao/Downloads/spark/spark-3.1.0-SNAPSHOT-bin-20200512/loctestdbdir > Serde Library org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe > InputFormat org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat > OutputFormat org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat > {code} > The user home is not always warehouse-related, unchangeable in runtime, and > shared both by database and table as the parent directory. Meanwhile, we use > the table path as the parent directory for relative partition locations. > the config `spark.sql.warehouse.dir` represents the default location for > managed databases and tables. For databases, the case above seems not to > follow its semantics. For tables it is right but here I suggest enriching its > meaning that is also for external tables with relative paths for locations. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org