[ https://issues.apache.org/jira/browse/SPARK-30411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Hyukjin Kwon updated SPARK-30411: --------------------------------- Description: {code} -bash-4.2$ hdfs dfs -ls /tmp | grep my_databases drwxr-x--T - redsanket users 0 2019-12-04 20:15 /tmp/my_databases {code} {code} >>> spark.sql("CREATE TABLE redsanket_db.example(bcookie string, ip int) STORED >>> AS orc"); {code} {code} -bash-4.2$ hdfs dfs -ls /tmp/my_databases | grep example drwxr-x--T - redsanket users 0 2019-12-04 20:20 /tmp/my_databases/example {code} Now after {{saveAsTable}} {code} >>> data = [('First', 1), ('Second', 2), ('Third', 3), ('Fourth', 4), >>> ('Fifth', 5)] >>> df = spark.createDataFrame(data) >>> df.write.format("orc").mode('overwrite').saveAsTable('redsanket_db.example') {code} {code} -bash-4.2$ hdfs dfs -ls /tmp/my_databases | grep example drwx------ - redsanket users 0 2019-12-04 20:23 /tmp/my_databases/example {code} Overwrites the permissions Insert into honors preserving parent directory permissions. {code} >>> spark.sql("DROP table redsanket_db.example"); DataFrame[] >>> spark.sql("CREATE TABLE redsanket_db.example(bcookie string, ip int) >>> STORED AS orc"); DataFrame[] >>> df.write.format("orc").insertInto('redsanket_db.example') {code} {code} -bash-4.2$ hdfs dfs -ls /tmp/my_databases | grep example drwxr-x--T - redsanket users 0 2019-12-04 20:43 /tmp/my_databases/example {code} It is either limitation of the API based on the mode and the behavior has to be documented or needs to be fixed was: {code} -bash-4.2$ hdfs dfs -ls /tmp | grep my_databases drwxr-x--T - redsanket users 0 2019-12-04 20:15 /tmp/my_databases {code} {code} >>> spark.sql("CREATE TABLE redsanket_db.example(bcookie string, ip int) STORED >>> AS orc"); {code} {code} -bash-4.2$ hdfs dfs -ls /tmp/my_databases | grep example drwxr-x--T - redsanket users 0 2019-12-04 20:20 /tmp/my_databases/example {code} Now after saveAsTable {code} >>> data = [('First', 1), ('Second', 2), ('Third', 3), ('Fourth', 4), >>> ('Fifth', 5)] >>> df = spark.createDataFrame(data) >>> df.write.format("orc").mode('overwrite').saveAsTable('redsanket_db.example') {code} {code} -bash-4.2$ hdfs dfs -ls /tmp/my_databases | grep example drwx------ - redsanket users 0 2019-12-04 20:23 /tmp/my_databases/example {code} Overwrites the permissions Insert into honors preserving parent directory permissions. {code} >>> spark.sql("DROP table redsanket_db.example"); DataFrame[] >>> spark.sql("CREATE TABLE redsanket_db.example(bcookie string, ip int) >>> STORED AS orc"); DataFrame[] >>> df.write.format("orc").insertInto('redsanket_db.example') {code} {code} -bash-4.2$ hdfs dfs -ls /tmp/my_databases | grep example drwxr-x--T - redsanket users 0 2019-12-04 20:43 /tmp/my_databases/example {code} It is either limitation of the API based on the mode and the behavior has to be documented or needs to be fixed > saveAsTable does not honor spark.hadoop.hive.warehouse.subdir.inherit.perms > --------------------------------------------------------------------------- > > Key: SPARK-30411 > URL: https://issues.apache.org/jira/browse/SPARK-30411 > Project: Spark > Issue Type: Bug > Components: SQL > Affects Versions: 2.4.4 > Reporter: Sanket Reddy > Priority: Minor > > {code} > -bash-4.2$ hdfs dfs -ls /tmp | grep my_databases > drwxr-x--T - redsanket users 0 2019-12-04 20:15 /tmp/my_databases > {code} > {code} > >>> spark.sql("CREATE TABLE redsanket_db.example(bcookie string, ip int) > >>> STORED AS orc"); > {code} > {code} > -bash-4.2$ hdfs dfs -ls /tmp/my_databases | grep example > drwxr-x--T - redsanket users 0 2019-12-04 20:20 /tmp/my_databases/example > {code} > Now after {{saveAsTable}} > {code} > >>> data = [('First', 1), ('Second', 2), ('Third', 3), ('Fourth', 4), > ('Fifth', 5)] > >>> df = spark.createDataFrame(data) > >>> > df.write.format("orc").mode('overwrite').saveAsTable('redsanket_db.example') > {code} > {code} > -bash-4.2$ hdfs dfs -ls /tmp/my_databases | grep example > drwx------ - redsanket users 0 2019-12-04 20:23 /tmp/my_databases/example > {code} > Overwrites the permissions > Insert into honors preserving parent directory permissions. > {code} > >>> spark.sql("DROP table redsanket_db.example"); > DataFrame[] > >>> spark.sql("CREATE TABLE redsanket_db.example(bcookie string, ip int) > STORED AS orc"); > DataFrame[] > >>> df.write.format("orc").insertInto('redsanket_db.example') > {code} > {code} > -bash-4.2$ hdfs dfs -ls /tmp/my_databases | grep example > drwxr-x--T - redsanket users 0 2019-12-04 20:43 /tmp/my_databases/example > {code} > It is either limitation of the API based on the mode and the behavior has to > be documented or needs to be fixed -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org