[ https://issues.apache.org/jira/browse/SPARK-30411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Sanket Reddy updated SPARK-30411: --------------------------------- Description: -bash-4.2$ hdfs dfs -ls /tmp | grep my_databases drwxr-x--T - redsanket users 0 2019-12-04 20:15 /tmp/my_databases >>>spark.sql("CREATE TABLE redsanket_db.example(bcookie string, ip int) STORED >>>AS orc"); -bash-4.2$ hdfs dfs -ls /tmp/my_databases | grep example drwxr-x--T - redsanket users 0 2019-12-04 20:20 /tmp/my_databases/example Now after saveAsTable >>> data = [('First', 1), ('Second', 2), ('Third', 3), ('Fourth', 4), >>> ('Fifth', 5)] >>> df = spark.createDataFrame(data) >>> df.write.format("orc").mode('overwrite').saveAsTable('redsanket_db.example') -bash-4.2$ hdfs dfs -ls /tmp/my_databases | grep example drwx------ - redsanket users 0 2019-12-04 20:23 /tmp/my_databases/example Overwrites the permissions Insert into honors preserving parent directory permissions. >>> spark.sql("DROP table redsanket_db.example"); DataFrame[] >>> spark.sql("CREATE TABLE redsanket_db.example(bcookie string, ip int) >>> STORED AS orc"); DataFrame[] >>> df.write.format("orc").insertInto('redsanket_db.example') -bash-4.2$ hdfs dfs -ls /tmp/my_databases | grep example drwxr-x--T - redsanket users 0 2019-12-04 20:43 /tmp/my_databases/example It is either limitation of the API based on the mode and the behavior has to be documented or needs to be fixed was: -bash-4.2$ hdfs dfs -ls /tmp | grep my_databases drwxr-x--T - redsanket users 0 2019-12-04 20:15 /tmp/my_databases >>>spark.sql("CREATE TABLE redsanket_db.example(bcookie string, ip int) STORED >>>AS orc"); -bash-4.2$ hdfs dfs -ls /tmp/my_databases | grep example drwxr-x--T - redsanket users 0 2019-12-04 20:20 /tmp/my_databases/example Now after saveAsTable >>> data = [('First', 1), ('Second', 2), ('Third', 3), ('Fourth', 4), ('Fifth', >>> 5)] >>> df = spark.createDataFrame(data) >>> df.write.format("orc").mode('overwrite').saveAsTable('redsanket_db.example') -bash-4.2$ hdfs dfs -ls /tmp/my_databases | grep example drwx------ - redsanket users 0 2019-12-04 20:23 /tmp/my_databases/example Overwrites the permissions Insert into honors preserving parent directory permissions. >>> spark.sql("DROP table redsanket_db.example"); DataFrame[] >>> spark.sql("CREATE TABLE redsanket_db.example(bcookie string, ip int) STORED >>> AS orc"); DataFrame[] >>> df.write.format("orc").insertInto('redsanket_db.example') -bash-4.2$ hdfs dfs -ls /tmp/my_databases | grep example drwxr-x--T - schintap users 0 2019-12-04 20:43 /tmp/my_databases/example It is either limitation of the API based on the mode and the behavior has to be documented or needs to be fixed > saveAsTable does not honor spark.hadoop.hive.warehouse.subdir.inherit.perms > --------------------------------------------------------------------------- > > Key: SPARK-30411 > URL: https://issues.apache.org/jira/browse/SPARK-30411 > Project: Spark > Issue Type: Bug > Components: SQL > Affects Versions: 2.4.4 > Reporter: Sanket Reddy > Priority: Minor > > -bash-4.2$ hdfs dfs -ls /tmp | grep my_databases > drwxr-x--T - redsanket users 0 2019-12-04 20:15 /tmp/my_databases > >>>spark.sql("CREATE TABLE redsanket_db.example(bcookie string, ip int) > >>>STORED AS orc"); > -bash-4.2$ hdfs dfs -ls /tmp/my_databases | grep example > drwxr-x--T - redsanket users 0 2019-12-04 20:20 /tmp/my_databases/example > Now after saveAsTable > >>> data = [('First', 1), ('Second', 2), ('Third', 3), ('Fourth', 4), > ('Fifth', 5)] > >>> df = spark.createDataFrame(data) > >>> > df.write.format("orc").mode('overwrite').saveAsTable('redsanket_db.example') > -bash-4.2$ hdfs dfs -ls /tmp/my_databases | grep example > drwx------ - redsanket users 0 2019-12-04 20:23 /tmp/my_databases/example > Overwrites the permissions > Insert into honors preserving parent directory permissions. > >>> spark.sql("DROP table redsanket_db.example"); > DataFrame[] > >>> spark.sql("CREATE TABLE redsanket_db.example(bcookie string, ip int) > STORED AS orc"); > DataFrame[] > >>> df.write.format("orc").insertInto('redsanket_db.example') > -bash-4.2$ hdfs dfs -ls /tmp/my_databases | grep example > drwxr-x--T - redsanket users 0 2019-12-04 20:43 /tmp/my_databases/example > It is either limitation of the API based on the mode and the behavior has to > be documented or needs to be fixed -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org