[ 
https://issues.apache.org/jira/browse/SPARK-30411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17009381#comment-17009381
 ] 

Sanket Reddy commented on SPARK-30411:
--------------------------------------

[~yumwang]  
[PR-22078|https://github.com/apache/spark/pull/22078#issuecomment-458851287] 
makes sense however and it fixes Hive 3.0.0 and it is not backward compatible 
change afaik.

Would be useful for users to not go ahead and manually change permissions on 
the File systems/use umask as a work around.

[~hyukjin.kwon] sure will try the hive implementation and get back but I doubt 
it would work, will give a try thanks for the quick reply

> saveAsTable does not honor spark.hadoop.hive.warehouse.subdir.inherit.perms
> ---------------------------------------------------------------------------
>
>                 Key: SPARK-30411
>                 URL: https://issues.apache.org/jira/browse/SPARK-30411
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 2.4.4
>            Reporter: Sanket Reddy
>            Priority: Minor
>
> {code}
> -bash-4.2$ hdfs dfs -ls /tmp | grep my_databases
>  drwxr-x--T - redsanket users 0 2019-12-04 20:15 /tmp/my_databases
> {code}
> {code}
> >>> spark.sql("CREATE TABLE redsanket_db.example(bcookie string, ip int) 
> >>> STORED AS orc");
> {code}
> {code}
> -bash-4.2$ hdfs dfs -ls /tmp/my_databases | grep example
>  drwxr-x--T - redsanket users 0 2019-12-04 20:20 /tmp/my_databases/example
> {code}
> Now after {{saveAsTable}}
> {code}
>  >>> data = [('First', 1), ('Second', 2), ('Third', 3), ('Fourth', 4), 
> ('Fifth', 5)]
>  >>> df = spark.createDataFrame(data)
>  >>> 
> df.write.format("orc").mode('overwrite').saveAsTable('redsanket_db.example')
> {code}
> {code}
> -bash-4.2$ hdfs dfs -ls /tmp/my_databases | grep example
>  drwx------ - redsanket users 0 2019-12-04 20:23 /tmp/my_databases/example
> {code}
>  Overwrites the permissions
> Insert into honors preserving parent directory permissions.
> {code}
>  >>> spark.sql("DROP table redsanket_db.example");
>  DataFrame[]
>  >>> spark.sql("CREATE TABLE redsanket_db.example(bcookie string, ip int) 
> STORED AS orc");
>  DataFrame[]
>  >>> df.write.format("orc").insertInto('redsanket_db.example')
> {code}
> {code}
> -bash-4.2$ hdfs dfs -ls /tmp/my_databases | grep example
>  drwxr-x--T - redsanket users 0 2019-12-04 20:43 /tmp/my_databases/example
> {code}
>  It is either limitation of the API based on the mode and the behavior has to 
> be documented or needs to be fixed



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to