[ 
https://issues.apache.org/jira/browse/SPARK-30411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17009381#comment-17009381
 ] 

Sanket Reddy edited comment on SPARK-30411 at 1/7/20 5:39 AM:
--------------------------------------------------------------

[~yumwang]  
[PR-22078|https://github.com/apache/spark/pull/22078#issuecomment-458851287] 
makes sense however and it fixes Hive 3.0.0 and it is not backward compatible 
change afaik.

My concern is inconsistency in the API's, it should preserve or should not 
preserve perms and needs to be documented for DDL, DML ops imho. 
(saveAsTable/insertInto)

Would be useful for users to not go ahead and manually change permissions on 
the File systems/use umask as a work around.

[~hyukjin.kwon] sure will try the hive implementation and get back but I doubt 
it would work, will give a try thanks for the quick reply


was (Author: sanket991):
[~yumwang]  
[PR-22078|https://github.com/apache/spark/pull/22078#issuecomment-458851287] 
makes sense however and it fixes Hive 3.0.0 and it is not backward compatible 
change afaik.

Would be useful for users to not go ahead and manually change permissions on 
the File systems/use umask as a work around.

[~hyukjin.kwon] sure will try the hive implementation and get back but I doubt 
it would work, will give a try thanks for the quick reply

> saveAsTable does not honor spark.hadoop.hive.warehouse.subdir.inherit.perms
> ---------------------------------------------------------------------------
>
>                 Key: SPARK-30411
>                 URL: https://issues.apache.org/jira/browse/SPARK-30411
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 2.4.4
>            Reporter: Sanket Reddy
>            Priority: Minor
>
> {code}
> -bash-4.2$ hdfs dfs -ls /tmp | grep my_databases
>  drwxr-x--T - redsanket users 0 2019-12-04 20:15 /tmp/my_databases
> {code}
> {code}
> >>> spark.sql("CREATE TABLE redsanket_db.example(bcookie string, ip int) 
> >>> STORED AS orc");
> {code}
> {code}
> -bash-4.2$ hdfs dfs -ls /tmp/my_databases | grep example
>  drwxr-x--T - redsanket users 0 2019-12-04 20:20 /tmp/my_databases/example
> {code}
> Now after {{saveAsTable}}
> {code}
>  >>> data = [('First', 1), ('Second', 2), ('Third', 3), ('Fourth', 4), 
> ('Fifth', 5)]
>  >>> df = spark.createDataFrame(data)
>  >>> 
> df.write.format("orc").mode('overwrite').saveAsTable('redsanket_db.example')
> {code}
> {code}
> -bash-4.2$ hdfs dfs -ls /tmp/my_databases | grep example
>  drwx------ - redsanket users 0 2019-12-04 20:23 /tmp/my_databases/example
> {code}
>  Overwrites the permissions
> Insert into honors preserving parent directory permissions.
> {code}
>  >>> spark.sql("DROP table redsanket_db.example");
>  DataFrame[]
>  >>> spark.sql("CREATE TABLE redsanket_db.example(bcookie string, ip int) 
> STORED AS orc");
>  DataFrame[]
>  >>> df.write.format("orc").insertInto('redsanket_db.example')
> {code}
> {code}
> -bash-4.2$ hdfs dfs -ls /tmp/my_databases | grep example
>  drwxr-x--T - redsanket users 0 2019-12-04 20:43 /tmp/my_databases/example
> {code}
>  It is either limitation of the API based on the mode and the behavior has to 
> be documented or needs to be fixed



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to