[ 
https://issues.apache.org/jira/browse/SPARK-28558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16936504#comment-16936504
 ] 

Nicolas Laduguie commented on SPARK-28558:
------------------------------------------

FYI here is a screenshot showing the permissions of partition folders names 
"dt=*" which are 766 and the table folder permissions which are 755.

!image-2019-09-24-09-20-07-225.png!

> DatasetWriter partitionBy is changing the group file permissions in 2.4 for 
> parquets
> ------------------------------------------------------------------------------------
>
>                 Key: SPARK-28558
>                 URL: https://issues.apache.org/jira/browse/SPARK-28558
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>    Affects Versions: 2.4.3
>         Environment: Hadoop 2.7
> Scala 2.11
> Tested:
>  * Spark 2.3.3 - Works
>  * Spark 2.4.x - All have the same issue
>            Reporter: Stephen Pearson
>            Priority: Minor
>         Attachments: image-2019-09-24-09-20-07-225.png
>
>
> When writing a parquet using partitionBy the group file permissions are being 
> changed as shown below. This causes members of the group to get 
> "org.apache.hadoop.security.AccessControlException: Open failed for file.... 
> error: Permission denied (13)"
> This worked in 2.3. I found a workaround which was to set 
> "spark.hadoop.mapreduce.fileoutputcommitter.algorithm.version=2" which gives 
> the correct behaviour
>  
> Code I used to reproduce issue:
> {quote}Seq(("H", 1), ("I", 2))
>  .toDF("Letter", "Number")
>  .write
>  .partitionBy("Letter")
>  .parquet(...){quote}
>  
> {quote}sparktesting$ tree -dp
> ├── [drwxrws---]  letter_testing2.3-defaults
> │   ├── [drwxrws---]  Letter=H
> │   └── [drwxrws---]  Letter=I
> ├── [drwxrws---]  letter_testing2.4-defaults
> │   ├── [drwxrwS---]  Letter=H
> │   └── [drwxrwS---]  Letter=I
> └── [drwxrws---]  letter_testing2.4-file-writer2
>     ├── [drwxrws---]  Letter=H
>     └── [drwxrws---]  Letter=I
> {quote}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to