[ https://issues.apache.org/jira/browse/SPARK-28558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16936504#comment-16936504 ]
Nicolas Laduguie commented on SPARK-28558: ------------------------------------------ FYI here is a screenshot showing the permissions of partition folders names "dt=*" which are 766 and the table folder permissions which are 755. !image-2019-09-24-09-20-07-225.png! > DatasetWriter partitionBy is changing the group file permissions in 2.4 for > parquets > ------------------------------------------------------------------------------------ > > Key: SPARK-28558 > URL: https://issues.apache.org/jira/browse/SPARK-28558 > Project: Spark > Issue Type: Bug > Components: Spark Core > Affects Versions: 2.4.3 > Environment: Hadoop 2.7 > Scala 2.11 > Tested: > * Spark 2.3.3 - Works > * Spark 2.4.x - All have the same issue > Reporter: Stephen Pearson > Priority: Minor > Attachments: image-2019-09-24-09-20-07-225.png > > > When writing a parquet using partitionBy the group file permissions are being > changed as shown below. This causes members of the group to get > "org.apache.hadoop.security.AccessControlException: Open failed for file.... > error: Permission denied (13)" > This worked in 2.3. I found a workaround which was to set > "spark.hadoop.mapreduce.fileoutputcommitter.algorithm.version=2" which gives > the correct behaviour > > Code I used to reproduce issue: > {quote}Seq(("H", 1), ("I", 2)) > .toDF("Letter", "Number") > .write > .partitionBy("Letter") > .parquet(...){quote} > > {quote}sparktesting$ tree -dp > ├── [drwxrws---] letter_testing2.3-defaults > │ ├── [drwxrws---] Letter=H > │ └── [drwxrws---] Letter=I > ├── [drwxrws---] letter_testing2.4-defaults > │ ├── [drwxrwS---] Letter=H > │ └── [drwxrwS---] Letter=I > └── [drwxrws---] letter_testing2.4-file-writer2 > ├── [drwxrws---] Letter=H > └── [drwxrws---] Letter=I > {quote} -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org