Tagar commented on pull request #29120: URL: https://github.com/apache/spark/pull/29120#issuecomment-719114594
I've seen Spark users do this all the time to save to a single file. Great improvement. Would this change cover `.repartition(1)` too and not just `.coalesce(1)` ? Thanks ps. Fun fact, some time back I wrote a small tool to workaround this very issue https://github.com/Tagar/abalon/blob/v2.3.3/abalon/spark/sparkutils.py#L444-L445 that used HDFS API calls to coalesce files together to not affecting Spark join pallelism https://github.com/Tagar/abalon/blob/v2.3.3/abalon/spark/sparkutils.py#L340 ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org