Re: [PR] [SPARK-48896][ML][MLLIB] Avoid repartition when writing out the metadata [spark]

via GitHub Mon, 15 Jul 2024 23:45:00 -0700


dongjoon-hyun commented on PR #47347:
URL: https://github.com/apache/spark/pull/47347#issuecomment-2230142749


   Thank you for the reason. It makes sense.
   
   > This is because of consistency. We're already using SparkSession to write 
Parquet.
   
   For the following, if you don't mind, please split them once more because 
it's not related to avoiding repartition.
   
   > I can separate PR for replacing RDD to DataFrame when writing the text out 
(https://github.com/apache/spark/pull/47347#discussion_r1678140089). I piggy 
backed there because I thought it's trivial. The code what it does is virtually 
same except those differences above.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Re: [PR] [SPARK-48896][ML][MLLIB] Avoid repartition when writing out the metadata [spark]

Reply via email to