Hi, I would like to develop a process that merges parquet files. My first intention was to develop it with PySpark using coalesce(1) - to create only 1 file. This process is going to run on a huge amount of files. I wanted your advice on what is the best way to implement it (PySpark isn't a must).
Thanks, Tzahi