Hi,

I would like to develop a process that merges parquet files.
My first intention was to develop it with PySpark using coalesce(1) -  to
create only 1 file.
This process is going to run on a huge amount of files.
I wanted your advice on what is the best way to implement it (PySpark isn't
a must).


Thanks,
Tzahi

Reply via email to