Hi James.
--num-executors is use to control the number of parallel tasks (each per
executors) running for your application. For reading and writing data in
parallel data partitioning is employed. You can look here for quick intro
how data partitioning work:
Reading Spark doc
(https://spark.apache.org/docs/latest/sql-data-sources-parquet.html). It's not
mentioned how to parallel read parquet file with SparkSession. Would
--num-executors just work? Any additional parameters needed to be added to
SparkSession as well?
Also if I want to parallel