ways
to do it.
Any help is appreciated.
Thanks,
Nezih
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/question-about-combining-small-input-splits-tp25440.html
Sent from the Apache Spark User List mailing list archive at
Hey everyone,
I have a Hive table that has a lot of small parquet files and I am creating
a data frame out of it to do some processing, but since I have a large
number of splits/files my job creates a lot of tasks, which I don't want.
Basically what I want is the same functionality that Hive provid