well... it turns out, that extra part-* file goes away when i limit
--num-executors to 1 or 2 (leaving it to default maxes it out, which in turn
gives an extra empty part-file)
i guess the test data i'm using only requires that many executors
--
Sent from:
the spark job succeeds (and with correct output), except there is always an
extra part-* file, and it is empty...
i even set number of partitions to only 2 via spark-submit, but there is
still a 3rd, empty, part-file that shows up.
why does it do that? how to fix?
Thank you
--
Sent