Hello,

I have some 100 folders. Each folder contains 5 files. I have an executable
that process one folder. The executable is a black box and hence it cannot
be modified.I would like to process 100 folders in parallel using Apache
spark so that I should be able to span a map task per folder. Can anyone
give me an idea? I have came across similar questions but with Hadoop and
answer was to use combineFileInputFormat and pathFilter. However, as I
said, I want to use Apache spark. Any idea?

Regards
Bala

Reply via email to