sc.textFile(URI) supports reading multiple files in parallel but only with a 
wildcard. I need to walk a dir tree, match a regex to create a list of files, 
then I’d like to read them into a single RDD in parallel. I understand these 
could go into separate RDDs then a union RDD can be created. Is there a way to 
create a single RDD from a URI list?

Reply via email to