Re: Strategies for reading large numbers of files

2014-10-21 Thread Landon Kuhn
the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org -- *Landon Kuhn*, *Software Architect

Strategies for reading large numbers of files

2014-10-02 Thread Landon Kuhn
Hello, I'm trying to use Spark to process a large number of files in S3. I'm running into an issue that I believe is related to the high number of files, and the resources required to build the listing within the driver program. If anyone in the Spark community can provide insight or guidance, it