The FilesInDirectoryCollectionReader creates an arraylist of java.io.File objects when it is initialized. For large datasets (~50k files) this is substantial time overhead and probably memory as well. Seems like it would be more efficient to use Strings instead of Files there and just open the File object when getNext() is called. It is pretty easy to implement, any downside to making this switch?
Tim

Reply via email to