binaryFiles() for 1 million files, too much memory required

2015-07-02 Thread Kostas Kougios
-- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/binaryFiles-for-1-million-files-too-much-memory-required-tp23590.html Sent from the Apache Spark User List mailing list archive at Nabble.com

binaryFiles() for 1 million files, too much memory required

2015-07-01 Thread Konstantinos Kougios
Once again I am trying to read a directory tree using binary files. My directory tree has a root dir ROOTDIR and subdirs where the files are located, i.e. ROOTDIR/1 ROOTDIR/2 ROOTDIR/.. ROOTDIR/100 A total of 1 mil files split into 100 sub dirs Using binaryFiles requires too much memory on