With filestream you can actually pass a filter parameter to avoid loading up .tmp file/directories.
Also, when you move/rename a file, the file creation date doesn't change and hence spark won't detect them i believe. Thanks Best Regards On Sat, May 2, 2015 at 9:37 PM, Evo Eftimov <evo.efti...@isecc.com> wrote: > it seems that on Spark Streaming 1.2 the filestream API may have a bug - > it doesn't detect new files when moving or renaming them on HDFS - only > when copying them but that leads to a well known problem with .tmp files > which get removed and make spark steraming filestream throw exception >