It is described in "Hadoop Definition Guild", chapter 3, FilePattern https://www.safaribooksonline.com/library/view/hadoop-the-definitive/9781449328917/ch03.html#FilePatterns Yong
From: pradeep1...@gmail.com Date: Wed, 13 Apr 2016 18:56:58 +0000 Subject: how does sc.textFile translate regex in the input. To: user@spark.apache.org I am trying to understand on how spark's sc.textFile() works. I specifically have the question on how it translates the paths with regex in it. For example: files = sc.textFile("hdfs://<server>:<port>/file1/*/*/*/*.txt") How does it find all the sub-directories and recurses to all the leaf files. ? Is there any documentation on how this happens ? Pradeep