Hello All, I have written a simple program to get data from
JavaDStream<String> textStream = jssc.textFileStream(<Streaming Folder>); JavaDStream<String> ceRDD = textStream.map( new Function<String, String>() { public String call(String ceData) throws Exception { System.out.println(ceData); } }); } My code works file when we pass complete path of the input directory <Streaming Folder> <Streaming Folder> = hdfs://quickstart.cloudera:8020//user/cloudera/CE/Output/OUTPUTYarnClusterCEQ/2016-04-01/4489867359541/ WORKS Fine. But <Streaming Folder> = hdfs://quickstart.cloudera:8020/user/cloudera/CE/Output/OUTPUTYarnClusterCEQ/2016-04-01/*/ DOES NOT WORK When we pass the folder name using regEx then i am getting the exception below. Exception 16/04/01 13:48:40 WARN FileInputDStream: Error finding new files java.io.FileNotFoundException: File hdfs://quickstart.cloudera:8020/user/cloudera/CE/Output/OUTPUTYarnClusterCEQ/2016-04-01/* does not exist. at org.apache.hadoop.hdfs.DistributedFileSystem.listStatusInternal(DistributedFileSystem.java:704) at org.apache.hadoop.hdfs.DistributedFileSystem.access$600(DistributedFileSystem.java:105) at org.apache.hadoop.hdfs.DistributedFileSystem$15.doCall(DistributedFileSystem.java:762) at org.apache.hadoop.hdfs.DistributedFileSystem$15.doCall(DistributedFileSystem.java:758) at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) at org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:758)