Hi, I did some experiments to read a changing hdfs file. It seems that the reading takes a snapshot at the file opening moment, and will not read any data appended to the file afterwards. It's different from what happens when reading a changing local file. My code is as follows
Configuration conf = new Configuration(); InputStream in = null; try { FileSystem fs = FileSystem.get(URI.create("hdfs://MyCluster/"), conf); in = fs.open(new Path("/tmp/test.txt")); Scanner scanner=new Scanner(in); while(scanner.hasNextLine()){ System.out.println("+++++++++++++++++++++++++++++++ read "+scanner.nextLine()); } System.out.println("+++++++++++++++++++++++++++++++ reader finished "); } catch (IOException e) { // TODO Auto-generated catch block e.printStackTrace(); } finally { IOUtils.closeStream(in); } I'm wondering if this is the designed hdfs reading behavior, or can be changed by using different API or configuration? What I expect is the same behavior as a local file reading: when a reader reads a file while another writer is writing to the file, the reader will receive all data written by the writer. Thanks, Jiang