Hi,

I did some experiments to read a changing hdfs file. It seems that the reading 
takes a snapshot at the file opening moment, and will not read any data 
appended to the file afterwards. It's different from what happens when reading 
a changing local file. My code is as follows

                        Configuration conf = new Configuration();
                        InputStream in = null;
                        try {
                                FileSystem fs = 
FileSystem.get(URI.create("hdfs://MyCluster/"),
                                                conf);
                                in = fs.open(new Path("/tmp/test.txt"));
                                Scanner scanner=new Scanner(in);
                                while(scanner.hasNextLine()){
                                        
System.out.println("+++++++++++++++++++++++++++++++ read "+scanner.nextLine());
                                }
                                
System.out.println("+++++++++++++++++++++++++++++++ reader finished ");
                        } catch (IOException e) {
                                // TODO Auto-generated catch block
                                e.printStackTrace();
                        } finally {
                                IOUtils.closeStream(in);
                        }

I'm wondering if this is the designed hdfs reading behavior, or can be changed 
by using different API or configuration? What I expect is the same behavior as 
a local file reading: when a reader reads a file while another writer is 
writing to the file, the reader will receive all data written by the writer.

Thanks,
Jiang


Reply via email to