hi, Guys, I am trying to implement a simple program(that is not for production, experimental). And invoke FileSystem.listFiles() to get a list of files under a hdfs folder, and then use FileSystem.getFileBlockLocations() to get replica locations of each file/blocks.
Since it is a controlled environment, I can make sure the files are static and don't worry about datanode crash, fail-over, etc. Assuming at a small time-window(say, 1 minute), I have 100~1000s client invoke the same program to look up the same folder. Will the above two APIs guarantee *same result in the same order* for all clients? To elaborate a bit more, say there is a folder called /dfs/dn/user/data contains three files: file1, file2, and file3. If client1 gets: listFiles() : file1,file2,file3 getFileBlockLocation(file1) -> datanode1, datanode3, datanode6 Will all other clients get the same information(I think so) and in the same order? or I have to do a sort by each client to guarantee the order? Many thanks for your inputs Demai