Re: TestFDSIO
On Fri, May 14, 2010 at 5:57 PM, Konstantin Shvachko wrote: > On the second thought, there should not be any racing. > You probably restart the hdfs cluster between the runs. > When you shutdown the cluster after the first run some files > may still remain unclosed. No we are not restarting the cluster. It is up and running.
Re: TestFDSIO
On the second thought, there should not be any racing. You probably restart the hdfs cluster between the runs. When you shutdown the cluster after the first run some files may still remain unclosed. Then after restarting the cluster you will have all their leases renewed, and if somebody tries to to recreate an unclosed file he will fail with AlreadyBeingCreatedException. If my guess is correct then you should keep the cluster running between the consequent DFSIO runs. Cleaning up will still help keeping benchmark data consistent. If a bunch of files is recreated, hdfs will start removing the old file blocks. This increases the internal load and skews the performance results. --Konstantin On 5/14/2010 2:26 PM, Konstantin Shvachko wrote: Hi Lavanya, On 5/14/2010 10:51 AM, Lavanya Ramakrishnan wrote: > Hello, > > I am running org.apache.hadoop.fs.TestDFSIO to benchmark our HDFS > installation and had a couple of questions regarding the same. > > a) If I run the benchmark back to back in the same directory, I start seeing > strange errors such as NotReplicatedYetException or > AlreadyBeingCreatedException (failed to create file on client 5, > because this file is already being created by DFSClient_ on ...). It > seems like there might be some kind of race condition between the > replication from a previous run and subsequent runs. Is there any way to > avoid this? Yes this looks like a race with the previous run. You can just wait or run TestDFSIO -clean before the second run. > b) I have been testing with concurrent writers and see a significant drop in > throughput. I get about 60 MB/s for 1 writer and about 8 MB/s for 50 > concurrent writers. Is this the known scalability limits for HDFS. Is there > any way to configure this to perform better? It depends on the size and the configuration of your cluster. In general for consistent results with DFSIO it is better to set up 1 or 2 tasks per node. And specify as many files for DFSIO as you have map slots. The idea is that all maps finish in one wave. Then you should get optimal performance. Thanks, --Konstantin
Re: TestFDSIO
Hi Lavanya, On 5/14/2010 10:51 AM, Lavanya Ramakrishnan wrote: > Hello, > > I am running org.apache.hadoop.fs.TestDFSIO to benchmark our HDFS > installation and had a couple of questions regarding the same. > > a) If I run the benchmark back to back in the same directory, I start seeing > strange errors such as NotReplicatedYetException or > AlreadyBeingCreatedException (failed to create file on client 5, > because this file is already being created by DFSClient_ on ...). It > seems like there might be some kind of race condition between the > replication from a previous run and subsequent runs. Is there any way to > avoid this? Yes this looks like a race with the previous run. You can just wait or run TestDFSIO -clean before the second run. > b) I have been testing with concurrent writers and see a significant drop in > throughput. I get about 60 MB/s for 1 writer and about 8 MB/s for 50 > concurrent writers. Is this the known scalability limits for HDFS. Is there > any way to configure this to perform better? It depends on the size and the configuration of your cluster. In general for consistent results with DFSIO it is better to set up 1 or 2 tasks per node. And specify as many files for DFSIO as you have map slots. The idea is that all maps finish in one wave. Then you should get optimal performance. Thanks, --Konstantin
TestFDSIO
Hello, I am running org.apache.hadoop.fs.TestDFSIO to benchmark our HDFS installation and had a couple of questions regarding the same. a) If I run the benchmark back to back in the same directory, I start seeing strange errors such as NotReplicatedYetException or AlreadyBeingCreatedException (failed to create file on client 5, because this file is already being created by DFSClient_ on ...). It seems like there might be some kind of race condition between the replication from a previous run and subsequent runs. Is there any way to avoid this? b) I have been testing with concurrent writers and see a significant drop in throughput. I get about 60 MB/s for 1 writer and about 8 MB/s for 50 concurrent writers. Is this the known scalability limits for HDFS. Is there any way to configure this to perform better? thanks LR