I've got a small cluster running Ubuntu 10.10, hadoop-0.20.203.0 and Ceph version 0.29.1. I'm trying to use the Ceph file system as storage for HDFS (instead of some other normal directory). Below I've posted system stats (1), TestDFSIO job attempt output (2) and configurations (3). Can anyone help me understand how HDFS is querying the OS/Ceph for the available space? Is this a special case since I'm telling Hadoop to use Ceph? Are there any configurations I can specify to ensure configured capacity? *Please see my note below regarding dfs.data.dir*, Thanks!
(In case you were wondering, I have tested this setup's functionality and it works. It's just that I'm unable to use all my capacity.) *(1) Here's what I'm seeing:* local_account@server1:~/ceph-hadoop/hadoop-0.20.203.0$ bin/hadoop dfsadmin -report Configured Capacity: 3628498944 (3.38 GB) Present Capacity: 3313673216 (3.09 GB) DFS Remaining: 3313668096 (3.09 GB) DFS Used: 5120 (5 KB) DFS Used%: 0% Under replicated blocks: 0 Blocks with corrupt replicas: 0 Missing blocks: 0 local_account@server1:/mnt/ceph$ df -h Filesystem Size Used Avail Use% Mounted on /dev/sda1 97G 3.1G 89G 4% / none 2.0G 164K 2.0G 1% /dev none 2.0G 0 2.0G 0% /dev/shm none 2.0G 64K 2.0G 1% /var/run none 2.0G 0 2.0G 0% /var/lock none 97G 3.1G 89G 4% /var/lib/ureadahead/debugfs /dev/sda3 97G 191M 92G 1% /data-storage *10.10.207.156:/ 289G 11G 264G 4% /mnt/ceph* * * local_account@server1:/mnt/ceph$ l hdfs0/ hdfs1/ hdfs2/ local_account@server1:/mnt/ceph$ du -sh 4.5K . local_account@server2:/mnt/ceph$ du -sh 4.5K . (2) A TestDFSIO job output (failed) Only posted the beginning out the output. Also notice that I told the job to write a total of 6000MB (which is more than the configured capacity). TestDFSIO will work properly when writing less than the configured capacity. local_account@server1:~/ceph-hadoop/hadoop-0.20.203.0$ bin/hadoop jar hadoop-test-0.20.203.0.jar TestDFSIO -write -nrFiles 3 -fileSize 2000 TestDFSIO.0.0.4 11/06/26 17:22:17 INFO fs.TestDFSIO: nrFiles = 3 11/06/26 17:22:17 INFO fs.TestDFSIO: fileSize (MB) = 2000 11/06/26 17:22:17 INFO fs.TestDFSIO: bufferSize = 1000000 11/06/26 17:22:18 INFO fs.TestDFSIO: creating control file: 2000 mega bytes, 3 files 11/06/26 17:22:18 INFO fs.TestDFSIO: created control files for: 3 files 11/06/26 17:22:18 INFO mapred.FileInputFormat: Total input paths to process : 3 11/06/26 17:22:19 INFO mapred.JobClient: Running job: job_201106261629_0001 11/06/26 17:22:20 INFO mapred.JobClient: map 0% reduce 0% 11/06/26 17:31:02 INFO mapred.JobClient: Task Id : attempt_201106261629_0001_m_000000_0, Status : FAILED java.io.IOException: All datanodes 10.10.207.44:50010 are bad. Aborting... at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:2711) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$1600(DFSClient.java:2255) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2423) (3) Here are my configurations: conf/core-site.xml <?xml version="1.0"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <!-- Put site-specific property overrides in this file. --> <configuration> <property> <name>fs.default.name</name> <value>hdfs://server1:9000</value> </property> <property> <name>webinterface.private.actions</name> <value>true</value> </property> </configuration> conf/hdfs-site.xml (for server2) Note: dfs.data.dir must be unique per datanode since I'm using Ceph (a global/networked file system). Let me know if this is confusing. This list below gives the directory names for each datanode (where HDFS should write its data). - *server2 - /mnt/ceph/hdfs0* - *server3 - /mnt/ceph/hdfs1* - *server4 - /mnt/ceph/hdfs2* <?xml version="1.0"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <!-- Put site-specific property overrides in this file. --> <configuration> <property> <name>dfs.name.dir</name> <value>/home/local_account/ceph-hadoop/namedir-0.20.203.0</value> </property> <property> <name>dfs.data.dir</name> <value>/mnt/ceph/hdfs0</value> </property> <property> <name>dfs.replication</name> <value>1</value> </property> </configuration> conf/mapred-site.xml <?xml version="1.0"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <!-- Put site-specific property overrides in this file. --> <configuration> <property> <name>mapred.job.tracker</name> <value>server1:9001</value> </property> </configuration> master: server1 slaves: server2, server3, server4 Thanks again, Adam