Ryan: actually I really like this decoupling Hbase/filesystem :) Thanks again all for your replies Cheers, TR
Ryan Rawson wrote:
Hey, When you start hbase in a fresh installation it will use local fs in /tmp. The hadoop filesystem libraries we use allow the use of at least 3 filesystems (local, hdfs, kfs). Right now you are seeing the single ZK process and the combined HBase master/regionserver process. HBase needs the following things out of it's filesystem: - global view - every single regionserver & master MUST see every file from everyone at all times. 1 hour rsync won't cut it. - high bandwidth, once you get 3+ servers doing high IO (compaction, etc), you wont want to rely on a 1 disk NFS. In theory you can use something like NFS and common mount dir on all regionservers/masters. This won't scale of course. It should _in theory_ work... You can specify the rootdir with something like "file:///nfs_mount_path/hbase". Normally we'd say hdfs://namenode:port/hbase The hbase scripts don't boot up or control hadoop at all. You must provide a working hadoop, then hbase can use it. It may seem a little "annoying" to have a 2 step process, but the decoupled control makes our control scripts more generic and suitable for all. Good luck out there! -ryan
