On Tue, Apr 6, 2010 at 12:02 AM, Patrick Hunt <[email protected]> wrote: > The ZK servers are sensitive to disk (io) latency. I just troubleshot an > issue last week where a user was seeing 80second (second!) latencies. > Turns out they were running zk server, namenode, tasktracker, and hbase > region server all on the same box, that box had a single spindle for all > io activity and was at 100% utilization for long periods of time. If > you want decent ZK API latencies (<100ms) you really need to ensure that > there's at least a separate spindle available for the ZK transaction logs. >
Great insight and info! So that means if ZK is using a separate spindle the rest can reside in the same spindle, at least in your case, nice! Thanks for the info, noting it. Imran > Patrick > > On 04/05/2010 11:11 AM, Jonathan Gray wrote: >> >> Imran, >> >> It's impossible to give good advice on cluster size and hardware >> configuration without some idea of the requirements. >> >> How much data? How will the data be queried? What kind of load do >> you expect? You are going to be doing offline batch/MapReduce, >> online random access, as well as search all from the same nodes? >> This can be dangerous. >> >> I would strongly recommend against putting Hadoop+HBase on the same >> nodes as something like Solr, unless you have dedicated disks for >> each. Also, don't forget about ZooKeeper which you definitely will >> need separate nodes/disks for if you will be co-locating so many >> other things. >> >> JG >> >>> -----Original Message----- From: Imran M Yousuf >>> [mailto:[email protected]] Sent: Monday, April 05, 2010 9:52 AM >>> To: [email protected] Subject: About test/production >>> server configuration >>> >>> Hi, >>> >>> We are a startup who have decided to use HBase purely because we >>> want to take advantage of HDFS based reliability, redundancy, >>> MapReduce and BigTable. For that we are thinking to go for a test >>> environment with 5 servers and production environment with 10 >>> servers in both case the Hadoop cluster will be used for HBase + >>> MapReduce + Solr Index. >>> >>> Firstly, I would like some opinion on whether 10 servers is a good >>> number for all 3 purposes or not. Secondly what kind of test >>> environment is currently in use in different organizations. >>> Thirdly, I would like to learn some server configuration and >>> purchase price (with purchase location if possible). >>> >>> Waiting eagerly for some feedback. >>> >>> Thank you, >>> >>> -- Imran M Yousuf Entrepreneur& Software Engineer Smart IT >>> Engineering Dhaka, Bangladesh Email: [email protected] >>> Blog: http://imyousuf-tech.blogs.smartitengineering.com/ Mobile: >>> +880-1711402557 > -- Imran M Yousuf Entrepreneur & Software Engineer Smart IT Engineering Dhaka, Bangladesh Email: [email protected] Blog: http://imyousuf-tech.blogs.smartitengineering.com/ Mobile: +880-1711402557
