Hi Rakhi, No, that's not what I meant.
You can run HDFS daemons on one set of instances, and TaskTrackers on another set of instances. No need to write up access lists. Yes, the mapreduce subsystem will also use the HDFS volume along with HBase, this is not a problem. The potential benefit of splitting function this way is that mapreduce tasks would not contend with the various functions of the storage cluster. However I think you have more basic problems at this time. As Amandeep has followed up with you already, your instances do NOT have enough RAM. You must use large or x-large instances. Start there. Another benefit of larger instances is more local instance storage in addition to the additional RAM. Also please follow Amandeep's advice to raise the maximum number of allowable open files at the OS level and the maximum number of xceivers allowable at the HDFS level. Have you looked at the troubleshooting page on the wiki? http://wiki.apache.org/hadoop/Hbase/Troubleshooting Given your current small data set you do not need 20 nodes for it. Try standing up a smaller cluster of large or x-large instances. Also - Andy > From: Rakhi Khatwani <[email protected]> > Subject: Re: Region Servers going down frequently > To: [email protected], [email protected] > Cc: "Ninad" <[email protected]> > Date: Tuesday, April 7, 2009, 11:04 PM > Hi Andy, > I think i figured it out. > We will have to set mapred.hosts and dfs.hosts property in > hadoop-site.xml > as follows: > > <property> > <name>dfs.hosts</name> > <value>filename1</value> > <description>Names a file that contains a list of > hosts that are > permitted to connect to the namenode. The full pathname > of the file > must be specified. If the value is empty, all hosts are > permitted.</description> > </property> > > [where filename1 will contain list of instances to be > considered for > storage] > > <property> > <name>mapred.hosts</name> > <value>filename2</value> > <description>Names a file that contains the list of > nodes that may > connect to the jobtracker. If the value is empty, all > hosts are > permitted.</description> > </property> > > [Where filename2 will contain list of instances which will > carry out > computation tasks] > > Correct me if i am wrong. > > Thanks once again, > Rakhi. > > On Wed, Apr 8, 2009 at 10:45 AM, Rakhi Khatwani > <[email protected]>wrote: > > > Hi Andy, > > Thanks for your suggesstion. > > But i was wondering how could we seperate HDFS storage > from Mapred > > Computations. as mapred uses the same master/slave > configuration as HDFS. > > > > did you mean using a set of instances as slaves and > another set of > > instances as regionservers.?? > > > > Thanks in Advance, > > Rakhi > > > > > > On Tue, Apr 7, 2009 at 11:06 PM, Andrew Purtell > <[email protected]>wrote: > > > >> > >> Hi Rakhi, > >> > >> The "cannot obtain block" error is > actually a HDFS problem. Most > >> likely this block was lost by HDFS during a period > of excessive > >> load. Usually the first sign you are using > insufficient > >> resources for your load is filesystem issues such > as these. To > >> address the problems I recommend you do two things > at once. > >> > >> 1) The minimum usable instance type for HBase (and > Hadoop) is > >> large in my opinion. The basic rule of thumb for > HBase and > >> Hadoop daemons is you must allocate 1GB of > heap/RAM and one > >> CPU (or vcpu) thread for each daemon. You can > search the > >> hbase-user@ archives for previous discussion on > this topic. > >> > >> 2) Allocate more instances to spread the load on > DFS. > >> > >> On EC2 I recommend running storage such as > HDFS/HBase on one set > >> of instances and mapreduce computations on another > set. Hadoop > >> and HBase daemons are sensitive to thread > starvation problems. > >> > >> Hope this helps, > >> > >> - Andy > >> > >> > From: Rakhi Khatwani > >> > Subject: Region Servers going down frequently > >> > Date: Tuesday, April 7, 2009, 2:45 AM > >> > Hi, > >> > I have a 20 node cluster on ec2(small > instance).... i > >> > have a set of tables which store huge amount > of data (tried > >> > wid 10,000 rows... more to be added).... but > during my map > >> > reduce jobs, some of the region servers shut > >> > down thereby causing data loss, stop in my > program > >> > execution and infact one of my tables got > damaged. when ever > >> > i scan the table, i get the could not obtain > block error. > >> > >> > >> > >> > >> > >
