On a small cluster, 8000 isn't a lot. Grepping the data node logs for "Too many open files" will give you the answer.
Also make sure you set the xcievers right. And yeah, I'd love to see some logs ;) If you want to do more "interactive" debugging, do come see us on #hbase @ freenode. Lots of Y!s there too. J-D On Wed, May 26, 2010 at 10:09 AM, Vidhyashankar Venkataraman <[email protected]> wrote: > No OOME or HDFS errors that I can see in the logs.. > I turned major compaction on and restarted Hbase : now the RS's arent > shutting down: Compactions are happening.. > > I had set the ulimit to 8000 a while back.. Should I increase it more then? > (With the current setting, each region can have a max of around 4 open files > if there are 2000 regions per node)... > > Let me also check the logs a little more carefully and get back to the forum.. > > Thank you > Vidhya > > > On 5/26/10 9:38 AM, "Jean-Daniel Cryans" <[email protected]> wrote: > > I'm pretty sure something else is going on. > > 1) What does it log when it shuts down? Zookeeper session timeout? > OOME? HDFS errors? > > 2) Is your cluster meeting all the requirements? Especially the last > bullet point? See > http://hadoop.apache.org/hbase/docs/r0.20.4/api/overview-summary.html#requirements > > J-D > > On Wed, May 26, 2010 at 9:07 AM, Vidhyashankar Venkataraman > <[email protected]> wrote: >> Are there any side effects to turning major compactions off, other than just >> a hit in the read performance? >> >> I was trying to merge a 120 Gig update (modify/insert/delete operations) >> into a 2 TB fully compacted Hbase table with 5 region servers using a map >> reduce job.. Each RS was serving around 2000 regions (256 MB max size)... >> Major compactions were turned off before the job started (by setting the >> compaction period very high to around 4 or 5 days).. >> >> As the job was going on, the region servers just shut down after the table >> reached near-100% fragmentation (as shown in the web interface).. On >> looking at the RS logs, I saw that there were compaction checks for each >> region which obviously didn't clear, and the RS's shut down soon after the >> checks.. I tried restarting the database after killing the map reduce job >> (still, with major compactions turned off).. The RS's shut down soon after >> booting up.. >> >> Is this expected? Even if the update files (the additional StoreFiles) per >> region get huge, won't the region get split on its own? >> >> Thank you >> Vidhya >> > >
