Yah, we use Phoenix in a lot of tables so it wont be possible to remove that. We are already migrating to a newer cluster but we need to operate on this cluster for a while during migration. Although we are running HDP but IMO, this seems to be something related to vanilla(Apache) Hadoop/HBase. So, i was hoping to get some pointers. Anyways, i will post it on vendor forum too.
Thanks, Anil On Thu, Feb 8, 2018 at 3:56 PM, Ted Yu <[email protected]> wrote: > Do you use Phoenix functionality ? > > If not, you can try disabling the Phoenix side altogether (removing Phoenix > coprocessors). > > 2.3.4 is really old - please upgrade to 2.6.3 > > You should consider asking on the vendor's community forum. > > Cheers > > On Thu, Feb 8, 2018 at 3:06 PM, anil gupta <[email protected]> wrote: > > > Hi Folks, > > > > We are running a 60 Node MapReduce/HBase HDP cluster. HBase 1.1.2 , HDP: > > 2.3.4.0-3485. Phoenix is enabled on this cluster. > > Each slave has ~120gb ram. RS has 20 Gb heap, 12 disk of 2Tb each and 24 > > cores. This cluster has been running OK for last 2 years but recently > with > > few disk failures(we unmounted those disks) it hasnt been running fine. I > > have checked hbck and hdfs fsck. Both of them report no inconsistency. > > > > Some our RegionServers keeps on aborting with following error: > > 1 ==> > > org.apache.hadoop.ipc.RemoteException(org.apache. > > hadoop.hdfs.server.namenode.LeaseExpiredException): > > No lease on > > /apps/hbase/data/data/default/DE.TABLE_NAME/ > 35aa0de96715c33e1f0664aa4d9292 > > ba/recovered.edits/0000000003948161445.temp > > (inode 420864666): File does not exist. [Lease. Holder: > > DFSClient_NONMAPREDUCE_-64710857_1, pendingcreates: 1] > > > > 2 ==> 2018-02-08 03:09:51,653 ERROR [regionserver/ > > hdpslave26.bigdataprod1.com/1.16.6.56:16020] regionserver.HRegionServer: > > Shutdown / close of WAL failed: > > org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException: No lease > on > > /apps/hbase/data/oldWALs/hdpslave26.bigdataprod1.com% > > 2C16020%2C1518027416930.default.1518085177903 > > (inode 420996935): File is not open for writing. Holder > > DFSClient_NONMAPREDUCE_649736540_1 does not have any open files. > > > > All the LeaseExpiredException are happening for recovered.edits and > > oldWALs. > > > > HDFS is around 48% full. Most of the DN's have 30-40% space left on them. > > NN heap is at 60% use. I have tried googling around but cant find > anything > > concrete to fix this problem. Currently, 15/60 nodes are already down in > > last 2 days. > > Can someone please point out what might be causing these RegionServer > > failures? > > > > > > -- > > Thanks & Regards, > > Anil Gupta > > > -- Thanks & Regards, Anil Gupta
