Ralph, one thing you could try is to disable the Phoenix side altogether 
(disable the configs for coprocessors). Then restart hbase. That should help to 
bring the regions online (assuming that the Phoenix coprocessor invocations are 
causing the servers to go down).


________________________________
From: Perko, Ralph J <[email protected]>
Sent: Wednesday, April 08, 2015 9:04 AM
To: [email protected]
Subject: Re: hbase / phoenix errors

Hi – thanks everyone for the help.  I could use some guidance as my system is 
currently not usable.

I see how the bug impacted the system and I’m glad it showed up now.  But how 
do I move forward?
Options I see:
Apply the patches from issue #1634 to phoenix 4.3
Downgrade phoenix to 4.2.2
Something else you may suggest?

Regarding hbase – Is there any recovery from the state its in (see previous 
messages)?


Thanks,
Ralph


From: <Perko>, Ralph Perko <[email protected]<mailto:[email protected]>>
Reply-To: "[email protected]<mailto:[email protected]>" 
<[email protected]<mailto:[email protected]>>
Date: Tuesday, April 7, 2015 at 1:50 PM
To: "[email protected]<mailto:[email protected]>" 
<[email protected]<mailto:[email protected]>>
Subject: RE: hbase / phoenix errors

That is unfortunate.  Do you know if there any way to recover the data?

I’ve tried the following:
I ran “hbase hbck” and learned all the regions are inconsistent and have holes 
to repair.  I attempted to run “hbase hbck –repairHoles” and got stuck in a 
loop with a message that a region is still in transition.  Based on something I 
read on the hbase mailing list I tried clearing out the zk node and repeat the 
repair but none of this has worked.


From: Samarth Jain [mailto:[email protected]]
Sent: Tuesday, April 07, 2015 1:32 PM
To: [email protected]<mailto:[email protected]>
Subject: Re: hbase / phoenix errors

I think that page needs to be updated. Sorry about that, Ralph. We ran into 
problems with HBase 0.98.4 and local indexes where a similar (but not the same) 
error was thrown:

Coprocessor.CoprocessorHost: the coprocessor …LocalIndexSplitter threw an 
exception
NoSuchMethodError hbase.regionserver.RegionServerService.getCatalogTracker

See https://issues.apache.org/jira/browse/PHOENIX-1634.

Rajeshbabu - would be interesting to get your opinion on this too.

On Tue, Apr 7, 2015 at 1:19 PM, Perko, Ralph J 
<[email protected]<mailto:[email protected]>> wrote:
Based on the Phoenix compatibility chart at the download page I did not expect 
there to be issues with Phoenix 4.3 and Hbase 0.98.4.
http://phoenix.apache.org/download.html


From: Devaraj Das [mailto:[email protected]<mailto:[email protected]>]
Sent: Tuesday, April 07, 2015 12:58 PM

To: [email protected]<mailto:[email protected]>
Subject: Re: hbase / phoenix errors


What is the major driver to not use the HDP bundled Phoenix?

It seems to me that the Phoenix version you have is not compatible with the 
underlying HBase version, leading to all these issues. In particular, the 
method getCatalogTracker in HDP-2.2 works only with 1 argument, but in Phoenix 
versions from the open source, it works with 0 arguments. This has been taken 
care of in the-yet-to-be-released HDP-2.3 (the HBase/Phoenix code both 
supports/uses the 0 argument getCatalogTracker).

________________________________
From: Perko, Ralph J <[email protected]<mailto:[email protected]>>
Sent: Tuesday, April 07, 2015 10:28 AM
To: [email protected]<mailto:[email protected]>
Subject: RE: hbase / phoenix errors

Thank you for the response

I am using Phoenix 4.3 as a separate installation.

Unfortunately I have no way to copy the actual log files so I will need to 
transcribe as much as I can.

There are a lot of things going on – I’ll try to provide the highlights

Right now:
Using ambari – everything on the cluster is green – there are no apparent 
issues (but there are many)

On the hbase master web site it shows a table split hung up (all red – “regions 
in transition”) since yesterday evening.

All my phoenix tables are setup as follows:
Salted
100GB hregion max file size
Constant split size policy

If I attempt to connect to Phoenix using sqlline it get the exception:
NotServingRegionException:Region SYSTEM.CATALOG is not online

If I run hbase shell I can list the tables but cannot scan any of them

RS Log Messages:
Aside from the messages I provided earlier some errors and exceptions have come 
up as well on the RS:

In order I believe:

ERROR StatsScanner failed to update stats table
ERROR largeCompaction Compaction Failed

ERROR largeCompaction Failed after attempt 350 – ConnectionRefused – this 
server is in the failed servers list

Coprocessor.CoprocessorHost: the coprocessor …LocalIndexSplitter threw an 
exception
NoSuchMethodError hbase.regionserver.RegionServerService.getCatalogTracker

HRegion: compaction interrupted InterruptedOException
RuntimeException: HRegionServer aborted

Restart

ERROR RS_LOG_REPLAY wal.HLogSplitter  OutOFMemory

Restart

Many of these: RemoteException (LeaseExpiredException) Holder: 
DFSCLient…recovered.edits…: File does not exist

Many java.net.ConnectionException: Connection refused

Java.net.ConnectionException SocketTimeoutException … row ‘’ on table 
‘hbase.meta’

This is where we are today

I will provide whatever info you need

Thanks!
Ralph



From: Nick Dimiduk [mailto:[email protected]]
Sent: Tuesday, April 07, 2015 9:05 AM
To: [email protected]<mailto:[email protected]>
Subject: Re: hbase / phoenix errors

Also, beside each region server log file (.log) there's also the output file 
(.out). Check the output files as well, as some serious crashes scenarios 
bypass the logs and go directly to the out files.

-n

On Tuesday, April 7, 2015, Devaraj Das 
<[email protected]<mailto:[email protected]>> wrote:
Hi Ralph, were you using the Phoenix bundled with HDP-2.2 or was that a 
separate installation? Could you please copy/paste some log lines around the 
time of a regionserver's crash (look for exceptions etc around that time in the 
regionserver logs).
Thanks
Devaraj

On Apr 6, 2015, at 3:00 PM, Perko, Ralph J 
<[email protected]<mailto:[email protected]>> wrote:
Hi, we recently upgraded to Phoenix 4.3 and Hortonworks 2.2 (HBase .98.4) and 
we are running into some issues.  I am wondering if I am missing something easy 
and hoping you can help.  I have 34 regions servers and many keep crashing but 
without much in the way of error messages.

Here are the things that stand out:

ClientAsync.Process – waiting for some tasks to finish
smallCompaction RPCRetryingCaller: Call exception …. ‘msg row 
‘SOME_PHOENIX_TABLE_NAME_IDX:<some long key>’ on table: SYSTEM.STATS attempt 
225/350

Similar ones for largeCompaction as well.

The other issue is the Pig loader hangs with these messages in the mapper logs:
[phoenix-1-thread-0] RPCRetryingCaller: Call exception msg row ‘’ on table 
‘SYSTEM.CATALOG’

Eventually the mappers time out – no errors

Regions servers come up and down.  There are lots of connection refused errors 
as well.

Restarting hbase does not help.  The region servers will come up then go down 
again.

Zookeeper is up.  I’ve restarted just in case but it did not help

I cannot connect to Phoenix from the command line

Any help is appreciated.

Thanks!
Ralph


Reply via email to