Re: sqlline thin client upsert successfully, but got timeout problem when try to connect to the phoenix again

ivanybma Mon, 09 Apr 2018 20:58:04 -0700

Hi, sorry to reply late.
that is just a part of my hbase-site.xml.

below is the full content:
************************************hbase/conf/hbase.site.xml**************************
<configuration>
    <property>
    <name>hbase.rootdir</name>
    <value>hdfs://broker.xxx-xxx.local:9000/hbase</value>
    </property>
<property>
  <name>hbase.zookeeper.quorum</name>
  <value>broker.xxx-xxx.local</value>
</property>
<property>
<name>hbase.cluster.distributed</name>
<value>true</value>
</property>
<property>
<name>zookeeper.znode.parent</name>
<value>/hbase</value>
</property>
<property>
    <name>hbase.zookeeper.property.dataDir</name>
    <value>broker.xxx-xxx.local</value>
  </property>
  <property>
     <name>hbase.regionserver.wal.codec</name>
     <value>org.apache.hadoop.hbase.regionserver.wal.IndexedWALEditCodec</value>
</property>
  <property>
  <name>phoenix.transactions.enabled</name>
  <value>true</value>
</property>
<property>
  <name>data.tx.snapshot.dir</name>
  <value>/tmp/tephra/snapshots</value>
</property>
  <property>
  <name>data.tx.timeout</name>
  <value>120</value>
</property>
<property>
  <name>phoenix.query.timeoutMs</name>
  <value>2800000</value>
</property>
<property>
  <name>hbase.regionserver.lease.period</name>
  <value>2200000</value>
</property>
<property>
  <name>hbase.rpc.timeout</name>
  <value>2200000</value>
</property>
<property>
  <name>hbase.client.scanner.caching</name>
  <value>2000</value>
</property>
<property>
  <name>hbase.client.scanner.timeout.period</name>
  <value>2200000</value>
</property>
</configuration>


*************************client:  
phoenix/bin/hbase-site.xml*************************
<configuration>
<property>
     <name>hbase.regionserver.wal.codec</name>
     <value>org.apache.hadoop.hbase.regionserver.wal.IndexedWALEditCodec</value>
</property>
<property>
  <name>phoenix.transactions.enabled</name>
  <value>true</value>
</property>
<property>
  <name>data.tx.snapshot.dir</name>
  <value>/tmp/tephra/snapshots</value>
</property>
<property>
  <name>data.tx.timeout</name>
  <value>120</value>
</property>
<property>
  <name>phoenix.query.timeoutMs</name>
  <value>2800000</value>
</property>
<property>
  <name>hbase.regionserver.lease.period</name>
  <value>2200000</value>
</property>
<property>
  <name>hbase.rpc.timeout</name>
  <value>2200000</value>
</property>
<property>
  <name>hbase.client.scanner.caching</name>
  <value>2000</value>
</property>
<property>
  <name>hbase.client.scanner.timeout.period</name>
  <value>2200000</value>
</property>
</configuration>


On 2018/04/09 18:01:14, Josh Elser <[email protected]> wrote: 
> The hbase-site.xml elements you shared earlier, were those your entire 
> hbase-site contents or just part of it?
> 
> Make sure you have the required properties set as described on 
> https://phoenix.apache.org/secondary_indexing.html for your indexes. If 
> you're still seeing problems, you may need to increase the number of 
> handlers you configured HBase to use.
> 
> While in the stuck state, you may benefit from getting a thread-dump or 
> two from the client and your regionserver(s). This would help in 
> figuring out exactly where things are stuck (like the DEBUG logs would do).
> 
> On 4/9/18 1:30 PM, [email protected] wrote:
> > thanks for your suggestion.  I found something interesting, not sure if 
> > that is some potential reason.  That is my indexes created on my tables.
> > I created a lot of indexes.  After I removed all of the indexes, it seems 
> > things went better(no more hanging like that).  So I am suspecting there is 
> > some incompatible or other issues in the way I set up mu cluster.
> > 
> > Something special i used to create table:
> > )c.DATA_BLOCK_ENCODING='FAST_DIFF', SALT_BUCKETS=3, 
> > COMPRESSION='GZ',TRANSACTIONAL=true ;
> > and some indexes I created like this:
> > CREATE INDEX testing_IDX_2  ON xxx.xxx (field1, field2) INCLUDE (field3, 
> > field4)
> > 
> > 
> > 
> > On 2018/04/09 17:04:03, Josh Elser <[email protected]> wrote:
> >> Have you looked at DEBUG logging client and server(HBase) side?
> >>
> >> The "Call exception" log messages imply that the client is repeatedly
> >> trying to issue an RPC to a RegionServer and failing. This should be
> >> where you focus your attention. It may be something trivial to fix
> >> related to configuration/security setup.
> >>
> >> On 4/8/18 2:04 AM, [email protected] wrote:
> >>> Hi,  I got below tricky problem:
> >>> Situation:
> >>> I successfully did a upsert into multiple tables with transaction 
> >>> enabled(and there are many index created on these table).
> >>> Problem:
> >>> after the fist time upsert done successfully, I tried to do the 2nd, 
> >>> 3rd.... and next same upsert, sometime, the 2nd works, then the 3rd 
> >>> upsert will get timeout exception, at this time, the whole phoenix seems 
> >>> hangs there and keep retrying. I tried to stop the whole hbase cluster 
> >>> including phoenix queryserver and tepera and restart, then when I try to 
> >>> connect with sqlline.py, it got hang again.
> >>>
> >>> hbase-site.xml setting:
> >>>     <property>
> >>>        <name>hbase.regionserver.wal.codec</name>
> >>>        
> >>> <value>org.apache.hadoop.hbase.regionserver.wal.IndexedWALEditCodec</value>
> >>> </property>
> >>>     <property>
> >>>     <name>phoenix.transactions.enabled</name>
> >>>     <value>true</value>
> >>> </property>
> >>> <property>
> >>>     <name>data.tx.snapshot.dir</name>
> >>>     <value>/tmp/tephra/snapshots</value>
> >>> </property>
> >>>     <property>
> >>>     <name>data.tx.timeout</name>
> >>>     <value>120</value>
> >>> </property>
> >>> <property>
> >>>     <name>phoenix.query.timeoutMs</name>
> >>>     <value>1800000</value>
> >>> </property>
> >>> <property>
> >>>     <name>hbase.regionserver.lease.period</name>
> >>>     <value>1200000</value>
> >>> </property>
> >>> <property>
> >>>     <name>hbase.rpc.timeout</name>
> >>>     <value>1200000</value>
> >>> </property>
> >>> <property>
> >>>     <name>hbase.client.scanner.caching</name>
> >>>     <value>1000</value>
> >>> </property>
> >>> <property>
> >>>     <name>hbase.client.scanner.timeout.period</name>
> >>>     <value>1200000</value>
> >>> </property>
> >>>
> >>>
> >>>
> >>> Below is some queryserver log:
> >>> 18/04/08 05:47:12 INFO zookeeper.ZooKeeper: Initiating client connection, 
> >>> connectString=xxxx.xxxx.local:2181 sessionTimeout=90000 
> >>> watcher=org.apache.tephra.zookeeper.TephraZKClientService$5@6700104f
> >>> 18/04/08 05:47:12 INFO zookeeper.ClientCnxn: Opening socket connection to 
> >>> server xxxx.xxxx.local/127.0.0.1:2181. Will not attempt to authenticate 
> >>> using SASL (unknown error)
> >>> 18/04/08 05:47:12 INFO zookeeper.ClientCnxn: Socket connection 
> >>> established to xxxx.xxxx.local/127.0.0.1:2181, initiating session
> >>> 18/04/08 05:47:12 INFO zookeeper.ClientCnxn: Session establishment 
> >>> complete on server xxxx.xxxx.local/127.0.0.1:2181, sessionid = 
> >>> 0x162a3c72c9c0012, negotiated timeout = 90000
> >>> 18/04/08 05:57:39 INFO client.RpcRetryingCaller: Call exception, 
> >>> tries=10, retries=35, started=38310 ms ago, cancelled=false, msg=row 
> >>> 'SYSTEM.CATALOG,xxxLOAD_*N**_DIM,99999999999999' on table 'hbase:meta' at 
> >>> region=hbase:meta,,1.1588230740, 
> >>> hostname=xxxx.xxxx.local,16201,1523166165622, seqNum=0
> >>> 18/04/08 05:57:49 INFO client.RpcRetryingCaller: Call exception, 
> >>> tries=11, retries=35, started=48335 ms ago, cancelled=false, msg=row 
> >>> 'SYSTEM.CATALOG,xxxxxLOAD_*N**_DIM,99999999999999' on table 'hbase:meta' 
> >>> at region=hbase:meta,,1.1588230740, 
> >>> hostname=xxx.xxx.local,16201,1523166165622, seqNum=0
> >>>
> >>
>

Re: sqlline thin client upsert successfully, but got timeout problem when try to connect to the phoenix again

Reply via email to