[jira] [Commented] (PHOENIX-3977) Region is not closed when moving region or balancing while writing to table with local index
[ https://issues.apache.org/jira/browse/PHOENIX-3977?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16400044#comment-16400044 ] sanket patel commented on PHOENIX-3977: --- I hit the similar issue recently with 4.10. I find regions in PENDING_CLOSE state and same regions would show up under `/hbase-unsecure/regions-in-transition` Znode. Also on HMaster UI, under Procedures tab, there are pending tasks for DisableTable (probably because region is stuck) As a solution I remove the znode and restart hmaster. And then under same tables I see those regions as offline region. Later I drop table. Currently I am not able to reproduce the issue. I will post more symptoms and things I notice if I happen to encounter it again. > Region is not closed when moving region or balancing while writing to table > with local index > > > Key: PHOENIX-3977 > URL: https://issues.apache.org/jira/browse/PHOENIX-3977 > Project: Phoenix > Issue Type: Bug >Affects Versions: 4.10.0 >Reporter: JeongMin Ju >Priority: Major > Attachments: screenshot-1.png > > > Region is not closed when moving region or balancing while writing to table > with local index. > If the regionserver is forcibly killed and restart and then balancing is > performed during write operation perform using YCSB. The region is not moved > properly, so it becomes jammed. > This is also true when moving a specific region. > {panel:title=RegionServer1} > 2017-06-26 17:18:49,096 INFO > org.apache.phoenix.hbase.index.util.IndexManagementUtil: Rethrowing > org.apache.hadoop.hbase.DoNotRetryIOException: ERROR 2008 (INT10): ERROR 2008 > (INT10): Unable to find cached index metadata. key=-4998016164816556219 > region=PHOENIX_TABLE,&,1498464285648.df3f3da32306361ce5e38e5193f5e5d6.host=juke-cdh-36f531b4.s2.krane.9rum.cc,60020,1498464965204 > Index update failed > 2017-06-26 17:18:49,531 INFO > org.apache.phoenix.hbase.index.util.IndexManagementUtil: Rethrowing > org.apache.hadoop.hbase.DoNotRetryIOException: ERROR 2008 (INT10): ERROR 2008 > (INT10): Unable to find cached index metadata. key=6036765365681157624 > region=PHOENIX_TABLE,&,1498464285648.df3f3da32306361ce5e38e5193f5e5d6.host=juke-cdh-36f531b4.s2.krane.9rum.cc,60020,1498464965204 > Index update failed > 2017-06-26 17:18:49,536 INFO > org.apache.phoenix.hbase.index.util.IndexManagementUtil: Rethrowing > org.apache.hadoop.hbase.DoNotRetryIOException: ERROR 2008 (INT10): ERROR 2008 > (INT10): Unable to find cached index metadata. key=-428088766346522675 > region=PHOENIX_TABLE,&,1498464285648.df3f3da32306361ce5e38e5193f5e5d6.host=juke-cdh-36f531b4.s2.krane.9rum.cc,60020,1498464965204 > Index update failed > {panel} > At other regionserver > {panel:title=RegionServer2} > 2017-06-26 17:19:49,755 INFO org.apache.hadoop.hbase.client.AsyncProcess: > #1476, waiting for 62 actions to finish on table: PHOENIX_TABLE > 2017-06-26 17:19:49,755 INFO org.apache.hadoop.hbase.client.AsyncProcess: > #1476, waiting for 27 actions to finish on table: PHOENIX_TABLE > 2017-06-26 17:19:49,755 INFO org.apache.hadoop.hbase.client.AsyncProcess: > #1476, waiting for 52 actions to finish on table: PHOENIX_TABLE > {panel} > phoenix.coprocessor.maxServerCacheTimeToLiveMs & > phoenix.coprocessor.maxMetaDataCacheSize was not effected. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (PHOENIX-3977) Region is not closed when moving region or balancing while writing to table with local index
[ https://issues.apache.org/jira/browse/PHOENIX-3977?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16065726#comment-16065726 ] JeongMin Ju commented on PHOENIX-3977: -- Hi. This problem is related only to the local index. The global index problem can be solved by increasing hbase.regionserver.executor.openregion.threads, but this is not the case. You can see this phenomenon by writing to a table with a local index and moving any region. The problem with the global index is that the region is pending open and now the region is in the pending close state. !screenshot-1.png! > Region is not closed when moving region or balancing while writing to table > with local index > > > Key: PHOENIX-3977 > URL: https://issues.apache.org/jira/browse/PHOENIX-3977 > Project: Phoenix > Issue Type: Bug >Affects Versions: 4.10.0 >Reporter: JeongMin Ju > Attachments: screenshot-1.png > > > Region is not closed when moving region or balancing while writing to table > with local index. > If the regionserver is forcibly killed and restart and then balancing is > performed during write operation perform using YCSB. The region is not moved > properly, so it becomes jammed. > This is also true when moving a specific region. > {panel:title=RegionServer1} > 2017-06-26 17:18:49,096 INFO > org.apache.phoenix.hbase.index.util.IndexManagementUtil: Rethrowing > org.apache.hadoop.hbase.DoNotRetryIOException: ERROR 2008 (INT10): ERROR 2008 > (INT10): Unable to find cached index metadata. key=-4998016164816556219 > region=PHOENIX_TABLE,&,1498464285648.df3f3da32306361ce5e38e5193f5e5d6.host=juke-cdh-36f531b4.s2.krane.9rum.cc,60020,1498464965204 > Index update failed > 2017-06-26 17:18:49,531 INFO > org.apache.phoenix.hbase.index.util.IndexManagementUtil: Rethrowing > org.apache.hadoop.hbase.DoNotRetryIOException: ERROR 2008 (INT10): ERROR 2008 > (INT10): Unable to find cached index metadata. key=6036765365681157624 > region=PHOENIX_TABLE,&,1498464285648.df3f3da32306361ce5e38e5193f5e5d6.host=juke-cdh-36f531b4.s2.krane.9rum.cc,60020,1498464965204 > Index update failed > 2017-06-26 17:18:49,536 INFO > org.apache.phoenix.hbase.index.util.IndexManagementUtil: Rethrowing > org.apache.hadoop.hbase.DoNotRetryIOException: ERROR 2008 (INT10): ERROR 2008 > (INT10): Unable to find cached index metadata. key=-428088766346522675 > region=PHOENIX_TABLE,&,1498464285648.df3f3da32306361ce5e38e5193f5e5d6.host=juke-cdh-36f531b4.s2.krane.9rum.cc,60020,1498464965204 > Index update failed > {panel} > At other regionserver > {panel:title=RegionServer2} > 2017-06-26 17:19:49,755 INFO org.apache.hadoop.hbase.client.AsyncProcess: > #1476, waiting for 62 actions to finish on table: PHOENIX_TABLE > 2017-06-26 17:19:49,755 INFO org.apache.hadoop.hbase.client.AsyncProcess: > #1476, waiting for 27 actions to finish on table: PHOENIX_TABLE > 2017-06-26 17:19:49,755 INFO org.apache.hadoop.hbase.client.AsyncProcess: > #1476, waiting for 52 actions to finish on table: PHOENIX_TABLE > {panel} > phoenix.coprocessor.maxServerCacheTimeToLiveMs & > phoenix.coprocessor.maxMetaDataCacheSize was not effected. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PHOENIX-3977) Region is not closed when moving region or balancing while writing to table with local index
[ https://issues.apache.org/jira/browse/PHOENIX-3977?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16065594#comment-16065594 ] Sergey Soldatov commented on PHOENIX-3977: -- [~mini666] Could you give some more details about the configuration you are using. Does table has only local index or global index as well? Such behavior usually happen when there is a global index on the scene and this problem is discussed in PHOENIX-3970 and related issues. It would be nice if you provide several jstacks for the stuck region server. > Region is not closed when moving region or balancing while writing to table > with local index > > > Key: PHOENIX-3977 > URL: https://issues.apache.org/jira/browse/PHOENIX-3977 > Project: Phoenix > Issue Type: Bug >Affects Versions: 4.10.0 >Reporter: JeongMin Ju > > Region is not closed when moving region or balancing while writing to table > with local index. > If the regionserver is forcibly killed and restart and then balancing is > performed during write operation perform using YCSB. The region is not moved > properly, so it becomes jammed. > This is also true when moving a specific region. > {panel:title=RegionServer1} > 2017-06-26 17:18:49,096 INFO > org.apache.phoenix.hbase.index.util.IndexManagementUtil: Rethrowing > org.apache.hadoop.hbase.DoNotRetryIOException: ERROR 2008 (INT10): ERROR 2008 > (INT10): Unable to find cached index metadata. key=-4998016164816556219 > region=PHOENIX_TABLE,&,1498464285648.df3f3da32306361ce5e38e5193f5e5d6.host=juke-cdh-36f531b4.s2.krane.9rum.cc,60020,1498464965204 > Index update failed > 2017-06-26 17:18:49,531 INFO > org.apache.phoenix.hbase.index.util.IndexManagementUtil: Rethrowing > org.apache.hadoop.hbase.DoNotRetryIOException: ERROR 2008 (INT10): ERROR 2008 > (INT10): Unable to find cached index metadata. key=6036765365681157624 > region=PHOENIX_TABLE,&,1498464285648.df3f3da32306361ce5e38e5193f5e5d6.host=juke-cdh-36f531b4.s2.krane.9rum.cc,60020,1498464965204 > Index update failed > 2017-06-26 17:18:49,536 INFO > org.apache.phoenix.hbase.index.util.IndexManagementUtil: Rethrowing > org.apache.hadoop.hbase.DoNotRetryIOException: ERROR 2008 (INT10): ERROR 2008 > (INT10): Unable to find cached index metadata. key=-428088766346522675 > region=PHOENIX_TABLE,&,1498464285648.df3f3da32306361ce5e38e5193f5e5d6.host=juke-cdh-36f531b4.s2.krane.9rum.cc,60020,1498464965204 > Index update failed > {panel} > At other regionserver > {panel:title=RegionServer2} > 2017-06-26 17:19:49,755 INFO org.apache.hadoop.hbase.client.AsyncProcess: > #1476, waiting for 62 actions to finish on table: PHOENIX_TABLE > 2017-06-26 17:19:49,755 INFO org.apache.hadoop.hbase.client.AsyncProcess: > #1476, waiting for 27 actions to finish on table: PHOENIX_TABLE > 2017-06-26 17:19:49,755 INFO org.apache.hadoop.hbase.client.AsyncProcess: > #1476, waiting for 52 actions to finish on table: PHOENIX_TABLE > {panel} > phoenix.coprocessor.maxServerCacheTimeToLiveMs & > phoenix.coprocessor.maxMetaDataCacheSize was not effected. -- This message was sent by Atlassian JIRA (v6.4.14#64029)