[
https://issues.apache.org/jira/browse/PHOENIX-3159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15414384#comment-15414384
]
Devaraj Das commented on PHOENIX-3159:
--------------------------------------
Thanks [[email protected]]. Do check if this race is possible:
1. At time t1, the LRU cache evictor thread calls getReferenceCount and that
returns 0. Context switches to thread-1.
2. At time t1+1, thread-1, does getTable on the same table. But the context
switches from this thread before it can do incrementReferenceCount.
3. At time t1+2, the LRU cache evictor gets to continue execution, and it goes
ahead and closes the CachedHTableWrapper instance. Since the getReferenceCount
is still 0, this would finally invoke the close() on the real table instance.
We end up working with a "closed" htable instance now (wrapped with
CachedHTableWrapper)... since that is what is going to be returned from
getTable. If the above can happen, it should be prevented...
Also, remove the reference to workingTables from the patch.
> CachingHTableFactory may close HTable during eviction even if it is getting
> used for writing by another thread.
> ---------------------------------------------------------------------------------------------------------------
>
> Key: PHOENIX-3159
> URL: https://issues.apache.org/jira/browse/PHOENIX-3159
> Project: Phoenix
> Issue Type: Bug
> Reporter: Ankit Singhal
> Assignee: Ankit Singhal
> Fix For: 4.8.1
>
> Attachments: PHOENIX-3159.patch, PHOENIX-3159_v1.patch
>
>
> CachingHTableFactory may close HTable during eviction even if it is getting
> used for writing by another thread which results in writing thread to fail
> and index is disabled.
> LRU eviction closing HTable or underlying connection when cache is full and
> new HTable is requested.
> {code}
> 2016-08-04 13:45:21,109 DEBUG
> [nat-s11-4-ioss-phoenix-1-5.openstacklocal,16020,1470297472814-index-writer--pool11-t35]
> client.ConnectionManager$HConnectionImplementation: Closing HConnection
> (debugging purposes only)
> java.lang.Exception
> at
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.internalClose(ConnectionManager.java:2423)
> at
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.close(ConnectionManager.java:2447)
> at
> org.apache.hadoop.hbase.client.CoprocessorHConnection.close(CoprocessorHConnection.java:41)
> at
> org.apache.hadoop.hbase.client.HTableWrapper.internalClose(HTableWrapper.java:91)
> at
> org.apache.hadoop.hbase.client.HTableWrapper.close(HTableWrapper.java:107)
> at
> org.apache.phoenix.hbase.index.table.CachingHTableFactory$HTableInterfaceLRUMap.removeLRU(CachingHTableFactory.java:61)
> at
> org.apache.commons.collections.map.LRUMap.addMapping(LRUMap.java:256)
> at
> org.apache.commons.collections.map.AbstractHashedMap.put(AbstractHashedMap.java:284)
> at
> org.apache.phoenix.hbase.index.table.CachingHTableFactory.getTable(CachingHTableFactory.java:100)
> at
> org.apache.phoenix.hbase.index.write.ParallelWriterIndexCommitter$1.call(ParallelWriterIndexCommitter.java:160)
> at
> org.apache.phoenix.hbase.index.write.ParallelWriterIndexCommitter$1.call(ParallelWriterIndexCommitter.java:136)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> {code}
> But the IndexWriter was using this old connection to write to the table which
> was closed during LRU eviction
> {code}
> 016-08-04 13:44:59,553 ERROR [htable-pool659-t1] client.AsyncProcess: Cannot
> get replica 0 location for
> {"totalColumns":1,"row":"\\xC7\\x03\\x04\\x06X\\x1C)\\x00\\x80\\x07\\xB0X","families":{"0":[{"qualifier":"_0","vlen":2,"tag":[],"timestamp":1470318296425}]}}
> java.io.IOException: hconnection-0x21f468be closed
> at
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1153)
> at
> org.apache.hadoop.hbase.client.CoprocessorHConnection.locateRegion(CoprocessorHConnection.java:41)
> at
> org.apache.hadoop.hbase.client.AsyncProcess$AsyncRequestFutureImpl.findAllLocationsOrFail(AsyncProcess.java:949)
> at
> org.apache.hadoop.hbase.client.AsyncProcess$AsyncRequestFutureImpl.groupAndSendMultiAction(AsyncProcess.java:866)
> at
> org.apache.hadoop.hbase.client.AsyncProcess$AsyncRequestFutureImpl.resubmit(AsyncProcess.java:1195)
> at
> org.apache.hadoop.hbase.client.AsyncProcess$AsyncRequestFutureImpl.receiveGlobalFailure(AsyncProcess.java:1162)
> at
> org.apache.hadoop.hbase.client.AsyncProcess$AsyncRequestFutureImpl.access$1100(AsyncProcess.java:584)
> at
> org.apache.hadoop.hbase.client.AsyncProcess$AsyncRequestFutureImpl$SingleServerRequestRunnable.run(AsyncProcess.java:727)
> at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> {code}
> Although the workaround is to the cache size(index.tablefactory.cache.size).
> But still we should handle the closing of working HTables to avoid index
> write failures (which in turn disables index).
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)