[ https://issues.apache.org/jira/browse/HBASE-20322?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Thiruvel Thirumoolan updated HBASE-20322: ----------------------------------------- Attachment: HBASE-20322.branch-1.3.002.patch > CME in StoreScanner causes region server crash > ---------------------------------------------- > > Key: HBASE-20322 > URL: https://issues.apache.org/jira/browse/HBASE-20322 > Project: HBase > Issue Type: Bug > Affects Versions: 1.3.2 > Reporter: Thiruvel Thirumoolan > Assignee: Thiruvel Thirumoolan > Priority: Major > Fix For: 1.5.0, 1.3.3, 1.4.4 > > Attachments: HBASE-20322.branch-1.3.001.patch, > HBASE-20322.branch-1.3.002.patch, HBASE-20322.branch-1.4.001.patch > > > RS crashed with ConcurrentModificationException on our 1.3 cluster, stack > trace below. [~toffer] and I checked and there is a race condition between > flush and scanner close. When StoreScanner.updateReaders() is updating the > scanners after a newly flushed file (in this trace below a region close > during a split), the client's scanner could be closing thus causing CME. > Its rare, but since it crashes the region server, needs to be fixed. > FATAL regionserver.HRegionServer [regionserver/<rs>] : ABORTING region server > <rs>: Replay of WAL required. Forcing server shutdown > org.apache.hadoop.hbase.DroppedSnapshotException: region: <regionname> > at > org.apache.hadoop.hbase.regionserver.HRegion.internalFlushCacheAndCommit(HRegion.java:2579) > at > org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2255) > at > org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2217) > at > org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2207) > at org.apache.hadoop.hbase.regionserver.HRegion.doClose(HRegion.java:1501) > at org.apache.hadoop.hbase.regionserver.HRegion.close(HRegion.java:1420) > at > org.apache.hadoop.hbase.regionserver.SplitTransactionImpl.stepsBeforePONR(SplitTransactionImpl.java:398) > at > org.apache.hadoop.hbase.regionserver.SplitTransactionImpl.createDaughters(SplitTransactionImpl.java:278) > at > org.apache.hadoop.hbase.regionserver.SplitTransactionImpl.execute(SplitTransactionImpl.java:566) > at > org.apache.hadoop.hbase.regionserver.SplitRequest.doSplitting(SplitRequest.java:82) > at > org.apache.hadoop.hbase.regionserver.SplitRequest.run(SplitRequest.java:154) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.util.ConcurrentModificationException > at java.util.ArrayList$Itr.checkForComodification(ArrayList.java:901) > at java.util.ArrayList$Itr.next(ArrayList.java:851) > at > org.apache.hadoop.hbase.regionserver.StoreScanner.clearAndClose(StoreScanner.java:797) > at > org.apache.hadoop.hbase.regionserver.StoreScanner.updateReaders(StoreScanner.java:825) > at > org.apache.hadoop.hbase.regionserver.HStore.notifyChangedReadersObservers(HStore.java:1155) > PS: ignore the line no in the above stack trace, method calls should help > understand whats happening. -- This message was sent by Atlassian JIRA (v7.6.3#76005)