Hi All, We are running hbase - *Version 1.0.0-cdh5.4.2, rUnknown, Tue May 19 17:04:41 PDT 2015,* and are facing the problem in the bug ( https://issues.apache.org/jira/browse/HBASE-12074), where the regionserver crashes due to concurrent roll of wal file.
Below are failure logs from one of the instance in our env: 2015-10-25 22:09:41,885 INFO [regionserver/localhost/127.0.0.1:60020.logRoller] wal.FSHLog: Rolled WAL /var/lib/hbase/data/WALs/localhost,60020,1445796437179/localhost%2C60020%2C1445796437179.default.1445810949648 with entries=11826, filesize=30.40 MB; new WAL /var/lib/hbase/data/WALs/localhost,60020,1445796437179/localhost%2C60020%2C1445796437179.default.1445810981882 2015-10-25 22:10:09,177 INFO [regionserver/localhost/127.0.0.1:60020.logRoller] wal.FSHLog: Rolled WAL /var/lib/hbase/data/WALs/localhost,60020,1445796437179/localhost%2C60020%2C1445796437179.default.1445810981882 with entries=7796, filesize=30.41 MB; new WAL /var/lib/hbase/data/WALs/localhost,60020,1445796437179/localhost%2C60020%2C1445796437179.default.1445811009174 2015-10-25 22:10:09,189 ERROR [sync.2] wal.FSHLog: Error syncing, request close of wal java.io.IOException: java.lang.NullPointerException at org.apache.hadoop.hbase.regionserver.wal.ProtobufLogWriter.sync(ProtobufLogWriter.java:176) at org.apache.hadoop.hbase.regionserver.wal.FSHLog$SyncRunner.run(FSHLog.java:1334) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.NullPointerException at org.apache.hadoop.hbase.regionserver.wal.ProtobufLogWriter.sync(ProtobufLogWriter.java:173) ... 2 more 2015-10-25 22:10:09,226 FATAL [regionserver/localhost/127.0.0.1:60020.logRoller] regionserver.HRegionServer: ABORTING region server localhost,60020,1445796437179: IOE in log roller java.io.IOException: java.lang.NullPointerException at org.apache.hadoop.hbase.regionserver.wal.ProtobufLogWriter.sync(ProtobufLogWriter.java:176) at org.apache.hadoop.hbase.regionserver.wal.FSHLog$SyncRunner.run(FSHLog.java:1334) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.NullPointerException at org.apache.hadoop.hbase.regionserver.wal.ProtobufLogWriter.sync(ProtobufLogWriter.java:173) ... 2 more 2015-10-25 22:10:09,226 FATAL [regionserver/localhost/127.0.0.1:60020.logRoller] regionserver.HRegionServer: RegionServer abort: loaded coprocessors are: [org.apache.hadoop.hbase.coprocessor.MultiRowMutationEndpoint] Does anyone know if there is some workaround this problem or if there is a patch for this? If there is no workarounds/patch, I can help create a patch but would need some general guidance on what could be going on here. --cheers, gaurav