[jira] [Updated] (HBASE-5623) Race condition when rolling the HLog and hlogFlush

2012-03-22 Thread Enis Soztutar (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Enis Soztutar updated HBASE-5623:
-

Attachment: HBASE-5623_v0.patch

Attaching a simple patch, which seems to solve the problem. This does wrap the 
call  logSyncerThread.hlogFlush(this.writer) to obtain the updateLock. 
Otherwise, I guess we can also lose wal edits, since the writer cannot append 
the pendingWrites. 

> Race condition when rolling the HLog and hlogFlush
> --
>
> Key: HBASE-5623
> URL: https://issues.apache.org/jira/browse/HBASE-5623
> Project: HBase
>  Issue Type: Bug
>  Components: wal
>Affects Versions: 0.94.0
>Reporter: Enis Soztutar
>Assignee: Enis Soztutar
> Attachments: HBASE-5623_v0.patch
>
>
> When doing a ycsb test with a large number of handlers 
> (regionserver.handler.count=60), I get the following exceptions:
> {code}
> Caused by: org.apache.hadoop.ipc.RemoteException: java.io.IOException: 
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.io.SequenceFile$Writer.getLength(SequenceFile.java:1099)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogWriter.getLength(SequenceFileLogWriter.java:314)
>   at org.apache.hadoop.hbase.regionserver.wal.HLog.syncer(HLog.java:1291)
>   at org.apache.hadoop.hbase.regionserver.wal.HLog.sync(HLog.java:1388)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchPut(HRegion.java:2192)
>   at org.apache.hadoop.hbase.regionserver.HRegion.put(HRegion.java:1985)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.multi(HRegionServer.java:3400)
>   at sun.reflect.GeneratedMethodAccessor17.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>   at java.lang.reflect.Method.invoke(Method.java:597)
>   at 
> org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:366)
>   at 
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1351)
>   at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:920)
>   at 
> org.apache.hadoop.hbase.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:152)
>   at $Proxy1.multi(Unknown Source)
>   at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$3$1.call(HConnectionManager.java:1691)
>   at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$3$1.call(HConnectionManager.java:1689)
>   at 
> org.apache.hadoop.hbase.client.ServerCallable.withoutRetries(ServerCallable.java:214)
> {code}
> and 
> {code}
>   java.lang.NullPointerException
>   at 
> org.apache.hadoop.io.SequenceFile$Writer.checkAndWriteSync(SequenceFile.java:1026)
>   at 
> org.apache.hadoop.io.SequenceFile$Writer.append(SequenceFile.java:1068)
>   at 
> org.apache.hadoop.io.SequenceFile$Writer.append(SequenceFile.java:1035)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogWriter.append(SequenceFileLogWriter.java:279)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.HLog$LogSyncer.hlogFlush(HLog.java:1237)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.HLog.syncer(HLog.java:1271)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.HLog.sync(HLog.java:1391)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchPut(HRegion.java:2192)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.put(HRegion.java:1985)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.multi(HRegionServer.java:3400)
>   at sun.reflect.GeneratedMethodAccessor33.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>   at java.lang.reflect.Method.invoke(Method.java:597)
>   at 
> org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:366)
>   at 
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1351)
> {code}
> It seems the root cause of the issue is that we open a new log writer and 
> close the old one at HLog#rollWriter() holding the updateLock, but the other 
> threads doing syncer() calls
> {code} 
> logSyncerThread.hlogFlush(this.writer);
> {code}
> without holding the updateLock. LogSyncer only synchronizes against 
> concurrent appends and flush(), but not on the passed writer, which can be 
> closed already by rollWriter(). In this case, since 
> SequenceFile#Writer.close() sets it's out field as null, we get the NPE. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIR

[jira] [Updated] (HBASE-5623) Race condition when rolling the HLog and hlogFlush

2012-03-22 Thread Zhihong Yu (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Yu updated HBASE-5623:
--

Hadoop Flags: Reviewed
  Status: Patch Available  (was: Open)

Patch makes sense.

> Race condition when rolling the HLog and hlogFlush
> --
>
> Key: HBASE-5623
> URL: https://issues.apache.org/jira/browse/HBASE-5623
> Project: HBase
>  Issue Type: Bug
>  Components: wal
>Affects Versions: 0.94.0
>Reporter: Enis Soztutar
>Assignee: Enis Soztutar
> Attachments: HBASE-5623_v0.patch
>
>
> When doing a ycsb test with a large number of handlers 
> (regionserver.handler.count=60), I get the following exceptions:
> {code}
> Caused by: org.apache.hadoop.ipc.RemoteException: java.io.IOException: 
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.io.SequenceFile$Writer.getLength(SequenceFile.java:1099)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogWriter.getLength(SequenceFileLogWriter.java:314)
>   at org.apache.hadoop.hbase.regionserver.wal.HLog.syncer(HLog.java:1291)
>   at org.apache.hadoop.hbase.regionserver.wal.HLog.sync(HLog.java:1388)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchPut(HRegion.java:2192)
>   at org.apache.hadoop.hbase.regionserver.HRegion.put(HRegion.java:1985)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.multi(HRegionServer.java:3400)
>   at sun.reflect.GeneratedMethodAccessor17.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>   at java.lang.reflect.Method.invoke(Method.java:597)
>   at 
> org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:366)
>   at 
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1351)
>   at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:920)
>   at 
> org.apache.hadoop.hbase.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:152)
>   at $Proxy1.multi(Unknown Source)
>   at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$3$1.call(HConnectionManager.java:1691)
>   at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$3$1.call(HConnectionManager.java:1689)
>   at 
> org.apache.hadoop.hbase.client.ServerCallable.withoutRetries(ServerCallable.java:214)
> {code}
> and 
> {code}
>   java.lang.NullPointerException
>   at 
> org.apache.hadoop.io.SequenceFile$Writer.checkAndWriteSync(SequenceFile.java:1026)
>   at 
> org.apache.hadoop.io.SequenceFile$Writer.append(SequenceFile.java:1068)
>   at 
> org.apache.hadoop.io.SequenceFile$Writer.append(SequenceFile.java:1035)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogWriter.append(SequenceFileLogWriter.java:279)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.HLog$LogSyncer.hlogFlush(HLog.java:1237)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.HLog.syncer(HLog.java:1271)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.HLog.sync(HLog.java:1391)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchPut(HRegion.java:2192)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.put(HRegion.java:1985)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.multi(HRegionServer.java:3400)
>   at sun.reflect.GeneratedMethodAccessor33.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>   at java.lang.reflect.Method.invoke(Method.java:597)
>   at 
> org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:366)
>   at 
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1351)
> {code}
> It seems the root cause of the issue is that we open a new log writer and 
> close the old one at HLog#rollWriter() holding the updateLock, but the other 
> threads doing syncer() calls
> {code} 
> logSyncerThread.hlogFlush(this.writer);
> {code}
> without holding the updateLock. LogSyncer only synchronizes against 
> concurrent appends and flush(), but not on the passed writer, which can be 
> closed already by rollWriter(). In this case, since 
> SequenceFile#Writer.close() sets it's out field as null, we get the NPE. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5623) Race condition when rolling the HLog and hlogFlush

2012-03-22 Thread stack (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-5623:
-

Priority: Critical  (was: Major)

> Race condition when rolling the HLog and hlogFlush
> --
>
> Key: HBASE-5623
> URL: https://issues.apache.org/jira/browse/HBASE-5623
> Project: HBase
>  Issue Type: Bug
>  Components: wal
>Affects Versions: 0.94.0
>Reporter: Enis Soztutar
>Assignee: Enis Soztutar
>Priority: Critical
> Attachments: HBASE-5623_v0.patch
>
>
> When doing a ycsb test with a large number of handlers 
> (regionserver.handler.count=60), I get the following exceptions:
> {code}
> Caused by: org.apache.hadoop.ipc.RemoteException: java.io.IOException: 
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.io.SequenceFile$Writer.getLength(SequenceFile.java:1099)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogWriter.getLength(SequenceFileLogWriter.java:314)
>   at org.apache.hadoop.hbase.regionserver.wal.HLog.syncer(HLog.java:1291)
>   at org.apache.hadoop.hbase.regionserver.wal.HLog.sync(HLog.java:1388)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchPut(HRegion.java:2192)
>   at org.apache.hadoop.hbase.regionserver.HRegion.put(HRegion.java:1985)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.multi(HRegionServer.java:3400)
>   at sun.reflect.GeneratedMethodAccessor17.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>   at java.lang.reflect.Method.invoke(Method.java:597)
>   at 
> org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:366)
>   at 
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1351)
>   at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:920)
>   at 
> org.apache.hadoop.hbase.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:152)
>   at $Proxy1.multi(Unknown Source)
>   at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$3$1.call(HConnectionManager.java:1691)
>   at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$3$1.call(HConnectionManager.java:1689)
>   at 
> org.apache.hadoop.hbase.client.ServerCallable.withoutRetries(ServerCallable.java:214)
> {code}
> and 
> {code}
>   java.lang.NullPointerException
>   at 
> org.apache.hadoop.io.SequenceFile$Writer.checkAndWriteSync(SequenceFile.java:1026)
>   at 
> org.apache.hadoop.io.SequenceFile$Writer.append(SequenceFile.java:1068)
>   at 
> org.apache.hadoop.io.SequenceFile$Writer.append(SequenceFile.java:1035)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogWriter.append(SequenceFileLogWriter.java:279)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.HLog$LogSyncer.hlogFlush(HLog.java:1237)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.HLog.syncer(HLog.java:1271)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.HLog.sync(HLog.java:1391)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchPut(HRegion.java:2192)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.put(HRegion.java:1985)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.multi(HRegionServer.java:3400)
>   at sun.reflect.GeneratedMethodAccessor33.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>   at java.lang.reflect.Method.invoke(Method.java:597)
>   at 
> org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:366)
>   at 
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1351)
> {code}
> It seems the root cause of the issue is that we open a new log writer and 
> close the old one at HLog#rollWriter() holding the updateLock, but the other 
> threads doing syncer() calls
> {code} 
> logSyncerThread.hlogFlush(this.writer);
> {code}
> without holding the updateLock. LogSyncer only synchronizes against 
> concurrent appends and flush(), but not on the passed writer, which can be 
> closed already by rollWriter(). In this case, since 
> SequenceFile#Writer.close() sets it's out field as null, we get the NPE. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5623) Race condition when rolling the HLog and hlogFlush

2012-03-22 Thread stack (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-5623:
-

Attachment: 5623v2.txt

This patch is Enis's patch plus a test.  It doesn't seem to fix the issue (it 
makes it a little better as far as I can tell but the test NPEs easy).

This is a good find Enis.  Looks like its easy to NPE in here.

> Race condition when rolling the HLog and hlogFlush
> --
>
> Key: HBASE-5623
> URL: https://issues.apache.org/jira/browse/HBASE-5623
> Project: HBase
>  Issue Type: Bug
>  Components: wal
>Affects Versions: 0.94.0
>Reporter: Enis Soztutar
>Assignee: Enis Soztutar
>Priority: Critical
> Attachments: 5623v2.txt, HBASE-5623_v0.patch
>
>
> When doing a ycsb test with a large number of handlers 
> (regionserver.handler.count=60), I get the following exceptions:
> {code}
> Caused by: org.apache.hadoop.ipc.RemoteException: java.io.IOException: 
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.io.SequenceFile$Writer.getLength(SequenceFile.java:1099)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogWriter.getLength(SequenceFileLogWriter.java:314)
>   at org.apache.hadoop.hbase.regionserver.wal.HLog.syncer(HLog.java:1291)
>   at org.apache.hadoop.hbase.regionserver.wal.HLog.sync(HLog.java:1388)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchPut(HRegion.java:2192)
>   at org.apache.hadoop.hbase.regionserver.HRegion.put(HRegion.java:1985)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.multi(HRegionServer.java:3400)
>   at sun.reflect.GeneratedMethodAccessor17.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>   at java.lang.reflect.Method.invoke(Method.java:597)
>   at 
> org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:366)
>   at 
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1351)
>   at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:920)
>   at 
> org.apache.hadoop.hbase.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:152)
>   at $Proxy1.multi(Unknown Source)
>   at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$3$1.call(HConnectionManager.java:1691)
>   at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$3$1.call(HConnectionManager.java:1689)
>   at 
> org.apache.hadoop.hbase.client.ServerCallable.withoutRetries(ServerCallable.java:214)
> {code}
> and 
> {code}
>   java.lang.NullPointerException
>   at 
> org.apache.hadoop.io.SequenceFile$Writer.checkAndWriteSync(SequenceFile.java:1026)
>   at 
> org.apache.hadoop.io.SequenceFile$Writer.append(SequenceFile.java:1068)
>   at 
> org.apache.hadoop.io.SequenceFile$Writer.append(SequenceFile.java:1035)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogWriter.append(SequenceFileLogWriter.java:279)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.HLog$LogSyncer.hlogFlush(HLog.java:1237)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.HLog.syncer(HLog.java:1271)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.HLog.sync(HLog.java:1391)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchPut(HRegion.java:2192)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.put(HRegion.java:1985)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.multi(HRegionServer.java:3400)
>   at sun.reflect.GeneratedMethodAccessor33.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>   at java.lang.reflect.Method.invoke(Method.java:597)
>   at 
> org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:366)
>   at 
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1351)
> {code}
> It seems the root cause of the issue is that we open a new log writer and 
> close the old one at HLog#rollWriter() holding the updateLock, but the other 
> threads doing syncer() calls
> {code} 
> logSyncerThread.hlogFlush(this.writer);
> {code}
> without holding the updateLock. LogSyncer only synchronizes against 
> concurrent appends and flush(), but not on the passed writer, which can be 
> closed already by rollWriter(). In this case, since 
> SequenceFile#Writer.close() sets it's out field as null, we get the NPE. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 

[jira] [Updated] (HBASE-5623) Race condition when rolling the HLog and hlogFlush

2012-03-23 Thread Lars Hofhansl (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-5623:
-

Fix Version/s: 0.94.0

I think this needs to be in 0.94.0.

> Race condition when rolling the HLog and hlogFlush
> --
>
> Key: HBASE-5623
> URL: https://issues.apache.org/jira/browse/HBASE-5623
> Project: HBase
>  Issue Type: Bug
>  Components: wal
>Affects Versions: 0.94.0
>Reporter: Enis Soztutar
>Assignee: Enis Soztutar
>Priority: Critical
> Fix For: 0.94.0
>
> Attachments: 5623v2.txt, HBASE-5623_v0.patch
>
>
> When doing a ycsb test with a large number of handlers 
> (regionserver.handler.count=60), I get the following exceptions:
> {code}
> Caused by: org.apache.hadoop.ipc.RemoteException: java.io.IOException: 
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.io.SequenceFile$Writer.getLength(SequenceFile.java:1099)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogWriter.getLength(SequenceFileLogWriter.java:314)
>   at org.apache.hadoop.hbase.regionserver.wal.HLog.syncer(HLog.java:1291)
>   at org.apache.hadoop.hbase.regionserver.wal.HLog.sync(HLog.java:1388)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchPut(HRegion.java:2192)
>   at org.apache.hadoop.hbase.regionserver.HRegion.put(HRegion.java:1985)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.multi(HRegionServer.java:3400)
>   at sun.reflect.GeneratedMethodAccessor17.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>   at java.lang.reflect.Method.invoke(Method.java:597)
>   at 
> org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:366)
>   at 
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1351)
>   at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:920)
>   at 
> org.apache.hadoop.hbase.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:152)
>   at $Proxy1.multi(Unknown Source)
>   at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$3$1.call(HConnectionManager.java:1691)
>   at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$3$1.call(HConnectionManager.java:1689)
>   at 
> org.apache.hadoop.hbase.client.ServerCallable.withoutRetries(ServerCallable.java:214)
> {code}
> and 
> {code}
>   java.lang.NullPointerException
>   at 
> org.apache.hadoop.io.SequenceFile$Writer.checkAndWriteSync(SequenceFile.java:1026)
>   at 
> org.apache.hadoop.io.SequenceFile$Writer.append(SequenceFile.java:1068)
>   at 
> org.apache.hadoop.io.SequenceFile$Writer.append(SequenceFile.java:1035)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogWriter.append(SequenceFileLogWriter.java:279)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.HLog$LogSyncer.hlogFlush(HLog.java:1237)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.HLog.syncer(HLog.java:1271)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.HLog.sync(HLog.java:1391)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchPut(HRegion.java:2192)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.put(HRegion.java:1985)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.multi(HRegionServer.java:3400)
>   at sun.reflect.GeneratedMethodAccessor33.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>   at java.lang.reflect.Method.invoke(Method.java:597)
>   at 
> org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:366)
>   at 
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1351)
> {code}
> It seems the root cause of the issue is that we open a new log writer and 
> close the old one at HLog#rollWriter() holding the updateLock, but the other 
> threads doing syncer() calls
> {code} 
> logSyncerThread.hlogFlush(this.writer);
> {code}
> without holding the updateLock. LogSyncer only synchronizes against 
> concurrent appends and flush(), but not on the passed writer, which can be 
> closed already by rollWriter(). In this case, since 
> SequenceFile#Writer.close() sets it's out field as null, we get the NPE. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.c

[jira] [Updated] (HBASE-5623) Race condition when rolling the HLog and hlogFlush

2012-03-23 Thread Lars Hofhansl (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-5623:
-

Attachment: 5623.txt

How about this?
Could possibly generally synchronize in hLogFlush, since we're just appending 
and not sync'ing.

> Race condition when rolling the HLog and hlogFlush
> --
>
> Key: HBASE-5623
> URL: https://issues.apache.org/jira/browse/HBASE-5623
> Project: HBase
>  Issue Type: Bug
>  Components: wal
>Affects Versions: 0.94.0
>Reporter: Enis Soztutar
>Assignee: Enis Soztutar
>Priority: Critical
> Fix For: 0.94.0
>
> Attachments: 5623.txt, 5623v2.txt, HBASE-5623_v0.patch
>
>
> When doing a ycsb test with a large number of handlers 
> (regionserver.handler.count=60), I get the following exceptions:
> {code}
> Caused by: org.apache.hadoop.ipc.RemoteException: java.io.IOException: 
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.io.SequenceFile$Writer.getLength(SequenceFile.java:1099)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogWriter.getLength(SequenceFileLogWriter.java:314)
>   at org.apache.hadoop.hbase.regionserver.wal.HLog.syncer(HLog.java:1291)
>   at org.apache.hadoop.hbase.regionserver.wal.HLog.sync(HLog.java:1388)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchPut(HRegion.java:2192)
>   at org.apache.hadoop.hbase.regionserver.HRegion.put(HRegion.java:1985)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.multi(HRegionServer.java:3400)
>   at sun.reflect.GeneratedMethodAccessor17.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>   at java.lang.reflect.Method.invoke(Method.java:597)
>   at 
> org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:366)
>   at 
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1351)
>   at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:920)
>   at 
> org.apache.hadoop.hbase.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:152)
>   at $Proxy1.multi(Unknown Source)
>   at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$3$1.call(HConnectionManager.java:1691)
>   at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$3$1.call(HConnectionManager.java:1689)
>   at 
> org.apache.hadoop.hbase.client.ServerCallable.withoutRetries(ServerCallable.java:214)
> {code}
> and 
> {code}
>   java.lang.NullPointerException
>   at 
> org.apache.hadoop.io.SequenceFile$Writer.checkAndWriteSync(SequenceFile.java:1026)
>   at 
> org.apache.hadoop.io.SequenceFile$Writer.append(SequenceFile.java:1068)
>   at 
> org.apache.hadoop.io.SequenceFile$Writer.append(SequenceFile.java:1035)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogWriter.append(SequenceFileLogWriter.java:279)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.HLog$LogSyncer.hlogFlush(HLog.java:1237)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.HLog.syncer(HLog.java:1271)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.HLog.sync(HLog.java:1391)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchPut(HRegion.java:2192)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.put(HRegion.java:1985)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.multi(HRegionServer.java:3400)
>   at sun.reflect.GeneratedMethodAccessor33.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>   at java.lang.reflect.Method.invoke(Method.java:597)
>   at 
> org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:366)
>   at 
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1351)
> {code}
> It seems the root cause of the issue is that we open a new log writer and 
> close the old one at HLog#rollWriter() holding the updateLock, but the other 
> threads doing syncer() calls
> {code} 
> logSyncerThread.hlogFlush(this.writer);
> {code}
> without holding the updateLock. LogSyncer only synchronizes against 
> concurrent appends and flush(), but not on the passed writer, which can be 
> closed already by rollWriter(). In this case, since 
> SequenceFile#Writer.close() sets it's out field as null, we get the NPE. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/Contac

[jira] [Updated] (HBASE-5623) Race condition when rolling the HLog and hlogFlush

2012-03-23 Thread Enis Soztutar (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Enis Soztutar updated HBASE-5623:
-

Attachment: HBASE-5623_v4.patch

While running the unit test, I also noticed that in some cases, the syncer() 
still holds the old writer reference, and holding the updateLock does not 
guarantee that the other thread's writer pointer is updated to the nextWriter. 
I had changed Writer to be volatile but that did not help either. So, I use 
AtomicReference for managing the Writer pointer. 

Lars, I have merged my patch with yours for the second-tries, and modified 
Stack's unit test so that it fails nearly everytime on trunk, and once in 10 
runs with patch v2. 

> Race condition when rolling the HLog and hlogFlush
> --
>
> Key: HBASE-5623
> URL: https://issues.apache.org/jira/browse/HBASE-5623
> Project: HBase
>  Issue Type: Bug
>  Components: wal
>Affects Versions: 0.94.0
>Reporter: Enis Soztutar
>Assignee: Enis Soztutar
>Priority: Critical
> Fix For: 0.94.0
>
> Attachments: 5623.txt, 5623v2.txt, HBASE-5623_v0.patch, 
> HBASE-5623_v4.patch
>
>
> When doing a ycsb test with a large number of handlers 
> (regionserver.handler.count=60), I get the following exceptions:
> {code}
> Caused by: org.apache.hadoop.ipc.RemoteException: java.io.IOException: 
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.io.SequenceFile$Writer.getLength(SequenceFile.java:1099)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogWriter.getLength(SequenceFileLogWriter.java:314)
>   at org.apache.hadoop.hbase.regionserver.wal.HLog.syncer(HLog.java:1291)
>   at org.apache.hadoop.hbase.regionserver.wal.HLog.sync(HLog.java:1388)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchPut(HRegion.java:2192)
>   at org.apache.hadoop.hbase.regionserver.HRegion.put(HRegion.java:1985)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.multi(HRegionServer.java:3400)
>   at sun.reflect.GeneratedMethodAccessor17.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>   at java.lang.reflect.Method.invoke(Method.java:597)
>   at 
> org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:366)
>   at 
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1351)
>   at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:920)
>   at 
> org.apache.hadoop.hbase.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:152)
>   at $Proxy1.multi(Unknown Source)
>   at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$3$1.call(HConnectionManager.java:1691)
>   at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$3$1.call(HConnectionManager.java:1689)
>   at 
> org.apache.hadoop.hbase.client.ServerCallable.withoutRetries(ServerCallable.java:214)
> {code}
> and 
> {code}
>   java.lang.NullPointerException
>   at 
> org.apache.hadoop.io.SequenceFile$Writer.checkAndWriteSync(SequenceFile.java:1026)
>   at 
> org.apache.hadoop.io.SequenceFile$Writer.append(SequenceFile.java:1068)
>   at 
> org.apache.hadoop.io.SequenceFile$Writer.append(SequenceFile.java:1035)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogWriter.append(SequenceFileLogWriter.java:279)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.HLog$LogSyncer.hlogFlush(HLog.java:1237)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.HLog.syncer(HLog.java:1271)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.HLog.sync(HLog.java:1391)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchPut(HRegion.java:2192)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.put(HRegion.java:1985)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.multi(HRegionServer.java:3400)
>   at sun.reflect.GeneratedMethodAccessor33.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>   at java.lang.reflect.Method.invoke(Method.java:597)
>   at 
> org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:366)
>   at 
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1351)
> {code}
> It seems the root cause of the issue is that we open a new log writer and 
> close the old one at HLog#rollWriter() holding the updateLock, but the other 
> threads doing syncer() calls
> {code} 
> logSyncerThread.hlogFlush(this.writer);
> {code}

[jira] [Updated] (HBASE-5623) Race condition when rolling the HLog and hlogFlush

2012-03-23 Thread Zhihong Yu (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Yu updated HBASE-5623:
--

Comment: was deleted

(was: For patch v4:
{code}
+if (this.writerRef.get() != null) {
+  this.writerRef.get().close();
{code}
Shall we save the first writerRef.get() in a variable and use it to call 
close() ?

Stack's test isn't in patch v4.)

> Race condition when rolling the HLog and hlogFlush
> --
>
> Key: HBASE-5623
> URL: https://issues.apache.org/jira/browse/HBASE-5623
> Project: HBase
>  Issue Type: Bug
>  Components: wal
>Affects Versions: 0.94.0
>Reporter: Enis Soztutar
>Assignee: Enis Soztutar
>Priority: Critical
> Fix For: 0.94.0
>
> Attachments: 5623.txt, 5623v2.txt, HBASE-5623_v0.patch, 
> HBASE-5623_v4.patch
>
>
> When doing a ycsb test with a large number of handlers 
> (regionserver.handler.count=60), I get the following exceptions:
> {code}
> Caused by: org.apache.hadoop.ipc.RemoteException: java.io.IOException: 
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.io.SequenceFile$Writer.getLength(SequenceFile.java:1099)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogWriter.getLength(SequenceFileLogWriter.java:314)
>   at org.apache.hadoop.hbase.regionserver.wal.HLog.syncer(HLog.java:1291)
>   at org.apache.hadoop.hbase.regionserver.wal.HLog.sync(HLog.java:1388)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchPut(HRegion.java:2192)
>   at org.apache.hadoop.hbase.regionserver.HRegion.put(HRegion.java:1985)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.multi(HRegionServer.java:3400)
>   at sun.reflect.GeneratedMethodAccessor17.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>   at java.lang.reflect.Method.invoke(Method.java:597)
>   at 
> org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:366)
>   at 
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1351)
>   at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:920)
>   at 
> org.apache.hadoop.hbase.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:152)
>   at $Proxy1.multi(Unknown Source)
>   at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$3$1.call(HConnectionManager.java:1691)
>   at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$3$1.call(HConnectionManager.java:1689)
>   at 
> org.apache.hadoop.hbase.client.ServerCallable.withoutRetries(ServerCallable.java:214)
> {code}
> and 
> {code}
>   java.lang.NullPointerException
>   at 
> org.apache.hadoop.io.SequenceFile$Writer.checkAndWriteSync(SequenceFile.java:1026)
>   at 
> org.apache.hadoop.io.SequenceFile$Writer.append(SequenceFile.java:1068)
>   at 
> org.apache.hadoop.io.SequenceFile$Writer.append(SequenceFile.java:1035)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogWriter.append(SequenceFileLogWriter.java:279)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.HLog$LogSyncer.hlogFlush(HLog.java:1237)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.HLog.syncer(HLog.java:1271)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.HLog.sync(HLog.java:1391)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchPut(HRegion.java:2192)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.put(HRegion.java:1985)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.multi(HRegionServer.java:3400)
>   at sun.reflect.GeneratedMethodAccessor33.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>   at java.lang.reflect.Method.invoke(Method.java:597)
>   at 
> org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:366)
>   at 
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1351)
> {code}
> It seems the root cause of the issue is that we open a new log writer and 
> close the old one at HLog#rollWriter() holding the updateLock, but the other 
> threads doing syncer() calls
> {code} 
> logSyncerThread.hlogFlush(this.writer);
> {code}
> without holding the updateLock. LogSyncer only synchronizes against 
> concurrent appends and flush(), but not on the passed writer, which can be 
> closed already by rollWriter(). In this case, since 
> SequenceFile#Writer.close() sets it's out field as null, we get the NPE. 

--
This message is automatically generated by

[jira] [Updated] (HBASE-5623) Race condition when rolling the HLog and hlogFlush

2012-03-23 Thread Enis Soztutar (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Enis Soztutar updated HBASE-5623:
-

Attachment: HBASE-5623_v5.patch

@Ted, 
Forgot to git add. Also applied your suggestion. 

> Race condition when rolling the HLog and hlogFlush
> --
>
> Key: HBASE-5623
> URL: https://issues.apache.org/jira/browse/HBASE-5623
> Project: HBase
>  Issue Type: Bug
>  Components: wal
>Affects Versions: 0.94.0
>Reporter: Enis Soztutar
>Assignee: Enis Soztutar
>Priority: Critical
> Fix For: 0.94.0
>
> Attachments: 5623.txt, 5623v2.txt, HBASE-5623_v0.patch, 
> HBASE-5623_v4.patch, HBASE-5623_v5.patch
>
>
> When doing a ycsb test with a large number of handlers 
> (regionserver.handler.count=60), I get the following exceptions:
> {code}
> Caused by: org.apache.hadoop.ipc.RemoteException: java.io.IOException: 
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.io.SequenceFile$Writer.getLength(SequenceFile.java:1099)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogWriter.getLength(SequenceFileLogWriter.java:314)
>   at org.apache.hadoop.hbase.regionserver.wal.HLog.syncer(HLog.java:1291)
>   at org.apache.hadoop.hbase.regionserver.wal.HLog.sync(HLog.java:1388)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchPut(HRegion.java:2192)
>   at org.apache.hadoop.hbase.regionserver.HRegion.put(HRegion.java:1985)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.multi(HRegionServer.java:3400)
>   at sun.reflect.GeneratedMethodAccessor17.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>   at java.lang.reflect.Method.invoke(Method.java:597)
>   at 
> org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:366)
>   at 
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1351)
>   at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:920)
>   at 
> org.apache.hadoop.hbase.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:152)
>   at $Proxy1.multi(Unknown Source)
>   at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$3$1.call(HConnectionManager.java:1691)
>   at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$3$1.call(HConnectionManager.java:1689)
>   at 
> org.apache.hadoop.hbase.client.ServerCallable.withoutRetries(ServerCallable.java:214)
> {code}
> and 
> {code}
>   java.lang.NullPointerException
>   at 
> org.apache.hadoop.io.SequenceFile$Writer.checkAndWriteSync(SequenceFile.java:1026)
>   at 
> org.apache.hadoop.io.SequenceFile$Writer.append(SequenceFile.java:1068)
>   at 
> org.apache.hadoop.io.SequenceFile$Writer.append(SequenceFile.java:1035)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogWriter.append(SequenceFileLogWriter.java:279)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.HLog$LogSyncer.hlogFlush(HLog.java:1237)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.HLog.syncer(HLog.java:1271)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.HLog.sync(HLog.java:1391)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchPut(HRegion.java:2192)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.put(HRegion.java:1985)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.multi(HRegionServer.java:3400)
>   at sun.reflect.GeneratedMethodAccessor33.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>   at java.lang.reflect.Method.invoke(Method.java:597)
>   at 
> org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:366)
>   at 
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1351)
> {code}
> It seems the root cause of the issue is that we open a new log writer and 
> close the old one at HLog#rollWriter() holding the updateLock, but the other 
> threads doing syncer() calls
> {code} 
> logSyncerThread.hlogFlush(this.writer);
> {code}
> without holding the updateLock. LogSyncer only synchronizes against 
> concurrent appends and flush(), but not on the passed writer, which can be 
> closed already by rollWriter(). In this case, since 
> SequenceFile#Writer.close() sets it's out field as null, we get the NPE. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/Contac

[jira] [Updated] (HBASE-5623) Race condition when rolling the HLog and hlogFlush

2012-03-23 Thread Lars Hofhansl (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-5623:
-

Attachment: 5623-suggestion.txt

This one works for me.
The observation is that only with the updateLock held are we guaranteed that 
this.writer is not null (as Enis pointed out correctly).

So we can get a writer with the lock held, it is known to be not-null. If that 
writer has been closed we get an IOException, in which case we try again with 
the instance writer and the lock held.

Not entirely pretty, but avoids the AtomicReference everywhere.

> Race condition when rolling the HLog and hlogFlush
> --
>
> Key: HBASE-5623
> URL: https://issues.apache.org/jira/browse/HBASE-5623
> Project: HBase
>  Issue Type: Bug
>  Components: wal
>Affects Versions: 0.94.0
>Reporter: Enis Soztutar
>Assignee: Enis Soztutar
>Priority: Critical
> Fix For: 0.94.0
>
> Attachments: 5623-suggestion.txt, 5623.txt, 5623v2.txt, 
> HBASE-5623_v0.patch, HBASE-5623_v4.patch, HBASE-5623_v5.patch
>
>
> When doing a ycsb test with a large number of handlers 
> (regionserver.handler.count=60), I get the following exceptions:
> {code}
> Caused by: org.apache.hadoop.ipc.RemoteException: java.io.IOException: 
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.io.SequenceFile$Writer.getLength(SequenceFile.java:1099)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogWriter.getLength(SequenceFileLogWriter.java:314)
>   at org.apache.hadoop.hbase.regionserver.wal.HLog.syncer(HLog.java:1291)
>   at org.apache.hadoop.hbase.regionserver.wal.HLog.sync(HLog.java:1388)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchPut(HRegion.java:2192)
>   at org.apache.hadoop.hbase.regionserver.HRegion.put(HRegion.java:1985)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.multi(HRegionServer.java:3400)
>   at sun.reflect.GeneratedMethodAccessor17.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>   at java.lang.reflect.Method.invoke(Method.java:597)
>   at 
> org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:366)
>   at 
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1351)
>   at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:920)
>   at 
> org.apache.hadoop.hbase.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:152)
>   at $Proxy1.multi(Unknown Source)
>   at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$3$1.call(HConnectionManager.java:1691)
>   at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$3$1.call(HConnectionManager.java:1689)
>   at 
> org.apache.hadoop.hbase.client.ServerCallable.withoutRetries(ServerCallable.java:214)
> {code}
> and 
> {code}
>   java.lang.NullPointerException
>   at 
> org.apache.hadoop.io.SequenceFile$Writer.checkAndWriteSync(SequenceFile.java:1026)
>   at 
> org.apache.hadoop.io.SequenceFile$Writer.append(SequenceFile.java:1068)
>   at 
> org.apache.hadoop.io.SequenceFile$Writer.append(SequenceFile.java:1035)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogWriter.append(SequenceFileLogWriter.java:279)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.HLog$LogSyncer.hlogFlush(HLog.java:1237)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.HLog.syncer(HLog.java:1271)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.HLog.sync(HLog.java:1391)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchPut(HRegion.java:2192)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.put(HRegion.java:1985)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.multi(HRegionServer.java:3400)
>   at sun.reflect.GeneratedMethodAccessor33.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>   at java.lang.reflect.Method.invoke(Method.java:597)
>   at 
> org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:366)
>   at 
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1351)
> {code}
> It seems the root cause of the issue is that we open a new log writer and 
> close the old one at HLog#rollWriter() holding the updateLock, but the other 
> threads doing syncer() calls
> {code} 
> logSyncerThread.hlogFlush(this.writer);
> {code}
> without holding the updateLock. LogSyncer only synchronizes against 

[jira] [Updated] (HBASE-5623) Race condition when rolling the HLog and hlogFlush

2012-03-23 Thread Enis Soztutar (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Enis Soztutar updated HBASE-5623:
-

Attachment: HBASE-5623_v6-alt.patch

Thanks Lars for the patch. I retried again based on your comment, and was not 
able to reproduce the condition necessinating AtomicRef's. So, I just added one 
more check for in the unit test, and added a volatile to logRollRunning to your 
patch (5623-suggestion).

I have tested this with the unit test, but I am not able to test it with the 
ycsb cluster. I'll do that Monday. Until then, we can go with this patch, if 
you are comfortable. 

> Race condition when rolling the HLog and hlogFlush
> --
>
> Key: HBASE-5623
> URL: https://issues.apache.org/jira/browse/HBASE-5623
> Project: HBase
>  Issue Type: Bug
>  Components: wal
>Affects Versions: 0.94.0
>Reporter: Enis Soztutar
>Assignee: Enis Soztutar
>Priority: Critical
> Fix For: 0.94.0
>
> Attachments: 5623-suggestion.txt, 5623.txt, 5623v2.txt, 
> HBASE-5623_v0.patch, HBASE-5623_v4.patch, HBASE-5623_v5.patch, 
> HBASE-5623_v6-alt.patch
>
>
> When doing a ycsb test with a large number of handlers 
> (regionserver.handler.count=60), I get the following exceptions:
> {code}
> Caused by: org.apache.hadoop.ipc.RemoteException: java.io.IOException: 
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.io.SequenceFile$Writer.getLength(SequenceFile.java:1099)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogWriter.getLength(SequenceFileLogWriter.java:314)
>   at org.apache.hadoop.hbase.regionserver.wal.HLog.syncer(HLog.java:1291)
>   at org.apache.hadoop.hbase.regionserver.wal.HLog.sync(HLog.java:1388)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchPut(HRegion.java:2192)
>   at org.apache.hadoop.hbase.regionserver.HRegion.put(HRegion.java:1985)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.multi(HRegionServer.java:3400)
>   at sun.reflect.GeneratedMethodAccessor17.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>   at java.lang.reflect.Method.invoke(Method.java:597)
>   at 
> org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:366)
>   at 
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1351)
>   at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:920)
>   at 
> org.apache.hadoop.hbase.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:152)
>   at $Proxy1.multi(Unknown Source)
>   at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$3$1.call(HConnectionManager.java:1691)
>   at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$3$1.call(HConnectionManager.java:1689)
>   at 
> org.apache.hadoop.hbase.client.ServerCallable.withoutRetries(ServerCallable.java:214)
> {code}
> and 
> {code}
>   java.lang.NullPointerException
>   at 
> org.apache.hadoop.io.SequenceFile$Writer.checkAndWriteSync(SequenceFile.java:1026)
>   at 
> org.apache.hadoop.io.SequenceFile$Writer.append(SequenceFile.java:1068)
>   at 
> org.apache.hadoop.io.SequenceFile$Writer.append(SequenceFile.java:1035)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogWriter.append(SequenceFileLogWriter.java:279)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.HLog$LogSyncer.hlogFlush(HLog.java:1237)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.HLog.syncer(HLog.java:1271)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.HLog.sync(HLog.java:1391)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchPut(HRegion.java:2192)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.put(HRegion.java:1985)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.multi(HRegionServer.java:3400)
>   at sun.reflect.GeneratedMethodAccessor33.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>   at java.lang.reflect.Method.invoke(Method.java:597)
>   at 
> org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:366)
>   at 
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1351)
> {code}
> It seems the root cause of the issue is that we open a new log writer and 
> close the old one at HLog#rollWriter() holding the updateLock, but the other 
> threads doing syncer() calls
> {code} 
> logSyncerThread.hlogFlush(this.writer);
> {code}
> without holdin

[jira] [Updated] (HBASE-5623) Race condition when rolling the HLog and hlogFlush

2012-03-23 Thread Zhihong Yu (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Yu updated HBASE-5623:
--

Attachment: HBASE-5623_v6-alt.patch

Re-attaching Enis' patch for Hadoop QA

> Race condition when rolling the HLog and hlogFlush
> --
>
> Key: HBASE-5623
> URL: https://issues.apache.org/jira/browse/HBASE-5623
> Project: HBase
>  Issue Type: Bug
>  Components: wal
>Affects Versions: 0.94.0
>Reporter: Enis Soztutar
>Assignee: Enis Soztutar
>Priority: Critical
> Fix For: 0.94.0
>
> Attachments: 5623-suggestion.txt, 5623.txt, 5623v2.txt, 
> HBASE-5623_v0.patch, HBASE-5623_v4.patch, HBASE-5623_v5.patch, 
> HBASE-5623_v6-alt.patch, HBASE-5623_v6-alt.patch
>
>
> When doing a ycsb test with a large number of handlers 
> (regionserver.handler.count=60), I get the following exceptions:
> {code}
> Caused by: org.apache.hadoop.ipc.RemoteException: java.io.IOException: 
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.io.SequenceFile$Writer.getLength(SequenceFile.java:1099)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogWriter.getLength(SequenceFileLogWriter.java:314)
>   at org.apache.hadoop.hbase.regionserver.wal.HLog.syncer(HLog.java:1291)
>   at org.apache.hadoop.hbase.regionserver.wal.HLog.sync(HLog.java:1388)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchPut(HRegion.java:2192)
>   at org.apache.hadoop.hbase.regionserver.HRegion.put(HRegion.java:1985)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.multi(HRegionServer.java:3400)
>   at sun.reflect.GeneratedMethodAccessor17.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>   at java.lang.reflect.Method.invoke(Method.java:597)
>   at 
> org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:366)
>   at 
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1351)
>   at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:920)
>   at 
> org.apache.hadoop.hbase.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:152)
>   at $Proxy1.multi(Unknown Source)
>   at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$3$1.call(HConnectionManager.java:1691)
>   at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$3$1.call(HConnectionManager.java:1689)
>   at 
> org.apache.hadoop.hbase.client.ServerCallable.withoutRetries(ServerCallable.java:214)
> {code}
> and 
> {code}
>   java.lang.NullPointerException
>   at 
> org.apache.hadoop.io.SequenceFile$Writer.checkAndWriteSync(SequenceFile.java:1026)
>   at 
> org.apache.hadoop.io.SequenceFile$Writer.append(SequenceFile.java:1068)
>   at 
> org.apache.hadoop.io.SequenceFile$Writer.append(SequenceFile.java:1035)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogWriter.append(SequenceFileLogWriter.java:279)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.HLog$LogSyncer.hlogFlush(HLog.java:1237)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.HLog.syncer(HLog.java:1271)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.HLog.sync(HLog.java:1391)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchPut(HRegion.java:2192)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.put(HRegion.java:1985)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.multi(HRegionServer.java:3400)
>   at sun.reflect.GeneratedMethodAccessor33.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>   at java.lang.reflect.Method.invoke(Method.java:597)
>   at 
> org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:366)
>   at 
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1351)
> {code}
> It seems the root cause of the issue is that we open a new log writer and 
> close the old one at HLog#rollWriter() holding the updateLock, but the other 
> threads doing syncer() calls
> {code} 
> logSyncerThread.hlogFlush(this.writer);
> {code}
> without holding the updateLock. LogSyncer only synchronizes against 
> concurrent appends and flush(), but not on the passed writer, which can be 
> closed already by rollWriter(). In this case, since 
> SequenceFile#Writer.close() sets it's out field as null, we get the NPE. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA adminis

[jira] [Updated] (HBASE-5623) Race condition when rolling the HLog and hlogFlush

2012-03-23 Thread Lars Hofhansl (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-5623:
-

Attachment: 5623-v7.txt

Here's what I believe is a cleaner patch.
Includes Enis' test and volatile change.

> Race condition when rolling the HLog and hlogFlush
> --
>
> Key: HBASE-5623
> URL: https://issues.apache.org/jira/browse/HBASE-5623
> Project: HBase
>  Issue Type: Bug
>  Components: wal
>Affects Versions: 0.94.0
>Reporter: Enis Soztutar
>Assignee: Enis Soztutar
>Priority: Critical
> Fix For: 0.94.0
>
> Attachments: 5623-suggestion.txt, 5623-v7.txt, 5623.txt, 5623v2.txt, 
> HBASE-5623_v0.patch, HBASE-5623_v4.patch, HBASE-5623_v5.patch, 
> HBASE-5623_v6-alt.patch, HBASE-5623_v6-alt.patch
>
>
> When doing a ycsb test with a large number of handlers 
> (regionserver.handler.count=60), I get the following exceptions:
> {code}
> Caused by: org.apache.hadoop.ipc.RemoteException: java.io.IOException: 
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.io.SequenceFile$Writer.getLength(SequenceFile.java:1099)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogWriter.getLength(SequenceFileLogWriter.java:314)
>   at org.apache.hadoop.hbase.regionserver.wal.HLog.syncer(HLog.java:1291)
>   at org.apache.hadoop.hbase.regionserver.wal.HLog.sync(HLog.java:1388)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchPut(HRegion.java:2192)
>   at org.apache.hadoop.hbase.regionserver.HRegion.put(HRegion.java:1985)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.multi(HRegionServer.java:3400)
>   at sun.reflect.GeneratedMethodAccessor17.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>   at java.lang.reflect.Method.invoke(Method.java:597)
>   at 
> org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:366)
>   at 
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1351)
>   at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:920)
>   at 
> org.apache.hadoop.hbase.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:152)
>   at $Proxy1.multi(Unknown Source)
>   at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$3$1.call(HConnectionManager.java:1691)
>   at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$3$1.call(HConnectionManager.java:1689)
>   at 
> org.apache.hadoop.hbase.client.ServerCallable.withoutRetries(ServerCallable.java:214)
> {code}
> and 
> {code}
>   java.lang.NullPointerException
>   at 
> org.apache.hadoop.io.SequenceFile$Writer.checkAndWriteSync(SequenceFile.java:1026)
>   at 
> org.apache.hadoop.io.SequenceFile$Writer.append(SequenceFile.java:1068)
>   at 
> org.apache.hadoop.io.SequenceFile$Writer.append(SequenceFile.java:1035)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogWriter.append(SequenceFileLogWriter.java:279)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.HLog$LogSyncer.hlogFlush(HLog.java:1237)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.HLog.syncer(HLog.java:1271)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.HLog.sync(HLog.java:1391)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchPut(HRegion.java:2192)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.put(HRegion.java:1985)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.multi(HRegionServer.java:3400)
>   at sun.reflect.GeneratedMethodAccessor33.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>   at java.lang.reflect.Method.invoke(Method.java:597)
>   at 
> org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:366)
>   at 
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1351)
> {code}
> It seems the root cause of the issue is that we open a new log writer and 
> close the old one at HLog#rollWriter() holding the updateLock, but the other 
> threads doing syncer() calls
> {code} 
> logSyncerThread.hlogFlush(this.writer);
> {code}
> without holding the updateLock. LogSyncer only synchronizes against 
> concurrent appends and flush(), but not on the passed writer, which can be 
> closed already by rollWriter(). In this case, since 
> SequenceFile#Writer.close() sets it's out field as null, we get the NPE. 

--
This message is automatically generated by JIRA.
If you think it was

[jira] [Updated] (HBASE-5623) Race condition when rolling the HLog and hlogFlush

2012-03-23 Thread Lars Hofhansl (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-5623:
-

Attachment: 5623-v8.txt

Slightly simpler even.
Last patch today... Promised.

> Race condition when rolling the HLog and hlogFlush
> --
>
> Key: HBASE-5623
> URL: https://issues.apache.org/jira/browse/HBASE-5623
> Project: HBase
>  Issue Type: Bug
>  Components: wal
>Affects Versions: 0.94.0
>Reporter: Enis Soztutar
>Assignee: Enis Soztutar
>Priority: Critical
> Fix For: 0.94.0
>
> Attachments: 5623-suggestion.txt, 5623-v7.txt, 5623-v8.txt, 5623.txt, 
> 5623v2.txt, HBASE-5623_v0.patch, HBASE-5623_v4.patch, HBASE-5623_v5.patch, 
> HBASE-5623_v6-alt.patch, HBASE-5623_v6-alt.patch
>
>
> When doing a ycsb test with a large number of handlers 
> (regionserver.handler.count=60), I get the following exceptions:
> {code}
> Caused by: org.apache.hadoop.ipc.RemoteException: java.io.IOException: 
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.io.SequenceFile$Writer.getLength(SequenceFile.java:1099)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogWriter.getLength(SequenceFileLogWriter.java:314)
>   at org.apache.hadoop.hbase.regionserver.wal.HLog.syncer(HLog.java:1291)
>   at org.apache.hadoop.hbase.regionserver.wal.HLog.sync(HLog.java:1388)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchPut(HRegion.java:2192)
>   at org.apache.hadoop.hbase.regionserver.HRegion.put(HRegion.java:1985)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.multi(HRegionServer.java:3400)
>   at sun.reflect.GeneratedMethodAccessor17.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>   at java.lang.reflect.Method.invoke(Method.java:597)
>   at 
> org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:366)
>   at 
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1351)
>   at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:920)
>   at 
> org.apache.hadoop.hbase.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:152)
>   at $Proxy1.multi(Unknown Source)
>   at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$3$1.call(HConnectionManager.java:1691)
>   at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$3$1.call(HConnectionManager.java:1689)
>   at 
> org.apache.hadoop.hbase.client.ServerCallable.withoutRetries(ServerCallable.java:214)
> {code}
> and 
> {code}
>   java.lang.NullPointerException
>   at 
> org.apache.hadoop.io.SequenceFile$Writer.checkAndWriteSync(SequenceFile.java:1026)
>   at 
> org.apache.hadoop.io.SequenceFile$Writer.append(SequenceFile.java:1068)
>   at 
> org.apache.hadoop.io.SequenceFile$Writer.append(SequenceFile.java:1035)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogWriter.append(SequenceFileLogWriter.java:279)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.HLog$LogSyncer.hlogFlush(HLog.java:1237)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.HLog.syncer(HLog.java:1271)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.HLog.sync(HLog.java:1391)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchPut(HRegion.java:2192)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.put(HRegion.java:1985)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.multi(HRegionServer.java:3400)
>   at sun.reflect.GeneratedMethodAccessor33.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>   at java.lang.reflect.Method.invoke(Method.java:597)
>   at 
> org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:366)
>   at 
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1351)
> {code}
> It seems the root cause of the issue is that we open a new log writer and 
> close the old one at HLog#rollWriter() holding the updateLock, but the other 
> threads doing syncer() calls
> {code} 
> logSyncerThread.hlogFlush(this.writer);
> {code}
> without holding the updateLock. LogSyncer only synchronizes against 
> concurrent appends and flush(), but not on the passed writer, which can be 
> closed already by rollWriter(). In this case, since 
> SequenceFile#Writer.close() sets it's out field as null, we get the NPE. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly

[jira] [Updated] (HBASE-5623) Race condition when rolling the HLog and hlogFlush

2012-03-26 Thread Lars Hofhansl (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-5623:
-

Resolution: Fixed
Status: Resolved  (was: Patch Available)

Committed to 0.94 and 0.96.

> Race condition when rolling the HLog and hlogFlush
> --
>
> Key: HBASE-5623
> URL: https://issues.apache.org/jira/browse/HBASE-5623
> Project: HBase
>  Issue Type: Bug
>  Components: wal
>Affects Versions: 0.94.0
>Reporter: Enis Soztutar
>Assignee: Enis Soztutar
>Priority: Critical
> Fix For: 0.94.0
>
> Attachments: 5623-suggestion.txt, 5623-v7.txt, 5623-v8.txt, 5623.txt, 
> 5623v2.txt, HBASE-5623_v0.patch, HBASE-5623_v4.patch, HBASE-5623_v5.patch, 
> HBASE-5623_v6-alt.patch, HBASE-5623_v6-alt.patch
>
>
> When doing a ycsb test with a large number of handlers 
> (regionserver.handler.count=60), I get the following exceptions:
> {code}
> Caused by: org.apache.hadoop.ipc.RemoteException: java.io.IOException: 
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.io.SequenceFile$Writer.getLength(SequenceFile.java:1099)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogWriter.getLength(SequenceFileLogWriter.java:314)
>   at org.apache.hadoop.hbase.regionserver.wal.HLog.syncer(HLog.java:1291)
>   at org.apache.hadoop.hbase.regionserver.wal.HLog.sync(HLog.java:1388)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchPut(HRegion.java:2192)
>   at org.apache.hadoop.hbase.regionserver.HRegion.put(HRegion.java:1985)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.multi(HRegionServer.java:3400)
>   at sun.reflect.GeneratedMethodAccessor17.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>   at java.lang.reflect.Method.invoke(Method.java:597)
>   at 
> org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:366)
>   at 
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1351)
>   at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:920)
>   at 
> org.apache.hadoop.hbase.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:152)
>   at $Proxy1.multi(Unknown Source)
>   at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$3$1.call(HConnectionManager.java:1691)
>   at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$3$1.call(HConnectionManager.java:1689)
>   at 
> org.apache.hadoop.hbase.client.ServerCallable.withoutRetries(ServerCallable.java:214)
> {code}
> and 
> {code}
>   java.lang.NullPointerException
>   at 
> org.apache.hadoop.io.SequenceFile$Writer.checkAndWriteSync(SequenceFile.java:1026)
>   at 
> org.apache.hadoop.io.SequenceFile$Writer.append(SequenceFile.java:1068)
>   at 
> org.apache.hadoop.io.SequenceFile$Writer.append(SequenceFile.java:1035)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogWriter.append(SequenceFileLogWriter.java:279)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.HLog$LogSyncer.hlogFlush(HLog.java:1237)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.HLog.syncer(HLog.java:1271)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.HLog.sync(HLog.java:1391)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchPut(HRegion.java:2192)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.put(HRegion.java:1985)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.multi(HRegionServer.java:3400)
>   at sun.reflect.GeneratedMethodAccessor33.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>   at java.lang.reflect.Method.invoke(Method.java:597)
>   at 
> org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:366)
>   at 
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1351)
> {code}
> It seems the root cause of the issue is that we open a new log writer and 
> close the old one at HLog#rollWriter() holding the updateLock, but the other 
> threads doing syncer() calls
> {code} 
> logSyncerThread.hlogFlush(this.writer);
> {code}
> without holding the updateLock. LogSyncer only synchronizes against 
> concurrent appends and flush(), but not on the passed writer, which can be 
> closed already by rollWriter(). In this case, since 
> SequenceFile#Writer.close() sets it's out field as null, we get the NPE. 

--
This message is automatically generated by JIRA.
If you think it wa