[ 
https://issues.apache.org/jira/browse/AMQ-4837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13825318#comment-13825318
 ] 

Hiram Chirino commented on AMQ-4837:
------------------------------------

Tenzin,

Does it also happen with the following build?

https://repository.apache.org/content/repositories/snapshots/org/apache/activemq/apache-activemq/5.10-SNAPSHOT/apache-activemq-5.10-20131106.134045-17-bin.tar.gz

> LevelDB corrupted in AMQ cluster
> --------------------------------
>
>                 Key: AMQ-4837
>                 URL: https://issues.apache.org/jira/browse/AMQ-4837
>             Project: ActiveMQ
>          Issue Type: Bug
>          Components: activemq-leveldb-store
>    Affects Versions: 5.9.0
>         Environment: CentOS, Linux version 2.6.32-71.29.1.el6.x86_64
> java-1.7.0-openjdk.x86_64/java-1.6.0-openjdk.x86_64
> zookeeper-3.4.5.2
>            Reporter: Guillaume
>            Assignee: Hiram Chirino
>            Priority: Critical
>         Attachments: LevelDBCorrupted.zip, activemq.xml
>
>
> I have clustered 3 ActiveMQ instances using replicated leveldb and zookeeper. 
> When performing some tests using Web UI, I can across issues that appears to 
> corrupt the leveldb data files.
> The issue can be replicated by performing the following steps:
> 1.    Start 3 activemq nodes.
> 2.    Push a message to the master (Node1) and browse the queue using the web 
> UI
> 3.    Stop master node (Node1)
> 4.    Push a message to the new master (Node2) and browse the queue using the 
> web UI. Message summary and queue content ok.
> 5.    Start Node1
> 6.    Stop master node (Node2)
> 7.    Browse the queue using the web UI on new master (Node3). Message 
> summary ok however when clicking on the queue, no message details. An error 
> (see below) is logged by the master, which attempts a restart.
> From this point, the database appears to be corrupted and the same error 
> occurs to each node infinitely (shutdown/restart). The only way around is to 
> stop the nodes and clear the data files.
> However when a message is pushed between step 5 and 6, the error doesn’t 
> occur.
> =================================
> Leveldb configuration on the 3 instances:
>               <persistenceAdapter>
>                       <replicatedLevelDB
>                                       directory="${activemq.data}/leveldb"
>                                       replicas="3"
>                                       bind="tcp://0.0.0.0:0"
>                                       zkAddress="zkserver:2181"
>                                       zkPath="/activemq/leveldb-stores"
>                                       />
>               </persistenceAdapter>
> =================================
> The error is:
> INFO | Stopping BrokerService[localhost] due to exception, java.io.IOException
> java.io.IOException
>         at 
> org.apache.activemq.util.IOExceptionSupport.create(IOExceptionSupport.java:39)
>         at 
> org.apache.activemq.leveldb.LevelDBClient.might_fail(LevelDBClient.scala:543)
>         at 
> org.apache.activemq.leveldb.LevelDBClient.might_fail_using_index(LevelDBClient.scala:974)
>         at 
> org.apache.activemq.leveldb.LevelDBClient.collectionCursor(LevelDBClient.scala:1270)
>         at 
> org.apache.activemq.leveldb.LevelDBClient.queueCursor(LevelDBClient.scala:1194)
>         at 
> org.apache.activemq.leveldb.DBManager.cursorMessages(DBManager.scala:708)
>        at 
> org.apache.activemq.leveldb.LevelDBStore$LevelDBMessageStore.recoverNextMessages(LevelDBStore.scala:741)
>         at 
> org.apache.activemq.broker.region.cursors.QueueStorePrefetch.doFillBatch(QueueStorePrefetch.java:106)
>         at 
> org.apache.activemq.broker.region.cursors.AbstractStoreCursor.fillBatch(AbstractStoreCursor.java:258)
>         at 
> org.apache.activemq.broker.region.cursors.AbstractStoreCursor.reset(AbstractStoreCursor.java:108)
>         at 
> org.apache.activemq.broker.region.cursors.StoreQueueCursor.reset(StoreQueueCursor.java:157)
>         at 
> org.apache.activemq.broker.region.Queue.doPageInForDispatch(Queue.java:1875)
>         at 
> org.apache.activemq.broker.region.Queue.pageInMessages(Queue.java:2086)
>         at org.apache.activemq.broker.region.Queue.iterate(Queue.java:1581)
>         at 
> org.apache.activemq.thread.PooledTaskRunner.runTask(PooledTaskRunner.java:129)
>         at 
> org.apache.activemq.thread.PooledTaskRunner$1.run(PooledTaskRunner.java:47)
>         at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>         at java.lang.Thread.run(Thread.java:722)
> Caused by: java.lang.NullPointerException
>         at 
> org.apache.activemq.leveldb.LevelDBClient$$anonfun$queueCursor$1.apply(LevelDBClient.scala:1198)
>         at 
> org.apache.activemq.leveldb.LevelDBClient$$anonfun$queueCursor$1.apply(LevelDBClient.scala:1194)
>         at 
> org.apache.activemq.leveldb.LevelDBClient$$anonfun$collectionCursor$1$$anonfun$apply$mcV$sp$12.apply(LevelDBClient.scala:1272)
>         at 
> org.apache.activemq.leveldb.LevelDBClient$$anonfun$collectionCursor$1$$anonfun$apply$mcV$sp$12.apply(LevelDBClient.scala:1271)
>         at 
> org.apache.activemq.leveldb.LevelDBClient$RichDB.check$4(LevelDBClient.scala:315)
>         at 
> org.apache.activemq.leveldb.LevelDBClient$RichDB.cursorRange(LevelDBClient.scala:317)
>         at 
> org.apache.activemq.leveldb.LevelDBClient$$anonfun$collectionCursor$1.apply$mcV$sp(LevelDBClient.scala:1271)
>         at 
> org.apache.activemq.leveldb.LevelDBClient$$anonfun$collectionCursor$1.apply(LevelDBClient.scala:1271)
>         at 
> org.apache.activemq.leveldb.LevelDBClient$$anonfun$collectionCursor$1.apply(LevelDBClient.scala:1271)
>         at 
> org.apache.activemq.leveldb.LevelDBClient.usingIndex(LevelDBClient.scala:968)
>         at 
> org.apache.activemq.leveldb.LevelDBClient$$anonfun$might_fail_using_index$1.apply(LevelDBClient.scala:974)
>         at 
> org.apache.activemq.leveldb.LevelDBClient.might_fail(LevelDBClient.scala:540)
>         ... 17 more



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Reply via email to