[ 
https://issues.apache.org/jira/browse/AMQ-3725?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rob Waite updated AMQ-3725:
---------------------------

    Comment: was deleted

(was: So I reattempted with the latest snapshot 
(apache-activemq-5.9-20131011.032953-115-bin) and it did seem a lot better. My 
test is really just to make a producer that sends messages in a tight loop and 
to use a USB thumbdrive for the data directory. Then, while the producer is 
writing as fast as it can, I yank out the thumbdrive.

We have seen issues with our SAN becoming unavailable and seeing very similar 
stack traces.

With the new snapshot... it seems like it recovers most of the time. Twice (in 
maybe 15 minutes of testing) it would not start back up. The first time it 
seemed like the USB drive "data" folder got corrupted... I couldnt read from 
the data folder from windows (it woudl say it was corrupted). I reformatted it 
and tried again and the most recent error seemed like it would recover but then 
a very vague null pointer error happened.

Attached is the output... at the end you can see it tried to restart and then a 
nullpointerexception occurred and it halted.

I know a thumbdrive would not be used in prod... but the exceptions seemed very 
similar to what we have seen with our NFS store.)

> Kahadb error during SAN failover delayed write - Allow kahaDB to recover in a 
> similar manner as the JDBC store using the IOExceptionHandler
> -------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: AMQ-3725
>                 URL: https://issues.apache.org/jira/browse/AMQ-3725
>             Project: ActiveMQ
>          Issue Type: Bug
>          Components: Message Store
>    Affects Versions: 5.5.1
>            Reporter: Jason Sherman
>             Fix For: 5.10.0
>
>         Attachments: AMQ-3725-10112013.txt
>
>
> An issue can arise that causes the broker to terminate when using kahaDB with 
> a SAN, when the SAN fails over.  In this case the failover process is 
> seamless however, on fail back there is a 2-3 sec delay where writes are 
> blocked and the broker terminates.  With the JDBC datastore a similar 
> situation can be handled by using the IOExceptionHandler.  However with 
> kahaDB, when this same IOExceptionHandler is added it prevents the broker 
> from terminating but kahaDB retains an invalid index.
> {code}
>  INFO | ActiveMQ JMS Message Broker (Broker1, 
> ID:macbookpro-251a.home-56915-1328715089252-0:1) started
>  INFO | jetty-7.1.6.v20100715
>  INFO | ActiveMQ WebConsole initialized.
>  INFO | Initializing Spring FrameworkServlet 'dispatcher'
>  INFO | ActiveMQ Console at http://0.0.0.0:8161/admin
>  INFO | ActiveMQ Web Demos at http://0.0.0.0:8161/demo
>  INFO | RESTful file access application at http://0.0.0.0:8161/fileserver
>  INFO | FUSE Web Console at http://0.0.0.0:8161/console
>  INFO | Started [email protected]:8161
> ERROR | KahaDB failed to store to Journal
> java.io.SyncFailedException: sync failed
>       at java.io.FileDescriptor.sync(Native Method)
>       at 
> org.apache.kahadb.journal.DataFileAppender.processQueue(DataFileAppender.java:382)
>       at 
> org.apache.kahadb.journal.DataFileAppender$2.run(DataFileAppender.java:203)
>  INFO | Ignoring IO exception, java.io.SyncFailedException: sync failed
> java.io.SyncFailedException: sync failed
>       at java.io.FileDescriptor.sync(Native Method)
>       at 
> org.apache.kahadb.journal.DataFileAppender.processQueue(DataFileAppender.java:382)
>       at 
> org.apache.kahadb.journal.DataFileAppender$2.run(DataFileAppender.java:203)
> ERROR | Checkpoint failed
> java.io.SyncFailedException: sync failed
>       at java.io.FileDescriptor.sync(Native Method)
>       at 
> org.apache.kahadb.journal.DataFileAppender.processQueue(DataFileAppender.java:382)
>       at 
> org.apache.kahadb.journal.DataFileAppender$2.run(DataFileAppender.java:203)
>  INFO | Ignoring IO exception, java.io.SyncFailedException: sync failed
> java.io.SyncFailedException: sync failed
>       at java.io.FileDescriptor.sync(Native Method)
>       at 
> org.apache.kahadb.journal.DataFileAppender.processQueue(DataFileAppender.java:382)
>       at 
> org.apache.kahadb.journal.DataFileAppender$2.run(DataFileAppender.java:203)
> ERROR | KahaDB failed to store to Journal
> java.io.FileNotFoundException: /Volumes/NAS-01/data/kahadb/db-1.log (No such 
> file or directory)
>       at java.io.RandomAccessFile.open(Native Method)
>       at java.io.RandomAccessFile.<init>(RandomAccessFile.java:216)
>       at 
> org.apache.kahadb.journal.DataFile.openRandomAccessFile(DataFile.java:70)
>       at 
> org.apache.kahadb.journal.DataFileAppender.processQueue(DataFileAppender.java:324)
>       at 
> org.apache.kahadb.journal.DataFileAppender$2.run(DataFileAppender.java:203)
>  INFO | Ignoring IO exception, java.io.FileNotFoundException: 
> /Volumes/NAS-01/data/kahadb/db-1.log (No such file or directory)
> java.io.FileNotFoundException: /Volumes/NAS-01/data/kahadb/db-1.log (No such 
> file or directory)
>       at java.io.RandomAccessFile.open(Native Method)
>       at java.io.RandomAccessFile.<init>(RandomAccessFile.java:216)
>       at 
> org.apache.kahadb.journal.DataFile.openRandomAccessFile(DataFile.java:70)
>       at 
> org.apache.kahadb.journal.DataFileAppender.processQueue(DataFileAppender.java:324)
>       at 
> org.apache.kahadb.journal.DataFileAppender$2.run(DataFileAppender.java:203)
> ERROR | KahaDB failed to store to Journal
> java.io.FileNotFoundException: /Volumes/NAS-01/data/kahadb/db-1.log (No such 
> file or directory)
>       at java.io.RandomAccessFile.open(Native Method)
>       at java.io.RandomAccessFile.<init>(RandomAccessFile.java:216)
>       at 
> org.apache.kahadb.journal.DataFile.openRandomAccessFile(DataFile.java:70)
>       at 
> org.apache.kahadb.journal.DataFileAppender.processQueue(DataFileAppender.java:324)
>       at 
> org.apache.kahadb.journal.DataFileAppender$2.run(DataFileAppender.java:203)
>  INFO | Ignoring IO exception, java.io.FileNotFoundException: 
> /Volumes/NAS-01/data/kahadb/db-1.log (No such file or directory)
> java.io.FileNotFoundException: /Volumes/NAS-01/data/kahadb/db-1.log (No such 
> file or directory)
>       at java.io.RandomAccessFile.open(Native Method)
>       at java.io.RandomAccessFile.<init>(RandomAccessFile.java:216)
>       at 
> org.apache.kahadb.journal.DataFile.openRandomAccessFile(DataFile.java:70)
>       at 
> org.apache.kahadb.journal.DataFileAppender.processQueue(DataFileAppender.java:324)
>       at 
> org.apache.kahadb.journal.DataFileAppender$2.run(DataFileAppender.java:203)
>  WARN | Transport failed: java.io.EOFException
>  WARN | Transport failed: java.io.EOFException
>  INFO | KahaDB: Recovering checkpoint thread after death
> ERROR | Checkpoint failed
> java.io.IOException: Input/output error
>       at java.io.RandomAccessFile.write(Native Method)
>       at java.io.RandomAccessFile.writeLong(RandomAccessFile.java:1001)
>       at org.apache.kahadb.page.PageFile.writeBatch(PageFile.java:1006)
>       at org.apache.kahadb.page.PageFile.flush(PageFile.java:484)
>       at 
> org.apache.activemq.store.kahadb.MessageDatabase.checkpointUpdate(MessageDatabase.java:1290)
>       at 
> org.apache.activemq.store.kahadb.MessageDatabase$10.execute(MessageDatabase.java:768)
>       at org.apache.kahadb.page.Transaction.execute(Transaction.java:760)
>       at 
> org.apache.activemq.store.kahadb.MessageDatabase.checkpointCleanup(MessageDatabase.java:766)
>       at 
> org.apache.activemq.store.kahadb.MessageDatabase$3.run(MessageDatabase.java:315)
>  INFO | Ignoring IO exception, java.io.IOException: Input/output error
> java.io.IOException: Input/output error
>       at java.io.RandomAccessFile.write(Native Method)
>       at java.io.RandomAccessFile.writeLong(RandomAccessFile.java:1001)
>       at org.apache.kahadb.page.PageFile.writeBatch(PageFile.java:1006)
>       at org.apache.kahadb.page.PageFile.flush(PageFile.java:484)
>       at 
> org.apache.activemq.store.kahadb.MessageDatabase.checkpointUpdate(MessageDatabase.java:1290)
>       at 
> org.apache.activemq.store.kahadb.MessageDatabase$10.execute(MessageDatabase.java:768)
>       at org.apache.kahadb.page.Transaction.execute(Transaction.java:760)
>       at 
> org.apache.activemq.store.kahadb.MessageDatabase.checkpointCleanup(MessageDatabase.java:766)
>       at 
> org.apache.activemq.store.kahadb.MessageDatabase$3.run(MessageDatabase.java:315)
>  INFO | KahaDB: Recovering checkpoint thread after death
> ERROR | Checkpoint failed
> java.io.IOException: Input/output error
>       at java.io.RandomAccessFile.write(Native Method)
>       at java.io.RandomAccessFile.writeLong(RandomAccessFile.java:1001)
>       at org.apache.kahadb.page.PageFile.writeBatch(PageFile.java:1006)
>       at org.apache.kahadb.page.PageFile.flush(PageFile.java:484)
>       at 
> org.apache.activemq.store.kahadb.MessageDatabase.checkpointUpdate(MessageDatabase.java:1290)
>       at 
> org.apache.activemq.store.kahadb.MessageDatabase$10.execute(MessageDatabase.java:768)
>       at org.apache.kahadb.page.Transaction.execute(Transaction.java:760)
>       at 
> org.apache.activemq.store.kahadb.MessageDatabase.checkpointCleanup(MessageDatabase.java:766)
>       at 
> org.apache.activemq.store.kahadb.MessageDatabase$3.run(MessageDatabase.java:315)
>  INFO | Ignoring IO exception, java.io.IOException: Input/output error
> java.io.IOException: Input/output error
>       at java.io.RandomAccessFile.write(Native Method)
>       at java.io.RandomAccessFile.writeLong(RandomAccessFile.java:1001)
>       at org.apache.kahadb.page.PageFile.writeBatch(PageFile.java:1006)
>       at org.apache.kahadb.page.PageFile.flush(PageFile.java:484)
>       at 
> org.apache.activemq.store.kahadb.MessageDatabase.checkpointUpdate(MessageDatabase.java:1290)
>       at 
> org.apache.activemq.store.kahadb.MessageDatabase$10.execute(MessageDatabase.java:768)
>       at org.apache.kahadb.page.Transaction.execute(Transaction.java:760)
>       at 
> org.apache.activemq.store.kahadb.MessageDatabase.checkpointCleanup(MessageDatabase.java:766)
>       at 
> org.apache.activemq.store.kahadb.MessageDatabase$3.run(MessageDatabase.java:315)
>  WARN | Transport failed: java.io.EOFException
> {code}



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Reply via email to