Hi Christian,

what's the option on the NFS ?

Most of the time, I use I use sync and I increase the rsize and wsize.

Anyway, the difference is huge. The problem could be related to the journal checksum check. Can you try checkForCorruptJournalFiles="false" on the kahadb configuration ?

Maybe increasing the kahadb jobs could help too:

            <kahaDB directory="${karaf.data}/activemq/activemq/kahadb"
                indexWriteBatchSize="1000"
                indexCacheSize="2000"
                journalMaxFileLength="32mb"
                checkForCorruptJournalFiles="false"
                maxAsyncJobs="5000"
                concurrentStoreAndDispatchQueues="true"
                concurrentStoreAndDispatchTopics="true"
                enableJournalDiskSyncs="true"
                enableIndexWriteAsync="true"/>


Regards
JB

On 02/15/2016 01:32 PM, Christian Schneider wrote:
At a customer we are experiencing a very strange behaviour of activemq.

We have two brokers on separate machines A and B. Storage is on a third
machine and imported using nfsv4.
Kahadb contains about 18 GB of journal files.

Scenario 1:
- Stop B
- Stop A (master now)
- Start A

The start of A takes about 50 seconds.

Scnario 2:
- Stop A
- Stop B (master now)
- Start A

The start of A takes about 5 minutes.

In the slow case the log at debug level shows a big gap between adding
the last queue (12:04) and running the ActiveMQ Journal Checkpoint
Worker again after the queue creation (12:09).
I have no idea what activemq is doing in the mean time but it seems to
take several minutes.

Can anyone explain this behaviour or explain what activemq does in the
mean time? I would also be happy about pointer to the code.
What I do not understand is why it makes a difference if the master is
started again or if the slave is becoming master. As they both share the
same kahadb over nfs I would expect the same startup
behaviour.

Christian

-----

Slow start
2016-02-15 12:03:01,943 | INFO  | Refreshing
org.apache.activemq.xbean.XBeanBrokerFactory$1@17b14695: startup date
[Mon Feb 15 12:03:01 CET 2016]; root of context hierarchy |
org.apache.activemq.xbean.XBeanBrokerFactory$1 | main
...
2016-02-15 12:03:03,018 | INFO  |
PListStore:[/path/activemq/data/A/tmp_storage] started |
org.apache.activemq.store.kahadb.plist.PListStoreImpl | main
2016-02-15 12:03:03,033 | INFO  | ignoring zero length, partially
initialised journal data file: db-1.log number = 1 , length = 0 |
org.apache.activemq.store.kahadb.disk.journal.Journal | main
2016-02-15 12:03:03,177 | INFO  | JobSchedulerStore:path/scheduler
started |
org.apache.activemq.store.kahadb.scheduler.JobSchedulerStoreImpl | main
2016-02-15 12:03:03,180 | INFO  | JobScheduler using directory:
path/scheduler | org.apache.activemq.broker.BrokerService | main
2016-02-15 12:03:03,335 | INFO  | Using Persistence Adapter:
KahaDBPersistenceAdapter[/path] |
org.apache.activemq.broker.BrokerService | main
2016-02-15 12:03:03,350 | INFO  | JMX consoles can connect to
service:jmx:rmi:///jndi/rmi://localhost:1617/jmxrmi |
org.apache.activemq.broker.jmx.ManagementContext | JMX connector
2016-02-15 12:03:05,198 | INFO  | Corrupt journal records found in
'path/db-1122.log' between offsets: 17893102..17893732 |
org.apache.activemq.store.kahadb.disk.journal.Journal | main
2016-02-15 12:03:47,925 | INFO  | KahaDB is version 5 |
org.apache.activemq.store.kahadb.MessageDatabase | main
....
2016-02-15 12:03:49,233 | INFO  | Recovering from the journal ... |
org.apache.activemq.store.kahadb.MessageDatabase | main
2016-02-15 12:03:49,237 | INFO  | Recovery replayed 20 operations from
the journal in 0.094 seconds. |
org.apache.activemq.store.kahadb.MessageDatabase | main
2016-02-15 12:04:26,863 | DEBUG | Checkpoint started. |
org.apache.activemq.store.kahadb.MessageDatabase | ActiveMQ Journal
Checkpoint Worker
2016-02-15 12:04:26,932 | DEBUG | Checkpoint done. |
org.apache.activemq.store.kahadb.MessageDatabase | ActiveMQ Journal
Checkpoint Worker
2016-02-15 12:04:26,940 | INFO  | Apache ActiveMQ 5.10.2 (A,
ID:A-59167-1455534183070-1:1) is starting |
org.apache.activemq.broker.BrokerService | main
...
2016-02-15 12:04:30,289 | DEBUG | A adding destination: queue://q1.DLQ |
org.apache.activemq.broker.region.AbstractRegion | main
2016-02-15 12:04:57,026 | DEBUG | queue://q1.DLQ expiring messages .. |
org.apache.activemq.broker.region.Queue | ActiveMQ Broker[A] Scheduler
2016-02-15 12:04:57,029 | DEBUG | q1.DLQ toPageIn: 5, Inflight: 0,
pagedInMessages.size 0, pagedInPendingDispatch.size 0, enqueueCount: 0,
dequeueCount: 0, memUsage:0 | org.apache.activemq.broker.region.Queue |
ActiveMQ Broker[A] Scheduler
2016-02-15 12:09:20,440 | DEBUG | Checkpoint started. |
org.apache.activemq.store.kahadb.MessageDatabase | ActiveMQ Journal
Checkpoint Worker
2016-02-15 12:09:20,476 | DEBUG | Checkpoint done. |
org.apache.activemq.store.kahadb.MessageDatabase | ActiveMQ Journal
Checkpoint Worker
2016-02-15 12:09:20,488 | DEBUG | queue://q1.DLQ expiring messages done.
| org.apache.activemq.broker.region.Queue | ActiveMQ Broker[A] Scheduler



--
Jean-Baptiste Onofré
jbono...@apache.org
http://blog.nanthrax.net
Talend - http://www.talend.com

Reply via email to