On Fri, 2009-07-24 at 12:45 -0700, Sandy Pratt wrote: > > In practice, it may be possible to fill a journal by using tiny > > messages > > (~10 bytes) to just below the enqueue threshold, then dequeueing all > > these messages except the first. In this case the sheer number of > > dequeues will fill the remaining approx. 20% of the journal while > > keeping the first message in place as a blocker. One of the > functional > > tests uses this technique to test that a full journal occurs as > > expected. > > [Sandy Pratt] > > That's a clue. I was using messages of about 21 bytes previously, so > maybe there were a few of those hanging out in the journal still. I > didn't explicitly empty the queue when I changed the message size. > > > > Thanks for bringing this up, I am interested to see how the enqueue > > capacity check is being evaded. It may be some odd sequence that > occurs > > through JMS - but I fail to see how. > > [Sandy Pratt] > > I'll see what I can do. > Thanks for the journal files. Here is a summary of what I have found and deduced by looking at the journal:
1. Your test (I assume) enqueues 32,000 records. It then dequeues about 10,000 records (with even rids) using non-transactional dequeues, while 10,000 records (those with odd rids) are dequeued using transactional dequeues with a transactional block size of 1 (ie 1 txn per dequeue). The last block of about 12,000 records dequeues all the remaining records non-transactionally. 2. The enqueues seem normal, and are not transactional. 3. The first two sets of dequeues (the even and odd set) occur at the same time, I assume with separate threads or processes. However, the transactional dequeues occur much more slowly than the non-transactional dequeues. By the time the non-transactional (even) block of 10,024 records is dequeued, the transactional block has only dequeued 55 of the odd records. 4. At this point, the third block of about 12,000 dequeues (both even and odd rids, non-tx) start. When this block completes, the transactional dequeues have reached a total of only 84 of the odd-rid records. 5. Finally, with only the transactional dequeues remaining, the next 1405 records dequeue transactionally before the journal fills. The fact that these remain until the end is important in understanding why the journal filled up. 6. As I have previously explained, only the enqueues are subject to the threshold check that prevents the journal from getting more than about 80% full. Dequeues and transactions are not subject to this test, as these have a tendency to free up journal space. Each of the enqueue records consume 2 data blocks (each data block is 128 bytes) in the journal, while non-transactional dequeues consume only 1 data-block. However, transactional dequeues with a txn block size of 1 consume a whopping 8 dblks each. This is because transactions must flush both the record to be transacted and a commit record, and each flush must write to the disk in blocks of 512 bytes (ie 4 dblks) because of the use of O_DIRECT. Thus a dequeue-flush-commit-flush sequence consumes 8 dblks. 7. When a total of only 1489 of the transactional dequeus have been written, the journal tries to switch to the next file which contain the very enqueued records still being dequeued. Hence the slow pace of transactional dequeues has caused these early records to become blockers in the circular disk journal. BOTTOM LINE: This is in essence the same case I mentioned earlier in which a journal can be filled by dequeueing many small enqueues but leaving a blocker in place. However, the eye-opener in this case for me is the fact that transactional dequeus are both so slow and consume 8 times the disk space that its non-transactionsl counterpart does. This means that transactions make the journal more vulnerable to filling this way than I had previously thought. This problem is fundamentally one of out-of-order dequeueing. Because the transactional dequeues are so much slower than their non-transactional counterparts, and because there is no synchronization mechanism in the test (which would allow the tx dequeues to catch up at regular intervals), many of the early records have been allowed to become blockers in the journal. Solutions are twofold: 1. Synchronize the test so that the different threads do not allow the out-of-order problem to become excessive; 2. Make the journal bigger. Both the size and number of files used may be increased using the --num-jfiles and --jfile-size-pgs options when starting the broker (see broker help for further details). At some point in the future, when auto-expand becomes available, this mode of failure will be prevented if the feature is enabled. The question of how to recover from a full journal remains. Sandy, if you are interested, I can send you the journal analysis summary file. I did notice another oddity in the test (not related to this issue), but not knowing the details of the test itself, I cannot make any definitive comments. Kim --------------------------------------------------------------------- Apache Qpid - AMQP Messaging Implementation Project: http://qpid.apache.org Use/Interact: mailto:[email protected]
