On Thu, 28 May 2009, Rainer Gerhards wrote: > as you probably know, I am currently implementing ultra-reliable queue > processing mode. This is a re-design of important parts of the queue engine. > The bulk of work went rather well, but I am now having some serious trouble > with terminating the queue in disk-assisted cases. Also, there are some > issues with disk-assisted modes at all. > > Just as a reminder: disk-assisted (DA) mode is a mode where the queue usually > runs in memory but is configured to go to disk if we hit the configured upper > bound of messages to be kept in core. DA mode is implemented as two queues > running concurrently. Whenever the high water mark is reached, the regular > queue consumer is stopped, and a special queue-to-disk-queue consumer is > started which then shuffles messages from the main queue to the disk queue. > The regular consumer is than run off the disk queue (and only the disk > queue). This is reversed as soon as the in-memory queue consumption hits a > low water mark. > > Experience has shown that DA mode is of limited use when it comes to bursts - > that is because it effectively slows down the engine greatly, as all messages > need to go to the disk first. > > This was done to preserve the order of messages. However, with potentially > multiple workers and large batches, we do not have any decent order of > messages at all. > > I think DA mode would greatly benefit if we give up the approach to try > preserve message sequence. In that case, I can run both the regular and the > disk-based consumer at the same time. This solves at least a couple of the > issues I have with termination (at least I think so), and it also makes this > mode more efficient in practical use. This is because we now can continue to > process data from the in-memory queue in parallel to shuffeling it to disk > (this is also the source of out-of-order processing). > > I would appreciate feedback on this issue.
there are already many things that can mess up the order, network hardware can re-order UDP messages, multiple worker threads, etc I think it would be a useful thing to document some of these things as it is not the norm for syslog (which is usually single threaded) personally I don't see the DA mode as being for bursts (but then, I think of bursts in terms of saturating gig ethernet, far higher loads that any disk-based queue could handle), I see it as a mechanism to handle temporary (but lengthy) outages of the destination without running out of ram. when moving messages betweem the memory queue and the disk queue, it will probably speed things up significantly if they are done in batches. this will also keep space in the memory queue to handle bursts of traffic. David Lang _______________________________________________ rsyslog mailing list http://lists.adiscon.net/mailman/listinfo/rsyslog http://www.rsyslog.com

