Hi all, I found some time to work on haproxy last weeks and to perform a number of fundamental changes that have been needed for a long time.
First, while working on SSL and Compression at Exceliance, we found that the way the internal buffers and the HTTP message interact is really annoying. It comes from a long leftover of the migration which happened in 1.3 but it now had to come to an end. Some buffer manipulation functions have to deal with pointers that are copied into other places and because of this, some operations such as a simple realign are not possible. So I've changed the way it works. Now a buffer has a base (or origin) pointer, everything below it is from the past and is leaving the buffer. Everything above it is new and waiting for being forwarded. And HTTP messages don't hold and absolute pointer anymore, just offsets relative to the base pointer. The change was complex but the code is much more manageable and offers much more flexibility right now. Some of these changes conflicted with the ACL and pattern frameworks, so it was the right moment to merge them together. We now have a single sample fetch function for each type of data we want to extract, and both ACLs and patterns rely on this. The first user-visible benefit from this is that ACLs can now match cookies, URL parameters and arbitrary payloads. In practice, the current code is almost ready to enable session tracking on any input criteria. I thought I could make the track-sc1 and track-sc2 actions track headers but some more changes were needed that were out of the scope of all these changes, so I left them for later. Since some ACLs and pattern fetch methods supported an argument, a new argument management framework was implemented, making it very easy to declare variable number of typed arguments for new keywords. Thanks to this extension, I could bring new optional arguments to hdr() and cook() fetch methods to specify an occurrence number. This allows stick-tables to extract an IP address from a precise occurrence of the X-Forwarded-For header for instance, and to write ACLs which match such headers against networks found in files. Another point which had to be done was to automatically type the samples. Since the pattern framework supported automatic type casts, it was easy to complete this. Thanks to these types, we now support IPv6 ACLs, and the "src" and "dst" ACL/patterns are IPv4 or IPv6 depending on the data found. This is important because it means that it is now possible to mix v4 and v6 addresses in ACL patterns. As a side effect, the "src6" and "dst6" pattern fetches have been removed because they were redundant with "src" and "dst". All these extensions required some improved parsing and error reporting. Thus I have implemented a simple and convenient error reporting framework based on a new "memprintf()" function which acts on a single pointer that is automatically reallocated and freed. A large number of config parsing options (specifically the ACL ones) which used to report "error at line X" are now able to say something like "occurrence -20 too negative at argument 2 of hdr_ip(), must be >= -10". I wish I've done this earlier, it's so simple, it took far less time to implement than the time it took to design without it in the past ! Along these things, the long-awaited "use-server" directive was introduced. It works as an exception to load balancing and persistence. It is convenient to avoid creating many backends when you want to select a server for a specific purpose (eg: monitoring). The log framework now learned to create, emit and log a unique request ID. Using the same syntax as log-format, it is possible to build a string which is supposed to uniquely identify a request in a given environment. This string is logged and emitted in headers so that everyone along the chain can log the same information, making it much easier to correlate events across large infrastructures. The error capture system was lacking a number of important information. I discovered this while trying to track a bug I have on my server, which causes invalid contents to sometimes be emitted and blocked by haproxy which logs them. Unfortunately, the level of information made these traces inexploitable. Now there are additional information such as the client's source port, all known internal flags, the position in the stream and the length of the last chunk. This will probably help when I get the error again. Another point, I found an uninitialized entry in a structure which made me waste 2 hours because on one machine, the first malloc() returned a zeroed area while on another one it was not the case. So I have added a command line option to enable memory poisonning. It immediately gave me another occurrence which I fixed :-) However I think the code is safe now. A number of other minor issues were fixed : - balance source did not properly hash IPv6 addresses (Alex Markham) - logformat could sometimes segfault (William Lallemand) - req_ssl_sni would randomly fail if a session ID is present (Emmanuel Bégazu) - doc cleanups and fixes to support HTML converter (Cyril Bonté) What's in the pipe now ? We're still working on getting SSL and compression to work. I think that the deep changes are done for now. We found how to split the socket and protocol handling and it looks promising. I hope that we'll get something working soon so that we can work on the multi-process model which is an absolute requirement if we want to get some performance on SSL. I already have some ideas and I believe I found how we could share the stats and servers states between all processes. But that's for another version :-) I've just released haproxy 1.5-dev9 with all the changes above. Thanks to all those who read till there. site index : http://haproxy.1wt.eu/ sources : http://haproxy.1wt.eu/download/1.5/src/devel/ changelog : http://haproxy.1wt.eu/download/1.5/src/CHANGELOG Cheers, Willy