Re: [ANNOUNCE] haproxy-1.7-dev6
Hi Willy. Thank you as always for the detailed answer. Am 10-11-2016 06:51, schrieb Willy Tarreau: Hi Aleks, On Thu, Nov 10, 2016 at 12:52:22AM +0100, Aleksandar Lazic wrote: > http://www.haproxy.org/download/1.7/doc/SPOE.txt I have read the doc. very interesting. When I understand this sentence right currently it is only possible to check some headers right? ### Actually, for now, the SPOE can offload the processing before "tcp-request content", "tcp-response content", "http-request" and "http-response" rules. ### In theory since you pass sample fetch results as arguments, it should be possible to pass anything. For example you can already parse the beginning of a body in http-request if you have enabled the correct option to wait for the body (http-buffer-request or something like this). So in theory you could even pass a part of it even right now. Interesting Idea. So a header only WAF is now "easily" possible instead of the full stack with mod_security. http://blog.haproxy.com/2012/10/12/scalable-waf-protection-with-haproxy-and-apache-with-modsecurity/ In theory yes. And that's one of the goals. My initial intent regarding this protocol was to be able to delegate some heavy processing outside of the process, to do it for blocking stuff (eg: ldap libs are always at list a little bit blocking), as well as for anything requiring threads. Then I realized that it would solve other problems. For example we have 3 device detection engines, none of them is ever built in by default because they have external dependencies, so users who want to use them have to rebuild haproxy and will not be able to use their distro packages anymore. Such components could possibly be moved to external agents. Another point is WAF. People have a love and hate relation with their WAF, whatever it is. When you deploy your first WAF, you start by loving it because you see in the logs that it blocks a lot of stuff. Then your customers complain about breakage and you have to tune it and find a tradeoff between protection and compatibility. And one day they get hacked and they declare this WAF useless and you want to change everything. Having the WAF built into haproxy would mean that users would have to switch to another LB just to use a different WAF! With SPOP we can imagine having various WAF implementations in external processes that users can chose from. Well, nothing to add! A last motive is stability. At haptech we have implemented support for loadable modules (and you know how much I don't want to see this in our version here). Developing these modules require extreme care and lots of skills regarding haproxy's internals. We currently have a few such modules, providing nice improvements but whose usage will be debatable depending on users. Thus supporting modules is interesting because not everyone is forced to load some code they don't necessarily need or want, and it saves us from having to maintain patches. However we have to enforce a very tight check on the internal API to ensure a module is not loaded on a different version, which means that users have to update their modules at the same time they update the haproxy executable. But despite this there's always the risk that a bug in some experimental code we put there corrupts the whole process and does nasty stuff (random crashes, corrupted responses, never-ending connections, etc). With an external process it's much easier for anyone to develop experimental code without taking any risk for the main process. And if something crashes, you know on which side it crashes thus you can guess why. And typically a WAF is not something I would like to see implemented as a module, I would fear support escalations for crashed processes! So you see, there are plenty of good reasons for being able to move some content processing outside of haproxy, and these reasons have driven the protocol design. The first implementation focuses on having something usable even if not optimal first (eg: we didn't implement pipelining of requests yet but given there are connection pools it's not a big deal). Some attacks are also in the post body, I assume this will come in the future after some good tests. Yes that's planned. You should already be able to pass a full buffer of data using req.body (not tested). This is even why the protocol supports fragmented payload. It's more complicated to implement though. We could even imagine doing some compression outside (eg: sdch, snappy, or whatever). In 1.7, the compression was moved to filters so it's pretty possible to move it to an external process as well. We'll be very interested in getting feedback such as "I tried to implement this and failed". The protocol currently looks nice and evolutive. But I know by experience that you can plan everything and the first feature someone requests cannot be fulfilled and will require a protocol update :-) I will start to create a Dockerfile for 1.7 a
Re: [ANNOUNCE] haproxy-1.7-dev6
Hi Aleks, On Thu, Nov 10, 2016 at 12:52:22AM +0100, Aleksandar Lazic wrote: > > http://www.haproxy.org/download/1.7/doc/SPOE.txt > > I have read the doc. very interesting. > > When I understand this sentence right currently it is only possible to check > some headers right? > ### > Actually, for now, the SPOE can offload the processing before "tcp-request > content", > "tcp-response content", "http-request" and "http-response" rules. > ### In theory since you pass sample fetch results as arguments, it should be possible to pass anything. For example you can already parse the beginning of a body in http-request if you have enabled the correct option to wait for the body (http-buffer-request or something like this). So in theory you could even pass a part of it even right now. > So a header only WAF is now "easily" possible instead of the full stack with > mod_security. > http://blog.haproxy.com/2012/10/12/scalable-waf-protection-with-haproxy-and-apache-with-modsecurity/ In theory yes. And that's one of the goals. My initial intent regarding this protocol was to be able to delegate some heavy processing outside of the process, to do it for blocking stuff (eg: ldap libs are always at list a little bit blocking), as well as for anything requiring threads. Then I realized that it would solve other problems. For example we have 3 device detection engines, none of them is ever built in by default because they have external dependencies, so users who want to use them have to rebuild haproxy and will not be able to use their distro packages anymore. Such components could possibly be moved to external agents. Another point is WAF. People have a love and hate relation with their WAF, whatever it is. When you deploy your first WAF, you start by loving it because you see in the logs that it blocks a lot of stuff. Then your customers complain about breakage and you have to tune it and find a tradeoff between protection and compatibility. And one day they get hacked and they declare this WAF useless and you want to change everything. Having the WAF built into haproxy would mean that users would have to switch to another LB just to use a different WAF! With SPOP we can imagine having various WAF implementations in external processes that users can chose from. A last motive is stability. At haptech we have implemented support for loadable modules (and you know how much I don't want to see this in our version here). Developing these modules require extreme care and lots of skills regarding haproxy's internals. We currently have a few such modules, providing nice improvements but whose usage will be debatable depending on users. Thus supporting modules is interesting because not everyone is forced to load some code they don't necessarily need or want, and it saves us from having to maintain patches. However we have to enforce a very tight check on the internal API to ensure a module is not loaded on a different version, which means that users have to update their modules at the same time they update the haproxy executable. But despite this there's always the risk that a bug in some experimental code we put there corrupts the whole process and does nasty stuff (random crashes, corrupted responses, never-ending connections, etc). With an external process it's much easier for anyone to develop experimental code without taking any risk for the main process. And if something crashes, you know on which side it crashes thus you can guess why. And typically a WAF is not something I would like to see implemented as a module, I would fear support escalations for crashed processes! So you see, there are plenty of good reasons for being able to move some content processing outside of haproxy, and these reasons have driven the protocol design. The first implementation focuses on having something usable even if not optimal first (eg: we didn't implement pipelining of requests yet but given there are connection pools it's not a big deal). > Some attacks are also in the post body, I assume this will come in the > future after some good tests. Yes that's planned. You should already be able to pass a full buffer of data using req.body (not tested). This is even why the protocol supports fragmented payload. It's more complicated to implement though. We could even imagine doing some compression outside (eg: sdch, snappy, or whatever). In 1.7, the compression was moved to filters so it's pretty possible to move it to an external process as well. We'll be very interested in getting feedback such as "I tried to implement this and failed". The protocol currently looks nice and evolutive. But I know by experience that you can plan everything and the first feature someone requests cannot be fulfilled and will require a protocol update :-) > > Finally some minor performance improvements were brought to the HTTP > > parser > > for large requests or responses (eg: long URLs, huge cookies). I've > > observed > > up to 10% increase
Re: [ANNOUNCE] haproxy-1.7-dev6
Hi Willy. Am 10-11-2016 00:18, schrieb Willy Tarreau: Hi, HAProxy 1.7-dev6 was released on 2016/11/09. It added 61 new commits after version 1.7-dev5. Great ;-) [snip] - and the new stream processing offload engine (SPOE). Yes, we had to give it a name. And the protocol is called SPOP. This is what allows haproxy to offload some of its processing to external processes which can apply some actions and set variables. There are a few things that really please me here. The first one obviously is that it was completed in time. Kudos to Christopher on this one! The next one is that I personally find the design quite clean and we left some room to improve the protocol later if needed, and to improve our first implementation of the protocol without breaking backwards compatibility. The next one is that the code lies in its own file without affecting the code at all, it solely relies on the new filters infrastructure, which at the same time starts to proves its maturity, and this is great. The last one is that there's quite an extensive doc and even an example of external agent to be used as a starting point to move your processing outside. Most likely the first use cases will be to implement various forms of authentication or content inspection. We're obviously interested in feedback here. Those not using it don't have to fear any side effect. More info here : http://www.haproxy.org/download/1.7/doc/SPOE.txt I have read the doc. very interesting. When I understand this sentence right currently it is only possible to check some headers right? ### Actually, for now, the SPOE can offload the processing before "tcp-request content", "tcp-response content", "http-request" and "http-response" rules. ### So a header only WAF is now "easily" possible instead of the full stack with mod_security. http://blog.haproxy.com/2012/10/12/scalable-waf-protection-with-haproxy-and-apache-with-modsecurity/ Some attacks are also in the post body, I assume this will come in the future after some good tests. Finally some minor performance improvements were brought to the HTTP parser for large requests or responses (eg: long URLs, huge cookies). I've observed up to 10% increase in request rate with 1kB cookies and 100-char URIs. For me very impressive, wow respect. The goal now really is to test this version and to release it with minimal changes in 1-2 weeks depending on feedback and bug reports. Yes that's short, so if you have a few minor pending patches that you'd like to get merged in 1.7, send them NOW. There are still a number of things I'd like to see better arranged, so cleanups and code moves may still happen, and still are welcome, but we must not perform other important changes now. Please if you want to touch anything in dumpstats.c, notify William who is trying to tidy all this horrible mess by moving all non-stats parts to their relevant files (no code change, just functions being reshuffled around). As I interpret this right the HTTP/2 will be on the roadmap of 1.8 or 2.0? Some of our customers want to use http2_push. I think this requires that also the HTTP/2 client (Backend) need to be implemented right? BR Aleks
[ANNOUNCE] haproxy-1.7-dev6
Hi, HAProxy 1.7-dev6 was released on 2016/11/09. It added 61 new commits after version 1.7-dev5. I must say I'm *really* happy because we managed to merge all the stuff that was still pending in dev5 and all persons involved managed to assign some time to get their stuff merged, which I do appreciate given that we're all pretty busy at this period of the year. There are still quite some important changes but with limited impact on existing code since most of these changes were performed in side areas. The 4 main changes are : - ability to start a server whose address doesn't resolve, to decide how it must resolve upon startup, and to let the DNS or the CLI set its IP address later. This is achieved thanks to the new "init-addr" server setting. This implies that server address resolution is now delayed to a later moment in the boot sequence and that it will now be possible to pre-configure pools of unused servers that can be populated at run time when needed. This brings extra interesting improvements that we didn't think about. The first one is that when a config has errors, now you get all resolution errors at once instead of having to edit the file one line at a time and to try again. The second one is that it's now trivial to completely disable server address resolution upon failure, so we added a new debug option for this (-dr). That's convenient for people who, like me, often face configs which don't resolve in their environment and still want to validate the parsing. Please refer to the doc for this. [work done by Baptiste and me] - a DNS resolution failure can now finally bring a server down once the hold time is expired. This has been missing in 1.6 and had for consequence that traffic could be sent to a wrong server if the address was reassigned to someone else. This combined with init-addr above will provide an interesting method to transparently enable/disable servers in dynamic farms. [work done by Baptiste] - initial support for OpenSSL 1.1.0 was added. It builds with some warnings remining that parts of the old API are now deprecated, but it seems to work. Compatibility with OpenSSL 1.0.1/1.0.2 was maintained and assured via a compatibility file mapping the new API to the old one. At this moment, OpenSSL 0.9.8 doesn't build anymore. It doesn't seem terribly complicated to fix but as usual in this situations it's a painful process and we preferred to focus on the other pending stuff given that 0.9.8 is not supported anymore. However if someone is willing to address this, patches are more than welcome! I suggest to add a distinct section in the openssl-compat file for 0.9.8 as its API differs from 1.0.x. Distro maintainers might be interested in giving it a try on their next distros. [work done by Dirkjan Bussink] - and the new stream processing offload engine (SPOE). Yes, we had to give it a name. And the protocol is called SPOP. This is what allows haproxy to offload some of its processing to external processes which can apply some actions and set variables. There are a few things that really please me here. The first one obviously is that it was completed in time. Kudos to Christopher on this one! The next one is that I personally find the design quite clean and we left some room to improve the protocol later if needed, and to improve our first implementation of the protocol without breaking backwards compatibility. The next one is that the code lies in its own file without affecting the code at all, it solely relies on the new filters infrastructure, which at the same time starts to proves its maturity, and this is great. The last one is that there's quite an extensive doc and even an example of external agent to be used as a starting point to move your processing outside. Most likely the first use cases will be to implement various forms of authentication or content inspection. We're obviously interested in feedback here. Those not using it don't have to fear any side effect. More info here : http://www.haproxy.org/download/1.7/doc/SPOE.txt We also now have a third device detection engine, WURFL, contributed by Scientiamobile. The code is clean and well isolated so it was not a problem to merge it this late in the release process. I took this opportunity to clean up our README by moving the parts specific to DeviceAtlas and 51Degrees to their own file as well because they used to represent 1/3 of the whole file. Aside this we fixed the last pending bugs around the systemd wrapper and the issue I introduced in 1.6 when porting the peers to the new applet subsystem causing some connections to stay there forever and prevening old processes from disappearing sometimes upon reload. The drain state is now properly restored upon reload when using the state-file. Finally some minor perfo