Re: Inkonsistent forward-for

2014-05-20 Thread Jeffrey 'jf' Lim
On Wed, May 21, 2014 at 2:47 PM, Jürgen Haas  wrote:
> Am 21.05.2014 08:40, schrieb Jeffrey 'jf' Lim:
>> On Wed, May 21, 2014 at 2:29 PM, Jürgen Haas  wrote:
>> It's been some time since i last looked at the code; but I reckon it
>> would be the same issue I came across some time back. Do a dump on the
>> traffic to be sure. The RFC allows for headers with multiple values to
>> either be represented as repeated headers, each with one value, or as
>> a single header, with all of the values separated by commas. In either
>> case, your backend has to be capable / smart enough to be able to deal
>> with the 2 formats.
>>
>> -jf
>
> Thanks Jeffrey, you reckon to dump traffic at the backend or on the
> proxy? If the latter, any advise on how this could be done?
>

At the backend, of course. Look into tcpdump. I think you would do
well to investigate the point that others have made about tunnel mode
as well.

-jf

--
He who settles on the idea of the intelligent man as a static entity
only shows himself to be a fool.

Mensan / Full-Stack Technical Polymath / System Administrator
12 years over the entire web stack: Performance, Sysadmin, Ruby and Frontend



Re: Inkonsistent forward-for

2014-05-20 Thread Jeffrey 'jf' Lim
On Wed, May 21, 2014 at 2:29 PM, Jürgen Haas  wrote:
> Hi there,
>
> I'm having some issues with the forward-for feature. It seems to be
> working in general but for some reason not consistently. My default
> section in the config file looks like this:
>
> defaults
>   log global
>   mode http
>   option httplog
>   option dontlognull
>   option forwardfor
>   retries  3
>   maxconn 1000
>   timeout connect 5000ms
>   timeout client 120s
>   timeout server 120s
>   default_backend backend_ts1
>
> The apache config files on all web servers are configured so that they
> use the X-Forwarded-For header field if available:
>
> LogFormat "%{X-Forwarded-For}i %l %u %t \"%r\" %>s %b \"%{Referer}i\"
> \"%{User-Agent}i\"" proxy
> SetEnvIf X-Forwarded-For "^.*\..*\..*\..*" forwarded
> CustomLog ${APACHE_LOG_DIR}/access.log combined env=!forwarded
> CustomLog ${APACHE_LOG_DIR}/access.log proxy env=forwarded
>
> However, a lot of requests still get logged with the IP address of the
> proxy instead of the original client.
>
> We are using HA-Proxy version 1.5-dev19 2013/06/17 and I wonder if
> anyone had an idea what the reason for that could be.
>


It's been some time since i last looked at the code; but I reckon it
would be the same issue I came across some time back. Do a dump on the
traffic to be sure. The RFC allows for headers with multiple values to
either be represented as repeated headers, each with one value, or as
a single header, with all of the values separated by commas. In either
case, your backend has to be capable / smart enough to be able to deal
with the 2 formats.

-jf

--
He who settles on the idea of the intelligent man as a static entity
only shows himself to be a fool.

Mensan / Full-Stack Technical Polymath / System Administrator
12 years over the entire web stack: Performance, Sysadmin, Ruby and Frontend



Re: Spam

2014-04-14 Thread Jeffrey 'jf' Lim
On Mon, Apr 14, 2014 at 9:25 PM, Willy Tarreau  wrote:
> On Mon, Apr 14, 2014 at 09:03:44AM -0400, Patrick Hemmer wrote:
>> While I strongly disagree, I can respect your reasoning. But perhaps
>> there are solutions other than restricting non-subscribers. I can think
>> of these few without much thought:
>> 1) add grey listing (http://en.wikipedia.org/wiki/Greylisting).
>
> We already have some greylisting which is why we have "only" a little
> bit of spam.
>
>> 2) add a header indicating whether the sender is subscribed to the
>> mailing list. Then anyone who wants to remain on the list can add a
>> filter to auto-delete mail when the sender isn't on the list. I don't
>> know the numbers, but I'd bet the valid non-subscriber mail is rare.
>
> Among 2000 participants, we have 700 permanent subscribers, which means
> that other ones unsubscribe after a few exchanges. Some people (including
> myself) also have multiple addresses and will post from work or home
> using a different one while they don't want to be subscribed multiple
> times.
>
>> 3) Add a spamhaus IP blacklist. While I doubt this would block any
>> legitimate mail, it is possible. So I expect this to be met with the
>> same resistance as only allowing subscribers.
>
> We're currently using one such crappy BLs, which was the reason people
> from gmail were recently denied posting. So we had to relax them. The
> problem with blacklists is that they're maintained by people who quickly
> get addicted to the great power they have by being able to decide who is
> allowed to send mail and who isn't allowed. It quickly turns a technical
> tool into a political one.
>
> I don't have a magic solution to this. The real point is that a mailing
> list always comes with some spam and any mailbox also comes with some
> spam anyway. So as long as the mailing list only adds a few percent of
> spam to the one you already have, I really think it's not worth blocking
> legitimate users to try to save this.
>
> Willy
>

(Note: I'm replying to all, because I dont know who is legitimately
subscribed to the list. Probably all of you are? in which case I am
going to assume that your mail client should be set up to de-duplicate
messages already)

How about a 4th "magic" solution? (And yes, this will involve some
magic, specifically on the part of the mailing list) Allow for people
brought in to a conversation to be automatically subscribed (so that
way they dont lose messages if somebody neglects to do a "reply all")
to that (and only to that) specific thread. No work at all necessary
on their part (ie. no subscription and unsubscription). It'll involve
tracking a bit more data (specifically adding specific recipients to
specific mail threads). The only question is, how often does this
happen (people get added to a conversation). And if this extra data
proves to be a problem, it can always be expunged after some
determined interval. Magic tracking! (just like sticky sessions, haha)

-jf



Re: Haproxy & F5 usage question

2013-01-09 Thread Jeffrey 'jf' Lim
On Thu, Jan 10, 2013 at 2:05 AM, DeMarco, Alex wrote:

>  I have a situation where a backend server defined in HAProxy may be a
> vip on our F5.The F5 vip is setup for source persistence.  Right now
> all the requests to this vip from the haproxy  box are all going to one
> pool member.  Obviously the f5 is seeing the ip of the server and not the
> true client.  I do have haproxy sending out the X-Forwarded-For. But the f5
> does not see it.
>
> **
>

So let me get this right. You've got a BIGIP sitting behind a HAProxy
instance? Why are things configured this way?

-jf



>  **
>
> Anyone have an example of how  scenario like this would work?   Do I need
> to modify haproxy or is this an f5 issue?
>
> ** **
>
> Thank you again  in advance..
>
> ** **
>
> [image: circle] **
>
> *Alex DeMarco*
> *Manager of Technical Services*
> The State University of New York
> State University Plaza - Albany, New York 12246
> Tel: 518.320.1398Fax: 518.320.1550
> *Be a part of Generation SUNY: 
> **Facebook*
> * - **Twitter* * - 
> **YouTube*
> 
>
> ** **
>
> ** **
>
<>

Re: Multiple Load Balancers, stick table and url-embedded session support

2010-12-09 Thread Jeffrey &#x27;jf' Lim
On Thu, Dec 9, 2010 at 7:27 PM, Hank A. Paulson <
h...@spamproof.nospammail.net> wrote:

> Please see the thread:
> "need help figuring out a sticking method"
>
> I asked about this, Willie says there are issues figuring out a workable
> config syntax for 'regex to pull the URL/URI substring' but (I think) that
> coding the functionality is not technically super-difficult just not enough
> hands maybe and the config syntax?
>
>
Actually if the "key" is to taken from a query param, that is relatively
easy enough (I coded something myself for a client some time back based on
1.3.15.4). If, however, more flexibility is required (like in your case),
then the point that Willie has mentioned will definitely come into play.

-jf


I have a feeling this would be a fairly commonly used feature, so it is good
> to see others asking the same question  :)
>
> How are you planning to distribute the traffic to the different haproxy
> instances? LVS? Some hardware?
>
>
> On 12/8/10 8:58 PM, David wrote:
>
>> Hi there,
>>
>> I have been asked to design an architecture for our load-balancing needs,
>> and
>> it looks like haproxy can do almost everything needed in a fairly
>> straightfoward way. Two of the requirements are stickiness support (always
>> send a request for a given session to the same backend) as well as
>> multiple
>> load balancers running at the same time to avoid single point of failure
>> (hotbackup with only one haproxy running at a time is not considered
>> acceptable).
>>
>> Using multiple HAproxy instances in parallel with stickiness support looks
>> relatively easy if cookies are allowed (through e.g. cookie prefixing)
>> since
>> no information needs to be shared. Unfortunately, we also need to support
>> session id embedded in URL (e.g. http://example.com/foo?sess=someid), and
>> I
>> was hoping that the new sticky table replication in 1.5 could help for
>> that,
>> but I am not sure it is the case.
>>
>> As far as I understand, I need to first define a table with string type,
>> and
>> then use the store-request to store the necessary information. I cannot
>> see a
>> way to get some information embedded in the URL using the existing query
>> extraction methods. Am I missing something, or is it difficult to do this
>> with
>> haproxy ?
>>
>> regards,
>>
>> David
>>
>>
>


Re: clarification of CD termination code

2010-08-04 Thread Jeffrey &#x27;jf' Lim
On Thu, Aug 5, 2010 at 7:29 AM, Bryan Talbot  wrote:
> In the tcpdump listed below, isn't the next-to-the-last RST also include an
> ACK of the data previously sent?  If that is the case, then the client has
> received all of the data and ACK'd it but then rudely closed the TCP
> connection without the normal FIN exchange.  Is my reading correct?
>
> 19:03:33.106842 IP 10.79.25.20.4266 > 10.79.6.10.80: S
> 2041799057:2041799057(0) win 65535 
> 19:03:33.106862 IP 10.79.6.10.80 > 10.79.25.20.4266: S
> 266508528:266508528(0) ack 2041799058 win 5840 
> 19:03:33.106945 IP 10.79.25.20.4266 > 10.79.6.10.80: . ack 1 win 65535
> 19:03:33.107045 IP 10.79.25.20.4266 > 10.79.6.10.80: P 1:269(268) ack 1 win
> 65535
> 19:03:33.107060 IP 10.79.6.10.80 > 10.79.25.20.4266: . ack 269 win 6432
> 19:03:33.134401 IP 10.79.6.10.80 > 10.79.25.20.4266: P 1:270(269) ack 269
> win 6432
> 19:03:33.134442 IP 10.79.6.10.80 > 10.79.25.20.4266: F 270:270(0) ack 269
> win 6432
> 19:03:33.134548 IP 10.79.25.20.4266 > 10.79.6.10.80: R 269:269(0) ack 270
> win 0
> 19:03:33.134562 IP 10.79.25.20.4266 > 10.79.6.10.80: R
> 2041799326:2041799326(0) win 0
>
>

yes - i've encountered this myself, and after looking into the
traffic, observed the very same thing from windows clients...
Definitely frustrating behaviour in terms of causing all these alerts
in the logs...

-jf


--
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."
--Richard Stallman

"It's so hard to write a graphics driver that open-sourcing it would not help."
-- Andrew Fear, Software Product Manager, NVIDIA Corporation
http://kerneltrap.org/node/7228



Re: Performance on an Atom D510 Dual Core 1.66GHz

2010-07-26 Thread Jeffrey &#x27;jf' Lim
On Tue, Jul 27, 2010 at 1:27 PM, Willy Tarreau  wrote:
>
> OK so here are a few results of haproxy 1.4.8 running on Atom D510 (64-bit)
> without keep-alive :
>
> 6400 hits/s on 0-bytes objets
> 6200 hits/s on 1kB objects (86 Mbps)
> 5700 hits/s on 2kB objects (130 Mbps)
> 5250 hits/s on 4kB objects (208 Mbps)
> 3300 hits/s on 8kB objects (250 Mbps)
> 2000 hits/s on 16kB objects (300 Mbps)
> 1300 hits/s on 32kB objects (365 Mbps)
> 800 hits/s on 64kB objects (450 Mbps)
> 480 hits/s on 128kB objects (535 Mbps)
> 250 hits/s on 256kB objects (575 Mbps)
> 135 hits/s on 512kB objects (610 Mbps)
>
>
> This requires binding the NIC's interrupt on one core and binding haproxy
> to the other core. That way, it leaves about 20% total idle on the NIC's
> core. Otherwise, the system tends to put haproxy on the same core as the
> NIC and the results are approximately half of that.
>
> Quick tests with keep-alive enabled report 7400 hits/s instead of 6400
> for the empty file test, and 600 instead of 5250 for the 4kB file, thus
> minor savings.
>

hi Willy, are you talking about 6000 ("6000 instead of 5250")? or 600?

-jf


--
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."
--Richard Stallman

"It's so hard to write a graphics driver that open-sourcing it would not help."
-- Andrew Fear, Software Product Manager, NVIDIA Corporation
http://kerneltrap.org/node/7228



Re: Matching URLs at layer 7

2010-04-28 Thread Jeffrey &#x27;jf' Lim
On Wed, Apr 28, 2010 at 7:51 PM, Andrew Commons
 wrote:
> Hi Beni,
>
> A few things to digest here.
>
> What was leading me up this path was a bit of elementary (and probably naïve) 
> white-listing with respect to the contents of the Host header and the URI/L 
> supplied by the user. Tools like Fiddler make request manipulation trivial so 
> filtering out 'obvious' manipulation attempts would be a good idea. With this 
> in mind my thinking (if it can be considered as such) was that:
>
> (1) user request is for http://www.example.com/whatever
> (2) Host header is www.example.com
> (3) All is good! Pass request on to server.
>
> Alternatively:
>
> (1) user request is for http://www.example.com/whatever
> (2) Host header is www.whatever.com
> (3) All is NOT good! Flick request somewhere harmless.
>

Benedikt has explained this already (see his first reply). There is no
such thing. What you see as "user request" is really sent as host
header, + uri.

Also to answer another question you raised - the http specification
states that header names are case-insensitive. I dont know about
haproxy's treatment, though (i'm too lazy to delve into the code right
now - and really you can test it out to find out for urself).

-jf


--
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."
--Richard Stallman

"It's so hard to write a graphics driver that open-sourcing it would not help."
-- Andrew Fear, Software Product Manager, NVIDIA Corporation
http://kerneltrap.org/node/7228



Re: Backend sends 204, haproxy sends 502

2009-10-28 Thread Jeffrey &#x27;jf' Lim
On Wed, Oct 28, 2009 at 9:54 PM, Dirk Taggesell <
dirk.tagges...@googlemail.com> wrote:

> On Wed, Oct 28, 2009 at 2:19 PM, Jeffrey 'jf' Lim wrote:
>
> what version of haproxy is this?
>>
>
> Ah sorry. It is 1.3.17
>
>
>>  do 200 requests from the same backend passed through haproxy work?
>>
>
> Yes, haproxy generally works when i test it with an ordinary Apache as
> back-end instead of the custom app.
>
>
>> I can't say that i've looked too closely at the code for this, but, I get
>> the impression that haproxy generally returns 502 for stuff that it cannot
>> recognize.
>>
>
> I am afraid it is so. There's some paragraphs in the documentation which
> suggest that.
>
>
>> And one other thing to look at - what is the log line like for this
>> particular request?
>>
>
> Oct 28 13:50:57 127.0.0.1 haproxy[3282]: 
> 88.217.248.214:42160[28/Oct/2009:13:50:57.690] cookietracker 
> cookietracker/cookietracker
> 1/0/0/-1/3 502 204 - - SL-- 2000/0/0/0/0 0/0 "GET /c HTTP/1.1"
>
>
it looks like Karsten's suspicion is correct. Try adding the
'Content-length: 0' header. haproxy is still expecting more data from the
backend. (It apparently does not know about status 204???).

And to answer Karsten's question: the content-length header isn't strictly
mandated (it's a 'SHOULD').

-jf




> followed after some seconds by about several dozen of these lines:
> Oct 28 13:52:01 127.0.0.1 haproxy[3282]: 
> 10.224.115.160:43562[28/Oct/2009:13:51:11.732] trackertest 
> trackertest/ -1/1/0/-1/5 0
> 0 - - sL-- 1902/1902/1902/0/0 0/0 ""
>
> 10.224.115.160 is the server's ip NATed address (Amazon EC2)
>


Re: Backend sends 204, haproxy sends 502

2009-10-28 Thread Jeffrey &#x27;jf' Lim
On Wed, Oct 28, 2009 at 9:02 PM, Dirk Taggesell <
dirk.tagges...@googlemail.com> wrote:

> Hi all,
>
> I want to load balance a new server application that generally sends
> http code 204 - to save bandwidth and to avoid client-side caching.
> In fact it only exchanges cookie data, thus no real content is delivered
> anyway.
>
> When requests are made via haproxy, the backend - as intended - delivers
> a code 204 but haproxy instead turns it into a code 502. Unfortunately I
> cannot use tcp mode because the server app needs the client's IP
> address. Is there something else I can do?
>
>
what version of haproxy is this? do 200 requests from the same backend
passed through haproxy work? I can't say that i've looked too closely at the
code for this, but, I get the impression that haproxy generally returns 502
for stuff that it cannot recognize.

And one other thing to look at - what is the log line like for this
particular request?

-jf

--
In the meantime, here is your PSA:
"It's so hard to write a graphics driver that open-sourcing it would not
help."
   -- Andrew Fear, Software Product Manager, NVIDIA Corporation
http://kerneltrap.org/node/7228


Re: haproxy running at 100% cpu

2009-09-17 Thread Jeffrey &#x27;jf' Lim
On Thu, Sep 17, 2009 at 10:25 PM, Marc  wrote:

> Hi All,
> HAProxy is using up 100% of (1) CPU on my dual-core server.  This had
> happened to me in the past a couple of times and both were due to how I
> compiled HAProxy.  Pretty sure I have it compiled correctly now though.
>
>
I would be curious to know - how did you last compile, that gave you
problems?

-jf


Re: Persistence based on a server id url param

2009-06-02 Thread Jeffrey &#x27;jf' Lim
On Tue, Jun 2, 2009 at 3:32 PM, Willy Tarreau  wrote:

> Hi Ryan,
>
> On Mon, Jun 01, 2009 at 12:22:57PM -0700, Ryan Schlesinger wrote:
> > I've got haproxy set up (with 2 frontends) to load balance a php app
> > which works great.  However, we're using a java uploader applet that
> > doesn't appear to handle cookies.  It would be simple for me to have the
> > uploader use a URL with the server id in it (just like we're already
> > doing with the session id) but I don't see any way to get haproxy to
> > treat that parameter as the actual server id.  Using hashing is not an
> > option as changing the number of running application servers is a normal
> > occurrence for us.  I also can't use the appsession directive as the
> > haproxy session id cache isn't shared between the two frontends (both
> > running an instance of haproxy).  Can this be done with ACLs and I'm
> > missing it?
>


I actually made a patch for a client the last time that does this exact
thing. I'll see if this client is ok with sharing the code - or opensourcing
it. Willy's approach is also an interesting way of doing it - you control
the decision of what to do if the backend is down using the acl
'srv(1|2)_up'

-jf



> You could very well use ACLs to match your URL parameter in the
> frontend and switch to either backend 1 or backend 2 depending
> on the value.
>
> Alternatively, you could hash the URL parameter (balance url_param)
> but it would not necessarily be easy for your application to generate
> an URL param which will hash back to the same server. So I think that
> the ACL method is the most appropriate for your case.
>
> Basically you'd do that :
>
> frontend
>acl srv1 url_sub SERVERID=1
>acl srv2 url_sub SERVERID=2
>acl srv1_up nbsrv(bck1) gt 0
>acl srv2_up nbsrv(bck2) gt 0
>use_backend bck1 if srv1_up srv1
>use_backend bck2 if srv2_up srv2
>default_backend bck_lb
>
> backend bck_lb
># Perform load-balancing. Servers state is tracked
># from other backends.
>balance roundrobin
>server srv1 1.1.1.1 track bck1/srv1
>server srv2 1.1.1.2 track bck2/srv2
>...
>
> backend bck1
>balance roundrobin
>server srv1 1.1.1.1 check
>
> backend bck2
>balance roundrobin
>server srv2 1.1.1.2 check
>
> That's just a guideline, but I think you should manage to get
> it working based on that.
>
> Regards,
> Willy
>
>
>


Re: The use of dst_conn vs. connslots

2009-04-23 Thread Jeffrey &#x27;jf' Lim
On Fri, Apr 24, 2009 at 6:09 AM, Ninad Raje  wrote:
> Hi,
>
> I recently upgraded haproxy to 1.3.17.
>
> I use dst_conn to divert the excess server load to an alternate backend. I'm
> exploring the possibility of using connslots instead of dst_conn. But to me
> both look the same.
>

they dont. Read the documentation again. It specifically points out
the difference between 'dst_conn', and 'connslots'. Although
admittedly, the addition of one statement could help to clarify things
a lot in this case :) (I should probably add that in - together with a
little bit of revision to the code) - "'dst_conn' is for frontends,
whereas 'connslots' is for backends."


> My main backend has a pool of 5 servers. Each server handles 100 requests.
> The number 100 includes 20 concurrent, in-process requests + 80 in-queue
> requests. I've following configuration:
> acl use_main dst_conn lt 500
> use_backend main if use_main
> default_backend alternate
>
> Using connslots the configuration looks like,
> acl nearly_full connslots(main) lt 500
> use_backend alternate if nearly_full

you mean 'use_backend main'


> default_backend alternate
>


> I'm not too clear about the value of connslots. The documentation says that
> connslots is defined per backend. My each server handles 100 requests
> (maxconn = 20 + maxqueue = 80, Total = 100). There are 5 servers in main
> backend. What if I add a server or remove a server from the backend? Do I
> have to set the value of connslots again? manually?
>

yes. But then again, you would have to edit the config as well in
order to add or remove a server from the backend.


> With reference to my configuration, I was expecting the value of connslots
> to be 100 and not 500. So that I don't have to update this value whenever I
> add/delete servers from my backend.
>

This would only work (wanting to specify 100 in your case - instead of
500) if all the servers in your backend were the same.


> Please share your experiences and help me calrify connslots value.
>
> Also I'm not too sure about the syntax of using connslots. The documentation
> says,
>
> connslots(backend) 
>
> But haproxy 1.3.17 complains with that syntax. I changed the syntax to,
>
> connslots(backend) lt 
>
> The haproxy 1.3.17 does not complain.
>

try that with 'nbsrv'. You get the same thing as well. I believe if
you do not specifiy an operator with integer matching that it is
assumed to be 'eq'. (But i havent taken too much of a look at the code
for this - maybe Willy can clarify here).

-jf

--
In the meantime, here is your PSA:
"It's so hard to write a graphics driver that open-sourcing it would not help."
-- Andrew Fear, Software Product Manager, NVIDIA Corporation
http://kerneltrap.org/node/7228



Re: what are "normal" ports to bind/connect to?

2009-04-23 Thread Jeffrey &#x27;jf' Lim
On Wed, Apr 22, 2009 at 8:27 PM, Jan-Frode Myklebust  wrote:
> On 2009-04-22, Jeffrey 'jf' Lim  wrote:
>>>
>>
>>> A clear error on start up and description of how to resolve would be
>>> useful I guess, but then I guess haproxy wouldn't know it was being
>>> blocked by selinux policy?
>>
>> nope... Unless there were some kind of a "standard" way to inform an
>> application (in which case, of course, the app would have to be
>> programmed for that).
>
> Yes, I think it's supposed to be up to the OS to notify the admin here
> via logs, popups or email.. I always have a
>
>        tail -f /var/log/audit/auditd.log|grep avc
>
> running when I install new services.. Then I immediately see if
> something is denied.
>

ok. So no protocol then.


>
>> -jf (too, ha)
>
> I was considering if two "-jf"'s might be too much, and if I should
> find another ha-proxying solution since you were here first :-)
>
>

oh come on - you shouldnt have to do that... Like where would you go
to find as good a load balancer as haproxy? ;)

-jf

--
In the meantime, here is your PSA:
"It's so hard to write a graphics driver that open-sourcing it would not help."
-- Andrew Fear, Software Product Manager, NVIDIA Corporation
http://kerneltrap.org/node/7228



Re: what are "normal" ports to bind/connect to?

2009-04-22 Thread Jeffrey &#x27;jf' Lim
On Wed, Apr 22, 2009 at 6:00 PM, Malcolm Turnbull
 wrote:
> Jan-Frode,
>
> Thats a tough one :-).
>
> 80 obviously
>
> 21/23/25/81/8080
>

hm... ftp's tricky. If you want to include 21, then you might want to
consider 20 as well.


> A clear error on start up and description of how to resolve would be
> useful I guess, but then I guess haproxy wouldn't know it was being
> blocked by selinux policy?
>

nope... Unless there were some kind of a "standard" way to inform an
application (in which case, of course, the app would have to be
programmed for that).

-jf (too, ha)

--
In the meantime, here is your PSA:
"It's so hard to write a graphics driver that open-sourcing it would not help."
-- Andrew Fear, Software Product Manager, NVIDIA Corporation
http://kerneltrap.org/node/7228



Re: Weird problem with stats_uri

2009-04-04 Thread Jeffrey &#x27;jf' Lim
On Sun, Apr 5, 2009 at 1:06 PM, Will Buckner  wrote:
> Hey guys,
>
> I'm seeing an interesting thing with 1.3.x in my production environment... I
> have my stats_uri defined as /proxy_http_stats, requiring a username and
> password. When I access this URL, about half the time I'm taken to a 404
> page on my destination servers (haproxy never catches the request for stats)
> and about half of the time I'll get the stats page. Any ideas on where I can
> start looking?
>

start by first eliminating the question of caches from your equation.
What do u mean by "haproxy never catches the request for stats"? Does
it at least come out in the haproxy logs (ie. haproxy *did* see the
request - but just didnt interpret it as a stats_uri)?

-jf

--
In the meantime, here is your PSA:
"It's so hard to write a graphics driver that open-sourcing it would not help."
-- Andrew Fear, Software Product Manager, NVIDIA Corporation
http://kerneltrap.org/node/7228



Re: patch: nested acl evaluation

2009-04-04 Thread Jeffrey &#x27;jf' Lim
On Sat, Apr 4, 2009 at 3:30 PM, Willy Tarreau  wrote:
> On Sat, Apr 04, 2009 at 10:20:23AM +0800, Jeffrey 'jf' Lim wrote:
>> > OK maybe "use" is OK in fact, considering the alternatives.
>> >
>>
>> :) some proposals for the keywords:
>>
>> for/use
>> condition/use
>> cond/use
>>
>> (cond/use seems the best compromise - short, but understandable enough)
>
> what would you think about "do/use" ?
>

I actually prefer cond/use (more intuitive with the idea of an "if"
block), but hey, you're the man... :P


>> > If we extend the system, we'll have to associate parameters
>> > with a condition.
>> >
>> > But since the entry point is the switching rule here, maybe we'll end up
>> > with something very close to what you have implemented in your patch, in
>> > the end, and it's just that we want to make it more generic to use the
>> > two conditions layers in any ruleset.
>> >
>>
>> I would guess so! Even the redirect rules and 'block' rules look
>> pretty similar... :)
>
> yes, and maybe this is what we should try to improve : converge towards
> a generic condition matching/execution system, to which we pass the
> action and the args. That way, we just have to run it over redirect_rules
> or switching_rules always using the same code.
>

:) right... :D Does this sound like fun? lol

-jf



Re: patch: nested acl evaluation

2009-04-03 Thread Jeffrey &#x27;jf' Lim
On Fri, Apr 3, 2009 at 3:11 PM, Willy Tarreau  wrote:
> On Fri, Apr 03, 2009 at 01:37:53PM +0800, Jeffrey 'jf' Lim wrote:
>> > OK. Just so that I get an idea, how many use_backend rules (and how many
>> > backends) do you have in a large config ? I'm asking because I want to be
>> > able to support ACL files and rules files, and the way to implement them
>> > should mostly be driven by large config users.
>> >
>>
>> we're on the way to reaching 100 use_backend rules even as we speak...
>> :) (and because these rules are largely of the pattern i've
>> mentioned.. this motivated this patch here)
>
> OK, that's still pretty reasonable. I know configurations with 500 "listen"
> sections, and fortunately they don't use ACLs !
>

right.


>> > I have no problem with "use" if we only support "use_backend", but
>> > I was thinking about extending it to support other keywords, and
>> > thought it would be a bit awkward with keywords such as "redirect"
>> > or "block". Maybe I'm wrong and "use" fits all usages ?
>> >
>>
>> 'use' is kind of generic, i suppose. I have to admit, though - that
>> was proposed with my usage situation and tons of 'use_backend'
>> swimming around in my head! :P But I get your point - you want to
>> expand this to other keywords.
>
> OK maybe "use" is OK in fact, considering the alternatives.
>

:) some proposals for the keywords:

for/use
condition/use
cond/use

(cond/use seems the best compromise - short, but understandable enough)


>> Yeah, I had to cut it in two halves cos of the "overloading" of the
>> 'use_backend' keyword... Of course if we use a separate keyword, then
>> nothing like that happens :P I cut acl processing in half too - but of
>> course, i guess that's only a given since we want to be able to reduce
>> processing anyway.
>
> Well, keep in mind that processing ACLs is rather cheap anyway. Since most
> of the CPU time is already spent in system, even if you would multiply
> haproxy's work by 10, you would only cut performance in half. I have
> memories of something like less than 10-20% performance loss with 1000 ACLs,
> but I may be wrong. That does not prevent us from optimising for massive
> usage though ;-)
>

half is good for me! :)


> (...)
>> > While we would have something more like this :
>> >
>> >    use_backend_block if X or Y
>> >       use B1 if X1 or Y1 or Z1
>> >       use B2 if X2 or Y2 or Z2
>> >
>> >  => foreach acl in X Y; do
>> >       if (acl)
>> >         foreach rule in B1 B2; do
>> >           foreach acl in X1 Y1 Z1; do
>> >             if (acl) use_backend $B
>> >
>>
>> hm... I dunno. I think it's more of like this? (this is great for
>> dealing with 'and' conditions as well, btw)
>>
>> if X or Y; do // or could be 'X and Y'! 'unless X', ... whatever
>>   if X1 or Y1 or Z1
>>     use_backend B1
>>   else if X2 or Y2 or Z2
>>     use_backend B2
>
> Yes, you're right, in fact your "if" above are the call to acl_exec_cond()
> while my "foreach acl in" was also the call to acl_exec_cond(). I just wanted
> to emphasize on the fact that we're not scanning ACLs then deciding on an
> action, but rather check all conditions for an actions and perform it if a
> rule matches.

right.


> If we extend the system, we'll have to associate parameters
> with a condition.
>
> But since the entry point is the switching rule here, maybe we'll end up
> with something very close to what you have implemented in your patch, in
> the end, and it's just that we want to make it more generic to use the
> two conditions layers in any ruleset.
>

I would guess so! Even the redirect rules and 'block' rules look
pretty similar... :)

-jf



Re: patch: nested acl evaluation

2009-04-02 Thread Jeffrey &#x27;jf' Lim
On Fri, Apr 3, 2009 at 1:00 PM, Willy Tarreau  wrote:
> On Fri, Apr 03, 2009 at 08:55:11AM +0800, Jeffrey 'jf' Lim wrote:
> (...)
>>
>> Having said that, there are blocks already in the standard config
>> (like "backend", with multiple "server" lines) - but the difference is
>> they don't need an ending keyword, so ok I see the problem here is
>> with the mandatory end (guess that's what you mean by "section", vs
>> "block"!)
>
> yes. It don't like it when we need to explicitly terminate a block (or
> section if you want). This always leads to trouble in the long term,
> because of missing or inverted termination statements. So the language
> must be unambiguous enough not to require this. BTW, that's what you
> have in cisco-like configs.
>

:) now that you mention it, yeah...


>> > With all that in mind, I'm wondering if what you want is not just sort
>> > of a massive remapping feature. I see that you have arbitrarily factored
>> > on the Host part and you are selecting the backend based on the path
>> > part. Is this always what you do ? Are there any other cases ?
>>
>> for us atm, no. (Although i do appreciate the flexibility, of course,
>> to factor on anything else ['for_acl' will do this right now]. Key
>> thing for me is, I want to reduce acl processing when we've got big
>> sticking common acl among all the statements [you've seen my
>> examples])
>
> OK. Just so that I get an idea, how many use_backend rules (and how many
> backends) do you have in a large config ? I'm asking because I want to be
> able to support ACL files and rules files, and the way to implement them
> should mostly be driven by large config users.
>

we're on the way to reaching 100 use_backend rules even as we speak...
:) (and because these rules are largely of the pattern i've
mentioned.. this motivated this patch here)


>
>> > If you think you still need ACLs but
>> > just for use_backend rules, maybe we should just use a slightly
>> > different keyword : simply not repeat "use_backend" and use "select"
>> > instead, which does not appear in the normal config section :
>> >
>> >   use_backend bk1 if acl1
>> >   use_backend_block if Host1
>> >        select bk1 if path1 or path2
>> >        select bk2 if path3
>> >        select bk3 if path4 src1
>> >   ...
>> >   use_backend bkN if aclN
>> >
>> > That one would present the advantage of being more intuitive and
>> > would integrate better with other rules. Also, it would make it
>> > more intuitive how to write such blocks for other rule sets, and
>> > is very close to what you've already done. And that does not
>> > require any end tag since the keyword used in the block ("select"
>> > above) is not present in the containing block.
>> >
>>
>> right! I would be fine with this. Perhaps we could use the keyword
>> "use" instead of "select"? so
>>
>> use_backend_block if host_www
>>     use bk1 if path 1 or path 2
>>     use bk2 if path3
>>     use bk3 if path4 src1
>> ...
>
> I have no problem with "use" if we only support "use_backend", but
> I was thinking about extending it to support other keywords, and
> thought it would be a bit awkward with keywords such as "redirect"
> or "block". Maybe I'm wrong and "use" fits all usages ?
>

'use' is kind of generic, i suppose. I have to admit, though - that
was proposed with my usage situation and tons of 'use_backend'
swimming around in my head! :P But I get your point - you want to
expand this to other keywords.

(oh, and I don't like call/with either).


>> I can trivially modify my patch to be able to do this.
>
> The problem is not to modify a patch but more to do it the right
> way. Right now, from what I've seen, your patch cuts the use_backend
> processing in two halves, while I don't think we should implement it
> that way.
>

Yeah, I had to cut it in two halves cos of the "overloading" of the
'use_backend' keyword... Of course if we use a separate keyword, then
nothing like that happens :P I cut acl processing in half too - but of
course, i guess that's only a given since we want to be able to reduce
processing anyway.


>> > Maybe with a little bit more thinking we could come up with something
>> > more generic like this :
>> >
>> >   ...
>> >   call use_backend if acl

Re: patch: nested acl evaluation

2009-04-02 Thread Jeffrey &#x27;jf' Lim
On Fri, Apr 3, 2009 at 5:20 AM, Willy Tarreau  wrote:
> Hi Jeffrey,
>
> On Thu, Apr 02, 2009 at 02:23:44PM +0800, Jeffrey 'jf' Lim wrote:
> (...)
>> Ok perhaps "combinatorial" was not the word that i should have used,
>> but... I hope you can see the point/s with the explanation that i
>> gave. The "head" acl only gets checked once - thereafter which it goes
>> into the body (you could treat it like the standard "if" statement in
>> any programming language) to do the evaluation.
>>
>> Without this, the "head" acl has to be evaluated every time for every
>> combination of "head" acl + "sub" acl for you can go into each
>> 'use_backend'. So eg:
>>
>> use_backend b1 if host_www.domain.com path1
>> use_backend b2 if host_www.domain.com path2
>> use_backend b3 if host_www.domain.com path3
>> use_backend b4 if host_www.domain path4 or host_www.domain path5
>> ...
>>
>> use_backend b10 if host_www.d2.com path1
>> use_backend b11 if host_www.d2.com path2
>> ...
>>
>>
>> So does it make sense to cut out all of this? Of course it does.
>> Faster acl processing (you dont repeat having to process
>> 'host_www.domain.com' every time!), much neater (and maintainable)
>> config file.
>>
>> ==
>>
>> The following is a refined patch (did i mention i was serious about
>> this? just to allay the "April Fool's" thing)
>>
>> One caveat to take note: at this point in time I'm not going to cater
>> for nested 'for_acl's (I think if u really need this, you probably
>> have bigger problems). One level of 'for_acl' should be all you
>> need... (why is this deja vu) (Willy, I'll work out the documentation
>> later once/if u give the go ahead for this, thanks)
>
> I understand the usefulness of your proposal, but I really dislike
> the implementation for at least two reasons :
>  - right now the config works on a line basis. There are sections, but
>    no block.

yeah, I realize :P


> This is the first statement which introduces the notion
>    of a block, with a beginning and a mandatory end. I don't like that
>    for the same reason I don't like config languages with braces. It's
>    harder to debug and maintain.
>

I dunno. I'm finding the "line by line" thing harder to debug! (I once
made an error where the hostname was wrong because I copied the config
en masse from another hostname - and somehow missed out on changing
the hostname for this one line).

Having said that, there are blocks already in the standard config
(like "backend", with multiple "server" lines) - but the difference is
they don't need an ending keyword, so ok I see the problem here is
with the mandatory end (guess that's what you mean by "section", vs
"block"!)


>  - only the "use_backend" rules are covered. So the "for_acl" keyword
>    is not suited since in fact you're just processing a "use_backend"
>    list.

right!


> Also, that leaves me with two new questions : what could be
>    done for other rules ? Do we need to duplicate all the code, or can
>    we factor the same block out and use it everywhere ? Second
>    question : maybe you need this for use_backend only, which marks it
>    as a special case which might be handled even more easily ?
>

yes, I want it really just for 'use_backend' only. Keyword wise, I was
struggling to find something useful. I ended up with 'for_acl' :P But
I get what u mean. My bad - I was focussing only on 'use_backend' cos
that's what we have a lot of here. :P


> With all that in mind, I'm wondering if what you want is not just sort
> of a massive remapping feature. I see that you have arbitrarily factored
> on the Host part and you are selecting the backend based on the path
> part. Is this always what you do ? Are there any other cases ?

for us atm, no. (Although i do appreciate the flexibility, of course,
to factor on anything else ['for_acl' will do this right now]. Key
thing for me is, I want to reduce acl processing when we've got big
sticking common acl among all the statements [you've seen my
examples])


> You still
> have to declare ACLs for each path. Maybe it would be better to simply
> support something like this :
>
>    use_backend back1 if acl1
>    map_path_to_backend if Host1
>        use BK1 on ^/img
>        use BK2 on ^/js
>        use BK3 on ^/static/.*\.{jpg|gif|png}
>        ...
>    use_backend backN if aclN
>
>
> See ? No more ACLs on the path an

Re: patch: nested acl evaluation

2009-04-01 Thread Jeffrey &#x27;jf' Lim
On Wed, Apr 1, 2009 at 6:05 PM, Jeffrey 'jf' Lim  wrote:
> Combinatorial acls, everybody!!! :) The usefulness of this patch is
> best explained by a config snippet, so here goes...
>
> ===
>        for_acl  if  host_www.domain.com
>
>                use_backend  b1                                 if  path1
>
>                use_backend  b2                   if  path2
>                use_backend  b3                                 if  path3
>
>                use_backend  b4   if  path4 or path5
>
>                use_backend  b5  if  path6 or path7 or path8
>                use_backend  b6 if path9
>
>        endfor_acl
> ===
>
> (Yeah, I know i had to sanitize some names ;) but I hope u get the
> usefulness of this! Saner config for me, faster acl processing as
> well. Can be used in a hosting situation as well, where multiple
> hostnames are tied to the same ip)
>
>


(Some extra commentary):

Ok perhaps "combinatorial" was not the word that i should have used,
but... I hope you can see the point/s with the explanation that i
gave. The "head" acl only gets checked once - thereafter which it goes
into the body (you could treat it like the standard "if" statement in
any programming language) to do the evaluation.

Without this, the "head" acl has to be evaluated every time for every
combination of "head" acl + "sub" acl for you can go into each
'use_backend'. So eg:

use_backend b1 if host_www.domain.com path1
use_backend b2 if host_www.domain.com path2
use_backend b3 if host_www.domain.com path3
use_backend b4 if host_www.domain path4 or host_www.domain path5
...

use_backend b10 if host_www.d2.com path1
use_backend b11 if host_www.d2.com path2
...


So does it make sense to cut out all of this? Of course it does.
Faster acl processing (you dont repeat having to process
'host_www.domain.com' every time!), much neater (and maintainable)
config file.

==

The following is a refined patch (did i mention i was serious about
this? just to allay the "April Fool's" thing)

One caveat to take note: at this point in time I'm not going to cater
for nested 'for_acl's (I think if u really need this, you probably
have bigger problems). One level of 'for_acl' should be all you
need... (why is this deja vu) (Willy, I'll work out the documentation
later once/if u give the go ahead for this, thanks)


diff -ur haproxy-1.3.17-PRISTINE/include/types/proxy.h
haproxy-1.3.17/include/types/proxy.h
--- haproxy-1.3.17-PRISTINE/include/types/proxy.h   2009-03-29
21:26:57.0 +0800
+++ haproxy-1.3.17/include/types/proxy.h2009-04-01 17:53:23.714670412 
+0800
@@ -277,8 +277,14 @@
struct error_snapshot invalid_req, invalid_rep; /* captures of last 
errors */
 };

+/* values for switching_rule->type */
+#define SWR_SIMPLE  0
+#define SWR_NESTED  1
+
 struct switching_rule {
+   short type; /* jf: acl type - simple... or 
a complex 'if' body */
struct list list;   /* list linked to from the 
proxy */
+   struct list second_rules;   /* jf: 2nd-level switching 
rulesz */
struct acl_cond *cond;  /* acl condition to meet */
union {
struct proxy *backend;  /* target backend */
diff -ur haproxy-1.3.17-PRISTINE/src/cfgparse.c haproxy-1.3.17/src/cfgparse.c
--- haproxy-1.3.17-PRISTINE/src/cfgparse.c  2009-03-29 21:26:57.0 
+0800
+++ haproxy-1.3.17/src/cfgparse.c   2009-04-02 13:48:03.604287568 +0800
@@ -131,6 +131,7 @@
 int cfg_maxpconn = DEFAULT_MAXCONN;/* # of simultaneous connections
per proxy (-N) */
 int cfg_maxconn = 0;   /* # of simultaneous connections, (-n) 
*/
 unsigned int acl_seen = 0; /* CFG_ACL_* */
+short in_for_acl = 0;

 /* List head of all known configuration keywords */
 static struct cfg_kw_list cfg_keywords = {
@@ -1407,12 +1408,87 @@
}

rule = (struct switching_rule *)calloc(1, sizeof(*rule));
+   rule->type = SWR_SIMPLE;// !!jf: cos 'use_backend' IS 
simple
rule->cond = cond;
rule->be.name = strdup(args[1]);
LIST_INIT(&rule->list);
-   LIST_ADDQ(&curproxy->switching_rules, &rule->list);
+   if (in_for_acl)
+   LIST_ADDQ(&LIST_ELEM(curproxy->switching_rules.p, struct
switching_rule *, list)->second_rules, &rule->list);
+   else
+   LIST_ADDQ(&curproxy->switching_rules, &rule->list);
+
acl_seen |= CFG_ACL_BACKEND;
}
+   else if (!strcmp(args[0], "for_acl")) {
+
+   if (in_for_acl) {
+  

patch: nested acl evaluation

2009-04-01 Thread Jeffrey &#x27;jf' Lim
Combinatorial acls, everybody!!! :) The usefulness of this patch is
best explained by a config snippet, so here goes...

===
for_acl  if  host_www.domain.com

use_backend  b1 if  path1

use_backend  b2   if  path2
use_backend  b3 if  path3

use_backend  b4   if  path4 or path5

use_backend  b5  if  path6 or path7 or path8
use_backend  b6 if path9

endfor_acl
===

(Yeah, I know i had to sanitize some names ;) but I hope u get the
usefulness of this! Saner config for me, faster acl processing as
well. Can be used in a hosting situation as well, where multiple
hostnames are tied to the same ip)


diff -ur haproxy-1.3.17-PRISINE/include/types/proxy.h
haproxy-1.3.17/include/types/proxy.h
--- haproxy-1.3.17-PRISINE/include/types/proxy.h2009-03-29
21:26:57.0 +0800
+++ haproxy-1.3.17/include/types/proxy.h2009-04-01 17:53:23.714670412 
+0800
@@ -277,8 +277,14 @@
struct error_snapshot invalid_req, invalid_rep; /* captures of last 
errors */
 };

+/* values for switching_rule->type */
+#define SWR_SIMPLE  0
+#define SWR_NESTED  1
+
 struct switching_rule {
+   short type; /* jf: acl type - simple... or 
a complex 'if' body */
struct list list;   /* list linked to from the 
proxy */
+   struct list second_rules;   /* jf: 2nd-level switching 
rulesz */
struct acl_cond *cond;  /* acl condition to meet */
union {
struct proxy *backend;  /* target backend */
diff -ur haproxy-1.3.17-PRISINE/src/cfgparse.c haproxy-1.3.17/src/cfgparse.c
--- haproxy-1.3.17-PRISINE/src/cfgparse.c   2009-03-29 21:26:57.0 
+0800
+++ haproxy-1.3.17/src/cfgparse.c   2009-04-01 17:49:05.108321600 +0800
@@ -131,6 +131,7 @@
 int cfg_maxpconn = DEFAULT_MAXCONN;/* # of simultaneous connections
per proxy (-N) */
 int cfg_maxconn = 0;   /* # of simultaneous connections, (-n) 
*/
 unsigned int acl_seen = 0; /* CFG_ACL_* */
+short in_for_acl = 0;

 /* List head of all known configuration keywords */
 static struct cfg_kw_list cfg_keywords = {
@@ -1407,12 +1408,81 @@
}

rule = (struct switching_rule *)calloc(1, sizeof(*rule));
+   rule->type = SWR_SIMPLE;// !!jf: cos 'use_backend' IS 
simple
rule->cond = cond;
rule->be.name = strdup(args[1]);
LIST_INIT(&rule->list);
-   LIST_ADDQ(&curproxy->switching_rules, &rule->list);
+   if (in_for_acl)
+   LIST_ADDQ(&LIST_ELEM(curproxy->switching_rules.p, struct
switching_rule *, list)->second_rules, &rule->list);
+   else
+   LIST_ADDQ(&curproxy->switching_rules, &rule->list);
+
acl_seen |= CFG_ACL_BACKEND;
}
+   else if (!strcmp(args[0], "for_acl")) {
+   int pol = ACL_COND_NONE;
+   struct acl_cond *cond;
+   struct switching_rule *toprule;
+
+   if (curproxy == &defproxy) {
+   Alert("parsing [%s:%d] : '%s' not allowed in 'defaults'
section.\n", file, linenum, args[0]);
+   return -1;
+   }
+
+   if (warnifnotcap(curproxy, PR_CAP_FE, file, linenum, args[0], 
NULL))
+   return 0;
+
+   if (*(args[1]) == 0) {
+   Alert("parsing [%s:%d] : '%s' expects at least one acl 
name.\n",
file, linenum, args[0]);
+   return -1;
+   }
+
+   if (!strcmp(args[1], "if"))
+   pol = ACL_COND_IF;
+   else if (!strcmp(args[1], "unless"))
+   pol = ACL_COND_UNLESS;
+
+   if (pol == ACL_COND_NONE) {
+   Alert("parsing [%s:%d] : '%s' requires either 'if' or 
'unless'
followed by a condition.\n",
+ file, linenum, args[0]);
+   return -1;
+   }
+
+   if ((cond = parse_acl_cond((const char **)args + 2, 
&curproxy->acl,
pol)) == NULL) {
+   Alert("parsing [%s:%d] : error detected while parsing 
switching rule.\n",
+ file, linenum);
+   return -1;
+   }
+
+   cond->line = linenum;
+   if (cond->requires & ACL_USE_RTR_ANY) {
+   struct acl *acl;
+   const char *name;
+
+   acl = cond_find_require(cond, ACL_USE_RTR_ANY);
+   name = acl ? acl->name : "(unknown)";
+   Warning("parsing [%s:%d] : acl '%s' involves some 
response-only
criteria which will be ignored.\n",
+  

Re: [RFC] development model for future haproxy versions

2009-03-30 Thread Jeffrey &#x27;jf' Lim
On Tue, Mar 31, 2009 at 5:06 AM, Willy Tarreau  wrote:
> Hi all!
>
> Now that the storm of horror stories has gone with release of 1.3.17,
> I'd like to explain what I'm planning to do for future versions of
> haproxy.
>
> Right now there are a few issues with the development process and
> the version numbering in general :
>
> 
>
> 4) encourage people to work on a next feature set with their own tree.
>
> Since haproxy has migrated to use GIT for version control, it has really
> changed my life, and made it a lot more convenient for some contributors
> to maintain their own patchsets.
>

yo, I hadnt noticed! What's the clone url, though? All links I've
tried only get me to a gitweb interface.

-jf



Re: balance source based on a X-Forwarded-For

2009-03-29 Thread Jeffrey &#x27;jf' Lim
On Mon, Mar 30, 2009 at 12:42 PM, Willy Tarreau  wrote:
> On Mon, Mar 30, 2009 at 08:48:14AM +0800, Jeffrey 'jf' Lim wrote:
>> On Mon, Mar 30, 2009 at 3:48 AM, Willy Tarreau  wrote:
>> > On Sun, Mar 29, 2009 at 12:31:27PM -0700, John L. Singleton wrote:
>> >> I'm a little mystified as to the usefulness of this as well. I mean,
>> >> what does hashing the domain name solve that just balancing back to a
>> >> bunch of Apache instances with virtual hosting turned on doesn't? Are
>> >> you saying that you have domains like en.example.com, fr.example.com
>> >> and you want them all to be sticky to the same backend server when
>> >> they balance? If that's the case, I could see that being useful if the
>> >> site in question were doing some sort of expensive per-user asset
>> >> generation that was being cached on the server. Is this what you are
>> >> talking about?
>> >
>> > There are proxies which can do prefetching, and in this case, it's
>> > desirable that all requests for a same domain name pass through the
>> > same cache.
>> >
>>
>> so are you saying haproxy -> cache -> backend? (in which case, you
>> would be talking more about an ISP, i think? or does anybody here not
>> running an ISP actually do this (I would be interested to know))
>
> not necessarily, it can also be :
>   customers -> haproxy -> caches -> world
>

right!! :) interesting

-jf

--
In the meantime, here is your PSA:
"It's so hard to write a graphics driver that open-sourcing it would not help."
-- Andrew Fear, Software Product Manager, NVIDIA Corporation
http://kerneltrap.org/node/7228



Re: balance source based on a X-Forwarded-For

2009-03-29 Thread Jeffrey &#x27;jf' Lim
On Mon, Mar 30, 2009 at 3:48 AM, Willy Tarreau  wrote:
> On Sun, Mar 29, 2009 at 12:31:27PM -0700, John L. Singleton wrote:
>> I'm a little mystified as to the usefulness of this as well. I mean,
>> what does hashing the domain name solve that just balancing back to a
>> bunch of Apache instances with virtual hosting turned on doesn't? Are
>> you saying that you have domains like en.example.com, fr.example.com
>> and you want them all to be sticky to the same backend server when
>> they balance? If that's the case, I could see that being useful if the
>> site in question were doing some sort of expensive per-user asset
>> generation that was being cached on the server. Is this what you are
>> talking about?
>
> There are proxies which can do prefetching, and in this case, it's
> desirable that all requests for a same domain name pass through the
> same cache.
>

so are you saying haproxy -> cache -> backend? (in which case, you
would be talking more about an ISP, i think? or does anybody here not
running an ISP actually do this (I would be interested to know))

-jf

--
In the meantime, here is your PSA:
"It's so hard to write a graphics driver that open-sourcing it would not help."
-- Andrew Fear, Software Product Manager, NVIDIA Corporation
http://kerneltrap.org/node/7228



Re: balance source based on a X-Forwarded-For

2009-03-29 Thread Jeffrey &#x27;jf' Lim
On Wed, Mar 25, 2009 at 8:02 PM, Benoit  wrote:
>
>
> diff -ru haproxy-1.3.15.7/doc/configuration.txt 
> haproxy-1.3.15.7-cur/doc/configuration.txt
> --- haproxy-1.3.15.7/doc/configuration.txt      2008-12-04 11:29:13.0 
> +0100
> +++ haproxy-1.3.15.7-cur/doc/configuration.txt  2009-02-24 16:17:19.0 
> +0100
> @@ -788,6 +788,19 @@
>
>                 balance url_param  [check_post []]
>
> +      header      The Http Header specified in argument will be looked up in
> +                  each HTTP request.
> +
> +                  With the "Host" header name, an optionnal use_domain_only
> +                  parameter is available, for reducing the hash algorithm to
> +                  the main domain part, eg for "haproxy.1wt.eu", only "1wt"
> +                  will be taken into consideration.
> +

I'm not so sure how balancing based on a hash of the Host header would
be useful. How would this be useful? I would see an application for
balancing on perhaps other headers (like xff as mentioned), but for
Host... I dunno... (so basically what I'm saying is, is the code for
the 'use_domain_only' bit useful? can it be left out?)

-jf

--
In the meantime, here is your PSA:
"It's so hard to write a graphics driver that open-sourcing it would not help."
-- Andrew Fear, Software Product Manager, NVIDIA Corporation
http://kerneltrap.org/node/7228



Re: haproxy 1.3.16 getting really really closer

2009-03-07 Thread Jeffrey &#x27;jf' Lim
On Sat, Mar 7, 2009 at 7:32 PM, Willy Tarreau  wrote:
> Hi Jeff,
>
> On Sat, Mar 07, 2009 at 07:03:15PM +0800, Jeffrey 'jf' Lim wrote:
>> Woohoo!! :) thanks, Willy, for the work. Seems like a really great
>> list of stuff there.
>>
>> Especially love the "HTTP invalid request and response captures per
>> frontend/backend" feature - I would definitely love to be able to see
>> what we're getting over here where we use haproxy
>
> It's really useful. Many people have wasted a lot of time with traffic
> captures trying to catch an anomaly, while haproxy knows when there is
> an error, so simplifies the troubleshooting a lot to only capture errors.
> Also, now I see what people send in their attacks ;-)
>

heh, yeah, I want to know too! ;p


> Probably that the feature will be improved so that we can decide by
> configuration what type of errors should cause the request/response
> to be captured.
>

or where it should get logged to? it sounds like right now it's only
getting logged to memory. (I mean, how about on disk? although, of
course conversely, we want to prevent attacks from overwhelming our
disk as well...)


>> One question if u dont mind - session rate limiting on frontends -
>> what's the use case for this?
>
> There are several use cases :
>  - you can limit the request rate on a fragile server which has a
>    dedicated frontend (eg: a local search engine)
>
>  - if you're hosting several customer sites on your own infrastructure,
>    you may want to limit each of them to a reasonable load so that none
>    of them consumes all the CPU resources
>
>  - you can limit incoming mail rate to protect your SMTP relays,
>    especially when you have anti-virus and anti-spam forking like mad
>    under load (I've done that here at home to protect my poor old vax).
>
>  - it can help protecting your servers against DoS attacks.
>
> I'm sure other people will find smarter ideas and usages ;-)
>

:) thanks for that. I was initially thinking that putting a maxconn on
your frontend and/or backends would do it - but yeah, I see now how a
having request rate might be useful as well...

thanks,
-jf

--
In the meantime, here is your PSA:
"It's so hard to write a graphics driver that open-sourcing it would not help."
-- Andrew Fear, Software Product Manager, NVIDIA Corporation
http://kerneltrap.org/node/7228



Re: "option httpchk" is reporting servers as down when they're not

2009-03-07 Thread Jeffrey &#x27;jf' Lim
On Sat, Mar 7, 2009 at 2:38 AM, Willy Tarreau  wrote:
> Hi Thomas,
>
> On Thu, Mar 05, 2009 at 08:45:20AM -0500, Allen, Thomas wrote:
>> Hi Jeff,
>>
>> The thing is that if I don't include the health check, the load balancer 
>> works fine and each server receives equal distribution. I have no idea why 
>> the servers would be reported as "down" but still work when unchecked.
>
> It is possible that your servers expect the "Host:" header to
> be set during the checks. There's a trick to do it right now
> (don't forget to escape spaces) :
>
>        option httpchk GET /index.php HTTP/1.0\r\nHost:\ www.mydomain.com
>

you know Thomas, Willy may be very right here. And I just realized as
well - u say u're using 'option httpchk /index.php'?  - without
specifying the 'GET' verb?

-jf


> Also, you should check the server's logs to see why it is reporting
> the service as down. And as a last resort, a tcpdump of the traffic
> between haproxy and a failed server will show you both the request
> and the complete error from the server.
>
> Regards,
> Willy
>
>



Re: haproxy 1.3.16 getting really really closer

2009-03-07 Thread Jeffrey &#x27;jf' Lim
Woohoo!! :) thanks, Willy, for the work. Seems like a really great
list of stuff there.

Especially love the "HTTP invalid request and response captures per
frontend/backend" feature - I would definitely love to be able to see
what we're getting over here where we use haproxy

One question if u dont mind - session rate limiting on frontends -
what's the use case for this?

-jf

--
In the meantime, here is your PSA:
"It's so hard to write a graphics driver that open-sourcing it would not help."
-- Andrew Fear, Software Product Manager, NVIDIA Corporation
http://kerneltrap.org/node/7228


On Sat, Mar 7, 2009 at 7:11 AM, Willy Tarreau  wrote:
>
> Hi all !
>
> About 3 months ago I told you that 1.3.16 was getting closer. Now I
> really think it's getting even closer. Since then, we have fixed all
> remaining visible bugs, which does not mean that all bugs went away of
> course. Also several new useful features have been implemented due to
> concrete opportunities :
>
>  - doc is finished. All keywords, log options etc... have been migrated
>    to the new doc. The old one is still there "just in case", but should
>    not be needed anymore.
>
>  - autonomous forwarding layer between sockets without waking the
>    task up : when data must be forwarded from one socket to another
>    one, we don't wake the task up anymore if not needed. This saves
>    many CPU cycles on large objects and has improved the maximum data
>    rate by 10-15% depending on the workload. A further improvement
>    will consist in allocating buffers on the fly from a pool just
>    during the transfer, and releasing empty buffers when not in use
>    in order to reduce memory requirements.
>
>  - TCP splicing on very recent linux 2.6 (2.6.27.19, 2.6.28.6, or
>    2.6.29). TCP splicing enables zero-copy forward between two network
>    interfaces. With most network interfaces it does not seem to change
>    anything, however with some other NICs (at least Myricom's 10GE), we
>    observe huge savings. I could even reach 10 Gbps of layer 7 data
>    forwarding with only 25% CPU usage! Please note that this must not
>    be used with any kernel earlier than versions above since all
>    previous tcp-splice implementations are buggy and will randomly and
>    silently corrupt your data.
>
>  - unix stats socket is working again.
>
>  - complete session table dumps on the unix socket. It reports
>    pointers, states, protocol, timeouts, etc... It's primarily meant
>    for development, but will help understand why sometimes a process
>    refuses to die when some sessions remain present.
>
>  - HTTP invalid request and response captures per frontend/backend :
>    those who are fed up with tracking 502 coming from buggy servers
>    will love this one. Each invalid request or response (non HTTP
>    compliant) is copied into a reserved space in the frontend (request)
>    or backend (response) with info about the client's IP, the server,
>    the exact date, etc... so that the admin can later consult those
>    errors by simply sending a "show errors" request on the unix stats
>    socket. The exact position of the invalid character is indicated,
>    and that eases the troubleshooting a lot ! It's also useful to keep
>    complete captures of attacks, when the attacker sends invalid
>    requests :-)
>
>  - the layering work has been continued for a long time and a massive
>    cleanup has been performed (another one is still needed though)
>
>  - the internal I/O and scheduler subsystems are progressively getting
>    more mature, making it easier to integrate new features.
>
>  - add support for options "clear-cookie", "set-cookie" and "drop-query"
>    to the "redirect" keyword.
>
>  - ability to bind to a specific interface for listeners as well as for
>    source of outgoing connections. This will help on complex setups where
>    several interfaces are attached to the same LAN.
>
>  - ability to bind some instances to some processes in multi-process
>    mode.  Some people want to run frontend X on process 1 and frontend
>    Y on process 2. This is now possible. A future improvement will
>    consist in defining what CPU/core each process can run on (OS
>    dependant).
>
>  - session rate measurement per frontend, backend and server. This is
>    now reported in the stats in a "rate" column, in number of sessions
>    per second. Right now this is only on the last second, but at least
>    the algorithm gives an accurate measurement with very low CPU usage.
>    This value may also be checked in ACLs in order to write conditions
>    based on performance.
>
>  - session rate limiting on frontends : using "rate-limit sessions XXX"
>    it is now possible to limit the rate at which a frontend will accept
>    connections. This is very accurate too. Right now it's only limited
>    to sessions per second, but it will evolve to different periods. I've
>    successfully tried values as low as 1 and as high as 55000 sessions/s,

Re: "option httpchk" is reporting servers as down when they're not

2009-03-04 Thread Jeffrey &#x27;jf' Lim
well, looks like ur servers are actually down then. Do a curl from
your haproxy machine to both servers. What do you get?

-jf

--
In the meantime, here is your PSA:
"It's so hard to write a graphics driver that open-sourcing it would not help."
-- Andrew Fear, Software Product Manager, NVIDIA Corporation
http://kerneltrap.org/node/7228


On Wed, Mar 4, 2009 at 9:40 PM, Allen, Thomas  wrote:
> Never mind, I got it going. My stats page simply says that both servers are 
> down. What else should I be looking for?
>
> Thomas Allen
> Web Developer, ASCE
> 703.295.6355
>
> -----Original Message-
> From: Jeffrey 'jf' Lim [mailto:jfs.wo...@gmail.com]
> Sent: Wednesday, March 04, 2009 2:22 AM
> To: Allen, Thomas
> Cc: haproxy@formilux.org
> Subject: Re: "option httpchk" is reporting servers as down when they're not
>
> - Show quoted text -
> On Wed, Mar 4, 2009 at 4:05 AM, Allen, Thomas  wrote:
>> Hi,
>>
>> I like the idea of having HAProxy check server health, but for some reason,
>> it reports all of my servers as down. Here's my full config:
>>
>> listen http_proxy :80
>>     mode http
>>     balance roundrobin
>>     option httpchk
>>     server webA {IP} cookie A check
>>     server webB {IP} cookie B check
>>
>> I tried "option httpchk /index.php" just to be sure, and got the same
>> result. If I remove the httpchk option, HAProxy has no problem proxying
>> these servers. What am I doing wrong?
>>
>
> what's listed under "Status" for these servers when viewing your
> haproxy status page?
>
> -jf
>
> --
> In the meantime, here is your PSA:
> "It's so hard to write a graphics driver that open-sourcing it would not 
> help."
>    -- Andrew Fear, Software Product Manager, NVIDIA Corporation
> http://kerneltrap.org/node/7228
>



Re: "option httpchk" is reporting servers as down when they're not

2009-03-03 Thread Jeffrey &#x27;jf' Lim
On Wed, Mar 4, 2009 at 4:05 AM, Allen, Thomas  wrote:
> Hi,
>
> I like the idea of having HAProxy check server health, but for some reason,
> it reports all of my servers as down. Here's my full config:
>
> listen http_proxy :80
>     mode http
>     balance roundrobin
>     option httpchk
>     server webA {IP} cookie A check
>     server webB {IP} cookie B check
>
> I tried "option httpchk /index.php" just to be sure, and got the same
> result. If I remove the httpchk option, HAProxy has no problem proxying
> these servers. What am I doing wrong?
>

what's listed under "Status" for these servers when viewing your
haproxy status page?

-jf

--
In the meantime, here is your PSA:
"It's so hard to write a graphics driver that open-sourcing it would not help."
-- Andrew Fear, Software Product Manager, NVIDIA Corporation
http://kerneltrap.org/node/7228



Re: Amount of ACLs per Backend

2009-03-02 Thread Jeffrey &#x27;jf' Lim
On Mon, Mar 2, 2009 at 7:20 PM, Bernhard Krieger  wrote:
> Hello,
>
> 
>
> To get o good overwiew of our domains i created over than 40 acls, but i
> getting an error after starting the proxy.
>
>
> This happens if i using more than 30 acls  in a backend.
>
> use_backend WEB1 if acl1 or acl2 .. acl30 or acl40
>
>
> So i decided to create only one acl, but i getting an error too!
>
> starting haproxy[ALERT] 060/115647 (12911) : parsing
> [/etc/haproxy/config:91]: line too long, limit: 2047.
> [ALERT] 060/115647 (12911) : Error reading configuration file :
> /etc/haproxy/config
>

FYI, "2047" is there because 2048 is allocated in the source for
reading in a config line. I wouldn't recommend changing this in the
source, though - I think you probably have more problems than the
limit for a config line.

Why so many acls anyway? Must they really be there? I administrate a
setup that is almost similar - but I definitely work at keeping the
acls sane and small. We definitely do not have 40 acls here...



>
> So i have the following questions:
> 1) How many acls can i use per backend configuration
> 2) Other soltuions!
>
> Version: 1.3.15.7
>
>
> Thank you!
>
>

-jf

--
In the meantime, here is your PSA:
"It's so hard to write a graphics driver that open-sourcing it would not help."
-- Andrew Fear, Software Product Manager, NVIDIA Corporation
http://kerneltrap.org/node/7228



Re: REQ errors

2009-02-16 Thread Jeffrey &#x27;jf' Lim
On Tue, Feb 17, 2009 at 1:49 AM, Mayur B  wrote:
> Thanks.
> Since we're running on Amazon EC2, I hope its not the network. Its about 5%
> of all our http calls, which is what worries me.
>
> How would I go about checking the log files to see what those REQ errors
> are?
>

grep for 'BADREQ'. Then look at the flags to see what the problem is.

-jf





> On Mon, Feb 16, 2009 at 9:38 AM, Jeffrey 'jf' Lim 
> wrote:
>>
>> On Tue, Feb 17, 2009 at 1:10 AM, Mayur B  wrote:
>> > Hello,
>> >
>> > We are seeing a high number of errors in the REQ column of the HAproxy
>> > UI.
>> > What is the REQ error column? How can we resolve this?
>> >
>>
>> Bad http requests. May not exactly be your fault (so not exactly
>> resolvable) - could be due to a number of things. Unless you suspect
>> something wrong with your network connection (or the other party's
>> network connection), these are usually port scans, or server scans...
>>
>> -jf
>>
>> --
>> In the meantime, here is your PSA:
>> "It's so hard to write a graphics driver that open-sourcing it would not
>> help."
>>-- Andrew Fear, Software Product Manager, NVIDIA Corporation
>> http://kerneltrap.org/node/7228
>
>



Re: REQ errors

2009-02-16 Thread Jeffrey &#x27;jf' Lim
On Tue, Feb 17, 2009 at 1:10 AM, Mayur B  wrote:
> Hello,
>
> We are seeing a high number of errors in the REQ column of the HAproxy UI.
> What is the REQ error column? How can we resolve this?
>

Bad http requests. May not exactly be your fault (so not exactly
resolvable) - could be due to a number of things. Unless you suspect
something wrong with your network connection (or the other party's
network connection), these are usually port scans, or server scans...

-jf

--
In the meantime, here is your PSA:
"It's so hard to write a graphics driver that open-sourcing it would not help."
-- Andrew Fear, Software Product Manager, NVIDIA Corporation
http://kerneltrap.org/node/7228