Re: Script greasmonkey to navigate in Haproxy docs ...

2010-04-28 Thread Aleksandar Lazic

Hi all,

On Mit 28.04.2010 23:16, Cyril Bonté wrote:

Hi all,

Le mardi 27 avril 2010 22:31:10, Willy Tarreau a écrit :

Hello Damien,

On Tue, Apr 27, 2010 at 10:33:41AM +0200, Damien Hardy wrote:
> Hello all,
> 
> haproxy docs are quite difficult to manipulate.

>
> I had begun a Greasemonkey
> (https://addons.mozilla.org/fr/firefox/addon/748) script to create
> a "clickable" ToC in the haproxy documentation ...
> 
> You can DL it at http://userscripts.org/scripts/show/75315


Cool, thanks for your interest in the doc! You may want to join
efforts with Cyril Bonté (CC'd) who showed me an impressive HTML
conversion of the doc. I believe some of the processing was still by
hand, but there's definitely some good stuff there.


Yes, during the 1.4-rc version I was working on a python script to
convert the doc into html. It's still in an ugly state and it's
difficult to find time to clean the code :-)
It could be interesting to share the ideas !


[snipp]


OK, I'll try to work on it this week-end to provide a simpler version
"soon" (maybe without table detection for the moment).


how about to use:

http://txt2html.sourceforge.net/

I will try it today and post the results.

BR

Aleks



Re: HAProxy, Set-Cookie and Cacheable Elements

2010-04-28 Thread Karsten Elfenbein
Hi,

Why do you want to persist a cachable request to a backend server?

I use "cookie PHPSESSID prefix" to persist users only when needed. (if they 
got a session from logging in)

Karsten


Am Mittwoch, 28. April 2010 schrieb John Marrett:
> I've noticed some interesting behaviour with persistance cookies and
> haproxy.
> 
> Let's say you use the following settings in your haproxy.cfg:
> 
>   cookie SERVERID insert indirect
>   server static1 172.25.0.10:1080 cookie server1 check inter 15s
>   server static2 172.25.0.11:1080 cookie server2 check inter 15s
> 
> Any time haproxy receives a request that has no SERVERID cookie it will
> set one. Unfortunately, this doesn't take into consideration the
> cachability of the request. If a user receives a set-cookie in their
> response, on cacheable content, and if the proxy server isn't configured
> to strip Set-Cookie responses when serving from cache, all users of that
> proxy server will persist to a single backend server.
> 
> I noticed this while looking into some other issues we were having with
> Set-Cookie and proxy servers (notably the great firewall of Singapore).
> 
> Within our own application we either set Cache-Control: Private or
> ensure that we don't send a Set-Cookie on content that is declared as
> cacheable. I don't know if this kind of functionality could be
> interested for haproxy, but I thought I'd share my findings and see if
> anyone else was aware of this pattern of behaviour, if it was causing
> issues, and if there is or should be a way to address this issue.
> 
> -JohnF
> 




Re: haproxy & websockets

2010-04-28 Thread Willy Tarreau
Hi Dustin,

On Wed, Apr 28, 2010 at 04:51:41PM -0700, Dustin Moskovitz wrote:
> Actually, I should have mentioned at the beginning that we are using
> websockets to communicate with a *stateful* server, so we don't want to
> close the connection.

I'm not speaking about closing the websockets connection at all. And
that's what's nice with websockets, compared to earliser long-polling
methods, it's that once the server asks to switch to websockets, a
bidirectional tunnel is established over the HTTP connection between
the client and the server. At this point, haproxy does not care anymore
about what passes into it.

> However, explicitly declaring "mode http" in either
> the defaults section or the frontend & backend sections has made things
> happy.

Yes, that's expected.

> Now, a new problem (please let me know if/when I should start new threads).
> If I introduce ssl into the mix, I cannot get it to work with http over a
> websockets connection (I have it working without websockets, so it is
> otherwise ok).

haproxy does not decrypt SSL by itself. You need something like stunnel,
nginx or pound in front of it to handle it. I like stunnel for that because
it does not mangle the HTTP part at all. However, if you don't want to
decrypt the SSL traffic, you can make this instance work in TCP mode
and have the server do the job. You won't have any persistence though,
and the server will not get the client's IP address.

> Using the config at the bottom of this email, if I hit port 80 with a
> request or two and then move to ssl, haproxy will systematically *seg
> fault*, like so:

Yes, this was recently reported and fixed. It's only a debug mode issue.
When running in debug mode, haproxy tries to display all headers it
receives. When it receives invalid requests or responses (SSL traffic
being invalid from an HTTP point of view), it dereferences a NULL
pointer. The fix is pending in the tree. It is not a problem for you
anyway, because when you see the segfault, it indicates that haproxy
received something invalid which would not have worked in non-debug
mode.

Regards,
Willy




Re: haproxy & websockets

2010-04-28 Thread Dustin Moskovitz
Actually, I should have mentioned at the beginning that we are using
websockets to communicate with a *stateful* server, so we don't want to
close the connection. However, explicitly declaring "mode http" in either
the defaults section or the frontend & backend sections has made things
happy.

Now, a new problem (please let me know if/when I should start new threads).
If I introduce ssl into the mix, I cannot get it to work with http over a
websockets connection (I have it working without websockets, so it is
otherwise ok). Using the config at the bottom of this email, if I hit port
80 with a request or two and then move to ssl, haproxy will systematically *seg
fault*, like so:

r...@beta_lb001 ~# /var/asana-config/asana/3rdparty/haproxy/haproxy -f
/var/asana-config/asana/admin/production/proxy/haproxy.cfg.beta -p
/var/run/haproxy-private.pid -d -sf $( wrote:

> On Tue, Apr 27, 2010 at 05:36:12PM -0700, Dustin Moskovitz wrote:
> > Actually, I spoke too soon. When I create a config similar to Laurie's
> (see
> > below), I find that I still don't see headers and my default backend is
> > always utilized. I originally mistook this for working. Laurie, are you
> > positive you are actually routing traffic based on the hdr rule? This
> > implies to me the default is tcp? When I explicitly declare mode http in
> the
> > backends, the client never sees the response, as I mentioned in the first
> > email.
>
> You don't have any "mode http" statement, so haproxy does not wait for
> any HTTP contents to decide what backend to use.
>
> You have no httpclose/http-server-close option, so you're working in
> tunnel mode for backwards compatibility, where only the first request
> is processed and the rest is considered as data.
>
> Please just add "mode http" and "option http-server-close" to your
> defaults section and it should be OK.
>
> Regards,
> Willy
>
>


Re: Script greasmonkey to navigate in Haproxy docs ...

2010-04-28 Thread Cyril Bonté
Hi all,

Le mardi 27 avril 2010 22:31:10, Willy Tarreau a écrit :
> Hello Damien,
> 
> On Tue, Apr 27, 2010 at 10:33:41AM +0200, Damien Hardy wrote:
> > Hello all,
> > 
> > haproxy docs are quite difficult to manipulate.
> >
> > I had begun a Greasemonkey
> > (https://addons.mozilla.org/fr/firefox/addon/748) script to create a
> > "clickable" ToC in the haproxy documentation ...
> > 
> > You can DL it at http://userscripts.org/scripts/show/75315
> 
> Cool, thanks for your interest in the doc! You may want to join
> efforts with Cyril Bonté (CC'd) who showed me an impressive HTML
> conversion of the doc. I believe some of the processing was still
> by hand, but there's definitely some good stuff there.

Yes, during the 1.4-rc version I was working on a python script to convert the 
doc into html. It's still in an ugly state and it's difficult to find time to 
clean the code :-)
It could be interesting to share the ideas !

Currently, the script is able to :
- add links on sections
- detect tables and render them into html, this is the part of the code that 
should be rewritten, it's a nightmare to read ;-)
- detect keywords in the text and transform them into links to point to their 
documentation part (everywhere and in the "See also" parts). I still have some 
false-positives due to keywords like "maxconn" that are used in several 
contexts (will try to add locality detection).
- colorize keywords parameters (optional, required, choices)
- mark deprecated keywords

This is also useful to detect keywords missing in the keywords matrix, 
differences between each "May be used in sections" and the matrix, and to find 
lines that don't follow the document format (tabulations, more than 80 
characters, ...)
I'd also like to detect examples.

OK, I'll try to work on it this week-end to provide a simpler version "soon" 
(maybe without table detection for the moment).

-- 
Cyril Bonté



Re: stats via http

2010-04-28 Thread Willy Tarreau
On Wed, Apr 28, 2010 at 10:51:16AM -0500, Graham Barr wrote:
> On Apr 28, 2010, at 12:53 AM, Willy Tarreau wrote:
> > On Tue, Apr 27, 2010 at 05:07:12PM -0500, Graham Barr wrote:
> >> We are using 1.4.4 but whenever we access the stats via 
> >> /admin?stats;norefresh we are unable to get the full page and end up with 
> >> a page within a page.
> > 
> > what do you call "a page within a page" ?
> 
> Basically haproxy seems to be sending the page more than once within the same 
> http request. The 2nd page starts before the first is finished, so when it 
> gets rendered it looks like a page within itself.
> 
> > Could you please save what
> > you receive and send it ?
> 
> Sure, I have attached 3 examples

Wow, thanks !

That's weird and amazingly strange. I've just checked and some outputs
were not duplicates (eg: page 1 has two consecutive blocks of exactly
29548 bytes but with different counters).

This is just as if haproxy believed it had got two requests and started
processing the dummy second one completely. Could you check if you see
2 requests in your logs ? I also suppose this only happens when keep-alive
is enabled. You might want to use "option httpclose" or "option forceclose"
in the section which handles the stats to see if that works around it.

If you have the time to do so, I would love to get a tcpdump capture of
the request as seen on the haproxy machine, because clearly there's a bug
here and it must be fixed, and for that I need some hints on how to
reproduce it since I've never observed that yet !

If it's not too much to ask, an output of "strace -o output.file -s200 -p 
$pidofhaproxy"
during the capture would help a lot too. I don't know if it's easy for
you to reproduce the problem, nor if it only happens on a production
system or even on a test setup. That's why I'm asking.

Thanks,
Willy




Re: Matching URLs at layer 7

2010-04-28 Thread Willy Tarreau
On Wed, Apr 28, 2010 at 06:21:34PM +0930, Andrew Commons wrote:
> As an aside, should the documentation extract below actually read:
> 
> acl local_dsthdr(Host) -i localhost
>  ^
>  ^
> i.e. is the name of the header case sensitive? In my attempts to work this
> out I think that I had to use 'Host' rather than 'host' before it worked.

no, a header name is not case-sensitive, and the hdr() directive takes
care of that for you. However a header value is case sensitive, and since
the host header holds a DNS name, which is not case sensitive, you have to
use -i to be sure to match any possible syntax a user might use.

Regards,
Willy




Re: Hardware recommendations

2010-04-28 Thread Holger Just
On 2010-04-28 19:10, Alex Forrow wrote:
> We're looking to upgrade our HAProxy hardware soon. Does anyone have any
> recommendations on the things we should be looking for? e.g. Are there
> any NICs we should use/avoid?

Hi Alex,

I'm just writing down here what comes to my mind. Sorry if it looks a
bit unorganized...

Haproxy itself is not very demanding. A two core system will suffice.
Check to have enough RAM to hold all your sessions, but since it's
rather cheap to get 4 or 8 Gigs you should be safe here :)

Always think about the resource demands of the TCP stack. ON large
loadbalancer instances (esp. with many short connections), the TCP stack
will consume much more resources than your Haproxy.

Some NICs allow to offload some of the responsibilities like calculation
of packet checksums to silicon. Interrupt mitigation is something you
most probably want to have. Normaly, each packet will trigger an
interrupt which will eat away all your ressources if there are many of
them. Some NICs allow to cap the number of interrupts per second which
might increase latency a bit but saves your load balancer from dying :)

Make really sure your intended NIC is very well supported by your
intended OS. Many show suprising behaviour under stress. So the best
advice would possibly be to have a look in the vendors hardware support
lists and ask in the respective channels.

You most probably want to stay away from most on-board NICs from vendors
like Broadcom or SiS. Dedicated PCIe NICs from Intel are normally safe
(you find them also on some server boards from e.g. Supermicro). But
make sure to check the individual capabilities.

As a loadbalancer is always IO bound, check your data paths. Most
interestingly is the speed from and to the NIC (in a way that the
network-line is always the bottleneck) and between memory (ECC of
course) and CPU. Harddisks are obviously uninteresting :)

Hope this helps,
--Holger



Re: question regarding maxconn/ weight

2010-04-28 Thread Willy Tarreau
Hi Corin,

On Wed, Apr 28, 2010 at 08:50:46AM +0200, Corin Langosch wrote:
> Hi!
> 
> I wonder how maxconn and weight (both in a server statement) exactly
> interact. The docs in section 5 says about maxconn:

They don't interact. Weight is only used as a number of occurrences in
a load balancing cycle, which contains as many places as the sum of all
servers weights. So a server normally receives its share of the load
corresponding to its weight / total weight.

> >If the number of incoming concurrent requests goes higher than this value, 
> >they will be queued, waiting for a connection to be released.
> 
> But know I wonder where the requests get queued:
> 1. in the queue of this specific server - the request has to wait

This happens when the request has a cookie indicating that only this
server may be selected. Then yes, the request waits for this exact
server to release a connection.

> 2. in the global queue - other free servers can serve this request
> immediately

This happens with all "anonymous" requests, those which are not tied to
a specific server.

> My test config contains 2 servers: one with one cpu the other with 4
> cpus. Rigth now I simply set maxconn to the cpu count of each server.
> The weight is the same for both.

This is wrong. The maxconn is normally bound by your amount of RAM,
because it limits the concurrency. Concurrency means sessions, threads,
processes, contexts, file descriptors or whatever your application server
runs on. A server with more CPUs will not be able to support a larger
number of concurrent connections. However, it will process them faster,
making it possible to accept more users in a same time frame.

The weight is exactly what should be tuned to match your CPU differences,
because it ensures that the servers with less CPU power will receive less
work.

Let's consider the classical example of Apache servers with MaxClients=256.
Then set your maxconn values to 250 (leave some margin for health checks and
your own test connections). This is independant on the CPUs. You should
however ensure that your apache supports 256 connections with the installed
RAM. Then adjust the weights to reflect the CPU power, and you're done.

> I guess if queueing works according to 1. this config is bad and I'll
> have to adjust the weights too.
> I guess if queueing works according to 2. this config is perfectly
> fine - right?

it is both and the config is wrong :-)

Regards,
Willy




Re: Matching URLs at layer 7

2010-04-28 Thread Willy Tarreau
On Wed, Apr 28, 2010 at 09:21:31PM +0930, Andrew Commons wrote:
> Hi Beni,
> 
> A few things to digest here.
> 
> What was leading me up this path was a bit of elementary (and probably naïve) 
> white-listing with respect to the contents of the Host header and the URI/L 
> supplied by the user. Tools like Fiddler make request manipulation trivial so 
> filtering out 'obvious' manipulation attempts would be a good idea. With this 
> in mind my thinking (if it can be considered as such) was that:
> 
> (1) user request is for http://www.example.com/whatever
> (2) Host header is www.example.com
> (3) All is good! Pass request on to server.
> 
> Alternatively:
> 
> (1) user request is for http://www.example.com/whatever
> (2) Host header is www.whatever.com
> (3) All is NOT good! Flick request somewhere harmless.
> 
> I'm not sure whether your solution supports this, and if your interpretation 
> is correct maybe HAProxy doesn't support it either.
> 
> I'll do some more experimenting and I hope I don't lock myself out ;-)

I'm not sure what you're trying to achieve. Requests beginning with
"http://"; are normally for proxy servers, though they're also valid
on origin servers. If what you want is to explicitly match any of
those, then you must consider that HTTP/1.1 declares a requests with
a host field which does not match the one in the URL as invalid. So
in practice you could always just use the Host header as the one to
perform your switching on, and never use the URL part. You can even
decide to block any request beginning with "http://";. No browser
will send that to you anyway.

Regards,
Willy




Re: Logging of the IP addy for the syslog

2010-04-28 Thread Willy Tarreau
Hello,

On Wed, Apr 28, 2010 at 02:39:32PM -0400, johnskar...@informed-llc.com wrote:
> 
> Greetings. 
> 
> 
> I've got a quick question about logging in haproxy. We've got a setup that 
> utilizes nginx has the ssl decryptor who then passes off decrypted http 
> requests to haproxy, who then does what he needs. My question involves the 
> logging of the IP address in the syslogs. X-Forwarded-For exists for logging 
> of HTTP requests on the web server logs, however, haproxy logs the proxies 
> address in the system logs. Is there an option to reflect what 
> X-Forwarded-For does? 

yes, see "capture request header". You'd want something like
"capture request header x-forwarded-for len 15".

Regards,
Willy




Re: Matching URLs at layer 7

2010-04-28 Thread Benedikt Fraunhofer
Hi *,

> (2) Host header is www.example.com
> (3) All is good! Pass request on to server.
> (2) Host header is www.whatever.com
> (3) All is NOT good! Flick request somewhere harmless.

If that's all you want, you should be able to go with

 acl xxx_host hdr(Host)  -i xxx.example.com
 block if !xxx_host

, in your listen(, ...) section. But everything comes with a downside:
IMHO HTTP/1.0 doesnt require the Host header to be set so you'll be
effecitvely lock out all the HTTP/1.0 users unless you make another
rule checking for an undefined Host header (and allowing that) (or
checking for HTTP/1.0, there should be a "macro" for that.

Just my 2cent
  Beni.



Hardware recommendations

2010-04-28 Thread Alex Forrow
Hi,

We're looking to upgrade our HAProxy hardware soon. Does anyone have any
recommendations on the things we should be looking for? e.g. Are there any
NICs we should use/avoid?

Our site primarily serves lots of small objects.


Kind regards,

Alex


Re: Matching URLs at layer 7

2010-04-28 Thread Jeffrey 'jf' Lim
On Wed, Apr 28, 2010 at 7:51 PM, Andrew Commons
 wrote:
> Hi Beni,
>
> A few things to digest here.
>
> What was leading me up this path was a bit of elementary (and probably naïve) 
> white-listing with respect to the contents of the Host header and the URI/L 
> supplied by the user. Tools like Fiddler make request manipulation trivial so 
> filtering out 'obvious' manipulation attempts would be a good idea. With this 
> in mind my thinking (if it can be considered as such) was that:
>
> (1) user request is for http://www.example.com/whatever
> (2) Host header is www.example.com
> (3) All is good! Pass request on to server.
>
> Alternatively:
>
> (1) user request is for http://www.example.com/whatever
> (2) Host header is www.whatever.com
> (3) All is NOT good! Flick request somewhere harmless.
>

Benedikt has explained this already (see his first reply). There is no
such thing. What you see as "user request" is really sent as host
header, + uri.

Also to answer another question you raised - the http specification
states that header names are case-insensitive. I dont know about
haproxy's treatment, though (i'm too lazy to delve into the code right
now - and really you can test it out to find out for urself).

-jf


--
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."
--Richard Stallman

"It's so hard to write a graphics driver that open-sourcing it would not help."
-- Andrew Fear, Software Product Manager, NVIDIA Corporation
http://kerneltrap.org/node/7228



HAProxy, Set-Cookie and Cacheable Elements

2010-04-28 Thread John Marrett
I've noticed some interesting behaviour with persistance cookies and
haproxy.

Let's say you use the following settings in your haproxy.cfg:

  cookie SERVERID insert indirect
  server static1 172.25.0.10:1080 cookie server1 check inter 15s
  server static2 172.25.0.11:1080 cookie server2 check inter 15s

Any time haproxy receives a request that has no SERVERID cookie it will
set one. Unfortunately, this doesn't take into consideration the
cachability of the request. If a user receives a set-cookie in their
response, on cacheable content, and if the proxy server isn't configured
to strip Set-Cookie responses when serving from cache, all users of that
proxy server will persist to a single backend server.

I noticed this while looking into some other issues we were having with
Set-Cookie and proxy servers (notably the great firewall of Singapore).

Within our own application we either set Cache-Control: Private or
ensure that we don't send a Set-Cookie on content that is declared as
cacheable. I don't know if this kind of functionality could be
interested for haproxy, but I thought I'd share my findings and see if
anyone else was aware of this pattern of behaviour, if it was causing
issues, and if there is or should be a way to address this issue.

-JohnF



RE: Matching URLs at layer 7

2010-04-28 Thread Andrew Commons
Hi Beni,

A few things to digest here.

What was leading me up this path was a bit of elementary (and probably naïve) 
white-listing with respect to the contents of the Host header and the URI/L 
supplied by the user. Tools like Fiddler make request manipulation trivial so 
filtering out 'obvious' manipulation attempts would be a good idea. With this 
in mind my thinking (if it can be considered as such) was that:

(1) user request is for http://www.example.com/whatever
(2) Host header is www.example.com
(3) All is good! Pass request on to server.

Alternatively:

(1) user request is for http://www.example.com/whatever
(2) Host header is www.whatever.com
(3) All is NOT good! Flick request somewhere harmless.

I'm not sure whether your solution supports this, and if your interpretation is 
correct maybe HAProxy doesn't support it either.

I'll do some more experimenting and I hope I don't lock myself out ;-)

Cheers
Andrew

-Original Message-
From: myse...@gmail.com [mailto:myse...@gmail.com] On Behalf Of Benedikt 
Fraunhofer
Sent: Wednesday, 28 April 2010 7:42 PM
To: Andrew Commons
Cc: haproxy@formilux.org
Subject: Re: Matching URLs at layer 7

Hi Andrew,

2010/4/28 Andrew Commons :

> url_beg 
>  Returns true when the URL begins with one of the strings. This can be used to
>  check whether a URL begins with a slash or with a protocol scheme.
>
> So I'm assuming that "protocol scheme" means http:// or ftp:// or whatever

I would assume that, too..
but :) reading the other matching options it looks like those only
affect the "anchoring" of the matching. Like

> url_ip 
>  Applies to the IP address specified in the absolute URI in an HTTP request.
>  It can be used to prevent access to certain resources such as local network.
>  It is useful with option "http_proxy".

yep. but watch this "http_proxy"


> url_port 
>  "http_proxy". Note that if the port is not specified in the request, port 80
>  is assumed.

same here.. This enables plain proxy mode where requests are issued
(from the client) like

 GET http://www.example.com/importantFile.txt HTTP/1.0
.

> This seems to be reinforced (I think!) by:
>
> url_dom 
>  Returns true when one of the strings is found isolated or delimited with dots
>  in the URL. This is used to perform domain name matching without the risk of
>  wrong match due to colliding prefixes. See also "url_sub".

I personally don't think so.. I guess this is just another version of
"anchoring", here
"\.$STRING\."

> If I'm suffering from a bit of 'brain fade' here just set me on the right 
> road :-) If the url_ criteria have different interpretations in terms of what 
> the 'url' is then let's find out what these are!

I currently can't give it a try as i finally managed to lock myself out, but

http://haproxy.1wt.eu/download/1.4/doc/configuration.txt

has an example that looks exactly as what you need:
---
To select a different backend for requests to static contents on the "www" site
and to every request on the "img", "video", "download" and "ftp" hosts :

   acl url_static  path_beg /static /images /img /css
   acl url_static  path_end .gif .png .jpg .css .js
   acl host_wwwhdr_beg(host) -i www
   acl host_static hdr_beg(host) -i img. video. download. ftp.

   # now use backend "static" for all static-only hosts, and for static urls
   # of host "www". Use backend "www" for the rest.
   use_backend static if host_static or host_www url_static
   use_backend wwwif host_www

---

and as "begin" really means anchoring it with "^" in a regex this
would mean that there's no host in url as this would redefine the
meaning of "begin" which should not be done :)

So you should be fine with

   acl xxx_host hdr(Host)  -i xxx.example.com
   acl xxx_url  url_beg /
   #there's already a predefined acl doing this.
   use_backend xxx if xxx_host xxx_url

if i recall your example correctly.. But you should really put
something behind the url_beg to be of any use :)

Just my 2 cent

 Beni.




Re: Matching URLs at layer 7

2010-04-28 Thread Benedikt Fraunhofer
Hi Andrew,

2010/4/28 Andrew Commons :

> url_beg 
>  Returns true when the URL begins with one of the strings. This can be used to
>  check whether a URL begins with a slash or with a protocol scheme.
>
> So I'm assuming that "protocol scheme" means http:// or ftp:// or whatever

I would assume that, too..
but :) reading the other matching options it looks like those only
affect the "anchoring" of the matching. Like

> url_ip 
>  Applies to the IP address specified in the absolute URI in an HTTP request.
>  It can be used to prevent access to certain resources such as local network.
>  It is useful with option "http_proxy".

yep. but watch this "http_proxy"


> url_port 
>  "http_proxy". Note that if the port is not specified in the request, port 80
>  is assumed.

same here.. This enables plain proxy mode where requests are issued
(from the client) like

 GET http://www.example.com/importantFile.txt HTTP/1.0
.

> This seems to be reinforced (I think!) by:
>
> url_dom 
>  Returns true when one of the strings is found isolated or delimited with dots
>  in the URL. This is used to perform domain name matching without the risk of
>  wrong match due to colliding prefixes. See also "url_sub".

I personally don't think so.. I guess this is just another version of
"anchoring", here
"\.$STRING\."

> If I'm suffering from a bit of 'brain fade' here just set me on the right 
> road :-) If the url_ criteria have different interpretations in terms of what 
> the 'url' is then let's find out what these are!

I currently can't give it a try as i finally managed to lock myself out, but

http://haproxy.1wt.eu/download/1.4/doc/configuration.txt

has an example that looks exactly as what you need:
---
To select a different backend for requests to static contents on the "www" site
and to every request on the "img", "video", "download" and "ftp" hosts :

   acl url_static  path_beg /static /images /img /css
   acl url_static  path_end .gif .png .jpg .css .js
   acl host_wwwhdr_beg(host) -i www
   acl host_static hdr_beg(host) -i img. video. download. ftp.

   # now use backend "static" for all static-only hosts, and for static urls
   # of host "www". Use backend "www" for the rest.
   use_backend static if host_static or host_www url_static
   use_backend wwwif host_www

---

and as "begin" really means anchoring it with "^" in a regex this
would mean that there's no host in url as this would redefine the
meaning of "begin" which should not be done :)

So you should be fine with

   acl xxx_host hdr(Host)  -i xxx.example.com
   acl xxx_url  url_beg /
   #there's already a predefined acl doing this.
   use_backend xxx if xxx_host xxx_url

if i recall your example correctly.. But you should really put
something behind the url_beg to be of any use :)

Just my 2 cent

 Beni.



RE: Matching URLs at layer 7

2010-04-28 Thread Andrew Commons
Hi Beni,

Thank for responding :-)

The doco states  that:

url_beg 
  Returns true when the URL begins with one of the strings. This can be used to
  check whether a URL begins with a slash or with a protocol scheme.

So I'm assuming that "protocol scheme" means http:// or ftp:// or whatever

Other parts of the documentation state that:

url_ip 
  Applies to the IP address specified in the absolute URI in an HTTP request.
  It can be used to prevent access to certain resources such as local network.
  It is useful with option "http_proxy".

url_port 
  Applies to the port specified in the absolute URI in an HTTP request. It can
  be used to prevent access to certain resources. It is useful with option
  "http_proxy". Note that if the port is not specified in the request, port 80
  is assumed.

So I've been assuming that anything starting with url_ refers to the whole user 
supplied string parameters and all...

This seems to be reinforced (I think!) by:

url_dom 
  Returns true when one of the strings is found isolated or delimited with dots
  in the URL. This is used to perform domain name matching without the risk of
  wrong match due to colliding prefixes. See also "url_sub".

Which sure looks like the host portion to me!

If I'm suffering from a bit of 'brain fade' here just set me on the right road 
:-) If the url_ criteria have different interpretations in terms of what the 
'url' is then let's find out what these are!

Cheers
Andrew

-Original Message-
From: myse...@gmail.com [mailto:myse...@gmail.com] On Behalf Of Benedikt 
Fraunhofer
Sent: Wednesday, 28 April 2010 6:23 PM
To: Andrew Commons
Cc: haproxy@formilux.org
Subject: Re: Matching URLs at layer 7

Hi *,

2010/4/28 Andrew Commons :
>acl xxx_url  url_beg-i http://xxx.example.com
>acl xxx_url  url_sub-i xxx.example.com
>acl xxx_url  url_dom-i xxx.example.com

The Url is the part of the URI without the host :)
A http request looks like

 GET /index.html HTTP/1.0
 Host: www.example.com

so you can't use url_beg to match on the host unless you somehow
construct your urls to look like
 http://www.example.com/www.example.com/
but don't do that :)

so what you want is something like chaining
acl xxx_host hdr(Host) 
acl xxx_urlbe1 url_begin /toBE1/
use_backend BE1 if xxx_host xxx_urlbe1
?

Cheers

  Beni.




L'histoire restituee de la Franc-maçonnerie

2010-04-28 Thread patrick . boistier








--
Powered by PHPlist, www.phplist.com --


<>

Re: Matching URLs at layer 7

2010-04-28 Thread Benedikt Fraunhofer
Hi *,

2010/4/28 Andrew Commons :
>        acl xxx_url      url_beg        -i http://xxx.example.com
>        acl xxx_url      url_sub        -i xxx.example.com
>        acl xxx_url      url_dom        -i xxx.example.com

The Url is the part of the URI without the host :)
A http request looks like

 GET /index.html HTTP/1.0
 Host: www.example.com

so you can't use url_beg to match on the host unless you somehow
construct your urls to look like
 http://www.example.com/www.example.com/
but don't do that :)

so what you want is something like chaining
acl xxx_host hdr(Host) 
acl xxx_urlbe1 url_begin /toBE1/
use_backend BE1 if xxx_host xxx_urlbe1
?

Cheers

  Beni.



RE: Matching URLs at layer 7

2010-04-28 Thread Andrew Commons
As an aside, should the documentation extract below actually read:

acl local_dsthdr(Host) -i localhost
 ^
 ^
i.e. is the name of the header case sensitive? In my attempts to work this
out I think that I had to use 'Host' rather than 'host' before it worked.


4.2. Alphabetically sorted keywords reference
-

This section provides a description of each keyword and its usage.


acl   [flags] [operator]  ...
  Declare or complete an access list.
  May be used in sections :   defaults | frontend | listen | backend
 no|yes   |   yes  |   yes
  Example:
acl invalid_src  src  0.0.0.0/7 224.0.0.0/3
acl invalid_src  src_port 0:1023
acl local_dsthdr(host) -i localhost

  See section 7 about ACL usage.


-Original Message-
From: Andrew Commons [mailto:andrew.comm...@bigpond.com] 
Sent: Wednesday, 28 April 2010 4:06 PM
To: 'haproxy@formilux.org'
Subject: Matching URLs at layer 7

I'm confused over the behaviour of the url criteria in layer 7 acls. 

If I have a definition of the form:

acl xxx_host hdr(Host)  -i xxx.example.com

then something like this works fine:

use_backend xxx if xxx_host

If I try something like this:

acl xxx_url  url_beg-i http://xxx.example.com
use_backend xxx if xxx_url

then it fails.

I've tried:

acl xxx_url  url_sub-i xxx.example.com
acl xxx_url  url_dom-i xxx.example.com

Same resultI'm missing something obvious here, I just can't see it :-(

My ultimate goal is to have:

use_backend xxx if xxx_url xxx_host

which I think makes sense for a browser request that has not been fiddled
with...if I could test it I would be able to find out!

Any insights appreciated :-)

Cheers
andrew