Syslog message size

2014-02-06 Thread Hervé COMMOWICK
Hello list,

After some discussion in #haproxy channel about MAX_SYSLOG_LEN, i think
we should raise this default value from 1024 to 2048.

Multiple arguments for that, RFC5426
(http://tools.ietf.org/html/rfc5426#section-3.2) encourage all receivers
to support up to 2048 and recommend that syslog senders restrict message
sizes such that IP datagrams do not exceed the smallest MTU of the
network in use.
In our case, the network almost all of us use is loopback, as we usually
chain on the loopback, then forward log in the wild if we need. Loopback
MTU is 16346, so no limitation here.

Syslog daemon used by default in GNU/Linux distribution are rsyslog and
syslog-ng, RHEL and Debian use rsyslog, SLES use syslog-ng, all daemons
support up to 2048, by default 8192 especially for syslog-ng.
http://www.rsyslog.com/doc/rsyslog_conf_global.html
http://www.balabit.com/sites/default/files/documents/syslog-ng-pe-4.0-guides/en/syslog-ng-pe-v4.0-guide-admin-en/html-single/index.html#idp8428192

Rsyslog guys says "that testing showed that 4k seems to be the typical
maximum for UDP based syslog. This is an IP stack restriction. Not
always ... but very often"

*BSD (FreeBSD/OpenBSD) support is not so nice, base syslogd support up
to 1024, so maybe we should stick to 1024 only for us.

BTW, i think the best should also be to support configurable value in
configuration, like a tune.syslog.maxlength or something like that.

Hervé.

-- 
Hervé COMMOWICK
Ingénieur systèmes et réseaux.

http://www.rezulteo.com
by Lizeo Online Media Group 
42 quai Rambaud - 69002 Lyon (France) ⎮ ☎ +33 (0)4 26 99 03 77



capture.req.hdr

2014-02-06 Thread Patrick Hemmer
I really like this feature, and it was something actually on my todo
list of things to look into adding to haproxy.
However there is one thing I would consider supporting. Instead of
requiring the index of the capture keyword in the config, which is very
cumbersome and awkward in my opinion, support using the header name.

Now I imagine the immediate response to this is going to be that this
would require searching for the header by name every time
capture.req.hdr is used, and the captured headers are stored in a simple
array not maintaining the header names. This would complicate the code
and possibly slow haproxy down.
But, an alternate idea would be to transform the header name into its
index at the time of parsing configuration. This would let the user use
a header name, but the actual haproxy code which translates
capture.req.hdr wouldn't change at all.
It would be a lot less fragile when someone updates their config to
capture an additional header, but forgets to update all indexes (plus
having to keep track of indexes in the first place).


-Patrick


Re: [ANNOUNCE] haproxy-1.5-dev22

2014-02-06 Thread Qingshan Xie
Willy, 
in your release announcement, you mentioned 
"Some code is still pending for a next version. Thierry has finishedthe map+acl 
merge which will allow to manipulate ACLs on the fly just
like maps today, .."



On Sunday, February 2, 2014 4:48 PM, Willy Tarreau  wrote:
 
Hi all,

after 1.5 months of head scratching and hair pulling leading to many
bugs being fixed, here comes 1.5-dev22.

This release comes with two important changes :

  - rework of the whole polling system, which is the lower layer of
    haproxy ; This was needed to definitely get rid of the frequent
    regressions that were caused each time we did a small change
    more or less related to this area. The "speculative I/O" mechanism
    designed 7 years ago was totally reworked to become a complete
    event cache which remembers what direction a file descriptor is
    ready in even after being temporarily disabled. This was necessary
    because the previous model didn't work well with SSL. Or in fact,
    it used to work well enough to hide the fact that the SSL API is
    not compatible at all with polled I/O due to its internal buffers.
    This part was really difficult to get right, but the code is much
    less tricky and much safer, and despite the important change, I
    already trust it much more than I did for the previous one.

  - switch to HTTP keep-alive mode by default. This is a major step
    forwards since 1.1 where we used to run in tunnel mode by default.
    The reason is that tunnel mode was the only way to have something
    close to keep-alive for many years. Now that we have end-to-end
    keep-alive, we have no reason for keeping tunnel mode as the
    default. It causes all the trouble everyone has faced at least
    once ("my rule randomly matches") which everyone now is used to
    respond to with "your config is missing http-server-close". So
    now a config without any close directive is not tunnel anymore
    but end-to-end keep-alive. I know there are corner cases where
    people want the tunnel mode. There's now a new option "tunnel"
    exactly for this. It will be needed to have it in both the
    frontend and the backend, just as before when it was needed to
    have none of them there.

Eventhough I took extreme care on these changes and did many many
tests (I individually tested the 25 combinations of the 5 HTTP
modes), it is still possible that I didn't notice something, despite
this version currently being run in production on the main site. So
reports are welcome (success, doubts or failures).

I won't enumerate all of the 32 bugs that were fixed since dev21
(some of them introduced there) thanks to all the feedback we got
here on the list and to the detailed information some participants
provided.

The main interesting features that were included are :
  - optimization of the SSL buffer sizes during a handshake to
    reduce the number of round trips, as suggested by Ilya Grigorik.
    Tests run by Ilya show that the handshake time can be reduced by
    3! Work done by Emeric.

  - addition of more debugging information on the stats socket in
    "show info" such as SSL connections etc, and memory pools usage
    using "show pools".

  - added the ability to set a hard limit on the SSL session rate
    (maxsslrate) in order to protect the SSL stack against incoming
    connection rushes which can happen during a restart, a config
    change (eg: different algos) or an attack. It works exactly
    like the "rate-limit sessions" except that it applies to SSL
    only.

  - new "capture.req.hdr()" and "capture.res.hdr()" sample fetches
    are used to include contents of selected captured headers in logs
    or other headers (William).

  - keep-alive: stick to the same server if possible after receiving
    a 401 or 407 from the server, so that the user has a chance to
    complete an authentication handshake (eg: NTLM). This avoids the
    need for "option prefer-last-server" for such situations.

  - tcp-check: new "tcp-check connect" directive to establish a
    connection to a specific port. This allows multi-port checks
    (Baptiste).

Some code is still pending for a next version. Thierry has finished
the map+acl merge which will allow to manipulate ACLs on the fly just
like maps today, the code is still under review (massive changes),
and is so often requested that we'd better merge it before 1.5-final.

Another SSL optim is currently under test.

All the easy things that were pending have been merged. This leaves
us with only the bind-process fixes, buffer management to fix
compression on chunks, and the agent-checks modifications. We'll see
how all this goes and if some parts are too difficult to fix before
the release.

In the mean time, please test and report. Testers have been amazingly
helpful and determined these last months, and that's what makes the
quality in the end. So please continue like this!

Last point, I've been backporting all relevant fixes to 1.4 and am
pl

Re: [ANNOUNCE] haproxy-1.5-dev22

2014-02-06 Thread Willy Tarreau
Hi,

On Thu, Feb 06, 2014 at 10:05:49AM -0800, Qingshan Xie wrote:
> Willy, 
> in your release announcement, you mentioned 
> "Some code is still pending for a next version. Thierry has finishedthe 
> map+acl merge which will allow to manipulate ACLs on the fly just
> like maps today, .."
(...)
> does it mean "this new feature will allow us to modify ACLs dynamically
> without haproxy restart"?  Will it be included in 1.5-dev23?

Hopefully yes. Currently you can already emulate this using maps. Maps
allow you to associate a value to a matching key. By using ACLs on top
of maps, you can convert the expected value to "1" for example and have
the ACL match for value 1. Then you can update your maps from the CLI.

What Thierry has done is to fuse the ACL match and the maps match so
that the keys can be managed similarly.

Regards,
Willy




Re: Externalizing health checks

2014-02-06 Thread Bhaskar Maddala
Hello,

   Since I did not get any responses on this, I decided to try
motivating a reponse
by attempting an implementation. I am attaching a patch that does
this. Admittedly
this patch is an iteration and I am not submitting it for anything
more than receiving
feedback, on the requirement, alternative ideas and the implementation.

Following is an explanation

I added an option httpchksrv which takes an ipv4/6 address (external
health checker)
 and an option http header. The http header is used to communicate to the health
check server the backend server to check.

option  httpchk   GET /_health.php HTTP/1.1
option  httpchksrv [header
]

Next, I added a "header-value" specification to the server definition

 server a1 magic.tumblr.com:80 weight 20 maxconn 5 check inter
2s header-value magic.tumblr.com

the header-value is used for the http-header-name specified in httpchksrv

Here is an example of the health check request

GET /_health.php HTTP/1.1
X-Check-For: magic.tumblr.com

The default value of header-value is the server id, in this case 'a1'

The following is a little abstract and describes how health checks can be cached
using this change, please bear with my attempts to describe it, these may be
in-adequate. Please take this for what it is, broad strokes of an
idea. I am not in any way advocating for this deployment.

Going back to my original motivation "excessive health checks due to increasing
proxy and web application deployment", here is a description of how I can solve
it using this implementation.

On haproxy I define 2 frontend, one on port 80 and one on port 6777. The
httpchksrv specification is used to direct health checks back to haproxy on port
6777. With haproxy in http mode

option  httpchksrv127.0.0.1:6777

Each server specification on the backend for port 80 (production traffic) uses
a server specification as

 server a1 server:80 weight 20 maxconn 5 check inter 2s

I define a backend of varnish nodes to use with the front end on port 6777.
I also make sure that the varnish backend uses only L4 health checks.

Health check are passed to varnish from all the proxies consistently hashed on
the http header X-Check-For via their front end on port 6777. Varnish
vcl is used to
obtain the header value 'X-Check-For' and make a health check request to the
appropriate web host if required, it may return cached health check
responses according
the configured TTL.


Thanks
Bhaskar

On Fri, Jan 31, 2014 at 1:46 PM, Bhaskar Maddala  wrote:
> Hello,
>
>As the number of haproxy deployments (>20) grows in our infrastructure 
> along
> with an increase in the number of backends ~1500 we are beginning to
> see a non trivial resources allocated to health checks. Each proxy instance
> health checking each backend every 2 seconds.
>
>   In an earlier conversation with Willy I was directed to look into the 
> options
> fastinter and on-error configuration options. I have done this but wanted to
> speak about how others might have addressed this and if there was any
> interest in implementing something along these lines and gather ideas/comments
> on what such an implementation would look like.
>
>  We use haproxy as a http load balancer and I have not given any thought
> about how the following description applies to tcp mode.
>
> Currently we http check our backends using
>
> option httpchk GET /_check.php HTTP/1.1\r\nHost:\ www.domain.com
>
>   We were considering adding an additional directive to specify a check server
> in addition to the httpchk directive
>
> option  httpchk GET /_health.php HTTP/1.1\r\nHost:\ hdr(Host)
> option  chksrv  server hcm-008dad0f 172.16.114.52:80
>
> The change would add a dynamic field to the health check request.
> hdr(Host) (http host header in this instance) is the field used to communicate
> the server to be health checked to the external check server.
>
> The check server can/will be implemented to cache health check responses from
> the back ends.
>
> One of the justifications for implementing this is the need in my
> environment to take
> into consideration factors not available to the backends when
> responding to a health
> check. As an example we will be implementing in our check server
> ability to force
> success/failure of health checks on groups of backends related in some manner.
> We expect this to allow us to avoid brown out scenarios we have
> encountered in the past.
>
> Has anyone considered/achieved something along these lines, or have 
> suggestions
> on how we could implement the same?
>
> Thanks
> Bhaskar
From 914db3e485831e29e5b76bf3d276ce56442b498f Mon Sep 17 00:00:00 2001
From: Bhaskar Maddala 
Date: Wed, 5 Feb 2014 23:58:36 -0500
Subject: [PATCH] Attempt at adding ability to externalize health check

Summary:
We add new option 'httpchksrv' which allows us to specify
the server to use to health check backends. The backend
to health check is commu

Re: RabbitMQ-HAProxy raising a exception.

2014-02-06 Thread Ryan O'Hara
On Thu, Feb 06, 2014 at 02:05:07PM -0600, Kuldip Madnani wrote:
> Hi,
> 
> I am trying to connect my RabbitMQ cluster through HAProxy.When connected
> directly to RabbitMQ nodes it works fine but when connected through HAProxy
> it raises following exception :

What are your client/server timeouts?

Ryan

> com.rabbitmq.client.ShutdownSignalException: connection error; reason:
> java.io.EOFException
> at
> com.rabbitmq.client.impl.AMQConnection.startShutdown(AMQConnection.java:678)
> at com.rabbitmq.client.impl.AMQConnection.shutdown(AMQConnection.java:668)
> at
> com.rabbitmq.client.impl.AMQConnection$MainLoop.run(AMQConnection.java:546)
> Caused by: java.io.EOFException
> at java.io.DataInputStream.readUnsignedByte(DataInputStream.java:290)
> at com.rabbitmq.client.impl.Frame.readFrom(Frame.java:95)
> at
> com.rabbitmq.client.impl.SocketFrameHandler.readFrame(SocketFrameHandler.java:131)
> at
> com.rabbitmq.client.impl.AMQConnection$MainLoop.run(AMQConnection.java:515)
> 
> What could be the reason.I see RabbitMQ guys said in many forums to check
> it with HAProxy.
> 
> Thanks & Regards,
> Kuldip Madnani



Keep-alive and websocket connections

2014-02-06 Thread Chris Yang
Dear all,

In the latest HAProxy 1.5 release (dev22), it is indicated that
keep-alive is now enabled by default for both client and server sides.
I have some questions regarding its use in the following scenario.

I use HAProxy in front of an array of servers: one nginx for
delivering static files, and the others being application servers. One
of the application servers exclusively deals with websocket (or in the
event of ws failure, switching to streaming) connections. Currently, I
am using 'http-server-close' by default for all servers, but I think
it'd be better to switch to 'http-keep-alive' for the nginx and keep
'http-server-close' for the websockets server.

Is this a correct setup? Thanks.

Best,

Chris



RabbitMQ-HAProxy raising a exception.

2014-02-06 Thread Kuldip Madnani
Hi,

I am trying to connect my RabbitMQ cluster through HAProxy.When connected
directly to RabbitMQ nodes it works fine but when connected through HAProxy
it raises following exception :

com.rabbitmq.client.ShutdownSignalException: connection error; reason:
java.io.EOFException
at
com.rabbitmq.client.impl.AMQConnection.startShutdown(AMQConnection.java:678)
at com.rabbitmq.client.impl.AMQConnection.shutdown(AMQConnection.java:668)
at
com.rabbitmq.client.impl.AMQConnection$MainLoop.run(AMQConnection.java:546)
Caused by: java.io.EOFException
at java.io.DataInputStream.readUnsignedByte(DataInputStream.java:290)
at com.rabbitmq.client.impl.Frame.readFrom(Frame.java:95)
at
com.rabbitmq.client.impl.SocketFrameHandler.readFrame(SocketFrameHandler.java:131)
at
com.rabbitmq.client.impl.AMQConnection$MainLoop.run(AMQConnection.java:515)

What could be the reason.I see RabbitMQ guys said in many forums to check
it with HAProxy.

Thanks & Regards,
Kuldip Madnani


Re: RabbitMQ-HAProxy raising a exception.

2014-02-06 Thread Kuldip Madnani
I have the following setting for HAProxy and no settings in client for
connectionFactory:

defaults
log global
modetcp
option  tcplog
option  dontlognull
retries 3
option  redispatch
maxconn 4096
timeout connect 5s # default 5 second time out if a backend is not found
timeout client 300s
timeout server 300s


# Entries for rabbitmq_CLUSTER6 Listener
#--#
listen rabbitmq_CLUSTER6   *:5678
mode   tcp
maxconn8092
option allbackups
balanceroundrobin
server LISTENER_rabbitmq_CLUSTER6_zldv3697_vci_att_com_5672
zldv3697.XXX.XXX.com:5672 weight 10 check inter 5000 rise 2 fall 3
##

Do these values impact and throw java.io.EOFException.

Thanks & Regards,
Kuldip Madnani



On Thu, Feb 6, 2014 at 2:08 PM, Ryan O'Hara  wrote:

> On Thu, Feb 06, 2014 at 02:05:07PM -0600, Kuldip Madnani wrote:
> > Hi,
> >
> > I am trying to connect my RabbitMQ cluster through HAProxy.When connected
> > directly to RabbitMQ nodes it works fine but when connected through
> HAProxy
> > it raises following exception :
>
> What are your client/server timeouts?
>
> Ryan
>
> > com.rabbitmq.client.ShutdownSignalException: connection error; reason:
> > java.io.EOFException
> > at
> >
> com.rabbitmq.client.impl.AMQConnection.startShutdown(AMQConnection.java:678)
> > at
> com.rabbitmq.client.impl.AMQConnection.shutdown(AMQConnection.java:668)
> > at
> >
> com.rabbitmq.client.impl.AMQConnection$MainLoop.run(AMQConnection.java:546)
> > Caused by: java.io.EOFException
> > at java.io.DataInputStream.readUnsignedByte(DataInputStream.java:290)
> > at com.rabbitmq.client.impl.Frame.readFrom(Frame.java:95)
> > at
> >
> com.rabbitmq.client.impl.SocketFrameHandler.readFrame(SocketFrameHandler.java:131)
> > at
> >
> com.rabbitmq.client.impl.AMQConnection$MainLoop.run(AMQConnection.java:515)
> >
> > What could be the reason.I see RabbitMQ guys said in many forums to check
> > it with HAProxy.
> >
> > Thanks & Regards,
> > Kuldip Madnani
>


Re: RabbitMQ-HAProxy raising a exception.

2014-02-06 Thread Ryan O'Hara
On Thu, Feb 06, 2014 at 02:15:31PM -0600, Kuldip Madnani wrote:
> I have the following setting for HAProxy and no settings in client for
> connectionFactory:
> 
> defaults
> log global
> modetcp
> option  tcplog
> option  dontlognull
> retries 3
> option  redispatch
> maxconn 4096
> timeout connect 5s # default 5 second time out if a backend is not found
> timeout client 300s
> timeout server 300s

OK. 300s is more than enough.

> # Entries for rabbitmq_CLUSTER6 Listener
> #--#
> listen rabbitmq_CLUSTER6   *:5678
> mode   tcp
> maxconn8092
> option allbackups
> balanceroundrobin
> server LISTENER_rabbitmq_CLUSTER6_zldv3697_vci_att_com_5672
> zldv3697.XXX.XXX.com:5672 weight 10 check inter 5000 rise 2 fall 3
> ##
> 
> Do these values impact and throw java.io.EOFException.

I have no idea. My first thought was the your connections were timing
out and the application didn't handle it well.

I don't think this is an haproxy issue. I have haproxy working in
front of a RabbitMQ cluster and have not hit any problems. The
configuration I am using can be found here:

http://openstack.redhat.com/RabbitMQ

Ryan

> Thanks & Regards,
> Kuldip Madnani
> 
> 
> 
> On Thu, Feb 6, 2014 at 2:08 PM, Ryan O'Hara  wrote:
> 
> > On Thu, Feb 06, 2014 at 02:05:07PM -0600, Kuldip Madnani wrote:
> > > Hi,
> > >
> > > I am trying to connect my RabbitMQ cluster through HAProxy.When connected
> > > directly to RabbitMQ nodes it works fine but when connected through
> > HAProxy
> > > it raises following exception :
> >
> > What are your client/server timeouts?
> >
> > Ryan
> >
> > > com.rabbitmq.client.ShutdownSignalException: connection error; reason:
> > > java.io.EOFException
> > > at
> > >
> > com.rabbitmq.client.impl.AMQConnection.startShutdown(AMQConnection.java:678)
> > > at
> > com.rabbitmq.client.impl.AMQConnection.shutdown(AMQConnection.java:668)
> > > at
> > >
> > com.rabbitmq.client.impl.AMQConnection$MainLoop.run(AMQConnection.java:546)
> > > Caused by: java.io.EOFException
> > > at java.io.DataInputStream.readUnsignedByte(DataInputStream.java:290)
> > > at com.rabbitmq.client.impl.Frame.readFrom(Frame.java:95)
> > > at
> > >
> > com.rabbitmq.client.impl.SocketFrameHandler.readFrame(SocketFrameHandler.java:131)
> > > at
> > >
> > com.rabbitmq.client.impl.AMQConnection$MainLoop.run(AMQConnection.java:515)
> > >
> > > What could be the reason.I see RabbitMQ guys said in many forums to check
> > > it with HAProxy.
> > >
> > > Thanks & Regards,
> > > Kuldip Madnani
> >



Re: Question about logging in HAProxy

2014-02-06 Thread Kuldip Madnani
I configured my rsyslog with below settings and was able to generate
haproxy.log in my directory /opt/app/haproxy but it gets generated with
owner root and file permission rw--.Does anybody know or implemented as
to how to change the ownership and file permissions for the log directory.I
know its a more of rsyslog question than HAProxy but if any body having any
idea?

Step 1 : Edit the config file of rsyslog(rsyslog.conf)
uncomment the below lines in rsyslog.conf :
#$ModLoad imudp
#$UDPServerRun 514

Add following line:
$UDPServerAddress 127.0.0.1

2. Create a file /etc/rsyslog.d/haproxy.conf and insert the following



if ($programname == 'haproxy' and $syslogseverity-text == 'info') then
-/opt/app/haproxy/log/haproxy-info.log
& ~

3.Now restart the rsyslog service
service rsyslog restart


Thanks & Regards,

Kuldip Madnani



On Tue, Feb 4, 2014 at 4:44 PM, Willy Tarreau  wrote:

> Hi Ryan,
>
> On Tue, Feb 04, 2014 at 04:00:14PM -0600, Ryan O'Hara wrote:
> > On Tue, Feb 04, 2014 at 02:05:24PM -0600, Kuldip Madnani wrote:
> > > Hi,
> > >
> > > I want to redirect the logs generated by HAProxy into some specific
> file .I
> > > read that in the global section in log option i can put a file location
> > > instead of IP address.I tried using that setting but it dint work for
> me,
> > > also i enabled tcp logging in my listener but no luck.Could any body
> tell
> > > if i am missing something.Here is my configuration:
> > > global
> > > 
> > > log /opt/app/workload/haproxy/log/haproxy.log syslog info
> > > 
> >
> > On my systems (which use rsyslog) I do this:
> >
> > log /dev/log local0
> >
> > Then I create /etc/rsyslog.d/haproxy.conf, which contains:
> >
> > local0.* /var/log/haproxy
> >
> > And everything gets logged there.
>
> Just a minor point here, when you're dealing with a proxy which is used
> in contexts of high load (thousands to tens of thousands of requests per
> second), the unix socket's log buffers are too small on many systems, and
> many log messages are dropped. Thus on these systems, logging over UDP is
> preferred (which requires to setup the syslog server to listen to UDP,
> and preferrably only on localhost).
>
> Best regards,
> Willy
>
>


Re: 'packet of death' in 1.5-dev21.x86_64.el6_4

2014-02-06 Thread James Hogarth
Hi,

Just providing some closure to this interesting edge case...

>
> Thank you for this detailed report, this is *very* useful. As you tracked
> the crash to happen inside openssl, I think you should file a report to
> centos/redhat because it's a security issue.


CentOS is bug-for-bug compatible (see recent libsvr2 issue there) and
seeing as we don't at this time have a support contract with RH plus it  it
is not reproducible on the current release (the glibc in 6.5 fixes it) I
doubt there will be any traction in that direction.


> It's possible that the bug
> is easier to trigger with haproxy or with a specific version of it than
> other products, but nevertheless, no lib should ever crash depending on
> the traffic so I suspect there's an unchecked error code in it causing a
> NULL pointer to be dereferenced.
>
>
It wasn't a null pointer but an explicit choice of behaviour in libkrb5 it
seems...


> In 1.5-dev12, I believe we did not yet support SNI, which could be an
> explanation for the different behaviour between the two versions. I
> think that the chroot is needed to trigger the bug simply because the
> glibc does not find a file it looks up, and causes a different return
> code to be fed to openssl.


This is possible and seems a likely possibility ... Incidentally blocking
one of the 'ciphers of death' by not permitting it on the bind also appears
to avoid the code path - the 'cipher not permitted' gets triggered before
whatever leads to the query through libkrb that results in the SIGABRT.


> It would be useful to know if you can also
> trigger the issue using the legacy openssl library instead of the
> distro's version (just pick 1.0.0l or 1.0.1f from the site if you're
> willing to rebuild it).
>
>
We probably won't get a chance to do this and we're unlikely to move off of
the libssl supported by the distro due to the added maintenance overhead.



> Thanks a lot!
> Willy
>
>
Not a problem ... our Head of IS did a detailed write up on our
investigation process and findings at his blog if you are interested:

http://blog.tinola.com/?e=36

Cheers,

James


Re: RabbitMQ-HAProxy raising a exception.

2014-02-06 Thread Willy Tarreau
Hi,

On Thu, Feb 06, 2014 at 02:25:02PM -0600, Ryan O'Hara wrote:
> On Thu, Feb 06, 2014 at 02:15:31PM -0600, Kuldip Madnani wrote:
> > I have the following setting for HAProxy and no settings in client for
> > connectionFactory:
> > 
> > defaults
> > log global
> > modetcp
> > option  tcplog
> > option  dontlognull
> > retries 3
> > option  redispatch
> > maxconn 4096
> > timeout connect 5s # default 5 second time out if a backend is not found
> > timeout client 300s
> > timeout server 300s
> 
> OK. 300s is more than enough.
> 
> > # Entries for rabbitmq_CLUSTER6 Listener
> > #--#
> > listen rabbitmq_CLUSTER6   *:5678
> > mode   tcp
> > maxconn8092
> > option allbackups
> > balanceroundrobin
> > server LISTENER_rabbitmq_CLUSTER6_zldv3697_vci_att_com_5672
> > zldv3697.XXX.XXX.com:5672 weight 10 check inter 5000 rise 2 fall 3
> > ##
> > 
> > Do these values impact and throw java.io.EOFException.
> 
> I have no idea. My first thought was the your connections were timing
> out and the application didn't handle it well.
> 
> I don't think this is an haproxy issue. I have haproxy working in
> front of a RabbitMQ cluster and have not hit any problems. The
> configuration I am using can be found here:
> 
> http://openstack.redhat.com/RabbitMQ

Do you know if any port information is transported in the protocol ? Since
haproxy and the server are on different ports, this could be one possible
explanation.

Willy




Re: RabbitMQ-HAProxy raising a exception.

2014-02-06 Thread Thomas Spicer
Stupid question, but what ports do you have defined in your “bunnies.config” 
file? I see you listening on 5678 but I know the default port for Rabbit is 
5672. I see 5672 listed in your haproxy config but RabbitMQ might be listening 
elsewhere. Also, what client are you using to make the connection? I would also 
check its config file as well to point to port 5678 as most will think you are 
using 5672. 


On Feb 6, 2014, at 3:47 PM, Willy Tarreau  wrote:

> Hi,
> 
> On Thu, Feb 06, 2014 at 02:25:02PM -0600, Ryan O'Hara wrote:
>> On Thu, Feb 06, 2014 at 02:15:31PM -0600, Kuldip Madnani wrote:
>>> I have the following setting for HAProxy and no settings in client for
>>> connectionFactory:
>>> 
>>> defaults
>>> log global
>>> modetcp
>>> option  tcplog
>>> option  dontlognull
>>> retries 3
>>> option  redispatch
>>> maxconn 4096
>>> timeout connect 5s # default 5 second time out if a backend is not found
>>> timeout client 300s
>>> timeout server 300s
>> 
>> OK. 300s is more than enough.
>> 
>>> # Entries for rabbitmq_CLUSTER6 Listener
>>> #--#
>>> listen rabbitmq_CLUSTER6   *:5678
>>> mode   tcp
>>> maxconn8092
>>> option allbackups
>>> balanceroundrobin
>>> server LISTENER_rabbitmq_CLUSTER6_zldv3697_vci_att_com_5672
>>> zldv3697.XXX.XXX.com:5672 weight 10 check inter 5000 rise 2 fall 3
>>> ##
>>> 
>>> Do these values impact and throw java.io.EOFException.
>> 
>> I have no idea. My first thought was the your connections were timing
>> out and the application didn't handle it well.
>> 
>> I don't think this is an haproxy issue. I have haproxy working in
>> front of a RabbitMQ cluster and have not hit any problems. The
>> configuration I am using can be found here:
>> 
>> http://openstack.redhat.com/RabbitMQ
> 
> Do you know if any port information is transported in the protocol ? Since
> haproxy and the server are on different ports, this could be one possible
> explanation.
> 
> Willy



Re: 'packet of death' in 1.5-dev21.x86_64.el6_4

2014-02-06 Thread Willy Tarreau
Hi James,

On Thu, Feb 06, 2014 at 08:36:00PM +, James Hogarth wrote:
> CentOS is bug-for-bug compatible (see recent libsvr2 issue there) and
> seeing as we don't at this time have a support contract with RH plus it  it
> is not reproducible on the current release (the glibc in 6.5 fixes it) I
> doubt there will be any traction in that direction.

Possible, but according to this below, 6.4 is very recent and supposed to
be maintained till 2015 :

   https://access.redhat.com/site/support/policy/updates/errata/

So maybe they're interested in backporting the fix from 6.5 into 6.4. There
are some Red Hat people here on the list, maybe they could relay that
information internally (Ryan ?).

> > It's possible that the bug
> > is easier to trigger with haproxy or with a specific version of it than
> > other products, but nevertheless, no lib should ever crash depending on
> > the traffic so I suspect there's an unchecked error code in it causing a
> > NULL pointer to be dereferenced.
> >
> >
> It wasn't a null pointer but an explicit choice of behaviour in libkrb5 it
> seems...

Indeed, but given the code you showed, I suspect that the abort() was put
there a bit in a hurry or as a sign of despair. abort() followed by return -1
is pointless and not that common!

> > In 1.5-dev12, I believe we did not yet support SNI, which could be an
> > explanation for the different behaviour between the two versions. I
> > think that the chroot is needed to trigger the bug simply because the
> > glibc does not find a file it looks up, and causes a different return
> > code to be fed to openssl.
> 
> 
> This is possible and seems a likely possibility ... Incidentally blocking
> one of the 'ciphers of death' by not permitting it on the bind also appears
> to avoid the code path - the 'cipher not permitted' gets triggered before
> whatever leads to the query through libkrb that results in the SIGABRT.

Yes so it seems you needed a perfect alignment of planets for this to
happen, but that in your environments, planets are always aligned :-/

> > It would be useful to know if you can also
> > trigger the issue using the legacy openssl library instead of the
> > distro's version (just pick 1.0.0l or 1.0.1f from the site if you're
> > willing to rebuild it).
> >
> >
> We probably won't get a chance to do this and we're unlikely to move off of
> the libssl supported by the distro due to the added maintenance overhead.

I easily understand! Especially since you really want to rely on someone
who knows it well to correctly backport only the right fixes and not the
bogus ones from upstream...

> > Thanks a lot!
> > Willy
> >
> >
> Not a problem ... our Head of IS did a detailed write up on our
> investigation process and findings at his blog if you are interested:
> 
> http://blog.tinola.com/?e=36

Ah cool, thanks for the link!

Cheers,
Willy




RE: Keep-alive and websocket connections

2014-02-06 Thread Lukas Tribus
Hi,


> In the latest HAProxy 1.5 release (dev22), it is indicated that
> keep-alive is now enabled by default for both client and server sides.
> I have some questions regarding its use in the following scenario.
>
> I use HAProxy in front of an array of servers: one nginx for
> delivering static files, and the others being application servers. One
> of the application servers exclusively deals with websocket (or in the
> event of ws failure, switching to streaming) connections. Currently, I
> am using 'http-server-close' by default for all servers, but I think
> it'd be better to switch to 'http-keep-alive' for the nginx and keep
> 'http-server-close' for the websockets server.

You can just default to http-keep-alive everywhere.

HAProxy recognizes the upgrade headers and switches to TCP mode
automatically [1].

Recognizing the upgrade in a HTTP transaction is possible with all modes
expect tcp mode (of course) and (the pre-dev22 default) http tunnel mode [2].



> Is this a correct setup? Thanks.

It is, but you may as well simplify it with http-keep-alive on all sections.

I don't see any advantage by configuring http-server-close on the websocket
backend.

Of course you should test this, before putting it in production.



Regards,

Lukas



[1] http://blog.exceliance.fr/2012/11/07/websockets-load-balancing-with-haproxy/
[2] http://cbonte.github.io/haproxy-dconv/configuration-1.5.html#4  
  


timeouts, and general config help

2014-02-06 Thread Mr. NPP
we are in the process of switching from a foundry load balancer (really
old) to haproxy.

we originally went with 1.4 and stunnel, however stunnel caved at 1000
connections and we had to find a new solution, so we are now running 1.5.

we've done a lot of testing, but we feel that we just don't have exactly
what we need, or could use.

we are also running into an error with a large file upload timing out after
10 minutes, and we can't seem to explain it.

below is all the information.



HA-Proxy version 1.5-dev22-1a34d57 2014/02/03
Copyright 2000-2014 Willy Tarreau 

Build options :
  TARGET  = linux2628
  CPU = generic
  CC  = x86_64-pc-linux-gnu-gcc
  CFLAGS  = -O2 -pipe -fomit-frame-pointer -mno-tls-direct-seg-refs
-fno-strict-aliasing
  OPTIONS = USE_LIBCRYPT=1 USE_GETADDRINFO=1 USE_ZLIB=1 USE_OPENSSL=1
USE_PCRE=1 USE_PCRE_JIT=1

Default settings :
  maxconn = 2000, bufsize = 16384, maxrewrite = 8192, maxpollevents = 200

Encrypted password support via crypt(3): yes
Built with zlib version : 1.2.8
Compression algorithms supported : identity, deflate, gzip
Built with OpenSSL version : OpenSSL 1.0.1e 11 Feb 2013
Running on OpenSSL version : OpenSSL 1.0.1e 11 Feb 2013
OpenSSL library supports TLS extensions : yes
OpenSSL library supports SNI : yes
OpenSSL library supports prefer-server-ciphers : yes
Built with PCRE version : 8.33 2013-05-28
PCRE library supports JIT : yes
Built with transparent proxy support using: IP_TRANSPARENT IPV6_TRANSPARENT
IP_FREEBIND

Available polling systems :
  epoll : pref=300,  test result OK
   poll : pref=200,  test result OK
 select : pref=150,  test result OK
Total: 3 (3 usable), will use epoll.
---


the config is as follows, please feel free to make fun of us for our severe
lack of proper config.

https://gist.github.com/anonymous/8deb82dba88980e8d243


the multiple frontends for the same thing are for redirects, last night was
a we have to swap it out or we are going to have major problems kind of
night, and i couldn't find an ACL that would match on proxy to do the
correct redirects if it was needed.

anyway, thank you.

mr.npp


Re: Keep-alive and websocket connections

2014-02-06 Thread Chris Yang
Thanks for your suggestion, Lukas.

For my own understanding, are you saying that there is no difference
between having "http-keep-alive" and having "http-server-close" to a
backend server once websocket connection to that server is establish,
and both settings allow for establishing websocket connection
perfectly.

So is there any advantage of having "http-keep-alive" to a websocket backend?


On Thu, Feb 6, 2014 at 4:56 PM, Lukas Tribus  wrote:
> Hi,
>
>
>> In the latest HAProxy 1.5 release (dev22), it is indicated that
>> keep-alive is now enabled by default for both client and server sides.
>> I have some questions regarding its use in the following scenario.
>>
>> I use HAProxy in front of an array of servers: one nginx for
>> delivering static files, and the others being application servers. One
>> of the application servers exclusively deals with websocket (or in the
>> event of ws failure, switching to streaming) connections. Currently, I
>> am using 'http-server-close' by default for all servers, but I think
>> it'd be better to switch to 'http-keep-alive' for the nginx and keep
>> 'http-server-close' for the websockets server.
>
> You can just default to http-keep-alive everywhere.
>
> HAProxy recognizes the upgrade headers and switches to TCP mode
> automatically [1].
>
> Recognizing the upgrade in a HTTP transaction is possible with all modes
> expect tcp mode (of course) and (the pre-dev22 default) http tunnel mode [2].
>
>
>
>> Is this a correct setup? Thanks.
>
> It is, but you may as well simplify it with http-keep-alive on all sections.
>
> I don't see any advantage by configuring http-server-close on the websocket
> backend.
>
> Of course you should test this, before putting it in production.
>
>
>
> Regards,
>
> Lukas
>
>
>
> [1] 
> http://blog.exceliance.fr/2012/11/07/websockets-load-balancing-with-haproxy/
> [2] http://cbonte.github.io/haproxy-dconv/configuration-1.5.html#4



Re: optimizing TLS time to first byte

2014-02-06 Thread Ilya Grigorik
> > (a) It's not clear to me how the threshold upgrade is determined? What
> > triggers the record size bump internally?
>
> The forwarding mechanism does two things :
>   - the read side counts the number of consecutive iterations that
> read() filled the whole receive buffer. After 3 consecutive times,
> it considers that it's a streaming transfer and sets the flag
> CF_STREAMER on the communication channel.
>
>   - after 2 incomplete reads, the flag disappears.
>
>   - the send side detects the number of times it can send the whole
> buffer at once. It sets CF_STREAMER_FAST if it can flush the
> whole buffer 3 times in a row.
>
>   - after 2 incomplete writes, the flag disappears.
>
> I preferred to only rely on CF_STREAMER and ignore the _FAST variant
> because it would only favor high bandwidth clients (it's used to
> enable splice() in fact). But I thought that CF_STREAMER alone would
> do the right job. And your WPT test seems to confirm this, when we
> look at the bandwidth usage!
>

Gotcha, thanks. As a follow up question, is it possible for me to control
the size of the read buffer?


> > (b) If I understood your earlier comment correctly, HAProxy will
> > automatically begin each new request with small record size... when it
> > detects that it's a new request.
>
> Indeed. In HTTP mode, it processes transactions (request+response), not
> connections, and each new transaction starts in a fresh state where these
> flags are cleared.


Awesome.


> > This works great if we're talking to a
> > backend in "http" mode: we parse the HTTP/1.x protocol and detect when a
> > new request is being processed, etc. However, what if I'm using HAProxy
> to
> > terminate TLS (+alpn negotiate) and then route the data to a "tcp" mode
> > backend.. which is my spdy / http/2 server talking over a non-encrypted
> > channel.
>
> Ah good point. I *suspect* that in practice it will work because :
>
>   - the last segment of the first transfer will almost always be incomplete
> (you don't always transfer exact multiples of the buffer size) ;
>   - the first response for the next request will almost always be
> incomplete
> (headers and not all data)
>

Ah, clever. To make this more interesting, say we have multiple streams in
flight: the frames may be interleaved and some streams may finish sooner
than others, but since multiple are in flight, chances are we'll be able to
fill the read buffer until the last stream completes.. which is actually
exactly what we want: we wouldn't want to reset the window at end of each
stream, but only when the connection goes quiet!


> So if we're in this situation, this will be enough to reset the CF_STREAMER
> flag (2 consecutive incomplete reads). I think it would be worth testing
> it.
> A very simple way to test it in your environment would be to chain two
> instances, one in TCP mode deciphering, and one in HTTP mode.
>

That's clever. I think for a realistic test we'd need a SPDY backend
though, since that's the only way we can actually get the multiplexed
streams flowing in parallel.


> > In this instance this logic wouldn't work, since HAProxy doesn't
> > have any knowledge or understanding of spdy / http/2 streams -- we'd
> start
> > the entire connection with small records, but then eventually upgrade it
> to
> > 16KB and keep it there, correct?
>
> It's not kept, it really depends on the transfer sizes all along. It
> matches
> more or less what you explained at the beginning of this thread, but based
> on transfer sizes at the lower layers.


Yep, this makes sense now - thanks.


>  > Any clever solutions for this? And on that note, are there future plans
> to
> > add "http/2" smarts to HAProxy, such that we can pick apart different
> > streams within a session, etc?
>
> Yes, I absolutely want to implement HTTP/2 but it will be time consuming
> and
> we won't have this for 1.5 at all. I also don't want to implement SPDY nor
> too early releases of 2.0, just because whatever we do will take a lot of
> time. Haproxy is a low level component, and each protocol adaptation is
> expensive to do. Not as much expensive as what people have to do with
> ASICs,
> but still harder than what some other products can do by using a small lib
> to perform the abstraction.
>

Makes sense, and great to hear!


> One of the huge difficulties we'll face will be to manage multiple streams
> over one connection. I think it will change the current paradigm of how
> requests are instanciated (which already started). From the very first
> version, we instanciated one "session" upon accept(), and this session
> contains buffers on which analyzers are plugged. The HTTP parsers are
> such analyzers. All the states and counters are stored at the session
> level. In 1.5, we started to change a few things. A connection is
> instanciated upon accept, then the session allocated after the connection
> is initialized (eg: SSL handshake complete). But splitting the sessions
> between

Re: HAProxy Question

2014-02-06 Thread Ben Timby
TCP mode load balancing would treat each TCP quad (source ip/source port,
dest ip/dest port), stream, or flow as a "session" or in other words, the
TCP stream is the basic unit of TCP load balancing.

You can enable the stats http interface and monitor that in your browser
for some useful metrics such as session count etc. There are also tools
such as hatop that will monitor the stats socket (unix domain socket) and
print a summary on the console.

See "stats *" directives in manual...
http://haproxy.1wt.eu/download/1.5/doc/configuration.txt


Re: optimizing TLS time to first byte

2014-02-06 Thread Willy Tarreau
Hi Ilya,

On Thu, Feb 06, 2014 at 04:14:14PM -0800, Ilya Grigorik wrote:
(...)
> > I preferred to only rely on CF_STREAMER and ignore the _FAST variant
> > because it would only favor high bandwidth clients (it's used to
> > enable splice() in fact). But I thought that CF_STREAMER alone would
> > do the right job. And your WPT test seems to confirm this, when we
> > look at the bandwidth usage!
> 
> Gotcha, thanks. As a follow up question, is it possible for me to control
> the size of the read buffer?

Yes, in the global section, you can set :

  - tune.bufsize : size of the buffer
  - tune.maxrewrite : reserve at the end of the buffer which is left
untouched when receiving HTTP headers

So during the headers phase, the buffer is considered full with
(bufsize-maxrewrite) bytes. After that, it's bufsize only.

> > > This works great if we're talking to a
> > > backend in "http" mode: we parse the HTTP/1.x protocol and detect when a
> > > new request is being processed, etc. However, what if I'm using HAProxy
> > to
> > > terminate TLS (+alpn negotiate) and then route the data to a "tcp" mode
> > > backend.. which is my spdy / http/2 server talking over a non-encrypted
> > > channel.
> >
> > Ah good point. I *suspect* that in practice it will work because :
> >
> >   - the last segment of the first transfer will almost always be incomplete
> > (you don't always transfer exact multiples of the buffer size) ;
> >   - the first response for the next request will almost always be
> > incomplete
> > (headers and not all data)
> >
> 
> Ah, clever. To make this more interesting, say we have multiple streams in
> flight: the frames may be interleaved and some streams may finish sooner
> than others, but since multiple are in flight, chances are we'll be able to
> fill the read buffer until the last stream completes.. which is actually
> exactly what we want: we wouldn't want to reset the window at end of each
> stream, but only when the connection goes quiet!

But then if we have multiple streams in flight, chances are that almost
all reads will be large enough to fill the buffer. This will certainly
not always be the case, but even if we're doing incomplete reads, we'll
send all what we have at once until there are at least two consecutive
incomplete reads. That said, the best way to deal with this will obviously
be to implement support for the upper protocols themselves at some point.

> > So if we're in this situation, this will be enough to reset the CF_STREAMER
> > flag (2 consecutive incomplete reads). I think it would be worth testing
> > it.
> > A very simple way to test it in your environment would be to chain two
> > instances, one in TCP mode deciphering, and one in HTTP mode.
> >
> 
> That's clever. I think for a realistic test we'd need a SPDY backend
> though, since that's the only way we can actually get the multiplexed
> streams flowing in parallel.

Yes it would be interesting to know how it behaves.

> > One of the huge difficulties we'll face will be to manage multiple streams
> > over one connection. I think it will change the current paradigm of how
> > requests are instanciated (which already started). From the very first
> > version, we instanciated one "session" upon accept(), and this session
> > contains buffers on which analyzers are plugged. The HTTP parsers are
> > such analyzers. All the states and counters are stored at the session
> > level. In 1.5, we started to change a few things. A connection is
> > instanciated upon accept, then the session allocated after the connection
> > is initialized (eg: SSL handshake complete). But splitting the sessions
> > between multiple requests will be quite complex. For example, I fear
> > that we'll have to always copy data because we'll have multiple
> > connections on one side and a single multiplexed one on the other side.
> > You can take a look at doc/internal/entities.pdf if you're interested.
> >
> 
> Yep, and you guys are not the only ones that will have to go through this
> architectural shift... I think many of the popular servers (Apache in
> particular comes to mind), might have to seriously reconsider their
> internal architecture. Not an easy thing to do, but I think it'll be worth
> it. :-)

Yes but the difficulty is that we also try to remain performant. At the
moment, we have no problem load-balancing videos or moderately large
objects (images) at 40 Gbps through a single-socket Xeon. I really fear
that the architecture required for HTTP/2 will make this drop significantly,
just because of the smaller windows, extra send/recv calls, and possibly
extra copies. And the worst thing would be to lose this performance in
HTTP/1 just because of the architecture shift needed to support HTTP/2.
We'll see...

Cheers,
Willy