Re: Load Balancing with haproxy makes my application slower?

2011-01-30 Thread Willy Tarreau
On Sun, Jan 30, 2011 at 06:34:34AM -0700, Sean Hess wrote:
> Ah, shoot, I forgot to mention that I'm running that ab test on ten boxes in
> parallel, so actual concurrency level is 10x what ab spits out. That's why I
> put "1500" above the one that had a concurrency of 150.
> 
> Sorry for the confusion.

no problem.

> I'm currently looking into keepalive stuff, going with a simpler config:
> 
> listen main *:80
> mode http
> balance roundrobin
> option http-server-close
> server  api1 api1:80 check
> server  api2 api2:80 check
> server  api3 api3:80 check
> server  api4 api4:80 check
> 
> It sounds like http-server-close makes the client -> haproxy connection stay
> open, but haproxy -> app server close between requests. It doesn't really
> make sense to me why that would be faster than keeping them all open all the
> time, but yes, I should examine my logs and figure out where the holdup is.

Right now with option httpclose, they're advertised as closed but not
actively closed. The client waits for the server to close. If it does
not or if it delays the close for whatever reason, it can cause what
you're observing. In your logs, you'd see short request/connect/response
times and a high total time.

Regards,
Willy




Re: Load Balancing with haproxy makes my application slower?

2011-01-30 Thread Sean Hess
Ah, shoot, I forgot to mention that I'm running that ab test on ten boxes in
parallel, so actual concurrency level is 10x what ab spits out. That's why I
put "1500" above the one that had a concurrency of 150.

Sorry for the confusion.

I'm currently looking into keepalive stuff, going with a simpler config:

listen main *:80
mode http
balance roundrobin
option http-server-close
server  api1 api1:80 check
server  api2 api2:80 check
server  api3 api3:80 check
server  api4 api4:80 check

It sounds like http-server-close makes the client -> haproxy connection stay
open, but haproxy -> app server close between requests. It doesn't really
make sense to me why that would be faster than keeping them all open all the
time, but yes, I should examine my logs and figure out where the holdup is.

I feel like we're getting close! Thank you so much for your help!



On Sun, Jan 30, 2011 at 5:05 AM, Willy Tarreau  wrote:

> On Sat, Jan 29, 2011 at 04:01:33PM -0700, Sean Hess wrote:
> > Unfortunately, the results are almost exactly the same with haproxy 1.4
> and
> > those changes you recommended. I'm so confused...
>
> Your numbers indicate a big problem somewhere :
>
>  Concurrency Level: 200
>  Requests per second: 176.56 [#/sec] (mean)
>  Time per request: 1132.776 [ms] (mean)
>
> That's much too slow, you should get approximately 100-200 times that.
>
> What do the logs say ? They will report details about where each request
> spends time (connection, headers, response, ...).
>
> One thing that could happen would be that the server ignores the
> "Connection: close" header and maintains the connection opened for
> 1 second. With 1.4, you'd better use "option http-server-close"
> and remove "option httpclose". This option actively closes the
> server-side connection and does not depend on the server's will
> to comply with the request.
>
> Regards,
> Willy
>
>


Re: Load Balancing with haproxy makes my application slower?

2011-01-30 Thread Willy Tarreau
On Sat, Jan 29, 2011 at 04:01:33PM -0700, Sean Hess wrote:
> Unfortunately, the results are almost exactly the same with haproxy 1.4 and
> those changes you recommended. I'm so confused...

Your numbers indicate a big problem somewhere :

  Concurrency Level: 200
  Requests per second: 176.56 [#/sec] (mean)
  Time per request: 1132.776 [ms] (mean)

That's much too slow, you should get approximately 100-200 times that.

What do the logs say ? They will report details about where each request
spends time (connection, headers, response, ...).

One thing that could happen would be that the server ignores the
"Connection: close" header and maintains the connection opened for
1 second. With 1.4, you'd better use "option http-server-close"
and remove "option httpclose". This option actively closes the
server-side connection and does not depend on the server's will
to comply with the request.

Regards,
Willy




Re: Load Balancing with haproxy makes my application slower?

2011-01-30 Thread John Feuerstein
Hi Sean,

> So, it looks about the same. The single instance outperforms the
> cluster, which doesn't make any sense. I'll try those changes and see if
> it gets any better. 

at first glance it looks like your problem (increased latency) could be
related to connection setup costs:

You're using "option httpclose", which closes the client and server
connection after every request, requiring two new connections to be
created for each and every request (client -> haproxy, haproxy -> server).

I didn't see the implementation/configuration of your application
server, but it will end up with at least half the amound of established
connections. And if using keepalive, it will be even less.

So to get this sorted out, make sure to know how keep-alive and
connections are handled in both scenarios. This can be a big factor,
especially when doing synthetic benchmarks.

For the haproxy 1.4+ case, you might want to use "option
http-server-close" if you want real round-robin on request basis. This
will keep the client connection open, but close the server connection
after every request, which can result in better latency if clients
support keep-alive.

Have a look at this paragraph from the haproxy 1.4 doc:

> By default HAProxy operates in a tunnel-like mode with regards to persistent
> connections: for each connection it processes the first request and forwards
> everything else (including additional requests) to selected server. Once
> established, the connection is persisted both on the client and server
> sides. Use "option http-server-close" to preserve client persistent 
> connections
> while handling every incoming request individually, dispatching them one after
> another to servers, in HTTP close mode. Use "option httpclose" to switch both
> sides to HTTP close mode. "option forceclose" and "option
> http-pretend-keepalive" help working around servers misbehaving in HTTP close
> mode.

Best regards,
John



Re: Load Balancing with haproxy makes my application slower?

2011-01-29 Thread Sean Hess
Unfortunately, the results are almost exactly the same with haproxy 1.4 and
those changes you recommended. I'm so confused...

Thanks for your help!

On Sat, Jan 29, 2011 at 3:25 PM, Sean Hess  wrote:

> Ok, here are the results from apache benchmark *before* making any other
> changes to the system (1.4, timeouts, etc).
>
> The Test - https://gist.github.com/802251
>
> The Results against the 1*256 haproxy -> 4*512 node cluster -
> https://gist.github.com/802268
> Here's the haproxy status after the test -
> http://dl.dropbox.com/u/1165308/ab_haproxy.png
>
> The Results against the 1*512 node instance -
> https://gist.github.com/802271
>
> So, it looks about the same. The single instance outperforms the cluster,
> which doesn't make any sense. I'll try those changes and see if it gets any
> better.
>
>
>
> On Sat, Jan 29, 2011 at 2:44 PM, Sean Hess  wrote:
>
>> Thanks Joel,
>>
>> I'm working on converting the test to ab (shouldn't take long) and trying
>> out 1.4, but to answer your questions. RSTavg is average response time.
>> There's a 500ms timer in the http response, and some serialization. It's
>> over the local network. So that should be about 550ms under no load.
>>
>> Users per second, yes.
>>
>> I didn't use ab to start because I'm not interested in response time, per
>> se, but at what load response time starts to fail. I don't know an effective
>> way to do this with ab, partially because it doesn't support stepping (my
>> test steps through the concurrency levels specified by "users", I should
>> rename Usersps to sessions per second, because if a "user" takes less
>> than 1 second they start again right away). My testing harness allows me to
>> write tests in my application language, blah blah.. you get the idea. But
>> yes, I'll run ab and see if I get the same results.
>>
>> I'll also try your changes to the timeouts. Thanks for your help!
>>
>>
>>
>> On Sat, Jan 29, 2011 at 1:00 PM, Joel Krauska  wrote:
>>
>>> Speculation, but using a newer version of haproxy (1.4) might also
>>> improve performance for you.
>>>
>>> --Joel
>>>
>>>
>>> On 1/29/11 10:53 AM, Sean Hess wrote:
>>>
 I'm performing real-world load tests for the first time, and my results
 aren't making a lot of sense.

 Just to make sure I have the test harness working, I'm not testing
 "real" application code yet, I'm just hitting a web page that simulates
 an IO delay (500 ms), and then serializes out some json (about 85 bytes
 of content). It's not accessing the database, or doing anything other
 than printing out that data. My application servers are written in
 node.js, on 512MB VPSes on rackspace (centos55).

 Here are the results that don't make sense:

 https://gist.github.com/802082

 When I run this test against a single application server (bottom one),
 You can see that it stays pretty flat (about 550ms response time) until
 it gets to 1500 simultaneous users, when it starts to error out and get
 slow.

 When I run it against an haproxy instance in front of 4 of the same
 nodes (top one), my performance is worse. It doesn't drop any
 connections, but the response time edges up much earlier than against a
 single node.

 Does this make any sense to you? Does haproxy need more RAM? I was
 watching the box while the test was running and the haproxy process
 didn't get higher than 20% CPU and 10% RAM.

 Please help, thanks!

>>>
>>>
>>
>


Re: Load Balancing with haproxy makes my application slower?

2011-01-29 Thread Sean Hess
Ok, here are the results from apache benchmark *before* making any other
changes to the system (1.4, timeouts, etc).

The Test - https://gist.github.com/802251

The Results against the 1*256 haproxy -> 4*512 node cluster -
https://gist.github.com/802268
Here's the haproxy status after the test -
http://dl.dropbox.com/u/1165308/ab_haproxy.png

The Results against the 1*512 node instance - https://gist.github.com/802271

So, it looks about the same. The single instance outperforms the cluster,
which doesn't make any sense. I'll try those changes and see if it gets any
better.



On Sat, Jan 29, 2011 at 2:44 PM, Sean Hess  wrote:

> Thanks Joel,
>
> I'm working on converting the test to ab (shouldn't take long) and trying
> out 1.4, but to answer your questions. RSTavg is average response time.
> There's a 500ms timer in the http response, and some serialization. It's
> over the local network. So that should be about 550ms under no load.
>
> Users per second, yes.
>
> I didn't use ab to start because I'm not interested in response time, per
> se, but at what load response time starts to fail. I don't know an effective
> way to do this with ab, partially because it doesn't support stepping (my
> test steps through the concurrency levels specified by "users", I should
> rename Usersps to sessions per second, because if a "user" takes less than
> 1 second they start again right away). My testing harness allows me to write
> tests in my application language, blah blah.. you get the idea. But yes,
> I'll run ab and see if I get the same results.
>
> I'll also try your changes to the timeouts. Thanks for your help!
>
>
>
> On Sat, Jan 29, 2011 at 1:00 PM, Joel Krauska  wrote:
>
>> Speculation, but using a newer version of haproxy (1.4) might also improve
>> performance for you.
>>
>> --Joel
>>
>>
>> On 1/29/11 10:53 AM, Sean Hess wrote:
>>
>>> I'm performing real-world load tests for the first time, and my results
>>> aren't making a lot of sense.
>>>
>>> Just to make sure I have the test harness working, I'm not testing
>>> "real" application code yet, I'm just hitting a web page that simulates
>>> an IO delay (500 ms), and then serializes out some json (about 85 bytes
>>> of content). It's not accessing the database, or doing anything other
>>> than printing out that data. My application servers are written in
>>> node.js, on 512MB VPSes on rackspace (centos55).
>>>
>>> Here are the results that don't make sense:
>>>
>>> https://gist.github.com/802082
>>>
>>> When I run this test against a single application server (bottom one),
>>> You can see that it stays pretty flat (about 550ms response time) until
>>> it gets to 1500 simultaneous users, when it starts to error out and get
>>> slow.
>>>
>>> When I run it against an haproxy instance in front of 4 of the same
>>> nodes (top one), my performance is worse. It doesn't drop any
>>> connections, but the response time edges up much earlier than against a
>>> single node.
>>>
>>> Does this make any sense to you? Does haproxy need more RAM? I was
>>> watching the box while the test was running and the haproxy process
>>> didn't get higher than 20% CPU and 10% RAM.
>>>
>>> Please help, thanks!
>>>
>>
>>
>


Re: Load Balancing with haproxy makes my application slower?

2011-01-29 Thread Sean Hess
(Sorry for the double-post Joel, I accidentally only sent this to you
instead of the mailing list)

Thanks Joel,

I'm working on converting the test to ab (shouldn't take long) and trying
out 1.4, but to answer your questions. RSTavg is average response time.
There's a 500ms timer in the http response, and some serialization. It's
over the local network. So that should be about 550ms under no load.

Users per second, yes.

I didn't use ab to start because I'm not interested in response time, per
se, but at what load response time starts to fail. I don't know an effective
way to do this with ab, partially because it doesn't support stepping (my
test steps through the concurrency levels specified by "users", I should
rename Usersps to sessions per second, because if a "user" takes less than 1
second they start again right away). My testing harness allows me to write
tests in my application language, blah blah.. you get the idea. But yes,
I'll run ab and see if I get the same results.

I'll also try your changes to the timeouts. Thanks for your help!

On Sat, Jan 29, 2011 at 12:57 PM, Joel Krauska  wrote:

> Sean,
>
> I think it would be helpful to further explain your testing scenario.
>
> How do you simulate concurrent users?
>
> What is RSTav?
>
> Usersps is sessions per second??
>
> I think most folks use Apache Bench
> http://httpd.apache.org/docs/2.0/programs/ab.html
> as a fairly common industry standard for HTTP server performance.
>
> Would you consider rerunning your test using ab as well?
>
> Equivalently, you might look at httpperf (see the haproxy web page for some
> notes)
>
>
> One tuning thing you might try is dropping down your timeouts.
> You have:
>timeout connect 1
>timeout client 30
>timeout server 30
>
> I typically use an order of magnitude smaller.
> 5000
> 5
> 5
> (these are exaple defaults listed in an example in 2.3 of the HA proxy
> docs)
> http://haproxy.1wt.eu/download/1.4/doc/configuration.txt
>
>
> Best of luck,
>
> Joel
>
>
>
> On 1/29/11 10:53 AM, Sean Hess wrote:
>
>> I'm performing real-world load tests for the first time, and my results
>> aren't making a lot of sense.
>>
>> Just to make sure I have the test harness working, I'm not testing
>> "real" application code yet, I'm just hitting a web page that simulates
>> an IO delay (500 ms), and then serializes out some json (about 85 bytes
>> of content). It's not accessing the database, or doing anything other
>> than printing out that data. My application servers are written in
>> node.js, on 512MB VPSes on rackspace (centos55).
>>
>> Here are the results that don't make sense:
>>
>> https://gist.github.com/802082
>>
>> When I run this test against a single application server (bottom one),
>> You can see that it stays pretty flat (about 550ms response time) until
>> it gets to 1500 simultaneous users, when it starts to error out and get
>> slow.
>>
>> When I run it against an haproxy instance in front of 4 of the same
>> nodes (top one), my performance is worse. It doesn't drop any
>> connections, but the response time edges up much earlier than against a
>> single node.
>>
>> Does this make any sense to you? Does haproxy need more RAM? I was
>> watching the box while the test was running and the haproxy process
>> didn't get higher than 20% CPU and 10% RAM.
>>
>> Please help, thanks!
>>
>
>


Re: Load Balancing with haproxy makes my application slower?

2011-01-29 Thread Joel Krauska

Sean,

I think it would be helpful to further explain your testing scenario.

How do you simulate concurrent users?

What is RSTav?

Usersps is sessions per second??

I think most folks use Apache Bench
http://httpd.apache.org/docs/2.0/programs/ab.html
as a fairly common industry standard for HTTP server performance.

Would you consider rerunning your test using ab as well?

Equivalently, you might look at httpperf (see the haproxy web page for 
some notes)



One tuning thing you might try is dropping down your timeouts.
You have:
timeout connect 1
timeout client 30
timeout server 30

I typically use an order of magnitude smaller.
5000
5
5
(these are exaple defaults listed in an example in 2.3 of the HA proxy docs)
http://haproxy.1wt.eu/download/1.4/doc/configuration.txt


Best of luck,

Joel


On 1/29/11 10:53 AM, Sean Hess wrote:

I'm performing real-world load tests for the first time, and my results
aren't making a lot of sense.

Just to make sure I have the test harness working, I'm not testing
"real" application code yet, I'm just hitting a web page that simulates
an IO delay (500 ms), and then serializes out some json (about 85 bytes
of content). It's not accessing the database, or doing anything other
than printing out that data. My application servers are written in
node.js, on 512MB VPSes on rackspace (centos55).

Here are the results that don't make sense:

https://gist.github.com/802082

When I run this test against a single application server (bottom one),
You can see that it stays pretty flat (about 550ms response time) until
it gets to 1500 simultaneous users, when it starts to error out and get
slow.

When I run it against an haproxy instance in front of 4 of the same
nodes (top one), my performance is worse. It doesn't drop any
connections, but the response time edges up much earlier than against a
single node.

Does this make any sense to you? Does haproxy need more RAM? I was
watching the box while the test was running and the haproxy process
didn't get higher than 20% CPU and 10% RAM.

Please help, thanks!





Re: Load Balancing with haproxy makes my application slower?

2011-01-29 Thread Sean Hess
Oh, here's my ha proxy config

https://gist.github.com/802098

and here's why my haproxy status looks like shortly after the test

http://dl.dropbox.com/u/1165308/haproxy.png

On Sat, Jan 29, 2011 at 11:53 AM, Sean Hess  wrote:

> I'm performing real-world load tests for the first time, and my results
> aren't making a lot of sense.
>
> Just to make sure I have the test harness working, I'm not testing "real"
> application code yet, I'm just hitting a web page that simulates an IO delay
> (500 ms), and then serializes out some json (about 85 bytes of content).
> It's not accessing the database, or doing anything other than printing out
> that data. My application servers are written in node.js, on 512MB VPSes on
> rackspace (centos55).
>
> Here are the results that don't make sense:
>
> https://gist.github.com/802082
>
> When I run this test against a single application server (bottom one), You
> can see that it stays pretty flat (about 550ms response time) until it gets
> to 1500 simultaneous users, when it starts to error out and get slow.
>
> When I run it against an haproxy instance in front of 4 of the same nodes
> (top one), my performance is worse. It doesn't drop any connections, but the
> response time edges up much earlier than against a single node.
>
> Does this make any sense to you? Does haproxy need more RAM? I was watching
> the box while the test was running and the haproxy process didn't get higher
> than 20% CPU and 10% RAM.
>
> Please help, thanks!


Load Balancing with haproxy makes my application slower?

2011-01-29 Thread Sean Hess
I'm performing real-world load tests for the first time, and my results
aren't making a lot of sense.

Just to make sure I have the test harness working, I'm not testing "real"
application code yet, I'm just hitting a web page that simulates an IO delay
(500 ms), and then serializes out some json (about 85 bytes of content).
It's not accessing the database, or doing anything other than printing out
that data. My application servers are written in node.js, on 512MB VPSes on
rackspace (centos55).

Here are the results that don't make sense:

https://gist.github.com/802082

When I run this test against a single application server (bottom one), You
can see that it stays pretty flat (about 550ms response time) until it gets
to 1500 simultaneous users, when it starts to error out and get slow.

When I run it against an haproxy instance in front of 4 of the same nodes
(top one), my performance is worse. It doesn't drop any connections, but the
response time edges up much earlier than against a single node.

Does this make any sense to you? Does haproxy need more RAM? I was watching
the box while the test was running and the haproxy process didn't get higher
than 20% CPU and 10% RAM.

Please help, thanks!