Re: haproxy invoked oom-killer
Lukas from what I see it looks like swap was not being utilized. It was available but not used. On Saturday, February 2, 2013, Lukas Tribus wrote: Notice that having the HAproxy box swapping is a huge performance killer and you absolutely do not want to do that, so apart from tuning/configuring the OOM killer you should track the issue down and avoid memory depletion and swapping in first place. Date: Sat, 2 Feb 2013 12:34:03 + Subject: Re: haproxy invoked oom-killer From: cont...@jpluscplusm.com javascript:; To: haproxy@formilux.org javascript:; On 1 February 2013 21:57, Dusty Doris du...@doris.name javascript:; wrote: oom-killer just killed my haproxy instance. Anyone know if there is a way to prioritize haproxy and have it get killed after something else? Or, any tuning that might help. https://www.google.co.uk/search?q=linux+exclude+proces+from+oom-killer -- Jonathan Matthews // Oxford, London, UK http://www.jpluscplusm.com/contact.html
Re: haproxy invoked oom-killer
1.4.2 I agree. I don't understand how haproxy was utilizing all the ram. I'm not actually certain it was. Could you elaborate on the sysctls comment? On Saturday, February 2, 2013, Baptiste wrote: Hi, My guess is that you have configured too huge buffers in HAProxy on in your TCP sysctls. There is no reason for HAProxy to use all the available memory, unless you're running a 1.5-dev with a memory leak (dev15 had some kind of leak, at least I've already seen it being oomed). Could you please let us know which version you are running? cheers On Sat, Feb 2, 2013 at 6:01 PM, Lukas Tribus luky...@hotmail.comjavascript:; wrote: Notice that having the HAproxy box swapping is a huge performance killer and you absolutely do not want to do that, so apart from tuning/configuring the OOM killer you should track the issue down and avoid memory depletion and swapping in first place. Date: Sat, 2 Feb 2013 12:34:03 + Subject: Re: haproxy invoked oom-killer From: cont...@jpluscplusm.com javascript:; To: haproxy@formilux.org javascript:; On 1 February 2013 21:57, Dusty Doris du...@doris.name javascript:; wrote: oom-killer just killed my haproxy instance. Anyone know if there is a way to prioritize haproxy and have it get killed after something else? Or, any tuning that might help. https://www.google.co.uk/search?q=linux+exclude+proces+from+oom-killer -- Jonathan Matthews // Oxford, London, UK http://www.jpluscplusm.com/contact.html
haproxy invoked oom-killer
oom-killer just killed my haproxy instance. Anyone know if there is a way to prioritize haproxy and have it get killed after something else? Or, any tuning that might help. It looked like I had plenty of swap space available when it decided to kill haproxy. Thanks for any advice. Linux 3.3.7-1.fc16.x86_64 HA-Proxy version 1.4.20 # free -m total used free sharedbuffers cached Mem: 995357637 0 3 25 -/+ buffers/cache:328667 Swap: 2015 92 1923 messages: Feb 1 15:48:03 prx2 kernel: [21556065.639023] sched: RT throttling activated Feb 1 15:48:03 prx2 heartbeat: [15556]: WARN: Gmain_timeout_dispatch: Dispatch function for check for signals was delayed 1470 ms ( 1010 ms) before being called (GSource: 0x20b4c20) Feb 1 15:48:03 prx2 heartbeat: [15556]: info: Gmain_timeout_dispatch: started at 2588817760 should have started at 2588817613 Feb 1 15:48:14 prx2 kernel: [21556076.952895] oom_kill_process: 997778 callbacks suppressed Feb 1 15:48:14 prx2 kernel: [21556076.952900] haproxy invoked oom-killer: gfp_mask=0xd0, order=0, oom_adj=0, oom_score_adj=0 Feb 1 15:48:14 prx2 kernel: [21556076.952934] haproxy cpuset=/ mems_allowed=0 Feb 1 15:48:14 prx2 kernel: [21556076.952946] Pid: 9654, comm: haproxy Not tainted 3.3.7-1.fc16.x86_64 #1 Feb 1 15:48:14 prx2 kernel: [21556076.952948] Call Trace: Feb 1 15:48:14 prx2 kernel: [21556076.952978] [810c7811] ? cpuset_print_task_mems_allowed+0x91/0xa0 Feb 1 15:48:14 prx2 kernel: [21556076.952993] [81123cd0] dump_header+0x80/0x1d0 Feb 1 15:48:14 prx2 kernel: [21556076.952997] [81124125] oom_kill_process+0x85/0x290 Feb 1 15:48:14 prx2 kernel: [21556076.953000] [81124770] out_of_memory+0x1c0/0x400 Feb 1 15:48:14 prx2 kernel: [21556076.953004] [81129d7f] __alloc_pages_nodemask+0x8df/0x8f0 Feb 1 15:48:14 prx2 kernel: [21556076.953016] [81521652] ? __ip_local_out+0xa2/0xb0 Feb 1 15:48:14 prx2 kernel: [21556076.953022] [81160a93] alloc_pages_current+0xa3/0x110 Feb 1 15:48:14 prx2 kernel: [21556076.953025] [8152adde] tcp_sendmsg+0x53e/0xdf0 Feb 1 15:48:14 prx2 kernel: [21556076.953031] [81550e74] inet_sendmsg+0x64/0xb0 Feb 1 15:48:14 prx2 kernel: [21556076.953043] [8126dc63] ? selinux_socket_sendmsg+0x23/0x30 Feb 1 15:48:14 prx2 kernel: [21556076.953052] [814ced17] sock_sendmsg+0x117/0x130 Feb 1 15:48:14 prx2 kernel: [21556076.953055] [81521652] ? __ip_local_out+0xa2/0xb0 Feb 1 15:48:14 prx2 kernel: [21556076.953065] [81067d6e] ? mod_timer+0x13e/0x2f0 Feb 1 15:48:14 prx2 kernel: [21556076.953069] [814d220d] sys_sendto+0x13d/0x190 Feb 1 15:48:14 prx2 kernel: [21556076.953073] [810d345c] ? __audit_syscall_entry+0xcc/0x310 Feb 1 15:48:14 prx2 kernel: [21556076.953076] [810d3a76] ? __audit_syscall_exit+0x3d6/0x410 Feb 1 15:48:14 prx2 kernel: [21556076.953084] [815fc529] system_call_fastpath+0x16/0x1b Feb 1 15:48:14 prx2 kernel: [21556076.953086] Mem-Info: Feb 1 15:48:14 prx2 kernel: [21556076.953088] Node 0 DMA per-cpu: Feb 1 15:48:14 prx2 kernel: [21556076.953198] CPU0: hi:0, btch: 1 usd: 0 Feb 1 15:48:14 prx2 kernel: [21556076.953200] CPU1: hi:0, btch: 1 usd: 0 Feb 1 15:48:14 prx2 kernel: [21556076.953203] CPU2: hi:0, btch: 1 usd: 0 Feb 1 15:48:14 prx2 kernel: [21556076.953205] CPU3: hi:0, btch: 1 usd: 0 Feb 1 15:48:14 prx2 kernel: [21556076.953206] Node 0 DMA32 per-cpu: Feb 1 15:48:14 prx2 kernel: [21556076.953209] CPU0: hi: 186, btch: 31 usd: 56 Feb 1 15:48:14 prx2 kernel: [21556076.953210] CPU1: hi: 186, btch: 31 usd: 0 Feb 1 15:48:14 prx2 kernel: [21556076.953212] CPU2: hi: 186, btch: 31 usd: 0 Feb 1 15:48:14 prx2 kernel: [21556076.953214] CPU3: hi: 186, btch: 31 usd: 29 Feb 1 15:48:14 prx2 kernel: [21556076.953218] active_anon:138 inactive_anon:194 isolated_anon:0 Feb 1 15:48:14 prx2 kernel: [21556076.953219] active_file:24 inactive_file:80 isolated_file:0 Feb 1 15:48:14 prx2 kernel: [21556076.953220] unevictable:4373 dirty:0 writeback:213 unstable:0 Feb 1 15:48:14 prx2 kernel: [21556076.953221] free:12235 slab_reclaimable:47686 slab_unreclaimable:25122 Feb 1 15:48:14 prx2 kernel: [21556076.953222] mapped:1506 shmem:2 pagetables:725 bounce:0 Feb 1 15:48:14 prx2 kernel: [21556076.953224] Node 0 DMA free:4640kB min:680kB low:848kB high:1020kB active_anon:44kB inactive_anon:84kB active_file:0kB inactive_file:44kB unevictable:352kB isolated(anon):0kB isolated(file):0kB present:15656kB mlocked:352kB dirty:0kB writeback:112kB mapped:352kB shmem:0kB slab_reclaimable:328kB slab_unreclaimable:276kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:592 all_unreclaimable? yes Feb 1 15:48:14 prx2 kernel: [21556076.953236] lowmem_reserve[]: 0 992 992 992 Feb 1
Soft Stop vs Disable
Could someone help me understand the difference between Soft Stop/Soft Start and Disable/Enable? I can see that when I disable a node in the web UI it is marked as MAINT and when I issue a Soft Stop, the weight is changed to 0. How do those differences impact my connections? What I working on is automating deployments by disabling a server from the proxy, waiting until its connections reach 0, updating the code and restarting the process, then re-enabling the server in the proxy. I'd like to wait until all connections are closed before I begin the upgrade (we have some long running file transfers that we don't want to cut off by dispatching to a different server) From what I understand, setting the weight to 0 and then back to 1, will accomplish my goal. What I am curious about is if I can accomplish the same goal by disabling the server into MAINT mode or if using both weight and MAINT adds anything useful. Thanks for your help!
unsubscribe
Re: HAProxy mod_rails (Passenger)
Right now, I simply have that file in the public directory so its served by apache directly, instead of placing it in a rails queue. This does make haproxy blind to whether or not Rails is broken, because its only checking to see if Apache is responding. However, if you want to take down the instance, you can just turn off apache and haproxy will take it out of the queue. Or simply rename that file. But, I realize that's not ideal. I still haven't come up with the best way to do this so that haproxy is checking the actual status of your ruby processes. I'm thinking that it might be actually starting up a single mongrel/thin/ebb/webrick process, and then using mod_rewrite to send those particular httpchk requests to that process instead of going into the passenger queue. In addition, I use the global queue in passenger. That seems to help if we have any long running actions. http://www.modrails.com/documentation/Users%20guide.html#PassengerUseGlobalQueue Hope that is helpful. On Fri, Feb 20, 2009 at 4:27 AM, Matthias Müller matthias.c.muel...@gmail.com wrote: Hi there seems like the failing requests issue is related to the check interval for option httpchk /health_check.html I guess these requests are being queued by Apache itself. We're having a couple of long running requests on our machines. These requests are queued in Apache by the Apache and Passenger internal queue. Maybe the check requests are so as well. So, if it takes longer than the specified check interval to complete the check request, HAProxy treats Apache to be down. And as there is no other server to forward the request to, the reuqest is dismissed. After increasing the check interval, HAProxy works fine. Does anyone have experience with having option httpchk enabled for Passenger? Thanks Matt Willy Tarreau wrote: On Thu, Feb 19, 2009 at 10:02:36AM +0100, Matthias Müller wrote: Hello there I'm trying to find a suitable solution to load balance Rails applications via Passenger and HAProxy.Currentliy I'm doing a lot of testing using Apache Bench. The setting is as simple as follows: machine A: HAProxy machine B: Apache with mod_rails my test: 100 concurrent requests via Apache Bench When running 100 concurrent requests against HAProxy, Apache Bench has a lot of non 2XX responses and I get a lot of BADRQ in my HAProxy log like: Feb 19 08:50:25 localhost.localdomain haproxy[1890]: 10.0.0.1:33917 [19/Feb/2009:08:50:10.898] http-proxy http-proxy/member1 -1/13816/1/-1/14702 503 13757 - - 99/99/99/9/0 0/32 BADREQ There's something odd in the log above. It implies that the request was sent into the queue waiting for a server to be free, but also that the request was invalid or perhaps empty. I suspect that AB has timedout slightly before haproxy while waiting for a server response, but I don't understand why we have BADREQ here, we should not have got a 503 with BADREQ. Could you please indicate what exact version this is ? This would help explaining why we can have BADREQ and 503 at the same time. Regards, Willy