Re: Speeding up mod_proxy_balancer on Windows
On Oct 13, 2008, at 4:42 PM, Ruediger Pluem wrote: On 10/13/2008 10:04 PM, Jess Holle wrote: Jess Holle wrote: Ruediger Pluem wrote: So if noone finds a registry entry to stop this RFC violating behaviour I'd love to see this solved by such a discovery, option 0. I see only two options on Windows: 1. Fiddle around with GetTcpTable. I've attached my incomplete code in this regard (as a diff against 2.2.9, which is what I used as the base for my changes) for what they're worth. There are TO_DO notes where I know I'm missing stuff. I tested basic use of GetTcpTable(), which solved the problem, but haven't completed my conversion to caching this data -- in part because I don't know where to allocate an lock to arbitrate access to this cached data. I forgot the -u on my diff. Here's a unified diff. Thanks for this. Given that it introduces a lot of platform specific code to the proxy and given the outstanding cache problem I would like to wait for Bill's proposal to improve apr_socket_connect within APR as this looks more appealing overall. If improving APR turns out to be not possible I would come back to your patch. ++1
Re: Speeding up mod_proxy_balancer on Windows
Mladen Turk wrote: Ruediger Pluem wrote: Not exactly. I would prefer to fix the basic issue with Windows. If we need to support milliseconds for connection timeouts seems to be another story for me. Can some of the Windows gurus come to the rescue to either confirm and explain why it takes that long for connect to return after trying to connect to a closed port (on the same machine !!!) or if this must be a local issue on a specific machine. According to the Microsoft (http://support.microsoft.com/default.aspx/kb/314053) TcpMaxConnectRetransmissions Key: Tcpip\Parameters Value Type: REG_DWORD - Number Valid Range: 0 - 0x Default: 2 Description: This parameter determines the number of times that TCP retransmits a connect request (SYN) before aborting the attempt. The retransmission timeout is doubled with each successive retransmission in a particular connect attempt. The initial timeout value is three seconds. So, looks like with default of 2 retransmits there is at least 9 second delay. I thought the problem was with local connects to a port, to which no process listens to. In this case the OS should immediately sending a RST (reset) and the client should not resend the SYN. I expect this TcpMaxConnectRetransmissions to be used in the case, were the remote server doesn't send any answer (neither RST nor SYN/ACK) and the client needs to retry the connection establishment after some timeout. That shouldn't be the case here, because the local system should always be able to send a RST if nothing is listening on the port. Except there's a local firewall with packet drop coming into the game (or name resolution timeouts, or ...). Regards, Rainer
Re: Speeding up mod_proxy_balancer on Windows
Ruediger Pluem wrote: Not exactly. I would prefer to fix the basic issue with Windows. If we need to support milliseconds for connection timeouts seems to be another story for me. Can some of the Windows gurus come to the rescue to either confirm and explain why it takes that long for connect to return after trying to connect to a closed port (on the same machine !!!) or if this must be a local issue on a specific machine. According to the Microsoft (http://support.microsoft.com/default.aspx/kb/314053) TcpMaxConnectRetransmissions Key: Tcpip\Parameters Value Type: REG_DWORD - Number Valid Range: 0 - 0x Default: 2 Description: This parameter determines the number of times that TCP retransmits a connect request (SYN) before aborting the attempt. The retransmission timeout is doubled with each successive retransmission in a particular connect attempt. The initial timeout value is three seconds. So, looks like with default of 2 retransmits there is at least 9 second delay. Regards -- ^(TM)
Re: Speeding up mod_proxy_balancer on Windows
Ruediger Pluem wrote: On 10/13/2008 12:50 AM, Jess Holle wrote: Perhaps I misunderstand things here, but isn't this connection timeout setting used for more than just the timing out the initial formation of the connection? It would seem that logical that there would be a connection timeout for forming the initial connection and another for timeouts of responses, etc, but I had understood this was not the case. We currently have connection timeout set very, very large as otherwise we got timeouts when the backend URL was something very computationally intensive that took a long time to respond with data (e.g. a good number of minutes). That should seemingly be distinct from an initial connection timeout, but my understanding was that it is not. Am I just confused here? No you are not. The next 2.2.x release will contain the parameter connectiontimeout where you can set *just* the connection timeout. The other parameter you are referring to is timeout. It will keep its meaning and will be used as a connection timeout if connectiontimeout is not set. By next 2.2.x release do you mean 2.2.10 (assuming it goes out)? Or trunk beyond 2.2.10? -- Jess Holle
Re: Speeding up mod_proxy_balancer on Windows
Ruediger Pluem wrote: Not exactly. I would prefer to fix the basic issue with Windows. If we need to support milliseconds for connection timeouts seems to be another story for me. Can some of the Windows gurus come to the rescue to either confirm and explain why it takes that long for connect to return after trying to connect to a closed port (on the same machine !!!) or if this must be a local issue on a specific machine. We have had another engineer verify the issue on another machine using a .Net application, so (1) it is not Apache specific and (2) it is not specific to my machine. He's also somewhat of a Windows guru, but I'd be ecstatic if someone could point out a reasonable way around this issue. -- Jess Holle
Re: Speeding up mod_proxy_balancer on Windows
Ruediger Pluem wrote: On 10/13/2008 12:50 AM, Jess Holle wrote: Perhaps I misunderstand things here, but isn't this connection timeout setting used for more than just the timing out the initial formation of the connection? It would seem that logical that there would be a connection timeout for forming the initial connection and another for timeouts of responses, etc, but I had understood this was not the case. We currently have connection timeout set very, very large as otherwise we got timeouts when the backend URL was something very computationally intensive that took a long time to respond with data (e.g. a good number of minutes). That should seemingly be distinct from an initial connection timeout, but my understanding was that it is not. Am I just confused here? No you are not. The next 2.2.x release will contain the parameter connectiontimeout where you can set *just* the connection timeout. The other parameter you are referring to is timeout. It will keep its meaning and will be used as a connection timeout if connectiontimeout is not set. Ah, so we additionally need something like Matt Stevenson's patch (or just change connection timeout to be a float or double rather than changing its units) to allow connection timeouts of much less than 1 second (e.g. 0.125 seconds) to address my Windows issue with slow connection rejection -- and that's /if/ Windows consistently connects in less than that timeframe when it is going to connect. Right? -- Jess Holle
Re: Speeding up mod_proxy_balancer on Windows
Mladen Turk wrote: Ruediger Pluem wrote: Not exactly. I would prefer to fix the basic issue with Windows. If we need to support milliseconds for connection timeouts seems to be another story for me. Can some of the Windows gurus come to the rescue to either confirm and explain why it takes that long for connect to return after trying to connect to a closed port (on the same machine !!!) or if this must be a local issue on a specific machine. According to the Microsoft (http://support.microsoft.com/default.aspx/kb/314053) TcpMaxConnectRetransmissions Key: Tcpip\Parameters Value Type: REG_DWORD - Number Valid Range: 0 - 0x Default: 2 Description: This parameter determines the number of times that TCP retransmits a connect request (SYN) before aborting the attempt. The retransmission timeout is doubled with each successive retransmission in a particular connect attempt. The initial timeout value is three seconds. So, looks like with default of 2 retransmits there is at least 9 second delay. Hmm... Oddly I'm seeing right around 1 second (just a little over) delay for the rejection of each connection on a port on which nothing is listening. This obviously does not match up with the 9 seconds in any way. -- Jess Holle
Re: Speeding up mod_proxy_balancer on Windows
On 10/13/2008 11:46 AM, Jess Holle wrote: Ruediger Pluem wrote: On 10/13/2008 12:50 AM, Jess Holle wrote: Perhaps I misunderstand things here, but isn't this connection timeout setting used for more than just the timing out the initial formation of the connection? It would seem that logical that there would be a connection timeout for forming the initial connection and another for timeouts of responses, etc, but I had understood this was not the case. We currently have connection timeout set very, very large as otherwise we got timeouts when the backend URL was something very computationally intensive that took a long time to respond with data (e.g. a good number of minutes). That should seemingly be distinct from an initial connection timeout, but my understanding was that it is not. Am I just confused here? No you are not. The next 2.2.x release will contain the parameter connectiontimeout where you can set *just* the connection timeout. The other parameter you are referring to is timeout. It will keep its meaning and will be used as a connection timeout if connectiontimeout is not set. Ah, so we additionally need something like Matt Stevenson's patch (or just change connection timeout to be a float or double rather than changing its units) to allow connection timeouts of much less than 1 second (e.g. 0.125 seconds) to address my Windows issue with slow connection rejection -- and that's /if/ Windows consistently connects in less than that timeframe when it is going to connect. Right? Not exactly. I would prefer to fix the basic issue with Windows. If we need to support milliseconds for connection timeouts seems to be another story for me. Can some of the Windows gurus come to the rescue to either confirm and explain why it takes that long for connect to return after trying to connect to a closed port (on the same machine !!!) or if this must be a local issue on a specific machine. Regards Rüdiger
Re: Speeding up mod_proxy_balancer on Windows
On 10/13/2008 03:54 PM, Rainer Jung wrote: Mladen Turk wrote: Ruediger Pluem wrote: Not exactly. I would prefer to fix the basic issue with Windows. If we need to support milliseconds for connection timeouts seems to be another story for me. Can some of the Windows gurus come to the rescue to either confirm and explain why it takes that long for connect to return after trying to connect to a closed port (on the same machine !!!) or if this must be a local issue on a specific machine. According to the Microsoft (http://support.microsoft.com/default.aspx/kb/314053) TcpMaxConnectRetransmissions Key: Tcpip\Parameters Value Type: REG_DWORD - Number Valid Range: 0 - 0x Default: 2 Description: This parameter determines the number of times that TCP retransmits a connect request (SYN) before aborting the attempt. The retransmission timeout is doubled with each successive retransmission in a particular connect attempt. The initial timeout value is three seconds. So, looks like with default of 2 retransmits there is at least 9 second delay. I thought the problem was with local connects to a port, to which no process listens to. In this case the OS should immediately sending a RST (reset) and the client should not resend the SYN. I expect this TcpMaxConnectRetransmissions to be used in the case, were the remote server doesn't send any answer (neither RST nor SYN/ACK) and the client needs to retry the connection establishment after some timeout. That shouldn't be the case here, because the local system should always be able to send a RST if nothing is listening on the port. Except there's a local firewall with packet drop coming into the game (or name resolution timeouts, or ...). Exactly my understanding of the problem. So I see no use in TcpMaxConnectRetransmissions for this case. Regards Rüdiger
Re: Speeding up mod_proxy_balancer on Windows
Ruediger Pluem wrote: According to the Microsoft (http://support.microsoft.com/default.aspx/kb/314053) TcpMaxConnectRetransmissions Key: Tcpip\Parameters Value Type: REG_DWORD - Number Valid Range: 0 - 0x Default: 2 Description: This parameter determines the number of times that TCP retransmits a connect request (SYN) before aborting the attempt. The retransmission timeout is doubled with each successive retransmission in a particular connect attempt. The initial timeout value is three seconds. So, looks like with default of 2 retransmits there is at least 9 second delay. I thought the problem was with local connects to a port, to which no process listens to. In this case the OS should immediately sending a RST (reset) and the client should not resend the SYN. I expect this TcpMaxConnectRetransmissions to be used in the case, were the remote server doesn't send any answer (neither RST nor SYN/ACK) and the client needs to retry the connection establishment after some timeout. That shouldn't be the case here, because the local system should always be able to send a RST if nothing is listening on the port. Except there's a local firewall with packet drop coming into the game (or name resolution timeouts, or ...). Exactly my understanding of the problem. So I see no use in TcpMaxConnectRetransmissions for this case. Regards Rüdiger Looks like it might be the retry issue: 3963 13.831213 source destination TCP 1230 2608 [RST, ACK] Seq=1 Ack=1 Win=0 Len=0 4087 14.280717 source destination TCP 2608 1230 [SYN] Seq=0 Win=65535 Len=0 MSS=1460 4088 14.280735 source destination TCP 1230 2608 [RST, ACK] Seq=1 Ack=1 Win=0 Len=0 4238 14.827581 source destination TCP 2608 1230 [SYN] Seq=0 Win=65535 Len=0 MSS=1460 4239 14.827603 source destination TCP 1230 2608 [RST, ACK] Seq=1 Ack=1 Win=0 Len=0 The RSTs occur remotely and Windows retries twice. 14284.025656unixsource destinationTCP57864 1230 [SYN] Seq=0 Win=32768 Len=0 MSS=1460 WS=0 14294.025674unixsource destinationTCP1230 57864 [RST, ACK] Seq=1 Ack=1 Win=0 Len=0 That's the same exchange from an HP-UX source to a linux destination. I'm assuming that windows pulls the same crap for localhost traffic even though we can't capture it to prove the case. Oh the joys of TCP/IP troubleshooting on Windows :) Andy
Re: Speeding up mod_proxy_balancer on Windows
I just set this parameter to 0 and the issue went away entirely. Good catch, Ruediger! Thank you -- and all who helped on this thread! It would appear that Microsoft's documentation slipped a decimal place somewhere as it would appear there is about 0.3 second delay on the initial retry and about a 0.6 second on the second -- resulting in about a 1 second overall delay when other overhead/latency is included. I don't see a way to reduce this delay and overall concur with Andy that this parameter should be 0 by all rights. Any thoughts? -- Jess Holle Andy Wang wrote: Ruediger Pluem wrote: According to the Microsoft (http://support.microsoft.com/default.aspx/kb/314053) TcpMaxConnectRetransmissions Key: Tcpip\Parameters Value Type: REG_DWORD - Number Valid Range: 0 - 0x Default: 2 Description: This parameter determines the number of times that TCP retransmits a connect request (SYN) before aborting the attempt. The retransmission timeout is doubled with each successive retransmission in a particular connect attempt. The initial timeout value is three seconds. So, looks like with default of 2 retransmits there is at least 9 second delay. I thought the problem was with local connects to a port, to which no process listens to. In this case the OS should immediately sending a RST (reset) and the client should not resend the SYN. I expect this TcpMaxConnectRetransmissions to be used in the case, were the remote server doesn't send any answer (neither RST nor SYN/ACK) and the client needs to retry the connection establishment after some timeout. That shouldn't be the case here, because the local system should always be able to send a RST if nothing is listening on the port. Except there's a local firewall with packet drop coming into the game (or name resolution timeouts, or ...). Exactly my understanding of the problem. So I see no use in TcpMaxConnectRetransmissions for this case. Regards Rüdiger Looks like it might be the retry issue: 3963 13.831213 source destination TCP 1230 2608 [RST, ACK] Seq=1 Ack=1 Win=0 Len=0 4087 14.280717 source destination TCP 2608 1230 [SYN] Seq=0 Win=65535 Len=0 MSS=1460 4088 14.280735 source destination TCP 1230 2608 [RST, ACK] Seq=1 Ack=1 Win=0 Len=0 4238 14.827581 source destination TCP 2608 1230 [SYN] Seq=0 Win=65535 Len=0 MSS=1460 4239 14.827603 source destination TCP 1230 2608 [RST, ACK] Seq=1 Ack=1 Win=0 Len=0 The RSTs occur remotely and Windows retries twice. 14284.025656unixsource destinationTCP57864 1230 [SYN] Seq=0 Win=32768 Len=0 MSS=1460 WS=0 14294.025674unixsource destinationTCP1230 57864 [RST, ACK] Seq=1 Ack=1 Win=0 Len=0 That's the same exchange from an HP-UX source to a linux destination. I'm assuming that windows pulls the same crap for localhost traffic even though we can't capture it to prove the case. Oh the joys of TCP/IP troubleshooting on Windows :) Andy
Re: Speeding up mod_proxy_balancer on Windows
After a poke and a prod from someone else here about this delay algorithm being used for timeouts, having that default to 0 doesn't seem like it would be appropriate either as it could severely hamper network connectivity in legitimate timeout cases. It seems like MS' TCP stack seems to think a RST is the same as a timeout. argh!! Andy Jess Holle wrote: I just set this parameter to 0 and the issue went away entirely. Good catch, Ruediger! Thank you -- and all who helped on this thread! It would appear that Microsoft's documentation slipped a decimal place somewhere as it would appear there is about 0.3 second delay on the initial retry and about a 0.6 second on the second -- resulting in about a 1 second overall delay when other overhead/latency is included. I don't see a way to reduce this delay and overall concur with Andy that this parameter should be 0 by all rights. Any thoughts? -- Jess Holle Andy Wang wrote: Ruediger Pluem wrote: According to the Microsoft (http://support.microsoft.com/default.aspx/kb/314053) TcpMaxConnectRetransmissions Key: Tcpip\Parameters Value Type: REG_DWORD - Number Valid Range: 0 - 0x Default: 2 Description: This parameter determines the number of times that TCP retransmits a connect request (SYN) before aborting the attempt. The retransmission timeout is doubled with each successive retransmission in a particular connect attempt. The initial timeout value is three seconds. So, looks like with default of 2 retransmits there is at least 9 second delay. I thought the problem was with local connects to a port, to which no process listens to. In this case the OS should immediately sending a RST (reset) and the client should not resend the SYN. I expect this TcpMaxConnectRetransmissions to be used in the case, were the remote server doesn't send any answer (neither RST nor SYN/ACK) and the client needs to retry the connection establishment after some timeout. That shouldn't be the case here, because the local system should always be able to send a RST if nothing is listening on the port. Except there's a local firewall with packet drop coming into the game (or name resolution timeouts, or ...). Exactly my understanding of the problem. So I see no use in TcpMaxConnectRetransmissions for this case. Regards Rüdiger Looks like it might be the retry issue: 3963 13.831213 source destination TCP 1230 2608 [RST, ACK] Seq=1 Ack=1 Win=0 Len=0 4087 14.280717 source destination TCP 2608 1230 [SYN] Seq=0 Win=65535 Len=0 MSS=1460 4088 14.280735 source destination TCP 1230 2608 [RST, ACK] Seq=1 Ack=1 Win=0 Len=0 4238 14.827581 source destination TCP 2608 1230 [SYN] Seq=0 Win=65535 Len=0 MSS=1460 4239 14.827603 source destination TCP 1230 2608 [RST, ACK] Seq=1 Ack=1 Win=0 Len=0 The RSTs occur remotely and Windows retries twice. 14284.025656unixsource destinationTCP57864 1230 [SYN] Seq=0 Win=32768 Len=0 MSS=1460 WS=0 14294.025674unixsource destinationTCP1230 57864 [RST, ACK] Seq=1 Ack=1 Win=0 Len=0 That's the same exchange from an HP-UX source to a linux destination. I'm assuming that windows pulls the same crap for localhost traffic even though we can't capture it to prove the case. Oh the joys of TCP/IP troubleshooting on Windows :) Andy
Re: Speeding up mod_proxy_balancer on Windows
Jess Holle wrote: I just set this parameter to 0 and the issue went away entirely. And indeed http://support.microsoft.com/kb/175523 confirms, that Microsoft has a different way of handling RST than Unixes. Good catch, Ruediger! Thank you -- and all who helped on this thread! I think it was Mladen ... It would appear that Microsoft's documentation slipped a decimal place somewhere as it would appear there is about 0.3 second delay on the initial retry and about a 0.6 second on the second -- resulting in about a 1 second overall delay when other overhead/latency is included. I don't see a way to reduce this delay and overall concur with Andy that this parameter should be 0 by all rights. Any thoughts? -- Jess Holle Andy Wang wrote: Ruediger Pluem wrote: According to the Microsoft (http://support.microsoft.com/default.aspx/kb/314053) TcpMaxConnectRetransmissions Key: Tcpip\Parameters Value Type: REG_DWORD - Number Valid Range: 0 - 0x Default: 2 Description: This parameter determines the number of times that TCP retransmits a connect request (SYN) before aborting the attempt. The retransmission timeout is doubled with each successive retransmission in a particular connect attempt. The initial timeout value is three seconds. So, looks like with default of 2 retransmits there is at least 9 second delay. I thought the problem was with local connects to a port, to which no process listens to. In this case the OS should immediately sending a RST (reset) and the client should not resend the SYN. I expect this TcpMaxConnectRetransmissions to be used in the case, were the remote server doesn't send any answer (neither RST nor SYN/ACK) and the client needs to retry the connection establishment after some timeout. That shouldn't be the case here, because the local system should always be able to send a RST if nothing is listening on the port. Except there's a local firewall with packet drop coming into the game (or name resolution timeouts, or ...). Exactly my understanding of the problem. So I see no use in TcpMaxConnectRetransmissions for this case. Regards Rüdiger Looks like it might be the retry issue: 3963 13.831213 source destination TCP 1230 2608 [RST, ACK] Seq=1 Ack=1 Win=0 Len=0 4087 14.280717 source destination TCP 2608 1230 [SYN] Seq=0 Win=65535 Len=0 MSS=1460 4088 14.280735 source destination TCP 1230 2608 [RST, ACK] Seq=1 Ack=1 Win=0 Len=0 4238 14.827581 source destination TCP 2608 1230 [SYN] Seq=0 Win=65535 Len=0 MSS=1460 4239 14.827603 source destination TCP 1230 2608 [RST, ACK] Seq=1 Ack=1 Win=0 Len=0 The RSTs occur remotely and Windows retries twice. 14284.025656unixsource destinationTCP57864 1230 [SYN] Seq=0 Win=32768 Len=0 MSS=1460 WS=0 14294.025674unixsource destinationTCP1230 57864 [RST, ACK] Seq=1 Ack=1 Win=0 Len=0 That's the same exchange from an HP-UX source to a linux destination. I'm assuming that windows pulls the same crap for localhost traffic even though we can't capture it to prove the case. Oh the joys of TCP/IP troubleshooting on Windows :) Andy
Re: Speeding up mod_proxy_balancer on Windows
On 10/13/2008 07:46 PM, Rainer Jung wrote: Jess Holle wrote: I just set this parameter to 0 and the issue went away entirely. And indeed http://support.microsoft.com/kb/175523 confirms, that Microsoft has a different way of handling RST than Unixes. Good catch, Ruediger! Thank you -- and all who helped on this thread! I think it was Mladen ... Correct and my statement didn't imply to set this registry value to 0. I think this is a dangerous road and could lead to other network problems. I wanted to say that the value Mladen mentioned is not really helping as it normally should deal with situations where the SYN packets get lost and retransmission is due. But as we all have to learn again Microsoft has their very own interpretation of RFCs: RFC793: Reset Processing In all states except SYN-SENT, all reset (RST) segments are validated by checking their SEQ-fields. A reset is valid if its sequence number is in the window. In the SYN-SENT state (a RST received in response to an initial SYN), the RST is acceptable if the ACK field acknowledges the SYN. The receiver of a RST first validates it, then changes state. If the receiver was in the LISTEN state, it ignores it. If the receiver was in SYN-RECEIVED state and had previously been in the LISTEN state, then the receiver returns to the LISTEN state, otherwise the receiver aborts the connection and goes to the CLOSED state. If the receiver was in any other state, it aborts the connection and advises the user and goes to the CLOSED state. In particular the last sentence tells us what to do if a SYN is answered with a RST: If the receiver [of the RST packet] was in any other state, it aborts the connection and advises the user and goes to the CLOSED state. So if noone finds a registry entry to stop this RFC violating behaviour I see only two options on Windows: 1. Fiddle around with GetTcpTable. 2. Allow connectiontimeout to somehow accept milliseconds. A possible idea for 2. would be to decide by the size of the value whether the value is in seconds or milliseconds: If the value is 100 it is seconds, if it is 100 it is milliseconds. I guess there is no sense in values 100 ms and 99 s for a *connection* timeout. Regards Rüdiger
Re: Speeding up mod_proxy_balancer on Windows
Ruediger Pluem wrote: So if noone finds a registry entry to stop this RFC violating behaviour I'd love to see this solved by such a discovery, option 0. I see only two options on Windows: 1. Fiddle around with GetTcpTable. I've attached my incomplete code in this regard (as a diff against 2.2.9, which is what I used as the base for my changes) for what they're worth. There are TO_DO notes where I know I'm missing stuff. I tested basic use of GetTcpTable(), which solved the problem, but haven't completed my conversion to caching this data -- in part because I don't know where to allocate an lock to arbitrate access to this cached data. 2. Allow connectiontimeout to somehow accept milliseconds. Or a floating point number? Unfortunately this would seem to impact actual connection timeouts as an undesired side-effect of trying to address Windows' bad treatment of RSTs, right? -- Jess Holle 31a32,35 #ifdef WIN32 #include iphlpapi.h #endif 2268a2273,2397 #ifdef WIN32 typedef struct live_port_data_t live_port_data_t; struct live_port_data_t { apr_time_t time_obtained; int n_ports; int *ports; }; static live_port_data_t *live_port_data = NULL; static int int_comparator( const void *pint1, const void *pint2 ) { int int1 = *((int*)pint1); int int2 = *((int*)pint2); if ( int1 int2 ) return -1; if ( int2 int2 ); return 1; return 0; } static live_port_data_t *get_port_data() { /* Much of this routine adapted directly from http://msdn.microsoft.com/en-us/library/aa366026(VS.85).aspx */ /* Declare and initialize variables */ PMIB_TCPTABLE pTcpTable; DWORD dwSize; DWORD dwRetVal; pTcpTable = (MIB_TCPTABLE *) malloc( sizeof (MIB_TCPTABLE) ); if ( pTcpTable == NULL ) return NULL; dwSize = sizeof (MIB_TCPTABLE); /* Make an initial call to GetTcpTable to get the necessary size into the dwSize variable */ if ((dwRetVal = GetTcpTable(pTcpTable, dwSize, FALSE)) == ERROR_INSUFFICIENT_BUFFER) { free(pTcpTable); pTcpTable = (MIB_TCPTABLE *) malloc(dwSize); if (pTcpTable == NULL) return NULL; } /* Make a second call to GetTcpTable to get the actual data we require */ if ((dwRetVal = GetTcpTable(pTcpTable, dwSize, FALSE)) != NO_ERROR) { free(pTcpTable); return NULL; } else { apr_time_t time_now = apr_time_now(); live_port_data_t *port_data; int nUniqPorts = 0; int *uniqPorts; { int nEntries = (int) pTcpTable-dwNumEntries; int *ports = (int*) malloc( nEntries * sizeof( int ) ); int prevPort = -9; int i; /* copy ports from pTcpTable to ports array */ for (i = 0; i nEntries; i++) ports[i] = ntohs( (u_short) pTcpTable-table[i].dwLocalPort ); free( pTcpTable ); /* sort ports array */ qsort( ports, nEntries, sizeof( int ), int_comparator ); /* reduce ports array to list of unique ports */ uniqPorts = (int*) malloc( nEntries * sizeof( int ) ); /* array will be oversized in the end; value speed over small memory savings */ for (i = 0; i nEntries; i++) { int port = ports[i]; if ( port != prevPort ) { uniqPorts[nUniqPorts] = port; ++nUniqPorts; prevPort = port; } } free( ports ); } port_data = malloc( sizeof( live_port_data_t ) ); port_data-time_obtained = time_now; port_data-n_ports = nUniqPorts; port_data-ports = uniqPorts; return port_data; } } static void destroy_port_data( live_port_data_t *port_data ) { free( port_data-ports ); free( port_data ); } static int port_in_data( const live_port_data_t *port_data, int port ) { return ( bsearch( port, port_data-ports, port_data-n_ports, sizeof( int ), int_comparator ) != NULL ); } /* TO_DO: make this configurable */ #define LIVE_PORT_DATA_TTL 150 /* use hard-wired time-to-live of 1.5 seconds for port data */ static int port_is_clearly_not_alive( const apr_sockaddr_t *addr, const server_rec *s ) { /* if not dealing with localhost, then simply return 0 */
Re: Speeding up mod_proxy_balancer on Windows
Ruediger Pluem wrote: Correct and my statement didn't imply to set this registry value to 0. I think this is a dangerous road and could lead to other network problems. While your statement didn't imply that, the microsoft knowledge base article seems to imply that this registry setting should be used to tune performance behavior in this scenario. However, they do warn that setting it to 0 shouldn't help much. Apparently, 1s per socket connection will make no difference in their book. No wonder people question performance on Windows, especially with the a second here, a second there shouldn't matter mentality. I wanted to say that the value Mladen mentioned is not really helping as it normally should deal with situations where the SYN packets get lost and retransmission is due. But as we all have to learn again Microsoft has their very own interpretation of RFCs: RFC793: Reset Processing In all states except SYN-SENT, all reset (RST) segments are validated by checking their SEQ-fields. A reset is valid if its sequence number is in the window. In the SYN-SENT state (a RST received in response to an initial SYN), the RST is acceptable if the ACK field acknowledges the SYN. The receiver of a RST first validates it, then changes state. If the receiver was in the LISTEN state, it ignores it. If the receiver was in SYN-RECEIVED state and had previously been in the LISTEN state, then the receiver returns to the LISTEN state, otherwise the receiver aborts the connection and goes to the CLOSED state. If the receiver was in any other state, it aborts the connection and advises the user and goes to the CLOSED state. In particular the last sentence tells us what to do if a SYN is answered with a RST: If the receiver [of the RST packet] was in any other state, it aborts the connection and advises the user and goes to the CLOSED state. I'm glad you looked that up. I had found the microsoft kb article shortly after my last e-mail and didn't bother to question their statement that the RFC is ambiguous. After reading that, it's clearly not ambiguous. Gotta love what you can do when you're the 800lb gorilla in the room. Andy
Re: Speeding up mod_proxy_balancer on Windows
Jess Holle wrote: Ruediger Pluem wrote: So if noone finds a registry entry to stop this RFC violating behaviour I'd love to see this solved by such a discovery, option 0. I see only two options on Windows: 1. Fiddle around with GetTcpTable. I've attached my incomplete code in this regard (as a diff against 2.2.9, which is what I used as the base for my changes) for what they're worth. There are TO_DO notes where I know I'm missing stuff. I tested basic use of GetTcpTable(), which solved the problem, but haven't completed my conversion to caching this data -- in part because I don't know where to allocate an lock to arbitrate access to this cached data. I forgot the -u on my diff. Here's a unified diff. -- Jess Holle --- proxy_util-2.2.9.c 2008-05-28 16:11:24.0 -0500 +++ proxy_util.c2008-10-13 14:32:26.342593500 -0500 @@ -29,6 +29,10 @@ #define apr_socket_create apr_socket_create_ex #endif +#ifdef WIN32 +#include iphlpapi.h +#endif + /* Global balancer counter */ int PROXY_DECLARE_DATA proxy_lb_workers = 0; static int lb_workers_limit = 0; @@ -2266,6 +2270,131 @@ } #endif /* USE_ALTERNATE_IS_CONNECTED */ +#ifdef WIN32 + +typedef struct live_port_data_t live_port_data_t; +struct live_port_data_t { + apr_time_t time_obtained; + int n_ports; +int *ports; +}; + +static live_port_data_t *live_port_data = NULL; + +static int int_comparator( const void *pint1, const void *pint2 ) +{ + int int1 = *((int*)pint1); + int int2 = *((int*)pint2); + if ( int1 int2 ) + return -1; + if ( int2 int2 ); + return 1; + return 0; +} + +static live_port_data_t *get_port_data() +{ + /* Much of this routine adapted directly from http://msdn.microsoft.com/en-us/library/aa366026(VS.85).aspx */ + + /* Declare and initialize variables */ + PMIB_TCPTABLE pTcpTable; + DWORD dwSize; + DWORD dwRetVal; + + pTcpTable = (MIB_TCPTABLE *) malloc( sizeof (MIB_TCPTABLE) ); + if ( pTcpTable == NULL ) + return NULL; + + dwSize = sizeof (MIB_TCPTABLE); + /* Make an initial call to GetTcpTable to + get the necessary size into the dwSize variable */ + if ((dwRetVal = GetTcpTable(pTcpTable, dwSize, FALSE)) == + ERROR_INSUFFICIENT_BUFFER) { + free(pTcpTable); + pTcpTable = (MIB_TCPTABLE *) malloc(dwSize); + if (pTcpTable == NULL) + return NULL; + } + + /* Make a second call to GetTcpTable to get + the actual data we require */ + if ((dwRetVal = GetTcpTable(pTcpTable, dwSize, FALSE)) != NO_ERROR) { + free(pTcpTable); + return NULL; + } + else + { + apr_time_t time_now = apr_time_now(); + live_port_data_t *port_data; + int nUniqPorts = 0; + int *uniqPorts; + { + int nEntries = (int) pTcpTable-dwNumEntries; + int *ports = (int*) malloc( nEntries * sizeof( int ) ); + int prevPort = -9; + int i; + /* copy ports from pTcpTable to ports array */ + for (i = 0; i nEntries; i++) + ports[i] = ntohs( (u_short) pTcpTable-table[i].dwLocalPort ); + free( pTcpTable ); + /* sort ports array */ + qsort( ports, nEntries, sizeof( int ), int_comparator ); + /* reduce ports array to list of unique ports */ + uniqPorts = (int*) malloc( nEntries * sizeof( int ) ); /* array will be oversized in the end; value speed over small memory savings */ + for (i = 0; i nEntries; i++) { + int port = ports[i]; + if ( port != prevPort ) + { + uniqPorts[nUniqPorts] = port; + ++nUniqPorts; + prevPort = port; + } + } + free( ports ); + } + port_data = malloc( sizeof( live_port_data_t ) ); + port_data-time_obtained = time_now; + port_data-n_ports = nUniqPorts; +port_data-ports = uniqPorts; + return port_data; + } +} + +static void destroy_port_data( live_port_data_t *port_data ) +{ + free( port_data-ports ); + free( port_data ); +} + +static int port_in_data( const live_port_data_t *port_data, int port ) +{ + return ( bsearch( port, port_data-ports, port_data-n_ports, sizeof( int ), int_comparator ) != NULL ); +} + +/* TO_DO: make this configurable */ +#define LIVE_PORT_DATA_TTL
Re: Speeding up mod_proxy_balancer on Windows
On 10/13/2008 09:37 PM, Jess Holle wrote: Ruediger Pluem wrote: So if noone finds a registry entry to stop this RFC violating behaviour I'd love to see this solved by such a discovery, option 0. I see only two options on Windows: 1. Fiddle around with GetTcpTable. I've attached my incomplete code in this regard (as a diff against 2.2.9, which is what I used as the base for my changes) for what they're Mind to attach this as a unified diff? worth. There are TO_DO notes where I know I'm missing stuff. I tested basic use of GetTcpTable(), which solved the problem, but haven't completed my conversion to caching this data -- in part because I don't know where to allocate an lock to arbitrate access to this cached data. I guess the post config hook would be the correct place to create such a mutex. Depending on what type of mutex you need a call to apr_global_mutex_child_init is due additionally in the child_init hook. Have a look at other caching modules in httpd that have to deal with this like ldap or on trunk the small objects caches. 2. Allow connectiontimeout to somehow accept milliseconds. Or a floating point number? Unfortunately this would seem to impact actual connection timeouts as an undesired side-effect of trying to address Windows' bad treatment of RSTs, right? Not directly as you can interpret an integer value of an existing configuration also as a float, but I would like to keep the value an integer. This should be doable the way I proposed ( 100 seconds = 100 milliseconds), but comments on this approach are welcome. Regards Rüdiger
Re: Speeding up mod_proxy_balancer on Windows
Ruediger Pluem wrote: Not exactly. I would prefer to fix the basic issue with Windows. If we need to support milliseconds for connection timeouts seems to be another story for me. Can some of the Windows gurus come to the rescue to either confirm and explain why it takes that long for connect to return after trying to connect to a closed port (on the same machine !!!) or if this must be a local issue on a specific machine. Apparently, nobody is using apr_socket_connect() much on win32 to have noticed this before, but the implementation; if (connect(sock-socketdes, (const struct sockaddr *)sa-sa.sin, sa-salen) == SOCKET_ERROR) { int rc; struct timeval tv, *tvptr; fd_set wfdset, efdset; rv = apr_get_netos_error(); if (rv != APR_FROM_OS_ERROR(WSAEWOULDBLOCK)) { return rv; } [...] if (sock-timeout 0) { tvptr = NULL; } else { /* casts for winsock/timeval definition */ tv.tv_sec = (long)apr_time_sec(sock-timeout); tv.tv_usec = (int)apr_time_usec(sock-timeout); tvptr = tv; } rc = select(FD_SETSIZE+1, NULL, wfdset, efdset, tvptr); is so suboptimal, it might as well be pure posix. It's obviously msvcrt select() implementation which ignores tv_usec based on the reporters comments. So where does this leave us? Just set a damned event for the completion context of connect (or WSAConnect) and block on the event with exactly the timeout you want (using [WSA]WaitForSingleObject). That should provide near-instant acknowledgment of success or failure. Because mod_ftp PORT needs the behavior, I can play with this pretty readily late in the week or at the con.
Re: Speeding up mod_proxy_balancer on Windows
Hi, I think the option of sub second connection timeouts is a good thing. It also has the nice benefit of fixing windows interest RST behavior. It also means a jk/http proxy can do things some L7 switches can't do. I've also had a need of it in the past. For most cases a connection is going to be very quick (LAN 20 msec, good or bad (RST) ) or be very long if unreachable/not there (apache timeout?). So on a LAN a second is to big, especially if you have an average HTTP response of 15-30 milli sec (keepalive connection). I've ran apaches where that was the main aim. A 1 sec connection timeout is 30 times the normal response time, kind of large in that case. I think the connect timeout is meaningful in the 0-10 second range which is better served by having micro second timeout value (well milli second but apache timeout seem to be usec or sec?). Anyway having the connection timeout is great (sec or usec), thanks for applying the patch to trunk. Regards Matt
Re: Speeding up mod_proxy_balancer on Windows
On 10/13/2008 10:04 PM, Jess Holle wrote: Jess Holle wrote: Ruediger Pluem wrote: So if noone finds a registry entry to stop this RFC violating behaviour I'd love to see this solved by such a discovery, option 0. I see only two options on Windows: 1. Fiddle around with GetTcpTable. I've attached my incomplete code in this regard (as a diff against 2.2.9, which is what I used as the base for my changes) for what they're worth. There are TO_DO notes where I know I'm missing stuff. I tested basic use of GetTcpTable(), which solved the problem, but haven't completed my conversion to caching this data -- in part because I don't know where to allocate an lock to arbitrate access to this cached data. I forgot the -u on my diff. Here's a unified diff. Thanks for this. Given that it introduces a lot of platform specific code to the proxy and given the outstanding cache problem I would like to wait for Bill's proposal to improve apr_socket_connect within APR as this looks more appealing overall. If improving APR turns out to be not possible I would come back to your patch. Regards Rüdiger
Re: Speeding up mod_proxy_balancer on Windows
Ruediger Pluem wrote: On 10/13/2008 09:37 PM, Jess Holle wrote: Ruediger Pluem wrote: So if noone finds a registry entry to stop this RFC violating behaviour I'd love to see this solved by such a discovery, option 0. I see only two options on Windows: 1. Fiddle around with GetTcpTable. I've attached my incomplete code in this regard (as a diff against 2.2.9, which is what I used as the base for my changes) for what they're Mind to attach this as a unified diff Already did -- I goofed the first time... worth. There are TO_DO notes where I know I'm missing stuff. I tested basic use of GetTcpTable(), which solved the problem, but haven't completed my conversion to caching this data -- in part because I don't know where to allocate an lock to arbitrate access to this cached data. I guess the post config hook would be the correct place to create such a mutex. Depending on what type of mutex you need a call to apr_global_mutex_child_init is due additionally in the child_init hook. Have a look at other caching modules in httpd that have to deal with this like ldap or on trunk the small objects caches. Thanks for the pointer. 2. Allow connectiontimeout to somehow accept milliseconds. Or a floating point number? Unfortunately this would seem to impact actual connection timeouts as an undesired side-effect of trying to address Windows' bad treatment of RSTs, right? Not directly as you can interpret an integer value of an existing configuration also as a float, but I would like to keep the value an integer. This should be doable the way I proposed ( 100 seconds = 100 milliseconds), but comments on this approach are welcome. The range-based interpretation just seems too subtle to me, but I'm probably biased. I've used floating point seconds for configuration in my own server code after having been burned by having to switch from seconds to milliseconds to nanoseconds, etc. I'd rather hide the implementation's units from the user. I still use an integer when I wish to assert that nothing below a second is allowable and don't see any value in being able to specify 2.5 in addition to 2 and 3, though. -- Jess Holle
Re: Speeding up mod_proxy_balancer on Windows
Ruediger Pluem wrote: On 10/13/2008 10:04 PM, Jess Holle wrote: Jess Holle wrote: Ruediger Pluem wrote: So if noone finds a registry entry to stop this RFC violating behaviour I'd love to see this solved by such a discovery, option 0. I see only two options on Windows: 1. Fiddle around with GetTcpTable. I've attached my incomplete code in this regard (as a diff against 2.2.9, which is what I used as the base for my changes) for what they're worth. There are TO_DO notes where I know I'm missing stuff. I tested basic use of GetTcpTable(), which solved the problem, but haven't completed my conversion to caching this data -- in part because I don't know where to allocate an lock to arbitrate access to this cached data. I forgot the -u on my diff. Here's a unified diff. Thanks for this. Given that it introduces a lot of platform specific code to the proxy and given the outstanding cache problem I would like to wait for Bill's proposal to improve apr_socket_connect within APR as this looks more appealing overall. If improving APR turns out to be not possible I would come back to your patch. That makes perfect sense to me. I was going to set the code aside for now myself for similar reasons, but wanted to share it before I forgot in case it turns out to be useful. [And, yes, I know the platform-specific bit in the middle of mod_proxy was rather ugly -- and requires 1 additional Win32 library be added to mod_proxy's VC++ config as well...] -- Jess Holle
Re: Speeding up mod_proxy_balancer on Windows
I've managed to create a workaround for this issue with GetTcpTable(). The only remaining issue I have is that I don't want to call this too often. I want to hold on to the data with a time-to-live during which I'll assume the data has not changed. That's all easy enough except for locking. That's easy at a logical level, but where can I allocate locks for such a thing so that I simply have a single lock per worker process (there's only one on Windows, of course, which is all I care about) allocated up front for the life of the process? I'd like to just use APR locks (possibly read/write), but to do so I clearly need to hook into the right place in the Apache life cycle and the right pool. -- Jess Holle P.S. Sorry for the stupid question -- the nuances of Apache lifecycle, pools, etc, are still clearly beyond me.
Re: Speeding up mod_proxy_balancer on Windows
Hi, Send this to the wrong address first time. May have saved the GetTcpTable coding. Here is a usec timeout fix, although I wouldn't go below 100 milliseconds without some testing under load. I'm not sure its the perfect way to do it, but it avoids changing the connectiontimeout parameter to usec (still defaults to sec). Order is important connectiontimeoutisusec must come after connectiontimeout. Ideas on better ways to do it welcome. I can see a need for timeouts less than a second outside the windows case. Also included the non blocking patch without the ifdefs. Regards Matt ProxyPass / balance://hotcluster/ Proxy balance://hotcluster # below IPs are not reachable, acts like a down box (if timeout is small enough) # 1 sec BalancerMember ajp://192.168.0.23:7010 loadfactor=1 connectiontimeout=100 connectiontimeoutisusec=1 # 1 sec normal BalancerMember ajp://192.168.0.24:7010 loadfactor=1 connectiontimeout=1 # 750 milli sec. BalancerMember ajp://192.168.0.25:7010 loadfactor=1 connectiontimeout=75 connectiontimeoutisusec=1 BalancerMember ajp://localhost:8009 loadfactor=1 connectiontimeout=2 /Proxy Index: modules/proxy/proxy_util.c === --- modules/proxy/proxy_util.c(revision 703688) +++ modules/proxy/proxy_util.c(working copy) @@ -2358,9 +2358,17 @@ proxy: %s: fam %d socket created to connect to %s, proxy_function, backend_addr-family, worker-hostname); +/* use non blocking for connect timeouts to work. The ifdef + limits to unix systems which have apr_wait_for_io_or_timeout. + TODO: remove the ifdef and see what works/breaks */ + +apr_socket_opt_set(newsock, APR_SO_NONBLOCK, 1); + /* make the connection out of the socket */ rv = apr_socket_connect(newsock, backend_addr); +apr_socket_opt_set(newsock, APR_SO_NONBLOCK, 0); + /* if an error occurred, loop round and try again */ if (rv != APR_SUCCESS) { apr_socket_close(newsock); Index: modules/proxy/mod_proxy.c === --- modules/proxy/mod_proxy.c(revision 703688) +++ modules/proxy/mod_proxy.c(working copy) @@ -291,6 +291,13 @@ worker-conn_timeout = apr_time_from_sec(ival); worker-conn_timeout_set = 1; } +else if (!strcasecmp(key, connectiontimeoutisusec)) { +/* change timeout to useconds */ +ival = atoi(val); +if (ival == 1 worker-conn_timeout_set == 1){ +worker-conn_timeout = apr_time_make(0, apr_time_sec(worker-conn_timeout) ); +} +} else { return unknown Worker parameter; }
Re: Speeding up mod_proxy_balancer on Windows
Perhaps I misunderstand things here, but isn't this connection timeout setting used for more than just the timing out the initial formation of the connection? It would seem that logical that there would be a connection timeout for forming the initial connection and another for timeouts of responses, etc, but I had understood this was not the case. We currently have connection timeout set very, very large as otherwise we got timeouts when the backend URL was something very computationally intensive that took a long time to respond with data (e.g. a good number of minutes). That should seemingly be distinct from an initial connection timeout, but my understanding was that it is not. Am I just confused here? -- Jess Holle Matt Stevenson wrote: Hi, Send this to the wrong address first time. May have saved the GetTcpTable coding. Here is a usec timeout fix, although I wouldn't go below 100 milliseconds without some testing under load. I'm not sure its the perfect way to do it, but it avoids changing the connectiontimeout parameter to usec (still defaults to sec). Order is important connectiontimeoutisusec must come after connectiontimeout. Ideas on better ways to do it welcome. I can see a need for timeouts less than a second outside the windows case. Also included the non blocking patch without the ifdefs. Regards Matt ProxyPass / balance://hotcluster/ Proxy balance://hotcluster # below IPs are not reachable, acts like a down box (if timeout is small enough) # 1 sec BalancerMember ajp://192.168.0.23:7010 loadfactor=1 connectiontimeout=100 connectiontimeoutisusec=1 # 1 sec normal BalancerMember ajp://192.168.0.24:7010 loadfactor=1 connectiontimeout=1 # 750 milli sec. BalancerMember ajp://192.168.0.25:7010 loadfactor=1 connectiontimeout=75 connectiontimeoutisusec=1 BalancerMember ajp://localhost:8009 loadfactor=1 connectiontimeout=2 /Proxy Index: modules/proxy/proxy_util.c === --- modules/proxy/proxy_util.c(revision 703688) +++ modules/proxy/proxy_util.c(working copy) @@ -2358,9 +2358,17 @@ proxy: %s: fam %d socket created to connect to %s, proxy_function, backend_addr-family, worker-hostname); +/* use non blocking for connect timeouts to work. The ifdef + limits to unix systems which have apr_wait_for_io_or_timeout. + TODO: remove the ifdef and see what works/breaks */ + +apr_socket_opt_set(newsock, APR_SO_NONBLOCK, 1); + /* make the connection out of the socket */ rv = apr_socket_connect(newsock, backend_addr); +apr_socket_opt_set(newsock, APR_SO_NONBLOCK, 0); + /* if an error occurred, loop round and try again */ if (rv != APR_SUCCESS) { apr_socket_close(newsock); Index: modules/proxy/mod_proxy.c === --- modules/proxy/mod_proxy.c(revision 703688) +++ modules/proxy/mod_proxy.c(working copy) @@ -291,6 +291,13 @@ worker-conn_timeout = apr_time_from_sec(ival); worker-conn_timeout_set = 1; } +else if (!strcasecmp(key, connectiontimeoutisusec)) { +/* change timeout to useconds */ +ival = atoi(val); +if (ival == 1 worker-conn_timeout_set == 1){ +worker-conn_timeout = apr_time_make(0, apr_time_sec(worker-conn_timeout) ); +} +} else { return unknown Worker parameter; }
Re: Speeding up mod_proxy_balancer on Windows
On 10/13/2008 12:50 AM, Jess Holle wrote: Perhaps I misunderstand things here, but isn't this connection timeout setting used for more than just the timing out the initial formation of the connection? It would seem that logical that there would be a connection timeout for forming the initial connection and another for timeouts of responses, etc, but I had understood this was not the case. We currently have connection timeout set very, very large as otherwise we got timeouts when the backend URL was something very computationally intensive that took a long time to respond with data (e.g. a good number of minutes). That should seemingly be distinct from an initial connection timeout, but my understanding was that it is not. Am I just confused here? No you are not. The next 2.2.x release will contain the parameter connectiontimeout where you can set *just* the connection timeout. The other parameter you are referring to is timeout. It will keep its meaning and will be used as a connection timeout if connectiontimeout is not set. Regards Rüdiger
Re: Speeding up mod_proxy_balancer on Windows
Jess Holle wrote: Ruediger Pluem wrote: Did you check whether the currently running thread proxy_ajp connect timeout fix. (http://mail-archives.apache.org/mod_mbox/httpd-dev/200810.mbox/[EMAIL PROTECTED] and http://mail-archives.apache.org/mod_mbox/httpd-dev/200810.mbox/[EMAIL PROTECTED]) does fix your issue on Windows? I was watching this. I'll have to give this a try as it just became clear from the latest of these messages that the fix applied to Windows. I just tried this fix -- it didn't help. Also the lowest connection timeout one can set via the proxy config options is 1 second, right? And that's roughly what it takes for connection to each dead port to take anyway (just a little over a second). -- Jess Holle
Speeding up mod_proxy_balancer on Windows
I had previously discovered that mod_proxy_balancer takes over 1 second on Windows to determine that nothing is listening on the target port. This becomes problematic if you are balancing over a sparsely populated set of proxy ports. A Windows guru here found the Windows GetTcpTable which would appear to offer a quicker way to determine a port's status -- whereas doing the obvious thing and attempting to connect takes over a second to fail. I'd like to experiment with using this API to address this issue upon attempted formation of the first connection for a given worker one is balancing over. Can anyone suggest where I should look to do add such a call? Eventually this should presumably be an APR-level thing, but in the short term I'm just looking for where I can experiment with inserting it in an #ifdef in the proxy code -- and getting a little lost here, unfortunately. -- Jess Holle
Re: Speeding up mod_proxy_balancer on Windows
P.S. Yes, I know this approach only has any hope of working when Apache and the proxy backends are on the same host. Jess Holle wrote: I had previously discovered that mod_proxy_balancer takes over 1 second on Windows to determine that nothing is listening on the target port. This becomes problematic if you are balancing over a sparsely populated set of proxy ports. A Windows guru here found the Windows GetTcpTable which would appear to offer a quicker way to determine a port's status -- whereas doing the obvious thing and attempting to connect takes over a second to fail. I'd like to experiment with using this API to address this issue upon attempted formation of the first connection for a given worker one is balancing over. Can anyone suggest where I should look to do add such a call? Eventually this should presumably be an APR-level thing, but in the short term I'm just looking for where I can experiment with inserting it in an #ifdef in the proxy code -- and getting a little lost here, unfortunately. -- Jess Holle
Re: Speeding up mod_proxy_balancer on Windows
On 10/09/2008 11:50 PM, Jess Holle wrote: P.S. Yes, I know this approach only has any hope of working when Apache and the proxy backends are on the same host. Jess Holle wrote: I had previously discovered that mod_proxy_balancer takes over 1 second on Windows to determine that nothing is listening on the target port. This becomes problematic if you are balancing over a sparsely populated set of proxy ports. A Windows guru here found the Windows GetTcpTable which would appear to offer a quicker way to determine a port's status -- whereas doing the obvious thing and attempting to connect takes over a second to fail. I'd like to experiment with using this API to address this issue upon attempted formation of the first connection for a given worker one is balancing over. Can anyone suggest where I should look to do add such a call? Eventually this should presumably be an APR-level thing, but in the short term I'm just looking for where I can experiment with inserting it in an #ifdef in the proxy code -- and getting a little lost here, unfortunately. Did you check whether the currently running thread proxy_ajp connect timeout fix. (http://mail-archives.apache.org/mod_mbox/httpd-dev/200810.mbox/[EMAIL PROTECTED] and http://mail-archives.apache.org/mod_mbox/httpd-dev/200810.mbox/[EMAIL PROTECTED]) does fix your issue on Windows? If httpd and the backends are running on the same machine this shouldn't take a second. The connect call should return immediately with an error code indicating that the connection was refused (if the port is down). If not is it possible that there is a local firewall that causes this trouble? Regards Rüdiger
Re: Speeding up mod_proxy_balancer on Windows
Ruediger Pluem wrote: Did you check whether the currently running thread proxy_ajp connect timeout fix. (http://mail-archives.apache.org/mod_mbox/httpd-dev/200810.mbox/[EMAIL PROTECTED] and http://mail-archives.apache.org/mod_mbox/httpd-dev/200810.mbox/[EMAIL PROTECTED]) does fix your issue on Windows? I was watching this. I'll have to give this a try as it just became clear from the latest of these messages that the fix applied to Windows. If httpd and the backends are running on the same machine this shouldn't take a second. The connect call should return immediately with an error code indicating that the connection was refused (if the port is down). Yes, that's what I'd assumed. It has become clear from testing on multiple machines that this is not the case. Another engineer did testing with Windows APIs directly in .NET app and came up with the same result -- over 1 second per port refusal. If not is it possible that there is a local firewall that causes this trouble? I tried disabling that and other such steps. I certainly concur -- the refusal should be immediate and certainly /far/ faster than 1 second per port. I've been left wondering if this isn't an odd-ball hack by Microsoft to slow down remote port scans. I'll give the timeout fix a try, but I'm not hopeful given the data so far. -- Jess Holle