Re: a simple formatter
Tested. I have made a (small) file of the structures I've noticed past formatting errors on. With a manual fix-up to what I think it should look like in readable C++ under the published squid formatting description. The output of your script does this: .. It is the --break-blocks option of the astyle. I think it can removed. Also, I noticed that the formatter will accept _anything_ given to it as a filename and create the files. Thats rather nasty when non-valid filename sequences are entered. ie --help OK fixed :-). Now accepts only files with .cc,.h,.cci and .c extensions and ignore all other files. I am re-sending the fixed script . In this version also I removed the --break-blocks astyle option. Formated code looks better without it. -- Christos #!/usr/bin/perl # # Author: Tsantilas Christos # email: [EMAIL PROTECTED] # # Distributed under the terms of the GNU General Public License as published # by the Free Software Foundation; either version 2 of the License, or # (at your option) any later version. # # The ldap_manager library is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU # Library General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program; if not, write to the Free Software # Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA # # See LICENSE or http://www.gnu.org/licenses/gpl.html for details . # use IPC::Open2; $ASTYLE_BIN=/usr/bin/astyle; $ASTYLE_ARGS =--mode=c -s4 -O -l; #$ASTYLE_ARGS=--mode=c -s4 -O --break-blocks -l; #$ASTYLE_BIN=/usr/local/src/astyle-1.21/bin/astyle; if(! -e $ASTYLE_BIN){ print \nFile $ASTYLE_BIN not found\n; print Please fix the ASTYLE_BIN variable in this script!\n\n; exit -1; } $ASTYLE_BIN=$ASTYLE_BIN. .$ASTYLE_ARGS; $INDENT = ; $out = shift @ARGV; #read options, currently no options available while($out eq || $out =~ /^-\w+$/){ if($out eq -h) { usage($0); exit 0; } else { usage($0); exit -1; } } while($out){ if( $out !~ /\.cc$|\.cci$|\.h$|\.c$/) { print Unknown suffix for file $out, ignoring\n; $out = shift @ARGV; next; } $in= $out.astylebak; my($new_in) = $in; my($i) = 0; while(-e $new_in) { $new_in=$in...$i; $i++; } $in=$new_in; rename($out, $in); local (*FROM_ASTYLE, *TO_ASTYLE); $pid_style=open2(\*FROM_ASTYLE, \*TO_ASTYLE, $ASTYLE_BIN); if(!$pid_style){ print An error while open2\n; exit -1; } if($pid=fork()){ #do parrent staf close(FROM_ASTYLE); if(!open(IN, $in)){ print Can not open input file: $in\n; exit -1; } my($line) = ''; while(IN){ $line=$line.$_; if(input_filter(\$line)==0){ next; } print TO_ASTYLE $line; $line = ''; } if($line){ print TO_ASTYLE $line; } close(TO_ASTYLE); waitpid($pid,0); } else{ # child staf close(TO_ASTYLE); if(!open(OUT,$out)){ print Can't open output file: $out\n; exit -1; } my($line)=''; while(FROM_ASTYLE){ $line = $line.$_; if(output_filter(\$line)==0){ next; } print OUT $line; $line = ''; } if($line){ print OUT $line; } close(OUT); exit 0; } $out = shift @ARGV; } sub input_filter{ my($line)[EMAIL PROTECTED]; #if we have integer declaration, get it all before processing it.. if($$line =~/.*\s*int\s+.*/ ){ if(index($$line,;) == -1){ return 0; } if($$line =~ /(.*)\s*int\s+([^:]*):\s*(\w+)\s*\;(.*)/s){ # print .$$line.($1)\n; $$line= $1. int .$2.__FORASTYLE__.$3.;.$4; # print -.$$line.\n; } return 1; } return 1; } sub output_filter{ my($line)[EMAIL PROTECTED]; if($$line =~ /^\s*$/){ return 1; } #if line is not preprocessor directive keep indentation if($$line !~ /^(\s*)\#/){ $$line =~ /^(\s*)./; $INDENT = $1; # here we can handle only the case we have a one line C++/C command. # but looks enougth.. } # The #preprocessor_directive { case if($$line =~ /^(\s*)\#(.*){\s*$/ ){ $$line=$1.#.$2.\n.$INDENT.{\n; } # The { #preprocessor directive case if($$line =~ /^(\s*)\{(.*)#(if|else|endif)(.*)$/ ){ # print From :.$$line.\n; $$line= $1.{.$2.\n#.$3.$4.\n; # print --.$$line.\n; } # The unsigned int:1; case . $$line =~ s/__FORASTYLE__/:/; return 1; } sub usage{ my($name)[EMAIL PROTECTED]; print Usage:\n $name file1 file2 file3 \n; }
Re: squid-3.HEAD IPAddress leak
On Thu, 2008-02-07 at 12:21 +0900, Adrian Chadd wrote: Well, I haven't removed the temporary malloc/free pair, whatever its called; I've just removed Amos' workaround in src/comm.cc so it doesn't leak on my system whilst I profile. Still, this is one of those death of a thousand cuts method of killing performance.. Right, I haven't seen the commit; care to mail the diff? Here you go. --- src/comm.cc 2008-01-19 20:50:42.0 +1300 +++ src/comm.cc-2 2008-02-08 09:56:11.0 +1300 @@ -1381,7 +1381,7 @@ } -#ifdef _SQUID_LINUX_ +#ifdef 0 // _SQUID_LINUX_ /* 2007-11-27: * Linux Debian replaces our allocated AI pointer with garbage when * connect() fails. This leads to segmentation faults deallocating In reply to your last few emails Adrian. re: can we join up on IRC? I may not be able to for the next few days, too much else on. re: Why are you trying to allocate the structure on invocation of GetAddrInfo() ? The design was to follow the well-known structure of the kernel call getaddrinfo(), which fills kernel-managed memory in a thread-safe way down a dynamic structure. The addrinfo structs are too nasty to use naked for anything like the amount of comm usage squid has. They are not an object per-se, but the root of a pointer tree to various types of node, which depend on the flags for their memory sizes. We could make a variant of the call which takes an addrinfo object and sets it up to point at the IPAddress internals correctly. BUT the bug we are looking at would then cause the IPAddress to become garbage, maybe have the kernel free'ing stack memory, etc. re: Argh, this temporary malloc/free pair is peppered throughout the codebase! Grr. You already know there are naked comm calls everywhere. :-( Most of them calls addrinfo was desgined for use in. re: I've removed that hack, and things work fine for me. Ubuntu 7.01 here, x86_64. Yes, I found no problem on ubuntu also. I simply could not (after very little searching) find a #if test that would only work for Debian. Amos
Re: squid-3.HEAD IPAddress leak
Uhm, the kernel won't be free'ing userland memory at all; it doesn't give a rats how its managed. You're probably confused with the library - the libc is fumbling with all the struct addrinfo things. Also, if the bug bugs you, create a temporary pointer and pass that in. ;) Adrian On Fri, Feb 08, 2008, Amos Jeffries wrote: On Thu, 2008-02-07 at 12:21 +0900, Adrian Chadd wrote: Well, I haven't removed the temporary malloc/free pair, whatever its called; I've just removed Amos' workaround in src/comm.cc so it doesn't leak on my system whilst I profile. Still, this is one of those death of a thousand cuts method of killing performance.. Right, I haven't seen the commit; care to mail the diff? Here you go. --- src/comm.cc 2008-01-19 20:50:42.0 +1300 +++ src/comm.cc-2 2008-02-08 09:56:11.0 +1300 @@ -1381,7 +1381,7 @@ } -#ifdef _SQUID_LINUX_ +#ifdef 0 // _SQUID_LINUX_ /* 2007-11-27: * Linux Debian replaces our allocated AI pointer with garbage when * connect() fails. This leads to segmentation faults deallocating In reply to your last few emails Adrian. re: can we join up on IRC? I may not be able to for the next few days, too much else on. re: Why are you trying to allocate the structure on invocation of GetAddrInfo() ? The design was to follow the well-known structure of the kernel call getaddrinfo(), which fills kernel-managed memory in a thread-safe way down a dynamic structure. The addrinfo structs are too nasty to use naked for anything like the amount of comm usage squid has. They are not an object per-se, but the root of a pointer tree to various types of node, which depend on the flags for their memory sizes. We could make a variant of the call which takes an addrinfo object and sets it up to point at the IPAddress internals correctly. BUT the bug we are looking at would then cause the IPAddress to become garbage, maybe have the kernel free'ing stack memory, etc. re: Argh, this temporary malloc/free pair is peppered throughout the codebase! Grr. You already know there are naked comm calls everywhere. :-( Most of them calls addrinfo was desgined for use in. re: I've removed that hack, and things work fine for me. Ubuntu 7.01 here, x86_64. Yes, I found no problem on ubuntu also. I simply could not (after very little searching) find a #if test that would only work for Debian. Amos -- - Xenion - http://www.xenion.com.au/ - VPS Hosting - Commercial Squid Support - - $25/pm entry-level VPSes w/ capped bandwidth charges available in WA -
Re: squid-3.HEAD IPAddress leak
On Thu, 2008-02-07 at 14:08 +0900, Adrian Chadd wrote: On Thu, Feb 07, 2008, Robert Collins wrote: Still, this is one of those death of a thousand cuts method of killing performance.. Right, I haven't seen the commit; care to mail the diff? Which? I just looked at the places where Amos is calling GetAddrInfo() and FreeAddrInfo(); more then one are: * GetAddrInfo(temp, ); * F-{something} = temp; * FreeAddrInfo(temp); Wheee, surely that would be better as F-{something} = GetAddrInfo(...) F-something should all be IPAddress type. Assignment to update based on some results kernel places in temp. That usage above is a guaranteed leak of N bytes where N can be up to the size of a small DNS zone! Please read this for a short synopsis of the requirements of addrinfo usage: http://linux.die.net/man/3/getaddrinfo Amos
Re: squid-3.HEAD IPAddress leak
Have you tried running valgrind? Also, wait. What exactly are you using addr_info for? Can you explain it to me? There's a netural sock addr size type - its called sockaddr_storage. addr_info is for return results from hostname-ip queries.. Adrian On Fri, Feb 08, 2008, Amos Jeffries wrote: On Thu, Feb 07, 2008, Robert Collins wrote: Still, this is one of those death of a thousand cuts method of killing performance.. Right, I haven't seen the commit; care to mail the diff? Which? I just looked at the places where Amos is calling GetAddrInfo() and FreeAddrInfo(); more then one are: * GetAddrInfo(temp, ); * F-{something} = temp; * FreeAddrInfo(temp); Where?! I hate addrinfo enough that I only added addrinfo where it needed: GetAddrInfo(temp) syscall_needing_neutral_sockaddr_size_family(temp) FreeAddrInfo(temp) OR: GetAddrInfo(temp) syscall_altering_addrinfo(temp) F-something = temp; FreeAddrInfo(temp) Amos -- - Xenion - http://www.xenion.com.au/ - VPS Hosting - Commercial Squid Support - - $25/pm entry-level VPSes w/ capped bandwidth charges available in WA -
Re: squid-3.HEAD IPAddress leak
Uhm, the kernel won't be free'ing userland memory at all; it doesn't give a rats how its managed. You're probably confused with the library - the libc is fumbling with all the struct addrinfo things. Possibly. Once an external call is made I no longer care if its kernel or library. Bad effects are equally dangerous to squid. Also, if the bug bugs you, create a temporary pointer and pass that in. ;) Try it. You will be horrified. The pointer-tree needs by the OS to work with screws that up. We would need to allocate a new addrinfo for each master node and set its pointers to new memory, copy the original data into it ... and hey! thats what IPAddress::GetAddrInfo() does! Do not confuse IPAddress::GetAddrInfo() which allocates an initializes a 'full' one-IP addrinfo* tree quickly which is about to be used in place of a blocking getaddrinfo() system call. The second-best alternative in many of these places is hard-coding sockaddr_in* types and using #if USE_IPV6 to alternate the calls. With IPAddress::InitAddrInfo() which allocates a single empty addrinfo* ready for new data to be inserted and only sets the flags up. Amos Adrian On Fri, Feb 08, 2008, Amos Jeffries wrote: On Thu, 2008-02-07 at 12:21 +0900, Adrian Chadd wrote: Well, I haven't removed the temporary malloc/free pair, whatever its called; I've just removed Amos' workaround in src/comm.cc so it doesn't leak on my system whilst I profile. Still, this is one of those death of a thousand cuts method of killing performance.. Right, I haven't seen the commit; care to mail the diff? Here you go. --- src/comm.cc 2008-01-19 20:50:42.0 +1300 +++ src/comm.cc-2 2008-02-08 09:56:11.0 +1300 @@ -1381,7 +1381,7 @@ } -#ifdef _SQUID_LINUX_ +#ifdef 0 // _SQUID_LINUX_ /* 2007-11-27: * Linux Debian replaces our allocated AI pointer with garbage when * connect() fails. This leads to segmentation faults deallocating In reply to your last few emails Adrian. re: can we join up on IRC? I may not be able to for the next few days, too much else on. re: Why are you trying to allocate the structure on invocation of GetAddrInfo() ? The design was to follow the well-known structure of the kernel call getaddrinfo(), which fills kernel-managed memory in a thread-safe way down a dynamic structure. The addrinfo structs are too nasty to use naked for anything like the amount of comm usage squid has. They are not an object per-se, but the root of a pointer tree to various types of node, which depend on the flags for their memory sizes. We could make a variant of the call which takes an addrinfo object and sets it up to point at the IPAddress internals correctly. BUT the bug we are looking at would then cause the IPAddress to become garbage, maybe have the kernel free'ing stack memory, etc. re: Argh, this temporary malloc/free pair is peppered throughout the codebase! Grr. You already know there are naked comm calls everywhere. :-( Most of them calls addrinfo was desgined for use in. re: I've removed that hack, and things work fine for me. Ubuntu 7.01 here, x86_64. Yes, I found no problem on ubuntu also. I simply could not (after very little searching) find a #if test that would only work for Debian. Amos -- - Xenion - http://www.xenion.com.au/ - VPS Hosting - Commercial Squid Support - - $25/pm entry-level VPSes w/ capped bandwidth charges available in WA -
Re: squid-3.HEAD IPAddress leak
On Thu, Feb 07, 2008, Robert Collins wrote: Still, this is one of those death of a thousand cuts method of killing performance.. Right, I haven't seen the commit; care to mail the diff? Which? I just looked at the places where Amos is calling GetAddrInfo() and FreeAddrInfo(); more then one are: * GetAddrInfo(temp, ); * F-{something} = temp; * FreeAddrInfo(temp); Where?! I hate addrinfo enough that I only added addrinfo where it needed: GetAddrInfo(temp) syscall_needing_neutral_sockaddr_size_family(temp) FreeAddrInfo(temp) OR: GetAddrInfo(temp) syscall_altering_addrinfo(temp) F-something = temp; FreeAddrInfo(temp) Amos
Re: squid-3.HEAD IPAddress leak
Have you tried running valgrind? Also, wait. What exactly are you using addr_info for? Can you explain it to me? The basics: addrinfo* extends the neutral sockaddr* type ( union { sockaddr_in, sockaddr_in6, sockaddr_storage} ) while adding the size values and moving the flags and protocol details. I'm using it in three ways. First Usage: [simplest via InitAddrInfo() paired with an assignment somewhere ] - to create an empty addrinfo tree root and pass to the system to be filled out. The results are copied to the fde or xxData objects for later async use. Second usage: [ hiding complex setup via GetAddrInfo() ] - to create addrinfo tree containing the current IPAddress so system or squid can use the structure fields in some comm call. Variant 1: addrinfo provides: AF_*, sockaddr_storage*, and sizeof() results in one package with fixed field names/locations for squid to send individually to a system call without mode-specific #ifdef's. Variant 2: whole addrinfo goes to system for use. Third Usage: [ hiding complex usage via GetAddrInfo() paired with an assignment somewhere ] - to create addrinfo tree containing the current IP and socket information so system can update/add its knowledge of the FQDN, hostname, alternate IPs, socket options, or original destination (in NAT case). (both above Variantions apply here too) - which then gets saved somewhere for squid to use later. All three uses above end up with a tree of dynamic data squid is expected to cleanup. So all uses are terminated with a FreeAddrInfo() to perform that cleanup. There's a netural sock addr size type - its called sockaddr_storage. addr_info is for return results from hostname-ip queries.. struct addrinfo extends struct sockaddr_storage to handle multiple IPs, hostnames, and removes the need for many type-castings in user-space code. Amos Adrian On Fri, Feb 08, 2008, Amos Jeffries wrote: On Thu, Feb 07, 2008, Robert Collins wrote: Still, this is one of those death of a thousand cuts method of killing performance.. Right, I haven't seen the commit; care to mail the diff? Which? I just looked at the places where Amos is calling GetAddrInfo() and FreeAddrInfo(); more then one are: * GetAddrInfo(temp, ); * F-{something} = temp; * FreeAddrInfo(temp); Where?! I hate addrinfo enough that I only added addrinfo where it needed: GetAddrInfo(temp) syscall_needing_neutral_sockaddr_size_family(temp) FreeAddrInfo(temp) OR: GetAddrInfo(temp) syscall_altering_addrinfo(temp) F-something = temp; FreeAddrInfo(temp) Amos -- - Xenion - http://www.xenion.com.au/ - VPS Hosting - Commercial Squid Support - - $25/pm entry-level VPSes w/ capped bandwidth charges available in WA -
Re: squid-3.HEAD IPAddress leak
On Fri, Feb 08, 2008, Amos Jeffries wrote: Try it. You will be horrified. Why? Whats being trashed, AI, or AI-somemember? If AI is being trashed, then just create a temporary pointer copy of AI, use that in the socket call, and then throw it away. if its trashing the memory -at- the AI rather than the AI pointer itself then you should probably spend some time with valgrind. The pointer-tree needs by the OS to work with screws that up. We would need to allocate a new addrinfo for each master node and set its pointers to new memory, copy the original data into it ... and hey! thats what IPAddress::GetAddrInfo() does! It doesn't do that in C! I don't get it. Do not confuse IPAddress::GetAddrInfo() which allocates an initializes a 'full' one-IP addrinfo* tree quickly which is about to be used in place of a blocking getaddrinfo() system call. The second-best alternative in many of these places is hard-coding sockaddr_in* types and using #if USE_IPV6 to alternate the calls. With IPAddress::InitAddrInfo() which allocates a single empty addrinfo* ready for new data to be inserted and only sets the flags up. Are there any situations where you iterate over a list of AddrInfo's ? You're not passing in a struct addr_info which you've allocated yourself into a glibc call which then frobs it and possibly allocates another? Adrian
Re: squid-3.HEAD IPAddress leak
On Fri, Feb 08, 2008, Amos Jeffries wrote: Try it. You will be horrified. Why? Whats being trashed, AI, or AI-somemember? AI-somemember. If AI is being trashed, then just create a temporary pointer copy of AI, use that in the socket call, and then throw it away. if its trashing the memory -at- the AI rather than the AI pointer itself then you should probably spend some time with valgrind. The pointer-tree needs by the OS to work with screws that up. We would need to allocate a new addrinfo for each master node and set its pointers to new memory, copy the original data into it ... and hey! thats what IPAddress::GetAddrInfo() does! It doesn't do that in C! I don't get it. Do not confuse IPAddress::GetAddrInfo() which allocates an initializes a 'full' one-IP addrinfo* tree quickly which is about to be used in place of a blocking getaddrinfo() system call. The second-best alternative in many of these places is hard-coding sockaddr_in* types and using #if USE_IPV6 to alternate the calls. With IPAddress::InitAddrInfo() which allocates a single empty addrinfo* ready for new data to be inserted and only sets the flags up. Are there any situations where you iterate over a list of AddrInfo's ? Yes, in all those F-somewhere = *AI calls. Robert pointed out. In all results from get*() system calls. In all results from xgetaddrinfo() system calls. You're not passing in a struct addr_info which you've allocated yourself into a glibc call which then frobs it and possibly allocates another? I suspect thats whats happening. Or, errors changing it to some sockaddr* thing representing a bad-IP state. Same diff to squid. When it should NOT be altering the pointers, but writing to the memory provided. Look at this: int connect(int sockfd, const struct sockaddr *serv_addr, socklen_t addrlen); in squid ... AI = { ... ai_addr (0xfff...) my pointer somewhere to sockaddr ai_addrlen = 16 ... }; connect(fd, AI-ai_addr, AI-ai_addrlen); afterwards: AI = { ... ai_addr (0x80...) definately NOT my pointer ai_addrlen = 16 ... }; the rest of AI is untouched. So its altering a CONST parameter. And worse. It's taking the pointer by-value as if it was accepting by-ref and alocating its own. Amos
Re: squid-3.HEAD IPAddress leak
And which version of debian on which platform is showing the busted behaviour? Adrian On Fri, Feb 08, 2008, Amos Jeffries wrote: On Thu, 2008-02-07 at 12:21 +0900, Adrian Chadd wrote: Well, I haven't removed the temporary malloc/free pair, whatever its called; I've just removed Amos' workaround in src/comm.cc so it doesn't leak on my system whilst I profile. Still, this is one of those death of a thousand cuts method of killing performance.. Right, I haven't seen the commit; care to mail the diff? Here you go. --- src/comm.cc 2008-01-19 20:50:42.0 +1300 +++ src/comm.cc-2 2008-02-08 09:56:11.0 +1300 @@ -1381,7 +1381,7 @@ } -#ifdef _SQUID_LINUX_ +#ifdef 0 // _SQUID_LINUX_ /* 2007-11-27: * Linux Debian replaces our allocated AI pointer with garbage when * connect() fails. This leads to segmentation faults deallocating In reply to your last few emails Adrian. re: can we join up on IRC? I may not be able to for the next few days, too much else on. re: Why are you trying to allocate the structure on invocation of GetAddrInfo() ? The design was to follow the well-known structure of the kernel call getaddrinfo(), which fills kernel-managed memory in a thread-safe way down a dynamic structure. The addrinfo structs are too nasty to use naked for anything like the amount of comm usage squid has. They are not an object per-se, but the root of a pointer tree to various types of node, which depend on the flags for their memory sizes. We could make a variant of the call which takes an addrinfo object and sets it up to point at the IPAddress internals correctly. BUT the bug we are looking at would then cause the IPAddress to become garbage, maybe have the kernel free'ing stack memory, etc. re: Argh, this temporary malloc/free pair is peppered throughout the codebase! Grr. You already know there are naked comm calls everywhere. :-( Most of them calls addrinfo was desgined for use in. re: I've removed that hack, and things work fine for me. Ubuntu 7.01 here, x86_64. Yes, I found no problem on ubuntu also. I simply could not (after very little searching) find a #if test that would only work for Debian. Amos -- - Xenion - http://www.xenion.com.au/ - VPS Hosting - Commercial Squid Support - - $25/pm entry-level VPSes w/ capped bandwidth charges available in WA -
Re: squid-3.HEAD IPAddress leak
And which version of debian on which platform is showing the busted behaviour? Dell i386 Kernels 2.6.18-4 thru 2.6.22-2 libstdc6 with g++ 4.1 thru 4.3 All the debian setups I have had to test with basically since I first found it. Amos On Fri, Feb 08, 2008, Amos Jeffries wrote: On Thu, 2008-02-07 at 12:21 +0900, Adrian Chadd wrote: Well, I haven't removed the temporary malloc/free pair, whatever its called; I've just removed Amos' workaround in src/comm.cc so it doesn't leak on my system whilst I profile. Still, this is one of those death of a thousand cuts method of killing performance.. Right, I haven't seen the commit; care to mail the diff? Here you go. --- src/comm.cc 2008-01-19 20:50:42.0 +1300 +++ src/comm.cc-2 2008-02-08 09:56:11.0 +1300 @@ -1381,7 +1381,7 @@ } -#ifdef _SQUID_LINUX_ +#ifdef 0 // _SQUID_LINUX_ /* 2007-11-27: * Linux Debian replaces our allocated AI pointer with garbage when * connect() fails. This leads to segmentation faults deallocating In reply to your last few emails Adrian. re: can we join up on IRC? I may not be able to for the next few days, too much else on. re: Why are you trying to allocate the structure on invocation of GetAddrInfo() ? The design was to follow the well-known structure of the kernel call getaddrinfo(), which fills kernel-managed memory in a thread-safe way down a dynamic structure. The addrinfo structs are too nasty to use naked for anything like the amount of comm usage squid has. They are not an object per-se, but the root of a pointer tree to various types of node, which depend on the flags for their memory sizes. We could make a variant of the call which takes an addrinfo object and sets it up to point at the IPAddress internals correctly. BUT the bug we are looking at would then cause the IPAddress to become garbage, maybe have the kernel free'ing stack memory, etc. re: Argh, this temporary malloc/free pair is peppered throughout the codebase! Grr. You already know there are naked comm calls everywhere. :-( Most of them calls addrinfo was desgined for use in. re: I've removed that hack, and things work fine for me. Ubuntu 7.01 here, x86_64. Yes, I found no problem on ubuntu also. I simply could not (after very little searching) find a #if test that would only work for Debian. Amos -- - Xenion - http://www.xenion.com.au/ - VPS Hosting - Commercial Squid Support - - $25/pm entry-level VPSes w/ capped bandwidth charges available in WA -
Re: a simple formatter
Tested. I have made a (small) file of the structures I've noticed past formatting errors on. With a manual fix-up to what I think it should look like in readable C++ under the published squid formatting description. The output of your script does this: .. It is the --break-blocks option of the astyle. I think it can removed. Also, I noticed that the formatter will accept _anything_ given to it as a filename and create the files. Thats rather nasty when non-valid filename sequences are entered. ie --help OK fixed :-). Now accepts only files with .cc,.h,.cci and .c extensions and ignore all other files. I am re-sending the fixed script . In this version also I removed the --break-blocks astyle option. Formated code looks better without it. Excellent. That dropit it down to: --- astyle-test-cases.cc2008-02-08 12:37:14.0 +1300 +++ astyle-test-cases.cc-original 2008-02-07 12:13:07.0 +1300 @@ -10,9 +10,9 @@ /// structures with bit-fields struct t1 { -unsigned int a:1; -unsigned int b:1; -unsigned int c:1; +unsigned int a:1; +unsigned int b:1; +unsigned int c:1; }; I suppose we could live with that. Though I think it may be a problem with the formatter.pl final replacement. Amos
Re: squid-3.HEAD IPAddress leak
On Fri, 2008-02-08 at 11:46 +1300, Amos Jeffries wrote: The basics: addrinfo* extends the neutral sockaddr* type ( union { sockaddr_in, sockaddr_in6, sockaddr_storage} ) while adding the size values and moving the flags and protocol details. Hi Amos, I understand that we need to support IPv4 and IPv6 addresses. I do not understand why we are suddenly talking about pointers and trees? I can understand if wherever Squid2 was using an IPv4 address, Squid3 would use an (ip4, ip6, whichOneFlag) class with appropriate methods that check the flag and return or set the right thing. I know that would not be the best way to implement it, but it would work and I would understand what is going on. Can you explain AddrInfo in the above simple (ip4, ip6, whichOneFlag) class terms? Why do we need AddrInfo now and why are we talking about a tree with pointers and alloc/free operations? Did Squid2 use an equivalent of AddrInfo? If yes, did it use it as often as Squid3 does now? Did Squid2 allocated and freed it? Thank you, Alex. I'm using it in three ways. First Usage: [simplest via InitAddrInfo() paired with an assignment somewhere ] - to create an empty addrinfo tree root and pass to the system to be filled out. The results are copied to the fde or xxData objects for later async use. Second usage: [ hiding complex setup via GetAddrInfo() ] - to create addrinfo tree containing the current IPAddress so system or squid can use the structure fields in some comm call. Variant 1: addrinfo provides: AF_*, sockaddr_storage*, and sizeof() results in one package with fixed field names/locations for squid to send individually to a system call without mode-specific #ifdef's. Variant 2: whole addrinfo goes to system for use. Third Usage: [ hiding complex usage via GetAddrInfo() paired with an assignment somewhere ] - to create addrinfo tree containing the current IP and socket information so system can update/add its knowledge of the FQDN, hostname, alternate IPs, socket options, or original destination (in NAT case). (both above Variantions apply here too) - which then gets saved somewhere for squid to use later. All three uses above end up with a tree of dynamic data squid is expected to cleanup. So all uses are terminated with a FreeAddrInfo() to perform that cleanup. There's a netural sock addr size type - its called sockaddr_storage. addr_info is for return results from hostname-ip queries.. struct addrinfo extends struct sockaddr_storage to handle multiple IPs, hostnames, and removes the need for many type-castings in user-space code. Amos Adrian On Fri, Feb 08, 2008, Amos Jeffries wrote: On Thu, Feb 07, 2008, Robert Collins wrote: Still, this is one of those death of a thousand cuts method of killing performance.. Right, I haven't seen the commit; care to mail the diff? Which? I just looked at the places where Amos is calling GetAddrInfo() and FreeAddrInfo(); more then one are: * GetAddrInfo(temp, ); * F-{something} = temp; * FreeAddrInfo(temp); Where?! I hate addrinfo enough that I only added addrinfo where it needed: GetAddrInfo(temp) syscall_needing_neutral_sockaddr_size_family(temp) FreeAddrInfo(temp) OR: GetAddrInfo(temp) syscall_altering_addrinfo(temp) F-something = temp; FreeAddrInfo(temp) Amos -- - Xenion - http://www.xenion.com.au/ - VPS Hosting - Commercial Squid Support - - $25/pm entry-level VPSes w/ capped bandwidth charges available in WA -
Re: a simple formatter
On Wed, 2008-02-06 at 21:26 +0200, Tsantilas Christos wrote: Do we need something like that? Any comments/suggestions? Any testers? I believe we do and I appreciate you working on it! Please try to fix the remaining problems Amos pointed out. Also, can you apply it to the entire Squid tree, remove all whitespace, and calculate md5, comparing that with the virgin whitespace-less tree? The MD5s should be the same, right? This is not a perfect check because spaces within strings/etc are stripped and not checked, but it is a pretty good one. Thank you, Alex.
Re: a simple formatter
On Wed, 2008-02-06 at 21:26 +0200, Tsantilas Christos wrote: Do we need something like that? Any comments/suggestions? Any testers? I believe we do and I appreciate you working on it! Please try to fix the remaining problems Amos pointed out. Also, can you apply it to the entire Squid tree, remove all whitespace, and calculate md5, comparing that with the virgin whitespace-less tree? The MD5s should be the same, right? Oooh. Nice test. This is not a perfect check because spaces within strings/etc are stripped and not checked, but it is a pretty good one. It will also miss the #if 0 { block problem. Though my unit-test for that will catch it any any other whitsspace-exact case we care to write the test for. We may need a compile test for any other similar ones that have not had test-cases written. Just have to figure out the best way to have the test file built into the codebase for automatic testing. Amos
Re: a simple formatter
On Fri, 2008-02-08 at 14:31 +1300, Amos Jeffries wrote: On Wed, 2008-02-06 at 21:26 +0200, Tsantilas Christos wrote: Do we need something like that? Any comments/suggestions? Any testers? I believe we do and I appreciate you working on it! Please try to fix the remaining problems Amos pointed out. Also, can you apply it to the entire Squid tree, remove all whitespace, and calculate md5, comparing that with the virgin whitespace-less tree? The MD5s should be the same, right? Oooh. Nice test. This is not a perfect check because spaces within strings/etc are stripped and not checked, but it is a pretty good one. It will also miss the #if 0 { block problem. Are there free C++ source code obfuscation programs? If they are guaranteed to generate the same source code regardless of formatting, then applying them would catch even more bugs. Too bad compilers produce different output for each execution due to timestamps and such. Perhaps there is a way to avoid that and compare md5s of squid executables? Cheers, Alex. P.S. A basic test file would be good to have anyway, so thank you for bootstrapping that.
Re: squid-3.HEAD IPAddress leak
On Fri, 2008-02-08 at 11:46 +1300, Amos Jeffries wrote: The basics: addrinfo* extends the neutral sockaddr* type ( union { sockaddr_in, sockaddr_in6, sockaddr_storage} ) while adding the size values and moving the flags and protocol details. Hi Amos, I understand that we need to support IPv4 and IPv6 addresses. I do not understand why we are suddenly talking about pointers and trees? I can understand if wherever Squid2 was using an IPv4 address, Squid3 would use an (ip4, ip6, whichOneFlag) class with appropriate methods that check the flag and return or set the right thing. I know that would not be the best way to implement it, but it would work and I would understand what is going on. The protocol extensions defined to replace sockaddr_in and in_addr went through three generations of 'migration-pathway' attempts. The first; sockaddr* defines a union type that was nasty for casting and did not include that flag you mention in the structure itself. Most of the accept(), connect(), bind() etc still use this as their input type. The second; sockaddr_storage (as Husni uses, and Adrian mentioned) was created to provide a better way of using sockaddr* so the sockaddr_in and sockaddr_in6 bits could be read-written easily. But the big/litte endian problems between OS screwed up the sockaddr_in* sa_family field locations inside it so developers still can't portably use it for the v6/v6 flag they wanted. The third; addrinfo* defines a whole new type. Wrapping that old sockaddr* mess and providing in a nice set fo bells and whistles for use. Most importantly that flag we need to pas to the system calls. It is in _exactly_ the class you mention. But in struct form so C people can use it too. Can you explain AddrInfo in the above simple (ip4, ip6, whichOneFlag) class terms? There are a few extra bits I'v snipped from this explanation. The important bits for the squid comm code are: struct addrinfo { ... here is that flag you want to switch on if its been set properly. ... AF_INET (IPv4) or AF_INET6 (IPv6) or AF_UNSPEC (NULL, error, 'whatever') int ai_family; ... some bits needed only by the socket() creation calls int ai_socktype; int ai_protocol; ... and the actual data for the IP address: ... (1) pointer to somewhere holding the sockaddr_storage/*_in/*_in6 struct sockaddr *ai_addr; ... (2) and the sizeof() for the (1) object. ... if the flag is not set this _might_ be magic'd instead. int ai_addrlen; ... along with some bits set when ia DNS lookup has been done ... _sometimes_ it's the host FQDN ... optional so we never new/free this ourselves in squid. char *ai_canonname; ... and because some system lookups in IPv6 can return _multiple_ addresses ... it has a linked-list of branchs for each of the alternatives struct addrinfo *ai_next; /* Pointer to next in list. */ }; Why do we need AddrInfo now To kill dozens of castings, magic operations on object sizes, and cut down dozens of lines of 'compatibility code'. and why are we talking about a tree with pointers and alloc/free operations? I see a tree of pointers I call it a tree even if its a class-1 (dynamic linked-list). addrinfo - makes a tree node. ai_addr (a sockaddr pile of unfortunate structs) ai_canonname (a null-terminated string), ai_next (another addrinfo .. repeat ad inifinitum) Together they make a pretty tree. But every used piece is eseentially another new, memset, free. HTH. Did Squid2 use an equivalent of AddrInfo? If yes, did it use it as often as Squid3 does now? Did Squid2 allocated and freed it? What I've seen of Husni's final 2.6s13 patch he used addrinfo in many of the same system-interface places and uses the naked system getaddrinfo()/freeaddrinfo() calls where I use my GetAddrInfo() wrappers to the system ones. Then where he is passing around either sockaddr* or sockaddr_storage* where I universally use IPAddress . There are a few spots where a 'read-only' variant might be possible. So as to pretend to allocate when no new/free is actually done, just a pointer to the IPAddress field and a local addrinfo. One of them is this connect() IFF that bug can be guaranteed not to overwrite the internal IPAddress data and warp squid. I may have some time to do this tonight. Bug me about it if I'm on. Amos Thank you, Alex. I'm using it in three ways. First Usage: [simplest via InitAddrInfo() paired with an assignment somewhere ] - to create an empty addrinfo tree root and pass to the system to be filled out. The results are copied to the fde or xxData objects for later async use. Second usage: [ hiding complex setup via GetAddrInfo() ] - to create addrinfo tree containing the current IPAddress so system or squid can use the structure fields in some comm call. Variant 1: addrinfo provides: AF_*, sockaddr_storage*, and sizeof() results in one package with fixed field names/locations for squid to send individually to a system call
Re: a simple formatter
On Fri, 2008-02-08 at 14:31 +1300, Amos Jeffries wrote: On Wed, 2008-02-06 at 21:26 +0200, Tsantilas Christos wrote: Do we need something like that? Any comments/suggestions? Any testers? I believe we do and I appreciate you working on it! Please try to fix the remaining problems Amos pointed out. Also, can you apply it to the entire Squid tree, remove all whitespace, and calculate md5, comparing that with the virgin whitespace-less tree? The MD5s should be the same, right? Oooh. Nice test. This is not a perfect check because spaces within strings/etc are stripped and not checked, but it is a pretty good one. It will also miss the #if 0 { block problem. Are there free C++ source code obfuscation programs? If they are guaranteed to generate the same source code regardless of formatting, then applying them would catch even more bugs. Too bad compilers produce different output for each execution due to timestamps and such. What do you mean by this? A standard entry-level compiler test is that it produces the same output from the same input every time. Beginner students are taught that in the bootstrapping lessons: If it compiles its own code and produces a different binary, do it again until it stops or breaks. If it breaks you started with buggy code and still have work to do. Perhaps there is a way to avoid that and compare md5s of squid executables? Cheers, Alex. P.S. A basic test file would be good to have anyway, so thank you for bootstrapping that. I thought it would be helpful if you and Christos decided on an astyle formatter. Turned out right. Amos
Re: squid-3.HEAD IPAddress leak
On Fri, Feb 08, 2008, Amos Jeffries wrote: sockaddr_storage (as Husni uses, and Adrian mentioned) was created to provide a better way of using sockaddr* so the sockaddr_in and sockaddr_in6 bits could be read-written easily. But the big/litte endian problems between OS screwed up the sockaddr_in* sa_family field locations inside it so developers still can't portably use it for the v6/v6 flag they wanted. How'd that screw things up? Hm, I'll have to read. ... (1) pointer to somewhere holding the sockaddr_storage/*_in/*_in6 struct sockaddr *ai_addr; Which you have to allocate. and do when needed. char *ai_canonname; Which we're not using, and you have to allocate. and don't. ... and because some system lookups in IPv6 can return _multiple_ addresses ... it has a linked-list of branchs for each of the alternatives struct addrinfo *ai_next; /* Pointer to next in list. */ }; Which is only useful in the context of providing lists of addresses for a given host or IP address. What getaddrinfo() returns. and consider read-only. Why do we need AddrInfo now To kill dozens of castings, magic operations on object sizes, and cut down dozens of lines of 'compatibility code'. And add in a minimum of one, and a maximum of two allocations per socket operation. Together they make a pretty tree. But every used piece is eseentially another new, memset, free. Ah, and here you will have problems. Agreed. The members of that struct should probably be malloc, free, and not new/delete. You're using new/delete which -should- map to a default new operator and head off to the malloc libraries, but -squids- idea of the malloc interface could differ from the -library- idea of the malloc interface. I was thinking squid overiding the new/delete to its xmalloc/xfree at the same time it overrode the general malloc/free. So you should probably drop the new/delete'ing of the addrinfo stuff and replace it with malloc/free. If thats better, no problem. You're also memset()'ing the addrinfo struct whether you allocated it or not, which may be double-memset'ing the thing, and if someone passed in an addrinfo it may have structure members which have now been leaked. In the current usage the call should be the allocation. Not external. Allowing external allocation via another API call would be the only optimisation I can think of that does not break anything anywhere. memset is needed there because I could not tell that the new/delete _guaranteed_ pre-NULL'd memory and a single set bit in any unused fields might cause a crash later. With your deeper knowledge of the memory allocation in squid-3, feel free to alter the default memset()'s. Amos
Re: squid-3.HEAD IPAddress leak
On Fri, Feb 08, 2008, Amos Jeffries wrote: sockaddr_storage (as Husni uses, and Adrian mentioned) was created to provide a better way of using sockaddr* so the sockaddr_in and sockaddr_in6 bits could be read-written easily. But the big/litte endian problems between OS screwed up the sockaddr_in* sa_family field locations inside it so developers still can't portably use it for the v6/v6 flag they wanted. How'd that screw things up? Hm, I'll have to read. ... (1) pointer to somewhere holding the sockaddr_storage/*_in/*_in6 struct sockaddr *ai_addr; Which you have to allocate. char *ai_canonname; Which we're not using, and you have to allocate. ... and because some system lookups in IPv6 can return _multiple_ addresses ... it has a linked-list of branchs for each of the alternatives struct addrinfo *ai_next; /* Pointer to next in list. */ }; Which is only useful in the context of providing lists of addresses for a given host or IP address. What getaddrinfo() returns. Why do we need AddrInfo now To kill dozens of castings, magic operations on object sizes, and cut down dozens of lines of 'compatibility code'. And add in a minimum of one, and a maximum of two allocations per socket operation. Together they make a pretty tree. But every used piece is eseentially another new, memset, free. Ah, and here you will have problems. The members of that struct should probably be malloc, free, and not new/delete. You're using new/delete which -should- map to a default new operator and head off to the malloc libraries, but -squids- idea of the malloc interface could differ from the -library- idea of the malloc interface. So you should probably drop the new/delete'ing of the addrinfo stuff and replace it with malloc/free. You're also memset()'ing the addrinfo struct whether you allocated it or not, which may be double-memset'ing the thing, and if someone passed in an addrinfo it may have structure members which have now been leaked. Adrian
Re: squid-3.HEAD IPAddress leak
On Fri, 2008-02-08 at 15:52 +1300, Amos Jeffries wrote: The second; sockaddr_storage (as Husni uses, and Adrian mentioned) was created to provide a better way of using sockaddr* so the sockaddr_in and sockaddr_in6 bits could be read-written easily. But the big/litte endian problems between OS screwed up the sockaddr_in* sa_family field locations inside it so developers still can't portably use it for the v6/v6 flag they wanted. Do you have a reference for that? I do not want to bug you with more questions but I am surprised to learn that some kind of a sockaddr_storage wrapper cannot work well for Squid... We may have to fix Polygraph that is using that approach, IIRC. The third; addrinfo* defines a whole new type. Wrapping that old sockaddr* mess and providing in a nice set fo bells and whistles for use. Most importantly that flag we need to pas to the system calls. I have to say that the nice set of bells and whistles in a basic address structure used throughout a performance-sensitive program raises red flags, but perhaps the actual performance implications are not as bad. The important bits for the squid comm code are: struct addrinfo { int ai_family; int ai_socktype; int ai_protocol; struct sockaddr *ai_addr; // pointer to sockaddr_storage/*_in/*_in6 int ai_addrlen; char *ai_canonname; // we never new/free this ourselves in squid. struct addrinfo *ai_next; // Pointer to next in list. }; Can you replace IPAddress data members with the above, except not use any pointers and forget about ai_next and ai_canonname? I think doing so will eliminate temporary allocations and other things that look rather scary to both code quality and performance folks. When you do need a list of addresses or canonname, it is OK to use addrinfo and convert from/to IPAddress as needed, of course. I am guessing such uses will not affect performance or overall code quality. Thank you, Alex.