Re: [OMPI devel] Possible buffer overrun bug in opal_free_list_grow, called by MPI::Init
Stephen, I think you're completely right, and that I had a wrong understanding of the modulus operator. Based on my memory, I was pretty sure that the modulus is ALWAYS positive. Now, even Wikipedia seems to contradict me :) They have a pretty good definition of % based on the programming language (http://en.wikipedia.org/wiki/Modulo_operation). I will apply your patch to all places where we use modulus in Open MPI. Thanks for your help on this issue. Thanks, george. On Oct 19, 2008, at 1:43 PM, Stephan Kramer wrote: George Bosilca wrote: Stephan, You might be right. intptr_t is a signed type, which allows the result of % to be potentially negative. However, on the other side, mod is defined as a size_t which [based on my memory] is definitively unsigned as it represent a size. Did you try to apply your patch to Open MPI ? If yes does it resolve the issue ? george. Yes, I have applied the patch intptr_t -> uintptr_t and it does resolve the issue. I think the way this works, I'm not a C programmer myself, is: - the outcome of the % is a signed and negative number, say -x - this number gets wrapped in the assignment to the signed integer mod: UINT_MAX+1-x - in the subtraction CACHE_LINE_SIZE-mod, the result is wrapped around again, giving CACHE_LINE_SIZE+x Cheers Stephan On Oct 16, 2008, at 7:29 PM, Stephan Kramer wrote: George Bosilca wrote: I did investigate this issue for about 3 hours yesterday. Neither valgrind nor efence report any errors on my cluster. I'm using debian unstable with gcc-4.1.2. Adding printfs doesn't shows the same output as you, all addresses are in the correct range. I went over the code manually, and to be honest I cannot imagine how this might happens IF the compiler is doing what it is supposed to do. I run out of options on this one. If you can debug it and figure out what's the problem there I'll be happy to hear. george. Hi George, Thanks a lot for your effort of looking into this. I think I've come a bit further with this. The reproducibility may in fact have to do with 32bit/64 bit differences. I think the culprit is line 105 of opal_free_list.c: mod = (intptr_t)ptr % CACHE_LINE_SIZE; if(mod != 0) { ptr += (CACHE_LINE_SIZE - mod); } As intptr_t casts to a signed integer, for 32 bit with addresses above 0x7fff the outcome of mod will be negative. Thus ptr will be increased with more than CACHE_LINE_SIZE, which is not accounted for in the allocated buffer size in line 93, and a buffer overrun will appear in the subsequent element loop. This is confirmed with the output of some debugging statements I've pasted below. Also I haven't come across the same bug on 64bit machines. I guess this should be uintptr_t instead? Cheers Stephan Kramer The debugging output: mpidebug: num_elements = 1, flist->fl_elem_size = 40 mpidebug: sizeof(opal_list_item_t) = 16 mpidebug: allocating 184 mpidebug: allocated at memory address 0xb2d29f48 mpidebug: mod = -40, CACHE_LINE_SIZE = 128 and at point of the buffer overrun/efence segfault in gdb: (gdb) print item $23 = (opal_free_list_item_t *) 0xb2d2a000 which is exactly at (over) the end of the buffer: 0xb2d2a000=0xb2d29f48 + 184 On Oct 14, 2008, at 11:03 AM, Stephan Kramer wrote: Would someone mind having another look at the bug reported below? I'm hitting exactly the same problem with Debian Unstable, openmpi 1.2.7~rc2. Both valgrind and efence are indispensable tools for any developper, where both may catch errors the other won't report. Electric fence is especially good at catching buffer overruns as it protects the beginning and end of each allocated buffer. The original bug report shows an undeniable buffer overrun in MPI::Init, i.e. the attached patch prints out exactly the address it's trying to access which is over the end of the buffer. Any help would be much appreciated Stephan Kramer Patrick, I'm unable to reproduce the buffer overrun with the latest trunk. I run valgrind (with the memchekcer tool) on a regular basis on the trunk, and I never noticed anything like that. Moreover, I went over the code, and I cannot imagine how we can overrun the buffer in the code you pinpointed. Thanks, george. On Aug 23, 2008, at 7:57 PM, Patrick Farrell wrote: > Hi, > > I think I have found a buffer overrun in a function > called by MPI::Init, though explanations of why I am > wrong are welcome. > > I am using the openmpi included in Ubuntu Hardy, > version 1.2.5, though I have inspected the latest trunk by eye > and I don't believe the relevant code has changed. > > I was trying to use Electric Fence, a memory debugging library, > to debug a suspected buffer overrun in my own program. > Electric Fence works by replacing malloc/free in such > a way that bounds violation errors issue a segfault. > While running my program under Electric Fence, I found > that I got a segfault issued at: > > 0xb5cdd
Re: [OMPI devel] -display-map
Hmmm...just to be sure we are all clear on this. The reason we proposed to use mpirun is that "hostfile" has no meaning outside of mpirun. That's why ompi_info can't do anything in this regard. We have no idea what hostfile the user may specify until we actually get the mpirun cmd line. They may have specified a default-hostfile, but they could also specify hostfiles for the individual app_contexts. These may or may not include the node upon which mpirun is executing. So the only way to provide you with a separate command to get a hostfile<->nodename mapping would require you to provide us with the default-hostifle and/or hostfile cmd line options just as if you were issuing the mpirun cmd. We just wouldn't launch - but it would be the exact equivalent of doing "mpirun --do-not-launch". Am I missing something? If so, please do correct me - I would be happy to provide a tool if that would make it easier. Just not sure what that tool would do. Thanks Ralph On Oct 19, 2008, at 1:59 PM, Greg Watson wrote: Ralph, It seems a little strange to be using mpirun for this, but barring providing a separate command, or using ompi_info, I think this would solve our problem. Thanks, Greg On Oct 17, 2008, at 10:46 AM, Ralph Castain wrote: Sorry for delay - had to ponder this one for awhile. Jeff and I agree that adding something to ompi_info would not be a good idea. Ompi_info has no knowledge or understanding of hostfiles, and adding that capability to it would be a major distortion of its intended use. However, we think we can offer an alternative that might better solve the problem. Remember, we now treat hostfiles in a very different manner than before - see the wiki page for a complete description, or "man orte_hosts". So the problem is that, to provide you with what you want, we need to "dump" the information from whatever default-hostfile was provided, and, if no default-hostfile was provided, then the information from each hostfile that was provided with an app_context. The best way we could think of to do this is to add another mpirun cmd line option --dump-hostfiles that would output the line-by-line name from the hostfile plus the name we resolved it to. Of course, --xml would cause it to be in xml format. Would that meet your needs? Ralph On Oct 15, 2008, at 3:12 PM, Greg Watson wrote: Hi Ralph, We've been discussing this back and forth a bit internally and don't really see an easy solution. Our problem is that Eclipse is not running on the head node, so gethostbyname will not necessarily resolve to the same address. For example, the hostfile might refer to the head node by an internal network address that is not visible to the outside world. Since gethostname also looks in /etc/hosts, it may resolve locally but not on a remote system. The only think I can think of would be, rather than us reading the hostfile directly as we do now, to provide an option to ompi_info that would dump the hostfile using the same rules that you apply when you're using the hostfile. Would that be feasible? Greg On Sep 22, 2008, at 4:25 PM, Ralph Castain wrote: Sorry for delay - was on vacation and am now trying to work my way back to the surface. I'm not sure I can fix this one for two reasons: 1. In general, OMPI doesn't really care what name is used for the node. However, the problem is that it needs to be consistent. In this case, ORTE has already used the name returned by gethostname to create its session directory structure long before mpirun reads a hostfile. This is why we retain the value from gethostname instead of allowing it to be overwritten by the name in whatever allocation we are given. Using the name in hostfile would require that I either find some way to remember any prior name, or that I tear down and rebuild the session directory tree - neither seems attractive nor simple (e.g., what happens when the user provides multiple entries in the hostfile for the node, each with a different IP address based on another interface in that node? Sounds crazy, but we have already seen it done - which one do I use?). 2. We don't actually store the hostfile info anywhere - we just use it and forget it. For us to add an XML attribute containing any hostfile-related info would therefore require us to re-read the hostfile. I could have it do that -only- in the case of "XML output required", but it seems rather ugly. An alternative might be for you to simply do a "gethostbyname" lookup of the IP address or hostname to see if it matches instead of just doing a strcmp. This is what we have to do internally as we frequently have problems with FQDN vs. non-FQDN vs. IP addresses etc. If the local OS hasn't cached the IP address for the node in question it can take a little time to DNS resolve it, but otherwise works fine. I can point you to the code in OPAL that we use - I w