Robert Bonomi wrote:
I've _got_ to be doing something wrong, sine I'm getting heap corruption
calling it.  But for the life of me, I can't figure out -what- is wrong.

What I've got makes "a whole lot of no sense" -- I get corruption of
the _same_ malloc()'d data structure (at *exactly* the same offset into
the structure!!) on both 7.2 i386 and 8.0 amd64 releases (on different
hardware).

Unfortunately thee is a _lot_ of code, including significant use of
of malloc()/free() that attempmting to whittle things down to a
minimal test case would be very awkward.  right now, I've got the
corruption at a 'known' place, but no cluse as to -how- it's happening -- available evidence seems to exclude everything passed _into_ getwpnam_r().

the offending call is :
   getpwnam_r(cp3, &pw_data, buffer2, sizeof(buffer2), &pwd);

data declaration at the beginning of the function:
   char buffer[1024];
   char buffer2[1024];
   char mailbox[1024];
  *cp,*cp2,*cp3,*cp4 = buffer;
   struct passwd pw_data,pw_data2,*pwd=&pw_data2;
   int i;

The whole program is around 2500 lines of code and headers, with, as mentioined, _lots_ of malloc()/free() activity, I can put ut it up on
my web-server, if somebody really wants to dig.

I've tried changing the size of buffer2 to 8kb, in case I was
over-running the 1k buffer. ((unfortunately the mmanpage does _not_
specify a minimum size for the buffer)
I've tried declaring buffer2 _and_ the 'struct passwd' items as
'static', so that _if_ the corruption was coming from one of
those addresses, the corruption *should* move.

*NONE* of those changes made _any_ differnce in  where the corruption
was occuring, or _what_ was being written there.

I'm *really* baffled.   HELP!!! <*whimper*>

_what_ stuff shows up _does_ differ between the 7.2 and 8.0 systems,

8.0 reliably produces 116 bytes corrupted:
[gdb command: x/112 &private_data_pointer->remotehostname
0x40a0e070:     103 'g' 114 'r' 111 'o' 117 'u' 112 'p' 0 '\0'  0 '\0'  0 '\0'
0x40a0e078:     99 'c'  111 'o' 109 'm' 112 'p' 97 'a'  116 't' 0 '\0'  0 '\0'
0x40a0e080:     99 'c'  111 'o' 109 'm' 112 'p' 97 'a'  116 't' 0 '\0'  0 '\0'
0x40a0e088:     104 'h' 111 'o' 115 's' 116 't' 115 's' 0 '\0'  0 '\0'  0 '\0'
0x40a0e090:     102 'f' 105 'i' 108 'l' 101 'e' 115 's' 0 '\0'  0 '\0'  0 '\0'
0x40a0e098:     102 'f' 105 'i' 108 'l' 101 'e' 115 's' 0 '\0'  0 '\0'  0 '\0'
0x40a0e0a0:     102 'f' 105 'i' 108 'l' 101 'e' 115 's' 0 '\0'  0 '\0'  0 '\0'
0x40a0e0a8:     112 'p' 97 'a'  115 's' 115 's' 119 'w' 100 'd' 0 '\0'  0 '\0'
0x40a0e0b0:     99 'c'  111 'o' 109 'm' 112 'p' 97 'a'  116 't' 0 '\0'  0 '\0'
0x40a0e0b8:     115 's' 104 'h' 101 'e' 108 'l' 108 'l' 115 's' 0 '\0'  0 '\0'
0x40a0e0c0:     102 'f' 105 'i' 108 'l' 101 'e' 115 's' 0 '\0'  0 '\0'  0 '\0'
0x40a0e0c8:     99 'c'  111 'o' 109 'm' 112 'p' 97 'a'  116 't' 0 '\0'  0 '\0'
0x40a0e0d0:     102 'f' 105 'i' 108 'l' 101 'e' 115 's' 0 '\0'  0 '\0'  0 '\0'
0x40a0e0d8:     102 'f' 105 'i' 108 'l' 101 'e' 115 's' 0 '\0'  0 '\0'  0 '\0'


7.2 produces 64 bytes corrupted:
[gdb command: x/112 &private_data_pointer->remotehostname
0x2820b098:     100 'd' 110 'n' 115 's' 0 '\0'  110 'n' 105 'i' 115 's' 0 '\0'
0x2820b0a0:     110 'n' 105 'i' 115 's' 0 '\0'  114 'r' 112 'p' 99 'c'  0 '\0'
0x2820b0a8:     0 '\0'  0 '\0'  0 '\0'  0 '\0'  0 '\0'  0 '\0'  0 '\0'  0 '\0'
0x2820b0b0:     0 '\0'  0 '\0'  0 '\0'  0 '\0'  0 '\0'  0 '\0'  0 '\0'  0 '\0'
0x2820b0b8:     0 '\0'  0 '\0'  0 '\0'  0 '\0'  0 '\0'  0 '\0'  0 '\0'  0 '\0'
0x2820b0c0:     0 '\0'  0 '\0'  0 '\0'  0 '\0'  3 '\003'        0 '\0'  8 '\b' 
0 '\0'
0x2820b0c8:     0 '\0'  0 '\0'  0 '\0'  0 '\0'  0 '\0'  0 '\0'  0 '\0'  0 '\0'
0x2820b0d0:     2 '\002'        0 '\0'  0 '\0'  0 '\0'  0 '\0'  0 '\0'  0 '\0' 
0 '\0'

This stuff looks like it -might- be fromm a nsswitch.conf parse.
I dunno.

anybody got _any_ ideas?
This might help.

Just above the crashing call, open a file, and dump the contents of the vars you're sending to the function with fprintf, then close the file. I suspect cp3 doesn't point to valid data and is causing the function call to fail (it looks like it's pointing to the same thing as buffer, which doesn't look like how the function should be called). Having the contents of the individual vars will help you narrow down exactly what's occuring. Data you want to see is the pointers themselves, and maybe the first 8 or so characters of data that it's pointing to.

And if this doesn't help, there's always Google CodeSearch http://www.google.com/codesearch , for examples of how to call it.


_______________________________________________
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to "freebsd-questions-unsubscr...@freebsd.org"

Reply via email to