Re: Odd Memory Corruption

2007-07-06 Thread chromatic
On Thursday 05 July 2007 19:58:50 Patrick R. Michaud wrote:

 I also get segfaults after applying this patch.

 However, if I change the patch such that the size_t sentinel;
 line goes at the end of the struct PMC instead of the beginning,
 then everything appears to compile and run.

 In fact, if size_t sentinel; is placed as the second or third
 line in the struct (i.e., after pobj_t obj; or VTABLE *vtable;)
 then things still appear to compile and run.  It's only when the
 sentinel is given as the first line that I get errors.

 So, I'm guessing there's some code somewhere that assumes that
 obj is always the first component of PMC.

Right, and the clue was also in parrot/pobj.h.  Because buffers, strings, and 
PMCs all are PObjs, their structs must be isomorphic.  Hoisting the sentinel 
into the pobj_t struct gets past that segfault.

Structural subtyping hates me.

-- c


Re: Odd Memory Corruption

2007-07-06 Thread Ron Blaschke
chromatic wrote:
 In theory, this patch should apply and run cleanly.  It doesn't.
 
 Thus, something somewhere pokes into memory it shouldn't.
 
 Any ideas?  Alternately, any comments on this analysis?

I think adding memory checks is a brilliant idea, especially because
memory is sometimes recycled with GC, thus malloc/free checks being less
useful.  The bug where a PMC was still used while already being put on
the free list, and aggravated by the fact that the free list pointer is
put at the same address as the data comes into mind.

I really like what Microsoft has done with their memory guards.

The Structure of a Page Heap Block
http://msdn2.microsoft.com/en-us/library/ms220938(vs.80).aspx

It would be great if someone could have a look at patches 43432 and
43529.  They might be memory relevant too.

Ron


Odd Memory Corruption

2007-07-05 Thread chromatic
In theory, this patch should apply and run cleanly.  It doesn't.

Thus, something somewhere pokes into memory it shouldn't.

Any ideas?  Alternately, any comments on this analysis?

-- c

=== include/parrot/pobj.h
==
--- include/parrot/pobj.h	(revision 4520)
+++ include/parrot/pobj.h	(local)
@@ -139,6 +139,7 @@
 #define PMC_DATA_IN_EXT 1
 
 struct PMC {
+size_t sentinel;
 pobj_t obj;
 VTABLE *vtable;
 PMC *real_self;
@@ -183,7 +184,7 @@
 #ifdef NDEBUG
 #  define PMC_ext_checked(pmc) (pmc)-pmc_ext
 #else
-#  define PMC_ext_checked(pmc) (assert((pmc)-pmc_ext), (pmc)-pmc_ext)
+#  define PMC_ext_checked(pmc) (assert((pmc)-pmc_ext  (pmc)-sentinel == 0xbeefbeef), (pmc)-pmc_ext)
 #endif /* NDEBUG */
 #if PMC_DATA_IN_EXT
 #  define PMC_data(pmc)   PMC_ext_checked(pmc)-data
=== src/headers.c
==
--- src/headers.c	(revision 4520)
+++ src/headers.c	(local)
@@ -249,6 +249,7 @@
 ? interp-arena_base-constant_pmc_pool
 : interp-arena_base-pmc_pool;
 PMC * const pmc = (PMC *)pool-get_free_object(interp, pool);
+pmc-sentinel = 0xbeefbeef;
 
 /* clear flags, set is_PMC_FLAG */
 if (flags  PObj_is_PMC_EXT_FLAG) {
=== src/pmc/hash.pmc
==
--- src/pmc/hash.pmc	(revision 4520)
+++ src/pmc/hash.pmc	(local)
@@ -526,6 +531,9 @@
 */
 
 PMC* get_pmc_keyed_str(STRING *key) {
+if (!PMC_struct_val(SELF))
+return PMCNULL;
+
 HashBucket * const b =
 parrot_hash_get_bucket(INTERP, (Hash*) PMC_struct_val(SELF), key);
 
=== src/pmc.c
==
--- src/pmc.c	(revision 4520)
+++ src/pmc.c	(local)
@@ -199,6 +199,7 @@
 pmc-real_self = pmc;
 VTABLE_set_pointer(interp, pmc, pmc);
 }
+pmc-sentinel = 0xdeadbeef;
 return pmc;
 }
 if (vtable-flags  VTABLE_IS_CONST_PMC_FLAG) {
@@ -245,6 +246,7 @@
 fprintf(stderr, \t= new %p type %d\n, pmc, (int)base_type);
 }
 #endif
+pmc-sentinel = 0xbeefbeef;
 return pmc;
 }
 


Re: Odd Memory Corruption

2007-07-05 Thread Patrick R. Michaud
On Thu, Jul 05, 2007 at 06:30:44PM -0700, chromatic wrote:
 In theory, this patch should apply and run cleanly.  It doesn't.
 
 Thus, something somewhere pokes into memory it shouldn't.
 
 Any ideas?  Alternately, any comments on this analysis?

I also get segfaults after applying this patch.

However, if I change the patch such that the size_t sentinel;
line goes at the end of the struct PMC instead of the beginning, 
then everything appears to compile and run.  

In fact, if size_t sentinel; is placed as the second or third
line in the struct (i.e., after pobj_t obj; or VTABLE *vtable;)
then things still appear to compile and run.  It's only when the
sentinel is given as the first line that I get errors.

So, I'm guessing there's some code somewhere that assumes that
obj is always the first component of PMC.

Pm


Re: [perl #31493] Overlapping memory corruption

2004-09-10 Thread Leopold Toetsch
Steve Fink wrote:
Hey, your reason is much better than my reason. Still, why do the
_noinit stuff and duplicate the creation code? Why not just call pmc_new
as in my replacement code?
Cpmc_new would create a Hash already. But the clone has to create one 
of the source type, which might not be quite the same.

leo


[perl #31493] Overlapping memory corruption

2004-09-09 Thread via RT
# New Ticket Created by  Steve Fink 
# Please include the string:  [perl #31493]
# in the subject line of all future correspondence about this issue. 
# URL: http://rt.perl.org:80/rt3/Ticket/Display.html?id=31493 


---
osname= linux
osvers= 2.4.21-1.1931.2.382.entsmp
arch=   i386-linux-thread-multi
cc= gcc 3.2.2 20030222 (Red Hat Linux 3.2.2-5)
---
Flags:
category=core
severity=high
ack=no
---
I have a dynamic PMC named 'Match' that extends the PerlHash PMC. I
have occasionally been seeing destroy() not implemented in class
'Match' on program shutdown. Match does not define destroy(), nor
does it set the active destroy flag. Looking deeper, I find that it's
trying to dispose of a PMC with its -obj.flags member completely
trashed, and in particular it has an erroneous bit set for the active
destroy flag.

I won't go through all the details of what I looked at (though I'll
post them in my blog eventually), but what's happening is that this
line (from perlhash.pmc's clone() implementation) is corrupting the
flags field:

((Hash*)PMC_struct_val(dest))-container = dest;

The problem is that the dest PMC contains a Hash structure in its
struct_val field, but the address of the Hash is only 0x18 bytes
before the address of the PMC itself, and Hashes are 0x40 bytes
each. The offset of the container field within a Hash is 0x20, and the
offset of the flags field within a PMC is 0x04. So assigning to that
container field overwrites the flags field, since 0x18 + 0x04 == 0x20.

It seems like an initialization ordering problem, especially since I
can make the problem go away by changing the clone() method from

PMC* clone () {
PMC* dest = pmc_new_noinit(INTERP, SELF-vtable-base_type);
PObj_custom_mark_SET(dest);
((Hash*)PMC_struct_val(dest))-container = dest;
hash_clone(INTERP, (Hash *)PMC_struct_val(SELF),
   (Hash**)PMC_struct_val(dest));
return dest;
}

to

PMC* clone () {
PMC* dest = pmc_new(INTERP, SELF-vtable-base_type);
hash_clone(INTERP, (Hash *)PMC_struct_val(SELF),
   (Hash**)PMC_struct_val(dest));
return dest;
}

But I'm not sure why, and I'm not sure if this is a bug in perlhash's
clone(), or some weird side effect of using dynamic PMCs. (For now,
I'm just reimplementing clone() in my subclass.)

---
Summary of my parrot 0.1.0 configuration:
  configdate='Wed Sep  8 23:20:30 2004'
  Platform:
osname=linux, archname=i386-linux-thread-multi
jitcapable=1, jitarchname=i386-linux,
jitosname=LINUX, jitcpuarch=i386
execcapable=1
perl=/usr/bin/perl
  Compiler:
cc='ccache gcc', ccflags='-D_REENTRANT -D_GNU_SOURCE -DTHREADS_HAVE_PIDS 
-DDEBUGGING  -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 
-I/usr/include/gdbm -O1',
  Linker and Libraries:
ld='gcc', ldflags=' -L/usr/local/lib',
cc_ldflags='',
libs='-lnsl -ldl -lm -lpthread -lcrypt -lutil -lrt -lgmp'
  Dynamic Linking:
so='.so', ld_shared='-shared -L/usr/local/lib -fPIC',
ld_shared_flags=''
  Types:
iv=long, intvalsize=4, intsize=4, opcode_t=long, opcode_t_size=4,
ptrsize=4, ptr_alignment=1 byteorder=1234, 
nv=double, numvalsize=8, doublesize=8

---
Environment:
HOMELANGLANGUAGELC_ALLLD_LIBRARY_PATHLOGDIRPATHSHELL


Re: [perl #31493] Overlapping memory corruption

2004-09-09 Thread Leopold Toetsch
Steve Fink (via RT) wrote:
I won't go through all the details of what I looked at (though I'll
post them in my blog eventually), but what's happening is that this
line (from perlhash.pmc's clone() implementation) is corrupting the
flags field:
((Hash*)PMC_struct_val(dest))-container = dest;
Ah, yep. PMC_struct_val(dest) doesn't hold the hash yet, it is created 
in hash_clone() only after this line.

The problem is that the dest PMC contains a Hash structure in its
struct_val field
No. That's the pointer of the free_list, pointing to the previous PMC in 
that size class.
Putting above line after the hash_clone() fixes that bug.

Thanks for reporting,
leo


Re: [perl #31493] Overlapping memory corruption

2004-09-09 Thread Steve Fink
On Sep-09, Leopold Toetsch wrote:
 Steve Fink (via RT) wrote:
 
 I won't go through all the details of what I looked at (though I'll
 post them in my blog eventually), but what's happening is that this
 line (from perlhash.pmc's clone() implementation) is corrupting the
 flags field:
 
 ((Hash*)PMC_struct_val(dest))-container = dest;
 
 Ah, yep. PMC_struct_val(dest) doesn't hold the hash yet, it is created 
 in hash_clone() only after this line.
 
 The problem is that the dest PMC contains a Hash structure in its
 struct_val field
 
 No. That's the pointer of the free_list, pointing to the previous PMC in 
 that size class.
 Putting above line after the hash_clone() fixes that bug.

Hey, your reason is much better than my reason. Still, why do the
_noinit stuff and duplicate the creation code? Why not just call pmc_new
as in my replacement code?


Re: Memory corruption

2004-01-21 Thread Leopold Toetsch
Leopold Toetsch [EMAIL PROTECTED] wrote:
 Steve Fink [EMAIL PROTECTED] wrote:

 It crashes on a memcpy inside compact_pool

 Some remarks what I could find out:

 - current COW copying of stacks is AFAIK borken - both copies of
   one COWed stack share the same Buffer header

I have now more signs for that:

- First: I have a smaller program, that shows a simimlar corruption bug (a
  Random PMC turns suddenly into a RetContinuation PMC - while its
  called Random it shouldn't expose randomness on itself :)
- Turning GC off stops that bug - as well as in your program
- Using the LEA/system allocator (Configure.pl --gc=libc) makes the bug
  vanish

   the GC/compact_pool in resources.c does correctly deal with such
   Buffers. COWed strings have distinct (String)Buffer headers.

I'm quite sure now, that the bug is caused by the new COWed register
chunks, which share *one* Buffer header.

leo


Re: Memory corruption

2004-01-20 Thread Leopold Toetsch
Steve Fink [EMAIL PROTECTED] wrote:

 Here's another obnoxious test case.
[ ... ]
 So, in case anyone is curious (hi leo!), attached is a 56KB (9KB
 gzipped) imc file.

 It crashes on a memcpy inside compact_pool

Some remarks what I could find out:

- the damaged Buffer is in a pool of object_size 60
- Array/IntList (i.e. List) buffer_likes have that size
- valgrind reports a lot of unitialised values during DOD [1]
- some of these are in mark_context
- the context contains all kinds of aggregates
- current COW copying of stacks is AFAIK borken - both copies of
  one COWed stack share the same Buffer header - I don't think the
  the GC/compact_pool in resources.c does correctly deal with such
  Buffers. COWed strings have distinct (String)Buffer headers.

It could be that e.g. lex Pads (which contain Arrays) aren't marked
properly (valgrind has indications for that):

==16728== Conditional jump or move depends on uninitialised value(s)
==16728==at 0x80C4BE0: pobject_lives (src/dod.c:91)
==16728==by 0x81C1B72: mark_stack (src/method_util.c:162)
==16728==by 0x81BD2B8: mark_context (src/sub.c:59)

And finally I have to admit, that the list code is too complex. I don't
think, that the error is directly caused by the list code (DOD/GC is
turned off during each allocation). But it seems to be overkill for the
average usage (mainly small Arrays).

It can deal with huge sparse arrays, tries to be space efficient, with
small arrays and is pretty fast. But all current array usage (except a
few tests) just put a few items into arrays and do not much mess around.

leo

[1] some of these are during stack trace, where valgrind doesn't see
what Parrot is doing - we can ignore these


Memory corruption

2004-01-19 Thread Steve Fink
Here's another obnoxious test case. I started to try to strip it down,
but it starts working again if I even delete nonsense lines from a
subroutine that is never called. And I'm working on something else and
not at all in the mood to re-learn how to debug parrot internals. It
turns out that I don't get the crash when running JITted, so I think
I'll just do that for now.

So, in case anyone is curious (hi leo!), attached is a 56KB (9KB
gzipped) imc file.

It crashes on a memcpy inside compact_pool (triggered by new_hash).
b-buflen is obviously corrupted. Using -G to disable garbage
collection (does that work?) doesn't seem to help matters at all.

Deleting the __setup sub at the end of the file makes the problem go
away. (Note that __setup is never actually called, and the body of the
routine is irrelevant other than its length.)


dead.imc.gz
Description: GNU Zip compressed data


Do my debugging for me? (memory corruption)

2003-11-21 Thread Steve Fink
I'm staring at a crash, my eyes are glazing over, and I need sleep. So
I was wondering if anyone would be interested in taking a look at a
.imc file that is giving me a seg fault while marking a hash in a gc
run triggered by a hash PMC allocation. Or at least tell me whether
it's seg faulting on your machine too.

I'll attach the 5KB compressed .imc file (25KB uncompressed; PIR code
is redundant!) It's generated from the following Perl6 code, but you'd
need my local changes in order to regenerate it:

  rule thing() {
  [a-z]+
  }

  rule parens() {
  { print entering parens\n; }
  \( [ thing | parens | \s ]* \)
  { print leaving parens\n; }
  }

  sub main() {
  my $t = ()()(((blah blah () blah))(blah));
  my $t2 = $t _ );
  print ok 8\n if $t2 !~ /^ parens $/;
  }

(the entering/leaving parens printouts have no effect on the bug;
they're just remnants of my flailing.)

If you run with --gc-debug, it dies a little earlier, but in what
appears to be the same op.

Hopefully,
Steve


hash_bug.imc.gz
Description: GNU Zip compressed data


Re: Do my debugging for me? (memory corruption)

2003-11-21 Thread Leopold Toetsch
Steve Fink [EMAIL PROTECTED] wrote:
 I'm staring at a crash

 I'll attach the 5KB compressed .imc file (25KB uncompressed; PIR code

Its really good, to have such short code snippets, that clearly show,
where a bug is coming from ;) Anyway, it was again me causing this bug -
sorry.

Fixed and updated the comment which I didn't understand when removing
it.

 Hopefully,
 Steve

Thanks for the failure report,
leo


Memory Corruption Bug

2002-03-29 Thread Michel J Lambert

I've been using clintp's pasm test script for much of my testing, which is
supposed to be an infix expression evaluator. I've been stress-testing
parrot and finding bugs by putting increasingly-large expressions into the
pasm file for it to evaluate.

Attached is a .pasm file which causes some string data to be written into
the middle of the string_pool-pool_buffer list of entries, such that when
it tries to dereference foo in new_pmc_header, it's pointing to garbage
memory. 0x20202020 for me, which is four spaces. Changing the save/restore
of spaces in the pasm file to use periods causes the pointer to be
0x2e2e2e2e.  I tried for a bit on this, but couldn't really track it down
any more than that. Hopefully someone else can figure it out.

To change the expression to make it more stressful (on both your debugging
sessions, and parrot), add to the 'save ' at the very bottom of the
script. Currently, it's along the lines of:

2 . ( +(45*2+6)-3*5/(2+9/7-(5*6+3-5*7/2-2)*31+52/67*10-32-56) x
$somelargenumberofpastes )

Mike Lambert

PS: Clint, I hope you don't mind me giving out the pasm file. I believe it
is in the best interest of your master plan. :)




gc_string_stress.pasm
Description: gc_string_stress.pasm


Re: Memory Corruption Bug

2002-03-29 Thread Peter Gibbs

Michel J Lambert [EMAIL PROTECTED] wrote:

 Attached is a .pasm file which causes some string data to be written into
 the middle of the string_pool-pool_buffer list of entries, such that when
 it tries to dereference foo in new_pmc_header, it's pointing to garbage
 memory. 0x20202020 for me, which is four spaces. Changing the save/restore
 of spaces in the pasm file to use periods causes the pointer to be
 0x2e2e2e2e.  I tried for a bit on this, but couldn't really track it down
 any more than that. Hopefully someone else can figure it out.

Two problems found so far:
1) mem_realloc passes the incorrect size to Parrot_allocate (this causes the
specific error mentioned above)
2) add_header_to_free calls mem_realloc calls Parrot_allocate calls
go_collect which moves the free header pool

The first is a simple fix; the second needs either suppression of collection
during the procedure, or not using mem_realloc. The patch below puts the
reallocation code into add_header_to_free and add_pmc_to_free, this means
that, in the scenario described, the free pool will be moved by the garbage
collector, and then immediately moved again. Since the free pools are not
compressed by go_collect, perhaps they should be allocated independently and
not copied around all the time??

After fixing the above, the test program still abends, this time with
subend somehow is less than substart - I have not yet followed up on this.

--
Peter Gibbs
EmKel Systems


Index: memory.c
===
RCS file: /home/perlcvs/parrot/memory.c,v
retrieving revision 1.29
diff -u -r1.29 memory.c
--- memory.c  18 Mar 2002 16:42:06 -  1.29
+++ memory.c  29 Mar 2002 17:41:17 -
@@ -145,7 +145,7 @@
 {
 size_t copysize = (fromsize  tosize ? tosize : fromsize);
 void *mem;
-mem = Parrot_allocate(interpreter, copysize);
+mem = Parrot_allocate(interpreter, tosize);
 if (!mem) {
 return NULL;
 }
Index: resources.c
===
RCS file: /home/perlcvs/parrot/resources.c,v
retrieving revision 1.35
diff -u -r1.35 resources.c
--- resources.c 26 Mar 2002 16:33:01 - 1.35
+++ resources.c 29 Mar 2002 17:42:15 -
@@ -18,22 +18,26 @@
 static void add_header_to_free(struct Parrot_Interp *interpreter,
struct free_pool *pool, void *to_add);

-/* Add a string header to the free string header pool */
+/* Add a PMC header to the free pool */
 static void add_pmc_to_free(struct Parrot_Interp *interpreter,
 struct free_pool *pool, void
 *to_add) {
   PMC **temp_ptr;
   /* First, check and see if there's enough space in the free pool. If
-   we're within the size of a STRING pointer, we make it bigger */
+   we're within the size of a pointer, we make it bigger */
   if (pool-entries_in_pool * sizeof(PMC *) =
   pool-pool_buffer.buflen - sizeof(PMC *)) {
 /* If not, make the free pool bigger. We enlarge it by 20% */
-pool-pool_buffer.bufstart = mem_realloc(interpreter,
- pool-pool_buffer.bufstart,
- pool-pool_buffer.buflen,
-
(UINTVAL)(pool-pool_buffer.buflen * 1.2));
-pool-pool_buffer.buflen = (UINTVAL)(pool-pool_buffer.buflen * 1.2);
-
+/* don't use mem_realloc, because garbage collection may occur */
+size_t new_size = pool-pool_buffer.buflen * 1.2;
+void *new_ptr = Parrot_allocate(interpreter, new_size);
+if (!new_ptr) {
+internal_exception(ALLOCATION_ERROR,
+   No room to expand free PMC pool);
+}
+memcpy(new_ptr, pool-pool_buffer.bufstart, pool-pool_buffer.buflen);
+pool-pool_buffer.bufstart = new_ptr;
+pool-pool_buffer.buflen = new_size;
   }

   /* Okay, so there's space. Add the header on */
@@ -269,12 +273,16 @@
   if (pool-entries_in_pool * sizeof(STRING *) =
   pool-pool_buffer.buflen - sizeof(STRING *)) {
 /* If not, make the free pool bigger. We enlarge it by 20% */
-pool-pool_buffer.bufstart = mem_realloc(interpreter,
- pool-pool_buffer.bufstart,
- pool-pool_buffer.buflen,
-
(UINTVAL)(pool-pool_buffer.buflen * 1.2));
-pool-pool_buffer.buflen = (UINTVAL)(pool-pool_buffer.buflen * 1.2);
-
+/* don't use mem_realloc, because garbage collection may occur */
+size_t new_size = pool-pool_buffer.buflen * 1.2;
+void *new_ptr = Parrot_allocate(interpreter, new_size);
+if (!new_ptr) {
+internal_exception(ALLOCATION_ERROR,
+   No room to expand free buffer header pool);
+}
+memcpy(new_ptr, pool-pool_buffer.bufstart, pool-pool_buffer.buflen);
+pool-pool_buffer.bufstart = new_ptr;
+pool-pool_buffer.buflen = new_size;
   }

   /* Okay, so there's space. Add the header on */





[APPLIED (partially)] Re: Memory Corruption Bug

2002-03-29 Thread Melvin Smith

At 07:57 PM 3/29/2002 +0200, you wrote:
Michel J Lambert [EMAIL PROTECTED] wrote:

  Attached is a .pasm file which causes some string data to be written into
  the middle of the string_pool-pool_buffer list of entries, such that when
  it tries to dereference foo in new_pmc_header, it's pointing to garbage
  memory. 0x20202020 for me, which is four spaces. Changing the save/restore
  of spaces in the pasm file to use periods causes the pointer to be
  0x2e2e2e2e.  I tried for a bit on this, but couldn't really track it down
  any more than that. Hopefully someone else can figure it out.

Two problems found so far:
1) mem_realloc passes the incorrect size to Parrot_allocate (this causes the
specific error mentioned above)

Clearly a bug. Applied, thanks.


2) add_header_to_free calls mem_realloc calls Parrot_allocate calls
go_collect which moves the free header pool

The first is a simple fix; the second needs either suppression of collection
during the procedure, or not using mem_realloc. The patch below puts the
reallocation code into add_header_to_free and add_pmc_to_free, this means
that, in the scenario described, the free pool will be moved by the garbage
collector, and then immediately moved again. Since the free pools are not
compressed by go_collect, perhaps they should be allocated independently and
not copied around all the time??

After fixing the above, the test program still abends, this time with
subend somehow is less than substart - I have not yet followed up on this.

I left the second part concering realloc() out, pending further discussion.
Not saying this is wrong, just that is sounds as if you may suspect
a more elegant fix that may warrant comments from the list.

-Melvin