On Wed, Sep 20, 2023 at 08:08:23AM +0200, Otto Moerbeek wrote:

> On Sun, Sep 03, 2023 at 09:21:18AM +0200, Otto Moerbeek wrote:
> 
> > Hello,
> > 
> > I'm seeing some reports of "write after free" reported by malloc by
> > peolpe running current.  Malloc has become more strict since begining
> > of June. Let me explain:
> > 
> > Small allocations share a page. e.g. a 4k page will hold 8 512 byte
> > allocations.
> > 
> > When one such allocation is freed, it will be filled with a "free
> > junk" pattern, stay for a short while in the delayed free list (and be
> > subhject to write after free checks) and after that it will be marked
> > free and it might be allocated again. On a new allocation the write
> > after free check will also be done. That is the new part of my commit
> > in June.  Another change is that on the allocation of the page
> > containing chunks, all chunks wil be filled with a "free junk" (0xdf)
> > patttern. 
> > 
> > The change has a consequence: the time span of the "write after free"
> > checks is much longer, as it can take a long time for a chunk to get
> > used again.
> > 
> > I do believe the changes itself are correct and can help findiung
> > bugs in other code. 
> > 
> > You can also be set upon a wrong foot: if an out of bounds write on a
> > adjacent chunk happens and lands in (another) free chunk, upon
> > allocation of that free chunk it will be reported as a "write after
> > free" case. It might even be that an allocation that was never
> > allocated and never freed is still "write after free" because of
> > "close" out of bounds writes. 
> > 
> > I'm thinking how to change the error message to reflect that.
> > 
> > If these extra checks hinder progress, it is possible to swicth them
> > off using malloc option f (small F).
> 
> That should be j (small J).
> 
> > For general debugging I recommend using malloc option S, is has all
> > the extra strictness that can be enabled while still conforming to the
> > malloc API.
> > 
> > Please be aware of these things while hunting for bugs.
> 
> I'm happy to say two of the more complex ones are (being) fixed: one
> turned out to be a reference counting bug in firefox. See
> http://undeadly.org/cgi?action=article;sid=20230912094727
> 
> The other, a write after free that crashed the X server when running
> picard was diagnosed by me.  This one was a bit nasty, as it required
> instrumenting malloc to print some extra info to find the root cause. 
> 
> The bug is that the call in
> https://github.com/openbsd/xenocara/blob/master/xserver/Xext/xvdisp.c#L1002
> overwrites the first 4 bytes of the chunk next to the one allocated on
> line 995.
> 
> A workaround is to allocate 4 bytes extra, matthieu@ will be looking
> for a proper fix, as it requires knowledge of the X internals.
> 
> I am pondering if the instrumentation hack I used could be modified to
> be always available and print the function that did the original
> allocation when detecting a write after free.
> 
> The conclusion is that in these two cases, malloc helped in
> detecting/finding the bugs (and it was not a bug in malloc itself :-).
> Coincidentally this was the topioc of my talk during EuroBSD2023 last
> weekend, see http://www.openbsd.org/events.html
> 
> I hope to look at the reports I got for the Wayland port (which is
> WIP) as well, see
> https://github.com/openbsd/ports/blob/master/wayland/TODO-Wayland.md 

The wayland issue was found as well, using the same method.
I'm working on programming the heuristic that is quite effective into
malloc itself. It currently looks like this for the X case above:

X(67417) in malloc(): write to free mem 0xdd0277d4270 [0..7]@16
allocated at /usr/lib/libc.so.97.1 0x92f22
(preceding chunk 0xdd0277d4260 allocated at /usr/X11R6/bin/X 0x177374 (now 
free))

$ addr2line -e /usr/lib/libc.so.97.1 0x92f22
/usr/src/lib/libc/string/strdup.c:45
$ addr2line -e /usr/X11R6/bin/X 0x177374
/usr/xenocara/xserver/Xext/xvdisp.c:995

The idea is: if a buffer overflow is detected, it is very wel possible
that the overwrite happened as an out-of-bound write of the preceding
chunk. It is also possible that it is a genuine write-after-free, in
that case the developer should inspect the code that allocated and
manipulated the first chunk. Malloc has no way to the the difference,
that requires debugging by a human.

This is all experimental and the final form may change but it
certainly look promising, especially as regular malloc code did not
change at all (the extra info needed is only collected if malloc flag
D is set).

        -Otto

Reply via email to