On Mon, 2020-12-21 at 16:17 -0500, Alexander Bulekov wrote:
> On 201220 0256, Qiuhao Li wrote:
> > Since programmers usually trigger an IO just before they need it.
> > Try to
> > delay some IO instructions may help us better understanding the
> > timing
> > context when debug.
> > 
> > Tested with Bug 1908062. Refined vs. Original result:
> > 
> > outl 0xcf8 0x8000081c            outl 0xcf8 0x0
> > outb 0xcfc 0xc3                | outl 0xcf8 0x8000081c
> > outl 0xcf8 0x80000804          | outb 0xcfc 0xc3
> > outl 0xcfc 0x10000006          | outl 0xcf8 0x80000804
> > write 0xc300001028 0x1 0x5a    | outl 0xcfc 0x10000006
> > write 0xc300001024 0x2 0x10    | write 0xc300001028 0x1 0x5a
> > write 0xc30000101c 0x1 0x01    | writel 0xc30000100c 0x2a6f6c63
> > write 0xc300003002 0x1 0x0     v write 0xc300001024 0x2 0x10
> > write 0x5c 0x1 0x10              write 0xc30000101c 0x1 0x01
> > writel 0xc30000100c 0x2a6f6c63   write 0xc300001018 0x1 0x80
> > write 0xc300001018 0x1 0x80      write 0x5c 0x1 0x10
> > outl 0xcf8 0x0                   write 0xc300003002 0x1 0x0
> > 
> 
> In this example, I can remove the outl 0xcf8 0x0, and I still see the
> crash, so maybe the 1st step in the minimizer is failing somewhere.. 

I think it might because of our one-time scan and remove strategy,
which is not suitable for timing dependent instructions.

For example, instruction A will indicate an address where the config
chunk locates, and instruction B will make the configuration active. If
we have the following instruction sequence:

...
A1
B1
A2
B2
...

A2 and B2 are the actual instructions that trigger the bug.

If we scan from top to bottom, after we remove A1, the behavior of B1
might be unknowable, including not to crash the program. But we will
successfully remove B1 later cause A2 and B2 will crash the process
anyway:

...
A1
A2
B2
...

Now one more trimming will remove A1.

As for the example I gave, the instructions before the delaying
minimizer are like this:

outl 0xcf8 0x8000081c
outb 0xcfc 0xc3
outl 0xcf8 0x0                <--- The A instruction, didn't be removed
(outl 0xcfc 0x0)              <--- The B instruction, removed
outl 0xcf8 0x80000804
outl 0xcfc 0x10000006
write 0xc300001024 0x2 0x10
write 0xc300001028 0x1 0x5a
write 0xc30000101c 0x1 0x01
writel 0xc30000100c 0x2a6f6c63
write 0xc300001018 0x1 0x80
write 0x5c 0x1 0x10
write 0xc300003002 0x1 0x0

If we run the remove minimizer again, The A instruction outl 0xcf8 0x0
will be removed.

Since we only remove instructions, this iterative algorithm is
converging. Maybe we can keep removing the trace until the
len(newtrace) become unchanged.

> 
> Is the Refined one better? To me the original one read as:
> "Do a bunch of PCI configuration to map an MMIO BAR, then interact
> with
> the MMIO range and trigger some DMA activity". I also know exactly
> the
> line that will trigger the DMA activity and access 0x5c. With the
> refined one, I'm not so sure. Which line now causes the DMA read from
> 0x5c? writel 0xc30000100c? write 0xc300001018?
> Is there another example where this type of reordering makes the
> result
> easier to read?
> 
> > Signed-off-by: Qiuhao Li <qiuhao...@outlook.com>
> > ---
> >  scripts/oss-fuzz/minimize_qtest_trace.py | 21
> > +++++++++++++++++++++
> >  1 file changed, 21 insertions(+)
> > 
> > diff --git a/scripts/oss-fuzz/minimize_qtest_trace.py
> > b/scripts/oss-fuzz/minimize_qtest_trace.py
> > index f3e88064c4..da7aa73b3c 100755
> > --- a/scripts/oss-fuzz/minimize_qtest_trace.py
> > +++ b/scripts/oss-fuzz/minimize_qtest_trace.py
> > @@ -214,6 +214,27 @@ def minimize_trace(inpath, outpath):
> >  
> >      assert(check_if_trace_crashes(newtrace, outpath))
> >  
> > +    # delay IO instructions until they can't trigger the crash
> > +    # Note: O(n^2) and many timeouts, kinda slow
> 
> Maybe do a binary search instead of a linear scan for the optimal
> position
> to save some time?
> Also, if you re-run this multiple times, you can end up with
> different
> results, since some lines might not really be tied to a position
> (e.g.
> the outl cf8 0x0 in your example). Maybe it's not a problem, but i'm
> still not sure that this is making the result easier to read.
> -Alex

I'm not familiar with the PCI configuration and DMA mechanism in QEMU.
This patch is just random thinking based on empiricism. Maybe I should
add the "RFC" tag :). In version 2, I won't post this patch.

BTW, may I ask where I can learn about these IO emulations? How do you
know the address corresponding to the PCI bar and DMA?

> 
> > +    i = len(newtrace) - 1
> > +    while i >= 0:
> > +        tmp_i = newtrace[i]
> > +        if len(tmp_i) < 2:
> > +            i -= 1
> > +            continue
> > +        print("Delaying ", newtrace[i])
> > +        for j in reversed(range(i+1, len(newtrace)+1)):
> > +            newtrace.insert(j, tmp_i)
> > +            del newtrace[i]
> > +            if check_if_trace_crashes(newtrace, outpath):
> > +                break
> > +            newtrace.insert(i, tmp_i)
> > +            del newtrace[j]
> > +        i -= 1
> > +
> > +    assert(check_if_trace_crashes(newtrace, outpath))
> > +    # maybe another removing round
> > +
> >  
> >  if __name__ == '__main__':
> >      if len(sys.argv) < 3:
> > -- 
> > 2.25.1
> > 


Reply via email to