On Friday, 31 July 2020 11:28:36 CEST Jean Wolter wrote:
> On 31/07/2020 01:23, Paul Boddie wrote:
> > And if I look in the objdump output, at least on some occasions, I can
> > find an instruction which would be causing the exception. The code looks
> > like this:
> >
> >   100490f:       49 8b 04 24             mov    (%r12),%rax
> >   1004913:       4c 89 ee                mov    %r13,%rsi
> >   1004916:       31 d2                   xor    %edx,%edx
> >   1004918:       4c 89 e7                mov    %r12,%rdi
> >   100491b:       ff 50 70                callq  *0x70(%rax)
> > 
> > It is at this final instruction that the exception occurs, and the offset
> > is as reported, too.
> > 
> > The awkward thing here, though, is that the offending instruction is a
> > virtual method call within the same instance:
> > 
> > this->flush_flexpage(flexpage);
> 
> This looks like a pointer to an object is null.
> 
> Is there any chance that the method containing the
> "this->flush_flexpage(flexpage);" is a method, that is not virtual? And
> that you invoked the method on a pointer to an object and that the
> pointer is a nullptr? A simple test whether "this" is null could shed
> some light on this ...

I used the good old way of debugging, which can be disruptive in this 
environment, but it eventually yielded the following information about the 
method call:

0x423770 -> remove_flexpage( 0x16f90 );
L4Re[rm]: unhandled read page fault at 0x70 pc=0x100491b

The remove_flexpage method is not virtual, but the flush_flexpage method that 
causes the exception is virtual.

Making the remove_flexpage method virtual actually seems to make the error 
occur on the invocation of that method instead. This being an invocation from 
another instance, hence the debugging output above which corresponds to this 
statement:

accessor->remove_flexpage(flexpage);

So, if remove_flexpage is virtual, I get an error like this:

L4Re[rm]: unhandled read page fault at 0x70 pc=0x1006613

And the address corresponds to this disassembly:

 1006609:       49 8b 04 24             mov    (%r12),%rax
 100660d:       48 89 de                mov    %rbx,%rsi
 1006610:       4c 89 e7                mov    %r12,%rdi
 1006613:       ff 50 70                callq  *0x70(%rax)

(The flush_flexpage method is relocated to 0x78 in the table and its 
invocation still resides at the previously reported location.)

But I still have the feeling that I need to audit and rework my code, paying 
better attention to concurrency issues.

Paul



_______________________________________________
l4-hackers mailing list
l4-hackers@os.inf.tu-dresden.de
http://os.inf.tu-dresden.de/mailman/listinfo/l4-hackers

Reply via email to