[chromium-dev] Re: purecall exceptions and the manbearpig

2009-04-04 Thread Nicolas Sylvain
On Fri, Apr 3, 2009 at 7:19 PM, Tommi to...@chromium.org wrote:

 Yes, that's one way of running into purecall.  but, just in case my email
 is being misunderstood, now with italics! :)

   purecall is not called *when* an exception occurs.  purecall actually 
 *throws
 the exception - or exits the program*

 purecall is called when attempting to call a virtual method for which there
 is no implementation.  purecall is the default virtual method if you will.


Yes, that's the low level description of purecall, and no one is debating
that.

But it is also misleading, because, from a high level perspective, when you
look at my code, you see that the developer actually did implement the
virtual method explicitly, so, still from a high level perspective, it can
also happen for a virtual method that does have an implementation if the
object has been deleted prior to the call.  [All that because when the
derived object is deleted, one of the thing it does is to revert its vtable
to the base class vtable. That part is not obvious/known to the high level
developer]

It's not because you implement all your virtual functions correctly that
your objects wont purecall. But I'm sure you know that, I just wanted to
make sure I'm not misunderstood either ;)

Nicolas




 When you call _set_purecall_handler, you're giving _purecall a pointer to
 your function that purecall will delegate to.  There's not an exception that
 triggers this.  Calling purecall is just a regular function call.

 Here's CRT's implementation of __purecall:

 void __cdecl _purecall() {
 _purecall_handler purecall = (_purecall_handler)
 _decode_pointer(__pPurecall);
 if(purecall != NULL) {
 purecall();
 /*  shouldn't return, but if it does, we drop back to
 default behaviour
 */
 }

 _NMSG_WRITE(_RT_PUREVIRT);
 /* do not write the abort message */
 _set_abort_behavior(0, _WRITE_ABORT_MSG);
 abort();
 }

 and here's the implementation of _set_purecall_handler:

 _purecall_handler _set_purecall_handler(_purecall_handler pNew) {
 _purecall_handler pOld = NULL;
 pOld = (_purecall_handler) _decode_pointer(__pPurecall);
 __pPurecall = (_purecall_handler) _encode_pointer(pNew);
 return pOld;
 }


 On Fri, Apr 3, 2009 at 8:42 PM, Nicolas Sylvain nsylv...@chromium.orgwrote:

 The code below shows that it's possible to throw a purecall exception by
 calling a function from a delete object.

 I suspect this is what is happening in our code.

 Nicolas


 class Derived;
 class Base {
  public:
Base(Derived *derived): m_pDerived(derived) {};
~Base() {};  // Needed, dont know why.
virtual void function(void) = 0;
void bleh();
Derived * m_pDerived;
 };

 class Derived : public Base {
  public:
Derived() : Base(this) {};   // C4355
virtual void function(void) {};
 };

 void Base::bleh() {
   m_pDerived - function();
 }

 void purecall(void) {
__debugbreak();
 }

 #include windows.h
 int _tmain(int argc, _TCHAR* argv[]) {
_set_purecall_handler(purecall);
Base* base = NULL;
{
  Derived myDerived;
  myDerived.function();
  base =  myDerived;
}
base-bleh();
  }

 On Fri, Apr 3, 2009 at 2:17 PM, Tommi to...@chromium.org wrote:

 purecall isn't called when an exception occurs.  purecall actually throws
 the exception - or exits the program (by default the crt throws up a dialog
 and then abort()s).  in addition to cpu's email, raymond chen's article is a
 good (and short) read :)
 http://blogs.msdn.com/oldnewthing/archive/2004/04/28/122037.aspx

 On Fri, Apr 3, 2009 at 3:15 PM, Huan Ren hu...@google.com wrote:

 Based on what I saw in the bug, it looks like an exception happening
 during CALL instruction may lead to PureCall().

 For example, an object obj has been freed and later on someone calls
 obj-func(). Then the assembly code looks like this:

 // ecx: pointer to obj which is in memory
 // [ecx]: supposed to be pointer to vtable, it has invalid value since
 obj is freed
 // edx: now has pointer to vtable, which is invalid
 mov edx,dword ptr [ecx]

 // deref the vtable and make the call
 call dword ptr [edx+4]

 When a (hardware) exception happens during the call instruction, the
 control will be eventually transfered to the routine handling this
 type of exception which I *think* is PureCall().

 Huan

 On Fri, Apr 3, 2009 at 11:26 AM, Ricardo Vargas rvar...@chromium.org
 wrote:
  I certainly don't want to imply that it is the case with this
 particular
  bug, but I have seen crashes when the cause of the problem is using an
  object that was previously deleted (and only end up with this
 exception when
  all the planets are properly aligned). I guess that it depends on the
 actual
  class hierarchy of the objects in question, but I'd think that
 simple
  examples end up on a lot of crashes right after the cl that exposes
 the
  problem.
 
  On Fri, Apr 3, 2009 at 12:52 AM, Dean McNamee de...@chromium.org
 wrote:
 
  You 

[chromium-dev] Re: purecall exceptions and the manbearpig

2009-04-04 Thread Tommi
hehe, understood! :)

On Sat, Apr 4, 2009 at 12:43 PM, Nicolas Sylvain nsylv...@chromium.orgwrote:



 On Fri, Apr 3, 2009 at 7:19 PM, Tommi to...@chromium.org wrote:

 Yes, that's one way of running into purecall.  but, just in case my email
 is being misunderstood, now with italics! :)

   purecall is not called *when* an exception occurs.  purecall actually
 *throws the exception - or exits the program*

 purecall is called when attempting to call a virtual method for which
 there is no implementation.  purecall is the default virtual method if you
 will.


 Yes, that's the low level description of purecall, and no one is debating
 that.

 But it is also misleading, because, from a high level perspective, when you
 look at my code, you see that the developer actually did implement the
 virtual method explicitly, so, still from a high level perspective, it can
 also happen for a virtual method that does have an implementation if the
 object has been deleted prior to the call.  [All that because when the
 derived object is deleted, one of the thing it does is to revert its vtable
 to the base class vtable. That part is not obvious/known to the high level
 developer]

 It's not because you implement all your virtual functions correctly that
 your objects wont purecall. But I'm sure you know that, I just wanted to
 make sure I'm not misunderstood either ;)

 Nicolas




 When you call _set_purecall_handler, you're giving _purecall a pointer to
 your function that purecall will delegate to.  There's not an exception that
 triggers this.  Calling purecall is just a regular function call.

 Here's CRT's implementation of __purecall:

 void __cdecl _purecall() {
 _purecall_handler purecall = (_purecall_handler)
 _decode_pointer(__pPurecall);
 if(purecall != NULL) {
 purecall();
 /*  shouldn't return, but if it does, we drop back to
 default behaviour
 */
 }

 _NMSG_WRITE(_RT_PUREVIRT);
 /* do not write the abort message */
 _set_abort_behavior(0, _WRITE_ABORT_MSG);
 abort();
 }

 and here's the implementation of _set_purecall_handler:

 _purecall_handler _set_purecall_handler(_purecall_handler pNew) {
 _purecall_handler pOld = NULL;
 pOld = (_purecall_handler) _decode_pointer(__pPurecall);
 __pPurecall = (_purecall_handler) _encode_pointer(pNew);
 return pOld;
 }


 On Fri, Apr 3, 2009 at 8:42 PM, Nicolas Sylvain nsylv...@chromium.orgwrote:

 The code below shows that it's possible to throw a purecall exception by
 calling a function from a delete object.

 I suspect this is what is happening in our code.

 Nicolas


 class Derived;
 class Base {
  public:
Base(Derived *derived): m_pDerived(derived) {};
~Base() {};  // Needed, dont know why.
virtual void function(void) = 0;
void bleh();
Derived * m_pDerived;
 };

 class Derived : public Base {
  public:
Derived() : Base(this) {};   // C4355
virtual void function(void) {};
 };

 void Base::bleh() {
   m_pDerived - function();
 }

 void purecall(void) {
__debugbreak();
 }

 #include windows.h
 int _tmain(int argc, _TCHAR* argv[]) {
_set_purecall_handler(purecall);
Base* base = NULL;
{
  Derived myDerived;
  myDerived.function();
  base =  myDerived;
}
base-bleh();
  }

 On Fri, Apr 3, 2009 at 2:17 PM, Tommi to...@chromium.org wrote:

 purecall isn't called when an exception occurs.  purecall actually
 throws the exception - or exits the program (by default the crt throws up a
 dialog and then abort()s).  in addition to cpu's email, raymond chen's
 article is a good (and short) read :)
 http://blogs.msdn.com/oldnewthing/archive/2004/04/28/122037.aspx

 On Fri, Apr 3, 2009 at 3:15 PM, Huan Ren hu...@google.com wrote:

 Based on what I saw in the bug, it looks like an exception happening
 during CALL instruction may lead to PureCall().

 For example, an object obj has been freed and later on someone calls
 obj-func(). Then the assembly code looks like this:

 // ecx: pointer to obj which is in memory
 // [ecx]: supposed to be pointer to vtable, it has invalid value since
 obj is freed
 // edx: now has pointer to vtable, which is invalid
 mov edx,dword ptr [ecx]

 // deref the vtable and make the call
 call dword ptr [edx+4]

 When a (hardware) exception happens during the call instruction, the
 control will be eventually transfered to the routine handling this
 type of exception which I *think* is PureCall().

 Huan

 On Fri, Apr 3, 2009 at 11:26 AM, Ricardo Vargas rvar...@chromium.org
 wrote:
  I certainly don't want to imply that it is the case with this
 particular
  bug, but I have seen crashes when the cause of the problem is using
 an
  object that was previously deleted (and only end up with this
 exception when
  all the planets are properly aligned). I guess that it depends on the
 actual
  class hierarchy of the objects in question, but I'd think that
 simple
  examples end up on a lot of crashes right after the 

[chromium-dev] Re: purecall exceptions and the manbearpig

2009-04-03 Thread Ricardo Vargas
I certainly don't want to imply that it is the case with this particular
bug, but I have seen crashes when the cause of the problem is using an
object that was previously deleted (and only end up with this exception when
all the planets are properly aligned). I guess that it depends on the actual
class hierarchy of the objects in question, but I'd think that simple
examples end up on a lot of crashes right after the cl that exposes the
problem.

On Fri, Apr 3, 2009 at 12:52 AM, Dean McNamee de...@chromium.org wrote:


 You could, however, corrupt the vtable pointer (not the vtable).  Say
 somehow 32 was added to it, now the table is misaligned, and you might
 get a purecall, etc.  Not sure that's likely at all though.

 Since  the vtable pointer is the first field, it seems ripe for
 problems w/ use after free, etc.  I kinda doubt that's what's
 happening here though.  Anyone who is working on one of these can bug
 me and I'll look at the crash dump.

 On Fri, Apr 3, 2009 at 7:24 AM, Tommi to...@chromium.org wrote:
  On Thu, Apr 2, 2009 at 7:09 PM, cpu c...@chromium.org wrote:
 
 
 
  On Apr 2, 3:53 pm, Nicolas Sylvain nsylv...@chromium.org wrote:
   Another simple(r) example
   :http://msdn.microsoft.com/en-us/library/t296ys27(VS.80).aspx
  
   http://msdn.microsoft.com/en-us/library/t296ys27(VS.80).aspxBut, as
   discussed in bug 8544, we've see many purecall crashes that happens
 and
   we
   don't
   think it's related to virtual functions. The only thing I can think of
   is
   that the vtable is corrupted. (overwritten or freed)
  
   Does it not make sense?
 
  I don't think you can overwrite a vtables because they should be in
  the code section of the executable (the pages marked as read-execute),
  they are known at compile time and it would not make sense to
  construct them on the fly.
 
  But if you know of a case then that would be very interesting.
 
 
  yes they should be protected with read/execute and besides, you'd have to
  overwrite entries in the vtable with a pointer to __purecall for that to
  happen
 
 
 
 
  
   Nicolas
  
  
  
   On Thu, Apr 2, 2009 at 1:54 PM, cpu c...@chromium.org wrote:
  
After reading some speculation in bugs such as
   http://code.google.com/p/chromium/issues/detail?id=8544I felt
compelled to dispel some myths and misunderstandings about the
 origin
and meaning of the mythical _purecall_ exception. My hope is that
 then
you can spot the problems in our source code and fix them. Sorry for
the long post.
  
So first of all, what do you see when you get this error? if you are
in a debug build and you are not eating the exceptions via some
 custom
handler you see this dialog:
  
---
Debug Error!
R6025
- pure virtual function call
(Press Retry to debug the application)
---
Abort   Retry   Ignore
---
  
For chrome/chromium we install a special handler, which forces a
 crash
dump in which case you'll see in in the debugger analysis something
like this:
  
 [chrome_dll_main.cc:100] - `anonymous namespace'::PureCall()
 [purevirt.c:47] - _purecall
  
Before going into too much detail, let me show you a small program
that causes this exception:
  
=
class Base {
 public:
 virtual ~Base() {
   ThreeFn();
 }
  
 virtual void OneFn() = 0;
 virtual void TwoFn() = 0;
  
 void ThreeFn() {
   OneFn();
   TwoFn();
 }
};
  
class Concrete : public Base {
 public:
 Concrete() : state_(0) {
 }
  
 virtual void OneFn() {
   state_ += 1;
 }
 virtual void TwoFn() {
   state_ += 2;
 }
 private:
 int state_;
};
  
int _tmain(int argc, _TCHAR* argv[]) {
  
 Concrete* obj = new  Concrete();
 obj-OneFn();
 obj-TwoFn();
 obj-ThreeFn();
  
 delete obj;
  
 return 0;
}
=
  
Can you spot the problem? do you know at which line it crashes, do
 you
know why? if so I have wasted your time, apologies. If you are
 unsure
then read on.
  
This program crashes when trying to call OneFn() with a purecall
exception on debug build. On release build it exits with no error,
 but
your mileage might vary depending on what optimizations are active.
  
The call stack for the crash is:
  
   msvcr80d.dll!__purecall()  + 0x25
 --
shows the
dialog (debug only)
   app.exe!Base::ThreeFn()  Line 16 + 0xfc   -  error
here
   app.exe!Base::~Base()  Line 10  C++
   app.exe!Concrete::~Concrete()  + 0x2b
   app.exe!Concrete::`scalar deleting destructor'()  + 0x2b
 -
delete obj
  
So as you have guessed it has to do with calling virtual functions
from a destructor.
  
What happens is that during construction an object evolves from the

[chromium-dev] Re: purecall exceptions and the manbearpig

2009-04-03 Thread Tommi
purecall isn't called when an exception occurs.  purecall actually throws
the exception - or exits the program (by default the crt throws up a dialog
and then abort()s).  in addition to cpu's email, raymond chen's article is a
good (and short) read :)
http://blogs.msdn.com/oldnewthing/archive/2004/04/28/122037.aspx

On Fri, Apr 3, 2009 at 3:15 PM, Huan Ren hu...@google.com wrote:

 Based on what I saw in the bug, it looks like an exception happening
 during CALL instruction may lead to PureCall().

 For example, an object obj has been freed and later on someone calls
 obj-func(). Then the assembly code looks like this:

 // ecx: pointer to obj which is in memory
 // [ecx]: supposed to be pointer to vtable, it has invalid value since
 obj is freed
 // edx: now has pointer to vtable, which is invalid
 mov edx,dword ptr [ecx]

 // deref the vtable and make the call
 call dword ptr [edx+4]

 When a (hardware) exception happens during the call instruction, the
 control will be eventually transfered to the routine handling this
 type of exception which I *think* is PureCall().

 Huan

 On Fri, Apr 3, 2009 at 11:26 AM, Ricardo Vargas rvar...@chromium.org
 wrote:
  I certainly don't want to imply that it is the case with this particular
  bug, but I have seen crashes when the cause of the problem is using an
  object that was previously deleted (and only end up with this exception
 when
  all the planets are properly aligned). I guess that it depends on the
 actual
  class hierarchy of the objects in question, but I'd think that simple
  examples end up on a lot of crashes right after the cl that exposes the
  problem.
 
  On Fri, Apr 3, 2009 at 12:52 AM, Dean McNamee de...@chromium.org
 wrote:
 
  You could, however, corrupt the vtable pointer (not the vtable).  Say
  somehow 32 was added to it, now the table is misaligned, and you might
  get a purecall, etc.  Not sure that's likely at all though.
 
  Since  the vtable pointer is the first field, it seems ripe for
  problems w/ use after free, etc.  I kinda doubt that's what's
  happening here though.  Anyone who is working on one of these can bug
  me and I'll look at the crash dump.
 
  On Fri, Apr 3, 2009 at 7:24 AM, Tommi to...@chromium.org wrote:
   On Thu, Apr 2, 2009 at 7:09 PM, cpu c...@chromium.org wrote:
  
  
  
   On Apr 2, 3:53 pm, Nicolas Sylvain nsylv...@chromium.org wrote:
Another simple(r) example
:http://msdn.microsoft.com/en-us/library/t296ys27(VS.80).aspx
   
http://msdn.microsoft.com/en-us/library/t296ys27(VS.80).aspxBut,
 as
discussed in bug 8544, we've see many purecall crashes that happens
and
we
don't
think it's related to virtual functions. The only thing I can think
of
is
that the vtable is corrupted. (overwritten or freed)
   
Does it not make sense?
  
   I don't think you can overwrite a vtables because they should be in
   the code section of the executable (the pages marked as
 read-execute),
   they are known at compile time and it would not make sense to
   construct them on the fly.
  
   But if you know of a case then that would be very interesting.
  
  
   yes they should be protected with read/execute and besides, you'd have
   to
   overwrite entries in the vtable with a pointer to __purecall for that
 to
   happen
  
  
  
  
   
Nicolas
   
   
   
On Thu, Apr 2, 2009 at 1:54 PM, cpu c...@chromium.org wrote:
   
 After reading some speculation in bugs such as
http://code.google.com/p/chromium/issues/detail?id=8544I felt
 compelled to dispel some myths and misunderstandings about the
 origin
 and meaning of the mythical _purecall_ exception. My hope is that
 then
 you can spot the problems in our source code and fix them. Sorry
 for
 the long post.
   
 So first of all, what do you see when you get this error? if you
 are
 in a debug build and you are not eating the exceptions via some
 custom
 handler you see this dialog:
   
 ---
 Debug Error!
 R6025
 - pure virtual function call
 (Press Retry to debug the application)
 ---
 Abort   Retry   Ignore
 ---
   
 For chrome/chromium we install a special handler, which forces a
 crash
 dump in which case you'll see in in the debugger analysis
 something
 like this:
   
  [chrome_dll_main.cc:100] - `anonymous namespace'::PureCall()
  [purevirt.c:47] - _purecall
   
 Before going into too much detail, let me show you a small
 program
 that causes this exception:
   
 =
 class Base {
  public:
  virtual ~Base() {
ThreeFn();
  }
   
  virtual void OneFn() = 0;
  virtual void TwoFn() = 0;
   
  void ThreeFn() {
OneFn();
TwoFn();
  }
 };
   
 class Concrete : public Base {
  public:
  Concrete() : state_(0) {
  }
   
  

[chromium-dev] Re: purecall exceptions and the manbearpig

2009-04-02 Thread Nicolas Sylvain
Another simple(r) example :
http://msdn.microsoft.com/en-us/library/t296ys27(VS.80).aspx

http://msdn.microsoft.com/en-us/library/t296ys27(VS.80).aspxBut, as
discussed in bug 8544, we've see many purecall crashes that happens and we
don't
think it's related to virtual functions. The only thing I can think of is
that the vtable is corrupted. (overwritten or freed)

Does it not make sense?

Nicolas


On Thu, Apr 2, 2009 at 1:54 PM, cpu c...@chromium.org wrote:


 After reading some speculation in bugs such as
 http://code.google.com/p/chromium/issues/detail?id=8544 I felt
 compelled to dispel some myths and misunderstandings about the origin
 and meaning of the mythical _purecall_ exception. My hope is that then
 you can spot the problems in our source code and fix them. Sorry for
 the long post.

 So first of all, what do you see when you get this error? if you are
 in a debug build and you are not eating the exceptions via some custom
 handler you see this dialog:

 ---
 Debug Error!
 R6025
 - pure virtual function call
 (Press Retry to debug the application)
 ---
 Abort   Retry   Ignore
 ---

 For chrome/chromium we install a special handler, which forces a crash
 dump in which case you'll see in in the debugger analysis something
 like this:

  [chrome_dll_main.cc:100] - `anonymous namespace'::PureCall()
  [purevirt.c:47] - _purecall

 Before going into too much detail, let me show you a small program
 that causes this exception:

 =
 class Base {
  public:
  virtual ~Base() {
ThreeFn();
  }

  virtual void OneFn() = 0;
  virtual void TwoFn() = 0;

  void ThreeFn() {
OneFn();
TwoFn();
  }
 };

 class Concrete : public Base {
  public:
  Concrete() : state_(0) {
  }

  virtual void OneFn() {
state_ += 1;
  }
  virtual void TwoFn() {
state_ += 2;
  }
  private:
  int state_;
 };


 int _tmain(int argc, _TCHAR* argv[]) {

  Concrete* obj = new  Concrete();
  obj-OneFn();
  obj-TwoFn();
  obj-ThreeFn();

  delete obj;

  return 0;
 }
 =

 Can you spot the problem? do you know at which line it crashes, do you
 know why? if so I have wasted your time, apologies. If you are unsure
 then read on.

 This program crashes when trying to call OneFn() with a purecall
 exception on debug build. On release build it exits with no error, but
 your mileage might vary depending on what optimizations are active.

 The call stack for the crash is:

msvcr80d.dll!__purecall()  + 0x25--
 shows the
 dialog (debug only)
app.exe!Base::ThreeFn()  Line 16 + 0xfc   -  error here
app.exe!Base::~Base()  Line 10  C++
app.exe!Concrete::~Concrete()  + 0x2b
app.exe!Concrete::`scalar deleting destructor'()  + 0x2b  -
 delete obj

 So as you have guessed it has to do with calling virtual functions
 from a destructor.

 What happens is that during construction an object evolves from the
 earliest base class to the actual type and during destruction the
 object devolves (is that a word?) from the actual object to the
 earliest base class; when we reach ~Base() body the object is no
 longer of type Concrete but of type Base and thus the call Base::OneFn
 () is an error because that class does not in fact have any
 implementation.

 What the compiler does is create two vtables, the vtable of Concrete
 looks like this:

 vtable 1:
 [ 0 ] - Concrete::OneFn()
 [ 1 ] - Concrete::TwoFn()

 vtable 2:
 [ 0 ]- msvcr80d.dll!__purecall()
 [ 1 ]- msvcr80d.dll!__purecall()

 The dtor of Concrete is the default dtor which does nothing except
 calling Base::~Base(), but the dtor of base does:

 this-vtbl_ptr = vtable2
 call ThreeFn()


 Now, why doesn't the release build crash?

 That's because the compiler does not bother with generating the second
 vtable, after all is not going to be used and thus also eliminates the
 related lines such as this-vtbl_ptr = vtable2. Therefore the object
 reaches the base dtor with the vtbl_ptr pointing to vtable1 which
 makes the call ThreeFn() just work.

 But that was just luck. If you ever modify the base class, such as
 introducing a new virtual function that is not pure, like this:

 class Base {
  public:
  virtual ~Base() {
ThreeFn();
  }

  virtual void OneFn() = 0;
  virtual void TwoFn() = 0;

  virtual void FourFn() {  --- new function, not pure virtual
wprintf(Law snap);
  }

  void ThreeFn() {
OneFn();
TwoFn();
  }
 };

 // Same program below.
 // ...
 // 

 Then you are forcing the compiler to generate vtable 2, which looks:

 vtable 2:
 [ 0 ]- msvcr80d.dll!__purecall()
 [ 1 ]- msvcr80d.dll!__purecall()
 [ 2 [- Base::FourFn()

 And now the purecall crash magically happens (on the same spot) on
 release builds, which is quite surprising since the trigger was the
 introduction of FourFn() which has _nothing_ to do with the crash