I don't know if I can build on Linux but I have see ways to play with the
clock and setting affinity.  However, I don't know how to take the GIL
properly during destruction.  What we're talking about is a shared_ptr to
an interface that was created on the Python side.  I can slap GIL
acquisition in the destructor but I know I'm not wrapping all the logic
that goes on to the Python runtime that way for freeing the object.  Do you
know how that would be done?

In the meanwhile I'm looking at the Stackless examples that relate to how
CCP games said they did their engine.  If I can make it work, it would mean
all Python stuff gets scheduled on one thread.  I am not entirely sure
about object destruction though.

On Sun, May 27, 2012 at 9:05 AM, Niall Douglas <s_sourcefo...@nedprod.com>wrote:

> Try pinning everything to a single CPU, see what happens.
>
> Try pinning the CPU clock speed to its minimum. If it doesn't trip,
> you have a timing race.
>
> If you can build on Linux, try valgrind.
>
> Oh, you mentioned bits you weren't locking when destroying. I'd lock
> those and see what happens.
>
> Niall
>
> On 26 May 2012 at 13:19, Adam Preble wrote:
>
> > This might be one for the main Python lists but since I have a whole lot
> of
> > stuff wrapped in Boost floating about this problem, I wanted to try the
> > C++-sig for starters.  I run my little game experiment for anywhere
> between
> > 15 and 60 seconds, where it's sending a lot of events and messages around
> > between my C++ runtime and the Python runtime.  The code that's failing
> > often completes hundreds of times without fault.  I don't know when this
> > started to happen, but it's something of a recent phenomenon.  Given the
> > asynchronous stuff in my code, this could have been latent in it the
> whole
> > time.
> >
> > I'm specifically using Stackless Python 2.6.5 with Boost 1.49, with debug
> > symbols, so I'll paste _Py_ForgetReference so it's in front of you:
> >
> > void
> > _Py_ForgetReference(register PyObject *op)
> > {
> > #ifdef SLOW_UNREF_CHECK
> >         register PyObject *p;
> > #endif
> > if (op->ob_refcnt < 0)
> > Py_FatalError("UNREF negative refcnt");
> > if (op == &refchain ||
> >     op->_ob_prev->_ob_next != op || op->_ob_next->_ob_prev != op)
> > Py_FatalError("UNREF invalid object");
> > #ifdef SLOW_UNREF_CHECK
> > for (p = refchain._ob_next; p != &refchain; p = p->_ob_next) {
> > if (p == op)
> > break;
> > }
> > if (p == &refchain) /* Not found */
> > Py_FatalError("UNREF unknown object");
> > #endif
> > op->_ob_next->_ob_prev = op->_ob_prev;
> > op->_ob_prev->_ob_next = op->_ob_next;
> > op->_ob_next = op->_ob_prev = NULL;
> > _Py_INC_TPFREES(op);
> > }
> >
> > Here's where I get bit:
> >
> > if (op == &refchain ||
> >     op->_ob_prev->_ob_next != op || op->_ob_next->_ob_prev != op)
> > Py_FatalError("UNREF invalid object");
> >
> > I am developing in Visual Studio 2010, and I use the immediate window to
> > test those logic clauses.
> > There are two general situations where it happens:
> >
> > 1. A shared pointer to a message I created in the C++ runtime was passed
> to
> > an object in the Python runtime, processed, and the control was returned
> > back.  It secured and released the GIL in and out of that Python
> crossing.
> >  On the way out of the original C++ function it naturally decrements the
> > shared_ptr use count and starts to destroy it.  That is what I want to
> > happen.  Call stack of relevant bits:
> >
> >   python26_d.dll!Py_FatalError(const char * msg)  Line 1679 C
> > python26_d.dll!_Py_ForgetReference(_object * op)  Line 2178 + 0xa bytes C
> >   python26_d.dll!_Py_Dealloc(_object * op)  Line 2197 + 0x9 bytes C
> >   wva.exe!boost::python::xdecref<_object>(_object * p)  Line 36 + 0xb3
> bytes
> > C++
> >   wva.exe!boost::python::handle<_object>::reset()  Line 249 + 0xb bytes
> C++
> >   wva.exe!boost::python::converter::shared_ptr_deleter::operator()(const
> > void * __formal)  Line 36 C++
> >   wva.exe!boost::detail::sp_counted_impl_pd<void
> > *,boost::python::converter::shared_ptr_deleter>::dispose()  Line 149 C++
> >   wva.exe!boost::detail::sp_counted_base::release()  Line 102 + 0xf bytes
> > C++
> >   wva.exe!boost::detail::shared_count::~shared_count()  Line 309 C++
> >
> > If I probe that if condition I see this:
> > op == &refchain
> > 0
> > op->_ob_prev->_ob_next != op
> > 0
> > op->_ob_next->_ob_prev != op
> > 0
> >
> > Nothing was true!  How could that conditional trigger?  All I can suppose
> > is a gremlin came in and changed a condition on me.  Something that has
> > concerned me is I don't grab the GIL when I deallocate these objects.  I
> > don't know how I'd do that.  I have suspected that was a liability for
> > awhile, but I'm not entirely sure how.
> >
> > 2. Within the same block of C++ code, at the point that I'm trying to
> > transmit the message to the Python-derived object, it'll puke too.  So
> here
> > it has already created a shared_ptr for the message and is triggering a
> > callback into the Python code to handle it.  The Python-derived object is
> > being called through a wrapper, and that call has the GIL.  This stack
> > trace is much more obnoxious and it's difficult for me to make any sense
> of
> > it.  Note there's some Stackless stuff in there.  I think the interpreter
> > is at least starting to call some of the code in the Python derivation,
> but
> > I haven't been able to figure out how far it gets.  I'll take any advice
> on
> > how to probe this stuff since I feel I am too vague here--look for square
> > brackets on a few lines for some things I figured out:
> >
> >   python26_d.dll!Py_FatalError(const char * msg)  Line 1679 C
> > python26_d.dll!_Py_ForgetReference(_object * op)  Line 2178 C
> >   python26_d.dll!_Py_Dealloc(_object * op)  Line 2197 + 0x9 bytes C
> >   python26_d.dll!tupledealloc(PyTupleObject * op)  Line 170 + 0x86 bytes
> C
> >   python26_d.dll!_Py_Dealloc(_object * op)  Line 2198 + 0x7 bytes C
> >   python26_d.dll!PyObject_CallFunctionObjArgs(_object * callable, ...)
> >  Line 2751 + 0x54 bytes C
> >   python26_d.dll!handle_callback(_PyWeakReference * ref, _object *
> > callback)  Line 881 + 0xf bytes C
> >   python26_d.dll!PyObject_ClearWeakRefs(_object * object)  Line 928 + 0xd
> > bytes C
> >   wva.exe!instance_dealloc(_object * inst)  Line 344 + 0xa bytes C++
> >  [This is Boost.Python class.cpp, statically linked]
> >   python26_d.dll!subtype_dealloc(_object * self)  Line 1020 + 0x7 bytes C
> >   python26_d.dll!_Py_Dealloc(_object * op)  Line 2198 + 0x7 bytes C
> >        [I know here it's deallocating a wrapped type for a 3d vector I
> was
> > passing around]
> >   python26_d.dll!insertdict(_dictobject * mp, _object * key, long hash,
> > _object * value)  Line 459 + 0x54 bytes C      [It has replaced an
> existing
> > 3d vector with the passed one, and trying to nuke the old one]
> >   python26_d.dll!PyDict_SetItem(_object * op, _object * key, _object *
> > value)  Line 701 + 0x15 bytes C
> >   python26_d.dll!PyObject_GenericSetAttr(_object * obj, _object * name,
> > _object * value)  Line 1504 + 0x11 bytes C
> >   python26_d.dll!PyObject_SetAttr(_object * v, _object * name, _object *
> > value)  Line 1247 + 0x14 bytes C      [value is my 3d vector I am passing
> > around]
> >   python26_d.dll!PyEval_EvalFrame_value(_frame * f, int throwflag,
> _object
> > * retval)  Line 2063 C
> >   python26_d.dll!PyEval_EvalFrameEx_slp(_frame * f, int throwflag,
> _object
> > * retval)  Line 836 + 0x15 bytes C
> >   python26_d.dll!slp_frame_dispatch_top(_object * retval)  Line 719 +
> 0x12
> > bytes C
> >   python26_d.dll!slp_run_tasklet()  Line 1204 + 0x9 bytes C
> >   python26_d.dll!slp_eval_frame(_frame * f)  Line 299 + 0x5 bytes C
> >   python26_d.dll!climb_stack_and_eval_frame(_frame * f)  Line 266 + 0x9
> > bytes C
> >   python26_d.dll!slp_eval_frame(_frame * f)  Line 294 + 0x9 bytes C
> >   python26_d.dll!PyEval_EvalCodeEx(PyCodeObject * co, _object * globals,
> > _object * locals, _object * * args, int argcount, _object * * kws, int
> > kwcount, _object * * defs, int defcount, _object * closure)  Line 3294 +
> > 0x6 bytes C
> >   python26_d.dll!function_call(_object * func, _object * arg, _object *
> kw)
> >  Line 540 + 0x40 bytes C
> >   python26_d.dll!PyObject_Call(_object * func, _object * arg, _object *
> kw)
> >  Line 2502 + 0x3c bytes C
> >   python26_d.dll!instancemethod_call(_object * func, _object * arg,
> _object
> > * kw)  Line 2586 + 0x11 bytes C
> >   python26_d.dll!PyObject_Call(_object * func, _object * arg, _object *
> kw)
> >  Line 2502 + 0x3c bytes C
> >   python26_d.dll!PyEval_CallObjectWithKeywords(_object * func, _object *
> > arg, _object * kw)  Line 3931 + 0x11 bytes C
> >   python26_d.dll!PyEval_CallFunction(_object * obj, const char * format,
> > ...)  Line 556 + 0xf bytes C
> >
> wva.exe!boost::python::override::operator()<boost::shared_ptr<game::IComponentCommunicatable>,unsigned
> > int,boost::shared_ptr<game::ComponentMessage> >(const
> > boost::shared_ptr<game::IComponentCommunicatable> & a0, const unsigned
> int
> > & a1, const boost::shared_ptr<game::ComponentMessage> & a2)  Line 138 +
> > 0xac bytes C++
> >
> wva.exe!game::ComponentWrapper::IncomingSignalEvent(boost::shared_ptr<game::IComponentCommunicatable>
> > source, unsigned int id, boost::shared_ptr<game::ComponentMessage>
> message)
> >  Line 234 + 0x4b bytes C++
> >
> > At least this time the logic condition is true...
> > op == &refchain
> > 0
> > op->_ob_prev->_ob_next != op
> > 1
> > op->_ob_next->_ob_prev != op
> > 0
> >
> > What it's trying to free is of type __PyWeakref_RefType.
> >
> > The 3d vector wrapper has a lot of methods, but I think of particular
> > interest would be the class block.  It looks like
> > this: class_<vector3df>("vector3df", init<float, float, float>())
> > Stuff in Python can create them and they can also get passed around from
> > C++ to Python and back.
> >
> > The impression I get is that something is getting freed before its time.
>  I
> > couldn't tell if it's the shared_ptr self-destructing or the Python GC
> > jumping the gun.
> >
> > I don't have a good impression is what is wrong so I'm finding it hard to
> > write a simplified, self-contained example for the list.  I didn't
> expect a
> > silver bullet with this first message, but I figured somebody had enough
> > experience that I could start isolating things and pare it down.
> >
>
>
> --
> Technology & Consulting Services - ned Productions Limited.
> http://www.nedproductions.biz/. VAT reg: IE 9708311Q.
> Work Portfolio: http://careers.stackoverflow.com/nialldouglas/
>
>
>
> _______________________________________________
> Cplusplus-sig mailing list
> Cplusplus-sig@python.org
> http://mail.python.org/mailman/listinfo/cplusplus-sig
>
_______________________________________________
Cplusplus-sig mailing list
Cplusplus-sig@python.org
http://mail.python.org/mailman/listinfo/cplusplus-sig

Reply via email to