Here is what I found just by analyzing the logs. It seems the first failures appeared after this change:
http://svn.python.org/view/python/branches/release30-maint/Objects/object.c?rev=67888&view=diff&r1=67888&r2=67887&p1=python/branches/release30-maint/Objects/object.c&p2=/python/branches/release30-maint/Objects/object.c The logs of failing test runs all shows the same error message: [31481 refs] * ob object : <refcnt 0 at 0x3a97728> type : str refcount: 0 address : 0x3a97728 * op->_ob_prev->_ob_next object : <refcnt 0 at 0x3a97728> type : str refcount: 0 address : 0x3a97728 * op->_ob_next->_ob_prev object : [31776 refs] This is the output of _Py_ForgetReference (which calls _PyObject_Dump) called either from _PyUnicode_New or unicode_subtype_new. In both cases, this implies PyObject_MALLOC returned NULL when allocating the internal array of a str object. However, I have no idea why malloc() is failing there. By counting the number of [reftotal] printed in the log, I found that the failing test could be one of the following: test_invalid_args, test_invalid_bufsize, test_list2cmdline, test_no_leaking. Looking at the tests, it seems only test_no_leaking could be problematic: * test_list2cmdline checks if the subprocess.line2cmdline function works correctly, only Python code is involved here; * test_invalid_args checks if using an option unsupported by a platform raises an exception, only Python code is involved here; * test_invalid_bufsize only checks whether Popen rejects non-integer bufsize, only Python code is involved here. And unsurprisingly, that is the failing test: test test_subprocess failed -- Traceback (most recent call last): File "/home/pybot/buildarea-sid/3.0.klose-debian-sparc/build/Lib/test/test_subprocess.py", line 423, in test_no_leaking data = p.communicate(b"lime")[0] File "/home/pybot/buildarea-sid/3.0.klose-debian-sparc/build/Lib/subprocess.py", line 671, in communicate return self._communicate(input) File "/home/pybot/buildarea-sid/3.0.klose-debian-sparc/build/Lib/subprocess.py", line 1171, in _communicate bytes_written = os.write(self.stdin.fileno(), chunk) OSError: [Errno 32] Broken pipe It seems one of the spawned processes goes out of memory while allocating a new PyUnicode object. I believe we don't see the usual MemoryError because the parent process catches stderr and stdout of the children. Also, only klose-*-sparc buildbots are failing this way; loewis-sun is failing too but for a different reason. So, how much memory is available on this machine (or actually, on this virtual machine)? Now, I wonder why manipulating the GIL caused the bug to appear in 3.0, but not in 2.x. Maybe it is related to the new I/O library in Python 3.0. Regards, -- Alexandre On Tue, Dec 30, 2008 at 4:20 PM, Nick Coghlan <ncogh...@gmail.com> wrote: > Does anyone have local access to a sparc machine to try to track down > the ongoing buildbot failures in test_subprocess? > > (I think the problem is specific to 3.x builds on sparc machines, but I > haven't checked the buildbots all that closely - that assessment is just > based on what I recall of the buildbot failure emails). > > Cheers, > Nick. > > -- > Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia > --------------------------------------------------------------- > _______________________________________________ > Python-Dev mailing list > Python-Dev@python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/alexandre%40peadrop.com > _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com