[issue37224] test__xxsubinterpreters fails randomly

2020-03-11 Thread Kyle Stanley


Kyle Stanley  added the comment:

I have a few spare cycles to take another stab at this issue. I can say with 
some certainty that the failure in test__xxsubinterpreters.DestroyTests does 
not occur on the latest commit to master (3.9):

```
$ ./python -m test test__xxsubinterpreters --match DestroyTests -j200 -F -v

OK
0:09:28 load avg: 80.87 [2188] test__xxsubinterpreters passed
test_all (test.test__xxsubinterpreters.DestroyTests) ... ok
test_already_destroyed (test.test__xxsubinterpreters.DestroyTests) ... ok
test_bad_id (test.test__xxsubinterpreters.DestroyTests) ... ok
test_does_not_exist (test.test__xxsubinterpreters.DestroyTests) ... ok
test_from_current (test.test__xxsubinterpreters.DestroyTests) ... ok
test_from_other_thread (test.test__xxsubinterpreters.DestroyTests) ... ok
test_from_sibling (test.test__xxsubinterpreters.DestroyTests) ... ok
test_main (test.test__xxsubinterpreters.DestroyTests) ... ok
test_one (test.test__xxsubinterpreters.DestroyTests) ... ok
test_still_running (test.test__xxsubinterpreters.DestroyTests) ... ok

== Tests result: INTERRUPTED ==
Test suite interrupted by signal SIGINT.

2188 tests OK.
```

So at this point it seems to be a matter of looking through the diff between 
3.8 vs 3.9 for any relevant code paths, and attempting to determine what change 
resolved the failure for 3.9. It might also be useful to determine if it 
occurred when the tests were first added in 3.8 or if some other commit 
introduced a subtle regression.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37224] test__xxsubinterpreters fails randomly

2020-02-04 Thread Kyle Stanley


Kyle Stanley  added the comment:

> Thanks, Kyle.  That helps at least a little. :)

No problem. (:

I'll certainly spend some additional time investigating the main issue when I 
have the chance to, but in the meantime that test change should make it 
slightly easier to determine the point of failure.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37224] test__xxsubinterpreters fails randomly

2020-02-04 Thread Eric Snow


Eric Snow  added the comment:

Thanks, Kyle.  That helps at least a little. :)

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37224] test__xxsubinterpreters fails randomly

2020-02-04 Thread miss-islington


miss-islington  added the comment:


New changeset 9a740b6c7e7a88185d79128b8a1993ac387d5091 by Miss Islington (bot) 
in branch '3.8':
bpo-37224: Improve test__xxsubinterpreters.DestroyTests (GH-18058)
https://github.com/python/cpython/commit/9a740b6c7e7a88185d79128b8a1993ac387d5091


--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37224] test__xxsubinterpreters fails randomly

2020-02-02 Thread miss-islington


Change by miss-islington :


--
pull_requests: +17693
pull_request: https://github.com/python/cpython/pull/18318

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37224] test__xxsubinterpreters fails randomly

2020-01-31 Thread miss-islington


miss-islington  added the comment:


New changeset f03a8f8d5001963ad5b5b28dbd95497e9cc15596 by Kyle Stanley in 
branch 'master':
bpo-37224: Improve test__xxsubinterpreters.DestroyTests (GH-18058)
https://github.com/python/cpython/commit/f03a8f8d5001963ad5b5b28dbd95497e9cc15596


--
nosy: +miss-islington

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37224] test__xxsubinterpreters fails randomly

2020-01-18 Thread Kyle Stanley


Change by Kyle Stanley :


--
pull_requests: +17452
pull_request: https://github.com/python/cpython/pull/18058

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37224] test__xxsubinterpreters fails randomly

2020-01-17 Thread Eric Snow


Eric Snow  added the comment:

On Wed, Jan 15, 2020 at 12:20 AM Kyle Stanley  wrote:
> As can be seen from the results above, the interpreter is not even running in 
> the first place before
> it's destroyed, so of course destroy() won't raise an RuntimeError. I think 
> this proves that
> interpreters.destroy() is _not_ where we should be focusing our efforts (at 
> least for now). Instead,
> we should first investigate why it's not even running at this point.

Good catch.

> I suspect the issue _might_ be a race condition within the "_running()" 
> context manager that's
> preventing the interpreter from being ran, but I'll have to do some further 
> investigation.

Sounds good.

> Notably, a rather difficult and hard to explain side effect occurred from 
> adding the new assertion.
> [snip]
> But, I have no explanation for this.

Yeah, that sounds a bit strange.  Keep in mind that there have been
other changes in this part of the runtime code, so this might be
related.  Or I suppose it could be a side effect of calling
is_running() (though that definitely should not have side effects).

> do you think it might be worth adding in the changes I made to 
> DestroyTests.test_still_running above?

Yeah, it's a good sanity check on the assumptions made by the test.
Please do open a PR and request a review from me.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37224] test__xxsubinterpreters fails randomly

2020-01-15 Thread STINNER Victor


Change by STINNER Victor :


--
nosy:  -vstinner

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37224] test__xxsubinterpreters fails randomly

2020-01-14 Thread Kyle Stanley


Kyle Stanley  added the comment:

I just made a rather interesting discovery. Instead of specifically focusing my 
efforts on the logic with interp_destroy(), I decided to take a closer look at 
the failing unit test itself. 

The main test within DestroyTests that's failing is the following:

```
def test_still_running(self):
main, = interpreters.list_all()
interp = interpreters.create()
with _running(interp):
with self.assertRaises(RuntimeError):
interpreters.destroy(interp)
self.assertTrue(interpreters.is_running(interp))
```

(Specifically, "self.assertRaises(RuntimeError): interpreters.destroy(interp)" 
is the main point of failure)

In order to be 100% certain that it was an issue occurring from 
interpreters.destroy(), I decided to add in a bit of a "sanity check" to ensure 
the interpreter was actually running in the first place before destroying it (I 
also added some extra debugging info):

```
def test_still_running(self):
main, = interpreters.list_all()
interp = interpreters.create()
with _running(interp):
self.assertTrue(interpreters.is_running(interp),
msg=f"Interp {interp} should be running before destruction.")

with self.assertRaises(RuntimeError,
msg=f"Should not be able to destroy interp {interp} while"
   " it's still running."):
interpreters.destroy(interp)

self.assertTrue(interpreters.is_running(interp))
```

The results were very interesting...

```
OK
0:00:49 load avg: 135.49 [306/1] test__xxsubinterpreters failed
test_all (test.test__xxsubinterpreters.DestroyTests) ... ok
test_already_destroyed (test.test__xxsubinterpreters.DestroyTests) ... ok
test_bad_id (test.test__xxsubinterpreters.DestroyTests) ... ok
test_does_not_exist (test.test__xxsubinterpreters.DestroyTests) ... ok
test_from_current (test.test__xxsubinterpreters.DestroyTests) ... ok
test_from_other_thread (test.test__xxsubinterpreters.DestroyTests) ... ok
test_from_sibling (test.test__xxsubinterpreters.DestroyTests) ... ok
test_main (test.test__xxsubinterpreters.DestroyTests) ... ok
test_one (test.test__xxsubinterpreters.DestroyTests) ... ok
test_still_running (test.test__xxsubinterpreters.DestroyTests) ... FAIL

==
FAIL: test_still_running (test.test__xxsubinterpreters.DestroyTests)
--
Traceback (most recent call last):
  File "/home/aeros/repos/aeros-cpython/Lib/test/test__xxsubinterpreters.py", 
line 763, in test_still_running
self.assertTrue(interpreters.is_running(interp),
AssertionError: False is not true : Interp 12 should be running before 
destruction.

--
```

As can be seen from the results above, the interpreter is not even running in 
the first place before it's destroyed, so of course destroy() won't raise an 
RuntimeError. I think this proves that interpreters.destroy() is _not_ where we 
should be focusing our efforts (at least for now). Instead, we should first 
investigate why it's not even running at this point. I suspect the issue 
_might_ be a race condition within the "_running()" context manager that's 
preventing the interpreter from being ran, but I'll have to do some further 
investigation.

I also ran this ~20 times to be 100% certain, and every single one of those 
times the point of failure was at the new assertion check I added before 
destroy(). 

Notably, a rather difficult and hard to explain side effect occurred from 
adding the new assertion. The average number of tests before failure increased 
by a significant amount. In the above test, it was able to pass 306 iterations 
before failing, and in one of my earlier tests it reached over 1000. 

That never happened before on the 3.8 branch, it would very consistently fail 
in the first set of parallel workers if not very soon after. I can say that 
with a degree certainty as well, since I've ran this set of tests a countless 
number of times while trying to debug the failure. But, I have no explanation 
for this.

Do you have any potential ideas, Eric? Also, do you think it might be worth 
adding in the changes I made to DestroyTests.test_still_running above? It 
doesn't directly address the core failure occurring, but I think it improves 
the test significantly; both in functionality and debugging info. I would be 
glad to open a PR if you think the test changes might be useful, as well as 
make any needed adjustments to them.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 

[issue37224] test__xxsubinterpreters fails randomly

2020-01-14 Thread Kyle Stanley


Kyle Stanley  added the comment:

> I also just realized that I can run 
> "test.test__xxsubinterpreters.DestroyTests" by itself with:

> ./python -m test test__xxsubinterpreters.DestroyTests -j200 -F -v

Oops, the correct syntax is:

./python -m test test__xxsubinterpreters --match DestroyTests -j200 -F -v

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37224] test__xxsubinterpreters fails randomly

2020-01-14 Thread Kyle Stanley


Kyle Stanley  added the comment:

I also just realized that I can run "test.test__xxsubinterpreters.DestroyTests" 
by itself with:

./python -m test test__xxsubinterpreters.DestroyTests -j200 -F -v

For some reason, I hadn't thought of running that class of tests by itself to 
isolate the failure. Prior to this issue, I didn't have experience in debugging 
a group of intermittent failures occurring across different tests. For most 
bugs I've worked on so far, it was a single, clearly defined point of failure; 
or a behavioral issue that wasn't covered in the regression tests.

But, that's certainly useful to know for future debugging and will help to 
improve my workflow for further investigating this issue. It's a lot more 
effective than adding a bunch of skip test annotations throughout the test file!

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37224] test__xxsubinterpreters fails randomly

2020-01-14 Thread Kyle Stanley


Kyle Stanley  added the comment:

Update: I have a bit of good news and not so great news.

The good news is that I had some time to work on this again, specifically with 
isolating the failure in test__xxsubinterpreters.DestroyTests. Locally, I added 
a few temporary "@unittest.skip()" annotations to the other failing tests in an 
unmodified 3.8 until the only failure that would occur was the following:

==
FAIL: test_still_running (test.test__xxsubinterpreters.DestroyTests)
--
Traceback (most recent call last):
  File "/home/aeros/repos/aeros-cpython/Lib/test/test__xxsubinterpreters.py", 
line 764, in test_still_running
interpreters.destroy(interp)
AssertionError: RuntimeError not raised

--

Once the failure was isolated, I applied my changes on top of that branch with 
the skipped tests to ensure my proposed change resolved the above failure. 
Unfortunately, it still occurred.

It was somewhat disappointing, but I'm certainly glad that I waited to 
thoroughly test the proposed changes before opening a PR. I'll continue working 
on this and see if I can find another fix.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37224] test__xxsubinterpreters fails randomly

2020-01-09 Thread Kyle Stanley


Kyle Stanley  added the comment:

> For a struct-specific getter we usually end the prefix with an
underscore: _PyInterpreter_IsFinalizing.  Otherwise, that's the same
name I would have used. :)

Good to know, thanks!

> No worries (or hurries).  Just request a review from me when you're
ready.  Thanks again for working on this!

No problem, and thanks for your continued patience. I've been a bit 
additionally busy recently with the holidays and preparing for an upcoming 
technical interview w/ Amazon, so I didn't have time to thoroughly test my 
proposed fix. This would potentially be my first full-time job in the Software 
Engineering industry. As a result, I have a lot to prepare for.

When I get the chance to open a PR after ensuring it at least 100% fixes the 
failure I was targeting, I'll be sure to add you as a reviewer. 

If any deadline comes up for this, feel free to move forward with implementing 
any part of my suggestions (or an entirely different fix that resolves the 
other failures at the same time). I should have time to work on this in the 
near future though.

Thanks again for the guidance and patience with this! I've learned a lot about 
the C-API and C in general (which has been a weaker area for me) from my time 
spent working on the issue.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37224] test__xxsubinterpreters fails randomly

2019-12-17 Thread Eric Snow


Eric Snow  added the comment:

On Fri, Dec 13, 2019 at 8:08 PM Kyle Stanley  wrote:
> Yeah, I named it "_PyInterpreterIsFinalizing" and it's within 
> Include/cpython. Definitely open
> to suggestions on the name though, it's basically just a private getter for 
> interp->finalizing.

For a struct-specific getter we usually end the prefix with an
underscore: _PyInterpreter_IsFinalizing.  Otherwise, that's the same
name I would have used. :)

> Oh, awesome! In that case, I'll do some more rigorous testing before opening 
> the PR then;
> [snip]
> This might be a bit of a time consuming process, but I should have time in 
> the next week
> or so to work on it.

No worries (or hurries).  Just request a review from me when you're
ready.  Thanks again for working on this!

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37224] test__xxsubinterpreters fails randomly

2019-12-13 Thread Kyle Stanley


Kyle Stanley  added the comment:

> Yep, it has to use the public C-API just like any other module.  The
> function has a "_Py" prefix and be defined in Include/cpython, right?

Yeah, I named it "_PyInterpreterIsFinalizing" and it's within Include/cpython. 
Definitely open to suggestions on the name though, it's basically just a 
private getter for interp->finalizing.

> We don't need everything to be fixed in a single PR.  Feel free to
> create a PR just for the "finalizing" fix.

Oh, awesome! In that case, I'll do some more rigorous testing before opening 
the PR then; I'd like to be 99.99% certain that it at least resolves the 
following failure:

FAIL: test_still_running (test.test__xxsubinterpreters.DestroyTests)
--
Traceback (most recent call last):
  File 
"/usr/home/buildbot/python/3.x.koobs-freebsd-current/build/Lib/test/test__xxsubinterpreters.py",
 line 765, in test_still_running
interpreters.destroy(interp)
AssertionError: RuntimeError not raised

Especially since it would be adding a new private C-API function, who's primary 
purpose is to address this specific failure.

This might be a bit of a time consuming process, but I should have time in the 
next week or so to work on it.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37224] test__xxsubinterpreters fails randomly

2019-12-13 Thread Eric Snow


Eric Snow  added the comment:

On Sat, Nov 30, 2019 at 9:23 PM Kyle Stanley  wrote:
> I have a few ideas that I'd like to test out for fixing this failure, and if 
> any of them produce positive results I'll report back.

Sounds good.

> Since the failures are still consistently occurring, I have not yer revised 
> GH-16293. I'll do that when/if I come up with a more thorough solution.

We don't need everything to be fixed in a single PR.  Feel free to
create a PR just for the "finalizing" fix.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37224] test__xxsubinterpreters fails randomly

2019-12-13 Thread Eric Snow


Eric Snow  added the comment:

On Sat, Nov 30, 2019 at 9:23 PM Kyle Stanley  wrote:
> Based on the above hint, I was able to make some progress on a potential 
> solution. Thanks Eric.

That's great!

> Instead of only checking "frame->f_executing", I changed "_is_running()" to 
> also check the
> "finalizing" field of PyInterpreterState. The "finalizing" field is set to 1 
> in "Py_EndInterpreter()",
> so this ensures that an interpreter in the process of being destroyed is 
> considered "running",
> so that operations (such as running scripts, destroying the interpreter, etc) 
> can't occur during
> finalization.

Ah, that makes sense.

> I had to add a private function to the C-API in order to access 
> "interp->finalizing" from
> Modules/_xxsubinterpretersmodule.c due to the struct for PyInterpreterState 
> being internal only.

Yep, it has to use the public C-API just like any other module.  The
function has a "_Py" prefix and be defined in Include/cpython, right?

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37224] test__xxsubinterpreters fails randomly

2019-11-30 Thread Kyle Stanley


Kyle Stanley  added the comment:

> so that operations (such as running scripts, destroying the interpreter, etc) 
> can't occur during finalization

Clarification: by "destroying the interpreter" I am specifically referring to 
calling `interp_destroy()` after finalization has already started.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37224] test__xxsubinterpreters fails randomly

2019-11-30 Thread Kyle Stanley


Kyle Stanley  added the comment:

> Regarding "is_running()", notice that it relies almost entirely on 
> "frame->f_executing".  That might not be enough (or maybe the behavior there 
> changed).  That would be worth checking out.

Based on the above hint, I was able to make some progress on a potential 
solution. Thanks Eric.

Instead of only checking "frame->f_executing", I changed "_is_running()" to 
also check the "finalizing" field of PyInterpreterState. The "finalizing" field 
is set to 1 in "Py_EndInterpreter()", so this ensures that an interpreter in 
the process of being destroyed is considered "running", so that operations 
(such as running scripts, destroying the interpreter, etc) can't occur during 
finalization. I had to add a private function to the C-API in order to access 
"interp->finalizing" from Modules/_xxsubinterpretersmodule.c due to the struct 
for PyInterpreterState being internal only.

The above fix seems to completely remove the test failure that occurs in 
"interpreters.destroy(interp)" in "test_already_running" after running it 
several times, but I'm able to consistently reproduce the following:

Exception in thread Thread-8:
Traceback (most recent call last):
  File "/home/aeros/repos/aeros-cpython/Lib/threading.py", line 932, in 
_bootstrap_inner
self.run()
  File "/home/aeros/repos/aeros-cpython/Lib/threading.py", line 870, in run
self._target(*self._args, **self._kwargs)
  File "/home/aeros/repos/aeros-cpython/Lib/test/test__xxsubinterpreters.py", 
line 51, in run
interpreters.run_string(interp, dedent(f"""
RuntimeError: unrecognized interpreter ID 46
test test__xxsubinterpreters failed -- Traceback (most recent call last):
  File "/home/aeros/repos/aeros-cpython/Lib/test/test__xxsubinterpreters.py", 
line 492, in test_subinterpreter
self.assertTrue(interpreters.is_running(interp))
AssertionError: False is not true

I have a few ideas that I'd like to test out for fixing this failure, and if 
any of them produce positive results I'll report back. Since the failures are 
still consistently occurring, I have not yer revised GH-16293. I'll do that 
when/if I come up with a more thorough solution.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37224] test__xxsubinterpreters fails randomly

2019-11-23 Thread Kyle Stanley


Kyle Stanley  added the comment:

> So, I was finally able to replicate a failure in test_still_running locally, 
> it required using a rather ridiculous number of parallel workers

I forgot to mention that I was able to replicate the above failure on the 
latest commit to the 3.8 branch. I was _unable_ to replicate the failure on the 
latest commit to master, even when scaling up the number of parallel workers 
(the max I tested was 300 workers and ~3k iterations of 
test__xxsubinterpreters).

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37224] test__xxsubinterpreters fails randomly

2019-11-23 Thread Kyle Stanley


Kyle Stanley  added the comment:

> I was able to consistently reproduce the above failure using 200 parallel 
> workers, even without `-f`. 

Oops, I didn't mean without passing `-F`, as this would result in only a single 
test being ran. I meant without letting it repeat multiple times, as in the 
test failure would consistently occur in the first batch of 200 parallel 
workers.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37224] test__xxsubinterpreters fails randomly

2019-11-22 Thread Kyle Stanley


Kyle Stanley  added the comment:

So, I was finally able to replicate a failure in test_still_running locally, it 
required using a rather ridiculous number of parallel workers:

$ ./python -m test test__xxsubinterpreters -j200 -F
...
Exception in thread Thread-7:
Traceback (most recent call last):
  File "/home/aeros/repos/aeros-cpython/Lib/threading.py", line 944, in 
_bootstrap_inner
self.run()
  File "/home/aeros/repos/aeros-cpython/Lib/threading.py", line 882, in run
self._target(*self._args, **self._kwargs)
  File "/home/aeros/repos/aeros-cpython/Lib/test/test__xxsubinterpreters.py", 
line 51, in run
interpreters.run_string(interp, dedent(f"""
RuntimeError: unrecognized interpreter ID 39
test test__xxsubinterpreters failed -- Traceback (most recent call last):
  File "/home/aeros/repos/aeros-cpython/Lib/test/test__xxsubinterpreters.py", 
line 766, in test_still_running
interpreters.destroy(interp)
AssertionError: RuntimeError not raised
...
== Tests result: FAILURE ==

94 tests OK.

1 test failed:
test__xxsubinterpreters

Total duration: 1 min 49 sec
Tests result: FAILURE

OS: Arch Linux x86_64
Kernel: 5.3.11
CPU: Intel i5-4460

I was able to consistently reproduce the above failure using 200 parallel 
workers, even without `-f`. 

I have a few different theories that I'd like to test for fixing the failure, 
I'll report back if any of them yield positive results.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37224] test__xxsubinterpreters fails randomly

2019-11-22 Thread Kyle Stanley


Kyle Stanley  added the comment:

> Sorry I haven't gotten back to you sooner, Kyle.  Thanks for working on this. 
>  I'm looking at your PR right now.

> BTW, Kyle, your problem-solving approach on this is on-track.  Don't get 
> discouraged.  This stuff is tricky. :)

No problem at all, and thank you for the words of encouragement. (:

> Regarding "is_running()", notice that it relies almost entirely on 
> "frame->f_executing".  That might not be enough (or maybe the behavior there 
> changed).  That would be worth checking out.

Hmm, that's an interesting point that I hadn't considered. 

> @aeros, feel free too keep investigating.  I'd be glad to help you out.  
> Otherwise I'll dive into this probably next week.

Sounds good, I'll do some further digging around, particularly anywhere that 
interacts with PyFrameObject's `f_executing` field. I think it's possible that 
there's a non-obvious issue with `_is_running()`, where it works correctly 99% 
of the time. That seems to be a significant commonality between the different 
areas where the intermittent failures are occurring, they all directly or 
indirectly call `_is_running()`.

Also, thanks again Eric for the PR review. Looking back it after Eric's 
analysis and having a couple of months to think it over, I don't think that 
GH-16293 is the correct solution. It seems unlikely that this is being caused 
by a lack of proper GIL acquisition, as that would likely be causing far more 
consistent build failures. I'm thinking that it's more likely to be an issue 
with either:

1) Subinterpreters occasionally disappearing due to premature cleanup (as Eric 
suggested)

2) _is_running() being incorrect a small percentage of the time

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37224] test__xxsubinterpreters fails randomly

2019-11-22 Thread Eric Snow


Eric Snow  added the comment:

Thus far these are the failures we've seen:

* not running when we expect it to be running:
   * interpreters.is_running(interp)
   * interpreters.run_string(interp, ...)
   * interpreters.destroy(interp)
* can't find the interpreter even though we expect it to exist
   * interpreters.run_string(interp, ...)
* finds it running when we expect it to not be running
   * interpreters.run_string(interp, ...)

Except for the last one (which might be a separate issue), they all look like 
they could be explained by the same thing: the subinterpreter stopped (or went 
away) prematurely.  That could be related to the code in 
_xxsubinterpretersmodule.c or it could be the cleanup code that makes sure 
interpreters get cleaned up at the end of tests (e.g. running too soon).  
Either way I expect the fix will be in the module code and not the tests.

Regarding "is_running()", notice that it relies almost entirely on 
"frame->f_executing".  That might not be enough (or maybe the behavior there 
changed).  That would be worth checking out.

@aeros, feel free too keep investigating.  I'd be glad to help you out.  
Otherwise I'll dive into this probably next week.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37224] test__xxsubinterpreters fails randomly

2019-11-22 Thread Eric Snow


Eric Snow  added the comment:

BTW, Kyle, your problem-solving approach on this is on-track.  Don't get 
discouraged.  This stuff is tricky. :)

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37224] test__xxsubinterpreters fails randomly

2019-11-22 Thread Eric Snow


Eric Snow  added the comment:

Sorry I haven't gotten back to you sooner, Kyle.  Thanks for working on this.  
I'm looking at your PR right now.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37224] test__xxsubinterpreters fails randomly

2019-11-22 Thread Eric Snow


Change by Eric Snow :


--
keywords: +patch
pull_requests: +16843
stage:  -> patch review
pull_request: https://github.com/python/cpython/pull/16293

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37224] test__xxsubinterpreters fails randomly

2019-11-14 Thread STINNER Victor

STINNER Victor  added the comment:

AMD64 Windows8.1 Refleaks 3.8:
https://buildbot.python.org/all/#/builders/224/builds/151

0:54:23 load avg: 5.62 [306/423/3] test__xxsubinterpreters failed -- running: 
test_asyncio (4 min 26 sec), test_zipfile (6 min 6 sec), test_compileall (6 min 
21 sec)
beginning 6 repetitions
123456
..Exception in thread Thread-27:
Traceback (most recent call last):
  File "D:\buildarea\3.8.ware-win81-release.refleak\build\lib\threading.py", 
line 932, in _bootstrap_inner
    self.run()
  File "D:\buildarea\3.8.ware-win81-release.refleak\build\lib\threading.py", 
line 870, in run
    self._target(*self._args, **self._kwargs)
  File 
"D:\buildarea\3.8.ware-win81-release.refleak\build\lib\test\test__xxsubinterpreters.py",
 line 51, in run
    interpreters.run_string(interp, dedent(f"""
RuntimeError: unrecognized interpreter ID 175
test test__xxsubinterpreters failed -- Traceback (most recent call last):
  File 
"D:\buildarea\3.8.ware-win81-release.refleak\build\lib\test\test__xxsubinterpreters.py",
 line 763, in test_still_running
    interpreters.destroy(interp)
AssertionError: RuntimeError not raised

FAIL: test_subinterpreter (test.test__xxsubinterpreters.IsRunningTests)

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37224] test__xxsubinterpreters fails randomly

2019-10-28 Thread STINNER Victor


STINNER Victor  added the comment:

See also bpo-33868: test__xxsubinterpreters fails randomly since at least 
2018-06-15.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37224] test__xxsubinterpreters fails randomly

2019-10-23 Thread STINNER Victor


STINNER Victor  added the comment:

AMD64 FreeBSD Shared 3.x:

https://buildbot.python.org/all/#/builders/371/builds/7

test_subinterpreter (test.test__xxsubinterpreters.IsRunningTests) ... Exception 
in thread Thread-8:
Traceback (most recent call last):
  File 
"/usr/home/buildbot/python/3.x.koobs-freebsd-564d/build/Lib/threading.py", line 
944, in _bootstrap_inner
self.run()
  File 
"/usr/home/buildbot/python/3.x.koobs-freebsd-564d/build/Lib/threading.py", line 
882, in run
self._target(*self._args, **self._kwargs)
  File 
"/usr/home/buildbot/python/3.x.koobs-freebsd-564d/build/Lib/test/test__xxsubinterpreters.py",
 line 51, in run
interpreters.run_string(interp, dedent(f"""
RuntimeError: unrecognized interpreter ID 46
FAIL

test_already_running (test.test__xxsubinterpreters.RunStringTests) ... 
Exception in thread Thread-9:
Traceback (most recent call last):
  File 
"/usr/home/buildbot/python/3.x.koobs-freebsd-564d/build/Lib/threading.py", line 
944, in _bootstrap_inner
self.run()
  File 
"/usr/home/buildbot/python/3.x.koobs-freebsd-564d/build/Lib/threading.py", line 
882, in run
self._target(*self._args, **self._kwargs)
  File 
"/usr/home/buildbot/python/3.x.koobs-freebsd-564d/build/Lib/test/test__xxsubinterpreters.py",
 line 51, in run
interpreters.run_string(interp, dedent(f"""
RuntimeError: interpreter already running
FAIL


==
FAIL: test_subinterpreter (test.test__xxsubinterpreters.IsRunningTests)
--
Traceback (most recent call last):
  File 
"/usr/home/buildbot/python/3.x.koobs-freebsd-564d/build/Lib/test/test__xxsubinterpreters.py",
 line 492, in test_subinterpreter
self.assertTrue(interpreters.is_running(interp))
AssertionError: False is not true

==
FAIL: test_already_running (test.test__xxsubinterpreters.RunStringTests)
--
Traceback (most recent call last):
  File 
"/usr/home/buildbot/python/3.x.koobs-freebsd-564d/build/Lib/test/test__xxsubinterpreters.py",
 line 853, in test_already_running
interpreters.run_string(self.id, 'print("spam")')
AssertionError: RuntimeError not raised

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37224] test__xxsubinterpreters fails randomly

2019-10-18 Thread STINNER Victor


STINNER Victor  added the comment:

The race condition can be reproduced on Linux using a lot of parallel test 
worker processes. For example, I reproduce it on my laptop (8 CPUs) using:

$ ./python -m test test__xxsubinterpreters -F -j30
...
0:00:26 load avg: 17.86 [ 24] test__xxsubinterpreters passed
0:00:26 load avg: 17.86 [ 25] test__xxsubinterpreters passed
0:00:26 load avg: 17.86 [ 26] test__xxsubinterpreters passed
0:00:27 load avg: 17.86 [ 27/1] test__xxsubinterpreters failed
Exception in thread Thread-8:
Traceback (most recent call last):
  File "/home/vstinner/python/master/Lib/threading.py", line 944, in 
_bootstrap_inner
self.run()
  File "/home/vstinner/python/master/Lib/threading.py", line 882, in run
self._target(*self._args, **self._kwargs)
  File "/home/vstinner/python/master/Lib/test/test__xxsubinterpreters.py", line 
51, in run
interpreters.run_string(interp, dedent(f"""
RuntimeError: unrecognized interpreter ID 46
test test__xxsubinterpreters failed -- Traceback (most recent call last):
  File "/home/vstinner/python/master/Lib/test/test__xxsubinterpreters.py", line 
492, in test_subinterpreter
self.assertTrue(interpreters.is_running(interp))
AssertionError: False is not true

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37224] test__xxsubinterpreters fails randomly

2019-10-14 Thread Kyle Stanley


Kyle Stanley  added the comment:

> Kyle Stanley proposed a fix: PR 16293.

Note that I'm not confident about the fix I proposed in GH-16293, it was more 
of an idea to fix the intermittent failures more than anything. It definitely 
needs review from someone knowledgeable about sub-interpreters and the 
PyInterpreterState API. This isn't an area that I'm experienced in, but I am 
interested in it.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37224] test__xxsubinterpreters fails randomly

2019-10-14 Thread STINNER Victor


STINNER Victor  added the comment:

AMD64 FreeBSD CURRENT Shared 3.8:
https://buildbot.python.org/all/#/builders/212/builds/310

FAIL: test_still_running (test.test__xxsubinterpreters.DestroyTests)
FAIL: test_run_coroutine_threadsafe_with_timeout 
(test.test_asyncio.test_tasks.RunCoroutineThreadsafeTests)
1:01:44 load avg: 7.72 Re-running failed tests in verbose mode
1:01:44 load avg: 7.72 Re-running test__xxsubinterpreters in verbose mode
FAIL: test_still_running (test.test__xxsubinterpreters.DestroyTests)
1:02:01 load avg: 7.58 Re-running test_asyncio in verbose mode

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37224] test__xxsubinterpreters fails randomly

2019-10-14 Thread STINNER Victor


STINNER Victor  added the comment:

AMD64 FreeBSD CURRENT Shared 3.x:
https://buildbot.python.org/all/#/builders/168/builds/1630

FAIL: test_still_running (test.test__xxsubinterpreters.DestroyTests)
1:07:09 load avg: 8.18 Re-running failed tests in verbose mode
1:07:09 load avg: 8.18 Re-running test__xxsubinterpreters in verbose mode
FAIL: test_still_running (test.test__xxsubinterpreters.DestroyTests)
FAIL: test_subinterpreter (test.test__xxsubinterpreters.IsRunningTests)
FAIL: test_already_running (test.test__xxsubinterpreters.RunStringTests)

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37224] test__xxsubinterpreters fails randomly

2019-10-14 Thread STINNER Victor


Change by STINNER Victor :


--
title: test__xxsubinterpreters failed on AMD64 Windows8.1 Refleaks 3.8 -> 
test__xxsubinterpreters fails randomly

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com