On Wednesday, August 19, 2020 at 3:04:18 PM UTC+1 erik....@gmail.com wrote:
> On Friday, August 14, 2020 at 1:44:38 PM UTC+2 david....@gmail.com wrote: > >> Same here on a fresh clone with macOS 10.15.16 >> >> >> [dochtml] Building en/constructions. >> [dochtml] >> [dochtml] [construct] building [html]: targets for 16 source files that >> are out of date >> [dochtml] [construct] updating environment: [new config] 16 added, 0 >> changed, 0 removed >> ^C[dochtml] Error building the documentation. >> [dochtml] Traceback (most recent call last): >> [dochtml] File >> "/Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.7/lib/python3.7/runpy.py", >> >> line 193, in _run_module_as_main >> [dochtml] "__main__", mod_spec) >> [dochtml] File >> "/Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.7/lib/python3.7/runpy.py", >> >> line 85, in _run_code >> [dochtml] exec(code, run_globals) >> [dochtml] File >> "/Users/dcoudert/sage/local/lib/python3.7/site-packages/sage_setup/docbuild/__main__.py", >> >> line 2, in <module> >> [dochtml] main() >> [dochtml] File >> "/Users/dcoudert/sage/local/lib/python3.7/site-packages/sage_setup/docbuild/__init__.py", >> >> line 1721, in main >> [dochtml] builder() >> [dochtml] File >> "/Users/dcoudert/sage/local/lib/python3.7/site-packages/sage_setup/docbuild/__init__.py", >> >> line 337, in _wrapper >> [dochtml] build_many(build_other_doc, L) >> [dochtml] File >> "/Users/dcoudert/sage/local/lib/python3.7/site-packages/sage_setup/docbuild/__init__.py", >> >> line 281, in build_many >> [dochtml] _build_many(target, args, processes=NUM_THREADS) >> [dochtml] File >> "/Users/dcoudert/sage/local/lib/python3.7/site-packages/sage_setup/docbuild/utils.py", >> >> line 263, in build_many >> [dochtml] waited_pid, waited_exitcode = wait_for_one() >> [dochtml] File >> "/Users/dcoudert/sage/local/lib/python3.7/site-packages/sage_setup/docbuild/utils.py", >> >> line 179, in wait_for_one >> [dochtml] pid, sts = os.wait() >> [dochtml] File "src/cysignals/signals.pyx", line 320, in >> cysignals.signals.python_check_interrupt >> [dochtml] KeyboardInterrupt >> [dochtml] >> [dochtml] Note: incremental documentation builds sometimes cause >> spurious >> [dochtml] error messages. To be certain that these are real errors, >> run >> [dochtml] "make doc-clean" first and try again. >> make[3]: *** [doc-html] Error 1 >> make[2]: *** [all-start] Interrupt: 2 >> make[1]: *** [all-start] Interrupt: 2 >> make: *** [all] Interrupt: 2 >> >> >> > As I noted at [1], this implies that one or more of the docbuilds are > running some code that hangs forever, so it would help narrow it down by > finding out what code it's running to cause a hang. Normally in docbuilds > the most likely code to run will be some plotting code, so you can try to > take the plot code present in the relevant documentation pages, run it, and > see if it hangs. Chances are if you run it in a single process it might > *not* hang--frequently this happens only in a forked subprocess. So you > can try something like: > > from multiprocessing import Process > p = Process(target=<function implementing the plotting code>) > p.start() > > and see if it hangs. I've often found this to be the case with calls to > np.dot(), which matpotlib uses sometimes to perform various > transformations, and which in turn often results in a call to a > multi-threaded OpenBLAS function which are sometimes buggy. > > As an aside, I would like to make the parallel doc build code more robust > to this kind of hang, but it's hard to say exactly what the right solution > is (I know more or less what the technical solution is, but I mean more the > UX solution). Because as far as the docbuild code is concerned, it doesn't > know that the process is "hung". It's just taking however long it needs to > take, and the code will wait for it to finish. Perhaps we could implement > some kind of timeout to kill docbuild processes that are taking too long > (but what should the timeout be)? Or report back to the user exactly which > docbuilds are still running (like, a log message at some interval) so that > the user can decide whether or not to take action. > I found out that the problem is due to OpenMP, see https://trac.sagemath.org/ticket/30351#comment: <https://trac.sagemath.org/ticket/30351#comment:40>50 The workaround (or perhaps the only way to proceed, I don't know) is to set env.var. OMP_NUM_THREADS=1 > > [1] https://trac.sagemath.org/ticket/30351#comment:40 > -- You received this message because you are subscribed to the Google Groups "sage-release" group. To unsubscribe from this group and stop receiving emails from it, send an email to sage-release+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/sage-release/017886ad-d707-45d1-ba42-d97422909385n%40googlegroups.com.