On Wednesday, August 19, 2020 at 3:04:18 PM UTC+1 erik....@gmail.com wrote:

> On Friday, August 14, 2020 at 1:44:38 PM UTC+2 david....@gmail.com wrote:
>
>> Same here on a fresh clone with macOS 10.15.16
>>
>>
>> [dochtml] Building en/constructions.
>> [dochtml] 
>> [dochtml] [construct] building [html]: targets for 16 source files that 
>> are out of date
>> [dochtml] [construct] updating environment: [new config] 16 added, 0 
>> changed, 0 removed
>> ^C[dochtml] Error building the documentation.
>> [dochtml] Traceback (most recent call last):
>> [dochtml]   File 
>> "/Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.7/lib/python3.7/runpy.py",
>>  
>> line 193, in _run_module_as_main
>> [dochtml]     "__main__", mod_spec)
>> [dochtml]   File 
>> "/Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.7/lib/python3.7/runpy.py",
>>  
>> line 85, in _run_code
>> [dochtml]     exec(code, run_globals)
>> [dochtml]   File 
>> "/Users/dcoudert/sage/local/lib/python3.7/site-packages/sage_setup/docbuild/__main__.py",
>>  
>> line 2, in <module>
>> [dochtml]     main()
>> [dochtml]   File 
>> "/Users/dcoudert/sage/local/lib/python3.7/site-packages/sage_setup/docbuild/__init__.py",
>>  
>> line 1721, in main
>> [dochtml]     builder()
>> [dochtml]   File 
>> "/Users/dcoudert/sage/local/lib/python3.7/site-packages/sage_setup/docbuild/__init__.py",
>>  
>> line 337, in _wrapper
>> [dochtml]     build_many(build_other_doc, L)
>> [dochtml]   File 
>> "/Users/dcoudert/sage/local/lib/python3.7/site-packages/sage_setup/docbuild/__init__.py",
>>  
>> line 281, in build_many
>> [dochtml]     _build_many(target, args, processes=NUM_THREADS)
>> [dochtml]   File 
>> "/Users/dcoudert/sage/local/lib/python3.7/site-packages/sage_setup/docbuild/utils.py",
>>  
>> line 263, in build_many
>> [dochtml]     waited_pid, waited_exitcode = wait_for_one()
>> [dochtml]   File 
>> "/Users/dcoudert/sage/local/lib/python3.7/site-packages/sage_setup/docbuild/utils.py",
>>  
>> line 179, in wait_for_one
>> [dochtml]     pid, sts = os.wait()
>> [dochtml]   File "src/cysignals/signals.pyx", line 320, in 
>> cysignals.signals.python_check_interrupt
>> [dochtml] KeyboardInterrupt
>> [dochtml] 
>> [dochtml]     Note: incremental documentation builds sometimes cause 
>> spurious
>> [dochtml]     error messages. To be certain that these are real errors, 
>> run
>> [dochtml]     "make doc-clean" first and try again.
>> make[3]: *** [doc-html] Error 1
>> make[2]: *** [all-start] Interrupt: 2
>> make[1]: *** [all-start] Interrupt: 2
>> make: *** [all] Interrupt: 2
>>
>>
>>
> As I noted at [1], this implies that one or more of the docbuilds are 
> running some code that hangs forever, so it would help narrow it down by 
> finding out what code it's running to cause a hang.  Normally in docbuilds 
> the most likely code to run will be some plotting code, so you can try to 
> take the plot code present in the relevant documentation pages, run it, and 
> see if it hangs.  Chances are if you run it in a single process it might 
> *not* hang--frequently this happens only in a forked subprocess.  So you 
> can try something like:
>
> from multiprocessing import Process
> p = Process(target=<function implementing the plotting code>)
> p.start()
>
> and see if it hangs. I've often found this to be the case with calls to 
> np.dot(), which matpotlib uses sometimes to perform various 
> transformations, and which in turn often results in a call to a 
> multi-threaded OpenBLAS function which are sometimes buggy.
>
> As an aside, I would like to make the parallel doc build code more robust 
> to this kind of hang, but it's hard to say exactly what the right solution 
> is (I know more or less what the technical solution is, but I mean more the 
> UX solution).  Because as far as the docbuild code is concerned, it doesn't 
> know that the process is "hung".  It's just taking however long it needs to 
> take, and the code will wait for it to finish.  Perhaps we could implement 
> some kind of timeout to kill docbuild processes that are taking too long 
> (but what should the timeout be)? Or report back to the user exactly which 
> docbuilds are still running (like, a log message at some interval) so that 
> the user can decide whether or not to take action.
>

I found out that the problem is due to OpenMP, see 
https://trac.sagemath.org/ticket/30351#comment: 
<https://trac.sagemath.org/ticket/30351#comment:40>50
The workaround (or perhaps the only way to proceed, I don't know) is to set 
env.var. OMP_NUM_THREADS=1

 

>
> [1] https://trac.sagemath.org/ticket/30351#comment:40 
>

-- 
You received this message because you are subscribed to the Google Groups 
"sage-release" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sage-release+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/sage-release/017886ad-d707-45d1-ba42-d97422909385n%40googlegroups.com.

Reply via email to