Re: [Numpy-discussion] The mu.py script will keep running and never end.

Hongyi Zhao Sun, 11 Oct 2020 03:28:31 -0700

On Sun, Oct 11, 2020 at 2:56 PM Evgeni Burovski
<[email protected]> wrote:
>
> The script seems to be computing the particle numbers for an array of 
> chemical potentials.
>
> Two ways of speeding it up, both are likely simpler then using dask:


What do you mean by saying *dask*?

>
> First: use numpy
>
> 1. Move constructing mu_all out of the loop (np.linspace)
> 2. Arrange the integrands into a 2d array
> 3. np.trapz along an axis which corresponds to a single integrand array
> (Or avoid the overhead of trapz by just implementing the trapezoid formula 
> manually)
>
> Second:
>
> Move the loop into cython.

Will this be more efficient than the schema like parallelization based
on python modules, say, joblib?

>
>
>
>
> вс, 11 окт. 2020 г., 9:32 Hongyi Zhao <[email protected]>:
>>
>> On Sun, Oct 11, 2020 at 2:02 PM Andrea Gavana <[email protected]> 
>> wrote:
>> >
>> >
>> >
>> > On Sun, 11 Oct 2020 at 07.52, Hongyi Zhao <[email protected]> wrote:
>> >>
>> >> On Sun, Oct 11, 2020 at 1:33 PM Andrea Gavana <[email protected]> 
>> >> wrote:
>> >> >
>> >> >
>> >> >
>> >> > On Sun, 11 Oct 2020 at 07.14, Andrea Gavana <[email protected]> 
>> >> > wrote:
>> >> >>
>> >> >> Hi,
>> >> >>
>> >> >> On Sun, 11 Oct 2020 at 00.27, Hongyi Zhao <[email protected]> 
>> >> >> wrote:
>> >> >>>
>> >> >>> On Sun, Oct 11, 2020 at 1:48 AM Robert Kern <[email protected]> 
>> >> >>> wrote:
>> >> >>> >
>> >> >>> > You don't need to use vectorize() on fermi(). fermi() will work 
>> >> >>> > just fine on arrays and should be much faster.
>> >> >>>
>> >> >>> Yes, it really does the trick. See the following for the benchmark
>> >> >>> based on your suggestion:
>> >> >>>
>> >> >>> $ time python mu.py
>> >> >>> [-10.999 -10.999 -10.999 ...  20.     20.     20.   ] [4.973e-84
>> >> >>> 4.973e-84 4.973e-84 ... 4.973e-84 4.973e-84 4.973e-84]
>> >> >>>
>> >> >>> real    0m41.056s
>> >> >>> user    0m43.970s
>> >> >>> sys    0m3.813s
>> >> >>>
>> >> >>>
>> >> >>> But are there any ways to further improve/increase efficiency?
>> >> >>
>> >> >>
>> >> >>
>> >> >> I believe it will get a bit better if you don’t column_stack an array 
>> >> >> 6000 times - maybe pre-allocate your output first?
>> >> >>
>> >> >> Andrea.
>> >> >
>> >> >
>> >> >
>> >> > I’m sorry, scratch that: I’ve seen a ghost white space in front of your 
>> >> > column_stack call and made me think you were stacking your results very 
>> >> > many times, which is not the case.
>> >>
>> >> Still not so clear on your solutions for this problem. Could you
>> >> please post here the corresponding snippet of your enhancement?
>> >
>> >
>> > I have no solution, I originally thought you were calling “column_stack” 
>> > 6000 times in the loop, but that is not the case, I was mistaken. My 
>> > apologies for that.
>> >
>> > The timings of your approach is highly dependent on the size of your 
>> > “energy” and “DOS” array -
>>
>> The size of the “energy” and “DOS” array is Problem-related and
>> shouldn't be reduced arbitrarily.
>>
>> > not to mention calling trapz 6000 times in a loop.
>>
>> I'm currently thinking on parallelization the execution of the for
>> loop, say, with joblib <https://github.com/joblib/joblib>, but I still
>> haven't figured out the corresponding codes. If you have some
>> experience on this type of solution, could you please give me some
>> more hints?
>>
>> >  Maybe there’s a better way to do it with another approach, but at the 
>> > moment I can’t think of one...
>> >
>> >>
>> >>
>> >> Regards,
>> >> HY
>> >> >
>> >> >>
>> >> >>
>> >> >>>
>> >> >>>
>> >> >>> Regards,
>> >> >>> HY
>> >> >>>
>> >> >>> >
>> >> >>> > On Sat, Oct 10, 2020, 8:23 AM Hongyi Zhao <[email protected]> 
>> >> >>> > wrote:
>> >> >>> >>
>> >> >>> >> Hi,
>> >> >>> >>
>> >> >>> >> My environment is Ubuntu 20.04 and python 3.8.3 managed by pyenv. I
>> >> >>> >> try to run the script
>> >> >>> >> <https://notebook.rcc.uchicago.edu/files/acs.chemmater.9b05047/Data/bulk/dft/mu.py>,
>> >> >>> >> but it will keep running and never end. When I use 'Ctrl + c' to
>> >> >>> >> terminate it, it will give the following output:
>> >> >>> >>
>> >> >>> >> $ python mu.py
>> >> >>> >> [-10.999 -10.999 -10.999 ...  20.     20.     20.   ] [4.973e-84
>> >> >>> >> 4.973e-84 4.973e-84 ... 4.973e-84 4.973e-84 4.973e-84]
>> >> >>> >>
>> >> >>> >> I have to terminate it and obtained the following information:
>> >> >>> >>
>> >> >>> >> ^CTraceback (most recent call last):
>> >> >>> >>   File "mu.py", line 38, in <module>
>> >> >>> >>     integrand=DOS*fermi_array(energy,mu,kT)
>> >> >>> >>   File 
>> >> >>> >> "/home/werner/.pyenv/versions/datasci/lib/python3.8/site-packages/numpy/lib/function_base.py",
>> >> >>> >> line 2108, in __call__
>> >> >>> >>     return self._vectorize_call(func=func, args=vargs)
>> >> >>> >>   File 
>> >> >>> >> "/home/werner/.pyenv/versions/datasci/lib/python3.8/site-packages/numpy/lib/function_base.py",
>> >> >>> >> line 2192, in _vectorize_call
>> >> >>> >>     outputs = ufunc(*inputs)
>> >> >>> >>   File "mu.py", line 8, in fermi
>> >> >>> >>     return 1./(exp((E-mu)/kT)+1)
>> >> >>> >> KeyboardInterrupt
>> >> >>> >>
>> >> >>> >>
>> >> >>> >> Any helps and hints for this problem will be highly appreciated?
>> >> >>> >>
>> >> >>> >> Regards,
>> >> >>> >> --
>> >> >>> >> Hongyi Zhao <[email protected]>
>> >> >>> >> _______________________________________________
>> >> >>> >> NumPy-Discussion mailing list
>> >> >>> >> [email protected]
>> >> >>> >> https://mail.python.org/mailman/listinfo/numpy-discussion
>> >> >>> >
>> >> >>> > _______________________________________________
>> >> >>> > NumPy-Discussion mailing list
>> >> >>> > [email protected]
>> >> >>> > https://mail.python.org/mailman/listinfo/numpy-discussion
>> >> >>>
>> >> >>>
>> >> >>>
>> >> >>> --
>> >> >>> Hongyi Zhao <[email protected]>
>> >> >>> _______________________________________________
>> >> >>> NumPy-Discussion mailing list
>> >> >>> [email protected]
>> >> >>> https://mail.python.org/mailman/listinfo/numpy-discussion
>> >> >
>> >> > _______________________________________________
>> >> > NumPy-Discussion mailing list
>> >> > [email protected]
>> >> > https://mail.python.org/mailman/listinfo/numpy-discussion
>> >>
>> >>
>> >>
>> >> --
>> >> Hongyi Zhao <[email protected]>
>> >> _______________________________________________
>> >> NumPy-Discussion mailing list
>> >> [email protected]
>> >> https://mail.python.org/mailman/listinfo/numpy-discussion
>> >
>> > _______________________________________________
>> > NumPy-Discussion mailing list
>> > [email protected]
>> > https://mail.python.org/mailman/listinfo/numpy-discussion
>>
>>
>>
>> --
>> Hongyi Zhao <[email protected]>
>> _______________________________________________
>> NumPy-Discussion mailing list
>> [email protected]
>> https://mail.python.org/mailman/listinfo/numpy-discussion
>
> _______________________________________________
> NumPy-Discussion mailing list
> [email protected]
> https://mail.python.org/mailman/listinfo/numpy-discussion



-- 
Hongyi Zhao <[email protected]>
_______________________________________________
NumPy-Discussion mailing list
[email protected]
https://mail.python.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] The mu.py script will keep running and never end.

Reply via email to