On Sun, Oct 11, 2020 at 2:56 PM Evgeni Burovski <evgeny.burovs...@gmail.com> wrote: > > The script seems to be computing the particle numbers for an array of > chemical potentials. > > Two ways of speeding it up, both are likely simpler then using dask:
What do you mean by saying *dask*? > > First: use numpy > > 1. Move constructing mu_all out of the loop (np.linspace) > 2. Arrange the integrands into a 2d array > 3. np.trapz along an axis which corresponds to a single integrand array > (Or avoid the overhead of trapz by just implementing the trapezoid formula > manually) > > Second: > > Move the loop into cython. Will this be more efficient than the schema like parallelization based on python modules, say, joblib? > > > > > вс, 11 окт. 2020 г., 9:32 Hongyi Zhao <hongyi.z...@gmail.com>: >> >> On Sun, Oct 11, 2020 at 2:02 PM Andrea Gavana <andrea.gav...@gmail.com> >> wrote: >> > >> > >> > >> > On Sun, 11 Oct 2020 at 07.52, Hongyi Zhao <hongyi.z...@gmail.com> wrote: >> >> >> >> On Sun, Oct 11, 2020 at 1:33 PM Andrea Gavana <andrea.gav...@gmail.com> >> >> wrote: >> >> > >> >> > >> >> > >> >> > On Sun, 11 Oct 2020 at 07.14, Andrea Gavana <andrea.gav...@gmail.com> >> >> > wrote: >> >> >> >> >> >> Hi, >> >> >> >> >> >> On Sun, 11 Oct 2020 at 00.27, Hongyi Zhao <hongyi.z...@gmail.com> >> >> >> wrote: >> >> >>> >> >> >>> On Sun, Oct 11, 2020 at 1:48 AM Robert Kern <robert.k...@gmail.com> >> >> >>> wrote: >> >> >>> > >> >> >>> > You don't need to use vectorize() on fermi(). fermi() will work >> >> >>> > just fine on arrays and should be much faster. >> >> >>> >> >> >>> Yes, it really does the trick. See the following for the benchmark >> >> >>> based on your suggestion: >> >> >>> >> >> >>> $ time python mu.py >> >> >>> [-10.999 -10.999 -10.999 ... 20. 20. 20. ] [4.973e-84 >> >> >>> 4.973e-84 4.973e-84 ... 4.973e-84 4.973e-84 4.973e-84] >> >> >>> >> >> >>> real 0m41.056s >> >> >>> user 0m43.970s >> >> >>> sys 0m3.813s >> >> >>> >> >> >>> >> >> >>> But are there any ways to further improve/increase efficiency? >> >> >> >> >> >> >> >> >> >> >> >> I believe it will get a bit better if you don’t column_stack an array >> >> >> 6000 times - maybe pre-allocate your output first? >> >> >> >> >> >> Andrea. >> >> > >> >> > >> >> > >> >> > I’m sorry, scratch that: I’ve seen a ghost white space in front of your >> >> > column_stack call and made me think you were stacking your results very >> >> > many times, which is not the case. >> >> >> >> Still not so clear on your solutions for this problem. Could you >> >> please post here the corresponding snippet of your enhancement? >> > >> > >> > I have no solution, I originally thought you were calling “column_stack” >> > 6000 times in the loop, but that is not the case, I was mistaken. My >> > apologies for that. >> > >> > The timings of your approach is highly dependent on the size of your >> > “energy” and “DOS” array - >> >> The size of the “energy” and “DOS” array is Problem-related and >> shouldn't be reduced arbitrarily. >> >> > not to mention calling trapz 6000 times in a loop. >> >> I'm currently thinking on parallelization the execution of the for >> loop, say, with joblib <https://github.com/joblib/joblib>, but I still >> haven't figured out the corresponding codes. If you have some >> experience on this type of solution, could you please give me some >> more hints? >> >> > Maybe there’s a better way to do it with another approach, but at the >> > moment I can’t think of one... >> > >> >> >> >> >> >> Regards, >> >> HY >> >> > >> >> >> >> >> >> >> >> >>> >> >> >>> >> >> >>> Regards, >> >> >>> HY >> >> >>> >> >> >>> > >> >> >>> > On Sat, Oct 10, 2020, 8:23 AM Hongyi Zhao <hongyi.z...@gmail.com> >> >> >>> > wrote: >> >> >>> >> >> >> >>> >> Hi, >> >> >>> >> >> >> >>> >> My environment is Ubuntu 20.04 and python 3.8.3 managed by pyenv. I >> >> >>> >> try to run the script >> >> >>> >> <https://notebook.rcc.uchicago.edu/files/acs.chemmater.9b05047/Data/bulk/dft/mu.py>, >> >> >>> >> but it will keep running and never end. When I use 'Ctrl + c' to >> >> >>> >> terminate it, it will give the following output: >> >> >>> >> >> >> >>> >> $ python mu.py >> >> >>> >> [-10.999 -10.999 -10.999 ... 20. 20. 20. ] [4.973e-84 >> >> >>> >> 4.973e-84 4.973e-84 ... 4.973e-84 4.973e-84 4.973e-84] >> >> >>> >> >> >> >>> >> I have to terminate it and obtained the following information: >> >> >>> >> >> >> >>> >> ^CTraceback (most recent call last): >> >> >>> >> File "mu.py", line 38, in <module> >> >> >>> >> integrand=DOS*fermi_array(energy,mu,kT) >> >> >>> >> File >> >> >>> >> "/home/werner/.pyenv/versions/datasci/lib/python3.8/site-packages/numpy/lib/function_base.py", >> >> >>> >> line 2108, in __call__ >> >> >>> >> return self._vectorize_call(func=func, args=vargs) >> >> >>> >> File >> >> >>> >> "/home/werner/.pyenv/versions/datasci/lib/python3.8/site-packages/numpy/lib/function_base.py", >> >> >>> >> line 2192, in _vectorize_call >> >> >>> >> outputs = ufunc(*inputs) >> >> >>> >> File "mu.py", line 8, in fermi >> >> >>> >> return 1./(exp((E-mu)/kT)+1) >> >> >>> >> KeyboardInterrupt >> >> >>> >> >> >> >>> >> >> >> >>> >> Any helps and hints for this problem will be highly appreciated? >> >> >>> >> >> >> >>> >> Regards, >> >> >>> >> -- >> >> >>> >> Hongyi Zhao <hongyi.z...@gmail.com> >> >> >>> >> _______________________________________________ >> >> >>> >> NumPy-Discussion mailing list >> >> >>> >> NumPy-Discussion@python.org >> >> >>> >> https://mail.python.org/mailman/listinfo/numpy-discussion >> >> >>> > >> >> >>> > _______________________________________________ >> >> >>> > NumPy-Discussion mailing list >> >> >>> > NumPy-Discussion@python.org >> >> >>> > https://mail.python.org/mailman/listinfo/numpy-discussion >> >> >>> >> >> >>> >> >> >>> >> >> >>> -- >> >> >>> Hongyi Zhao <hongyi.z...@gmail.com> >> >> >>> _______________________________________________ >> >> >>> NumPy-Discussion mailing list >> >> >>> NumPy-Discussion@python.org >> >> >>> https://mail.python.org/mailman/listinfo/numpy-discussion >> >> > >> >> > _______________________________________________ >> >> > NumPy-Discussion mailing list >> >> > NumPy-Discussion@python.org >> >> > https://mail.python.org/mailman/listinfo/numpy-discussion >> >> >> >> >> >> >> >> -- >> >> Hongyi Zhao <hongyi.z...@gmail.com> >> >> _______________________________________________ >> >> NumPy-Discussion mailing list >> >> NumPy-Discussion@python.org >> >> https://mail.python.org/mailman/listinfo/numpy-discussion >> > >> > _______________________________________________ >> > NumPy-Discussion mailing list >> > NumPy-Discussion@python.org >> > https://mail.python.org/mailman/listinfo/numpy-discussion >> >> >> >> -- >> Hongyi Zhao <hongyi.z...@gmail.com> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion@python.org >> https://mail.python.org/mailman/listinfo/numpy-discussion > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion@python.org > https://mail.python.org/mailman/listinfo/numpy-discussion -- Hongyi Zhao <hongyi.z...@gmail.com> _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion