[Numpy-discussion] Re: Polyfit error in displacement

2024-03-25 Thread Andrea Gavana
On Mon, 25 Mar 2024 at 20:09, Charles R Harris 
wrote:

>
>
> On Mon, Mar 25, 2024 at 11:28 AM Luca Bertolotti <
> luca72.bertolo...@gmail.com> wrote:
>
>> Hello
>> in a vb program they use 3rd degree approx and get this value including
>> displacement:(SC)
>> [image: image.png]
>>
>> Ii think that i'm doing the same with numpy but I get different value
>> does anyone can help me please
>>
>> radious = [1821, 1284, 957, 603,450, 245]
>> y = [6722, 6940, 7227, 7864,8472, 10458]
>> p = np.polyfit(radious, y, 3,)
>> t = np.polyval(p, radious)
>> [ 6703.33694696  7061.23784145  7051.49974149  7838.84623289
>>   8654.47847319 10373.60076402]
>> You can see polyval is difference from the sc. of the table
>> Any help is really appreciated
>>
>
> What is sc?
>


At the beginning I thought it was the difference between the fitted y and
the measured y, but of course that is not the case.

For what is worth, doing it in Excel using LINEST:

*=LINEST(A2:A7, B2:B7^{1,2,3})*

And then back-calculating the results of the fit I get this for the
"fitted" y:

6703.34, 7061.24, 7051.50, 7838.85, 8654.48, 10373.60

Which are indeed the same numbers obtained by numpy polyfit. Also get the
same if you use the automatic line fitting in an Excel graph.

Not that I trust Excel with anything, it was just for fun.

Not entirely clear what the OP is doing with VB, but either we are missing
a crucial piece of information or the VB code is incorrect.

Andrea.
___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com


Re: [Numpy-discussion] C-coded dot 1000x faster than numpy?

2021-02-23 Thread Andrea Gavana
Hi,

On Tue, 23 Feb 2021 at 19.11, Neal Becker  wrote:

> I have code that performs dot product of a 2D matrix of size (on the
> order of) [1000,16] with a vector of size [1000].  The matrix is
> float64 and the vector is complex128.  I was using numpy.dot but it
> turned out to be a bottleneck.
>
> So I coded dot2x1 in c++ (using xtensor-python just for the
> interface).  No fancy simd was used, unless g++ did it on it's own.
>
> On a simple benchmark using timeit I find my hand-coded routine is on
> the order of 1000x faster than numpy?  Here is the test code:
> My custom c++ code is dot2x1.  I'm not copying it here because it has
> some dependencies.  Any idea what is going on?



I had a similar experience - albeit with an older numpy and Python 2.7, so
my comments are easily outdated and irrelevant. This was on Windows 10 64
bit, way more than plenty RAM.

It took me forever to find out that numpy.dot was the culprit, and I ended
up using fortran + f2py. Even with the overhead of having to go through
f2py bridge, the fortran dot_product was several times faster.

Sorry if It doesn’t help much.

Andrea.



>
> import numpy as np
>
> from dot2x1 import dot2x1
>
> a = np.ones ((1000,16))
> b = np.array([ 0.80311816+0.80311816j,  0.80311816-0.80311816j,
>-0.80311816+0.80311816j, -0.80311816-0.80311816j,
> 1.09707981+0.29396165j,  1.09707981-0.29396165j,
>-1.09707981+0.29396165j, -1.09707981-0.29396165j,
> 0.29396165+1.09707981j,  0.29396165-1.09707981j,
>-0.29396165+1.09707981j, -0.29396165-1.09707981j,
> 0.25495815+0.25495815j,  0.25495815-0.25495815j,
>-0.25495815+0.25495815j, -0.25495815-0.25495815j])
>
> def F1():
> d = dot2x1 (a, b)
>
> def F2():
> d = np.dot (a, b)
>
> from timeit import timeit
> print (timeit ('F1()', globals=globals(), number=1000))
> print (timeit ('F2()', globals=globals(), number=1000))
>
> In [13]: 0.013910860987380147 << 1st timeit
> 28.608758996007964  << 2nd timeit
> --
> Those who don't understand recursion are doomed to repeat it
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] The mu.py script will keep running and never end.

2020-10-12 Thread Andrea Gavana
Hi,

On Mon, 12 Oct 2020 at 16.22, Hongyi Zhao  wrote:

> On Mon, Oct 12, 2020 at 9:33 PM Andrea Gavana 
> wrote:
> >
> > Hi,
> >
> > On Mon, 12 Oct 2020 at 14:38, Hongyi Zhao  wrote:
> >>
> >> On Sun, Oct 11, 2020 at 3:42 PM Evgeni Burovski
> >>  wrote:
> >> >
> >> > On Sun, Oct 11, 2020 at 9:55 AM Evgeni Burovski
> >> >  wrote:
> >> > >
> >> > > The script seems to be computing the particle numbers for an array
> of chemical potentials.
> >> > >
> >> > > Two ways of speeding it up, both are likely simpler then using dask:
> >> > >
> >> > > First: use numpy
> >> > >
> >> > > 1. Move constructing mu_all out of the loop (np.linspace)
> >> > > 2. Arrange the integrands into a 2d array
> >> > > 3. np.trapz along an axis which corresponds to a single integrand
> array
> >> > > (Or avoid the overhead of trapz by just implementing the trapezoid
> formula manually)
> >> >
> >> >
> >> > Roughly like this:
> >> > https://gist.github.com/ev-br/0250e4eee461670cf489515ee427eb99
> >>
> >> I've done the comparison of the real execution time for your version
> >> I've compared the execution efficiency of your above method and the
> >> original method of the python script by directly using fermi() without
> >> executing vectorize() on it. Very surprisingly, the latter is more
> >> efficient than the former, see following for more info:
> >>
> >> $ time python fermi_integrate_np.py
> >> [[1.0300e+01 4.55561775e+17]
> >>  [1.03001000e+01 4.55561780e+17]
> >>  [1.03002000e+01 4.55561786e+17]
> >>  ...
> >>  [1.08997000e+01 1.33654085e+21]
> >>  [1.08998000e+01 1.33818034e+21]
> >>  [1.08999000e+01 1.33982054e+21]]
> >>
> >> real1m8.797s
> >> user0m47.204s
> >> sys0m27.105s
> >> $ time python mu.py
> >> [[1.0300e+01 4.55561775e+17]
> >>  [1.03001000e+01 4.55561780e+17]
> >>  [1.03002000e+01 4.55561786e+17]
> >>  ...
> >>  [1.08997000e+01 1.33654085e+21]
> >>  [1.08998000e+01 1.33818034e+21]
> >>  [1.08999000e+01 1.33982054e+21]]
> >>
> >> real0m38.829s
> >> user0m41.541s
> >> sys0m3.399s
> >>
> >> So, I think that the benchmark dataset used by you for testing code
> >> efficiency is not so appropriate. What's your point of view on this
> >> testing results?
> >
> >
> >
> >   Evgeni has provided an interesting example on how to speed up your
> code - granted, he used toy data but the improvement is real. As far as I
> can see, you haven't specified how big are your DOS etc... vectors, so it's
> not that obvious how to draw any conclusions. I find it highly puzzling
> that his implementation appears to be slower than your original code.
> >
> > In any case, if performance is so paramount for you, then I would
> suggest you to move in the direction Evgeni was proposing, i.e. shifting
> your implementation to C/Cython or Fortran/f2py.
>
> If so, I think that the C/Fortran based implementations should be more
> efficient than the ones using Cython/f2py.


That is not what I meant: what I meant is: write the time consuming part of
your code in C or Fortran and then bridge it to Python using Cython or
f2py.

Andrea.


>
>
> > I had much better results myself using Fortran/f2py than pure NumPy or
> C/Cython, but this is mostly because my knowledge of Cython is quite
> limited. That said, your problem should be fairly easy to implement in a
> compiled language.
> >
> > Andrea.
> >
> >
> >>
> >>
> >> Regards,
> >> HY
> >>
> >> >
> >> >
> >> >
> >> > > Second:
> >> > >
> >> > > Move the loop into cython.
> >> > >
> >> > >
> >> > >
> >> > >
> >> > > вс, 11 окт. 2020 г., 9:32 Hongyi Zhao :
> >> > >>
> >> > >> On Sun, Oct 11, 2020 at 2:02 PM Andrea Gavana <
> andrea.gav...@gmail.com> wrote:
> >> > >> >
> >> > >> >
> >> > >> >
> >> > >> > On Sun, 11 Oct 2020 at 07.52, Hongyi Zhao 
> wrote:
> >> > >> >>
> >> > >> >> On Sun, Oct 11, 2020 at 1:33 PM Andrea Gavana <
> andrea

Re: [Numpy-discussion] The mu.py script will keep running and never end.

2020-10-12 Thread Andrea Gavana
Hi,

On Mon, 12 Oct 2020 at 14:38, Hongyi Zhao  wrote:

> On Sun, Oct 11, 2020 at 3:42 PM Evgeni Burovski
>  wrote:
> >
> > On Sun, Oct 11, 2020 at 9:55 AM Evgeni Burovski
> >  wrote:
> > >
> > > The script seems to be computing the particle numbers for an array of
> chemical potentials.
> > >
> > > Two ways of speeding it up, both are likely simpler then using dask:
> > >
> > > First: use numpy
> > >
> > > 1. Move constructing mu_all out of the loop (np.linspace)
> > > 2. Arrange the integrands into a 2d array
> > > 3. np.trapz along an axis which corresponds to a single integrand array
> > > (Or avoid the overhead of trapz by just implementing the trapezoid
> formula manually)
> >
> >
> > Roughly like this:
> > https://gist.github.com/ev-br/0250e4eee461670cf489515ee427eb99
>
> I've done the comparison of the real execution time for your version
> I've compared the execution efficiency of your above method and the
> original method of the python script by directly using fermi() without
> executing vectorize() on it. Very surprisingly, the latter is more
> efficient than the former, see following for more info:
>
> $ time python fermi_integrate_np.py
> [[1.0300e+01 4.55561775e+17]
>  [1.03001000e+01 4.55561780e+17]
>  [1.03002000e+01 4.55561786e+17]
>  ...
>  [1.08997000e+01 1.33654085e+21]
>  [1.08998000e+01 1.33818034e+21]
>  [1.08999000e+01 1.33982054e+21]]
>
> real1m8.797s
> user0m47.204s
> sys0m27.105s
> $ time python mu.py
> [[1.0300e+01 4.55561775e+17]
>  [1.03001000e+01 4.55561780e+17]
>  [1.03002000e+01 4.55561786e+17]
>  ...
>  [1.08997000e+01 1.33654085e+21]
>  [1.08998000e+01 1.33818034e+21]
>  [1.08999000e+01 1.33982054e+21]]
>
> real0m38.829s
> user0m41.541s
> sys0m3.399s
>
> So, I think that the benchmark dataset used by you for testing code
> efficiency is not so appropriate. What's your point of view on this
> testing results?
>


  Evgeni has provided an interesting example on how to speed up your code -
granted, he used toy data but the improvement is real. As far as I can see,
you haven't specified how big are your DOS etc... vectors, so it's not that
obvious how to draw any conclusions. I find it highly puzzling that his
implementation appears to be slower than your original code.

In any case, if performance is so paramount for you, then I would suggest
you to move in the direction Evgeni was proposing, i.e. shifting your
implementation to C/Cython or Fortran/f2py. I had much better results
myself using Fortran/f2py than pure NumPy or C/Cython, but this is mostly
because my knowledge of Cython is quite limited. That said, your problem
should be fairly easy to implement in a compiled language.

Andrea.



>
> Regards,
> HY
>
> >
> >
> >
> > > Second:
> > >
> > > Move the loop into cython.
> > >
> > >
> > >
> > >
> > > вс, 11 окт. 2020 г., 9:32 Hongyi Zhao :
> > >>
> > >> On Sun, Oct 11, 2020 at 2:02 PM Andrea Gavana <
> andrea.gav...@gmail.com> wrote:
> > >> >
> > >> >
> > >> >
> > >> > On Sun, 11 Oct 2020 at 07.52, Hongyi Zhao 
> wrote:
> > >> >>
> > >> >> On Sun, Oct 11, 2020 at 1:33 PM Andrea Gavana <
> andrea.gav...@gmail.com> wrote:
> > >> >> >
> > >> >> >
> > >> >> >
> > >> >> > On Sun, 11 Oct 2020 at 07.14, Andrea Gavana <
> andrea.gav...@gmail.com> wrote:
> > >> >> >>
> > >> >> >> Hi,
> > >> >> >>
> > >> >> >> On Sun, 11 Oct 2020 at 00.27, Hongyi Zhao <
> hongyi.z...@gmail.com> wrote:
> > >> >> >>>
> > >> >> >>> On Sun, Oct 11, 2020 at 1:48 AM Robert Kern <
> robert.k...@gmail.com> wrote:
> > >> >> >>> >
> > >> >> >>> > You don't need to use vectorize() on fermi(). fermi() will
> work just fine on arrays and should be much faster.
> > >> >> >>>
> > >> >> >>> Yes, it really does the trick. See the following for the
> benchmark
> > >> >> >>> based on your suggestion:
> > >> >> >>>
> > >> >> >>> $ time python mu.py
> > >> >> >>> [-10.999 -10.999 -10.999 ...  20. 20. 20.   ]
> [4.973e-84
> > >> >> >>> 4.973

Re: [Numpy-discussion] The mu.py script will keep running and never end.

2020-10-10 Thread Andrea Gavana
On Sun, 11 Oct 2020 at 07.52, Hongyi Zhao  wrote:

> On Sun, Oct 11, 2020 at 1:33 PM Andrea Gavana 
> wrote:
> >
> >
> >
> > On Sun, 11 Oct 2020 at 07.14, Andrea Gavana 
> wrote:
> >>
> >> Hi,
> >>
> >> On Sun, 11 Oct 2020 at 00.27, Hongyi Zhao 
> wrote:
> >>>
> >>> On Sun, Oct 11, 2020 at 1:48 AM Robert Kern 
> wrote:
> >>> >
> >>> > You don't need to use vectorize() on fermi(). fermi() will work just
> fine on arrays and should be much faster.
> >>>
> >>> Yes, it really does the trick. See the following for the benchmark
> >>> based on your suggestion:
> >>>
> >>> $ time python mu.py
> >>> [-10.999 -10.999 -10.999 ...  20. 20. 20.   ] [4.973e-84
> >>> 4.973e-84 4.973e-84 ... 4.973e-84 4.973e-84 4.973e-84]
> >>>
> >>> real0m41.056s
> >>> user0m43.970s
> >>> sys0m3.813s
> >>>
> >>>
> >>> But are there any ways to further improve/increase efficiency?
> >>
> >>
> >>
> >> I believe it will get a bit better if you don’t column_stack an array
> 6000 times - maybe pre-allocate your output first?
> >>
> >> Andrea.
> >
> >
> >
> > I’m sorry, scratch that: I’ve seen a ghost white space in front of your
> column_stack call and made me think you were stacking your results very
> many times, which is not the case.
>
> Still not so clear on your solutions for this problem. Could you
> please post here the corresponding snippet of your enhancement?


I have no solution, I originally thought you were calling “column_stack”
6000 times in the loop, but that is not the case, I was mistaken. My
apologies for that.

The timings of your approach is highly dependent on the size of your
“energy” and “DOS” array - not to mention calling trapz 6000 times in a
loop. Maybe there’s a better way to do it with another approach, but at the
moment I can’t think of one...


>
> Regards,
> HY
> >
> >>
> >>
> >>>
> >>>
> >>> Regards,
> >>> HY
> >>>
> >>> >
> >>> > On Sat, Oct 10, 2020, 8:23 AM Hongyi Zhao 
> wrote:
> >>> >>
> >>> >> Hi,
> >>> >>
> >>> >> My environment is Ubuntu 20.04 and python 3.8.3 managed by pyenv. I
> >>> >> try to run the script
> >>> >> <
> https://notebook.rcc.uchicago.edu/files/acs.chemmater.9b05047/Data/bulk/dft/mu.py
> >,
> >>> >> but it will keep running and never end. When I use 'Ctrl + c' to
> >>> >> terminate it, it will give the following output:
> >>> >>
> >>> >> $ python mu.py
> >>> >> [-10.999 -10.999 -10.999 ...  20. 20. 20.   ] [4.973e-84
> >>> >> 4.973e-84 4.973e-84 ... 4.973e-84 4.973e-84 4.973e-84]
> >>> >>
> >>> >> I have to terminate it and obtained the following information:
> >>> >>
> >>> >> ^CTraceback (most recent call last):
> >>> >>   File "mu.py", line 38, in 
> >>> >> integrand=DOS*fermi_array(energy,mu,kT)
> >>> >>   File
> "/home/werner/.pyenv/versions/datasci/lib/python3.8/site-packages/numpy/lib/function_base.py",
> >>> >> line 2108, in __call__
> >>> >> return self._vectorize_call(func=func, args=vargs)
> >>> >>   File
> "/home/werner/.pyenv/versions/datasci/lib/python3.8/site-packages/numpy/lib/function_base.py",
> >>> >> line 2192, in _vectorize_call
> >>> >> outputs = ufunc(*inputs)
> >>> >>   File "mu.py", line 8, in fermi
> >>> >> return 1./(exp((E-mu)/kT)+1)
> >>> >> KeyboardInterrupt
> >>> >>
> >>> >>
> >>> >> Any helps and hints for this problem will be highly appreciated?
> >>> >>
> >>> >> Regards,
> >>> >> --
> >>> >> Hongyi Zhao 
> >>> >> ___
> >>> >> NumPy-Discussion mailing list
> >>> >> NumPy-Discussion@python.org
> >>> >> https://mail.python.org/mailman/listinfo/numpy-discussion
> >>> >
> >>> > ___
> >>> > NumPy-Discussion mailing list
> >>> > NumPy-Discussion@python.org
> >>> > https://mail.python.org/mailman/listinfo/numpy-discussion
> >>>
> >>>
> >>>
> >>> --
> >>> Hongyi Zhao 
> >>> ___
> >>> NumPy-Discussion mailing list
> >>> NumPy-Discussion@python.org
> >>> https://mail.python.org/mailman/listinfo/numpy-discussion
> >
> > ___
> > NumPy-Discussion mailing list
> > NumPy-Discussion@python.org
> > https://mail.python.org/mailman/listinfo/numpy-discussion
>
>
>
> --
> Hongyi Zhao 
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] The mu.py script will keep running and never end.

2020-10-10 Thread Andrea Gavana
On Sun, 11 Oct 2020 at 07.14, Andrea Gavana  wrote:

> Hi,
>
> On Sun, 11 Oct 2020 at 00.27, Hongyi Zhao  wrote:
>
>> On Sun, Oct 11, 2020 at 1:48 AM Robert Kern 
>> wrote:
>> >
>> > You don't need to use vectorize() on fermi(). fermi() will work just
>> fine on arrays and should be much faster.
>>
>> Yes, it really does the trick. See the following for the benchmark
>> based on your suggestion:
>>
>> $ time python mu.py
>> [-10.999 -10.999 -10.999 ...  20. 20. 20.   ] [4.973e-84
>> 4.973e-84 4.973e-84 ... 4.973e-84 4.973e-84 4.973e-84]
>>
>> real0m41.056s
>> user0m43.970s
>> sys0m3.813s
>>
>>
>> But are there any ways to further improve/increase efficiency?
>
>
>
> I believe it will get a bit better if you don’t column_stack an array 6000
> times - maybe pre-allocate your output first?
>
> Andrea.
>


I’m sorry, scratch that: I’ve seen a ghost white space in front of your
column_stack call and made me think you were stacking your results very
many times, which is not the case.


>
>
>>
>> Regards,
>> HY
>>
>> >
>> > On Sat, Oct 10, 2020, 8:23 AM Hongyi Zhao 
>> wrote:
>> >>
>> >> Hi,
>> >>
>> >> My environment is Ubuntu 20.04 and python 3.8.3 managed by pyenv. I
>> >> try to run the script
>> >> <
>> https://notebook.rcc.uchicago.edu/files/acs.chemmater.9b05047/Data/bulk/dft/mu.py
>> >,
>> >> but it will keep running and never end. When I use 'Ctrl + c' to
>> >> terminate it, it will give the following output:
>> >>
>> >> $ python mu.py
>> >> [-10.999 -10.999 -10.999 ...  20. 20. 20.   ] [4.973e-84
>> >> 4.973e-84 4.973e-84 ... 4.973e-84 4.973e-84 4.973e-84]
>> >>
>> >> I have to terminate it and obtained the following information:
>> >>
>> >> ^CTraceback (most recent call last):
>> >>   File "mu.py", line 38, in 
>> >> integrand=DOS*fermi_array(energy,mu,kT)
>> >>   File
>> "/home/werner/.pyenv/versions/datasci/lib/python3.8/site-packages/numpy/lib/function_base.py",
>> >> line 2108, in __call__
>> >> return self._vectorize_call(func=func, args=vargs)
>> >>   File
>> "/home/werner/.pyenv/versions/datasci/lib/python3.8/site-packages/numpy/lib/function_base.py",
>> >> line 2192, in _vectorize_call
>> >> outputs = ufunc(*inputs)
>> >>   File "mu.py", line 8, in fermi
>> >> return 1./(exp((E-mu)/kT)+1)
>> >> KeyboardInterrupt
>> >>
>> >>
>> >> Any helps and hints for this problem will be highly appreciated?
>> >>
>> >> Regards,
>> >> --
>> >> Hongyi Zhao 
>> >> ___
>> >> NumPy-Discussion mailing list
>> >> NumPy-Discussion@python.org
>> >> https://mail.python.org/mailman/listinfo/numpy-discussion
>> >
>> > ___
>> > NumPy-Discussion mailing list
>> > NumPy-Discussion@python.org
>> > https://mail.python.org/mailman/listinfo/numpy-discussion
>>
>>
>>
>> --
>> Hongyi Zhao 
>> ___
>> NumPy-Discussion mailing list
>> NumPy-Discussion@python.org
>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>
>
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] The mu.py script will keep running and never end.

2020-10-10 Thread Andrea Gavana
Hi,

On Sun, 11 Oct 2020 at 00.27, Hongyi Zhao  wrote:

> On Sun, Oct 11, 2020 at 1:48 AM Robert Kern  wrote:
> >
> > You don't need to use vectorize() on fermi(). fermi() will work just
> fine on arrays and should be much faster.
>
> Yes, it really does the trick. See the following for the benchmark
> based on your suggestion:
>
> $ time python mu.py
> [-10.999 -10.999 -10.999 ...  20. 20. 20.   ] [4.973e-84
> 4.973e-84 4.973e-84 ... 4.973e-84 4.973e-84 4.973e-84]
>
> real0m41.056s
> user0m43.970s
> sys0m3.813s
>
>
> But are there any ways to further improve/increase efficiency?



I believe it will get a bit better if you don’t column_stack an array 6000
times - maybe pre-allocate your output first?

Andrea.



>
> Regards,
> HY
>
> >
> > On Sat, Oct 10, 2020, 8:23 AM Hongyi Zhao  wrote:
> >>
> >> Hi,
> >>
> >> My environment is Ubuntu 20.04 and python 3.8.3 managed by pyenv. I
> >> try to run the script
> >> <
> https://notebook.rcc.uchicago.edu/files/acs.chemmater.9b05047/Data/bulk/dft/mu.py
> >,
> >> but it will keep running and never end. When I use 'Ctrl + c' to
> >> terminate it, it will give the following output:
> >>
> >> $ python mu.py
> >> [-10.999 -10.999 -10.999 ...  20. 20. 20.   ] [4.973e-84
> >> 4.973e-84 4.973e-84 ... 4.973e-84 4.973e-84 4.973e-84]
> >>
> >> I have to terminate it and obtained the following information:
> >>
> >> ^CTraceback (most recent call last):
> >>   File "mu.py", line 38, in 
> >> integrand=DOS*fermi_array(energy,mu,kT)
> >>   File
> "/home/werner/.pyenv/versions/datasci/lib/python3.8/site-packages/numpy/lib/function_base.py",
> >> line 2108, in __call__
> >> return self._vectorize_call(func=func, args=vargs)
> >>   File
> "/home/werner/.pyenv/versions/datasci/lib/python3.8/site-packages/numpy/lib/function_base.py",
> >> line 2192, in _vectorize_call
> >> outputs = ufunc(*inputs)
> >>   File "mu.py", line 8, in fermi
> >> return 1./(exp((E-mu)/kT)+1)
> >> KeyboardInterrupt
> >>
> >>
> >> Any helps and hints for this problem will be highly appreciated?
> >>
> >> Regards,
> >> --
> >> Hongyi Zhao 
> >> ___
> >> NumPy-Discussion mailing list
> >> NumPy-Discussion@python.org
> >> https://mail.python.org/mailman/listinfo/numpy-discussion
> >
> > ___
> > NumPy-Discussion mailing list
> > NumPy-Discussion@python.org
> > https://mail.python.org/mailman/listinfo/numpy-discussion
>
>
>
> --
> Hongyi Zhao 
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Proposal to add clause to license prohibiting use by oil and gas extraction companies

2020-07-01 Thread Andrea Gavana
On Wed, 1 Jul 2020 at 21.23, gyro funch  wrote:

> Hello,
>
> I greatly respect the intention, but this is a very slippery slope.
>
> Will you exempt groups within these companies that are working on
> 'green' technologies (e.g., biofuels)?
>
> Will you add to the license restrictions companies who make use of oil
> and gas extracted by these companies (automotive, chemical/polymers, etc.)?
>
> Will you follow the chain from extraction to consumption and add the
> links to the license 'blacklist'?
>
> -gyro


Thank you for injecting some sense and a few reality checks into the
discussion.

Andrea.



>
>
> On 7/1/2020 12:34 PM, John Preston wrote:
> > Hello all,
> >
> > The following proposal was originally issue #16722 on GitHub but at
> > the request of Matti Picus I am moving the discussion to this list.
> >
> >
> > "NumPy is the fundamental package needed for scientific computing with
> Python."
> >
> > I am asking the NumPy project to leverage its position as a core
> > dependency among statistical, numerical, and ML projects, in the
> > pursuit of climate justice. It is easy to identify open-source
> > software used by the oil and gas industry which relies on NumPy [1]
> > [2] , and it is highly likely that NumPy is used in closed-source and
> > in-house software at oil and gas extraction companies such as Aramco,
> > ExxonMobil, BP, Shell, and others. I believe it is possible to use
> > software licensing to discourage the use of NumPy and dependent
> > packages by companies such as these, and that doing so would frustrate
> > the ability of these companies to identify and extract new oil and gas
> > reserves.
> >
> > I propose NumPy's current BSD 3-Clause license be extended to include
> > the following conditions, in line with the Climate Strike License [3]
> > :
> >
> > * The Software may not be used in applications and services that
> > are used for or
> >aid in the exploration, extraction, refinement, processing, or
> > transportation
> >of fossil fuels.
> >
> > * The Software may not be used by companies that rely on fossil
> > fuel extraction
> >as their primary means of revenue. This includes but is not
> > limited to the
> >companies listed at https://climatestrike.software/blocklist
> >
> > I accept that there are issues around adopting such a proposal,
> including that:
> >
> > addition of such clauses violates the Open Source Initiative's
> > canonical Open Source Definition, which explicitly excludes licenses
> > that limit re-use "in a specific field of endeavor", and therefore if
> > these clauses were adopted NumPy would no longer "be open-source" by
> > this definition;
> > there may be collateral damage among the wider user base and project
> > sponsorship, due to the vague nature of the first clause, and this may
> > affect the longevity of the project and its standing within the
> > Python, numerical, statistical, and ML communities.
> >
> > My intention with the opening of this issue is to promote constructive
> > discussion of the use of software licensing -- and other measures --
> > for working towards climate justice -- and other forms of justice --
> > in the context of NumPy and other popular open-source libraries. Some
> > people will say that NumPy is "just a tool" and that it sits
> > independent of how it is used, but due to its utility and its
> > influence as a major open-source library, I think it is essential that
> > we consider the position of the Climate Strike License authors, that
> > "as tech workers, we should take responsibility in how our software is
> > used".
> >
> > Many thanks to all of the contributors who have put so much time and
> > energy into NumPy. ✨ ❤️ 😃
> >
> > [1] https://github.com/gazprom-neft/petroflow
> > [2] https://github.com/climate-strike/analysis
> > [3] https://github.com/climate-strike/license
> > ___
> > NumPy-Discussion mailing list
> > NumPy-Discussion@python.org
> > https://mail.python.org/mailman/listinfo/numpy-discussion
> >
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] PEP 574 - zero-copy pickling with out of band data

2018-07-03 Thread Andrea Gavana
Hi,


On Tue, 3 Jul 2018 at 09.20, Gael Varoquaux 
wrote:

> On Tue, Jul 03, 2018 at 08:54:51AM +0200, Andrea Gavana wrote:
> > This sound so very powerful... it’s such a pity that these type of gems
> won’t
> > be backported to Python 2 - we have so many legacy applications smoothly
> > running in Python 2 and nowhere near the required resources to even start
> > porting to Python 3,
>
> I am a strong defender of stability and long-term support in scientific
> software. But what you are demanding is that developers who do free work
> do not benefit from their own work to have a more powerful environment.
>
> More recent versions of Python are improved compared to older ones and
> make it much easier to write certain idioms. Developers make these
> changes over years to ensure that codebases are always simpler and more
> robust. Backporting in effect means doing this work twice, but the second
> time with more constraints. I just allocated something like a man-year to
> have robust parallel-computing features work both on Python 2 and Python
> 3. With this man-year we could have done many other things. Did I make
> the correct decision? I am not sure, because this is just creating more
> technical dept.
>
> I understand that we all sit on piles of code that we wrote for a given
> application and one point, and that we will not be able to modernise it
> all. But the fact that we don't have the bandwidth to make it evolve
> probably means that we need to triage what's important and call a loss
> the rest. Just like if I have 5 old cars in my backyard, I won't be able
> to keep them all on the road unless I am very rich.
>
>
> People asking for infinite backport to Python 2 are just asking
> developers to write them a second free check, even larger than the one
> they just got by having the feature under Python 3.
>

Just to clarify: I wasn’t asking for anything, just complimenting Antoine’s
work for something that appears to be a wonderful feature. There was a bit
of rant from my part for sure, but I’ve never asked for someone to redo the
work to make it run on Python 2.

Allocating a resource to port hundreds of thousand of LOC is close to an
impossibility in the industry I work in, especially because our big team
(the two of us) don’t code for a living, we have way many different duties.
We code to make our life easier.

I’m happy if you feel better after your tirade.

Andrea.




>
> Gaël
>
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] PEP 574 - zero-copy pickling with out of band data

2018-07-02 Thread Andrea Gavana
On Tue, 3 Jul 2018 at 07.35, Gael Varoquaux 
wrote:

> On Mon, Jul 02, 2018 at 05:31:05PM -0600, Charles R Harris wrote:
> > ISTR that some parallel processing applications sent pickled arrays
> around to
> > different processes, I don't know if that is still the case, but if so,
> no copy
> > might be a big gain for them.
>
> Yes, most parallel code that's across processes or across computers use
> some form a pickle. I hope that this PEP would enable large speed ups.
> This would be a big deal for parallelism in numerical Python.



This sound so very powerful... it’s such a pity that these type of gems
won’t be backported to Python 2 - we have so many legacy applications
smoothly running in Python 2 and nowhere near the required resources to
even start porting to Python 3, and pickle5 looks like  a small revolution
in the data-persistent world.

Andrea.



> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] List comprehension and loops performances with NumPy arrays

2017-10-08 Thread Andrea Gavana
On Sat, 7 Oct 2017 at 16.59, Nicholas Nadeau 
wrote:

> Hi Andrea!
>
> Checkout the following SO answers for similar contexts:
> -
> https://stackoverflow.com/questions/22108488/are-list-comprehensions-and-functional-functions-faster-than-for-loops
> -
> https://stackoverflow.com/questions/30245397/why-is-list-comprehension-so-faster
>
> To better visualize the issue, I made a iPython gist (simplifying the code
> a bit): https://gist.github.com/nnadeau/3deb6f18d028009a4495590cfbbfaa40
>
> From a quick view of the disassembled code (I'm not an expert, so correct
> me if I'm wrong), list comprehension has much less overhead compared to
> iterating/looping through the pre-allocated data and building/storing each
> slice.
>


Thank you Nicholas, I suspected that the approach of using list
comprehensions was close to unbeatable, thanks for the analysis!

Andrea.



> Cheers,
>
>
>
> --
> Nicholas Nadeau, P.Eng., AVS
>
> On 7 October 2017 at 05:56, Andrea Gavana  wrote:
>
>> Apologies, correct timeit code this time (I had gotten the wrong shape
>> for the output matrix in the loop case):
>>
>> if __name__ == '__main__':
>>
>> repeat = 1000
>> items = [Item('item_%d'%(i+1)) for i in xrange(500)]
>>
>> output = numpy.asarray([item.do_something() for item in items]).T
>> statements = ['''
>>   output = numpy.asarray([item.do_something() for item in
>> items]).T
>>   ''',
>>   '''
>>   output = numpy.empty((8, 500))
>>   for i, item in enumerate(items):
>>   output[:, i] = item.do_something()
>>   ''']
>>
>> methods = ['List Comprehension', 'Empty plus Loop   ']
>>
>> setup  = 'from __main__ import numpy, items'
>>
>> for stmnt, method in zip(statements, methods):
>>
>> elapsed = timeit.repeat(stmnt, setup=setup, number=1,
>> repeat=repeat)
>> minv, maxv, meanv = min(elapsed), max(elapsed),
>> numpy.mean(elapsed)
>> elapsed.sort()
>> best_of_3 = numpy.mean(elapsed[0:3])
>> result = numpy.asarray((minv, maxv, meanv, best_of_3))*repeat
>>
>> print method, ': MIN: %0.2f ms , MAX: %0.2f ms , MEAN: %0.2f ms ,
>> BEST OF 3: %0.2f ms'%tuple(result.tolist())
>>
>>
>> Results are the same as before...
>>
>>
>>
>> On 7 October 2017 at 11:52, Andrea Gavana 
>> wrote:
>>
>>> Hi All,
>>>
>>> I have this little snippet of code:
>>>
>>> import timeit
>>> import numpy
>>>
>>> class Item(object):
>>>
>>> def __init__(self, name):
>>>
>>> self.name = name
>>> self.values = numpy.random.rand(8, 1)
>>>
>>> def do_something(self):
>>>
>>> sv = self.values.sum(axis=0)
>>> array = numpy.empty((8, ))
>>> f = numpy.dot(0.5*numpy.ones((8, )), self.values)[0]
>>> array.fill(f)
>>> return array
>>>
>>>
>>> In my real application, the method do_something does a bit more than
>>> that, but I believe the snippet is enough to start playing with it. What I
>>> have is a list of (on average) 500-1,000 classes Item, and I am trying to
>>> retrieve the output of do_something for each of them in a single, big 2D
>>> numpy array.
>>>
>>> My current approach is to use list comprehension like this:
>>>
>>> output = numpy.asarray([item.do_something() for item in items]).T
>>>
>>> (Note: I need the transposed of that 2D array, always).
>>>
>>> But then I though: why not preallocating the output array and make a
>>> simple loop:
>>>
>>> output = numpy.empty((500, 8))
>>> for i, item in enumerate(items):
>>> output[i, :] = item.do_something()
>>>
>>>
>>> I was expecting this version to be marginally faster - as the previous
>>> one has to call asarray and then transpose the matrix, but I was in for a
>>> surprise:
>>>
>>> if __name__ == '__main__':
>>>
>>> repeat = 1000
>>> items = [Item('item_%d'%(i+1)) for i in xrange(500)]
>>>
>>> statements = ['''
>>>   output = numpy.asarray([item.do_something(

Re: [Numpy-discussion] List comprehension and loops performances with NumPy arrays

2017-10-07 Thread Andrea Gavana
Apologies, correct timeit code this time (I had gotten the wrong shape for
the output matrix in the loop case):

if __name__ == '__main__':

repeat = 1000
items = [Item('item_%d'%(i+1)) for i in xrange(500)]

output = numpy.asarray([item.do_something() for item in items]).T
statements = ['''
  output = numpy.asarray([item.do_something() for item in
items]).T
  ''',
  '''
  output = numpy.empty((8, 500))
  for i, item in enumerate(items):
  output[:, i] = item.do_something()
  ''']

methods = ['List Comprehension', 'Empty plus Loop   ']
setup  = 'from __main__ import numpy, items'

for stmnt, method in zip(statements, methods):

elapsed = timeit.repeat(stmnt, setup=setup, number=1, repeat=repeat)
minv, maxv, meanv = min(elapsed), max(elapsed), numpy.mean(elapsed)
elapsed.sort()
best_of_3 = numpy.mean(elapsed[0:3])
result = numpy.asarray((minv, maxv, meanv, best_of_3))*repeat

print method, ': MIN: %0.2f ms , MAX: %0.2f ms , MEAN: %0.2f ms ,
BEST OF 3: %0.2f ms'%tuple(result.tolist())


Results are the same as before...



On 7 October 2017 at 11:52, Andrea Gavana  wrote:

> Hi All,
>
> I have this little snippet of code:
>
> import timeit
> import numpy
>
> class Item(object):
>
> def __init__(self, name):
>
> self.name = name
> self.values = numpy.random.rand(8, 1)
>
> def do_something(self):
>
> sv = self.values.sum(axis=0)
> array = numpy.empty((8, ))
> f = numpy.dot(0.5*numpy.ones((8, )), self.values)[0]
> array.fill(f)
> return array
>
>
> In my real application, the method do_something does a bit more than that,
> but I believe the snippet is enough to start playing with it. What I have
> is a list of (on average) 500-1,000 classes Item, and I am trying to
> retrieve the output of do_something for each of them in a single, big 2D
> numpy array.
>
> My current approach is to use list comprehension like this:
>
> output = numpy.asarray([item.do_something() for item in items]).T
>
> (Note: I need the transposed of that 2D array, always).
>
> But then I though: why not preallocating the output array and make a
> simple loop:
>
> output = numpy.empty((500, 8))
> for i, item in enumerate(items):
> output[i, :] = item.do_something()
>
>
> I was expecting this version to be marginally faster - as the previous one
> has to call asarray and then transpose the matrix, but I was in for a
> surprise:
>
> if __name__ == '__main__':
>
> repeat = 1000
> items = [Item('item_%d'%(i+1)) for i in xrange(500)]
>
> statements = ['''
>   output = numpy.asarray([item.do_something() for item in
> items]).T
>   ''',
>   '''
>   output = numpy.empty((500, 8))
>   for i, item in enumerate(items):
>   output[i, :] = item.do_something()
>   ''']
>
> methods = ['List Comprehension', 'Empty plus Loop   ']
>
> setup  = 'from __main__ import numpy, items'
>
> for stmnt, method in zip(statements, methods):
>
> elapsed = timeit.repeat(stmnt, setup=setup, number=1,
> repeat=repeat)
> minv, maxv, meanv = min(elapsed), max(elapsed), numpy.mean(elapsed)
> elapsed.sort()
> best_of_3 = numpy.mean(elapsed[0:3])
> result = numpy.asarray((minv, maxv, meanv, best_of_3))*repeat
>
> print method, ': MIN: %0.2f ms , MAX: %0.2f ms , MEAN: %0.2f ms ,
> BEST OF 3: %0.2f ms'%tuple(result.tolist())
>
>
> I get this:
>
> List Comprehension : MIN: 7.32 ms , MAX: 9.13 ms , MEAN: 7.85 ms , BEST OF
> 3: 7.33 ms
> Empty plus Loop: MIN: 7.99 ms , MAX: 9.57 ms , MEAN: 8.31 ms , BEST OF
> 3: 8.01 ms
>
>
> Now, I know that list comprehensions are renowned for being insanely fast,
> but I though that doing asarray plus transpose would by far defeat their
> advantage, especially since the list comprehension is used to call a
> method, not to do some simple arithmetic inside it...
>
> I guess I am missing something obvious here... oh, and if anyone has
> suggestions about how to improve my crappy code (performance wise), please
> feel free to add your thoughts.
>
> Thank you.
>
> Andrea.
>
>
>
>
>
>
>
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] List comprehension and loops performances with NumPy arrays

2017-10-07 Thread Andrea Gavana
Hi All,

I have this little snippet of code:

import timeit
import numpy

class Item(object):

def __init__(self, name):

self.name = name
self.values = numpy.random.rand(8, 1)

def do_something(self):

sv = self.values.sum(axis=0)
array = numpy.empty((8, ))
f = numpy.dot(0.5*numpy.ones((8, )), self.values)[0]
array.fill(f)
return array


In my real application, the method do_something does a bit more than that,
but I believe the snippet is enough to start playing with it. What I have
is a list of (on average) 500-1,000 classes Item, and I am trying to
retrieve the output of do_something for each of them in a single, big 2D
numpy array.

My current approach is to use list comprehension like this:

output = numpy.asarray([item.do_something() for item in items]).T

(Note: I need the transposed of that 2D array, always).

But then I though: why not preallocating the output array and make a simple
loop:

output = numpy.empty((500, 8))
for i, item in enumerate(items):
output[i, :] = item.do_something()


I was expecting this version to be marginally faster - as the previous one
has to call asarray and then transpose the matrix, but I was in for a
surprise:

if __name__ == '__main__':

repeat = 1000
items = [Item('item_%d'%(i+1)) for i in xrange(500)]

statements = ['''
  output = numpy.asarray([item.do_something() for item in
items]).T
  ''',
  '''
  output = numpy.empty((500, 8))
  for i, item in enumerate(items):
  output[i, :] = item.do_something()
  ''']

methods = ['List Comprehension', 'Empty plus Loop   ']
setup  = 'from __main__ import numpy, items'

for stmnt, method in zip(statements, methods):

elapsed = timeit.repeat(stmnt, setup=setup, number=1, repeat=repeat)
minv, maxv, meanv = min(elapsed), max(elapsed), numpy.mean(elapsed)
elapsed.sort()
best_of_3 = numpy.mean(elapsed[0:3])
result = numpy.asarray((minv, maxv, meanv, best_of_3))*repeat

print method, ': MIN: %0.2f ms , MAX: %0.2f ms , MEAN: %0.2f ms ,
BEST OF 3: %0.2f ms'%tuple(result.tolist())


I get this:

List Comprehension : MIN: 7.32 ms , MAX: 9.13 ms , MEAN: 7.85 ms , BEST OF
3: 7.33 ms
Empty plus Loop: MIN: 7.99 ms , MAX: 9.57 ms , MEAN: 8.31 ms , BEST OF
3: 8.01 ms


Now, I know that list comprehensions are renowned for being insanely fast,
but I though that doing asarray plus transpose would by far defeat their
advantage, especially since the list comprehension is used to call a
method, not to do some simple arithmetic inside it...

I guess I am missing something obvious here... oh, and if anyone has
suggestions about how to improve my crappy code (performance wise), please
feel free to add your thoughts.

Thank you.

Andrea.
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion