Re: [Numpy-discussion] Fwd: [numfocus] Grants up to $3k available to NumFOCUS projects (sponsored & affiliated)

2017-04-03 Thread Julian Taylor
On 31.03.2017 16:07, Julian Taylor wrote:
> On 31.03.2017 15:51, Nathaniel Smith wrote:
>> On Mar 31, 2017 1:15 AM, "Ralf Gommers" > > wrote:
>>
>>
>>
>> On Mon, Mar 27, 2017 at 11:42 PM, Ralf Gommers
>> mailto:ralf.gomm...@gmail.com>> wrote:
>>
>>
>>
>> On Mon, Mar 27, 2017 at 11:33 PM, Julian Taylor
>> > > wrote:
>>
>> I have two ideas under one big important topic: make numpy
>> python3
>> compatible.
>>
>> The first fits pretty well with the grant size and nobody
>> wants to do it
>> for free:
>> - fix our text IO functions under python3 and support multiple
>> encodings, not only latin1.
>> Reasonably simple to do, slap encoding arguments on the
>> functions,
>> generate test cases and somehow keep backward compatibility.
>> Some
>> prelimary unfinished work is in
>> https://github.com/numpy/numpy/pull/4208
>> 
>>
>>
>> I like that idea, it's a recurring pain point. Are you
>> interested to work on it, or are you thinking to advertise the
>> idea here to see if anyone steps up?
>>
>>
>> More thoughts on this anyone? Or preferences for this idea or the
>> numpy.org  one? Submission deadline is April 3rd
>> and we can only put in one proposal this time, so we need to (a)
>> make a choice between these ideas, and (b) write up a proposal.
>>
>> If there's not enough replies to this so the choice is clear cut, I
>> will send out a poll to the core devs.
>>
>>
>> Do we have anyone interested in doing the work in either case? That
>> seems like the most important consideration to me...
>>
>> -n
>>
> 
> I could do the textio thing if no one shows up for numpy.org. I can
> probably check again what is required in the next few days and write a
> proposal.
> The change will need reviewing in the end too, should that be
> compensated too? It feels weird if not.
> 

I have decided to not do it, as it is more or less just a bugfix and I
currently do not feel capable of doing with added completion pressure.
But I have collected some of related issues and discussions:

https://github.com/numpy/numpy/issues/4600
https://github.com/numpy/numpy/issues/3184
http://numpy-discussion.10968.n7.nabble.com/using-loadtxt-to-load-a-text-file-in-to-a-numpy-array-tt35992.html#a36003
# loadtxt
https://github.com/numpy/numpy/pull/4208
# genfromtxt
http://numpy-discussion.10968.n7.nabble.com/genfromtxt-universal-newline-support-td37816.html
https://github.com/dhomeier/numpy/commit/995ec93
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Fwd: [numfocus] Grants up to $3k available to NumFOCUS projects (sponsored & affiliated)

2017-04-03 Thread Renato Fabbri
maybe OT, but is has become recurrent to me for already some years
to make a very simple module for obtaining arrays related to musical
elements.
All here:
https://github.com/ttm/dissertacao
scripts/ have Python/Numpy has implementions of the musical elements.
dissertacaoCorrigida.pdf holds a thorough description of the framework.

I idealize it as a module inside Numpy
but I understand it might be reasonable to do it as a Scipy kit.

I handed my doctorate a few days ago and might be willing to
put some time into this.

PS. long time no post. Hello!



On Mon, Apr 3, 2017 at 8:28 AM, Julian Taylor  wrote:

> On 31.03.2017 16:07, Julian Taylor wrote:
> > On 31.03.2017 15:51, Nathaniel Smith wrote:
> >> On Mar 31, 2017 1:15 AM, "Ralf Gommers"  >> > wrote:
> >>
> >>
> >>
> >> On Mon, Mar 27, 2017 at 11:42 PM, Ralf Gommers
> >> mailto:ralf.gomm...@gmail.com>> wrote:
> >>
> >>
> >>
> >> On Mon, Mar 27, 2017 at 11:33 PM, Julian Taylor
> >>  >> > wrote:
> >>
> >> I have two ideas under one big important topic: make numpy
> >> python3
> >> compatible.
> >>
> >> The first fits pretty well with the grant size and nobody
> >> wants to do it
> >> for free:
> >> - fix our text IO functions under python3 and support
> multiple
> >> encodings, not only latin1.
> >> Reasonably simple to do, slap encoding arguments on the
> >> functions,
> >> generate test cases and somehow keep backward compatibility.
> >> Some
> >> prelimary unfinished work is in
> >> https://github.com/numpy/numpy/pull/4208
> >> 
> >>
> >>
> >> I like that idea, it's a recurring pain point. Are you
> >> interested to work on it, or are you thinking to advertise the
> >> idea here to see if anyone steps up?
> >>
> >>
> >> More thoughts on this anyone? Or preferences for this idea or the
> >> numpy.org  one? Submission deadline is April 3rd
> >> and we can only put in one proposal this time, so we need to (a)
> >> make a choice between these ideas, and (b) write up a proposal.
> >>
> >> If there's not enough replies to this so the choice is clear cut, I
> >> will send out a poll to the core devs.
> >>
> >>
> >> Do we have anyone interested in doing the work in either case? That
> >> seems like the most important consideration to me...
> >>
> >> -n
> >>
> >
> > I could do the textio thing if no one shows up for numpy.org. I can
> > probably check again what is required in the next few days and write a
> > proposal.
> > The change will need reviewing in the end too, should that be
> > compensated too? It feels weird if not.
> >
>
> I have decided to not do it, as it is more or less just a bugfix and I
> currently do not feel capable of doing with added completion pressure.
> But I have collected some of related issues and discussions:
>
> https://github.com/numpy/numpy/issues/4600
> https://github.com/numpy/numpy/issues/3184
> http://numpy-discussion.10968.n7.nabble.com/using-loadtxt-
> to-load-a-text-file-in-to-a-numpy-array-tt35992.html#a36003
> # loadtxt
> https://github.com/numpy/numpy/pull/4208
> # genfromtxt
> http://numpy-discussion.10968.n7.nabble.com/genfromtxt-
> universal-newline-support-td37816.html
> https://github.com/dhomeier/numpy/commit/995ec93
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>



-- 
Renato Fabbri
GNU/Linux User #479299
labmacambira.sourceforge.net
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] speed of random number generator compared to Julia

2017-04-03 Thread Pierre Haessig
Hello,

Le 30/03/2017 à 13:31, Pierre Haessig a écrit :
> []
>
> But how come Julia is 4-5x faster since Numpy uses C implementation
> for the entire process ? (Mersenne Twister -> uniform double ->
> Box-Muller transform to get a Gaussian
> https://github.com/numpy/numpy/blob/master/numpy/random/mtrand/randomkit.c).
> Also I noticed that Julia uses a different algorithm (Ziggurat Method
> from Marsaglia and Tsang ,
> https://github.com/JuliaLang/julia/blob/master/base/random.jl#L700)
> but this doesn't explain the difference for uniform rng.
>
Any ideas?

Do you think Stackoverflow would be a better place for my question?

best,

Pierre

___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] speed of random number generator compared to Julia

2017-04-03 Thread Jaime Fernández del Río
On Mon, Apr 3, 2017 at 3:20 PM, Pierre Haessig 
wrote:

> Hello,
> Le 30/03/2017 à 13:31, Pierre Haessig a écrit :
>
> []
>
> But how come Julia is 4-5x faster since Numpy uses C implementation for
> the entire process ? (Mersenne Twister -> uniform double -> Box-Muller
> transform to get a Gaussian https://github.com/numpy/
> numpy/blob/master/numpy/random/mtrand/randomkit.c). Also I noticed that
> Julia uses a different algorithm (Ziggurat Method from Marsaglia and Tsang
> , https://github.com/JuliaLang/julia/blob/master/base/random.jl#L700) but
> this doesn't explain the difference for uniform rng.
>
> Any ideas?
>

This

says
that Julia uses this library
, which is
different from the home brewed version of the Mersenne twister in NumPy.
The second link I posted claims their speed comes from generating double
precision numbers directly, rather than generating random bytes that have
to be converted to doubles, as is the case of NumPy through this magical
incantation
.
They also throw the SIMD acronym around, which likely means their random
number generation is parallelized.

My guess is that most of the speed-up comes from the SIMD parallelization:
the Mersenne algorithm does a lot of work

to
produce 32 random bits, so that likely dominates over a couple of
arithmetic operations, even if divisions are involved.

Jaime

Do you think Stackoverflow would be a better place for my question?
>
> best,
>
> Pierre
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
>


-- 
(\__/)
( O.o)
( > <) Este es Conejo. Copia a Conejo en tu firma y ayúdale en sus planes
de dominación mundial.
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] speed of random number generator compared to Julia

2017-04-03 Thread Neal Becker
Take a look here:
https://bashtage.github.io/ng-numpy-randomstate/doc/index.html

On Mon, Apr 3, 2017 at 9:45 AM Jaime Fernández del Río 
wrote:

> On Mon, Apr 3, 2017 at 3:20 PM, Pierre Haessig 
> wrote:
>
> Hello,
> Le 30/03/2017 à 13:31, Pierre Haessig a écrit :
>
> []
>
> But how come Julia is 4-5x faster since Numpy uses C implementation for
> the entire process ? (Mersenne Twister -> uniform double -> Box-Muller
> transform to get a Gaussian
> https://github.com/numpy/numpy/blob/master/numpy/random/mtrand/randomkit.c).
> Also I noticed that Julia uses a different algorithm (Ziggurat Method
> from Marsaglia and Tsang ,
> https://github.com/JuliaLang/julia/blob/master/base/random.jl#L700) but
> this doesn't explain the difference for uniform rng.
>
> Any ideas?
>
>
> This
> 
>  says
> that Julia uses this library
> , which is
> different from the home brewed version of the Mersenne twister in NumPy.
> The second link I posted claims their speed comes from generating double
> precision numbers directly, rather than generating random bytes that have
> to be converted to doubles, as is the case of NumPy through this magical
> incantation
> .
> They also throw the SIMD acronym around, which likely means their random
> number generation is parallelized.
>
> My guess is that most of the speed-up comes from the SIMD parallelization:
> the Mersenne algorithm does a lot of work
> 
>  to
> produce 32 random bits, so that likely dominates over a couple of
> arithmetic operations, even if divisions are involved.
>
> Jaime
>
> Do you think Stackoverflow would be a better place for my question?
>
> best,
>
> Pierre
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
>
>
>
> --
> (\__/)
> ( O.o)
> ( > <) Este es Conejo. Copia a Conejo en tu firma y ayúdale en sus planes
> de dominación mundial.
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] speed of random number generator compared to Julia

2017-04-03 Thread Pierre Haessig

Le 03/04/2017 à 15:52, Neal Becker a écrit :
> Take a look here:
> https://bashtage.github.io/ng-numpy-randomstate/doc/index.html
Thanks for the pointer. A very feature-full random generator package.

So it is indeed possible to have in Python/Numpy both the "advanced"
Mersenne Twister (dSFMT) at the lower level and the Ziggurat algorithm
for Gaussian transform on top. Perfect!

In an ideal world, this would be implemented by default in Numpy, but I
understand that this would break the reproducibility of existing codes.

best,
Pierre
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] speed of random number generator compared to Julia

2017-04-03 Thread Neal Becker
I think the intention is that this is the next gen of numpy randomstate,
and will eventually be merged in.

On Mon, Apr 3, 2017 at 11:47 AM Pierre Haessig 
wrote:

>
> Le 03/04/2017 à 15:52, Neal Becker a écrit :
> > Take a look here:
> > https://bashtage.github.io/ng-numpy-randomstate/doc/index.html
> Thanks for the pointer. A very feature-full random generator package.
>
> So it is indeed possible to have in Python/Numpy both the "advanced"
> Mersenne Twister (dSFMT) at the lower level and the Ziggurat algorithm
> for Gaussian transform on top. Perfect!
>
> In an ideal world, this would be implemented by default in Numpy, but I
> understand that this would break the reproducibility of existing codes.
>
> best,
> Pierre
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] speed of random number generator compared to Julia

2017-04-03 Thread Pierre Haessig
Le 03/04/2017 à 15:44, Jaime Fernández del Río a écrit :
> This
> 
>  says
> that Julia uses this library
> , which
> is different from the home brewed version of the Mersenne twister in
> NumPy. The second link I posted claims their speed comes from
> generating double precision numbers directly, rather than generating
> random bytes that have to be converted to doubles, as is the case of
> NumPy through this magical incantation
> .
> They also throw the SIMD acronym around, which likely means their
> random number generation is parallelized.
>
> My guess is that most of the speed-up comes from the SIMD
> parallelization: the Mersenne algorithm does a lot of work
> 
>  to
> produce 32 random bits, so that likely dominates over a couple of
> arithmetic operations, even if divisions are involved.
Thanks for the feedback.

I'm not good in enough in reading Julia to be 100% sure, but I feel like
that the random.jl
(https://github.com/JuliaLang/julia/blob/master/base/random.jl) contains
a Julia implementation of Mersenne Twister... but I have no idea whether
it is the "fancy" SIMD version or the "old" 32bits version.

best,
Pierre
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] speed of random number generator compared to Julia

2017-04-03 Thread Nathaniel Smith
On Apr 3, 2017 8:59 AM, "Pierre Haessig"  wrote:

Le 03/04/2017 à 15:44, Jaime Fernández del Río a écrit :

This

says
that Julia uses this library
, which is
different from the home brewed version of the Mersenne twister in NumPy.
The second link I posted claims their speed comes from generating double
precision numbers directly, rather than generating random bytes that have
to be converted to doubles, as is the case of NumPy through this magical
incantation
.
They also throw the SIMD acronym around, which likely means their random
number generation is parallelized.

My guess is that most of the speed-up comes from the SIMD parallelization:
the Mersenne algorithm does a lot of work

to
produce 32 random bits, so that likely dominates over a couple of
arithmetic operations, even if divisions are involved.

Thanks for the feedback.

I'm not good in enough in reading Julia to be 100% sure, but I feel like
that the random.jl (https://github.com/JuliaLang/
julia/blob/master/base/random.jl) contains a Julia implementation of
Mersenne Twister... but I have no idea whether it is the "fancy" SIMD
version or the "old" 32bits version.


That code contains many references to "dSFMT", which is the name of the
"fancy" algorithm. IIUC dSFMT is related to the mersenne twister but is
actually a different generator altogether -- advertising that Julia uses
the mersenne twister is somewhat misleading IMHO. Of course this is really
the fault of the algorithm's designers for creating multiple algorithms
that have "mersenne twister" as part of their names...

-n
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] speed of random number generator compared to Julia

2017-04-03 Thread Pierre Haessig
Le 03/04/2017 à 17:49, Neal Becker a écrit :
> I think the intention is that this is the next gen of numpy
> randomstate, and will eventually be merged in.
Ah yes, I found the related issue in the meantime:
https://github.com/numpy/numpy/issues/6967

Thanks again for the pointers.

Pierre

___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Fwd: [numfocus] Grants up to $3k available to NumFOCUS projects (sponsored & affiliated)

2017-04-03 Thread Ralf Gommers
On Mon, Apr 3, 2017 at 11:28 PM, Julian Taylor <
jtaylor.deb...@googlemail.com> wrote:

> On 31.03.2017 16:07, Julian Taylor wrote:
> > On 31.03.2017 15:51, Nathaniel Smith wrote:
> >> On Mar 31, 2017 1:15 AM, "Ralf Gommers"  >> > wrote:
> >>
> >>
> >>
> >> On Mon, Mar 27, 2017 at 11:42 PM, Ralf Gommers
> >> mailto:ralf.gomm...@gmail.com>> wrote:
> >>
> >>
> >>
> >> On Mon, Mar 27, 2017 at 11:33 PM, Julian Taylor
> >>  >> > wrote:
> >>
> >> I have two ideas under one big important topic: make numpy
> >> python3
> >> compatible.
> >>
> >> The first fits pretty well with the grant size and nobody
> >> wants to do it
> >> for free:
> >> - fix our text IO functions under python3 and support
> multiple
> >> encodings, not only latin1.
> >> Reasonably simple to do, slap encoding arguments on the
> >> functions,
> >> generate test cases and somehow keep backward compatibility.
> >> Some
> >> prelimary unfinished work is in
> >> https://github.com/numpy/numpy/pull/4208
> >> 
> >>
> >>
> >> I like that idea, it's a recurring pain point. Are you
> >> interested to work on it, or are you thinking to advertise the
> >> idea here to see if anyone steps up?
> >>
> >>
> >> More thoughts on this anyone? Or preferences for this idea or the
> >> numpy.org  one? Submission deadline is April 3rd
> >> and we can only put in one proposal this time, so we need to (a)
> >> make a choice between these ideas, and (b) write up a proposal.
> >>
> >> If there's not enough replies to this so the choice is clear cut, I
> >> will send out a poll to the core devs.
> >>
> >>
> >> Do we have anyone interested in doing the work in either case? That
> >> seems like the most important consideration to me...
>

Fair enough. Had a plan, but my weekend went a bit different than planned
so couldn't follow up on it.


> >>
> >> -n
> >>
> >
> > I could do the textio thing if no one shows up for numpy.org. I can
> > probably check again what is required in the next few days and write a
> > proposal.
> > The change will need reviewing in the end too, should that be
> > compensated too? It feels weird if not.
> >
>
> I have decided to not do it, as it is more or less just a bugfix and I
> currently do not feel capable of doing with added completion pressure.
>

Good call Julian. I struggled with the same thing - had a designer to do
the numpy.org work, but that still needed someone to do the content,
review, etc. Decided not to try to take that on, because I'm already
struggling to keep up.



> But I have collected some of related issues and discussions:
>

Thanks, I'm sure that'll be of use at some point.

Ralf


>
> https://github.com/numpy/numpy/issues/4600
> https://github.com/numpy/numpy/issues/3184
> http://numpy-discussion.10968.n7.nabble.com/using-loadtxt-
> to-load-a-text-file-in-to-a-numpy-array-tt35992.html#a36003
> # loadtxt
> https://github.com/numpy/numpy/pull/4208
> # genfromtxt
> http://numpy-discussion.10968.n7.nabble.com/genfromtxt-
> universal-newline-support-td37816.html
> https://github.com/dhomeier/numpy/commit/995ec93
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion