Sorry for the previous mispost.
This thread remids me of something I've though about for a while: Would
NumPy benefit from an np.ndarraylist subclass of np.ndarray, that has an
O(1) amortized append like Python lists? (Other methods of Python lists
(pop, extend) would be worth considering as we
Den 15.07.2010 15:41, skrev Skipper Seabold:
> On Thu, Jul 15, 2010 at 5:54 AM, John Porter wrote:
>
>> Has anyone got any advice about array creation. I've been using numpy
>> for a long time and have just noticed something unexpected about array
>> concatenation.
>>
>> It seems that using nu
> On May 26th, I sent an email titled "curious about how people would
> feel about moving to github."
Should we really be supporting Ruby like that ;)
Personally I am an idiot when to comes to SVN, so a move to GitHub might
make it easier for me to contribute.
Sturla
_
There has been some discussion on FFTPACK lately. Problems with FFTPACK
seems to be:
- Written in old Fortran 77.
- Unprecise for single precision.
- Can sometimes be very slow, depending on input size.
- Can only handle a few small prime factors {2,3,4,5} efficiently.
- How to control integer si
ys before the VC++
2008 compiler is totally unavailable.
Sturla Molden
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion
Sturla Molden skrev:
> http://shootout.alioth.debian.org/u32/which-programming-languages-are-fastest.php
>
> They are benchmarking with tasks that burns the CPU, like computing and
> bitmapping Mandelbrot sets and processing DNA data.
>
>
It is also the kind of tasks where Nu
Sebastian Haase skrev:
> Hi Sturla,
> what is this even about ... ? Do you have some references ? It does
> indeed sound interesting ... but what kind of code / problem are they
> actually testing here ?
>
>
http://shootout.alioth.debian.org/u32/which-programming-languages-are-fastest.php
They
I was just looking at Debian's benchmark. LuaJIT is now (on median)
beating Intel Fortran! Consider that Lua is a dynamic language very
similar to Python. I know it's "just a benchmark" but this has to count
as insanely impressive. Beating Intel Fortran with a dynamic scripting
language... How
Lisandro Dalcin skrev:
> No, no sarcasm at all! I just realized that PyCObject were
> (pending)deprecated in 2.7 ... Anyway. let me say I'm so annoyed and
> upset as you.
>
>
PyCapsule should be used instead. It has two main advantages over
PyCObject: First, it associates a 'name' with the void
Den 17.06.2010 16:29, skrev greg whittier:
> I have files (from an external source) that contain ~10 GB of
> big-endian uint16's that I need to read into a series of arrays. What
> I'm doing now is
>
> import numpy as np
> import struct
>
> fd = open('file.raw', 'rb')
>
> for n in range(1)
>
Den 15.06.2010 18:30, skrev Sturla Molden:
> A very radical solution would be to get rid of all C, and go for a
> "pure Python" solution. NumPy could build up a text string with OpenCL
> code on the fly, and use the OpenCL driver as a "JIT compiler" for
> fast
A very radical solution would be to get rid of all C, and go for a "pure
Python" solution. NumPy could build up a text string with OpenCL code on
the fly, and use the OpenCL driver as a "JIT compiler" for fast array
expressions. Most GPUs and CPUs will support OpenCL, and thus there will
be no
Den 13.06.2010 18:19, skrev Charles R Harris:
>
> It's the combination of unsigned with signed that causes the
> promotion. The int64 type can't hold the largest values in uint64.
> Strictly speaking, doubles can't hold either of the 64 bit integer
> types without loss of precision but at least
Den 13.06.2010 05:47, skrev David Cournapeau:
>
> This only works in simple cases. What do you do when you don't know
> the output size ?
First: If you don't know, you don't know. Then you're screwed and C is
not going to help.
Second: If we cannot figure out how much to allocate before starting
Den 13.06.2010 02:39, skrev David Cournapeau:
>
> But the point is to get rid of the python dependency, and if you don't
> allow any api call to allocate memory, there is not much left to
> implement in the core.
>
>
Memory allocation is platform dependent. A CPython version could use
bytearr
Den 12.06.2010 15:57, skrev David Cournapeau:
> Anything non trivial will require memory allocation and object
> ownership conventions. If the goal is interoperation with other
> languages and vm, you may want to use something else than plain
> malloc, to interact better with the allocation strateg
Den 11.06.2010 17:17, skrev Anne Archibald:
>
> On the other hand, since memory reads are very slow, optimizations
> that do more calculation per load/store could make a very big
> difference, eliminating temporaries as a side effect.
>
Yes, that's the main issue, not the extra memory they use
Den 11.06.2010 09:14, skrev Sebastien Binet:
> it of course depends on the granularity at which you wrap and use
> numpy-core but tight loops calling ctypes ain't gonna be pretty
> performance-wise.
>
Tight loops in Python are never pretty.
The purpose of vectorization with NumPy is to avoid
Den 11.06.2010 10:17, skrev Pauli Virtanen:
>> 1. Collect an array of pointers to each subarray (e.g. using
>> std::vector or dtype**)
>> 2. Dispatch on the pointer array...
>>
> This is actually what the current ufunc code does.
>
> The innermost dimension is handled via the ufunc loop, whi
Den 11.06.2010 03:02, skrev Charles R Harris:
>
> But for an initial refactoring it probably falls in the category of
> premature optimization. Another thing to avoid on the first go around
> is micro-optimization, as it tends to complicate the code and often
> doesn't do much for performance.
Den 11.06.2010 04:19, skrev David:
>
> Ah, ok, I did not know this was called copy-in/copy-out, thanks for the
> explanation. I agree this would be a good direction to pursue, but maybe
> out of scope for the first refactoring,
>
>
Copy-in copy-out is actually an implementation detail in Fortr
Den 11.06.2010 00:57, skrev David Cournapeau:
Do you have the code for this ? That's something I wanted to do, but
never took the time to do. Faster generic iterator would be nice, but
very hard to do in general.
/* this computes the start adress for every vector along a dimension
(axis)
Den 10.06.2010 22:07, skrev Travis Oliphant:
>
>> 2. The core should be a plain DLL, loadable with ctypes. (I know David
>> Cournapeau and Robert Kern is going to hate this.) But if Python can have a
>> custom loader for .pyd files, so can NumPy for it's core DLL. For ctypes we
>> just need to s
Den 10.06.2010 22:28, skrev Pauli Virtanen:
>
> Some places where Openmp could probably help are in the inner ufunc
> loops. However, improving the memory efficiency of the data access
> pattern is another low-hanging fruit for multidimensional arrays.
>
>
Getting the intermediate array out of
s the NumPy core DLL. Yes it will not work with older versions of
Python. But for a complete refactoring ("NumPy 3000"), backwards
compatibility should not matter much. (It's easier to backport bytearray
than track down memory leaks in NumPy's core.)
Sturla
Den 10.06
Den 10.06.2010 18:48, skrev Sturla Molden:
> ctypes will also make porting to other Python implementations easier
> (or even other languages: Ruby, JacaScript) easier. Not to mention
> that it will make NumPy impervious to changes in the Python C API.
Linking is also easier with
I have a few radical suggestions:
1. Use ctypes as glue to the core DLL, so we can completely forget about
refcounts and similar mess. Why put manual reference counting and error
handling in the core? It's stupid.
2. The core should be a plain DLL, loadable with ctypes. (I know David
Courna
> Sturla Molden wrote:
>> I would suggest using GotoBLAS instead of ATLAS.
>
>> http://www.tacc.utexas.edu/tacc-projects/
>
> That does look promising -- nay idea what the license is? They don't
> make it clear on the site
UT TACC Research License (Source Code)
> I also tried to Install numpy with intel mkl 9.1
> I still used gfortran for numpy installation as intel mkl 9.1 supports gnu
> compiler.
I would suggest using GotoBLAS instead of ATLAS. It is easier to build
then ATLAS (basically no configuration), and has even better performance
than MKL.
ht
Colin J. Williams skrev:
> When one has a smallish sample size, what give the best estimate of the
> variance?
What do you mean by "best estimate"?
Unbiased? Smallest standard error?
> In the widely used Analysis of Variance (ANOVA), the degrees of freedom
> are reduced for each mean estimate
Colin J. Williams skrev:
>
> suggested that 1 (one) would be a better default but Robert Kern told
> us that it won't happen.
>
>
I don't even see the need for this keyword argument, as you can always
multiply the variance by n/(n-1) to get what you want.
Also, normalization by n gives th
David Cournapeau skrev:
> We are talking about the numpy extensions here, which are not
> installed through the install_data command. The problem is about how
> windows looks for dll with the manifest mechanism, and how to
> build/install extensions when the C runtime (or any other "system"
> dll)
David Cournapeau skrev:
> We are talking about the numpy extensions here, which are not
> installed through the install_data command. The problem is about how
> windows looks for dll with the manifest mechanism, and how to
> build/install extensions when the C runtime (or any other "system"
> dll)
David Cournapeau skrev:
> If every python package starts to put its extensions (*.pyd) into a
> directory, what happens when two different packages have an extension
> with the same name (e.g. package foo has a package multiarray.pyd) ? I
> would also be really annoyed if a 3rd party extension star
th a bias to the right.
- Hit 7 or better, with no bias.
Do you think it can be shown that the latter option is the better?
No?
Sturla Molden
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion
Pauli Virtanen skrev:
> XXX: 3K: numpy.random is disabled for now, uses PyString_*
> XXX: 3K: numpy.ma is disabled for now -- some issues
>
I thought numpy.random uses Cython? Is it just a matter of recompiling
the pyx-file?
> I remember Dag was working on this a bit: how far did it go?
>
>
Robin skrev:
> Ah, I hadn't realised it was an OS constraint - I thought it was
> possible to unload dlls - and that was why matlab provides the clear
> function. mex automatically clears a function when you rebuild it - I
> thought that was how you can rebuild and reload mex functions without
> re
Robin skrev:
> So far the only remotely tricky thing I did was redirect sys.stdout
> and sys.stderr to a wrapper that uses mexPrintf so output goes to the
> matlab console.
>
Be careful when you are using file handles. You have to be sure that
Matlab, Python and NumPy are all linked against the
Robin skrev:
> I had assumed when matlab unloads the mex function it would also
> unload python - but it looks like other dynamic libs pulled in from
> the mex function (in this case python and in turn numpy) aren't
> unloaded...
>
Matlab MEX functions are DLLs, Python interpreter is a DLL, NumP
Alexey Tigarev skrev:
> I have implemented multiple regression in a following way:
>
>
You should be using QR or SVD for this.
Sturla
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussio
Jake VanderPlas wrote:
> Does anybody know a
> way to directly access the numpy.linalg routines from a C extension,
> without the overhead of a python callback? Thanks for the help.
>
You find a C function pointer wrapped in a CObject in the ._cpointer
attribute.
_
David Cournapeau wrote:
> On Fri, Nov 6, 2009 at 6:54 AM, David Goldsmith
> wrote:
>
>> Interesting thread, which leaves me wondering two things: is it documented
>> somewhere (e.g., at the IEEE site) precisely how many *decimal* mantissae
>> are representable using the 64-bit IEEE standard fo
Sturla Molden skrev:
> Robert Kern skrev:
>
>> Then let me clarify: it was written to support integer ranges up to
>> sys.maxint. Absolutely, it would be desirable to extend it.
>>
>>
>>
> Actually it only supports integers up to sys.maxin
Robert Kern skrev:
> Then let me clarify: it was written to support integer ranges up to
> sys.maxint. Absolutely, it would be desirable to extend it.
>
>
Actually it only supports integers up to sys.maxint-1, as
random_integers call randint. random_integers includes the upper range,
but randi
Robert Kern skrev:
> Then let me clarify: it was written to support integer ranges up to
> sys.maxint. Absolutely, it would be desirable to extend it.
>
>
I know, but look at this:
>>> import sys
>>> sys.maxint
2147483647
>>> 2**31-1
2147483647L
sys.maxint becomes a long, which is what conf
Robert Kern skrev:
> 64-bit and larger integers could be done, but it requires
> modification. The integer distributions were written to support C
> longs, not anything larger. You could also use .bytes() and
> np.fromstring().
>
But as of Python 2.6.4, even 32-bit integers fail, at least on Win
Thomas Robitaille skrev:
> np.random.random_integers(np.iinfo(np.int32).min,high=np.iinfo
> (np.int32).max,size=10)
>
> which gives
>
> array([-1506183689, 662982379, -1616890435, -1519456789, 1489753527,
> -604311122, 2034533014, 449680073, -444302414,
> -1924170329])
>
>
Th
Bill Blinn skrev:
> v = multiview((3, 4))
> #the idea of the following lines is that the 0th row of v is
> #a view on the first row of a. the same would hold true for
> #the 1st and 2nd row of v and the 0th rows of b and c, respectively
> v[0] = a[0]
This would not even work, becuase a[0] does not
Anne Archibald skrev:
> The short answer is, you can't.
Not really true. It is possible create an array (sub)class that stores
memory addresses (pointers) instead of values. It is doable, but I am
not wasting my time implementing it.
Sturla
___
NumP
Lisandro Dalcin skrev:
> Is there any specific naming convention for these XML files to work
> with KATE? Would it be fine to call it 'cython-mode-kate.xml' to push
> it to the repo? Will it still work (I mean, with that name) when
> placed appropriately in KATE config dirs or whatever? ... Just
>
Sturla Molden skrev:
and "Cython with NumPy" shows up under Sources. Anyway, this is the
syntax high-lighter I use to write Cython.
It seems I posted the wrong file. :-(
S.M.
as
cimport
import
as you wish.
P.S. I am also cleaning up Python high-lighting for KDE. Not done yet,
but I will post a "Python with NumPy" highlighter later on if this is
interesting.
P.P.S. This also covers Pyrex, but add in some Cython stuff.
Sturla Molden
Ralf Gommers skrev:
>
> If anyone with knowledge of the differences between the C and Fortran
> versions could add a few notes at the above link, that would be great.
>
The most notable difference (from a user perspective) is that the
Fortran version has more transforms, such as discrete sine and
Dag Sverre Seljebotn skrev:
> Microsoft's compilers don't support C99 (or, at least, versions that
> still has to be used doesn't).
>
>
Except for automatic arrays, they do support some of the more important
parts of C99 as extensions to C89:
inline functions
restrict qualifier
for (int i
ver
arrays, possibly using where/find masks, are fast. So although NumPy is
not executed on a vector machine like the Cray C90, it certainly behaves
like one performance wise.
I'd say that a MIMD machine running NumPy is a Turing machine emulating
a SIMD/vector machine.
And now I a
Mathieu Blondel skrev:
> As I wrote earlier in this thread, I confused Cython and CPython. PN
> was suggesting to include Numpy in the CPython distribution (not
> Cython). The reason why was also given earlier.
>
>
First, that would currently not be possible, as NumPy does not support
Py3k. Se
Mathieu Blondel skrev:
> Peter Norvig suggested to merge Numpy into Cython but he didn't
> mention SIMD as the reason (this one is from me).
I don't know what Norvig said or meant.
However:
There is NumPy support in Cython. Cython has a general syntax applicable
to any PEP 3118 buffer. (As Num
Matthieu Brucher skrev:
> I agree with Sturla, for instance nVidia GPUs do SIMD computations
> with blocs of 16 values at a time, but the hardware behind can't
> compute on so much data at a time. It's SIMD from our point of view,
> just like Numpy does ;)
>
>
A computer with a CPU and a GPU is
Robert Kern skrev:
> No, I think you're right. Using "SIMD" to refer to numpy-like
> operations is an abuse of the term not supported by any outside
> community that I am aware of. Everyone else uses "SIMD" to describe
> hardware instructions, not the application of a single syntactical
> element o
__array_interface__['data'][0]
offset = (16 - address % 16) % 16
return tmp[offset:offset+N].view(dtype=dtype)
Sturla Molden
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion
Skipper Seabold skrev:
> I'm curious about this as I use ss, which is just np.sum(a*a, axis),
> in statsmodels and didn't much think about it.
>
> Do the number of loops matter in the timings and is dot always faster
> even without the blas dot?
>
The thing is that a*a returns a temporary array
Francesc Alted skrev:
> The response is clear: avoid memcpy() if you can. It is true that memcpy()
> performance has improved quite a lot in latest gcc (it has been quite good in
> Win versions since many years ago), but working with data in-place (i.e.
> avoiding a memory copy) is always faste
Robert Kern skrev:
> collections.deque() is a linked list of 64-item chunks.
>
Thanks for that useful information. :-) But it would not help much for a
binary tree...
Since we are on the NumPy list... One could image making linked lists
using NumPy arrays with dtype=object. They are storage
Robert Kern skrev:
> While this description is basically true of numpy arrays, I would
> caution you that every language has a different lexicon, and the same
> word can mean very different things in each. For example, Python lists
> are *not* linked lists; they are like C++'s std::vectors with a
>
Xavier Gnata skrev:
> I have a large 2D numpy array as input and a 1D array as output.
> In between, I would like to use C code.
> C is requirement because it has to be fast and because the algorithm
> cannot be written in a numpy oriented way :( (no way...really).
>
There are certain algorithm
René Dudfield skrev:
> Another way is to make your C function then load it with ctypes
Also one should beware that ctypes is a stable part of the Python
standard library.
Cython is still unstable and in rapid development.
Pyrex is more stabile than Cython, but interfacing with ndarrays is harder
René Dudfield skrev:
> Another way is to make your C function then load it with ctypes(or
> wrap it with something else) and pass it pointers with
> array.ctype.data.
numpy.ctypeslib.ndpointer is preferred when using ndarrays with ctypes.
> You can find the shape of the array in python, and
> p
Sebastian Haase skrev:
> I know that cython's numpy is still getting better and better over
> time, but is it already today possible to have numpy support when
> using Cython in "pure python" mode?
>
I'm not sure. There is this odd memoryview syntax:
import cython
view = cython.int[:,:](my2darr
mtt/doc/mtt.pdf
<http://home.online.no/%7Epjacklam/matlab/doc/mtt/doc/mtt.pdf>
Sturla Molden
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion
V. Armando Solé skrev:
> In python 2.6:
>
> >>>import numpy.core._dotblas as dotblas
> ...
> ImportError: No module named _dotblas
>
>>> import numpy.core._dotblas as dotblas
>>> dotblas.__file__
'C:\\Python26\\lib\\site-packages\\numpy\\core\\_dotblas.pyd'
___
V. Armando Solé skrev:
> import numpy
> import time
> a=numpy.arange(100.)
> a.shape=1000,1000
> t0=time.time()
> b=numpy.dot(a.T,a)
> print "Elapsed time = ",time.time()-t0
>
> reports an "Elapsed time" of 1.4 seconds under python 2.5 and 15 seconds
> under python 2.6
>
My computer reports
Francesc Alted skrev:
>
> Numexpr already uses the Python parser, instead of build a new one.
> However the bytecode emitted after the compilation process is
> different, of course.
>
> Also, I don't see the point in requiring immutable buffers. Could you
> develop this further?
>
If you do lacy
Rohit Garg skrev:
> gtx280-->141GBps-->has 1GB
> ati4870-->115GBps-->has 1GB
> ati5870-->153GBps (launches sept 22, 2009)-->2GB models will be there too
>
That is going to help if buffers are kept in graphics memory. But the
problem is that graphics memory is a scarse resource.
S.M.
closer to a specialized SciPy JIT-compiler. I would
be fun to make if I could find time for it.
Sturla Molden
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion
cannot be the source of the slowness.
Sturla Molden
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion
AS, not in NumPy's core. One could e.g. consider linking with a
BLAS wrapper that directs these special cases to the GPU and the rest to
ATLAS / MKL / netlib BLAS.
Sturla Molden
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mai
Daniel Platz skrev:
> data1 = numpy.zeros((256,200),dtype=int16)
> data2 = numpy.zeros((256,200),dtype=int16)
>
> This works for the first array data1. However, it returns with a
> memory error for array data2. I have read somewhere that there is a
> 2GB limit for numpy arrays on a 32 bit m
Alan G Isaac skrev:
> http://article.gmane.org/gmane.comp.python.general/630847
>
Yes, but here you still have to look up the name 'f' from locals in each
iteration. map is written in C, once it has as PyObject* to the callable
it does not need to look up the name anymore. The dictionary look
Mark Wendell skrev:
> for i in range(5):
> for j in range(5):
> a[i,j].myMethod(var3,var4)
> print a[i,j].attribute1
>
> Again, is there a quicker way than above to call myMethod or access attribute1
One option is to look up the name of the method unbound, and then use
bui
Charles R Harris skrev:
> The size of long depends on the compiler as well as the operating
> system. On linux x86_64, IIRC, it is 64 bits, on Windows64 I believe
> it is 32. Ints always seem to be 32 bits.
If I remember the C standard correctly, a long is guaranteed to be at
least 32 bits, wher
David Warde-Farley skrev:
>> The odd values might be from the format code in the error message:
>>
>>PyErr_Format(PyExc_ValueError,
>>"%ld requested and %ld written",
>>(long) size, (long) n);
>>
>
> Yes, I saw that. My C is rusty
dian you can actully create an
quite an efficient median filter using a 3D ndarray. For example if you
use an image of 640 x 480 pixels and want a 9 pixel median filter, you
can put shifted images in an 640 x 480 x 9 ndarray, and call median
Robert Kern skrev:
> When he is talking about 2D, I believe he is referring to median
> filtering rather than computing the median along an axis. I.e.,
> replacing each pixel with the median of a specified neighborhood
> around the pixel.
>
>
That's not something numpy's median function should b
d be fast regardless of dimensions.
I haven't tested the Cython code /thoroughly/, but at least it does compile.
Sturla Molden
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion
ary file
store C structs written successively.
Sturla Molden
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion
Sebastian Haase skrev:
> A mockarray is initialized with a list of nd-arrays. The result is a
> mock array having one additional dimention "in front".
This is important, because often in the case of 'concatenation' a real
concatenation is not needed. But then there is a common tool called
Matlab
V. Armando Solé skrev:
> I am looking for a way to have a non contiguous array C in which the
> "left" (1, 2000) elements point to A and the "right" (1, 4000)
> elements point to B.
>
> Any hint will be appreciated.
If you know in advance that A and B are going to be duplicated, you can
Citi, Luca skrev:
> Hello Sturla,
> In "_median" how can you, if n==2, use s[] if s is not defined?
> What if n==1?
>
That was a typo.
> Also, I think when returning an empty array, it should be of
> the same type you would get in the other cases.
Currently median returns numpy.nan for empty
ose to a megabyte of Cython generated gibberish C just for the
median.
Sturla Molden
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion
Sturla Molden skrev:
>
> http://projects.scipy.org/numpy/attachment/ticket/1213/generate_qselect.py
> http://projects.scipy.org/numpy/attachment/ticket/1213/quickselect.pyx
My suggestion for a new median function is here:
http://projects.scipy.org/numpy/attachment/ticket/1213/media
Sturla Molden skrev:
>
> By the way, here is a more polished version, does it look ok?
No it doesn't... Got to keep the GIL for the general case (sorting
object arrays). Fixing that.
SM
___
NumPy-Discussion mailing list
NumPy-Discussio
lect.py
http://projects.scipy.org/numpy/attachment/ticket/1213/quickselect.pyx
Cython needs something like Java's generics by the way :-)
Regards,
Sturla Molden
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion
Dag Sverre Seljebotn skrev:
> Nitpick: This will fail on large arrays. I guess numpy.npy_intp is the
> right type to use in this case?
>
>
Yup. You are right. Thanks.
Sturla
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mai
lementation of
median. I disabled overwrite_input because the median function calls
numpy.apply_along_axis.
Regards,
Sturla Molden
median.py.gz
Description: application/gzip
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.o
Sturla Molden skrev:
> We recently has a discussion regarding an optimization of NumPy's median
> to average O(n) complexity. After some searching, I found out there is a
> selection algorithm competitive in speed with Hoare's quick select.
>
> Reference:
> ht
odule.c.src. When you are done, it might be
about 10% faster than this. :-)
Reference:
http://ndevilla.free.fr/median/median.pdf
Best regards,
Sturla Molden
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion
edian replacement myself. I was
thinking in the same lines, except I don't store those two arrays. I
just keep track of counts in them. For the even case, I also keep track
the elements closest to the pivot (smaller and bigger). It's incredibly
simple actually. So lets see who gets the
s worse.
Improved memory usage - e.g. through lazy evaluation and JIT compilaton
of expressions - can give up to a tenfold increase in performance. That
is where we must start optimising to get a faster NumPy. Incidentally,
this will also make it easier to leverage
Xavier Saint-Mleux skrev:
> Of course, the mathematically correct way would be to use a correct
> jumpahead function, but all the implementations that I know of are GPL.
> A recent article about this is:
>
> www.iro.umontreal.ca/~lecuyer/myftp/papers/jumpmt.pdf
>
>
I know of no efficient "jumpa
Sturla Molden skrev:
> It seems there is a special version of the Mersenne Twister for this.
> The code is LGPL (annoying for SciPy but ok for me).
http://www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/DC/dgene.pdf
<http://www.math.sci.hiroshima-u.ac.jp/%7Em-mat/MT/DC/dgene.pdf>
Basicall
601 - 700 of 840 matches
Mail list logo