Re: [Numpy-discussion] Statistical distributions on samples

2011-08-15 Thread Andrea Gavana
Hi Chris  Brennan,

On 15 August 2011 00:59, Brennan Williams wrote:
 You can use scipy.stats.truncnorm, can't you? Unless I misread, you want to
 sample a normal distribution but with generated values only being within a
 specified range? However you also say you want to do this with triangular
 and log normal and for these I presume the easiest way is to sample and then
 accept/reject.

 Brennan

 On 13/08/2011 2:53 a.m., Christopher Jordan-Squire wrote:

 Hi Andrea--An easy way to get something like this would be

 import numpy as np
 import scipy.stats as stats

 sigma = #some reasonable standard deviation for your application
 x = stats.norm.rvs(size=1000, loc=125, scale=sigma)
 x = x[x50]
 x = x[x200]

 That will give a roughly normal distribution to your velocities, as long as,
 say, sigma25. (I'm using the rule of thumb for the normal distribution that
 normal random samples lie 3 standard deviations away from the mean about 1
 out of 350 times.) Though you won't be able to get exactly normal errors
 about your mean since normal random samples can theoretically be of any
 size.

 You can use this same process for any other distribution, as long as you've
 chosen a scale variable so that the probability of samples being outside
 your desired interval is really small. Of course, once again your random
 errors won't be exactly from the distribution you get your original samples
 from.

Thank you for your answer. Indeed, it appears that a truncated
distribution implementation exists only for the normal distribution
(in the subset of distributions I need to use). I haven't checked yet
what the code for truncnorm does but maybe it might be possible to
apply the same approach for other distributions. In any case the
sampling/reject/accept approach is the best approach for me, due to my
ignorance about statistical things :-)

Thank you again.

Andrea.

Imagination Is The Only Weapon In The War Against Reality.
http://xoomer.alice.it/infinity77/

 import PyQt4.QtGui
Traceback (most recent call last):
  File interactive input, line 1, in module
ImportError: No module named PyQt4.QtGui

 import pygtk
Traceback (most recent call last):
  File interactive input, line 1, in module
ImportError: No module named pygtk

 import wx


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] ULONG not in UINT16, UINT32, UINT64 under 64-bit windows, is this possible?

2011-08-15 Thread Pearu Peterson
Hi,

A student of mine using 32-bit numpy 1.5 under 64-bit Windows 7 noticed that
giving a numpy array with dtype=uint32 to an extension module the
following codelet would fail:

switch(PyArray_TYPE(ARR)) {
  case PyArray_UINT16: /* do smth */ break;
  case PyArray_UINT32: /* do smth */ break;
  case PyArray_UINT64: /* do smth */ break;
  default: /* raise type error exception */
}

The same test worked fine under Linux.

Checking the value of PyArray_TYPE(ARR) (=8) showed that it corresponds to
NPY_ULONG (when counting the items in the enum definition).

Is this situation possible where NPY_ULONG does not correspond to a 16 or 32
or 64 bit integer?
Or does this indicate a bug somewhere for this particular platform?

Thanks,
Pearu
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] ULONG not in UINT16, UINT32, UINT64 under 64-bit windows, is this possible?

2011-08-15 Thread Olivier Delalleau
The reason is there can be multiple dtypes (i.e. with different .num)
representing the same kind of data.
Usually in Python this goes unnoticed, because you do not test a dtype
through its .num, instead you use for instance == 'uint32', and all works
fine.
However, it can indeed confuse C code in situations like the one you
describe, because of direct comparison of .num.
I guess you have a few options:
- Do not compare .num (I'm not sure what would be the equivalent to ==
'utin32' in C though) = probably slower
- Re-cast your array in the exact dtype you need (in Python you can do this
with .view) = probably cumbersome
- Write a customized comparison function that figures out at initialization
time all dtypes that represent the same data, and then is able to do a fast
comparison based on .num = probably best, but requires more work

Here's some Python code that lists the various scalar dtypes associated to
unique .num in numpy (excerpt slightly modified from code found in Theano --
http://deeplearning.net/software/theano -- BSD license). Call the
get_numeric_types() function, and print both the string representation of
the resulting dtypes as well as their .num.

def get_numeric_subclasses(cls=numpy.number, ignore=None):

Return subclasses of `cls` in the numpy scalar hierarchy.

We only return subclasses that correspond to unique data types.
The hierarchy can be seen here:
http://docs.scipy.org/doc/numpy/reference/arrays.scalars.html

if ignore is None:
ignore = []
rval = []
dtype = numpy.dtype(cls)
dtype_num = dtype.num
if dtype_num not in ignore:
# Safety check: we should be able to represent 0 with this data
type.
numpy.array(0, dtype=dtype)
rval.append(cls)
ignore.append(dtype_num)
for sub in cls.__subclasses__():
rval += [c for c in get_numeric_subclasses(sub, ignore=ignore)]
return rval


def get_numeric_types():

Return numpy numeric data types.

:returns: A list of unique data type objects. Note that multiple data
types
may share the same string representation, but can be differentiated
through
their `num` attribute.

rval = []
def is_within(cls1, cls2):
# Return True if scalars defined from `cls1` are within the
hierarchy
# starting from `cls2`.
# The third test below is to catch for instance the fact that
# one can use ``dtype=numpy.number`` and obtain a float64 scalar,
even
# though `numpy.number` is not under `numpy.floating` in the class
# hierarchy.
return (cls1 is cls2 or
issubclass(cls1, cls2) or
isinstance(numpy.array([0], dtype=cls1)[0], cls2))
for cls in get_numeric_subclasses():
dtype = numpy.dtype(cls)
rval.append([str(dtype), dtype, dtype.num])
# We sort it to be deterministic, then remove the string and num
elements.
return [x[1] for x in sorted(rval, key=str)]


2011/8/15 Pearu Peterson pearu.peter...@gmail.com


 Hi,

 A student of mine using 32-bit numpy 1.5 under 64-bit Windows 7 noticed
 that
 giving a numpy array with dtype=uint32 to an extension module the
 following codelet would fail:

 switch(PyArray_TYPE(ARR)) {
   case PyArray_UINT16: /* do smth */ break;
   case PyArray_UINT32: /* do smth */ break;
   case PyArray_UINT64: /* do smth */ break;
   default: /* raise type error exception */
 }

 The same test worked fine under Linux.

 Checking the value of PyArray_TYPE(ARR) (=8) showed that it corresponds to
 NPY_ULONG (when counting the items in the enum definition).

 Is this situation possible where NPY_ULONG does not correspond to a 16 or
 32 or 64 bit integer?
 Or does this indicate a bug somewhere for this particular platform?

 Thanks,
 Pearu

 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Statistical distributions on samples

2011-08-15 Thread Andrea Gavana
Hi Chris and All,

On 12 August 2011 16:53, Christopher Jordan-Squire wrote:
 Hi Andrea--An easy way to get something like this would be

 import numpy as np
 import scipy.stats as stats

 sigma = #some reasonable standard deviation for your application
 x = stats.norm.rvs(size=1000, loc=125, scale=sigma)
 x = x[x50]
 x = x[x200]

 That will give a roughly normal distribution to your velocities, as long as,
 say, sigma25. (I'm using the rule of thumb for the normal distribution that
 normal random samples lie 3 standard deviations away from the mean about 1
 out of 350 times.) Though you won't be able to get exactly normal errors
 about your mean since normal random samples can theoretically be of any
 size.

 You can use this same process for any other distribution, as long as you've
 chosen a scale variable so that the probability of samples being outside
 your desired interval is really small. Of course, once again your random
 errors won't be exactly from the distribution you get your original samples
 from.

Thank you for your suggestion. There are a couple of things I am not
clear with, however. The first one (the easy one), is: let's suppose I
need 200 values, and the accept/discard procedure removes 5 of them
from the list. Is there any way to draw these 200 values from a bigger
sample so that the accept/reject procedure will not interfere too
much? And how do I get 200 values out of the bigger sample so that
these values are still representative?

Another thing, possibly completely unrelated. I am trying to design a
toy Latin Hypercube script (just for my own understanding). I found
this piece of code on the web (and I modified it slightly):

def lhs(dist, size=100):
'''
Latin Hypercube sampling of any distrbution.
dist is is a scipy.stats random number generator
such as stats.norm, stats.beta, etc
parms is a tuple with the parameters needed for
the specified distribution.

:Parameters:
- `dist`: random number generator from scipy.stats module.
- `size` :size for the output sample
'''

n = size

perc = numpy.arange(0.0, 1.0, 1.0/n)
numpy.random.shuffle(perc)

smp = [stats.uniform(i,1.0/n).rvs() for i in perc]

v = dist.ppf(smp)

return v


Now, I am not 100% clear of what the percent point function is (I have
read around the web, but please keep in mind that my statistical
skills are close to minus infinity). From this page:

http://www.itl.nist.gov/div898/handbook/eda/section3/eda362.htm

I gather that, if you plot the results of the ppf, with the horizontal
axis as probability, the vertical axis goes from the smallest to the
largest value of the cumulative distribution function. If i do this:

numpy.random.seed(123456)

distribution = stats.norm(loc=125, scale=25)

my_lhs = lhs(distribution, 50)

Will my_lhs always contain valid values (i.e., included between 50 and
200)? I assume the answer is no... but even if this was the case, is
this my_lhs array ready to be used to setup a LHS experiment when I
have multi-dimensional problems (in which all the variables are
completely independent from each other - no correlation)?

My apologies for the idiocy of the questions.

Andrea.

Imagination Is The Only Weapon In The War Against Reality.
http://xoomer.alice.it/infinity77/

 import PyQt4.QtGui
Traceback (most recent call last):
  File interactive input, line 1, in module
ImportError: No module named PyQt4.QtGui

 import pygtk
Traceback (most recent call last):
  File interactive input, line 1, in module
ImportError: No module named pygtk

 import wx


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] [ANN] Constrained optimization solver with guaranteed precision

2011-08-15 Thread Dmitrey
 Hi all,
   I'm glad to inform you that general constraints handling for interalg
   (free solver with guaranteed user-defined precision) now is available.
   Despite it is very premature and requires lots of improvements, it is
   already capable of outperforming commercial BARON (example:
   http://openopt.org/interalg_bench#Test_4)  and thus you could be
   interested in trying it right now (next OpenOpt release will be no
   sooner than 1 month).

   interalg can be especially more effective than BARON (and some other
   competitors) on problems with huge or absent Lipschitz constant, for
   example on funcs like sqrt(x), log(x), 1/x, x**alpha, alpha1, when
   domain of x is something like [small_positive_value, another_value].

   Let me also remember you that interalg can search for all solutions of
   nonlinear equations / systems of them where local solvers like
   scipy.optimize fsolve cannot find anyone, and search single/multiple
   integral with guaranteed user-defined precision (speed of integration
   is intended to be enhanced in future).
   However, only FuncDesigner models are handled (read interalg webpage
   for more details).

   Regards, D.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Statistical distributions on samples

2011-08-15 Thread Christopher Jordan-Squire
On Mon, Aug 15, 2011 at 8:53 AM, Andrea Gavana andrea.gav...@gmail.comwrote:

 Hi Chris and All,

 On 12 August 2011 16:53, Christopher Jordan-Squire wrote:
  Hi Andrea--An easy way to get something like this would be
 
  import numpy as np
  import scipy.stats as stats
 
  sigma = #some reasonable standard deviation for your application
  x = stats.norm.rvs(size=1000, loc=125, scale=sigma)
  x = x[x50]
  x = x[x200]
 
  That will give a roughly normal distribution to your velocities, as long
 as,
  say, sigma25. (I'm using the rule of thumb for the normal distribution
 that
  normal random samples lie 3 standard deviations away from the mean about
 1
  out of 350 times.) Though you won't be able to get exactly normal errors
  about your mean since normal random samples can theoretically be of any
  size.
 
  You can use this same process for any other distribution, as long as
 you've
  chosen a scale variable so that the probability of samples being outside
  your desired interval is really small. Of course, once again your random
  errors won't be exactly from the distribution you get your original
 samples
  from.

 Thank you for your suggestion. There are a couple of things I am not
 clear with, however. The first one (the easy one), is: let's suppose I
 need 200 values, and the accept/discard procedure removes 5 of them
 from the list. Is there any way to draw these 200 values from a bigger
 sample so that the accept/reject procedure will not interfere too
 much? And how do I get 200 values out of the bigger sample so that
 these values are still representative?


FWIW, I'm not really advocating a truncated normal so much as making the
standard deviation small enough so that there's no real difference between a
true normal distribution and a truncated normal.

If you're worried about getting exactly 200 samples, then you could sample N
with N200 and such that after throwing out the ones that lie outside your
desired region you're left with M200. Then just randomly pick 200 from
those M. That shouldn't bias anything as long as you randomly pick them. (Or
just pick the first 200, if you haven't done anything to impose any order on
the samples, such as sorting them by size.) But I'm not sure why you'd want
exactly 200 samples instead of some number of samples close to 200.



 Another thing, possibly completely unrelated. I am trying to design a
 toy Latin Hypercube script (just for my own understanding). I found
 this piece of code on the web (and I modified it slightly):

 def lhs(dist, size=100):
'''
Latin Hypercube sampling of any distrbution.
dist is is a scipy.stats random number generator
such as stats.norm, stats.beta, etc
parms is a tuple with the parameters needed for
the specified distribution.

:Parameters:
- `dist`: random number generator from scipy.stats module.
- `size` :size for the output sample
'''

n = size

perc = numpy.arange(0.0, 1.0, 1.0/n)
numpy.random.shuffle(perc)

smp = [stats.uniform(i,1.0/n).rvs() for i in perc]

v = dist.ppf(smp)

return v


 Now, I am not 100% clear of what the percent point function is (I have
 read around the web, but please keep in mind that my statistical
 skills are close to minus infinity). From this page:

 http://www.itl.nist.gov/div898/handbook/eda/section3/eda362.htm


The ppf is what's called the quantile function elsewhere. I do not know why
scipy calls it the ppf/percent point function.

The quantile function is the inverse of the cumulative density function
(cdf). So dist.ppf(z) is the x such that P(dist = x) = z. Roughly. (Things
get slightly more finicky if you think about discrete distributions because
then you have to pick what happens at the jumps in the cdf.) So
dist.ppf(0.5) gives the median of dist, and dist.ppf(0.25) gives the
lower/first quartile of dist.


 I gather that, if you plot the results of the ppf, with the horizontal
 axis as probability, the vertical axis goes from the smallest to the
 largest value of the cumulative distribution function. If i do this:

 numpy.random.seed(123456)

 distribution = stats.norm(loc=125, scale=25)

 my_lhs = lhs(distribution, 50)

 Will my_lhs always contain valid values (i.e., included between 50 and
 200)? I assume the answer is no... but even if this was the case, is
 this my_lhs array ready to be used to setup a LHS experiment when I
 have multi-dimensional problems (in which all the variables are
 completely independent from each other - no correlation)?


I'm not really sure if the above function is doing the lhs you want.   To
answer your question, it won't always generate values within [50,200]. If
size is large enough then you're dividing up the probability space evenly.
So even with the random perturbations (whose use I don't really understand),
you'll ensure that some of the samples you get when you apply the ppf will
correspond to the extremely low probability samples that are 50 or 200.

-Chris JS

My 

Re: [Numpy-discussion] [ANN] Constrained optimization solver with guaranteed precision

2011-08-15 Thread Andrea Gavana
Hi Dmitrey,

2011/8/15 Dmitrey tm...@ukr.net:
 Hi all,
 I'm glad to inform you that general constraints handling for interalg (free
 solver with guaranteed user-defined precision) now is available. Despite it
 is very premature and requires lots of improvements, it is already capable
 of outperforming commercial BARON (example:
 http://openopt.org/interalg_bench#Test_4)  and thus you could be interested
 in trying it right now (next OpenOpt release will be no sooner than 1
 month).

 interalg can be especially more effective than BARON (and some other
 competitors) on problems with huge or absent Lipschitz constant, for example
 on funcs like sqrt(x), log(x), 1/x, x**alpha, alpha1, when domain of x is
 something like [small_positive_value, another_value].

 Let me also remember you that interalg can search for all solutions of
 nonlinear equations / systems of them where local solvers like
 scipy.optimize fsolve cannot find anyone, and search single/multiple
 integral with guaranteed user-defined precision (speed of integration is
 intended to be enhanced in future).
 However, only FuncDesigner models are handled (read interalg webpage for
 more details).

Thank you for this new improvements. I am one of those who use OpenOpt
in real life problems, and if I can advance a suggestion (for the
second time), when you post a benchmark of various optimization
methods, please do not consider the elapsed time only as a
meaningful variable to measure a success/failure of an algorithm.

Some (most?) of real life problems require intensive and time
consuming simulations for every *function evaluation*; the time spent
by the solver itself doing its calculations simply disappears in front
of the real process simulation. I know it because our simulations take
between 2 and 48 hours to run, so what's 300 seconds more or less in
the solver calculations? If you talk about synthetic problems (such as
the ones defined by a formula), I can see your point. For everything
else, I believe the number of function evaluations is a more direct
way to assess the quality of an optimization algorithm.

Just my 2c.

Andrea.

Imagination Is The Only Weapon In The War Against Reality.
http://xoomer.alice.it/infinity77/

 import PyQt4.QtGui
Traceback (most recent call last):
  File interactive input, line 1, in module
ImportError: No module named PyQt4.QtGui

 import pygtk
Traceback (most recent call last):
  File interactive input, line 1, in module
ImportError: No module named pygtk

 import wx


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] [ANN] Constrained optimization solver with guaranteed precision

2011-08-15 Thread Dmitrey
 Hi Andrea,
   I believe benchmarks should be like Hans Mittelman do (
   http://plato.asu.edu/bench.html ) and of course number of funcs
   evaluations matters when slow Python code vs compiled is tested, but
   my current work doesn't allow me to spend so much time for OpenOpt
   development, so, moreover, for auxiliary work such as benchmarking
   (and making it properly like that). Also, benchmarks of someone's own
   soft usually are not very  trustful, moreover, on his own probs.

   BTW, please don't reply on my posts in scipy mail lists - I use them
   only to post the announcements like this and can miss a reply.

   Regards, D.

   --- Исходное сообщение ---
   От кого:  Andrea Gavana andrea.gav...@gmail.com
   Кому:  Discussion of Numerical Python numpy-discussion@scipy.org
   Дата: 15 августа 2011, 23:01:05
   Тема: Re: [Numpy-discussion] [ANN] Constrained optimization solver
   with guaranteed precision



 Hi Dmitrey,
 
 2011/8/15 Dmitrey  tm...@ukr.net :
  Hi all,
  I'm glad to inform you that general constraints handling for interalg 
(free
  solver with guaranteed user-defined precision) now is available. Despite 
it
  is very premature and requires lots of improvements, it is already 
capable
  of outperforming commercial BARON (example:
   http://openopt.org/interalg_bench#Test_4 )  and thus you could 
be interested
  in trying it right now (next OpenOpt release will be no sooner than 1
  month).
 
  interalg can be especially more effective than BARON (and some other
  competitors) on problems with huge or absent Lipschitz constant, for 
example
  on funcs like sqrt(x), log(x), 1/x, x**alpha, alpha1, when domain of x 
is
  something like [small_positive_value, another_value].
 
  Let me also remember you that interalg can search for all solutions of
  nonlinear equations / systems of them where local solvers like
  scipy.optimize fsolve cannot find anyone, and search single/multiple
  integral with guaranteed user-defined precision (speed of integration is
  intended to be enhanced in future).
  However, only FuncDesigner models are handled (read interalg webpage for
  more details).
 
 Thank you for this new improvements. I am one of those who use OpenOpt
 in real life problems, and if I can advance a suggestion (for the
 second time), when you post a benchmark of various optimization
 methods, please do not consider the elapsed time only as a
 meaningful variable to measure a success/failure of an algorithm.
 
 Some (most?) of real life problems require intensive and time
 consuming simulations for every *function evaluation*; the time spent
 by the solver itself doing its calculations simply disappears in front
 of the real process simulation. I know it because our simulations take
 between 2 and 48 hours to run, so what's 300 seconds more or less in
 the solver calculations? If you talk about synthetic problems (such as
 the ones defined by a formula), I can see your point. For everything
 else, I believe the number of function evaluations is a more direct
 way to assess the quality of an optimization algorithm.
 
 Just my 2c.
 
 Andrea.
 
   
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] inverting and calculating eigenvalues for many small matrices

2011-08-15 Thread Daniel Wheeler
Hi, I put together a set of tools for inverting, multiplying and
finding eigenvalues for many small matrices (arrays of shape (N, M, M)
where MxM is the size of each matrix). Thanks to the posoter who
suggested using the Tokyo package. Although not used directly, it
helped with figuring the correct arguments to pass to the lapack
routines and getting stated with cython. I put the code up here if
anyone happens to be interested.

   http://matforge.org/fipy/browser/sandbox/smallMatrixTools/smt

It consists of three files, smallMatrixTools.py, smt.pyx amd smt.pxd.
The speed tests comparing with numpy came out something like this...

  N, M, M: 1, 2, 2
  mulinv speed up: 65.9, eigvec speed up: 11.2

  N, M, M: 1, 3, 3
  mulinv speed up: 32.3, eigvec speed up: 7.2

  N, M, M: 1, 4, 4
  mulinv speed up: 24.1, eigvec speed up: 5.9

  N, M, M: 1, 5, 5
  mulinv speed up: 17.0, eigvec speed up: 5.2

for random matrices. Not bad speed ups, but not out of this world. I'm
new to cython so there may be some costly mistakes in the
implementation. I profiled and it seems that most of the time is now
being spent in the lapack routines, but still not completely convinced
by the profiling results. One thing that I know I'm doing wrong is
reassigning every sub-matrix to a new array. This may not be that
costly, but it seems fairly ugly. I wasn't sure how to pass the
address of the submatrix to the lapack routines so I'm assigning to a
new array and passing that instead.

I tested with 
http://matforge.org/fipy/browser/sandbox/smallMatrixTools/smt/test.py
and speed tests were done using
http://matforge.org/fipy/browser/sandbox/smallMatrixTools/smt/speedTest.py.

Cheers

On Tue, Jul 12, 2011 at 3:51 AM, Dag Sverre Seljebotn
d.s.seljeb...@astro.uio.no wrote:
 On 07/11/2011 11:01 PM, Daniel Wheeler wrote:
 Hi, I am trying to find the eigenvalues and eigenvectors as well as
 the inverse for a large number of small matrices. The matrix size
 (MxM) will typically range from 2x2 to 8x8 at most. The number of
 matrices (N) can be from 100 up to a million or more. My current
 solution is to define eig and inv to be,

 def inv(A):
      
      Inverts N MxM matrices, A.shape = (M, M, N), inv(A).shape = (M, M, N).
      
      return np.array(map(np.linalg.inv, A.transpose(2, 0, 1))).transpose(1, 
 2, 0)

 def eig(A):
      
      Calculate the eigenvalues and eigenvectors of N MxM matrices,
 A.shape = (M, M, N), eig(A)[0].shape = (M, N), eig(A)[1].shape = (M,
 M, N)
      
      tmp = zip(*map(np.linalg.eig, A.transpose(2, 0, 1)))
      return (np.array(tmp[0]).swapaxes(0,1), 
 np.array(tmp[1]).transpose(1,2,0))

 The above uses map to fake a vector solution, but this is heinously
 slow. Are there any better ways to do this without resorting to cython
 or weave (would it even be faster (or possible) to use np.linalg.eig
 and np.linalg.inv within cython)? I could write specialized versions

 If you want to go the Cython route, here's a start:

 http://www.vetta.org/2009/09/tokyo-a-cython-blas-wrapper-for-fast-matrix-math/

 Dag Sverre
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion




-- 
Daniel Wheeler
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Efficient way to load a 1Gb file?

2011-08-15 Thread Russell E. Owen
In article 
ca+rwobwjyy_abjijnxepkseraeom608uimywffgag-6xdgs...@mail.gmail.com,
 Torgil Svensson torgil.svens...@gmail.com wrote:

 Try the fromiter function, that will allow you to pass an iterator
 which can read the file line by line and not preload the whole file.
 
 file_iterator = iter(open('filename.txt')
 line_parser = lambda x: map(float,x.split('\t'))
 a=np.fromiter(itertools.imap(line_parser,file_iterator),dtype=float)
 
 You have also the option to iterate the file twice and pass the
 count argument.

Thanks. That sounds great!

-- RUssell

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] Segfault for np.lookfor

2011-08-15 Thread Matthew Brett
Hi,

On current trunk, all tests pass but running the (forgive my language)
doctests, I found this:

In [1]: import numpy as np

In [2]: np.__version__
Out[2]: '2.0.0.dev-730b861'

In [3]: np.lookfor('cos')
Segmentation fault

on:

Linux angela 2.6.38-10-generic #46-Ubuntu SMP Tue Jun 28 15:07:17 UTC
2011 x86_64 x86_64 x86_64 GNU/Linux
Ubuntu Natty Python 2.7.1+

Best,

Matthew
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] numpydoc - latex longtables error

2011-08-15 Thread Matthew Brett
Hi,

On Wed, Aug 10, 2011 at 5:17 PM, Matthew Brett matthew.br...@gmail.com wrote:
 Hi,

 On Wed, Aug 10, 2011 at 5:03 PM,  josef.p...@gmail.com wrote:
 On Wed, Aug 10, 2011 at 6:17 PM, Matthew Brett matthew.br...@gmail.com 
 wrote:
 Hi,

 On Wed, Aug 10, 2011 at 12:38 PM, Skipper Seabold jsseab...@gmail.com 
 wrote:
 On Wed, Aug 10, 2011 at 3:28 PM, Matthew Brett matthew.br...@gmail.com 
 wrote:
 Hi,

 I think this one might be for Pauli.

 I've run into an odd problem that seems to be an interaction of
 numpydoc and autosummary and large classes.

 In summary, large classes and numpydoc lead to large tables of class
 methods, and there seems to be an error in the creation of the large
 tables in latex.

 Specifically, if I run 'make latexpdf' with the attached minimal
 sphinx setup, I get a pdflatex error ending thus:

 ...
 l.118 \begin{longtable}{LL}

 and this is because longtable does not accept LL as an argument, but
 needs '|l|l|' (bar - el - bar - el - bar).

 I see in sphinx.writers.latex.py, around line 657, that sphinx knows
 about this in general, and long tables in standard ReST work fine with
 the el-bar arguments passed to longtable.

        if self.table.colspec:
            self.body.append(self.table.colspec)
        else:
            if self.table.has_problematic:
                colwidth = 0.95 / self.table.colcount
                colspec = ('p{%.3f\\linewidth}|' % colwidth) * \
                          self.table.colcount
                self.body.append('{|' + colspec + '}\n')
            elif self.table.longtable:
                self.body.append('{|' + ('l|' * self.table.colcount) + 
 '}\n')
            else:
                self.body.append('{|' + ('L|' * self.table.colcount) + 
 '}\n')

 However, using numpydoc and autosummary (see the conf.py file), what
 seems to happen is that, when we reach the self.table.colspec test at
 the beginning of the snippet above, 'self.table.colspec' is defined:

 In [1]: self.table.colspec
 Out[1]: '{LL}\n'

 and thus the LL gets written as the arg to longtable:

 \begin{longtable}{LL}

 and the pdf build breaks.

 I'm using the numpydoc out of the current numpy source tree.

 At that point I wasn't sure how to proceed with debugging.  Can you
 give any hints?


 It's not a proper fix, but our workaround is to edit the Makefile for
 latex (and latexpdf) to

 https://github.com/statsmodels/statsmodels/blob/master/scikits/statsmodels/docs/Makefile#L94
 https://github.com/statsmodels/statsmodels/blob/master/scikits/statsmodels/docs/make.bat#L121

 to call the script to replace the longtable arguments

 https://github.com/statsmodels/statsmodels/blob/master/scikits/statsmodels/docs/fix_longtable.py

 The workaround itself probably isn't optimal, and I'd be happy to hear
 of a proper fix.

 Thanks - yes - I found your workaround in my explorations, I put in a
 version in our tree too:

 https://github.com/matthew-brett/nipy/blob/latex_build_fixes/tools/fix_longtable.py

  - but I agree it seems much better to get to the root cause.

 When I tried to figure this out, I never found out why the correct
 sphinx longtable code path never gets reached, or which code
 (numpydoc, autosummary or sphinx) is filling in the colspec.

 No - it looked hard to debug.  I established that it required numpydoc
 and autosummary to be enabled.

It looks like this conversation dried up, so I've moved it to a ticket:

http://projects.scipy.org/numpy/ticket/1935

Best,

Matthew
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Segfault for np.lookfor

2011-08-15 Thread Charles R Harris
On Mon, Aug 15, 2011 at 3:53 PM, Matthew Brett matthew.br...@gmail.comwrote:

 Hi,

 On current trunk, all tests pass but running the (forgive my language)
 doctests, I found this:

 In [1]: import numpy as np

 In [2]: np.__version__
 Out[2]: '2.0.0.dev-730b861'

 In [3]: np.lookfor('cos')
 Segmentation fault

 on:

 Linux angela 2.6.38-10-generic #46-Ubuntu SMP Tue Jun 28 15:07:17 UTC
 2011 x86_64 x86_64 x86_64 GNU/Linux
 Ubuntu Natty Python 2.7.1+


The problem is somewhere in print_coercion_tables.py

Chuck
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Segfault for np.lookfor

2011-08-15 Thread Charles R Harris
On Mon, Aug 15, 2011 at 6:56 PM, Charles R Harris charlesr.har...@gmail.com
 wrote:



 On Mon, Aug 15, 2011 at 3:53 PM, Matthew Brett matthew.br...@gmail.comwrote:

 Hi,

 On current trunk, all tests pass but running the (forgive my language)
 doctests, I found this:

 In [1]: import numpy as np

 In [2]: np.__version__
 Out[2]: '2.0.0.dev-730b861'

 In [3]: np.lookfor('cos')
 Segmentation fault

 on:

 Linux angela 2.6.38-10-generic #46-Ubuntu SMP Tue Jun 28 15:07:17 UTC
 2011 x86_64 x86_64 x86_64 GNU/Linux
 Ubuntu Natty Python 2.7.1+


 The problem is somewhere in print_coercion_tables.py


Or more precisely, it triggered by importing  print_coercion_tables.py. I
don't think lookfor should be doing that, but in any case:

array + scalar
+ ? b h i l q p B H I L Q P e f d g F D G S U V O M m
? ? b h i l l l B H I L L L e f d g F D G O O # O ! m
b b b b b b b b b b b b b b e f d g F D G O O # O ! m
h h h h h h h h h h h h h h f f d g F D G O O # O ! m
i i i i i i i i i i i i i i d d d g D D G O O # O ! m
l l l l l l l l l l l l l l d d d g D D G O O # O ! m
q l l l l l l l l l l l l l d d d g D D G O O # O ! m
p l l l l l l l l l l l l l d d d g D D G O O # O ! m
B B B B B B B B B B B B B B e f d g F D G O O # O ! m
H H H H H H H H H H H H H H f f d g F D G O O # O ! m
I I I I I I I I I I I I I I d d d g D D G O O # O ! m
L L L L L L L L L L L L L L d d d g D D G O O # O ! m
Q L L L L L L L L L L L L L d d d g D D G O O # O ! m
P L L L L L L L L L L L L L d d d g D D G O O # O ! m
e e e e e e e e e e e e e e e e e e F F F O O # O ! #
f f f f f f f f f f f f f f f f f f F F F O O # O ! #
d d d d d d d d d d d d d d d d d d D D D O O # O ! #
g g g g g g g g g g g g g g g g g g G G G O O # O ! #
F F F F F F F F F F F F F F F F F F F F F O O # O ! #
D D D D D D D D D D D D D D D D D D D D D O O # O ! #
G G G G G G G G G G G G G G G G G G G G G O O # O ! #
S O O O O O O O O O O O O O O O O O O O O O O # O ! O
U O O O O O O O O O O O O O O O O O O O O O O # O ! O
Segmentation fault (core dumped)

Chuck
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Segfault for np.lookfor

2011-08-15 Thread Charles R Harris
On Mon, Aug 15, 2011 at 7:09 PM, Charles R Harris charlesr.har...@gmail.com
 wrote:



 On Mon, Aug 15, 2011 at 6:56 PM, Charles R Harris 
 charlesr.har...@gmail.com wrote:



 On Mon, Aug 15, 2011 at 3:53 PM, Matthew Brett 
 matthew.br...@gmail.comwrote:

 Hi,

 On current trunk, all tests pass but running the (forgive my language)
 doctests, I found this:

 In [1]: import numpy as np

 In [2]: np.__version__
 Out[2]: '2.0.0.dev-730b861'

 In [3]: np.lookfor('cos')
 Segmentation fault

 on:

 Linux angela 2.6.38-10-generic #46-Ubuntu SMP Tue Jun 28 15:07:17 UTC
 2011 x86_64 x86_64 x86_64 GNU/Linux
 Ubuntu Natty Python 2.7.1+


 The problem is somewhere in print_coercion_tables.py


 Or more precisely, it triggered by importing  print_coercion_tables.py. I
 don't think lookfor should be doing that, but in any case:

 array + scalar
 + ? b h i l q p B H I L Q P e f d g F D G S U V O M m
 ? ? b h i l l l B H I L L L e f d g F D G O O # O ! m
 b b b b b b b b b b b b b b e f d g F D G O O # O ! m
 h h h h h h h h h h h h h h f f d g F D G O O # O ! m
 i i i i i i i i i i i i i i d d d g D D G O O # O ! m
 l l l l l l l l l l l l l l d d d g D D G O O # O ! m
 q l l l l l l l l l l l l l d d d g D D G O O # O ! m
 p l l l l l l l l l l l l l d d d g D D G O O # O ! m
 B B B B B B B B B B B B B B e f d g F D G O O # O ! m
 H H H H H H H H H H H H H H f f d g F D G O O # O ! m
 I I I I I I I I I I I I I I d d d g D D G O O # O ! m
 L L L L L L L L L L L L L L d d d g D D G O O # O ! m
 Q L L L L L L L L L L L L L d d d g D D G O O # O ! m
 P L L L L L L L L L L L L L d d d g D D G O O # O ! m
 e e e e e e e e e e e e e e e e e e F F F O O # O ! #
 f f f f f f f f f f f f f f f f f f F F F O O # O ! #
 d d d d d d d d d d d d d d d d d d D D D O O # O ! #
 g g g g g g g g g g g g g g g g g g G G G O O # O ! #
 F F F F F F F F F F F F F F F F F F F F F O O # O ! #
 D D D D D D D D D D D D D D D D D D D D D O O # O ! #
 G G G G G G G G G G G G G G G G G G G G G O O # O ! #
 S O O O O O O O O O O O O O O O O O O O O O O # O ! O
 U O O O O O O O O O O O O O O O O O O O O O O # O ! O
 Segmentation fault (core dumped)


A quick fix is to put the print statements in a function.

diff --git a/numpy/testing/print_coercion_tables.py
b/numpy/testing/print_coercion_tables.p
index d875449..3bc9253 100755
--- a/numpy/testing/print_coercion_tables.py
+++ b/numpy/testing/print_coercion_tables.py
@@ -65,22 +65,23 @@ def print_coercion_table(ntypes, inputfirstvalue,
inputsecondvalue, fir
 print char,
 print

-print can cast
-print_cancast_table(np.typecodes['All'])
-print
-print In these tables, ValueError is '!', OverflowError is '@', TypeError
is '#'
-print
-print scalar + scalar
-print_coercion_table(np.typecodes['All'], 0, 0, False)
-print
-print scalar + neg scalar
-print_coercion_table(np.typecodes['All'], 0, -1, False)
-print
-print array + scalar
-print_coercion_table(np.typecodes['All'], 0, 0, True)
-print
-print array + neg scalar
-print_coercion_table(np.typecodes['All'], 0, -1, True)
-print
-print promote_types
-print_coercion_table(np.typecodes['All'], 0, 0, False, True)
+def printem():
+print can cast
+print_cancast_table(np.typecodes['All'])
+print
+print In these tables, ValueError is '!', OverflowError is '@',
TypeError is '#'
+print
+print scalar + scalar
+print_coercion_table(np.typecodes['All'], 0, 0, False)
+print
+print scalar + neg scalar
+print_coercion_table(np.typecodes['All'], 0, -1, False)
+print
+print array + scalar
+print_coercion_table(np.typecodes['All'], 0, 0, True)
+print
+print array + neg scalar
+print_coercion_table(np.typecodes['All'], 0, -1, True)
+print
+print promote_types
+print_coercion_table(np.typecodes['All'], 0, 0, False, True)

Chuck
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] As any array, really any array

2011-08-15 Thread Luis Pedro Coelho
Hello all,

I often find myself writing the following code:

try:
features = np.asanyarray(features)
except:
features = np.asanyarray(features, dtype=object)

I basically want to be able to use fany indexing on features and, in most 
cases, it will be a numpy floating point array. Otherwise, default to having it 
be an array of dtype=object.

Is there a more elegant way to do it with numpy?

Thank you,
-- 
Luis Pedro Coelho | Carnegie Mellon University | http://luispedro.org


signature.asc
Description: This is a digitally signed message part.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion