from:"David Huard"

Re: [Numpy-discussion] Fortran 90 Library and .mod files numpy.distutils

2014-05-30 Thread David Huard

Hi Onur,

Have you taken a look at https://github.com/numpy/numpy/issues/1350 ? Maybe
both issues are related.

Cheers,

David H.


On Fri, May 30, 2014 at 6:20 AM, Onur Solmaz  wrote:

> Was this mail seen? I cannot be sure because it is the first time I posted.
>
>
>
> On Mon, May 26, 2014 at 2:48 PM, Onur Solmaz  wrote:
>
>> I am building a Fortran 90 library and its extension. .mod files get
>> generated inside the build/temp.linux-x86_64-2.7/ directory, and stay
>> there; so when building the extension, the compiler complains that it
>> cannot find the modules
>> This is because the include paths do not have the temp directory. I can
>> work this around by adding that to the include paths for the extension, but
>> this is not a clean solution.
>> What is the best solution to this?
>>
>> I also want to be able to use the modules later, because I will
>> distribute the library. It is some other issue whether the modules should
>> be distributed with the library under /usr/lib or /usr/include, refer to
>> this <https://gcc.gnu.org/bugzilla/show_bug.cgi?id=49138> bug.
>>
>> Also one can refer to this
>> <https://gcc.gnu.org/ml/fortran/2011-06/msg00117.html> thread. This is
>> what convinced me to distribute the modules, rather than putting module
>> definitions into header files, which the user can include in their code to
>> recreate the modules. Yet another way is to use submodules, but that
>> feature is not available in Fortran 90.
>>
>
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>


-- 
David Huard, PhD
Conseiller scientifique, Ouranos
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] need a better way to fill a grid

2011-01-24 Thread David Huard

Hi John,

Since you have a regular grid, you should be able to find the x and y
indices without np.where, ie something like

I = (lon-grid.outlon0 / grid.dx).astype(int)
J = (lat-grid.outlat0 / grid.dy).astype(int)

for i, j, e in zip(I, J, emissions):
Z[i,j] += e


David

On Mon, Jan 24, 2011 at 8:53 AM, John  wrote:
> Hello,
>
> I'm trying to cycle over some vectors (lat,lon,emissions) of
> irregularly spaced lat/lon spots, and values. I need to sum the values
> each contributing to grid on a regular lat lon grid.
>
> This is what I have presently, but it is too slow. Is there a more
> efficient way to do this? I would prefer not to create an external
> module (f2py, cython) unless there is really no way to make this more
> efficient... it's the looping through the grid I guess that takes so
> long.
>
> Thanks,
> john
>
>
>
>    def grid_emissions(lon,lat,emissions,grid.dx, grid.dy,
> grid.outlat0, grid.outlon0, grid.nxmax, grid.nymax):
>        """ sample the emissions into a grid to fold into model output
>        """
>
>        dx = grid.dxout
>        dy = grid.dyout
>
>        # Generate a regular grid to fill with the sum of emissions
>        xi = np.linspace(grid.outlon0,
> grid.outlon0+(grid.nxmax*grid.d), grid.nxmax)
>        yi = np.linspace(grid.outlat0,
> grid.outlat0+(grid.nymax*grid.dy), grid.nymax)
>
>        X, Y = np.meshgrid(yi, xi)
>        Z = np.zeros(X.shape)
>
>        for i,x in enumerate(xi):
>            for j,y in enumerate(yi):
>                Z[i,j] = np.sum( emissions[\
>                         np.where(((lat>y-dy) & (lat ((lon>x-dx) & (lon
>        return Z
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] read ascii file from complex fortran format() -- genfromtxt

2010-09-21 Thread David Huard

Have you tried

http://code.google.com/p/python-fortranformat/

It's not officially released yet but it's probably worth a try.

David H.

On Tue, Sep 21, 2010 at 8:25 AM, Andrew Jaffe  wrote:

> Hi all,
>
> I've got an ascii file with a relatively complicated structure,
> originally written by fortran with the format:
>
> 135format(a12,1x,2(f10.5,1x),i3,1x,4(f9.3,1x),4(i2,1x),3x,
>  1 16(f7.2,1x),i3,3x,f13.5,1x,f10.5,1x,f10.6,1x,i3,1x,
>  2 4(f10.6,1x),
>  2 i2,1x,f5.2,1x,f10.3,1x,i3,1x,f7.2,1x,f7.2,3x,4(f7.4,1x),
>  3 4(f7.2,1x),3x,f7.2,1x,i4,3x,f10.3,1x,14(f6.2,1x),i3,1x,
>  1  3x,2f10.5,8f11.2,2f10.5,f12.3,3x,
>  4 2(a6,1x),a23,1x,a22,1x,a22)
>
> Note, in particular, that many of the strings contain white space.
>
> Is there a relatively straightforward way to translate this into  dtype
> (and delimiter?) arguments for use with genfromtxt or do I just have to
> do it by hand?
>
> Andrew
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] numpy histogram normed=True (bug / confusing behavior)

2010-08-31 Thread David Huard

On Tue, Aug 31, 2010 at 7:02 AM, Ralf Gommers
wrote:

>
>
> On Tue, Aug 31, 2010 at 3:44 AM, David Huard wrote:
>
>>
>> I just added a warning alerting concerned users (r8674), so this takes
>> care of the bug fix and Nils wish to avoid a silent change in behavior.
>> These two changes could be included in 1.5 if Ralf feels this is
>> worthwhile.
>>
>> That looks like a reasonable solution. I haven't got a strong opinion on
> whether or not to change the 'normed' keyword to 'density'.
>
> Looking at the changes, I don't think that is the right way to do the
> filtering in the tests. resetwarnings() removes all filters including the
> ones previously set by users, and should therefore not be used. Better to
> either raise a specific warning and filter on that, or to filter on the
> message content with:
> warnings.filterwarnings('ignore' , message="This release of NumPy fixes
> a normalization bug in histogram").
> I found one more place where resetwarnings() is used, in
> test_arraysetops.py, I'll change that in trunk. Related problem there is
> that the warning in warnings.warn is not a DeprecationWarning.
>
> The above problem is easy to fix, but in any case it's too late to go into
> 1.5.0 - I'll tag the final release tonight.
>
>
Ralf,

test_function_base and test_arraysetops now do not use resetwarnings. What I
did is added a warning filter and popped it out of the filters list
afterwards. Is this OK ?

In other tests, what is done is rather

  warnings.simplefilter('ignore', DeprecationWarning)
  test_function()
  warnings.simplefilter('default', DeprecationWarning)

but that will also override any user-defined setup, no ?

David


Cheers,
> Ralf
>
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] numpy histogram normed=True (bug / confusing behavior)

2010-08-30 Thread David Huard

On Mon, Aug 30, 2010 at 3:02 PM,  wrote:

> On Mon, Aug 30, 2010 at 2:43 PM, Benjamin Root  wrote:
> > On Mon, Aug 30, 2010 at 10:50 AM,  wrote:
> >>
> >> On Mon, Aug 30, 2010 at 11:39 AM, Bruce Southey 
> >> wrote:
> >> > On 08/30/2010 09:19 AM, Benjamin Root wrote:
> >> >
> >> > On Mon, Aug 30, 2010 at 8:29 AM, David Huard 
> >> > wrote:
> >> >>
> >> >> Thanks for the feedback,
> >> >> As far as I understand it, the proposition is to keep histogram as it
> >> >> is
> >> >> for 1.5, then in 2.0, deprecate normed=True but keep the buggy
> >> >> behavior,
> >> >> while adding a density keyword that fixes the bug. In a later
> release,
> >> >> we
> >> >> could then get rid of normed. While the bug won't be present in
> >> >> histogramdd
> >> >> and histogram2d, the keyword change should be mirrored in those
> >> >> functions as
> >> >> well.
> >> >> I personally am not too keen on changing the keyword normed for
> >> >> density. I
> >> >> feel we are trading clarity for a few new users against additional
> >> >> trouble
> >> >> for many existing users. We could mitigate this by first documenting
> >> >> the
> >> >> change in the docstring and live with both keywords for a few years
> >> >> before
> >> >> raising a DeprecationWarning.
> >> >> Since this has a direct impact on matloblib's hist, I'd be keen to
> >> >> hears
> >> >> the devs on this.
> >> >> David
> >> >
> >> > I am not a dev, but I would like to give a word of warning from
> >> > matplotlib.
> >> >
> >> > In matplotlib, the bar/hist family of functions grew organically as
> the
> >> > devs
> >> > took on various requests to add keywords and such to modify the style
> >> > and
> >> > behavior of those graphing functions.  It has now become an
> >> > unmaintainable
> >> > mess, prompting discussions on how to rip it out and replace it with a
> >> > cleaner implementation.  While everyone agrees that it needs to be
> done,
> >> > we
> >> > all don't want to break backwards compatibility.
> >> >
> >> > My personal feeling is that a function should do one thing, and do
> that
> >> > one
> >> > thing well.  So, to me, that means that histogram() should return an
> >> > array
> >> > of counts and the bins for those counts.  Anything more is merely
> window
> >> > dressing to me.  With this information, one can easily compute a
> >> > cumulative
> >> > distribution function, and/or normalize the result.  The idea is that
> if
> >> > there is nothing special that needs to be done within the histogram
> >> > algorithm to accommodate these extra features, then they belong
> outside
> >> > the
> >> > function.
> >> >
> >> > My 2 cents,
> >> > Ben Root
> >> >
> >> > ___
> >> > NumPy-Discussion mailing list
> >> > NumPy-Discussion@scipy.org
> >> > http://mail.scipy.org/mailman/listinfo/numpy-discussion
> >> >
> >> > +1 for Ben's approach.
> >> > This is very similar to my view regarding to the contingency table
> class
> >> > proposed for scipy ( http://projects.scipy.org/scipy/ticket/1258). We
> >> > need
> >> > to provide the core functionality that other approaches such as
> density
> >> > estimation can use but not be limited to specific details.
> >>
> >> I think (a corrected) density histogram is core functionality for
> >> unequal bin lengths.
> >>
> >> The graph with raw count in the case of unequal bin sizes would be
> >> quite misleading when plotted and interpreted on the real line and not
> >> on discrete points (shaded areas instead of vertical lines). And as
> >> the origin of this thread showed, it's not trivial to figure out what
> >> the correct normalization is.
> >> So, I think, if we drop the density normalization, we just need a new
> >> function that does it.
> >>
> >> My 2c,
> >>
> >> Josef
> >>
> >>
> >
> > Why

Re: [Numpy-discussion] numpy histogram normed=True (bug / confusing behavior)

2010-08-30 Thread David Huard

I tend to agree with Josef here,

To me, bincount and digitize are the low-level functions, and histogram
contains a bit more functionality since its used so often and for many use
cases. My guess is that if we removed the normalization, it could annoy a
lot of people and would quickly appear on the desired feature list.

Just to put things in perspective, this was indeed a trivial bug that
required a one line fix. It only affected use cases with non-uniform bin
widths and normed=True, a combination that is probably uncommon. I believe
it is a genuine bug, not just a confusing behavior, and that's why I
initially thought a warning was unnecessary.

In any case, I'm not sure this is really a "while we're at it" situation,
that is, I think the switch from "normed" to "density" should be addressed
in another context. That would allow us to include the bug fix (with a
warning) in the upcoming 1.5 release.

David H.

On Mon, Aug 30, 2010 at 11:50 AM,  wrote:

> On Mon, Aug 30, 2010 at 11:39 AM, Bruce Southey 
> wrote:
> > On 08/30/2010 09:19 AM, Benjamin Root wrote:
> >
> > On Mon, Aug 30, 2010 at 8:29 AM, David Huard 
> wrote:
> >>
> >> Thanks for the feedback,
> >> As far as I understand it, the proposition is to keep histogram as it is
> >> for 1.5, then in 2.0, deprecate normed=True but keep the buggy behavior,
> >> while adding a density keyword that fixes the bug. In a later release,
> we
> >> could then get rid of normed. While the bug won't be present in
> histogramdd
> >> and histogram2d, the keyword change should be mirrored in those
> functions as
> >> well.
> >> I personally am not too keen on changing the keyword normed for density.
> I
> >> feel we are trading clarity for a few new users against additional
> trouble
> >> for many existing users. We could mitigate this by first documenting the
> >> change in the docstring and live with both keywords for a few years
> before
> >> raising a DeprecationWarning.
> >> Since this has a direct impact on matloblib's hist, I'd be keen to hears
> >> the devs on this.
> >> David
> >
> > I am not a dev, but I would like to give a word of warning from
> matplotlib.
> >
> > In matplotlib, the bar/hist family of functions grew organically as the
> devs
> > took on various requests to add keywords and such to modify the style and
> > behavior of those graphing functions.  It has now become an
> unmaintainable
> > mess, prompting discussions on how to rip it out and replace it with a
> > cleaner implementation.  While everyone agrees that it needs to be done,
> we
> > all don't want to break backwards compatibility.
> >
> > My personal feeling is that a function should do one thing, and do that
> one
> > thing well.  So, to me, that means that histogram() should return an
> array
> > of counts and the bins for those counts.  Anything more is merely window
> > dressing to me.  With this information, one can easily compute a
> cumulative
> > distribution function, and/or normalize the result.  The idea is that if
> > there is nothing special that needs to be done within the histogram
> > algorithm to accommodate these extra features, then they belong outside
> the
> > function.
> >
> > My 2 cents,
> > Ben Root
> >
> > ___
> > NumPy-Discussion mailing list
> > NumPy-Discussion@scipy.org
> > http://mail.scipy.org/mailman/listinfo/numpy-discussion
> >
> > +1 for Ben's approach.
> > This is very similar to my view regarding to the contingency table class
> > proposed for scipy ( http://projects.scipy.org/scipy/ticket/1258). We
> need
> > to provide the core functionality that other approaches such as density
> > estimation can use but not be limited to specific details.
>
> I think (a corrected) density histogram is core functionality for
> unequal bin lengths.
>
> The graph with raw count in the case of unequal bin sizes would be
> quite misleading when plotted and interpreted on the real line and not
> on discrete points (shaded areas instead of vertical lines). And as
> the origin of this thread showed, it's not trivial to figure out what
> the correct normalization is.
> So, I think, if we drop the density normalization, we just need a new
> function that does it.
>
> My 2c,
>
> Josef
>
>
> >
> > Bruce
> >
> >
> >
> >
> > ___
> > NumPy-Discussion mailing list
> > NumPy-Discussion@scipy.org
> > http://mail.scipy.org/mailman/listinfo/numpy-discussion
> >
> >
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] numpy histogram normed=True (bug / confusing behavior)

2010-08-30 Thread David Huard

Thanks for the feedback,

As far as I understand it, the proposition is to keep histogram as it is for
1.5, then in 2.0, deprecate normed=True but keep the buggy behavior, while
adding a density keyword that fixes the bug. In a later release, we could
then get rid of normed. While the bug won't be present in histogramdd and
histogram2d, the keyword change should be mirrored in those functions as
well.

I personally am not too keen on changing the keyword normed for density. I
feel we are trading clarity for a few new users against additional trouble
for many existing users. We could mitigate this by first documenting the
change in the docstring and live with both keywords for a few years before
raising a DeprecationWarning.

Since this has a direct impact on matloblib's hist, I'd be keen to hears the
devs on this.

David



On Sun, Aug 29, 2010 at 5:06 PM, Sebastian Haase wrote:

> On Sun, Aug 29, 2010 at 3:21 PM, Nils Becker  wrote:
> >> On Sat, Aug 28, 2010 at 04:12, Zbyszek Szmek  wrote:
> >>> Hi,
> >>>
> >>> On Fri, Aug 27, 2010 at 06:43:26PM -0600, Charles R Harris wrote:
> >>>> ? ?On Fri, Aug 27, 2010 at 2:47 PM, Robert Kern <
> robert.k...@gmail.com>
> >>>> ? ?wrote:
> >>>>
> >>>> ? ? ?On Fri, Aug 27, 2010 at 15:32, David Huard <
> david.hu...@gmail.com>
> >>>> ? ? ?wrote:
> >>>> ? ? ?> Nils and Joseph,
> >>>> ? ? ?> Thanks for the bug report, this is now fixed in SVN (r8672).
> >>>>
> >>>> ? ? ?While we're at it, can we change the name of the argument?
> "normed"
> >>>> ? ? ?has caused so much confusion over the years. We could deprecate
> >>>> ? ? ?normed=True in favor of pdf=True or density=True.
> >>> I think it might be a good moment to also include a different type of
> normalization:
> >>> ? ? ? n = n / n.sum()
> >>> i.e. the frequency of counts in each bin. This one is of course very
> simple to calculate
> >>> by hand, but very common. I think it would be useful to have this
> normalization
> >>> available too. [
> http://www.itl.nist.gov/div898/handbook/eda/section3/histogra.htm]
> >>
> >> My feeling is that this is trivial to do "by hand". I do not see a
> >> reason to add an option to histogram() to do this.
> >>
> > Hi,
> >
> > +1 for not silently changing the behavior of normed=True. (I'm one of
> > the people who have worked around it).
> >
> > One argument in favor of putting both normalizing styles 'frequency' and
> > 'density' may be that the documentation will automatically become very
> > clear. A user sees all options and there is little chance of a
> > misunderstanding. Of course, a sentence like "If you want frequency
> > normalization, use histogram(data, normalized=False)/sum(data)" would
> > also make things clear, without adding the frequency option.
> >
> I am in favor of adding an option for the density mode (not for this
> release I guess).
> I often have a long expressing in place of `data` and the one extra
> keyword saves lot's of typing.
>
> -Sebastian Haase
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] numpy histogram normed=True (bug / confusing behavior)

2010-08-27 Thread David Huard

Nils and Joseph,

Thanks for the bug report, this is now fixed in SVN (r8672).

Ralph. is this something that you want to see backported in 1.5 ?

Regards,

David


On Fri, Aug 6, 2010 at 7:49 PM,  wrote:

> On Fri, Aug 6, 2010 at 4:53 PM, Nils Becker  wrote:
> > Hi again,
> >
> > first a correction: I posted
> >
> >> I believe np.histogram(data, bins, normed=True) effectively does :
>  np.histogram(data, bins, normed=False) / (bins[-1] - bins[0]).
> 
>  However, it _should_ do
>  np.histogram(data, bins, normed=False) / bins_widths
> >
> > but there is a normalization missing; it should read
> >
> > I believe np.histogram(data, bins, normed=True) effectively does
> > np.histogram(data, bins, normed=False) / (bins[-1] - bins[0]) /
> data.sum()
> >
> > However, it _should_ do
> > np.histogram(data, bins, normed=False) / bins_widths / data.sum()
> >
> > Bruce Southey replied:
> >> As I recall, there as issues with this aspect.
> >> Please search the discussion regarding histogram especially David
> >> Huard's reply in this thread:
> >> http://thread.gmane.org/gmane.comp.python.numeric.general/22445
> > I think this discussion pertains to a switch in calling conventions
> > which happened at the time. The last reply of D. Huard (to me) seems to
> > say that they did not fix anything in the _old_ semantics, but that the
> > new semantics is expected to work properly.
> >
> > I tried with an infinite bin:
> > counts, dmy = np.histogram([1,2,3,4], [0.5,1.5,np.inf])
> > counts
> > array([1,3])
> > ncounts, dmy = np.histogram([1,2,3,4], [0.5,1.5,np.inf], normed=1)
> > ncounts
> > array([0.,0.])
> >
> > this also does not make a lot of sense to me. A better result would be
> > array([0.25, 0.]), since 25% of the points fall in the first bin; 75%
> > fall in the second but are spread out over an infinite interval, giving
> > 0. This is what my second proposal would give. I cannot find anything
> > wrong with it so far...
>
> I didn't find any different information about the meaning of
> normed=True on the mailing list nor in the trac history
>
>169
>170 if normed:
>171 db = array(np.diff(bins), float)
>172 return n/(n*db).sum(), bins
>
> this does not look like the correct piecewise density with unequal
> binsizes.
>
> Thanks Nils for pointing this out, I tried only equal binsizes for a
> histogram distribution.
>
> Josef
>
>
>
>
>
> >
> > Cheers, Nils
> > ___
> > NumPy-Discussion mailing list
> > NumPy-Discussion@scipy.org
> > http://mail.scipy.org/mailman/listinfo/numpy-discussion
> >
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] "Dynamic convolution" in Numpy

2010-06-04 Thread David Huard

On Thu, Jun 3, 2010 at 9:52 AM, arthur de conihout <
arthurdeconih...@gmail.com> wrote:

> Hi,
> thanks for your answer
>
> *why don't you compute all possible versions beforehand*
>
>
> thats exactly what i m doing presently cause i m using a 187 filters
> database(azimuth and elevation).I would love to be able to reduce the angle
> thread under 5° which triggers around 1000 files to produce.For a 3Mb
> original sound file, it becomes huge.
>
> Indeed ! I'll be curious to see what solutions ends up working best. Keep
us posted.

David


> Thanks
>
> Arthur
>
>
> 2010/6/3 David Huard 
>
> Hi Arthur,
>>
>> I've no experience whatsoever with what you are doing, but my first
>> thought was why don't you compute all possible versions beforehand and then
>> progressively switch from one version to another by interpolation between
>> the different versions. If the resolution is 15 degrees, there aren't that
>> many versions to compute beforehand.
>>
>> David
>>
>> On Thu, Jun 3, 2010 at 6:49 AM, arthur de conihout <
>> arthurdeconih...@gmail.com> wrote:
>>
>>> Hello everybody
>>>
>>> i m fighting with a dynamic binaural synthesis(can give more hints on it
>>> if necessary).
>>>
>>> i would like to modify the sound playing according to listener's head
>>> position. I got special filters(binaural ones) per head position that i
>>> convolve in real time with a monophonic sound.When the head moves i want to
>>> be able to play the next position version of the stereo generated sound but
>>> from the playing position(bit number in the unpacked datas) of the previous
>>> one.My problem is to hamper audible artefacts due to transition.
>>> At moment i m only using *short audio wav* that i play and repeat if
>>> necessary entirely because my positionning resolution is 15°
>>> degrees.Evolution of head angle position let time for the whole process to
>>> operate (getting position->choosing corresponding filter->convolution->play
>>> sound)
>>>
>>> *For long **audio wav* I could make a fade-in fade-out from the
>>> transition point but i have no idea how to implement it(i m using audiolab
>>> and numpy for convolution)
>>>
>>> An other solution could be dynamic filtering means when i change position
>>> i convolve the next position filter from the place the playing must stop for
>>> the previous one(but won't pratically stop to let the next convolution
>>> operates on enough frames) in accordance with the filter frame length(all
>>> the filters are impulse response of the same lenght 128).
>>>
>>> The "drawing" i introduce just below is my mental representation of what
>>> i m looking to implement, i already apologize for its crapitude (and one of
>>> my brain too):
>>>
>>>
>>> t0_t1__t2__t3___t=len(stimulus)
>>> monophonic sound(time and bit position in the unpacked datas)
>>>
>>> C1C1C1C1C1C1C1C1C1C1C1...
>>> running convolution with filter 1 corresponding to position 1 (ex: angle
>>> from reference=15°)
>>>
>>>P1___
>>>sound playing 1
>>>
>>>^
>>>position 2 detection(angle=30°)
>>>
>>>C2C2C2C2C2C2C2C2C2C2C2...
>>>running convolution with filter 2
>>>
>>>P1_x
>>>keep playing 1 for convolution 2 to operate on enough
>>> frames (latency)
>>>
>>>FIFO
>>>fade in fade out
>>>
>>> P2_
>>> sound playing 2
>>>
>>>
>>> I don't know if i made myself very clear.
>>>
>>> if anyone has suggestions or has already operated a dynamic filtering i
>>> would be well interested.
>>>
>>> Cheers
>>>
>>> Arthur
>>>
>>>
>>> ___
>>> NumPy-Discussion mailing list
>>> NumPy-Discussion@scipy.org
>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>>
>>>
>>
>> ___
>> NumPy-Discussion mailing list
>> NumPy-Discussion@scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>
>>
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] "Dynamic convolution" in Numpy

2010-06-03 Thread David Huard

Hi Arthur,

I've no experience whatsoever with what you are doing, but my first thought
was why don't you compute all possible versions beforehand and then
progressively switch from one version to another by interpolation between
the different versions. If the resolution is 15 degrees, there aren't that
many versions to compute beforehand.

David

On Thu, Jun 3, 2010 at 6:49 AM, arthur de conihout <
arthurdeconih...@gmail.com> wrote:

> Hello everybody
>
> i m fighting with a dynamic binaural synthesis(can give more hints on it if
> necessary).
>
> i would like to modify the sound playing according to listener's head
> position. I got special filters(binaural ones) per head position that i
> convolve in real time with a monophonic sound.When the head moves i want to
> be able to play the next position version of the stereo generated sound but
> from the playing position(bit number in the unpacked datas) of the previous
> one.My problem is to hamper audible artefacts due to transition.
> At moment i m only using *short audio wav* that i play and repeat if
> necessary entirely because my positionning resolution is 15°
> degrees.Evolution of head angle position let time for the whole process to
> operate (getting position->choosing corresponding filter->convolution->play
> sound)
>
> *For long **audio wav* I could make a fade-in fade-out from the transition
> point but i have no idea how to implement it(i m using audiolab and numpy
> for convolution)
>
> An other solution could be dynamic filtering means when i change position i
> convolve the next position filter from the place the playing must stop for
> the previous one(but won't pratically stop to let the next convolution
> operates on enough frames) in accordance with the filter frame length(all
> the filters are impulse response of the same lenght 128).
>
> The "drawing" i introduce just below is my mental representation of what i
> m looking to implement, i already apologize for its crapitude (and one of my
> brain too):
>
>
> t0_t1__t2__t3___t=len(stimulus)
> monophonic sound(time and bit position in the unpacked datas)
>
> C1C1C1C1C1C1C1C1C1C1C1...
> running convolution with filter 1 corresponding to position 1 (ex: angle
> from reference=15°)
>
>P1___
>sound playing 1
>
>^
>position 2 detection(angle=30°)
>
>C2C2C2C2C2C2C2C2C2C2C2...
>running convolution with filter 2
>
>P1_x
>keep playing 1 for convolution 2 to operate on enough
> frames (latency)
>
>FIFO
>fade in fade out
>
> P2_
> sound playing 2
>
>
> I don't know if i made myself very clear.
>
> if anyone has suggestions or has already operated a dynamic filtering i
> would be well interested.
>
> Cheers
>
> Arthur
>
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Bug in frompyfunc starting at 10000 elements?

2010-05-26 Thread David Huard

And in 2.0.0.dev8437.

More hints:

Assume has shape (N, Da) and b has shape (N, Db)

* There is a problem wben N >= 1, Db=1 and Da > 1.
* There is no problem when N >= 1, Da=1 and Db > 1.
* The first row is OK, but for all others, there is one error per row,
appearing in first column, then last column, first, etc.

Happy debugging !

David H.


On Fri, May 21, 2010 at 9:22 PM, David Warde-Farley  wrote:
> Confirmed in NumPy 1.4.1, Py 2.6.5.
>
> David
>
> On Fri, 21 May 2010, James Bergstra wrote:
>
>> Hi all, I'm wondering if this is a bug...
>>
>> Something strange happens with my ufunc as soon as I use 1 elements. As
>> the test shows, the ufunc computes the correct result for either the first
>> or last  elements, but both at the same time is no good.
>>
>> Turns out I'm only running numpy 1.3.0 with Python 2.6.4... could someone
>> with a more recent installation maybe check to see if this has been fixed?
>>
>> Thanks,
>>
>> def test_ufunc():
>>    np = numpy
>>
>>    rng = np.random.RandomState(2342)
>>    a = rng.randn(1, 2)
>>    b = rng.randn(1, 1)
>>
>>
>>    f = lambda x,y:x*y
>>    ufunc = np.frompyfunc(lambda *x:numpy.prod(x), 2, 1)
>>
>>    def g(x,y):
>>        return np.asarray(ufunc(x,y), dtype='float64')
>>
>>
>>    assert numpy.allclose(f(a[:-1],b[:-1]), g(a[:-1],b[:-1]))
>>   # PASS
>>    assert numpy.allclose(f(a[1:],b[1:]), g(a[1:],b[1:]))          # PASS
>>    assert numpy.allclose(f(a,b), g(a,b))                             # FAIL
>>
>>
>> --
>> http://www-etud.iro.umontreal.ca/~bergstrj
>>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Aggregate memmap

2010-04-25 Thread David Huard

Hi Matt,

I don't think the memmap code support this. However, you can stack memmaps
just as easily as arrays, so if you define individual memmaps for each slice
and stack them (numpy.vstack), the resulting array will behave as a regular
3D array.

HTH,

David H.



On Wed, Apr 21, 2010 at 3:41 PM, Matthew Turk  wrote:

> Hi there,
>
> I've quite a bit of unformatted fortran data that I'd like to use as
> input to a memmap, as sort of a staging area for selection of
> subregions to be loaded into RAM.  Unfortunately, what I'm running
> into is that the data was output as a set of "slices" through a 3D
> cube, instead of a single 3D cube -- the end result being that each
> slice also contains a record delimiter.  I was wondering if there's a
> way to either specify that every traversal through the least-frequent
> dimension requires an additional offset or to calculate individual
> offsets for each slice myself and then aggregate these into a "super
> memmap."
>
> Thanks for any suggestions you might have!
>
> -Matt
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Sea Ice Concentrations from Nimbus-7 SMMR and DMSP SSM/I Passive Microwave Data

2010-02-26 Thread David Huard

Nicolas,

I've attached a script I used to load the files, metadata and coordinates.

You owe me a donut.

David

On Fri, Feb 26, 2010 at 8:11 AM, Dag Sverre Seljebotn
 wrote:
> Nicolas wrote:
>> Hello
>>
>> a VERY specific question, but I thought someone in the list might have
>> used this dataset before and could give me some advice
>>
>> I am trying to read the daily Antarctic Sea Ice Concentrations from
>> Nimbus-7 SMMR and DMSP SSM/I Passive Microwave Data, as provided by
>> the national snow and ice data center (http://nsidc.org), more
>> specifically, these are the files as downloaded from their ftp site
>> (ftp://sidads.colorado.edu/pub/DATASETS/seaice/polar-stereo/nasateam/final-gsfc/south/daily)
>>
>> they are provided in binary files (e.g. nt_19980101_f13_v01_s.bin for
>> the 1st of Jan. 1998)
>>
>> the metadata information
>> (http://nsidc.org/cgi-bin/get_metadata.pl?id=nsidc-0051) gives the
>> following information (for the polar stereographic projection):
>> """
>> Data are scaled and stored as one-byte integers in flat binary arrays
>>
>> geographical coordinates
>> N: 90°     S: 30.98°     E: 180°     W: -180°
>>
>> Latitude Resolution: 25 km
>> Longitude Resolution: 25 km
>>
>> Distribution Size: 105212 bytes per southern file
>> """
>>
>> I am unfamiliar with non self-documented files (used to hdf and netcdf
>> !) and struggle to make sense of how to read these files and plot the
>> corresponding maps, I've tried using the array module and the
>> fromstring function
>>
>> file=open(filename,'rb')
>> a=array.array('B',file.read())
>> var=numpy.fromstring(a,dtype=np.int)
>>
> Try
>
> numpy.fromfile(f, dtype=np.int8)
>
> or uint8, depending on whether the data is signed or not. What you did
> is also correct except that the final dtype must be np.int8.
>
> (You can always do var = var.astype(np.int) *afterwards* to convert to a
> bigger integer type.)
>
> Dag Sverre
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
"""
Load data from SMMR and SSM/I ICE CONCENTRATION datasets.

See http://nsidc.org/data/polar_stereo/tools.html

The first 300 bytes contain the header. The data is stored as 1 byte integers. 

The number of columns is stored in the header[6:11] and the number of rows in header[12:17]

:author: David Huard 
:date: September 2009

Notes
=
ltln_25n.msk : Latitude and longitude lines, ie crosshairs. 

Cells containing land, coast, the hole at the pole or missing data are 
indicated as follows:
  land : 254
  coast : 253
  hole : 251
  missing1 : 255


Resolution
--
25km : (448, 304)

"""
import numpy as np
import struct, os
from datetime import datetime, timedelta
PATH = '/home/data/seaice/'

def load_meta(filename):
"""Return the meta data about the file."""
meta = {}
f = open(filename, 'r')
header = f.read(126)

meta['filename'] = f.read(24).strip(' \x00')
meta['title'] = f.read(80).strip(' \x00')
meta['info'] = f.read(70).strip(' \x00')

h = [header[i:i+5] for i in range(0,122,6)]
meta['missing'] = int(h[0])
meta['ncols'] =  int(h[1])
meta['nrows'] = int(h[2])
meta['latitude'] = float(h[4])
meta['orientation'] = float(h[5]) # With respect to Greenwich
meta['pole_j'] = int(float(h[7]))
meta['pole_i'] = int(float(h[8]))
meta['instrument'] = h[9]
meta['data descriptors'] = h[10]
meta['julianday_start'] = h[11]
meta['hour_start'] = h[12]
meta['minute_start'] = h[13]
meta['julianday_end'] = h[14]
meta['hour_end'] = int(h[15])
meta['minute_end'] = int(h[16])
meta['year'] = int(h[17])
meta['julianday'] = int(h[18])
meta['channel'] = h[19]
meta['scaling_factor'] = int(h[20])

meta['packing'] = float(h[20])
try:
meta['date_start'] = datetime(int(h[17]), 1, 1, int(h[12]), int(h[13])) + timedelta(days=h[11])
except: 
pass
return meta

def read(filename):
meta = load_meta(filename)
s = (meta['nrows'], meta['ncols'])
x = np.memmap(filename, dtype=np.uint8, mode='r', shape=s, offset=300)
return x

def coords25():
"""Return the longitude and latitude of the stereographic grid for 25 km 
resolution.
"""
s = (448, 304)
lon = np.memmap(os.path.join(PATH, 'tools', 'psn25lons_v2.dat'), dtype='250).round(2))
ax.set_xticklabels([0])
ax.set_yticklabels([0])
cb = plt.colorbar(mappable=im, ticks=np.linspace(0,1,11))
cb.set_clim(vmax=1)
cb.set_label('Sea ice concentration')
return fig
   

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] numpy 2.0, what else to do?

2010-02-15 Thread David Huard

In the list of things to do, I suggest deleting completely the old
histogram behaviour and the `new` keyword.

The `new` keyword argument has raised a deprecation warning since 1.3
and was set for removal in 1.4.

David H.

On Mon, Feb 15, 2010 at 9:21 AM, Robert Kern  wrote:
> On Mon, Feb 15, 2010 at 05:24, David Cournapeau  wrote:
>> David Cournapeau wrote:
>>
>>>
>>> It is always an ABI change, but is mostly backward compatible (which is
>>> neither the case of matplotlib or scipy AFAIK).
>>
>> This sentence does not make any sense: I meant that it is backward
>> compatible from an ABI POV, unless the structure PyArray_Array itself is
>> included in another structure (instead of merely being used).
>>
>> Neither matplotlib or scipy do that AFAIK - the main use-case for that
>> would be to inherit from numpy array at the C level, but I doubt many
>> extensions do that. For people who do C++, that's the same problem as
>> changing a base class, which always break the ABI,
>
> Actually, it's PyArray_Descr, which corresponds to numpy.dtype, that
> has been extended. That has even fewer possible use cases for
> subtyping. I know of none.
>
> --
> Robert Kern
>
> "I have come to believe that the whole world is an enigma, a harmless
> enigma that is made terrible by our own mad attempt to interpret it as
> though it had an underlying truth."
>  -- Umberto Eco
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Removing datetime support for 1.4.x series ?

2010-02-02 Thread David Huard

On Tue, Feb 2, 2010 at 9:23 PM, Neil Martinsen-Burrell  
wrote:
> On 2010-02-02 19:53 , David Cournapeau wrote:
>> Travis Oliphant wrote:
>>
>>> I think we just signal the breakage in 1.4.1 and move forward.   The
>>> datetime is useful as a place-holder for data.  Math on date-time arrays
>>> just doesn't work yet.    I don't think removing it is the right
>>> approach.    It would be better to spend the time on fleshing out the
>>> ufuncs and conversion functions for date-time support.
>>
>> Just so that there is no confusion: it is only about removing it for
>> 1.4.x, not about removing datetime altogether. It seems that datetime in
>> 1.4.x has few users, whereas breaking ABI is a nuisance for many more
>> people. In particular, people who update NumPy 1.4.0 cannot use scipy or
>> matplotlib unless they build it by themselves as well - we are talking
>> about thousand of people at least assuming sourceforge numbers are accurate.
>>
>> More fundamentally though, what is your opinion about ABI ? Am I right
>> to understand you don't consider is as significant ?
>
> In previous discussions about compatibility-breaking, particularly in
> instances where compatibility has been broken, we have heard vociferous
> complaints from users on this list about NumPy's inability to maintain
> compatibility within minor releases.  The silent majority of people who
> just use NumPy and don't follow our development process are not here to
> express their displeasure.  Even the small number of people who are
> reporting import errors after upgrading their NumPy installations should
> be an indication to the developers that this *is* in fact a problem.
>
> I don't understand Travis's comment that "datetime is just a
> place-holder for data".  We have heard from a number of people that the
> current state of the datetime work is not sufficiently advanced to be
> useful for them.

I'd like to clarify this bit since I don't think this is accurate. My
view is that the state of the datetime
code is perfectly acceptable for developers, able to get the source,
compile the code and react appropriately
to the small glitches that inevitably occur with new code.
On the other hand, I don't see the documentation and the functionality
to be ready yet for distribution to a wider audience (read binary
distribution users) who are likely to feel frustration toward
compilation and compatibility issues.
In that sense, the proposition from David C. seems to strike a nice balance.

David


What is the place that needs holding here?  What
> difference does it make if that code is simply developed on a branch
> which will be incorporated into an ABI-breaking x.y release when
> datetime support is at a useful point in its development?  What's the
> particular benefit for NumPy users or developers in including a
> half-working feature in a release?  If we simply want the feature to
> start getting exercised by developers, then we should make a long-lived
> publicly available branch for those who would like to try it out.
> (Insert distributed version control plug here.)
>
> NumPy has become in the past 3-5 years a critical low-level library that
> supports a large number of Python projects.  As a library, the balance
> between compatibility and new features has to shift in favor of
> compatibility.  This is a change from the days when Travis O. owned the
> NumPy source tree and features were added at will (and we are all glad
> that they were added).
>
> As a simple user, I vote in favor of considering 1.4.0 as a buggy
> release of NumPy, removing datetime support (it's just one 4000 line
> commit, right?) and releasing an ABI compatible 1.4.1.  That should
> probably be accompanied by a roadmap hashed out at this year's SciPy
> conference that takes us up through adding datetime, Python 3 and a
> possible major rewrite (that will add the indirection necessary to make
> future ABI breaks unneccessary).
>
> -Neil
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] How to get the shape of an array slice without doing it

2010-01-29 Thread David Huard

For the record, here is what I came up with.

import numpy as np

def expand_ellipsis(index, ndim):
"""Replace the ellipsis, real or implied, of an index expression by slices.

Parameters
--
index : tuple
  Indexing expression.
ndim : int
  Number of dimensions of the array the index applies to.

Return
--
out : tuple
  An indexing expression of length `ndim` where the Elipsis are replaced
  by slices.
"""
n = len(index)
index = index + ndim * (slice(None),)

newindex = []
for i in index:
try:
if i == Ellipsis:
newindex.extend((ndim - n + 1)*(slice(None),))
else:
newindex.append(i)
except:
newindex.append(i)

return newindex[:ndim]

def indexedshape(shape, index):
"""Return the shape of an array sliced by index.

Parameters
--
shape : tuple
  Shape of the original array.
index : tuple
  Indexing sequence.

Return
--
out : tuple
  If array A has shape `shape`, then out = A[index].shape.

Example
---
>>> indexedshape((5,4,3,2), (Ellipsis, 0))
(5,4,3)
>>> indexedshape((5,4,3,2), (slice(None, None, 2), 2, [1,2],
[True, False]))
"""
index = expand_ellipsis(index, len(shape))
out = []
for s, i in zip(shape,index):
if type(i) == slice:
start, stop, stride = i.indices(s)
out.append(int(np.ceil((stop-start)*1./stride)))
elif np.isscalar(i):
pass
elif getattr(i, 'dtype', None) == np.bool:
out.append(i.sum())
else:
out.append(len(i))

return tuple(out)


def test_indexedshape():
from numpy.testing import assert_equal as eq
s = (6,5,4,3)
a = np.empty(s)
i = np.index_exp[::4, 3:, 0, np.array([True, False, True])]
eq(a[i].shape, indexedshape(s, i))

i = np.index_exp[1::4, 3:, np.array([0,1,2]), ::-1]
eq(a[i].shape, indexedshape(s, i))

i = (0,)
eq(a[i].shape, indexedshape(s, i))

i = (3, Ellipsis, 0)
eq(a[i].shape, indexedshape(s, i))

On Fri, Jan 29, 2010 at 1:27 PM,   wrote:
> On Fri, Jan 29, 2010 at 1:03 PM, Keith Goodman  wrote:
>> On Fri, Jan 29, 2010 at 9:53 AM,   wrote:
>>> I forgot about ellipsis, since I never use them,
>>> replace ellipsis by [slice(None)]*ndim or something like this
>>>
>>> I don't know how to access an ellipsis directly, is it even possible
>>> to construct an index list that contains an ellipsis?
>>> There is an object for it but I never looked at it.
>>
>> I haven't been following the discussion and I don't understand your
>> question and in a moment I will accidentally hit send...
>>
 class eli(object):
>>   ...:
>>   ...:         def __init__(self):
>>   ...:             pass
>>   ...:
>>   ...:     def __getitem__(self, index):
>>   ...:             print index
>>   ...:
>>
 x[...]
>> Ellipsis
 x[...,1]
>> (Ellipsis, 1)
>>
>> Ellipsis is a python class. Built in, no need to import.
>
> thanks, this makes it possible to construct index lists with Ellipsis,
> but it showed that my broadcast idea doesn't work this way
>
> Travis explained last year how slices and broadcasting are used for
> indexing, and it's quite a bit more complicated than this.
>
> Sorry for jumping in too fast.
>
> Josef
>
 indi= (slice(2,5),Ellipsis, np.arange(3)[:,None])
 ind2 = []
 for i in indi:
>        if not i is Ellipsis: ind2.append(i)
>        else: ind2.extend([slice(None)]*2)
>
>
 ind2
> [slice(2, 5, None), slice(None, None, None), slice(None, None, None),
> array([[0],
>       [1],
>       [2]])]
 np.broadcast(*ind2).shape
> (3, 1)
>
>
>
>> ___
>> NumPy-Discussion mailing list
>> NumPy-Discussion@scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Warning on http://scipy.org/ about binary incompatibility ?

2010-01-29 Thread David Huard

I'm a heavy user of scikits.timeseries so I am very interested in
having native datetime objects in Numpy. However, when I did play with
it about a week ago. I found inconsistencies between the actual code
and the NEP.  The "Example of use" section mostly doesn't work. I
understand the need to put it out there so it gets used, but for the
moment I think potential users are still those who compile from the
dev. tree anyway.

Thanks for all the hard work that has been put into this,

David

On Thu, Jan 28, 2010 at 7:58 PM, David Cournapeau  wrote:
> Charles R Harris wrote:
>>
>>
>> On Thu, Jan 28, 2010 at 12:33 AM, David Cournapeau
>> mailto:da...@silveregg.co.jp>> wrote:
>
>>
>>     Because Travis was against it when it was suggested last september or
>>     so. And removing in 1.4.x a feature introduced in 1.4.0 is weird.
>>
>>
>> But wasn't that decision based on the premiss that the datetime work
>> wouldn't break the ABI?
>
> Well, this and because Travis is the BFDL of NumPy as far as I am
> concerned :) So I think it should be his decision whether to remove it
> or not.
>
>> I don't see anything weird about making 1.4 work
>> with existing binaries.
>
> Indeed, but that's not really what I am saying :) I am saying there is a
> tradeoff between breaking people's code (for people using datetime) and
> keeping a compatible ABI.
>
> So the decision depends quite a bit on how many people use the datetime
> code.
>
>> If we are going to break the ABI, and it looks
>> like we will, then it would be better if the word went out early so that
>> projects that depend on numpy can be prepared for the change. So my
>> preference would be to remove the incompatibility in 1.4 and introduce
>> it in 1.5.
>
> Assuming not many people depend on datetime in 1.4.0, that would be my
> preference as well.
>
> cheers,
>
> David
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] How to get the shape of an array slice without doing it

2010-01-29 Thread David Huard

On Fri, Jan 29, 2010 at 12:10 PM,   wrote:
> On Fri, Jan 29, 2010 at 11:49 AM, David Huard  wrote:
>> Hi,
>>
>> I have a 4D "array" with a given shape, but the array is never
>> actually created since it is large and distributed over multiple
>> binary files. Typical usage would be to take slices across the 4D
>> array.
>>
>> I'd like to know what the shape of the resulting array would be if I
>> took a slice out of it.
>> That is, let's say my 4D array is A, I'd like to know
>>
>> A[ndindex].shape
>>
>> without actually creating A.
>>
>> ndindex should support all numpy constructions (integer, boolean,
>> array, slice, ...). I am guessing something already exists to do this,
>> but I just can't put my finger on it.
>
> trying out some things, just because it's a puzzling question
>
>>>> indi= (slice(2,5), np.arange(2), np.arange(3)[:,None])
>>>> np.broadcast(*indi).shape
> (3, 2)
>
> I don't know if this is ok for all possible cases, (and there are some
> confusing things with reordering axis, when slices and fancy indexing
> is mixed)
>

Hi josef,

Where then do you specify the shape of the A array ?  Maybe an example
would be clearer:

Let's say A's shape is (10, 1, 5, 20)
and the index is [::2, ..., 0]

A[::2, ..., 0] shape would be (5, 1, 5)

The broadcast idea has potential, I'll toy with it.

David



> Josef
>
>
>
>> Thanks.
>>
>> David
>> ___
>> NumPy-Discussion mailing list
>> NumPy-Discussion@scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

[Numpy-discussion] How to get the shape of an array slice without doing it

2010-01-29 Thread David Huard

Hi,

I have a 4D "array" with a given shape, but the array is never
actually created since it is large and distributed over multiple
binary files. Typical usage would be to take slices across the 4D
array.

I'd like to know what the shape of the resulting array would be if I
took a slice out of it.
That is, let's say my 4D array is A, I'd like to know

A[ndindex].shape

without actually creating A.

ndindex should support all numpy constructions (integer, boolean,
array, slice, ...). I am guessing something already exists to do this,
but I just can't put my finger on it.

Thanks.

David
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

[Numpy-discussion] Histogram - removing the "new" keyword for 1.4

2009-12-10 Thread David Huard

Hi all,

A long time ago, it was decided to change the default behaviour of the
histogram function. The new behaviour has been the default in 1.3 and usage
of the old behaviour has raised a warning.  According to the timeline
discussed at the time, version 1.4 would be the time to remove the old stuff
entirely. I was wondering if this was satisfactory for all, or if someone
still depends on "new=False" ? I don't think there is any harm in keeping it
around until 1.5, we'd just have to update the docstring to reflect this. Of
course, following the original plan would be better.

I'm sorry to bring this so late in the release cycle.

Cheers,

David Huard
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Convert data into rectangular grid

2009-09-29 Thread David Huard

On Mon, Sep 28, 2009 at 8:45 PM, jah  wrote:

> On Mon, Sep 28, 2009 at 4:48 PM,  wrote:
>
>> On Mon, Sep 28, 2009 at 7:19 PM, jah  wrote:
>> > Hi,
>> >
>> > Suppose I have a set of x,y,c data (something useful for
>> > matplotlib.pyplot.plot() ).  Generally, this data is not rectangular at
>> > all.  Does there exist a numpy function (or set of functions) which will
>> > take this data and construct the smallest two-dimensional arrays X,Y,C (
>> > suitable for matplotlib.pyplot.contour() ).
>> >
>> > Essentially, I want to pass in the data and a grid step size in the x-
>> and
>> > y-directions.  The function would average the c-values for all points
>> which
>> > land in any particular square.  Optionally, I'd like to be able to
>> specify a
>> > value to use when there are no points in x,y which are in the square.
>> >
>> > Hope this makes sense.
>>
>> If I understand correctly  numpy.histogram2d(x, y, ..., weights=c) might
>> do
>> what you want.
>>
>> There was a recent thread on its usage.
>>
>
> It is very close, but it normed=True, will first normalize the weights
> (undesirably) and then it will normalize the normalized weights by dividing
> by the cell area.  Instead, what I want is the cell value to be the average
> off all the points that were placed in the cell.  This seems like a common
> use case, so I'm guessing this functionality is present already.  So if 3
> points with weights [10,20,30] were placed in cell (i,j), then the cell
> should have value 20 (the arithmetic mean of the points placed in the cell).
>
>
Would this work for you ?

>>> s = histogram2d(x,y,weights=c)  # Not normalized, so you get the sum of
the weights
>>> n = histogram2d(x,y) # Now you have the number of elements in each bin
>>> mean = s/n


David


> Here is the desired use case:  I have a set of x,y,c values that I could
> pass into matplotlib's scatter() or hexbin().   I'd like to take this same
> set of points and transform them so that I can pass them into matplotlib's
> contour() function.  Perhaps matplotlib has a function which does this.
>
>
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] MFDatasets and NetCDF4

2009-09-25 Thread David Huard

Hi George,

On Fri, Sep 25, 2009 at 6:55 AM, George Nurser wrote:

> Hi,
> I hope this is the right place to ask this.
> I've found the MFDataset works well in reading NetCDF3 files, but it
> appears that it doesn't work at present for NetCDF4 files.
>
>
It works on my side for netCDF4 files. What error are you getting ?

> Is this an inherent problem with the NetCDF4 file structure, or would
> it be possible to implement the MFDataset for NetCDF4 files sometime?
> It would be very useful.
>
>
>From the docstring:

 Datasets must be in C{NETCDF4_CLASSIC, NETCDF3_CLASSIC or NETCDF3_64BIT}
format (C{NETCDF4} Datasets won't work).

I suspect your files are not in CLASSIC mode. NETCDF4 datasets are allowed
to have a more complex hierarchy than the CLASSIC mode, and I think this is
what makes concatenation difficult to implement. That is, there would be no
simple rule to determine which fields should be concatenated.

David

> --George Nurser.
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

[Numpy-discussion] Fortran reader for npy files

2009-08-28 Thread David Huard

Hi,

Has someone written a fortran reader for the "npy" binary files numpy.save
creates ?

Thanks,

David
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Faulty behavior of numpy.histogram?

2009-08-12 Thread David Huard

On Wed, Aug 12, 2009 at 3:12 AM, Danny Handoko wrote:

>  Dear all,
>
> We try to use numpy.histogram with combination of matplotlib.  We are using
> numpy 1.3.0, but a somewhat older matplotlib version of 0.91.2.
> Matplotlib's  axes.hist() function calls the numpy.histogram, passing
> through the 'normed' parameter.  However, this version of matplotlib uses
> '0' as the default value of 'normed' (I see it fixed in higher version).
> What I found strange is that if the 'normed' parameter of numpy.histogram is
> set with other object than 'True' or 'False', the output becomes None, but
> no exceptions are raised.  As a result, the matplotlib code that does
> something like this:
>
> >>> n, bins = numpy.histogram([1,2,3], 10, range = None, normed = 0)
> Traceback (most recent call last):
>   File "", line 1, in 
> TypeError: 'NoneType' object is not iterable
> results in the above exception.
>

This is now fixed. Thanks.


>
> Secondly, this matplotlib version also expects both outputs to be of the
> same length, which is no longer true with the new histogram semantics.  This
> can be easily reverted using the parameter 'new = False' in numpy.histogram,
> but this parameter is not available for the caller of axes.hist() function
> in matplotlib.  Is there any way to tell numpy to use the old semantics?
>
>

Could you go in the numpy source code and change the default value for new ?

David


> Upgrading to the newer matplotlib is a rather longer term solution, and we
> hope to be able to find some workaround/short-term solution
>
> Thank you,
>
>
> --
>
> Danny Handoko
>
> System Architecture and Generics
>
> Room 7G2.003 -- ph: x2968
>
> email: danny.hand...@asml.com
>
>
> -- The information contained in this communication and any attachments is
> confidential and may be privileged, and is for the sole use of the intended
> recipient(s). Any unauthorized review, use, disclosure or distribution is
> prohibited. Unless explicitly stated otherwise in the body of this
> communication or the attachment thereto (if any), the information is
> provided on an AS-IS basis without any express or implied warranties or
> liabilities. To the extent you are relying on this information, you are
> doing so at your own risk. If you are not the intended recipient, please
> notify the sender immediately by replying to this message and destroy all
> copies of this message and any attachments. ASML is neither liable for the
> proper and complete transmission of the information contained in this
> communication, nor for any delay in its receipt.
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

[Numpy-discussion] vectorize problem with f2py and gfortran 4.3

2009-08-10 Thread David Huard

Hi all,

A user on the pymc user list has reported a problem with f2py wrapped
fortran functions compiled with gfortran 4.3, which is the standard Ubuntu
Jaunty fortran compiler. I noticed the same bug in some of my own routines.
The problem, as far as I can understand, is that vectorize tries to find the
number of arguments by calling the function with no arguments and parsing
the error message. With numpy 1.3, python 2.6 and gfortran 4.3, the error
message is not what numpy expects, and does not contain the expected number
of arguments. So I am wondering if there is a reliable way to introspect
compiled extensions to provide the number of arguments needed by vectorize ?

Thanks,

David
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Indexing with callables (was: Yorick-like functionality)

2009-05-15 Thread David Huard

Josef,

You're right, you can see it as a moving average. For 1D, correlate(a,
[5,.5]) yields what I expect but does not take an axis keyword. For the 2D
case, I'm rather looking for

>>> ndimage.filters.correlate(b,0.25*np.ones((2,2)))[1:,1:]

So another one-liner... maybe not worth adding to the numpy namespace.

David




On Fri, May 15, 2009 at 4:47 PM,  wrote:

> On Fri, May 15, 2009 at 4:09 PM, David Huard 
> wrote:
> > Pauli and David,
> >
> > Can this indexing syntax do things that are otherwise awkward with the
> > current syntax ? Otherwise, I'm not warm to the idea of making indexing
> more
> > complex than it is.
> >
> > getv : this is useful but it feels a bit redundant with numpy.take. Is
> there
> > a reason why take could not support slices ?
> >
> > Drop_last: I don't think it is worth cluttering the namespace with a one
> > liner.
> >
> > append_one: A generalized stack method with broadcasting capability would
> be
> > more useful in my opinion, eg. ``np.stack(x, 1., axis=1)``
> >
> > zcen: This is indeed useful, particulary in its nd form, that is, when it
> > can be applied to multiples axes to find the center of a 2D or 3D cell in
> > one call. I'm appending the version I use below.
> >
> > Cheers,
> >
> > David
> >
> >
> > # This code is released in the public domain.
> > import numpy as np
> > def __midpoints_1d(a):
> > """Return `a` linearly interpolated at the mid-points."""
> > return (a[:-1] + a[1:])/2.
> >
> > def midpoints(a,  axis=None):
> > """Return `a` linearly interpolated at the mid-points.
> >
> > Parameters
> > --
> > a : array-like
> >   Input array.
> > axis : int or None
> >   Axis along which the interpolation takes place. None stands for all
> > axes.
> >
> > Returns
> > ---
> > out : ndarray
> >   Input array interpolated at the midpoints along the given axis.
> >
> > Examples
> > 
> > >>> a = [1,2,3,4]
> > >>> midpoints(a)
> > array([1.5, 2.5, 3.5])
> > """
> > x = np.asarray(a)
> > if axis is not None:
> > return np.apply_along_axis(__midpoints_1d,  axis, x)
> > else:
> > for i in range(x.ndim):
> > x = midpoints(x,  i)
> > return x
> >
>
> zcen is just a moving average, isn't it? For time series (1d),
> correlate works well, for 2d (nd?), there is
>
> >>> a= np.arange(5)
> >>> b = 1.0*a[:,np.newaxis]*np.arange(4)
> >>> ndimage.filters.correlate(b,0.5*np.ones((2,1)))[1:,1:]
> >>> ndimage.filters.correlate(b,0.5*np.ones((2,1)))[1:,1:]
>
> Josef
> ___
> Numpy-discussion mailing list
> Numpy-discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Indexing with callables (was: Yorick-like functionality)

2009-05-15 Thread David Huard

Pauli and David,

Can this indexing syntax do things that are otherwise awkward with the
current syntax ? Otherwise, I'm not warm to the idea of making indexing more
complex than it is.

getv : this is useful but it feels a bit redundant with numpy.take. Is there
a reason why take could not support slices ?

Drop_last: I don't think it is worth cluttering the namespace with a one
liner.

append_one: A generalized stack method with broadcasting capability would be
more useful in my opinion, eg. ``np.stack(x, 1., axis=1)``

zcen: This is indeed useful, particulary in its nd form, that is, when it
can be applied to multiples axes to find the center of a 2D or 3D cell in
one call. I'm appending the version I use below.

Cheers,

David


# This code is released in the public domain.
import numpy as np
def __midpoints_1d(a):
"""Return `a` linearly interpolated at the mid-points."""
return (a[:-1] + a[1:])/2.

def midpoints(a,  axis=None):
"""Return `a` linearly interpolated at the mid-points.

Parameters
--
a : array-like
  Input array.
axis : int or None
  Axis along which the interpolation takes place. None stands for all
axes.

Returns
---
out : ndarray
  Input array interpolated at the midpoints along the given axis.

Examples

>>> a = [1,2,3,4]
>>> midpoints(a)
array([1.5, 2.5, 3.5])
"""
x = np.asarray(a)
if axis is not None:
return np.apply_along_axis(__midpoints_1d,  axis, x)
else:
for i in range(x.ndim):
x = midpoints(x,  i)
return x

On Thu, May 14, 2009 at 6:54 AM, Pauli Virtanen  wrote:

> Wed, 13 May 2009 13:18:45 -0700, David J Strozzi kirjoitti:
> [clip]
> > Many of you probably know of the interpreter yorick by Dave Munro. As a
> > Livermoron, I use it all the time.  There are some built-in functions
> > there, analogous to but above and beyond numpy's sum() and diff(), which
> > are quite useful for common operations on gridded data. Of course one
> > can write their own, but maybe they should be cleanly canonized?
>
> +0 from me for zcen and other, having small functions probably won't hurt
>   much
>
> [clip]
> > Besides zcen, yorick has builtins for "point centering", "un-zone
> > centering," etc.  Also, due to its slick syntax you can give these
> > things as array "indexes":
> >
> > x(zcen), y(dif), z(:,sum,:)
>
> I think you can easily subclass numpy.ndarray to offer the same feature,
> see below. I don't know if we want to add this feature (indexing with
> callables) to the Numpy's fancy indexing itself. Thoughts?
>
> -
>
> import numpy as np
> import inspect
>
> class YNdarray(np.ndarray):
>"""
>A subclass of ndarray that implements Yorick-like indexing with
> functions.
>
>Beware: not adequately tested...
>"""
>
>def __getitem__(self, key_):
>if not isinstance(key_, tuple):
>key = (key_,)
>scalar_key = True
>else:
>key = key_
>scalar_key = False
>
>key = list(key)
>
># expand ellipsis manually
>while Ellipsis in key:
>j = key.index(Ellipsis)
>key[j:j+1] = [slice(None)] * (self.ndim - len(key))
>
># handle reducing or mutating callables
>arr = self
>new_key = []
>real_axis = 0
>for j, v in enumerate(key):
>if callable(v):
>arr2 = self._reduce_axis(arr, v, real_axis)
>new_key.extend([slice(None)] * (arr2.ndim - arr.ndim + 1))
>arr = arr2
>elif v is not None:
>real_axis += 1
>new_key.append(v)
>else:
>new_key.append(v)
>
># final get
>if scalar_key:
>return np.ndarray.__getitem__(arr, new_key[0])
>else:
>return np.ndarray.__getitem__(arr, tuple(new_key))
>
>
>def _reduce_axis(self, arr, func, axis):
>return func(arr, axis=axis)
>
> x = np.arange(2*3*4).reshape(2,3,4).view(YNdarray)
>
> # Now,
>
> assert np.allclose(x[np.sum,...], np.sum(x, axis=0))
> assert np.allclose(x[:,np.sum,:], np.sum(x, axis=1))
> assert np.allclose(x[:,:,np.sum], np.sum(x, axis=2))
> assert np.allclose(x[:,np.sum,None,np.sum],
> x.sum(axis=1).sum(axis=1)[:,None])
>
> def get(v, s, axis=0):
>"""Index `v` with slice `s` along given axis"""
>ix = [slice(None)] * v.ndim
>ix[axis] = s
>return v[ix]
>
> def drop_last(v, axis=0):
>"""Remove one element from given array in given dimension"""
>return get(v, slice(None, -1), axis)
>
> assert np.allclose(x[:,drop_last,:], x[:,:-1,:])
>
> def zcen(v, axis=0):
>return .5*(get(v, slice(None,-1), axis) + get(v, slice(1,None), axis))
>
> assert np.allclose(x[0,1,zcen], .5*(x[0,1,1:] + x[0,1,:-1]))
>
> def append_one(v, axis=0):
>"""Append one element to the given array in given dimension,
>fill with ones"""
>new_shape = list(v.shape)
>n

Re: [Numpy-discussion] hairy optimization problem

2009-05-06 Thread David Huard

Hi Mathew,

You could use Newton's method to optimize for each vi sequentially. If you
have an expression for the jacobian, it's even better.

What I'd do is write a class with a method f(self, x, y) that records the
result of f(x,y) each time it is called. I would  then sample very coarsely
the x,y space where I guess my solutions are. You can then select the x,y
where v1 is maximum as your initial point for Newton's method and iterate
until you converge to the solution for v1. Since during the search for the
optimum your class stores the computed points, your initial guess for v2
should be a bit better than it was for v1, which should speed up the
convergence to the solution for v2, etc.

If you have multiple processors available, you can scatter function
evaluation among them using ipython. It's easier than it looks.

Hope someone comes up with a nicer solution,

David

On Wed, May 6, 2009 at 3:16 PM, Mathew Yeates  wrote:

> I have a function f(x,y) which produces N values [v1,v2,v3  vN]
> where some of the values are None (only found after evaluation)
>
> each evaluation of "f" is expensive and N is large.
> I want N x,y pairs which produce the optimal value in each column.
>
> A brute force approach would be to generate
> [v11,v12,v13,v14 ]
> [v21,v22,v23 ...]
> etc
>
> then locate the maximum of each column.
> This is far too slow ..Any other ideas?
>
>
>
> ___
> Numpy-discussion mailing list
> Numpy-discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] efficient 3d histogram creation

2009-05-05 Thread David Huard

On Mon, May 4, 2009 at 4:18 PM,  wrote:

> On Mon, May 4, 2009 at 4:00 PM, Chris Colbert  wrote:
> > i'll take a look at them over the next few days and see what i can hack
> out.
> >
> > Chris
> >
> > On Mon, May 4, 2009 at 3:18 PM, David Huard 
> wrote:
> >>
> >>
> >> On Mon, May 4, 2009 at 7:00 AM,  wrote:
> >>>
> >>> On Mon, May 4, 2009 at 12:31 AM, Chris Colbert 
> >>> wrote:
> >>> > this actually sort of worked. Thanks for putting me on the right
> track.
> >>> >
> >>> > Here is what I ended up with.
> >>> >
> >>> > this is what I ended up with:
> >>> >
> >>> > def hist3d(imgarray):
> >>> > histarray = N.zeros((16, 16, 16))
> >>> > temp = imgarray.copy()
> >>> > bins = N.arange(0, 257, 16)
> >>> > histarray = N.histogramdd((temp[:,:,0].ravel(),
> >>> > temp[:,:,1].ravel(),
> >>> > temp[:,:,2].ravel()), bins=(bins, bins, bins))[0]
> >>> > return histarray
> >>> >
> >>> > this creates a 3d histogram of rgb image values in the range 0,255
> >>> > using 16
> >>> > bins per component color.
> >>> >
> >>> > on a 640x480 image, it executes in 0.3 seconds vs 4.5 seconds for a
> for
> >>> > loop.
> >>> >
> >>> > not quite framerate, but good enough for prototyping.
> >>> >
> >>>
> >>> I don't think your copy to temp is necessary, and use reshape(-1,3) as
> >>> in the example of Stefan, which will avoid copying the array 3 times.
> >>>
> >>> If you need to gain some more speed, then rewriting histogramdd and
> >>> removing some of the unnecessary checks and calculations looks
> >>> possible.
> >>
> >> Indeed, the strategy used in the histogram function is faster than the
> one
> >> used in the histogramdd case, so porting one to the other should speed
> >> things up.
> >>
> >> David
>
> is searchsorted faster than digitize and bincount ?
>

That depends on the number of bins and whether or not the bin width is
uniform. A 1D benchmark I did a while ago showed that if the bin width is
uniform, then the best strategy is to create a counter initialized to 0,
loop through the data, compute i = (x-bin0) /binwidth and increment counter
i by 1 (or by the weight of the data). If the bins are non uniform, then for
nbin > 30 you'd better use searchsort, and digitize otherwise.

For those interested in speeding up histogram code, I recommend reading a
thread started by Cameron Walsh on the 12/12/06 named "Histograms of
extremely large data sets" Code and benchmarks were posted.

Chris, if your bins all have the same width, then you can certainly write an
histogramdd routine that is way faster by using the indexing trick instead
of digitize or searchsort.

Cheers,

David




>
> Using the idea of histogramdd, I get a bit below a tenth of a second,
> my best for this problem is below.
> I was trying for a while what the fastest way is to convert a two
> dimensional array into a one dimensional index for bincount. I found
> that using the return index of unique1d is very slow compared to
> numeric index calculation.
>
> Josef
>
> example timed for:
> nobs = 307200
> nbins = 16
> factors = np.random.randint(256,size=(nobs,3)).copy()
> factors2 = factors.reshape(-1,480,3).copy()
>
> def hist3(factorsin, nbins):
>if factorsin.ndim != 2:
>factors = factorsin.reshape(-1,factorsin.shape[-1])
>else:
>factors = factorsin
>N, D = factors.shape
>darr = np.empty(factors.T.shape, dtype=int)
>nele = np.max(factors)+1
>bins = np.arange(0, nele, nele/nbins)
>bins[-1] += 1
>for i in range(D):
>darr[i] = np.digitize(factors[:,i],bins) - 1
>
>#add weighted rows
>darrind = darr[D-1]
>for i in range(D-1):
>darrind += darr[i]*nbins**(D-i-1)
>return np.bincount(darrind)  # return flat not reshaped
>

> ___
> Numpy-discussion mailing list
> Numpy-discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] efficient 3d histogram creation

2009-05-04 Thread David Huard

On Mon, May 4, 2009 at 7:00 AM,  wrote:

> On Mon, May 4, 2009 at 12:31 AM, Chris Colbert 
> wrote:
> > this actually sort of worked. Thanks for putting me on the right track.
> >
> > Here is what I ended up with.
> >
> > this is what I ended up with:
> >
> > def hist3d(imgarray):
> > histarray = N.zeros((16, 16, 16))
> > temp = imgarray.copy()
> > bins = N.arange(0, 257, 16)
> > histarray = N.histogramdd((temp[:,:,0].ravel(), temp[:,:,1].ravel(),
> > temp[:,:,2].ravel()), bins=(bins, bins, bins))[0]
> > return histarray
> >
> > this creates a 3d histogram of rgb image values in the range 0,255 using
> 16
> > bins per component color.
> >
> > on a 640x480 image, it executes in 0.3 seconds vs 4.5 seconds for a for
> > loop.
> >
> > not quite framerate, but good enough for prototyping.
> >
>
> I don't think your copy to temp is necessary, and use reshape(-1,3) as
> in the example of Stefan, which will avoid copying the array 3 times.
>
> If you need to gain some more speed, then rewriting histogramdd and
> removing some of the unnecessary checks and calculations looks
> possible.


Indeed, the strategy used in the histogram function is faster than the one
used in the histogramdd case, so porting one to the other should speed
things up.

David


>
> Josef
> ___
> Numpy-discussion mailing list
> Numpy-discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Changes and new workflow on Trac

2009-03-10 Thread David Huard

Plain old firefox 3.0.6 on fedora 9.

On Tue, Mar 10, 2009 at 4:11 PM, Charles R Harris  wrote:

>
>
> On Tue, Mar 10, 2009 at 2:07 PM, Stéfan van der Walt wrote:
>
>> 2009/3/10 David Huard :
>> > but, if I try to login, I get the same error again. I tried to reset the
>> > password, register under a new name, but I always get the following
>> message:
>> >
>> > The browser has stopped trying to retrieve the requested item. The site
>> is
>> > redirecting the request in a way that will never complete.
>>
>> Does anyone else see the behaviour David is describing?
>>
>
> I don't. David, what browser are you using?
>
> Chuck
>
>
>
> ___
> Numpy-discussion mailing list
> Numpy-discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Changes and new workflow on Trac

2009-03-10 Thread David Huard

On Tue, Mar 10, 2009 at 1:28 PM, David Huard  wrote:

>
>
> On Tue, Mar 10, 2009 at 9:44 AM, Stéfan van der Walt wrote:
>
>> Hi David
>>
>> 2009/3/10 David Huard :
>> > Stefan,
>> >
>> > The SciPy site is really nice, but the NumPy site returns a Page Load
>> Error.
>>
>> Which page are you referring to?
>>
>> http://projects.scipy.org/numpy
>>
>> seems to work fine.
>>
>
> Yes, this one. I deleted the cookies for the page and then it worked.
>
>
>> but, if I try to login, I get the same error again. I tried to reset the
password, register under a new name, but I always get the following message:

The browser has stopped trying to retrieve the requested item. The site is
redirecting the request in a way that will never complete.

Thanks,

David


>
>> Cheers
>> Stéfan
>> ___
>> Numpy-discussion mailing list
>> Numpy-discussion@scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>
>
>
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Changes and new workflow on Trac

2009-03-10 Thread David Huard

On Tue, Mar 10, 2009 at 9:44 AM, Stéfan van der Walt wrote:

> Hi David
>
> 2009/3/10 David Huard :
> > Stefan,
> >
> > The SciPy site is really nice, but the NumPy site returns a Page Load
> Error.
>
> Which page are you referring to?
>
> http://projects.scipy.org/numpy
>
> seems to work fine.
>

Yes, this one. I deleted the cookies for the page and then it worked.


>
> Cheers
> Stéfan
> ___
> Numpy-discussion mailing list
> Numpy-discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Changes and new workflow on Trac

2009-03-10 Thread David Huard

Stefan,

The SciPy site is really nice, but the NumPy site returns a Page Load Error.


David




On Mon, Mar 9, 2009 at 3:35 AM, Stéfan van der Walt wrote:

> Hi all,
>
> Here is an outline of recent changes made to the Trac system.
>
> I have modified the ticket workflow on
> projects.scipy.org/{numpy,scipy}to
>  accommodate patch review (see
> http://mentat.za.net/refer/workflow.png).  I hope this facility will
> make it easier to contribute, and I would like to have your
> feedback/suggestions.
>
> Instructions to contributers:
>
>  * [http://projects.scipy.org/numpy/newticket Contribute a patch] or
> file a bug report
>  * [http://docs.scipy.org Write documentation]
>  * [http://projects.scipy.org/numpy/report/12 Review patches] available
>  * [http://projects.scipy.org/numpy/report/13 Apply reviewed patches]
>
> The last two are new items.
>
> A ticket can be marked "needs_review", whereafter it can be changed to
> "review_positive" or "needs_work".  Also, a "design decision needed"
> state is provided for stalled tickets.
>
> Other changes:
>
> To simplify ticket structure, "severity" was removed ("priority"
> should be used instead).  Furthermore, tickets are no longer
> "accepted", but simply "assigned".  You can still assign
> tickets to yourself.
>
> Source repository:
>
> A git repository is available on http://projects.scipy.org/git and
> http://projects.scipy.org/git/{numpy,scipy}.
>  This repository can be
> browsed from Trac by clicking on the "Git Repo" button, or at
>
> http://projects.scipy.org/{numpy,scipy}/browse_git
>
> It can be cloned from
>
> http://projects.scipy.org/numpy.git
>
> Pauli installed the necessary SVN post-commit hooks to ensure that the
> git repository is always up to date.
>
> Ticket mailing lists:
>
> Trac tries to send out e-mails, but only a handful are going through.
> We are investigating the problem.
>
> Comments and suggestions are very welcome!  Thank you to David, Peter
> and Pauli for all their hard work on the server setup during the past
> week.
>
> Regards
> Stéfan
> ___
> Numpy-discussion mailing list
> Numpy-discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] GMRES internal variables

2009-02-17 Thread David Huard

Nathan,

First of all, thanks to all your work on the sparse linear algebra package,
I am starting to use it and it's much appreciated.

Just a thought: wouldn't it be more natural to write gmres as a class rather
than a function ? That way, accessing the internal work arrays for reuse
would be much easier. These are useful when using gmres in a inexact newton
loop.

Cheers,

David




On Tue, Feb 17, 2009 at 2:15 AM, Nathan Bell  wrote:

> On Mon, Feb 16, 2009 at 10:49 PM,   wrote:
> >
> > I'm trying to run multiple instances of GMRES at the same time (one
> inside
> > another actually, used inside of the preconditioner routine) but am
> > running in to some problems. My guess is there is a single set of
> internal
> > variables associated with GMRES and when I initiate a new GMRES inside a
> > currently running GMRES, the old GMRES data gets the boot.
> >
> > Is this correct, and if so any suggestions on how to fix this?
> >
>
> This recently came up on SciPy-User:
> http://thread.gmane.org/gmane.comp.python.scientific.user/19197/focus=19206
>
> One solution:  PyAMG's GMRES implementation (pyamg.krylov.gmres)
> should be a drop-in replacement for SciPy's gmres().  Unlike the CG
> code mentioned above, you can't use gmres.py without some other
> compiled components (i.e. you'll need the whole pyamg package).
>
> We should have this resolved in the next scipy release (either 0.7.x or
> 0.8).
>
> --
> Nathan Bell wnb...@gmail.com
> http://graphics.cs.uiuc.edu/~wnbell/
> ___
> Numpy-discussion mailing list
> Numpy-discussion@scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion
>
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Bugs in histogram and matplotlib-hist

2008-11-12 Thread David Huard

On Wed, Nov 12, 2008 at 1:27 PM, Mike Ressler <[EMAIL PROTECTED]>wrote:

> On Wed, Nov 12, 2008 at 4:30 AM, Scott Sinclair <[EMAIL PROTECTED]>
> wrote:
> >> "Mike Ressler" <[EMAIL PROTECTED]> 11/12/08 1:19 AM
> >> I did an update to a Fedora 9 workstation yesterday that included
> >> updating numpy to 1.2.0 and matplotlib 0.98.3 (python version is
> >
> > They reported that the Fedora 9 matplotlib package is 0.91.4, which
> doesn't work with numpy-1.2.0. Perhaps the matplotlib on your system isn't
> what you expect?
>
> Argh! The one thing I didn't doublecheck before posting - you are
> correct, Scott; the Fedora box has 0.91.4; the machines I personally
> use more regularly have 0.98.3 on Archlinux. I'll see what can be done
> with package updating before I start patching. Thanks for pointing
> this out.


Mike, before patching, please take a look at the tickets related to
histogram on the numpy trac. Previously, histogram returned only used the
left bin edges and it caused a lot of problems with outliers and
normalization. We are not going back there.

Cheers,

David




>
>
> Mike
>
> --
> [EMAIL PROTECTED]
> ___
> Numpy-discussion mailing list
> Numpy-discussion@scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion
>
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

[Numpy-discussion] Changes to histogram semantics: follow-up

2008-11-12 Thread David Huard

NumPy users,

Revision 6020 proceeds with the planned changes to histogram semantics for
the 1.3 release.

This modification brings no change in functionality, only changes in the
warnings being raised:

  No warning is printed for the default behaviour (new=None).

  new=False now raises a DeprecationWarning. Users relying on the old
behaviour are encouraged to switch to the new semantics.

  new=True warns users that the `new` keyword will disappear in 1.4


Regards,

David Huard
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Apply a vector function to each row of a matrix

2008-10-10 Thread David Huard

On Thu, Oct 9, 2008 at 2:48 PM, Neal Becker <[EMAIL PROTECTED]> wrote:

> David Huard wrote:
>
> > On Thu, Oct 9, 2008 at 9:40 AM, Neal Becker <[EMAIL PROTECTED]> wrote:
> >
> >> David Huard wrote:
> >>
> >> > Neal,
> >> >
> >> > Look at: apply_along_axis
> >> >
> >> >
> >> I guess it'd be:
> >>
> >> b = empty_like(a)
> >> for row in a.shape[0]:
> >>  b[row,:] = apply_along_axis (func, row, a)
> >>
> >
> >> I don't suppose there is a way to do this without explicitly writing a
> >> loop.
> >
> >
> > Have you tried
> >
> > b = apply_along_axis(func, 1, a)
> >
> > It should work.
> >
> Yes, thanks.
>
> The doc for apply_along_axis is not clear.
>

> For one thing, it says:
> The output array. The shape of outarr depends on the return value of
> func1d. If it returns arrays with the same shape as the input arrays it
> receives, outarr has the same shape as arr.
>
> What happens if the 'if' clause is not true?
>

The shape along the axis is determined by the result's shape of your
function.

def func(x):
   ...: return x[::2]
   ...:

> [3]: a = random.rand(3,4)

> [4]: a
 <[4]:
array([[ 0.95979758,  0.37350614,  0.77423741,  0.62520089],
   [ 0.69060211,  0.91480227,  0.60105525,  0.20184552],
   [ 0.31540644,  0.19919848,  0.72567385,  0.63987393]])

> [5]: apply_along_axis(func, 1, a)
 <[5]:
array([[ 0.95979758,  0.77423741],
   [ 0.69060211,  0.60105525],
   [ 0.31540644,  0.72567385]])

I've edited the docstring at
http://sd-2116.dedibox.fr/pydocweb/doc/numpy.lib.shape_base.apply_along_axis/

Feel free to improve on it.

David


>
> ___
> Numpy-discussion mailing list
> Numpy-discussion@scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion
>
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Apply a vector function to each row of a matrix

2008-10-09 Thread David Huard

On Thu, Oct 9, 2008 at 9:40 AM, Neal Becker <[EMAIL PROTECTED]> wrote:

> David Huard wrote:
>
> > Neal,
> >
> > Look at: apply_along_axis
> >
> >
> I guess it'd be:
>
> b = empty_like(a)
> for row in a.shape[0]:
>  b[row,:] = apply_along_axis (func, row, a)
>

> I don't suppose there is a way to do this without explicitly writing a
> loop.


Have you tried

b = apply_along_axis(func, 1, a)

It should work.


>
> ___
> Numpy-discussion mailing list
> Numpy-discussion@scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion
>
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Apply a vector function to each row of a matrix

2008-10-09 Thread David Huard

Neal,

Look at: apply_along_axis


David

On Thu, Oct 9, 2008 at 8:04 AM, Neal Becker <[EMAIL PROTECTED]> wrote:

> Suppose I have a function (I wrote in c++) that accepts a numpy 1-d vector.
>  What is the recommended way to apply it to each row of a matrix, returning
> a new matrix result?  (Assume the function has signature newvec = f
> (oldvec))
>
> ___
> Numpy-discussion mailing list
> Numpy-discussion@scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion
>
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Help to process a large data file

2008-10-03 Thread David Huard

Frank, On Thu, Oct 2, 2008 at 3:20 PM, frank wang <[EMAIL PROTECTED]> wrote:

>
> Thans David and Chris for providing the nice solution.
>
>

Glad it helped.


> Both method works gread. I could not tell the speed difference between the
> two solutions. My data size is 1048577 lines.
>

I'd be curious to know what happens for larger files (~ 10 M lines). I'd
guess Chris solution would be the fastest since it works incrementally and
does not load the entire data in memory.  If you ever try, I'll be
interested to know how it turns out.

David


> I did not try the second solution from Chris since it is too slow as Chris
> stated.
>
> Frank
>
>
> > Date: Thu, 2 Oct 2008 17:43:37 +0200
> > From: [EMAIL PROTECTED]
> > To: numpy-discussion@scipy.org
> > CC: [EMAIL PROTECTED]
> > Subject: Re: [Numpy-discussion] Help to process a large data file
>
> >
> > Frank,
> >
> > I would imagine that you cannot get a much better performance in python
> > than this, which avoids string conversions:
> >
> > c = []
> > count = 0
> > for line in open('foo'):
> > if line == '1 1\n':
> > c.append(count)
> > count = 0
> > else:
> > if '1' in line: count += 1
> >
> > One could do some numpy trick like:
> >
> > a = np.loadtxt('foo',dtype=int)
> > a = np.sum(a,axis=1) # Add the two columns horizontally
> > b = np.where(a==2)[0] # Find with sum == 2 (1 + 1)
> > count = []
> > for i,j in zip(b[:-1],b[1:]):
> > count.append( a[i+1:j].sum() ) # Calculate number of lines with 1
> >
> > but on my machine the numpy version takes about 20 sec for a 'foo' file
> > of 2,500,000 lines versus 1.2 sec for the pure python version...
> >
> > As a side note, if i replace "line == '1 1\n'" with "line.startswith('1
> > 1')", the pure python version goes up to 1.8 sec... Isn't this a bit
> > weird, i'd think startswith() should be faster...
> >
> > Chris
> >
> > On Wed, Oct 01, 2008 at 07:27:27PM -0600, frank wang wrote:
> >
> > > Hi,
> > >
> > > I have a large data file which contains 2 columns of data. The two
> > > columns only have zero and one. Now I want to cound how many one in
> > > between if both columns are one. For example, if my data is:
> > >
> > > 1 0
> > > 0 0
> > > 1 1
> > > 0 0
> > > 0 1 x
> > > 0 1 x
> > > 0 0
> > > 0 1 x
> > > 1 1
> > > 0 0
> > > 0 1 x
> > > 0 1 x
> > > 1 1
> > >
> > > Then my count will be 3 and 2 (the numbers with x).
> > >
> > > Are there an efficient way to do this? My data file is pretty big.
> > >
> > > Thanks
> > >
> > > Frank
> > ___
> > Numpy-discussion mailing list
> > Numpy-discussion@scipy.org
> > http://projects.scipy.org/mailman/listinfo/numpy-discussion
>
>
> --
> See how Windows connects the people, information, and fun that are part of
> your life. See 
> Now
>
> ___
> Numpy-discussion mailing list
> Numpy-discussion@scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion
>
>
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Help to process a large data file

2008-10-02 Thread David Huard

Frank,

How about that:

x = np.loadtxt('file')

z = x.sum(1)   # Reduce data to an array of 0,1,2

rz = z[z>0]   # Remove all 0s since you don't want to count those.

loc = np.where(rz==2)[0]  # The location of the (1,1)s

count = np.diff(loc) - 1  # The spacing between those (1,1)s, ie, the number
of elements that have one 1.


HTH,

David


On Wed, Oct 1, 2008 at 9:27 PM, frank wang <[EMAIL PROTECTED]> wrote:

>  Hi,
>
> I have a large data file which contains 2 columns of data. The two columns
> only have zero and one. Now I want to cound how many one in between if both
> columns are one. For example, if my data is:
>
> 1 0
> 0 0
> 1 1
> 0 0
> 0 1x
> 0 1x
> 0 0
> 0 1x
> 1 1
> 0 0
> 0 1x
> 0 1x
> 1 1
>
> Then my count will be 3 and 2 (the numbers with x).
>
> Are there an efficient way to do this? My data file is pretty big.
>
> Thanks
>
> Frank
>
> --
> See how Windows connects the people, information, and fun that are part of
> your life. See 
> Now
>
> ___
> Numpy-discussion mailing list
> Numpy-discussion@scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion
>
>
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Upper and lower envelopes

2008-09-30 Thread David Huard

On Tue, Sep 30, 2008 at 4:37 PM, Anne Archibald
<[EMAIL PROTECTED]>wrote:

> 2008/9/30 bevan <[EMAIL PROTECTED]>:
> > Hello,
> >
> > I have some XY data.  I would like to generate the equations for an upper
> and
> > lower envelope that excludes a percentage of the data points.
> >
> > I would like to define the slope of the envelope line (say 3) and then
> have my
> > code find the intercept that fits my requirements (say 5% of data below
> the
> > lower envelope).  This would then give me the equation and I could plot
> the
> > upper and lower envelopes.
> >
> >
> > I hope this makes sense.  Thanks for any help.
>
> For this particular problem - where you know the slope - it's not too
> hard. If the slope is b, and your points are x and y, compute y-b*x,
> then sort that array, and choose the 5th and 95th percentile values.
>

That's a pretty elegant solution.

Thanks for sharing,

David

>
> Anne
> ___
> Numpy-discussion mailing list
> Numpy-discussion@scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion
>
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Upper and lower envelopes

2008-09-30 Thread David Huard

Bevan,

You can estimate the intercept and slope using least-squares
(scipy.optimize.leastsq). Make sure though that errors in X are small
compared to errors in Y, otherwise, your slope will be underestimated.

Using the slope, you can write a function lower(b,a, X,Y) that will compute
y=aX+b and return True if Y < y. Computing the ratio of true elements will
give you the percentage of points below the curve. You can then find b such
that the ratio is .5 and .95 using scipy.optimize.fmin.

There are other ways to do this;

Make a 2D histogram of the data (normed), compute the cumulative sum along Y
and find the histogram bins (along x) such that the cumulative histogram is
approximately equal to .5 and .95.

Partition the data in N sets along the x-axis, fit a normal distribution to
each set and compute the quantile corresponding to .5 and .95 cumulative
probability density.

David

By the way, anonymous mails from newcomers don't get as much attention as
those that are signed. Call it mailing list etiquette.

On Tue, Sep 30, 2008 at 5:06 AM, bevan <[EMAIL PROTECTED]> wrote:

> Hello,
>
> I have some XY data.  I would like to generate the equations for an upper
> and
> lower envelope that excludes a percentage of the data points.
>
> I would like to define the slope of the envelope line (say 3) and then have
> my
> code find the intercept that fits my requirements (say 5% of data below the
> lower envelope).  This would then give me the equation and I could plot the
> upper and lower envelopes.
>
>
> I hope this makes sense.  Thanks for any help.
>
> ___
> Numpy-discussion mailing list
> Numpy-discussion@scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion
>
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] loadtxt error

2008-09-24 Thread David Huard

Note that the fix was also backported to 1.2, for which binary builds are
available:

David

[ copied from a recent thread ]

The 1.2.0rc2 is now available:
http://svn.scipy.org/svn/numpy/tags/1.2.0rc2

The source tarball is here:
https://cirl.berkeley.edu/numpy/numpy-1.2.0rc2.tar.gz

Here is the universal Mac binary:
https://cirl.berkeley.edu/numpy/numpy-1.2.0rc2-py2.5-macosx10.5.dmg

Here are the Window's binaries:
http://www.ar.media.kyoto-u.ac.jp/members/david/archives/numpy/numpy-1.2
.0rc2-win32-superpack-python2.4.exe
http://www.ar.media.kyoto-u.ac.jp/members/david/archives/numpy/numpy-1.2
.0rc2-win32-superpack-python2.5.exe

Here are the release notes:
http://scipy.org/scipy/numpy/milestone/1.2.0





On Wed, Sep 24, 2008 at 2:36 AM, Joshua Lippai <[EMAIL PROTECTED]> wrote:

> David said the bug was fixed in the trunk, which you don't have; the
> development state of the main source code, or the trunk, is the latest
> state available, and releases are always a bit behind trunk since it
> would be kind of ridiculous to make a new release every time someone
> commits a change. You can download the current NumPy source code via
> subversion with
>
> svn co http://svn.scipy.org/svn/numpy/trunk numpy
>
> While the most recent release version number is 1.1.1, the trunk is at
> something more along the lines of 1.3.0dev5861 at the moment. You can
> check your numpy version from the Python shell by importing numpy and
> then calling the __version__ function. So you would type:
>
> import numpy
> numpy.__version__
>
> And the version would be displayed on screen. Bear in mind that unlike
> the release, which installs via an installer file you double click,
> you will have to compile numpy from the downloaded source code
> yourself. Detailed instructions for doing this (and also installing
> SciPy from source) are available here:
>
> http://www.scipy.org/Installing_SciPy/Windows
>
>
> Josh
>
> On Tue, Sep 23, 2008 at 10:14 PM, frank wang <[EMAIL PROTECTED]> wrote:
> > My numpy version is 1.1.1. I just downloaded and installed. It is the
> same
> > result. Also, when I use the list, I got similar error said list does not
> > have the find command.
> >
> > ---> 14
> > fid=loadtxt(fname,comments='"',dtype='|S4',converters={cols:lambda s
> > :int(s,16)},usecols=[cols])
> >  15
> >  16
> > #fid=loadtxt('ww36_5adcoutputsin45mhznotuner-0dbm_mux_adc_ddc_rmR.cs
> > v',comments='"',dtype='string',usecols=(0,))
> > C:\Python25\lib\site-packages\numpy\lib\io.pyc in loadtxt(fname, dtype,
> > comments
> > , delimiter, converters, skiprows, usecols, unpack)
> > 337 for i, conv in (user_converters or {}).iteritems():
> > 338 if usecols:
> > --> 339 i = usecols.find(i)
> > 340 converters[i] = conv
> > 341
> > AttributeError: 'list' object has no attribute 'find'
> >> c:\python25\lib\site-packages\numpy\lib\io.py(339)loadtxt()
> > 338 if usecols:
> > --> 339 i = usecols.find(i)
> > 340 converters[i] = conv
> >
> >
> > Thanks
> >
> > Frank
> >
> >> From: [EMAIL PROTECTED]
> >> To: numpy-discussion@scipy.org
> >> Date: Mon, 22 Sep 2008 20:10:11 -0400
> >> Subject: Re: [Numpy-discussion] loadtxt error
> >>
> >> On Monday 22 September 2008 19:56:47 frank wang wrote:
> >> > This error is caused that the usecols is a tuple and it does not have
> >> > find
> >> > command. I do not know how to fix this problem.
> >>
> >> Try to use a list instead of a tuple as a quick-fix.
> >> Anyway, Frank, you should try to give us the version of numpy you're
> >> using.
> >> Obviously, it's not the latest.
> >> ___
> >> Numpy-discussion mailing list
> >> Numpy-discussion@scipy.org
> >> http://projects.scipy.org/mailman/listinfo/numpy-discussion
> >
> >
> > 
> > See how Windows connects the people, information, and fun that are part
> of
> > your life. See Now
> > ___
> > Numpy-discussion mailing list
> > Numpy-discussion@scipy.org
> > http://projects.scipy.org/mailman/listinfo/numpy-discussion
> >
> >
> ___
> Numpy-discussion mailing list
> Numpy-discussion@scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion
>
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] loadtxt error

2008-09-23 Thread David Huard

This bug has been fixed in the trunk a couple of weeks ago.

Cheers,

David

On Mon, Sep 22, 2008 at 8:10 PM, Pierre GM <[EMAIL PROTECTED]> wrote:

> On Monday 22 September 2008 19:56:47 frank wang wrote:
> > This error is caused that the usecols is a tuple and it does not have
> find
> > command. I do not know how to fix this problem.
>
> Try to use a list instead of a tuple as a quick-fix.
> Anyway, Frank, you should try to give us the version of numpy you're using.
> Obviously, it's not the latest.
> ___
> Numpy-discussion mailing list
> Numpy-discussion@scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion
>
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Help speeding up element-wise operations for video processing

2008-09-16 Thread David Huard

Brendan,

Not sure if I understand correctly what you want, but ...

Numpy vector operations are performed in C, so there will be an iteration
over the array elements.

For parallel operations over all pixels, you'd need a package that talks to
your GPU, such as pyGPU.
I've never tried it and if you do, please report your experience, I'd be
very interested to hear about it.

HTH,

David




On Tue, Sep 16, 2008 at 4:50 AM, Stéfan van der Walt <[EMAIL PROTECTED]>wrote:

> Hi Brendan
>
> 2008/9/16 brendan simons <[EMAIL PROTECTED]>:
> > #interpolate the green pixels from the bayer filter image ain
> > g = greenMask * ain
> > gi = g[:-2, 1:-1].astype('uint16')
> > gi += g[2:, 1:-1]
> > gi += g[1:-1, :-2]
> > gi += g[1:-1, 2:]
> > gi /= 4
> > gi += g[1:-1, 1:-1]
> > return gi
>
> I may be completely off base here, but you should be able to do this
> *very* quickly using your GPU, or even just using OpenGL.  Otherwise,
> coding it up in ctypes is easy as well (I can send you a code snippet,
> if you need).
>
> > I do something similar for red and blue, then stack the interpolated red,
> > green and blue integers into an array of 24 bit integers and blit to the
> > screen.
> >
> > I was hoping that none of the lines would have to iterate over pixels,
> and
> > would instead do the adds and multiplies as single operations. Perhaps
> numpy
> > has to iterate when copying a subset of an array?  Is there a faster
> array
> > "crop" ?  Any hints as to how I might code this part up using ctypes?
>
> Have you tried formulating this as a convolution, and using
> scipy.signal's 2-d convolve or fftconvolve?
>
> Cheers
> Stéfan
> ___
> Numpy-discussion mailing list
> Numpy-discussion@scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion
>
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] BUG in numpy.loadtxt?

2008-09-05 Thread David Huard

Done in r5790.

On Fri, Sep 5, 2008 at 12:36 PM, Ryan May <[EMAIL PROTECTED]> wrote:

> David Huard wrote:
> > Hi Ryan,
> >
> > I applied your patch in r5788 on the trunk.
> > I noticed there was another bug occurring when both converters and
> > usecols are provided.
> > I've added regression tests for both bugs. Could you confirm that
> > everything is fine on your side ?
> >
>
> I can confirm that it works fine for me.  Can you or someone else
> backport this to the 1.2 branch so that this bug is fixed in the next
> release?
>
> Thanks,
>
> Ryan
>
> --
> Ryan May
> Graduate Research Assistant
> School of Meteorology
> University of Oklahoma
> ___
> Numpy-discussion mailing list
> Numpy-discussion@scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion
>
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] BUG in numpy.loadtxt?

2008-09-05 Thread David Huard

Hi Ryan,

I applied your patch in r5788 on the trunk.
I noticed there was another bug occurring when both converters and usecols
are provided.
I've added regression tests for both bugs. Could you confirm that everything
is fine on your side ?

Thanks,

On Thu, Sep 4, 2008 at 4:47 PM, Ryan May <[EMAIL PROTECTED]> wrote:

> Travis E. Oliphant wrote:
> > Ryan May wrote:
> >> Stefan (or anyone else who can comment),
> >>
> >> It appears that the usecols argument to loadtxt no longer accepts numpy
> >> arrays:
> >>
> >
> > Could you enter a ticket so we don't lose track of this.  I don't
> > remember anything being intentional.
> >
>
> Done: #905
> http://scipy.org/scipy/numpy/ticket/905
>
> I've attached a patch that does the obvious and coerces usecols to a
> list when it's not None, so it will work for any iterable.
>
> I don't think it was a conscious decision, just a consequence of the
> rewrite using different methods.  There are two problems:
>
> 1) It's an API break, technically speaking
> 2) It currently doesn't even accept tuples, which are used in the
> docstring.
>
> Can we hurry and get this into 1.2?
>
> Thanks,
> Ryan
>
> --
> Ryan May
> Graduate Research Assistant
> School of Meteorology
> University of Oklahoma
> ___
> Numpy-discussion mailing list
> Numpy-discussion@scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion
>
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] 1.2 tasks

2008-08-05 Thread David Huard

On Tue, Aug 5, 2008 at 1:36 PM, Jarrod Millman <[EMAIL PROTECTED]> wrote:

> On Tue, Aug 5, 2008 at 10:24 AM, Stéfan van der Walt <[EMAIL PROTECTED]>
> wrote:
> > Could you put in a check for new=True, and suppress those messages?  A
> > user that knows about the changes wouldn't want to see anything.
>
> Yes, that is all ready available.  Maybe the warning message for
> 'new=None' should mention this, though.
>

Done


>
> --
> Jarrod Millman
> Computational Infrastructure for Research Labs
> 10 Giannini Hall, UC Berkeley
> phone: 510.643.4014
> http://cirl.berkeley.edu/
> ___
> Numpy-discussion mailing list
> Numpy-discussion@scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion
>
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] 1.2 tasks

2008-08-05 Thread David Huard

On Tue, Aug 5, 2008 at 1:18 PM, Jarrod Millman <[EMAIL PROTECTED]> wrote:

> On Tue, Aug 5, 2008 at 8:48 AM, David Huard <[EMAIL PROTECTED]> wrote:
> > Thanks for the feedback. Here is what will be printed:
> >
> > If new=False
> >
> > The original semantics of histogram is scheduled to be
> > deprecated in NumPy 1.3. The new semantics fixes
> > long-standing issues with outliers handling. The main
> > changes concern
> > 1. the definition of the bin edges,
> > now including the rightmost edge, and
> > 2. the handling of upper outliers,
> > now ignored rather than tallied in the rightmost bin.
> >
> > Please read the docstring for more information.
> >
> >
> >
> > If new=None  (default)
> >
> > The semantics of histogram has been modified in
> > the current release to fix long-standing issues with
> > outliers handling. The main changes concern
> > 1. the definition of the bin edges,
> >now including the rightmost edge, and
> > 2. the handling of upper outliers, now ignored rather
> >than tallied in the rightmost bin.
> > The previous behaviour is still accessible using
> > `new=False`, but is scheduled to be deprecated in the
> > next release (1.3).
> >
> > *This warning will not printed in the 1.3 release.*
> >
> > Please read the docstring for more information.
>
> Thanks for taking care of this.  I thought that we were going to
> remove the new parameter in the 1.3 release.  Is that still the plan?
> If so, shouldn't the warning state "will be removed in the next minor
> release (1.3)" rather than "is scheduled to be deprecated in the next
> release (1.3)"?  In my mind the old behavior is deprecated in this
> release (1.2).
>

The roadmap that I propose is the following:

1.1 we warn about upcoming change, (new=False)
1.2 we make that change, (new=None) + warnings
1.3 we deprecate the old behaviour (new=True), no warnings.
1.4 remove the old behavior and remove the new keyword.

It's pretty much the roadmap exposed in the related ticket that I wrote
following discussions on the ML.

This leaves plenty of time for people to make their changes, and my guess
is that a lot of people will appreciate this, given that you were asked to
delay
the changes to histogram.


> The 1.2 release will be longer lived (~6 months) than the 1.1 release
> and I anticipate several bugfix releases (1.2.1, 1.2.2, 1.2.3, etc).
> So I think it is reasonable to just remove the old behavior in the 1.3
> release.


> --
> Jarrod Millman
> Computational Infrastructure for Research Labs
> 10 Giannini Hall, UC Berkeley
> phone: 510.643.4014
> http://cirl.berkeley.edu/
> ___
> Numpy-discussion mailing list
> Numpy-discussion@scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion
>
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] 1.2 tasks

2008-08-05 Thread David Huard

On Tue, Aug 5, 2008 at 4:04 AM, Vincent Schut <[EMAIL PROTECTED]> wrote:

> David Huard wrote:
> >
> >
> > On Mon, Aug 4, 2008 at 1:45 PM, Jarrod Millman <[EMAIL PROTECTED]
> > <mailto:[EMAIL PROTECTED]>> wrote:
>
> > 
>
> > Question: Should histogram raise a warning by default (new=True) to warn
> > users that the behaviour has changed ? Or warn only if new=False to
> > remind that
> > the old behaviour will be deprecated in 1.3 ?  I think that users will
> > prefer being annoyed at warnings than surprised by an unexpected change,
> > but repeated warnings
> > can become a nuisance.
> >
> > To minimize annoyance, we could also offer three possibilities:
> >
> > new=None (default) : Equivalent to True, print warning about change.
> > new=True : Don't print warning.
> > new=False : Print warning about future deprecation.
> >
> > So those who have already set new=True don't get warnings, and all
> > others are warned. Feedback ?
>
> As a regular user of histogram I say: please warn! Your proposal above
> seems OK to me. I do have histogram in a lot of kind of old (and
> sometimes long-running) code of mine, and I certainly would prefer to be
> warned.
>
> Vincent.

Thanks for the feedback. Here is what will be printed:

If new=False

The original semantics of histogram is scheduled to be
deprecated in NumPy 1.3. The new semantics fixes
long-standing issues with outliers handling. The main
changes concern
1. the definition of the bin edges,
now including the rightmost edge, and
2. the handling of upper outliers,
now ignored rather than tallied in the rightmost bin.

Please read the docstring for more information.

If new=None  (default)

The semantics of histogram has been modified in
the current release to fix long-standing issues with
outliers handling. The main changes concern
1. the definition of the bin edges,
   now including the rightmost edge, and
2. the handling of upper outliers, now ignored rather
   than tallied in the rightmost bin.
The previous behaviour is still accessible using
`new=False`, but is scheduled to be deprecated in the
next release (1.3).

*This warning will not printed in the 1.3 release.*

Please read the docstring for more information.

I modified the docstring to put the emphasis on the new semantics,
adapted the tests and updated the ticket.

David

>
>
> ___
> Numpy-discussion mailing list
> Numpy-discussion@scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion
>
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] 1.2 tasks

2008-08-04 Thread David Huard

On Mon, Aug 4, 2008 at 1:45 PM, Jarrod Millman <[EMAIL PROTECTED]> wrote:

> Here are the remaining tasks that I am aware of that need to be done
> before tagging 1.2.0b1 on the 8th.
>
> Median
> ==
> The call signature for median needs to change from
>  def median(a, axis=0, out=None, overwrite_input=False):
> to
>  def median(a, axis=None, out=None, overwrite_input=False):
> in both numpy/ma/extras.py and numpy/lib/function_base.py
>
> Histogram
> 
> The call signature for histogram needs to change from
>  def histogram(a, bins=10, range=None, normed=False, weights=None,
> new=False):
> to
>  def histogram(a, bins=10, range=None, normed=False, weights=None,
> new=True):
> in numpy/lib/function_base.py
>

Question: Should histogram raise a warning by default (new=True) to warn
users that the behaviour has changed ? Or warn only if new=False to remind
that
the old behaviour will be deprecated in 1.3 ?  I think that users will
prefer being annoyed at warnings than surprised by an unexpected change, but
repeated warnings
can become a nuisance.

To minimize annoyance, we could also offer three possibilities:

new=None (default) : Equivalent to True, print warning about change.
new=True : Don't print warning.
new=False : Print warning about future deprecation.

So those who have already set new=True don't get warnings, and all others
are warned. Feedback ?

David H.

> Documentation
> 
> The documentation project needs to merge in its changes.  Stefan will
> take care of this on the 5th.
>
> Please let me know ASAP if there is anything I am missing.
>
> Thanks,
>
> --
> Jarrod Millman
> Computational Infrastructure for Research Labs
> 10 Giannini Hall, UC Berkeley
> phone: 510.643.4014
> http://cirl.berkeley.edu/
> ___
> Numpy-discussion mailing list
> Numpy-discussion@scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion
>
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] "import numpy" is slow

2008-07-31 Thread David Huard

On Thu, Jul 31, 2008 at 1:12 PM, Christopher Barker
<[EMAIL PROTECTED]>wrote:

> David Cournapeau wrote:
> > Christopher Barker wrote:
> >> On my OS-X box (10.4.11, python2.5, numpy '1.1.1rc2'), it takes about 7
> >> seconds to import numpy!
> >
> > Hot or cold ? If hot, there is something horribly wrong with your setup.
>
> hot -- it takes about 10 cold.
>
> I've been wondering about that.
>
> time python -c "import numpy"
>
> real0m8.383s
> user0m0.320s
> sys 0m7.805s
>
> and similar results if run multiple times in a row.
>
> Any idea what could be wrong? I have no clue where to start, though I
> suppose a complete clean out and re-install of python comes to mind.
>

Is only 'import numpy' slow, or other packages import slowly too ?
Are there remote directories in your pythonpath ?
Do you have old `eggs` in the site-packages directory that point to remote
directories (installed with setuptools developp) ?
Try cleaning the site-packages directory. That did the trick for me once.

David


> oh, and this is a dual G5 PPC (which should have a faster disk than your
> Macbook)
>
>
> -Chris
>
>
> --
> Christopher Barker, Ph.D.
> Oceanographer
>
> Emergency Response Division
> NOAA/NOS/OR&R(206) 526-6959   voice
> 7600 Sand Point Way NE   (206) 526-6329   fax
> Seattle, WA  98115   (206) 526-6317   main reception
>
> [EMAIL PROTECTED]
> ___
> Numpy-discussion mailing list
> Numpy-discussion@scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion
>
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] The date/time dtype and the casting issue

2008-07-29 Thread David Huard

Hi,

Silent casting is often a source of bugs and I appreciate the strict rules
you want to enforce.
However, I think there should be a simpler mechanism for operations between
different types
than creating a copy of a variable with the correct type.

My suggestion is to have a dtype argument for methods such as add and subs:

>>> numpy.ones(3, dtype="t8[Y]").add(numpy.zeros(3, dtype="t8[fs]"),
dtype="t8[fs]")

This way, `implicit` operations (+,-) enforce strict rules, and `explicit`
operations (add, subs) let's
you do want you want at your own risk.

David


On Tue, Jul 29, 2008 at 9:12 AM, Francesc Alted <[EMAIL PROTECTED]> wrote:

> Hi,
>
> During the making of the date/time proposals and the subsequent
> discussions in this list, we have changed a couple of times our point
> of view about the way how the castings would work between different
> date/time types and the different time units (previously called
> resolutions).  So I'd like to expose this issue in detail here, and
> give yet another new proposal about this, so as to gather feedback from
> the community before consolidating it in the final date/time proposal.
>
> Casting proposal for date/time types
> 
>
> The operations among the proposed date/time types can be divided in
> three groups:
>
> * Absolute time versus relative time
>
> * Absolute time versus absolute time
>
> * Relative time versus relative time
>
> Now, here are our considerations for each case:
>
> Absolute time versus relative time
> --
>
> We think that in this case the absolute time should have priority for
> determining the time unit of the outcome.  That would represent what
> the people wants to do most of the times.  For example, this would
> allow to do:
>
> >>> series = numpy.array(['1970-01-01', '1970-02-01', '1970-09-01'],
> dtype='datetime64[D]')
> >>> series2 = series + numpy.timedelta(1, 'Y')  # Add 2 relative years
> >>> series2
> array(['1972-01-01', '1972-02-01', '1972-09-01'],
> dtype='datetime64[D]')  # the 'D'ay time unit has been chosen
>
> Absolute time versus absolute time
> --
>
> When operating (basically, only the substraction will be allowed) two
> absolute times with different unit times, we are proposing that the
> outcome would be to raise an exception.  This is because the ranges and
> timespans of the different time units can be very different, and it is
> not clear at all what time unit will be preferred for the user.  For
> example, this should be allowed:
>
> >>> numpy.ones(3, dtype="T8[Y]") - numpy.zeros(3, dtype="T8[Y]")
> array([1, 1, 1], dtype="timedelta64[Y]")
>
> But the next should not:
>
> >>> numpy.ones(3, dtype="T8[Y]") - numpy.zeros(3, dtype="T8[ns]")
> raise numpy.IncompatibleUnitError  # what unit to choose?
>
> Relative time versus relative time
> --
>
> This case would be the same than the previous one (absolute vs
> absolute).  Our proposal is to forbid this operation if the time units
> of the operands are different.  For example, this should be allowed:
>
> >>> numpy.ones(3, dtype="t8[Y]") + 3*numpy.ones(3, dtype="t8[Y]")
> array([4, 4, 4], dtype="timedelta64[Y]")
>
> But the next should not:
>
> >>> numpy.ones(3, dtype="t8[Y]") + numpy.zeros(3, dtype="t8[fs]")
> raise numpy.IncompatibleUnitError  # what unit to choose?
>
> Introducing a time casting function
> ---
>
> As forbidding operations among absolute/absolute and relative/relative
> types can be unacceptable in many situations, we are proposing an
> explicit casting mechanism so that the user can inform about the
> desired time unit of the outcome.  For this, a new NumPy function,
> called, say, ``numpy.change_unit()`` (this name is for the purposes of
> the discussion and can be changed) will be provided.  The signature for
> the function will be:
>
> change_unit(time_object, new_unit, reference)
>
> where 'time_object' is the time object whose unit is to be
> changed, 'new_unit' is the desired new time unit, and 'reference' is an
> absolute date that will be used to allow the conversion of relative
> times in case of using time units with an uncertain number of smaller
> time units (relative years or months cannot be expressed in days).  For
> example, that would allow to do:
>
> >>> numpy.change_unit( numpy.array([1,2], 'T[Y]'), 'T[d]' )
> array([365, 731], dtype="datetime64[d]")
>
> or:
>
> >>> ref = numpy.datetime64('1971', 'T[Y]')
> >>> numpy.change_unit( numpy.array([1,2], 't[Y]'), 't[d]',  ref )
> array([366, 365], dtype="timedelta64[d]")
>
> Note: we refused to use the ``.astype()`` method because of the
> additional 'time_reference' parameter that will sound strange for other
> typical uses of ``.astype()``.
>
> Opinions?
>
> --
> Francesc Alted
> ___
> Numpy-discussion mailing list
> Numpy-discussion@scipy.org
> http://projects.scipy.org/mailman/listi

Re: [Numpy-discussion] Schedule for 1.2.0

2008-07-23 Thread David Huard

I think we should stick to what has been agreed and announced months ago.
It's called honouring our commitments and the project's image depends on it.

If the inconvenience of these API changes is worth the trouble, a 1.1.2 release
could be considered.

My two cents.

David

2008/7/22 Joe Harrington <[EMAIL PROTECTED]>:
> Hi Jarrod,
>
> I'm just catching up on my numpy lists and I caught this; sorry for
> the late reply!
>
>> Another issue that we should address is whether it is OK to postpone
>> the planned API changes to histogram and median.  A couple of people
>> have mentioned to me that they would like to delay the API changes to
>> 1.3, which seems reasonable to me.  If anyone would prefer that we
>> make the planned API changes for histogram and median in 1.2, please
>> speak now.
>
> I *strongly* want both these changes for 1.2, as I am sure do the many
> people teaching courses using numpy for the fall.  It is hard to get
> students to understand why there are inconsistencies and
> irrationalities in software, and it's even worse when it's
> open-source, since somehow it's the lecturer's fault that he picked a
> package that isn't right in some major way.  Worse, we're changing
> these behaviors like 6 months from now, so students will have to learn
> it wrong and code it wrong, and then their code may break on top of
> it.  On behalf of this year's new students and their instructors, I
> ask you to keep these changes in the release as planned.
>
> Thanks,
>
> --jh--
> Prof. Joseph Harrington
> Department of Physics
> MAP 414
> 4000 Central Florida Blvd.
> University of Central Florida
> Orlando, FL 32816-2385
> ___
> Numpy-discussion mailing list
> Numpy-discussion@scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion
>
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] 1.1.1rc1 to be tagged tonight

2008-07-21 Thread David Huard

Ryan, I committed your patch to the trunk and added a test for it from your
failing example.

Jarrod, though I'm also wary to touch the branch so late, the patch is minor
and I don't see how it could break something that was not already broken.

David

2008/7/20 Ryan May <[EMAIL PROTECTED]>:

> Jarrod Millman wrote:
>
>> Hello,
>>
>> This is a reminder that 1.1.1rc1 will be tagged tonight.  Chuck is
>> planning to spend some time today fixing a few final bugs on the 1.1.x
>> branch.  If anyone else is planning to commit anything to the 1.1.x
>> branch today, please let me know immediately.  Obviously now is not
>> the time to commit anything to the branch that could break anything,
>> so please be extremely careful if you have to touch the branch.
>>
>> Once the release is tagged, Chris and David will create binary
>> installers for both Windows and Mac.  Hopefully, this will give us an
>> opportunity to have much more widespread testing before releasing
>> 1.1.1 final at the end of the month.
>>
>>  Can I get anyone to look at this patch for loadtext()?
>
> I was trying to use loadtxt() today to read in some text data, and I had
> a problem when I specified a dtype that only contained as many elements
> as in columns in usecols.  The example below shows the problem:
>
> import numpy as np
> import StringIO
> data = '''STID RELH TAIR
> JOE 70.1 25.3
> BOB 60.5 27.9
> '''
> f = StringIO.StringIO(data)
> names = ['stid', 'temp']
> dtypes = ['S4', 'f8']
> arr = np.loadtxt(f, usecols=(0,2),dtype=zip(names,dtypes), skiprows=1)
>
> With current 1.1 (and SVN head), this yields:
>
> IndexErrorTraceback (most recent call last)
>
> /home/rmay/ in ()
>
> /usr/lib64/python2.5/site-packages/numpy/lib/io.pyc in loadtxt(fname,
> dtype, comments, delimiter, converters, skiprows, usecols, unpack)
>309 for j in xrange(len(vals))]
>310 if usecols is not None:
> --> 311 row = [converterseq[j](vals[j]) for j in usecols]
>312 else:
>313 row = [converterseq[j](val) for j,val in
> enumerate(vals)]
>
> IndexError: list index out of range
> -
>
> I've added a patch that checks for usecols, and if present, correctly
> creates the converters dictionary to map each specified column with
> converter for the corresponding field in the dtype. With the attached
> patch, this works fine:
>
> >arr
> array([('JOE', 25.301), ('BOB', 27.899)],
>  dtype=[('stid', '|S4'), ('temp', '
>
> Thanks,
> Ryan
>
> --
> Ryan May
> Graduate Research Assistant
> School of Meteorology
> University of Oklahoma
>
> --
> Ryan May
> Graduate Research Assistant
> School of Meteorology
> University of Oklahoma
>
> ___
> Numpy-discussion mailing list
> Numpy-discussion@scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion
>
>
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] numpy.loadtext() fails with dtype + usecols

2008-07-21 Thread David Huard

Looks good to me. I committed the patch to the trunk and added a regression
test (r5495).

David




2008/7/18 Charles R Harris <[EMAIL PROTECTED]>:

>
>
> On Fri, Jul 18, 2008 at 4:16 PM, Ryan May <[EMAIL PROTECTED]> wrote:
>
>> Hi,
>>
>> I was trying to use loadtxt() today to read in some text data, and I had a
>> problem when I specified a dtype that only contained as many elements as in
>> columns in usecols.  The example below shows the problem:
>>
>> import numpy as np
>> import StringIO
>> data = '''STID RELH TAIR
>> JOE 70.1 25.3
>> BOB 60.5 27.9
>> '''
>> f = StringIO.StringIO(data)
>> names = ['stid', 'temp']
>> dtypes = ['S4', 'f8']
>> arr = np.loadtxt(f, usecols=(0,2),dtype=zip(names,dtypes), skiprows=1)
>>
>> With current 1.1 (and SVN head), this yields:
>>
>> IndexErrorTraceback (most recent call
>> last)
>>
>> /home/rmay/ in ()
>>
>> /usr/lib64/python2.5/site-packages/numpy/lib/io.pyc in loadtxt(fname,
>> dtype, comments, delimiter, converters, skiprows, usecols, unpack)
>>309 for j in xrange(len(vals))]
>>310 if usecols is not None:
>> --> 311 row = [converterseq[j](vals[j]) for j in usecols]
>>312 else:
>>313 row = [converterseq[j](val) for j,val in
>> enumerate(vals)]
>>
>> IndexError: list index out of range
>> --
>>
>> I've added a patch that checks for usecols, and if present, correctly
>> creates the converters dictionary to map each specified column with
>> converter for the corresponding field in the dtype. With the attached patch,
>> this works fine:
>>
>> >arr
>> array([('JOE', 25.301), ('BOB', 27.899)],
>>  dtype=[('stid', '|S4'), ('temp', '>
>> Comments?  Can I get this in for 1.1.1?
>>
>
> Can someone familiar with loadtxt comment on this patch?
>
> Chuck
>
>
>
> ___
> Numpy-discussion mailing list
> Numpy-discussion@scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion
>
>
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Histogram bin definition

2008-07-16 Thread David Huard

Hi Stefan,

It's designed this way. The main reason is that the default bin edges are
generated using

linspace(a.min(), a.max(), bin)

when bin is an integer.

If we leave the rightmost edge open, then the histogram of a 100 items array
will typically yield an histogram with 99 values because the maximum value
is an outlier. I thought the least surprising behavior was to make sure that
all items are counted.

The other reason has to do with backward compatibility, I tried to avoid
breakage for the simplest use case.

`histogram(r, bins=10)` yields the same thing as `histogram(r, bins=10,
new=True)`

We could avoid the open ended edge by defining the edges by
linspace(a.min(), a.max()+delta, bin), but people will wonder why the right
edge is 3.01 instead of 3.

Cheers,

David






2008/7/16 Stéfan van der Walt <[EMAIL PROTECTED]>:

> Hi all,
>
> I am busy documenting `histogram`, and the definition of a "bin"
> eludes me.  Here is the behaviour that troubles me:
>
> >>> np.histogram([1,2,1], bins=[0, 1, 2, 3], new=True)
> (array([0, 2, 1]), array([0, 1, 2, 3]))
>
> >From this result, it seems as if a bin is defined as the half-open
> interval [right_edge, left_edge).
>
> Now, looks what happens in the following case:
>
> >>> np.histogram([1,2,3], bins=[0,1,2,3], new=True)
> (array([0, 1, 2]), array([0, 1, 2, 3]))
>
> Here, the last bin is defined by the closed interval [right_edge,
> left_edge]!
>
> Is this a bug, or a design consideration?
>
> Regards
> Stéfan
> ___
> Numpy-discussion mailing list
> Numpy-discussion@scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion
>
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Revised list of backport candidates for 1.1.1

2008-07-15 Thread David Huard

The revision number for the backport of 5254 is 5419.

David

2008/7/15 Charles R Harris <[EMAIL PROTECTED]>:

> After the first round of backports the following remain.
>
> charris
> r5259
> r5312
> r5322
> r5324
> r5392
> r5394
> r5399
> r5406
> r5407
>
> dhuard
> r5254
>
> fperez
> r5298
> r5301
> r5303
>
> oliphant
> r5245
> r5255
>
> Chuck
>
> ___
> Numpy-discussion mailing list
> Numpy-discussion@scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion
>
>
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] RFC: A proposal for implementing some date/time types in NumPy

2008-07-14 Thread David Huard

2008/7/14 Francesc Alted <[EMAIL PROTECTED]>:

> [...]
> > DateArray([14-Jan-2001 14:34:33, 16-Jan-2001 10:09:11],
> >   freq='S')
>
> That's great.  However we only planned to import/export dates from the
> ``datetime`` module for the time being, mainly because of efficency but
> also simplicity.  Would many people be interested in seeing this kind
> of string date parsing integrated in the native NumPy types?
>
>
It's useful to have a complete string representation to write dates
to a file and be able to retrieve them later on. In this sense, a
strftime-like
write/read method would be appreciated (where the date format is specified
by the user or set by convention).

On the other hand, trying to second guess the format the date is formatted
in can quickly turn
into a regular expression nightmare (look at the mx.datetime module that
does this). I'd hate
to see you waste time on this.


David
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] nose changes checked in

2008-06-17 Thread David Huard

2008/6/17 Anne Archibald <[EMAIL PROTECTED]>:

> 2008/6/17 Alan McIntyre <[EMAIL PROTECTED]>:
> > On Tue, Jun 17, 2008 at 9:26 AM, David Huard <[EMAIL PROTECTED]>
> wrote:
> >> I noticed that NumpyTest and NumpyTestCase disappeared, and now I am
> >> wondering whether these classes part of the public interface or were
> they
> >> reserved for internal usage ?
> >>
> >> In the former, it might be well to deprecate them before removing them.
> >
> > ParametricTestCase is gone too.  There was at least one person using
> > it that said he didn't mind porting to the nose equivalent, but I
> > expect that's an indication there's more people out there using them.
> > If there's a consensus that they need to go back in and get marked as
> > deprecated, I'll put them back.
>
> Uh, I assumed NumpyTestCase was public and used it. I'm presumably not
> alone, so perhaps a deprecation warning would be good. What
> backward-compatible class should I use? unittest.TestCase?
>

Yes. You'll also have to replace the NumpyTest().run() by something else.
Either the new nose method or the old fashioned
if __name__ == '__main__':
 unittest.main()

Also, note that unittest.TestCase is more restrictive than NumpyTestCase
regarding the method names that are considered tests. While NumpyTestCase
accepted method names starting with check_, TestCase won't recognize those
as tests.

David



> Thanks,
> Anne
> ___
> Numpy-discussion mailing list
> Numpy-discussion@scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion
>
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] nose changes checked in

2008-06-17 Thread David Huard

I noticed that NumpyTest and NumpyTestCase disappeared, and now I am
wondering whether these classes part of the public interface or were they
reserved for internal usage ?

In the former, it might be well to deprecate them before removing them.

Cheers,

David

2008/6/17 David Cournapeau <[EMAIL PROTECTED]>:

> David Cournapeau wrote:
> > It does not work with python2.6a3, but it is a nose problem, apparently
> > (I have the exact same error)
> >
>
> Sorry, it is a python26 problem, not nose.
>
> cheers,
>
> David
> ___
> Numpy-discussion mailing list
> Numpy-discussion@scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion
>
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] NumpyTest problem

2008-06-10 Thread David Huard

Charles,

This bug appeared after your change in r5217:

Index: numpytest.py
===
--- numpytest.py(révision 5216)
+++ numpytest.py(révision 5217)
@@ -527,7 +527,7 @@
 all_tests = unittest.TestSuite(suite_list)
 return all_tests

-def test(self, level=1, verbosity=1, all=False, sys_argv=[],
+def test(self, level=1, verbosity=1, all=True, sys_argv=[],
  testcase_pattern='.*'):
 """Run Numpy module test suite with level and verbosity.

running
NumpyTest().test(all=False) works, but
NumpyTest().test(all=True) doesn't, that is, it finds 0 test.

David


2008/6/2 Charles R Harris <[EMAIL PROTECTED]>:

>
>
> On Mon, Jun 2, 2008 at 9:20 AM, David Huard <[EMAIL PROTECTED]> wrote:
>
>> Hi,
>>
>> There are 2 problems with NumpyTest
>>
>> 1. It fails if the command is given the file name only (without a
>> directory structure)
>>
>> E.g.:
>>
>> [EMAIL PROTECTED]:~/repos/numpy/numpy/tests$ python test_ctypeslib.py
>> Traceback (most recent call last):
>>   File "test_ctypeslib.py", line 87, in 
>> NumpyTest().run()
>>   File "/usr/lib64/python2.5/site-packages/numpy/testing/numpytest.py",
>> line 655, in run
>> testcase_pattern=options.testcase_pattern)
>>   File "/usr/lib64/python2.5/site-packages/numpy/testing/numpytest.py",
>> line 575, in test
>> level, verbosity)
>>   File "/usr/lib64/python2.5/site-packages/numpy/testing/numpytest.py",
>> line 453, in _test_suite_from_all_tests
>> importall(this_package)
>>   File "/usr/lib64/python2.5/site-packages/numpy/testing/numpytest.py",
>> line 681, in importall
>> for subpackage_name in os.listdir(package_dir):
>> OSError: [Errno 2] No such file or directory: ''
>> [EMAIL PROTECTED]:~/repos/numpy/numpy/tests$
>>
>>
>>
>> 2. It doesn't find tests it used to find:
>>
>> [EMAIL PROTECTED]:~/repos/numpy/numpy$ python tests/test_ctypeslib.py
>>
>
>
> There haven't been many changes to the tests.  Could you fool with
> numpy.test(level=10,all=0) and such to see what happens? All=1 is now the
> default.
>
> I've also seen test run some tests twice. I don't know what was up with
> that.
>
> Chuck
>
>
>
> ___
> Numpy-discussion mailing list
> Numpy-discussion@scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion
>
>
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Bug in numpy.histogram?

2008-06-09 Thread David Huard

2008/6/9 Tommy Grav <[EMAIL PROTECTED]>:

> I understand this and agree, but it still means that the API for
> histogram is
> broken since normed can only be used with the new=True parameter. I
> though
> the whole point of the future warning was to avoid this. It is not a
> big deal,
> just means that one is forced to use the new API somewhat quicker :)
>

Tommy,

you should be able to use normed=True as long as bins edges are not
specified explicitly.
That is, by setting bins=number_of_bins and range=[bin_min, bin_max], normed
should not raise any warning.

The case bins=edges_array and normed=True was simply too ugly too fix using
the old calling semantic due to this right edge at infinity problem. Also,
since there was a bug in histogram for this combination, we thought it just
as well to force the switch to the new behavior.

Sorry for the inconvenience,

David





>
> Cheers
>Tommy
>
>
>
> On Jun 9, 2008, at 11:17 AM, Pauli Virtanen wrote:
>
> > ma, 2008-06-09 kello 11:11 -0400, Tommy Grav kirjoitti:
> >> With the most recent change in numpy 1.1 it seems that
> >> numpy.histogram
> >> was broken when wanting a normalized histogram. I thought the idea
> >> was
> >> to leave the functionality of histogram as it was in 1.1 and then
> >> break the api in 1.2?
> > [clip]
> >> data, bins = numpy.histogram(a,b,normed=True)
> >> Traceback (most recent call last):
> >>   File "", line 0, in 
> >>   File "/Library/Frameworks/Python.framework/Versions/2.5/lib/
> >> python2.5/site-packages/numpy/lib/function_base.py", line 189, in
> >> histogram
> >> raise ValueError, 'Use new=True to pass bin edges explicitly.'
> >> ValueError: Use new=True to pass bin edges explicitly.
> >
> > I think the point in this specific change was that numpy.histogram
> > previously returned invalid results when normed=True and explicit bins
> > were given; the previous code always normalized the results assuming
> > the
> > bins were of equal size.
> >
> > Moreover, I think it was not obvious what "normalized" results should
> > mean when one of the bins is of infinite size.
> >
> >   Pauli
> >
> >
> > ___
> > Numpy-discussion mailing list
> > Numpy-discussion@scipy.org
> > http://projects.scipy.org/mailman/listinfo/numpy-discussion
>
> ___
> Numpy-discussion mailing list
> Numpy-discussion@scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion
>
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] NumpyTest problem

2008-06-02 Thread David Huard

Hi Alan,

Thanks for looking into it.

David


2008/6/2 Alan McIntyre <[EMAIL PROTECTED]>:

> David,
>
> We're in the process of switching to nose
> (http://www.somethingaboutorange.com/mrl/projects/nose/) as the test
> framework for 1.2; I'll try to keep an eye on stuff like that and make
> it work properly if I can.
>
> Alan
>
> On Mon, Jun 2, 2008 at 11:20 AM, David Huard <[EMAIL PROTECTED]>
> wrote:
> > Hi,
> >
> > There are 2 problems with NumpyTest
> >
> > 1. It fails if the command is given the file name only (without a
> directory
> > structure)
> >
> > E.g.:
> >
> > [EMAIL PROTECTED]:~/repos/numpy/numpy/tests$ python test_ctypeslib.py
> > Traceback (most recent call last):
> >   File "test_ctypeslib.py", line 87, in 
> > NumpyTest().run()
> >   File "/usr/lib64/python2.5/site-packages/numpy/testing/numpytest.py",
> line
> > 655, in run
> > testcase_pattern=options.testcase_pattern)
> >   File "/usr/lib64/python2.5/site-packages/numpy/testing/numpytest.py",
> line
> > 575, in test
> > level, verbosity)
> >   File "/usr/lib64/python2.5/site-packages/numpy/testing/numpytest.py",
> line
> > 453, in _test_suite_from_all_tests
> > importall(this_package)
> >   File "/usr/lib64/python2.5/site-packages/numpy/testing/numpytest.py",
> line
> > 681, in importall
> > for subpackage_name in os.listdir(package_dir):
> > OSError: [Errno 2] No such file or directory: ''
> > [EMAIL PROTECTED]:~/repos/numpy/numpy/tests$
> >
> >
> >
> > 2. It doesn't find tests it used to find:
> >
> > [EMAIL PROTECTED]:~/repos/numpy/numpy$ python tests/test_ctypeslib.py
> >
> > --
> > Ran 0 tests in 0.000s
> >
> > OK
> > [EMAIL PROTECTED]:~/repos/numpy/numpy$
> >
> > Cheers,
> >
> > David
> >
> >
> > ___
> > Numpy-discussion mailing list
> > Numpy-discussion@scipy.org
> > http://projects.scipy.org/mailman/listinfo/numpy-discussion
> >
> >
> ___
> Numpy-discussion mailing list
> Numpy-discussion@scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion
>
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] NumpyTest problem

2008-06-02 Thread David Huard

numpy.test(level=10,all=0) seems to work fine.
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

[Numpy-discussion] NumpyTest problem

2008-06-02 Thread David Huard

Hi,

There are 2 problems with NumpyTest

1. It fails if the command is given the file name only (without a directory
structure)

E.g.:

[EMAIL PROTECTED]:~/repos/numpy/numpy/tests$ python test_ctypeslib.py
Traceback (most recent call last):
  File "test_ctypeslib.py", line 87, in 
NumpyTest().run()
  File "/usr/lib64/python2.5/site-packages/numpy/testing/numpytest.py", line
655, in run
testcase_pattern=options.testcase_pattern)
  File "/usr/lib64/python2.5/site-packages/numpy/testing/numpytest.py", line
575, in test
level, verbosity)
  File "/usr/lib64/python2.5/site-packages/numpy/testing/numpytest.py", line
453, in _test_suite_from_all_tests
importall(this_package)
  File "/usr/lib64/python2.5/site-packages/numpy/testing/numpytest.py", line
681, in importall
for subpackage_name in os.listdir(package_dir):
OSError: [Errno 2] No such file or directory: ''
[EMAIL PROTECTED]:~/repos/numpy/numpy/tests$



2. It doesn't find tests it used to find:

[EMAIL PROTECTED]:~/repos/numpy/numpy$ python tests/test_ctypeslib.py

--
Ran 0 tests in 0.000s

OK
[EMAIL PROTECTED]:~/repos/numpy/numpy$

Cheers,

David
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] question about histogram2d

2008-05-29 Thread David Huard

Hi Darren,

If I remember correctly, the thinking under the current behavior is that it
preserves similarity of results with histogramdd, where the histogram is
oriented in the numpy order (columns, rows). I thought that making
histogram2d(x,y) return something different than histogramdd([x,y]) was
probably worst than satisfying the cartesian convention.

Regards,

David



2008/5/29 Darren Dale <[EMAIL PROTECTED]>:

> I have a question about histogram2d. Say I do something like:
>
> import numpy
> from numpy import random
> import pylab
>
> x=random.rand(1000)-0.5
> y=random.rand(1000)*10-5
>
> xbins=numpy.linspace(-10,10,100)
> ybins=numpy.linspace(-10,10,100)
> h,x,y=numpy.histogram2d(x,y,bins=[xbins,ybins])
>
> pylab.imshow(h,interpolation='nearest')
> pylab.show()
>
> The output is attached. I think I would have expected the transpose of what
> numpy histogram2d returned, so the tight x distribution appears along the x
> axis in the image. Maybe I am thinking about this incorrectly, or there is
> a
> convention I am unfamiliar with. If the behavior is correct, could the
> docstring include a comment explaining the orientation of the histogram
> array?
>
> Thanks,
> Darren
>
> --
> Darren S. Dale, Ph.D.
> Staff Scientist
> Cornell High Energy Synchrotron Source
> Cornell University
> 275 Wilson Lab
> Rt. 366 & Pine Tree Road
> Ithaca, NY 14853
>
> [EMAIL PROTECTED]
> office: (607) 255-3819
> fax: (607) 255-9001
> http://www.chess.cornell.edu
>
> ___
> Numpy-discussion mailing list
> Numpy-discussion@scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion
>
>
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] 1.1.0rc1 tagged

2008-05-19 Thread David Huard

Ticket 793 has a patch, submitted by Alan McIntyre, waiting for review from
someone C-API-wise.

Cheers,

David



2008/5/19 Neal Becker <[EMAIL PROTECTED]>:

> Jarrod Millman wrote:
>
> > Please test the release candidate:
> > svn co http://svn.scipy.org/svn/numpy/tags/1.1.0rc1 1.1.0rc1
> >
> > Also please review the release notes:
> > http://projects.scipy.org/scipy/numpy/milestone/1.1.0
> >
> > I am going to ask Chris and David to create Windows and Mac binaries,
> > which I hope they will have time to create ASAP.
> >
> > Sorry that it has taken me so long, I am on vacation with my family
> > and am having a difficult time getting on my computer.
> >
> > Thanks,
> >
> Built OK on Fedora F9 x86_64 using
>  ['lapack', 'f77blas', 'cblas', 'atlas']
>
> Used rpmbuild with slightly modified version of fedora 9 spec file.
>
> ___
> Numpy-discussion mailing list
> Numpy-discussion@scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion
>
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] let's use patch review

2008-05-15 Thread David Huard

2008/5/14 David Cournapeau <[EMAIL PROTECTED]>:

> On Wed, 2008-05-14 at 13:58 -1000, Eric Firing wrote:
> >
> > What does that mean?  How does one know when there is a consensus?
>
> There can be a system to make this automatic. For example, the code is
> never commited directly to svn, but to a gatekeeper, and people vote by
> an email command to say if they want the patch in; when the total number
> of votes is above some threshold, the gatekeeper commit the patch.
>

There is about 5 commits/day, I don't think it's a good idea to wait for a
vote on each one of them.


>
> David
>
> ___
> Numpy-discussion mailing list
> Numpy-discussion@scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion
>
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] f2py and -D_FORTIFY_SOURCE=2 compilation flag

2008-05-15 Thread David Huard

Works for me,

Thanks

David

2008/5/15 Pearu Peterson <[EMAIL PROTECTED]>:

>
>
> Robert Kern wrote:
> > On Wed, May 14, 2008 at 3:20 PM, David Huard <[EMAIL PROTECTED]>
> wrote:
> >> I filed a patch that seems to do the trick in ticket #792.
> >
> > I don't think this is the right approach. The problem isn't that
> > _FORTIFY_SOURCE is set to 2 but that f2py is doing (probably) bad
> > things that trip these buffer overflow checks. IIRC, Pearu wasn't on
> > the f2py mailing list at the time this came up; please try him again.
>
> I was able to reproduce the bug on a debian system. The fix with
> a comment on what was causing the bug, is in svn:
>
>   http://scipy.org/scipy/numpy/changeset/5173
>
> I should warn that the bug fix does not have unittests because:
> 1) testing the bug requires Fortran compiler that for NumPy is
> an optional requirement.
> 2) I have tested the fix with two different setups that should cover
> all possible configurations.
> 3) In the case of problems with the fix, users should notice it
> immediately.
> 4) I have carefully read the patch before committing.
>
> Regards,
> Pearu
> ___
> Numpy-discussion mailing list
> Numpy-discussion@scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion
>
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] f2py and -D_FORTIFY_SOURCE=2 compilation flag

2008-05-14 Thread David Huard

I filed a patch that seems to do the trick in ticket
#792.<http://scipy.org/scipy/numpy/ticket/792>


2008/5/14 David Huard <[EMAIL PROTECTED]>:

> Hi,
>
> On fedora 8, the docstrings of f2py generated extensions are strangely
> missing. On Ubuntu, the same modules do have the docstrings. The problem, as
> reported in the f2py ML, seems to come from the -D_FORTIFY_SOURCE flag which
> is set to 2 instead of 1. Could this be fixed in numpy.distutils and how ?
>
> Thanks,
>
> David
>
>
>
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

[Numpy-discussion] f2py and -D_FORTIFY_SOURCE=2 compilation flag

2008-05-14 Thread David Huard

Hi,

On fedora 8, the docstrings of f2py generated extensions are strangely
missing. On Ubuntu, the same modules do have the docstrings. The problem, as
reported in the f2py ML, seems to come from the -D_FORTIFY_SOURCE flag which
is set to 2 instead of 1. Could this be fixed in numpy.distutils and how ?

Thanks,

David
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] numpy release

2008-04-25 Thread David Huard

Thanks Chuck,

I didn't know there were other tests for histogram outside of
test_function_base.

The error is now raised only if bins are passed explicitly and normed=True.

David

2008/4/25 Charles R Harris <[EMAIL PROTECTED]>:

>
>
> On Fri, Apr 25, 2008 at 12:55 PM, Jarrod Millman <[EMAIL PROTECTED]>
> wrote:
>
> > On Fri, Apr 25, 2008 at 12:55 PM, David Huard <[EMAIL PROTECTED]>
> > wrote:
> > > > Done in r5085. I added a bunch of tests, but I'd appreciate if
> > someone
> > > could double check before the release. This is not the time to
> > introduce new
> > > bugs.
> > > >
> > > > Hopefully this is the end of the histogram saga.
> > > >
> > >
>
>
> This one?
>
> ERROR: Ticket #632
> --
> Traceback (most recent call last):
>   File
> "/usr/lib/python2.5/site-packages/numpy/core/tests/test_regression.py", line
> 812, in check_hist_bins_as_list
> hist,edges = np.histogram([1,2,3,4],[1,2])
>   File "/usr/lib/python2.5/site-packages/numpy/lib/function_base.py", line
> 184, in histogram
> raise ValueError, 'Use new=True to pass bin edges explicitly.'
> ValueError: Use new=True to pass bin edges explicitly.
>
> Chuck
>
>
> ___
> Numpy-discussion mailing list
> Numpy-discussion@scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion
>
>
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] numpy release

2008-04-25 Thread David Huard

2008/4/25 David Huard <[EMAIL PROTECTED]>:

> 2008/4/24 Jarrod Millman <[EMAIL PROTECTED]>:
>
> > On Thu, Apr 24, 2008 at 1:22 PM, David Huard wrote:
> > >  Assuming we want the next version to : ignore values outside of range
> > and
> > > accept and return the bin edges instead of the left edges, here could
> > be the
> > > new signature for 1.1:
> > >  h, edges = histogram(a, bins=10, normed=False, range=None,
> > normed=False,
> > > new=False)
> > >
> > >  If new=False, return the histogram and the left edges, with a warning
> > that
> > > in the next version, the edges will be returned. If new=True, return
> > the
> > > histogram and the edges.
> > >  If range is given explicitly , raise a warning saying that in the
> > next
> > > version, the outliers will be ignored.  To ignore outliers, use
> > new=True.
> > >  If bins is a sequence, raise an error saying that bins should be an
> > > integer. To use explicit edges, use new=True.
> > >
> > >  In 1.2, set new=True as the default, and in 2.3, remove new
> > altogether.
> >
> > +1
> > That sounds fine to me assuming 2.3 is 1.3.
> >
>
> Indeed.
>
> Done in r5085. I added a bunch of tests, but I'd appreciate if someone
> could double check before the release. This is not the time to introduce new
> bugs.
>
> Hopefully this is the end of the histogram saga.
>

Well, it's not... there is still an issue... give me a couple of minutes to
fix it.


>
> David
>
>
> > --
> > Jarrod Millman
> > Computational Infrastructure for Research Labs
> > 10 Giannini Hall, UC Berkeley
> > phone: 510.643.4014
> > http://cirl.berkeley.edu/
> > ___
> > Numpy-discussion mailing list
> > Numpy-discussion@scipy.org
> > http://projects.scipy.org/mailman/listinfo/numpy-discussion
> >
>
>
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] numpy release

2008-04-25 Thread David Huard

2008/4/24 Jarrod Millman <[EMAIL PROTECTED]>:

> On Thu, Apr 24, 2008 at 1:22 PM, David Huard wrote:
> >  Assuming we want the next version to : ignore values outside of range
> and
> > accept and return the bin edges instead of the left edges, here could be
> the
> > new signature for 1.1:
> >  h, edges = histogram(a, bins=10, normed=False, range=None,
> normed=False,
> > new=False)
> >
> >  If new=False, return the histogram and the left edges, with a warning
> that
> > in the next version, the edges will be returned. If new=True, return the
> > histogram and the edges.
> >  If range is given explicitly , raise a warning saying that in the next
> > version, the outliers will be ignored.  To ignore outliers, use
> new=True.
> >  If bins is a sequence, raise an error saying that bins should be an
> > integer. To use explicit edges, use new=True.
> >
> >  In 1.2, set new=True as the default, and in 2.3, remove new altogether.
>
> +1
> That sounds fine to me assuming 2.3 is 1.3.
>

Indeed.

Done in r5085. I added a bunch of tests, but I'd appreciate if someone could
double check before the release. This is not the time to introduce new bugs.


Hopefully this is the end of the histogram saga.

David


> --
> Jarrod Millman
> Computational Infrastructure for Research Labs
> 10 Giannini Hall, UC Berkeley
> phone: 510.643.4014
> http://cirl.berkeley.edu/
> ___
> Numpy-discussion mailing list
> Numpy-discussion@scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion
>
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Generating Bell Curves (was: Using normal() )

2008-04-25 Thread David Huard

Other suggestions for bounded bell-shaped functions that reach zero on a
finite interval:

 - Beta distribution: http://en.wikipedia.org/wiki/Beta_distribution
 - Cubic B-splines:http://www.ibiblio.org/e-notes/Splines/Basis.htm




2008/4/25 Bruce Southey <[EMAIL PROTECTED]>:

> Rich Shepard wrote:
> >Thanks to several of you I produced test code using the normal
> density
> > function, and it does not do what we need. Neither does the Gaussian
> > function using fwhm that I've tried. The latter comes closer, but the
> ends
> > do not reach y=0 when the inflection point is y=0.5.
> >
> >So, let me ask the collective expertise here how to generate the
> curves
> > that we need.
> >
> >We need to generate bell-shaped curves given a midpoint, width (where
> y=0)
> > and inflection point (by default, y=0.5) where y is [0.0, 1.0], and x is
> > usually [0, 100], but can vary. Using the NumPy arange() function to
> produce
> > the x values (e.g, arange(0, 100, 0.1)), I need a function that will
> produce
> > the associated y values for a bell-shaped curve. These curves represent
> the
> > membership functions for fuzzy term sets, and generally adjacent curves
> > overlap where y=0.5. It would be a bonus to be able to adjust the skew
> and
> > kurtosis of the curves, but the controlling data would be the
> > center/midpoint and width, with defaults for inflection point, and other
> > parameters.
> >
> >I've been searching for quite some time without finding a solution
> that
> > works as we need it to work.
> >
> > TIA,
> >
> > Rich
> >
> >
> Hi,
> You could use a Gamma distribution to get a skewed distribution. But to
> extend Keith's comment, continuous  distributions typically go from
> minus infinity or zero to positive infinity and, furthermore, the
> probability of a single point in a continuous distribution is always
> zero. The only way you are going to get this from a single continuous
> distribution is via some truncated distribution - essentially Keith's
> reply.
>
> Alternatively, you may get away with a discrete distribution like the
> Poisson since it very quickly approaches normality but is skewed. A
> multinomial distribution may also work but that is more assumptions. In
> either case, you have map the points into the valid space because it is
> the distribution within the set that is used not the distribution of the
> data.
>
> I do not see the requirement for overlapping curves because the expected
> distribution of each set should be independent of the data and of the
> other sets. In that case, you just find the mean and variance of each
> set to get the degree of overlap you require. The inflection point
> requirement is very hard to understand as it different meanings such as
> just crossing or same area under the curve. I don't see any simple
> solution to that - two normals with the same variance but different
> means probably would. If the sets are dependent then you need a
> multivariate solution. Really you probably need a mixture of
> distributions and/or generate your own function to get something that
> meets you full requirements.
>
> Regards
> Bruce
>
>
> ___
> Numpy-discussion mailing list
> Numpy-discussion@scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion
>
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] numpy release

2008-04-24 Thread David Huard

The problem I see with C is that it will break compatibility with the other
histogram functions, which also use bins.

So here is suggestion E:

The most common use case ( I think) is the following:
h, b = histogram(r, number_of_bins, normed=True/False) for which the
function behaves correctly.

Assuming we want the next version to : ignore values outside of range and
accept and return the bin edges instead of the left edges, here could be the
new signature for 1.1:
h, edges = histogram(a, bins=10, normed=False, range=None, normed=False,
new=False)

If new=False, return the histogram and the left edges, with a warning that
in the next version, the edges will be returned. If new=True, return the
histogram and the edges.
If range is given explicitly , raise a warning saying that in the next
version, the outliers will be ignored.  To ignore outliers, use new=True.
If bins is a sequence, raise an error saying that bins should be an integer.
To use explicit edges, use new=True.

In 1.2, set new=True as the default, and in 2.3, remove new altogether.

David

2008/4/24 Travis E. Oliphant <[EMAIL PROTECTED]>:

> Pauli Virtanen wrote:
> >
> >   C) Create a new parameter with more sensible behavior and a name
> > different from "bins", and deprecate (at least giving sequences to) the
> > "bins" parameter: put up a DeprecationWarning if the user does this, but
> > still produce the same results as the old histogram. This way the user
> > can forward-port her code at leisure.
> >
>
>
> >   D) Or, retain the old behavior (values below lowest bin ignored) and
> > just fix the docstring and the normed=True bug? (I have a patch doing
> > this.)
> >
> >
> > So which one (or something else) do we choose for 1.1.0?
> >
> >
> I like either C or D,  but prefer C if it can be done before 1.1.0.
>
> -Travis
>
>
>
> ___
> Numpy-discussion mailing list
> Numpy-discussion@scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion
>
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] numpy release

2008-04-23 Thread David Huard

2008/4/23, Stéfan van der Walt <[EMAIL PROTECTED]>:
>
> Hi Jarrod
>
> Of those tickets, the following are serious:
>
> http://projects.scipy.org/scipy/numpy/ticket/605 (a patch is
> available?, David Huard)
>   Fixing of histogram.
>

I haven't found a way to fix histogram reliably without breaking the current
behavior. There is a patch attached to the ticket, if the decision is to
break histogram.

David
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] packaging scipy (was Re: Simple financial functions for NumPy)

2008-04-09 Thread David Huard

2008/4/9, Gael Varoquaux <[EMAIL PROTECTED]>:
>
> [snip]
>
> Some people do not want their scripts to scale or to last more than a day.


And that's what Matlab is especially good at ! ; )

And I'll say the thing I'm dying to say since this started: If anybody other
than Travis had suggested we put financial functions in numpy the response
would have been: make it a scikit, let the functions mature and evolve, get
some feedback from users and then we'll see where they fit in. The fact that
we are still discussing this shows the huge amount of respect Travis has in
this community, but also the lack of guidelines for NumPy's growth. Maybe
it's time for us to decide on a procedure for NEPs (Numpy Enhancement
Proposals) !

Regards,

David
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] ticket #605

2008-04-09 Thread David Huard

Hello Jarrod and co.,

here is my personal version of the histogram saga.

The current version of histogram puts in the rightmost bin all values larger
than range, but does not put in the leftmost bin all values smaller than
bin, eg.

In [6]: histogram([1,2,3,4,5,6], bins=3, range=[2,5])
Out[6]: (array([1, 1, 3]), array([ 2.,  3.,  4.]))

It discards 1, but puts 2 in the first bin, 3 in the second bin, and 4,5,6
in the third bin.  Also, the docstring  says that outliers are put in the
closest bin, which is false. Another point to consider is normalization.
Currently, the normalization factor is db=bin[1]-bin[0]. Of course, if the
bins are not equally spaced, this will yield a spurious density. Also, I'd
argue that since the rightmost bin covers the space from bin[-1] to
infinity, it's density should always be zero.

Now if someone wants to explain all that in the docstring, that's fine by
me. I fully understand the need to avoid breaking people's code. I simply
hope that in the next big release, this behavior can be changed to something
that is simpler: bins are the bin edges (instead of the left edges), and
everything outside the edges is ignored. This would be a nice occasion to
add an axis keyword and possibly weights, and would make histogram
consistent with histogramdd. I'm willing to implement those changes, but I
don't know how to do so without breaking histogram's behavior.

I just got Bruce reply, so sorry for the overlap.

David

2008/4/9, Jarrod Millman <[EMAIL PROTECTED]>:
>
> Hello,
>
> I just turned this one into a blocker for now.  There has been a very
> long and good discussion about this ticket:
> http://projects.scipy.org/scipy/numpy/ticket/605
>
> Could someone (David?, Bruce?) briefly summarize the problem and the
> current proposed solution for us again?  Let's agree on the problem
> and the solution.  I want to have something similiar to what is
> written about median for this release:
> http://projects.scipy.org/scipy/numpy/milestone/1.0.5
>
> I agree with David's sentiment:  "This issue has been raised a number
> of times since I follow this ML. It's not the first time I've proposed
> patches, and I've already documented the weird behavior only to see
> the comments disappear after a while. I hope this time some kind of
> agreement will be reached."
>
> If you give me the short summary I will make sure Travis or Eric
> respond (and I will put it in the release notes).
>
> Thanks,
>
>
> --
> Jarrod Millman
> Computational Infrastructure for Research Labs
> 10 Giannini Hall, UC Berkeley
> phone: 510.643.4014
> http://cirl.berkeley.edu/
>
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Ticket #605 Incorrect behavior of numpy.histogram

2008-04-08 Thread David Huard

2008/4/8, Bruce Southey <[EMAIL PROTECTED]>:
>
> Hi,
> I agree that the current histogram should be changed. However, I am not
> sure 1.0.5 is the correct release for that.


We both agree.

David, this doesn't work for your code:
> r= np.array([1,2,2,3,3,3,4,4,4,4,5,5,5,5,5])
> dbin=[2,3,4]
> rc, rb=histogram(r, bins=dbin, discard=None)

Returns:
> rc=[3 3] # Really should be [3, 3, 9]
> rb=[-92233720368547758083 -9223372036854775808]


I used the convention that bins are the bin edges, including the right most
edge, this is why len(rc) =2 and len(rb)=3.

Now there clearly is a bug, and I traced it to the use of np.r_. Check this
out:

In [26]: dbin = [1,2,3]

In [27]: np.r_[-np.inf, dbin, np.inf]
Out[27]: array([-Inf,   1.,   2.,   3.,  Inf])

In [28]: np.r_[-np.inf, asarray(dbin), np.inf]
Out[28]:
array([-9223372036854775808,1,
2,  3, -9223372036854775808])

In [29]: np.r_[-np.inf, asarray(dbin).astype(float), np.inf]
Out[29]: array([-Inf,   1.,   2.,   3.,  Inf])

Is this a misuse of r_ or a bug ?


David








But I have not had time to find the error.
>
> Regards
> Bruce
>
>
>
> David Huard wrote:
> > Hans,
> >
> > Note that the current histogram is buggy, in the sense that it assumes
> > that all bins have the same width and computes db = bins[1]-bin[0].
> > This is why you get zeros everywhere.
> >
> > The current behavior has been heavily criticized and I think we should
> > change it. My proposal is to have for histogram the same behavior as
> > for histogramdd and histogram2d: bins are the bin edges, including the
> > rightmost bin, and values outside of the bins are not tallied. The
> > problem with this is that it breaks code, and I'm not sure it's such a
> > good idea to do this in a point release.
> >
> > My short term proposal would be to fix the normalization bug and
> > document the current behavior of histogram for the 1.0.5 release. Once
> > it's done, we can modify histogram and maybe print a warning the first
> > time it's used to notice users of the change.
> >
> > I'd like to hear the voice of experienced devs on this. This issue has
> > been raised a number of times since I follow this ML. It's not the
> > first time I've proposed patches, and I've already documented the
> > weird behavior only to see the comments disappear after a while. I
> > hope this time some kind of agreement will be reached.
> >
> > Regards,
> >
> > David
> >
> >
> >
> >
> > 2008/4/8, Hans Meine <[EMAIL PROTECTED]
>
> > <mailto:[EMAIL PROTECTED]>>:
>
> >
> > Am Montag, 07. April 2008 14:34:08 schrieb Hans Meine:
> >
> > > Am Samstag, 05. April 2008 21:54:27 schrieb Anne Archibald:
> > > > There's also a fourth option - raise an exception if any
> > points are
> > > > outside the range.
> > >
> > > +1
> > >
> > > I think this should be the default.  Otherwise, I tend towards
> > "exclude",
> > > in order to have comparable bin sizes (when plotting, I always
> > find peaks
> > > at the ends annoying); this could also be called "clip" BTW.
> > >
> > > But really, an exception would follow the Zen: "In the face of
> > ambiguity,
> > > refuse the temptation to guess."  And with a kwarg: "Explicit is
> > better
> > > than implicit."
> >
> >
> > When posting this, I did indeed not think this through fully; as
> > David (and
> > Tommy) pointed out, this API does not fit well with the existing
> > `bins`
> > option, especially when a sequence of bin bounds is given.  (I
> > guess I was
> > mostly thinking about the special case of discrete values and 1:1
> > bins, as
> > typical for uint8 data.)
> >
> > Thus, I would like to withdraw my above opinion from and instead
> > state that I
> > find the current API as clear as it gets.  If you want to exclude
> > values,
> > simply pass an additional right bound, and for including outliers,
> > passing -inf as additional left bound seems to do the trick.  This
> > could be
> > possibly added to the documentation though.
> >
> > The only critical aspect I see is the `normed` arg.  As it is now,
> the
> > rightmost bin has always infinite size, but it is not treated like
> > that:
> >

Re: [Numpy-discussion] Ticket #605 Incorrect behavior of numpy.histogram

2008-04-08 Thread David Huard

Hans,

Note that the current histogram is buggy, in the sense that it assumes that
all bins have the same width and computes db = bins[1]-bin[0]. This is why
you get zeros everywhere.

The current behavior has been heavily criticized and I think we should
change it. My proposal is to have for histogram the same behavior as for
histogramdd and histogram2d: bins are the bin edges, including the rightmost
bin, and values outside of the bins are not tallied. The problem with this
is that it breaks code, and I'm not sure it's such a good idea to do this in
a point release.

My short term proposal would be to fix the normalization bug and document
the current behavior of histogram for the 1.0.5 release. Once it's done, we
can modify histogram and maybe print a warning the first time it's used to
notice users of the change.

I'd like to hear the voice of experienced devs on this. This issue has been
raised a number of times since I follow this ML. It's not the first time
I've proposed patches, and I've already documented the weird behavior only
to see the comments disappear after a while. I hope this time some kind of
agreement will be reached.

Regards,

David




2008/4/8, Hans Meine <[EMAIL PROTECTED]>:
>
> Am Montag, 07. April 2008 14:34:08 schrieb Hans Meine:
>
> > Am Samstag, 05. April 2008 21:54:27 schrieb Anne Archibald:
> > > There's also a fourth option - raise an exception if any points are
> > > outside the range.
> >
> > +1
> >
> > I think this should be the default.  Otherwise, I tend towards
> "exclude",
> > in order to have comparable bin sizes (when plotting, I always find
> peaks
> > at the ends annoying); this could also be called "clip" BTW.
> >
> > But really, an exception would follow the Zen: "In the face of
> ambiguity,
> > refuse the temptation to guess."  And with a kwarg: "Explicit is better
> > than implicit."
>
>
> When posting this, I did indeed not think this through fully; as David
> (and
> Tommy) pointed out, this API does not fit well with the existing `bins`
> option, especially when a sequence of bin bounds is given.  (I guess I was
> mostly thinking about the special case of discrete values and 1:1 bins, as
> typical for uint8 data.)
>
> Thus, I would like to withdraw my above opinion from and instead state
> that I
> find the current API as clear as it gets.  If you want to exclude values,
> simply pass an additional right bound, and for including outliers,
> passing -inf as additional left bound seems to do the trick.  This could
> be
> possibly added to the documentation though.
>
> The only critical aspect I see is the `normed` arg.  As it is now, the
> rightmost bin has always infinite size, but it is not treated like that:
>
> In [1]: from numpy import *
>
> In [2]: histogram(arange(10), [2,3,4], normed = True)
> Out[2]: (array([ 0.1,  0.1,  0.6]), array([2, 3, 4]))
>
> Even worse, if you try to add an infinite bin to the left, this pulls all
> values to zero (technically, I understand that, but it looks really
> undesirable to me):
>
> In [3]: histogram(arange(10), [-inf, 2,3,4], normed = True)
> Out[3]: (array([ 0.,  0.,  0.,  0.]), array([-Inf,   2.,   3.,   4.]))
>
>
> --
> Ciao, /  /
>  /--/
> /  / ANS
> ___
> Numpy-discussion mailing list
> Numpy-discussion@scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion
>
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Ticket #605 Incorrect behavior of numpy.histogram

2008-04-07 Thread David Huard

> On Apr 7, 2008, at 4:14 PM, LB wrote:
> > +1 for axis and +1 for a keyword to define what to do with values
> > outside the range.
> >
> > For the keyword, ather than 'outliers', I would propose 'discard' or
> > 'exclude', because it could be used to describe the four
> > possibilities :
> >  - discard='low'  => values lower than the range are discarded,
> > values higher are added to the last bin
> >   - discard='up'   => values higher than the range are discarded,
> > values lower are added to the first bin
> >   - discard='out'  => values out of the range are discarded
> >   - discard=None=> values outside of this range are allocated to
> > the closest bin
> >

Suppose you set bins=5, range=[0,10], discard=None, should the returned bins
be [0,2,4,6,810] or [-inf, 2, 4, 6, 8, inf] ?
Now suppose normed=True, what should be the density for the first and last
bin ? It seems to me it should be zero since we are assuming that the bins
extend to -infinity and infinity, but then, taking the outliers into account
seems pretty useless.

Overall, I think "discard" is a confusing option with little added value.
Getting the outliers is simply a matter of defining the bin edges explictly,
ie [-inf, x0, x1, ..., xn, inf].

In any case, attached is a version of histogram implementing the axis and
discard keywords. I'd really prefer though if we dumped the discard option.

David

>
>
>
>
> ___
> Numpy-discussion mailing list
> Numpy-discussion@scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion
>
def histogram(a, bins=10, range=None, normed=False, discard='out', axis=None):
"""Compute the histogram from a set of data.

Parameters:

a : array
The data to histogram.

bins : int or sequence of floats
If an int, then the number of equal-width bins in the given range.
Otherwise, a sequence of the lower bound of each bin.

range : (float, float)
The lower and upper range of the bins. If not provided, then
(a.min(), a.max()) is used. Values outside of this range are
allocated according to the discard keyword.

normed : bool
If False, the result array will contain the number of samples in
each bin.  If True, the result array is the value of the
probability *density* function at the bin normalized such that the
*integral* over the range is 1. Note that the sum of all of the
histogram values will not usually be 1; it is not a probability
*mass* function.

discard : out, low, high, None
With out, values outside range are not tallied, using low (high),
values lower (greater) than range are discarded, and values higher
(lower) than range are tallied in the closest bin. Using None,
values outside of range are stored in the closest bin.

axis : None or int
Axis along which histogram is performed. If None, applies on the
entire array.

Returns:

hist : array
The values of the histogram. See `normed` for a description of the
possible semantics.

edges : float array
The bins edges.

SeeAlso:

histogramdd

"""
a = asarray(a).ravel()

if (range is not None):
mn, mx = range
if (mn > mx):
raise AttributeError, 'max must be larger than min in range parameter.'

if not iterable(bins):
if range is None:
range = (a.min(), a.max())
mn, mx = [mi+0.0 for mi in range]
if mn == mx:
mn -= 0.5
mx += 0.5
bins = linspace(mn, mx, bins+1, endpoint=True)
else:
bins = asarray(bins)
if (np.diff(bins) < 0).any():
raise AttributeError, 'bins must increase monotonically.'

if discard is None:
bins = np.r_[-np.inf, bins[1:-1], np.inf]
elif discard == 'low':
bins = np.r_[bins[:-1], np.inf]
elif discard == 'high':
bins = np.r_[-np.inf, bins[1:]]
elif discard == 'out':
pass
else:
raise ValueError, 'discard keyword not in None, out, high, low : %s'%discard

if axis is None:
return histogram1d(a.ravel(), bins, normed), bins
else:
return np.apply_along_axis(histogram1d, axis, a, bins, normed), bins

def histogram1d(a, bins, normed):
"""Internal usage function to compute an histogram on a 1D array.

Parameters:
  a : array
  The data to histogram.
  bins : sequence
  The edges of the bins.
  normed : bool
  If false, return the number of samples falling into each bin. If true, return
  the density of the sample in each bin.
"""
# best block size probably depends on processor cache size
block = 65536
n = np.zeros(bins.shape,

Re: [Numpy-discussion] packaging scipy (was Re: Simple financial functions for NumPy)

2008-04-07 Thread David Huard

2008/4/4, Joe Harrington <[EMAIL PROTECTED]>:
>
> import numpy  as N
> import numpy.math as N.M
> import numpy.trig as N.T
> import numpy.stat as N.S



I don't think the issue is whether to put everything in the base namespace
// everything in individual namespace, but rather to find an optimal and
intuitive mix between the two. For instance, the io functions would be
easier to find by typing np.io.loadtxt than by sifting through the 500+
items of the base namespace. The stats functions could equally well be in a
separate namespace, given that the most used are implemented as array
methods. I think this would allow numpy to grow more gracefully.

As for the financial functions, being specific to a discipline, I think they
rather belongs with scipy. The numpy namespace will quickly become a mess if
we add np.geology, np.biology, np.material, etc.

Of course, this raises the problem of distributing scipy, and here is a
suggestion:

Change the structure of scipy so that it looks like the scikits:

scipy/
sparse/
cluster/
financial/
...
fftpack/
   setup.py
   scipy/
__init__.py
fftpack/


The advantage is that each subpackage can be installed independently of the
others. For distribution, we could lump all the pure python or easy to
compile packages into scipy.common, and distribute the other packages such
as sparse and fftpack independently. My feeling is that such a lighter
structure would encourage projects with large code base to join the scipy
community. It would also allow folks with 56k modems to download only what
they need.

David
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Ticket #605 Incorrect behavior of numpy.histogram

2008-04-07 Thread David Huard

+1 for an outlier keyword. Note, that this implies that when bins are passed
explicitly, the edges are given (nbins+1), not simply the left edges
(nbins).

While we are refactoring histogram, I'd suggest adding an axis keyword. This
is pretty straightforward to implement using the np.apply_along_axis
function.

Also, I noticed that current normalization is buggy for non-uniform bin
sizes.
if normed:
db = bins[1] - bins[0]
return 1.0/(a.size*db) * n, bins

Finally, whatever option is chosen in the end, we should make sure it is
consistent across all histogram functions. This may mean that we will also
break the behavior of histogramdd and histogram2d.

Bruce: I did some work over the weekend on the histogram function, including
tests. If you want, I'll send that to you in the evening.

David




2008/4/7, Hans Meine <[EMAIL PROTECTED]>:
>
> Am Samstag, 05. April 2008 21:54:27 schrieb Anne Archibald:
>
> > There's also a fourth option - raise an exception if any points are
> > outside the range.
>
>
> +1
>
> I think this should be the default.  Otherwise, I tend towards "exclude",
> in
> order to have comparable bin sizes (when plotting, I always find peaks at
> the
> ends annoying); this could also be called "clip" BTW.
>
> But really, an exception would follow the Zen: "In the face of ambiguity,
> refuse the temptation to guess."  And with a kwarg: "Explicit is better
> than
> implicit."
>
> histogram(a, arange(10), outliers = "clip")
> histogram(a, arange(10), outliers = "include")
> # better names? "include"->"accumulate"/"map to border"/"map"/"boundary"
>
>
> --
> Ciao, /  /
>  /--/
> /  / ANS
>
> ___
> Numpy-discussion mailing list
> Numpy-discussion@scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion
>
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] loading data with gaps

2008-04-04 Thread David Huard

Hi Tim,

Look at the thread posted a couple of weeks ago named: loadtxt and missing
values

I'm guessing you'll find answers to your questions, if not, don't hesitate
to ask.

David


2008/4/3, Tim Michelsen <[EMAIL PROTECTED]>:
>
> Hello!
>
> How can I load a data file (e.g. CSV, DAT) in ASCII which has some gaps?
>
> The file has been saved with from a spreadsheet program which leaves
> cells with not data empty:
>
>
> 1,23.
> 2,13.
> 3,
> 4,34.
>
> Would this code be correct:
> ### test_loadtxt.py ###
> import numpy
> import maskedarray
>
> # load data which has empty 'cells' as beeing saved from spreadsheet:
> # 1,23.
> # 2,13.
> # 3,
> # 4,34.
> data = numpy.loadtxt('./loadtxt_test.csv',dtype=str,delimiter=',')
>
>
> # create a masked array with all no data ('', empty cells from CSV) masked
> my_masked_array = maskedarray.masked_equal(data,'')
> ##
>
> * How can I change the data type of my maskedarray (my_masked_array) to
> a type that allows me to perform calulations?
>
> * Would you do this task differently or more efficient?
>
> * What possibilities do I have to estimate/interpolate the masked values?
> A example would be nice.
>
> * How to I convert maskedarray (my_masked_array) to a array without
> masked values?
>
> Thanks in advance for your help,
> Tim Michelsenholg
>
> ___
> Numpy-discussion mailing list
> Numpy-discussion@scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion
>
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] isnan bug?

2008-03-20 Thread David Huard

Chris,

The trac  page is to place to file
tickets.

Note that you have to register first before you can file new tickets.


David

2008/3/20, Chris Withers <[EMAIL PROTECTED]>:
>
> Hi All,
>
> I'm faily sure that:
>
> numpy.isnan(datetime.datetime.now())
>
> ...should just return False and not raise an exception.
>
> Where can I raise a bug to this effect?
>
> cheers,
>
> Chris
>
>
> --
> Simplistix - Content Management, Zope & Python Consulting
> - http://www.simplistix.co.uk
> ___
> Numpy-discussion mailing list
> Numpy-discussion@scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion
>
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

[Numpy-discussion] Proposed change to average function

2008-03-18 Thread David Huard

In the process of addressing tickets for the next release, Charles Harris
and I made some changes to the internals of the average function which also
affects which input are accepted as valid.

According to the current documentation, weights can either be 1D or any
shape that can be broadcasted to a's shape. It seems, though, that the
broadcasting was partially broken. After some thought, we are proposing that
average only accepts weights that are either
 - 1D with length equal to a's shape along axis.
 - the same shape as a.

and raises an error otherwise. I think this reduces the risk of unexpected
results but wanted to know if anyone disagrees with the change.

The proposed version is implemented in revision 4888.


Regards,

David Huard
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] subset of array - statistics

2008-03-14 Thread David Huard

Look at the timeseries package in scikits (only on svn i'm afraid). You'll
find exactly what you're looking for. Conversion from daily to monthly or
yearly time series is a breeze.

Cheers,

David

2008/3/13, Joris De Ridder <[EMAIL PROTECTED]>:
>
>
> I am new to the world of Python and numpy
>
>
> Welcome.
>
> I have successfully imported the data into lists and then created a single
> array from the lists.
>
>
> I think putting each quantity in a 1D array is more practical in this
> case.
>
> I can get the rainfall total over the entire period using:
>
> 
>
> But what i would like to do is get an average rainfall for each month and
> also
> the ability to get rainfall totals for any month and Year
>
>
> Assuming that yr, mth and rain are 1D arrays, you may try something along
>
> [[average(rain[(yr == y) & (mth == m)]) for m in unique(mth[yr==y])] for y
> in unique(yr)]
>
> which gives you the monthly average rainfalls stored in lists, one for
> each year.
>
> The rain data cannot be reshaped in a 3D numpy array, because not all
> months have the same number of days, and not all years have the same number
> of months. If they could, numpy would allow you to do something like:
>
> average(rain.reshape(Nyear, Nmonth, Nday), axis =-1)
>
> to get the same result.
>
> J.
>
>
>
> Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm for more
> information.
>
>
> ___
> Numpy-discussion mailing list
> Numpy-discussion@scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion
>
>
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Help needed with numpy 10.5 release blockers

2008-03-14 Thread David Huard

I added a test for ticket 691. Problem is, there seems to be a new bug. I
don't know it its related to the change or if it was there before. Please
check this out.

David

2008/3/14, David Huard <[EMAIL PROTECTED]>:
>
> I added a test for ticket 690.
>
> 2008/3/13, Barry Wark <[EMAIL PROTECTED]>:
> >
> > I appologize that the Mac OSX buildbot has been so flakey. For some
> > reason it stops being able to resolve scipy.org on a regular basis
> > (though other processes on the same machine don't seem to have
> > trouble). Restarting the slave fixes the issue. Anyways, if anyone is
> > testing an OS X issue and the svn update fails, let me know.
> >
> >
> > Barry
> >
> >
> > On Thu, Mar 13, 2008 at 2:38 AM, Jarrod Millman <[EMAIL PROTECTED]>
> > wrote:
> > > On Wed, Mar 12, 2008 at 10:43 PM, Jarrod Millman <[EMAIL PROTECTED]>
> > wrote:
> > >  >  Stefan and I also triaged the remaining tickets--closing several
> > and
> > >  >  turning others in to release blockers:
> > >  >
> > http://scipy.org/scipy/numpy/query?status=new&severity=blocker&milestone=1.0.5&order=priority
> > >  >
> > >  >  I think that it is especially important that we spend some time
> > trying
> > >  >  to make the 1.0.5 release rock solid.  There are several important
> > >  >  changes in the trunk so I really hope we can get these tickets
> > >  >  resolved ASAP.  I need everyone's help getting this release
> > out.  If
> > >  >  you can help work on any of the open release blockers, please try
> > to
> > >  >  close them over the weekend.  If you have any ideas about the
> > tickets
> > >  >  but aren't exactly sure how to resolve them please post a message
> > to
> > >  >  the list or add a comment to the ticket.
> > >
> > >  Hello,
> > >
> > >  I just noticed that David Cournapeau fixed one of the blockers
> > moments
> > >  after I sent out my email asking for help:
> > >  http://projects.scipy.org/scipy/numpy/ticket/688
> > >
> > >  Thanks David!
> > >
> > >  So we are down to 12 tickets blocking the release.  Some of the
> > >  tickets are just missing tests, so they should be fairly easy to
> > >  implement--for anyone who wants to help get this release out ASAP.
> > >
> > >  Cheers,
> > >
> > >  --
> > >
> > >
> > > Jarrod Millman
> > >  Computational Infrastructure for Research Labs
> > >  10 Giannini Hall, UC Berkeley
> > >  phone: 510.643.4014
> > >  http://cirl.berkeley.edu/
> > >  ___
> > >  Numpy-discussion mailing list
> > >  Numpy-discussion@scipy.org
> > >  http://projects.scipy.org/mailman/listinfo/numpy-discussion
> > >
> > ___
> > Numpy-discussion mailing list
> > Numpy-discussion@scipy.org
> > http://projects.scipy.org/mailman/listinfo/numpy-discussion
> >
>
>
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Help needed with numpy 10.5 release blockers

2008-03-14 Thread David Huard

I added a test for ticket 690.

2008/3/13, Barry Wark <[EMAIL PROTECTED]>:
>
> I appologize that the Mac OSX buildbot has been so flakey. For some
> reason it stops being able to resolve scipy.org on a regular basis
> (though other processes on the same machine don't seem to have
> trouble). Restarting the slave fixes the issue. Anyways, if anyone is
> testing an OS X issue and the svn update fails, let me know.
>
>
> Barry
>
>
> On Thu, Mar 13, 2008 at 2:38 AM, Jarrod Millman <[EMAIL PROTECTED]>
> wrote:
> > On Wed, Mar 12, 2008 at 10:43 PM, Jarrod Millman <[EMAIL PROTECTED]>
> wrote:
> >  >  Stefan and I also triaged the remaining tickets--closing several and
> >  >  turning others in to release blockers:
> >  >
> http://scipy.org/scipy/numpy/query?status=new&severity=blocker&milestone=1.0.5&order=priority
> >  >
> >  >  I think that it is especially important that we spend some time
> trying
> >  >  to make the 1.0.5 release rock solid.  There are several important
> >  >  changes in the trunk so I really hope we can get these tickets
> >  >  resolved ASAP.  I need everyone's help getting this release out.  If
> >  >  you can help work on any of the open release blockers, please try to
> >  >  close them over the weekend.  If you have any ideas about the
> tickets
> >  >  but aren't exactly sure how to resolve them please post a message to
> >  >  the list or add a comment to the ticket.
> >
> >  Hello,
> >
> >  I just noticed that David Cournapeau fixed one of the blockers moments
> >  after I sent out my email asking for help:
> >  http://projects.scipy.org/scipy/numpy/ticket/688
> >
> >  Thanks David!
> >
> >  So we are down to 12 tickets blocking the release.  Some of the
> >  tickets are just missing tests, so they should be fairly easy to
> >  implement--for anyone who wants to help get this release out ASAP.
> >
> >  Cheers,
> >
> >  --
> >
> >
> > Jarrod Millman
> >  Computational Infrastructure for Research Labs
> >  10 Giannini Hall, UC Berkeley
> >  phone: 510.643.4014
> >  http://cirl.berkeley.edu/
> >  ___
> >  Numpy-discussion mailing list
> >  Numpy-discussion@scipy.org
> >  http://projects.scipy.org/mailman/listinfo/numpy-discussion
> >
> ___
> Numpy-discussion mailing list
> Numpy-discussion@scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion
>
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Transforming an array of numbers to an array of formatted strings

2008-03-13 Thread David Huard

['S%03d'%i for i in int_data]

David


2008/3/13, Alan G Isaac <[EMAIL PROTECTED]>:
>
> On Thu, 13 Mar 2008, Alexander Michael apparently wrote:
> > I want to format an array of numbers as strings.
>
>
> To what end?
> Note that tofile has a format option.
> And for 1d array ``x`` you can always do::
>
> strdata = list( fmt%xi for xi in x)
>
> Nice because the counter name does not "bleed" into your program.
>
> Cheers,
> Alan Isaac
>
>
>
> ___
> Numpy-discussion mailing list
> Numpy-discussion@scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion
>
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] loadtxt broken if file does not end in newline

2008-02-27 Thread David Huard

Hi Christopher,

The advantage of using regular expressions is that in this case it gives you
some flexibility that wasn't there before. For instance, if for any reason
there are two type of characters that coexist in the file to mark comments,
using

pattern = re.compile(comments)
for i,line in enumerate(fh):
 if i:
>
> David Huard wrote:
> > Would everyone be satisfied with a solution using regular expressions ?
>
>
> Maybe it's because regular expressions make me itch, but I think it's
> overkill for this.
>
> The issue here is a result of what I consider a wart in python's string
> methods -- string.find() returns a valid index( -1 ) when it fails to
> find anything. The usual way to work with this is to test for it:
>
> print "test for comment not found:"
> for line in SampleLines:
>  i = line.find(comments)
>  if i == -1:
>  line = line.strip()
>  else:
>  line = line[:i].strip()
>  print line
>
> which does seem like a lot of extra code.
>
> In this case, that wasn't' done, as most of the time there is a newline
> at the end that can be thrown away anyway, so the -1 index is OK. So
> that inspired the following solution -- just add an extra space every
> time:
>
> print "simply pad the line with a space:"
> for line in SampleLines:
>  line += " "
>
>  line = line[:(line).find(comments)].strip()
>
>  print line
>
> an extra string creation, but simple.
>
>
> > pattern = re.compile(r"""
> > ^\s* # leading white space
> > (.*) # Data
> > %s?  # Zero or one comment character
> > (.*) # Comments
> > \s*$ # Trailing white space
> > """%comments, re.VERBOSE)
>
>
> This pattern fails if the last character of the line is a comment
> character, and if it is a comment only line, though I'm sure that could
> be fixed. I still prefer the python string methods approaches, though.
>
> I've enclosed a little test code, that gives these results:
>
> old way -- this fails with no comment of newline
> 1 2 3 4 5
> 1 2 3 4
> 1 2 3 4 5
>
> with regular expression:
> 1 2 3 4 5
> 1 2 3 4 5
> 1 2 3 4 5#
> # 1 2 3 4 5
> simply pad the line with a space:
> 1 2 3 4 5
> 1 2 3 4 5
> 1 2 3 4 5
>
> test for comment not found:
> 1 2 3 4 5
> 1 2 3 4 5
> 1 2 3 4 5
>
> My suggestions work on all my test cases. We really should put these,
> and others, into a real unit test when this fix is added.
>
> -Chris
>
> --
> Christopher Barker, Ph.D.
> Oceanographer
>
> Emergency Response Division
> NOAA/NOS/OR&R(206) 526-6959   voice
> 7600 Sand Point Way NE   (206) 526-6329   fax
> Seattle, WA  98115   (206) 526-6317   main reception
>
> [EMAIL PROTECTED]
>
> ___
> Numpy-discussion mailing list
> Numpy-discussion@scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion
>
>
>
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] loadtxt broken if file does not end in newline

2008-02-27 Thread David Huard

Lisandro,

When you have some time, could you check this patch solves your problem (and
does not introduce new ones) ?

David


Index: numpy/lib/io.py
===
--- numpy/lib/io.py (revision 4824)
+++ numpy/lib/io.py (working copy)
@@ -11,6 +11,7 @@
 import cStringIO
 import tempfile
 import os
+import re

 from cPickle import load as _cload, loads
 from _datasource import DataSource
@@ -291,9 +292,12 @@
 converterseq = [_getconv(dtype.fields[name][0]) \
 for name in dtype.names]

+# Remove comments and leading/trailing white space
+pattern = re.compile(comments)
 for i,line in enumerate(fh):
 if i:
>
> I can look at it.
>
> Would everyone be satisfied with a solution using regular expressions ?
> That is, looking for the following pattern:
>
> pattern = re.compile(r"""
> ^\s* # leading white space
> (.*) # Data
> %s?  # Zero or one comment character
> (.*) # Comments
> \s*$ # Trailing white space
> """%comments, re.VERBOSE)
>
> match = pattern.search(line)
> line, comment = match.groups()
>
> instead of
>
> line = line[:line.find(comments)].strip()
>
> By the way, is there a test function for loadtxt and savetxt ? I couldn't
> find one.
>
>
> David
>
> 2008/2/26, Alan G Isaac <[EMAIL PROTECTED]>:
> >
> > On Tue, 26 Feb 2008, Lisandro Dalcin apparently wrote:
> > > I believe the current 'loadtxt' function is broken
> >
> >
> > I agree:
> >  > http://projects.scipy.org/pipermail/numpy-discussion/2007-November/030057.html
> > >
> >
> > Cheers,
> >
> > Alan Isaac
> >
> >
> >
> >
> > ___
> > Numpy-discussion mailing list
> > Numpy-discussion@scipy.org
> > http://projects.scipy.org/mailman/listinfo/numpy-discussion
> >
>
>
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] loadtxt broken if file does not end in newline

2008-02-27 Thread David Huard

I can look at it.

Would everyone be satisfied with a solution using regular expressions ?
That is, looking for the following pattern:

pattern = re.compile(r"""
^\s* # leading white space
(.*) # Data
%s?  # Zero or one comment character
(.*) # Comments
\s*$ # Trailing white space
"""%comments, re.VERBOSE)

match = pattern.search(line)
line, comment = match.groups()

instead of

line = line[:line.find(comments)].strip()

By the way, is there a test function for loadtxt and savetxt ? I couldn't
find one.


David

2008/2/26, Alan G Isaac <[EMAIL PROTECTED]>:
>
> On Tue, 26 Feb 2008, Lisandro Dalcin apparently wrote:
> > I believe the current 'loadtxt' function is broken
>
>
> I agree:
>  http://projects.scipy.org/pipermail/numpy-discussion/2007-November/030057.html
> >
>
> Cheers,
>
> Alan Isaac
>
>
>
>
> ___
> Numpy-discussion mailing list
> Numpy-discussion@scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion
>
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] David's build_with_scons branch merged!

2008-02-08 Thread David Huard

Jarrod and David,

I am reporting a success on FC8, Xeon. Some tests don't pass, but I don't
believe it is related to the build process.

Well done,

David

2008/2/8, Jarrod Millman <[EMAIL PROTECTED]>:
>
> Hello,
>
> In preparation for the upcoming NumPy 1.0.5 release, I just merged
> David Cournapeau's build_with_scons branch:
> http://projects.scipy.org/scipy/numpy/changeset/4773
>
> The current build system using numpy.distutils is still the default.
> NumPy does not include numscons; this merge adds scons support to
> numpy.distutils, provides some scons scripts, and modifies the
> configuration of numpy/core.  David has extensively tested these
> changes and I did a very quick sanity check to make sure I didn't
> completely break everything.
>
> Obviously, we will need to push back the 1.0.5 release date again to
> ensure that there is sufficient testing.  So please test these changes
> and let us know if you have any problems (or successes).
>
> David has been putting in a considerable effort over the last several
> months in developing numscons.  If you are interested in the
> advantages to Davids approach, please read the description here:
> http://projects.scipy.org/scipy/numpy/wiki/NumpyScons
>
> Thanks,
>
> --
> Jarrod Millman
> Computational Infrastructure for Research Labs
> 10 Giannini Hall, UC Berkeley
> phone: 510.643.4014
> http://cirl.berkeley.edu/
> ___
> Numpy-discussion mailing list
> Numpy-discussion@scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion
>
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] histogramdd memory needs

2008-02-04 Thread David Huard

2008/2/4, Lars Friedrich <[EMAIL PROTECTED]>:
>
> Hi,
>
> > 2) Is there a way to use another algorithm (at the cost of performance)
> >> > that uses less memory during calculation so that I can generate
> bigger
> >> > histograms?
> >
> >
> > You could work through your array block by block. Simply fix the range
> and
> > generate an histogram for each slice of 100k data and sum them up at the
> > end.
>
> Thank you for your answer.
>
> I sliced the (original) data into blocks. However, when I do this, I
> need at least twice the memory for the whole histogram (one for the
> temporary result and one for accumulating the total result). Assuming my
> histogram has a size of (280**3)*8 = 176 (megabytes) this does not help,
> I think.
>
> What I will try next is to compute smaller parts of the big histogram
> and combine them at the end. (Slice the histogram into blocks) Is it
> this, that you were recommending?


It was badly explained, sorry, but the goal is to reduce memory footprint,
so storing each intermediate result and adding them at the end does not help
indeed. You should update the partial histogram as soon as a block is
computed. I'm sending you a script that does this for 1D histograms. This
comes from the pymc code base. Look at the histogram function in utils.py.

Cheers,

David


Lars
> ___
> Numpy-discussion mailing list
> Numpy-discussion@scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion
>


utils.py
Description: Binary data
C***
C RETURN THE HISTOGRAM OF ARRAY X, THAT IS, THE NUMBER OF ELEMENTS
C IN X FALLING INTO EACH BIN.
C THE BIN ARRAY CONSISTS IN N BINS STARTING AT BIN0 WITH WIDTH DELTA.
C HISTO H : | LOWER OUTLIERS | 1 | 2 | 3 | ... |  N  | UPPER OUTLIERS |
C INDEX i : |1   | 2 | 3 | 4 | ... | N+1 |  N+2   |

  SUBROUTINE FIXED_BINSIZE(X, BIN0, DELTA, N, NX, H)

C PARAMETERS
C --
C X : ARRAY 
C BIN0 : LEFT BIN EDGE
C DELTA : BIN WIDTH
C N : NUMBER OF BINS
C H : HISTOGRAM

  IMPLICIT NONE
  INTEGER :: N, NX, i, K
  DOUBLE PRECISION ::  X(NX), BIN0, DELTA
  INTEGER :: H(N+2), UP, LOW

CF2PY INTEGER INTENT(IN) :: N
CF2PY INTEGER INTENT(HIDE) :: NX = LEN(X)
CF2PY DOUBLE PRECISION DIMENSION(NX), INTENT(IN) :: X
CF2PY DOUBLE PRECISION INTENT(IN) :: BIN0, DELTA
CF2PY INTEGER DIMENSION(N+2), INTENT(OUT) :: H


  DO i=1,N+2
H(i) = 0
  ENDDO
  
C OUTLIERS INDICES
  UP = N+2
  LOW = 1

  DO i=1,NX
IF (X(i) >= BIN0) THEN
  K = INT((X(i)-BIN0)/DELTA)+1
  IF (K <= N) THEN
H(K+1) = H(K+1) + 1
  ELSE 
H(UP) = H(UP) + 1
  ENDIF
ELSE 
  H(LOW) = H(LOW) + 1
ENDIF
  ENDDO

  END SUBROUTINE



C***
C RETURN THE WEIGHTED HISTOGRAM OF ARRAY X, THAT IS, THE SUM OF THE 
C WEIGHTS OF THE ELEMENTS OF X FALLING INTO EACH BIN.
C THE BIN ARRAY CONSISTS IN N BINS STARTING AT BIN0 WITH WIDTH DELTA.
C HISTO H : | LOWER OUTLIERS | 1 | 2 | 3 | ... |  N  | UPPER OUTLIERS |
C INDEX i : |1   | 2 | 3 | 4 | ... | N+1 |  N+2   |

  SUBROUTINE WEIGHTED_FIXED_BINSIZE(X, W, BIN0, DELTA, N, NX, H)

C PARAMETERS
C --
C X : ARRAY 
C W : WEIGHTS
C BIN0 : LEFT BIN EDGE
C DELTA : BIN WIDTH
C N : NUMBER OF BINS
C H : HISTOGRAM

  IMPLICIT NONE
  INTEGER :: N, NX, i, K
  DOUBLE PRECISION ::  X(NX), W(NX), BIN0, DELTA, H(N+2)
  INTEGER :: UP, LOW

CF2PY INTEGER INTENT(IN) :: N
CF2PY INTEGER INTENT(HIDE) :: NX = LEN(X)
CF2PY DOUBLE PRECISION DIMENSION(NX), INTENT(IN) :: X, W
CF2PY DOUBLE PRECISION INTENT(IN) :: BIN0, DELTA
CF2PY DOUBLE PRECISION DIMENSION(N+2), INTENT(OUT) :: H


  DO i=1,N+2
H(i) = 0.D0
  ENDDO
  
C OUTLIERS INDICES
  UP = N+2
  LOW = 1

  DO i=1,NX
IF (X(i) >= BIN0) THEN
  K = INT((X(i)-BIN0)/DELTA)+1
  IF (K <= N) THEN
H(K+1) = H(K+1) + W(i)
  ELSE 
H(UP) = H(UP) + W(i)
  ENDIF
ELSE 
  H(LOW) = H(LOW) + W(i)
ENDIF
  ENDDO

  END SUBROUTINE


C*
C COMPUTE N DIMENSIONAL FLATTENED HISTOGRAM

  SUBROUTINE FIXED_BINSIZE_ND(X, BIN0, DELTA, N, COUNT, NX,D,NC)

C PARAMETERS
C --
C X : ARRAY (NXD)
C BIN0 : LEFT BIN EDGES (D)  
C DELTA : BIN WIDTH (D)
C N : NUMBER OF BINS (D)
C COUNT : FLATTENED HISTOGRAM (NC)
C NC : PROD(N(:)+2)

  IMPLICIT NONE
  INTEGER :: NX, D, NC,N(D), i, j, k, T
  DOUBLE PRECISION :: X(NX,D), BIN0(D), DELTA(D)
  INTEGER :: INDEX(NX), ORDER(D), MULT, COUNT(NC)


CF2PY DOUBLE PRECISION DIMENSION(NX,D), INTENT(IN) :: X
CF2PY DOUBLE PRECISION DIMENSION(D) :: BIN0, DELTA
CF2PY INTEGER INTENT(IN) :: N
CF2PY INTEGER DIMENSION(NC), INTENT(OUT) :: COUNT
CF2PY INTEG

Re: [Numpy-discussion] histogramdd memory needs

2008-02-01 Thread David Huard

Hi Lars,

[...]

2008/2/1, Lars Friedrich <[EMAIL PROTECTED]>:
>
>
> 1) How can I tell histogramdd to use another dtype than float64? My bins
> will be very little populated so an int16 should be sufficient. Without
> normalization, a Integer dtype makes more sense to me.


There is no way you'll be able to ask that without tweaking the histogramdd
function yourself.  The relevant bit of code is the instantiation of hist :

hist = zeros(nbin.prod(), float)


2) Is there a way to use another algorithm (at the cost of performance)
> that uses less memory during calculation so that I can generate bigger
> histograms?


You could work through your array block by block. Simply fix the range and
generate an histogram for each slice of 100k data and sum them up at the
end.

The current histogram and histogramdd implementation has the advantage of
being general, that is you can work with uniform or non-uniform bins, but it
is not particularly efficient, at least for large number of bins (>30).

Cheers,

David

My numpy version is '1.0.4.dev3937'
>
> Thanks,
> Lars
>
>
> --
> Dipl.-Ing. Lars Friedrich
>
> Photonic Measurement Technology
> Department of Microsystems Engineering -- IMTEK
> University of Freiburg
> Georges-Köhler-Allee 102
> D-79110 Freiburg
> Germany
>
> phone: +49-761-203-7531
> fax:   +49-761-203-7537
> room:  01 088
> email: [EMAIL PROTECTED]
> ___
> Numpy-discussion mailing list
> Numpy-discussion@scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion
>
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

1 2 >

1 - 100 of 122 matches

Mail list logo