[Numpy-discussion] Bug

2015-10-16 Thread josef.pktd

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Bug

2015-10-16 Thread josef.pktd
Sorry, wrong shortcut key, question will arrive later.

Josef

On Fri, Oct 16, 2015 at 1:40 PM,  wrote:

>
>
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Bug in np.nonzero / Should index returning functions return ndarray subclasses?

2015-05-12 Thread Marten van Kerkwijk
Agreed that indexing functions should return bare `ndarray`. Note that in
Jaime's PR one can override it anyway by defining __nonzero__.  -- Marten

On Sat, May 9, 2015 at 9:53 PM, Stephan Hoyer sho...@gmail.com wrote:

  With regards to np.where -- shouldn't where be a ufunc, so subclasses or
 other array-likes can be control its behavior with __numpy_ufunc__?

 As for the other indexing functions, I don't have a strong opinion about
 how they should handle subclasses. But it is certainly tricky to attempt to
 handle handle arbitrary subclasses. I would agree that the least error
 prone thing to do is usually to return base ndarrays. Better to force
 subclasses to override methods explicitly.

 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Bug in np.nonzero / Should index returning functions return ndarray subclasses?

2015-05-09 Thread Nathaniel Smith
On May 9, 2015 12:54 PM, Benjamin Root ben.r...@ou.edu wrote:

 Absolutely, it should be writable. As for subclassing, that might be
messy. Consider the following:

 inds = np.where(data  5)

 In that case, I'd expect a normal, bog-standard ndarray because that is
what you use for indexing (although pandas might have a good argument for
having it return one of their special indexing types if data was a pandas
array...).

Pandas doesn't subclass ndarray (anymore), so they're irrelevant to this
particular discussion :-). Of course they're an argument for having a
cleaner more general way of allowing non-ndarray array-like objects, but
the legacy subclassing system will never be that.

 Next:

 foobar = np.where(data  5, 1, 2)

 Again, I'd expect a normal, bog-standard ndarray because the scalar
elements are very simple. This question gets very complicated when
considering array arguments. Consider:

 merged_data = np.where(data  5, data, data2)

 So, what should merged_data be? If both data and data2 are the same
types, then it would be reasonable to return the same type, if possible.
But what if they aren't the same? Maybe use array_priority to determine the
return type? Or, perhaps it does make sense to say sod it all and always
return an ndarray?

Not sure what this has to do with Jaime's post about nonzero? There is
indeed a potential question about what 3-argument where() should do with
subclasses, but that's effectively a different operation entirely and to
discuss it we'd need to know things like what it historically has done and
why that was causing problems.

-n
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Bug in np.nonzero / Should index returning functions return ndarray subclasses?

2015-05-09 Thread Nathaniel Smith
On Sat, May 9, 2015 at 1:27 PM, Benjamin Root ben.r...@ou.edu wrote:

 On Sat, May 9, 2015 at 4:03 PM, Nathaniel Smith n...@pobox.com wrote:

 Not sure what this has to do with Jaime's post about nonzero? There is
 indeed a potential question about what 3-argument where() should do with
 subclasses, but that's effectively a different operation entirely and to
 discuss it we'd need to know things like what it historically has done and
 why that was causing problems.

 Because my train of thought started at np.nonzero(), which I have always
 just mentally mapped to np.where(), and then... squirrel!

 Indeed, np.where() has no bearing here.

Ah, gotcha :-).

There is an argument that we should try to reduce this confusion by
nudging people to use np.nonzero() consistently instead of np.where(),
via the documentation and/or a warning message...

-- 
Nathaniel J. Smith -- http://vorpus.org
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Bug in np.nonzero / Should index returning functions return ndarray subclasses?

2015-05-09 Thread Stephan Hoyer
With regards to np.where -- shouldn't where be a ufunc, so subclasses or other 
array-likes can be control its behavior with __numpy_ufunc__?


As for the other indexing functions, I don't have a strong opinion about how 
they should handle subclasses. But it is certainly tricky to attempt to handle 
handle arbitrary subclasses. I would agree that the least error prone thing to 
do is usually to return base ndarrays. Better to force subclasses to override 
methods explicitly.___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Bug in np.nonzero / Should index returning functions return ndarray subclasses?

2015-05-09 Thread Benjamin Root
On Sat, May 9, 2015 at 4:03 PM, Nathaniel Smith n...@pobox.com wrote:

 Not sure what this has to do with Jaime's post about nonzero? There is
 indeed a potential question about what 3-argument where() should do with
 subclasses, but that's effectively a different operation entirely and to
 discuss it we'd need to know things like what it historically has done and
 why that was causing problems.



Because my train of thought started at np.nonzero(), which I have always
just mentally mapped to np.where(), and then... squirrel!

Indeed, np.where() has no bearing here.

Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] Bug in np.nonzero / Should index returning functions return ndarray subclasses?

2015-05-09 Thread Jaime Fernández del Río
There is a reported bug (issue #5837
https://github.com/numpy/numpy/issues/5837) regarding different returns
from np.nonzero with 1-D vs higher dimensional arrays. A full summary of
the differences can be seen from the following output:

 class C(np.ndarray): pass
...
 a = np.arange(6).view(C)
 b = np.arange(6).reshape(2, 3).view(C)
 anz = a.nonzero()
 bnz = b.nonzero()

 type(anz[0])
type 'numpy.ndarray'
 anz[0].flags
  C_CONTIGUOUS : True
  F_CONTIGUOUS : True
  OWNDATA : True
  WRITEABLE : True
  ALIGNED : True
  UPDATEIFCOPY : False
 anz[0].base

 type(bnz[0])
class '__main__.C'
 bnz[0].flags
  C_CONTIGUOUS : False
  F_CONTIGUOUS : False
  OWNDATA : False
  WRITEABLE : False
  ALIGNED : True
  UPDATEIFCOPY : False
 bnz[0].base
array([[0, 1],
   [0, 2],
   [1, 0],
   [1, 1],
   [1, 2]])

The original bug report was only concerned with the non-writeability of
higher dimensional array returns, but there are more differences: 1-D
always returns an ndarray that owns its memory and is writeable, but higher
dimensional arrays return views, of the type of the original array, that
are non-writeable.

I have a branch that attempts to fix this by making both 1-D and n-D arrays:

   1. return a view, never the base array,
   2. return an ndarray, never a subclass, and
   3. return a writeable view.

I guess the most controversial choice is #2, and in fact making that change
breaks a few tests. I nevertheless think that all of the index returning
functions (nonzero, argsort, argmin, argmax, argpartition) should always
return a bare ndarray, not a subclass. I'd be happy to be corrected, but I
can't think of any situation in which preserving the subclass would be
needed for these functions.

Since we are changing the returns of a few other functions in 1.10
(diagonal, diag, ravel), it may be a good moment to revisit the behavior
for these other functions. Any thoughts?

Jaime

-- 
(\__/)
( O.o)
(  ) Este es Conejo. Copia a Conejo en tu firma y ayúdale en sus planes
de dominación mundial.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Bug in np.nonzero / Should index returning functions return ndarray subclasses?

2015-05-09 Thread Nathaniel Smith
On May 9, 2015 10:48 AM, Jaime Fernández del Río jaime.f...@gmail.com
wrote:

 There is a reported bug (issue #5837) regarding different returns from
np.nonzero with 1-D vs higher dimensional arrays. A full summary of the
differences can be seen from the following output:

  class C(np.ndarray): pass
 ...
  a = np.arange(6).view(C)
  b = np.arange(6).reshape(2, 3).view(C)
  anz = a.nonzero()
  bnz = b.nonzero()

  type(anz[0])
 type 'numpy.ndarray'
  anz[0].flags
   C_CONTIGUOUS : True
   F_CONTIGUOUS : True
   OWNDATA : True
   WRITEABLE : True
   ALIGNED : True
   UPDATEIFCOPY : False
  anz[0].base

  type(bnz[0])
 class '__main__.C'
  bnz[0].flags
   C_CONTIGUOUS : False
   F_CONTIGUOUS : False
   OWNDATA : False
   WRITEABLE : False
   ALIGNED : True
   UPDATEIFCOPY : False
  bnz[0].base
 array([[0, 1],
[0, 2],
[1, 0],
[1, 1],
[1, 2]])

 The original bug report was only concerned with the non-writeability of
higher dimensional array returns, but there are more differences: 1-D
always returns an ndarray that owns its memory and is writeable, but higher
dimensional arrays return views, of the type of the original array, that
are non-writeable.

 I have a branch that attempts to fix this by making both 1-D and n-D
arrays:
 return a view, never the base array,

This doesn't matter, does it? View isn't a thing, only view of is
meaningful. And in this case, none of the returned arrays share any memory
with any other arrays that the user has access to... so whether they were
created as a view or not should be an implementation detail that's
transparent to the user?

 return an ndarray, never a subclass, and
 return a writeable view.
 I guess the most controversial choice is #2, and in fact making that
change breaks a few tests. I nevertheless think that all of the index
returning functions (nonzero, argsort, argmin, argmax, argpartition) should
always return a bare ndarray, not a subclass. I'd be happy to be corrected,
but I can't think of any situation in which preserving the subclass would
be needed for these functions.

I also can't see any logical reason why the return type of these functions
has anything to do with the type of the inputs. You can index me with my
phone number but my phone number is not a person. OTOH logic and ndarray
subclassing don't have much to do with each other; the practical effect is
probably more important. Looking at the subclasses I know about (masked
arrays, np.matrix, and astropy quantities), though, I also can't see much
benefit in copying the subclass of the input, and the fact that we were
never consistent about this suggests that people probably aren't depending
on it too much.

So in summary my feeling is: +1 to making then writable, no objection to
the view thing (though I don't see how it matters), and provisional +1 to
consistently returning ndarray (to be revised if the people who use the
subclassing functionality disagree).

-n
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Bug in np.nonzero / Should index returning functions return ndarray subclasses?

2015-05-09 Thread Benjamin Root
Absolutely, it should be writable. As for subclassing, that might be messy.
Consider the following:

inds = np.where(data  5)

In that case, I'd expect a normal, bog-standard ndarray because that is
what you use for indexing (although pandas might have a good argument for
having it return one of their special indexing types if data was a pandas
array...). Next:

foobar = np.where(data  5, 1, 2)

Again, I'd expect a normal, bog-standard ndarray because the scalar
elements are very simple. This question gets very complicated when
considering array arguments. Consider:

merged_data = np.where(data  5, data, data2)

So, what should merged_data be? If both data and data2 are the same
types, then it would be reasonable to return the same type, if possible.
But what if they aren't the same? Maybe use array_priority to determine the
return type? Or, perhaps it does make sense to say sod it all and always
return an ndarray?

I don't know the answer. I do find it interesting that the result from a
multi-dimensional array is not writable. I don't know why I have never
encountered that.


Ben Root


On Sat, May 9, 2015 at 2:42 PM, Nathaniel Smith n...@pobox.com wrote:

 On May 9, 2015 10:48 AM, Jaime Fernández del Río jaime.f...@gmail.com
 wrote:
 
  There is a reported bug (issue #5837) regarding different returns from
 np.nonzero with 1-D vs higher dimensional arrays. A full summary of the
 differences can be seen from the following output:
 
   class C(np.ndarray): pass
  ...
   a = np.arange(6).view(C)
   b = np.arange(6).reshape(2, 3).view(C)
   anz = a.nonzero()
   bnz = b.nonzero()
 
   type(anz[0])
  type 'numpy.ndarray'
   anz[0].flags
C_CONTIGUOUS : True
F_CONTIGUOUS : True
OWNDATA : True
WRITEABLE : True
ALIGNED : True
UPDATEIFCOPY : False
   anz[0].base
 
   type(bnz[0])
  class '__main__.C'
   bnz[0].flags
C_CONTIGUOUS : False
F_CONTIGUOUS : False
OWNDATA : False
WRITEABLE : False
ALIGNED : True
UPDATEIFCOPY : False
   bnz[0].base
  array([[0, 1],
 [0, 2],
 [1, 0],
 [1, 1],
 [1, 2]])
 
  The original bug report was only concerned with the non-writeability of
 higher dimensional array returns, but there are more differences: 1-D
 always returns an ndarray that owns its memory and is writeable, but higher
 dimensional arrays return views, of the type of the original array, that
 are non-writeable.
 
  I have a branch that attempts to fix this by making both 1-D and n-D
 arrays:
  return a view, never the base array,

 This doesn't matter, does it? View isn't a thing, only view of is
 meaningful. And in this case, none of the returned arrays share any memory
 with any other arrays that the user has access to... so whether they were
 created as a view or not should be an implementation detail that's
 transparent to the user?

  return an ndarray, never a subclass, and
  return a writeable view.
  I guess the most controversial choice is #2, and in fact making that
 change breaks a few tests. I nevertheless think that all of the index
 returning functions (nonzero, argsort, argmin, argmax, argpartition) should
 always return a bare ndarray, not a subclass. I'd be happy to be corrected,
 but I can't think of any situation in which preserving the subclass would
 be needed for these functions.

 I also can't see any logical reason why the return type of these functions
 has anything to do with the type of the inputs. You can index me with my
 phone number but my phone number is not a person. OTOH logic and ndarray
 subclassing don't have much to do with each other; the practical effect is
 probably more important. Looking at the subclasses I know about (masked
 arrays, np.matrix, and astropy quantities), though, I also can't see much
 benefit in copying the subclass of the input, and the fact that we were
 never consistent about this suggests that people probably aren't depending
 on it too much.

 So in summary my feeling is: +1 to making then writable, no objection to
 the view thing (though I don't see how it matters), and provisional +1 to
 consistently returning ndarray (to be revised if the people who use the
 subclassing functionality disagree).

 -n

 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] Bug in 1.9?

2014-10-22 Thread Neil Girdhar
Hello,

Is this desired behaviour or a regression or a bug?

http://stackoverflow.com/questions/26497656/how-do-i-align-a-numpy-record-array-recarray

Thanks,

Neil
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Bug in 1.9?

2014-10-22 Thread Charles R Harris
On Wed, Oct 22, 2014 at 11:32 AM, Neil Girdhar mistersh...@gmail.com
wrote:

 Hello,

 Is this desired behaviour or a regression or a bug?


 http://stackoverflow.com/questions/26497656/how-do-i-align-a-numpy-record-array-recarray

 Thanks,


I'd guess that the definition of aligned may have become stricter, that's
the only thing I think has changed. Maybe Julian can comment on that.

Chuck
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Bug in 1.9?

2014-10-22 Thread Julian Taylor
On 22.10.2014 20:00, Charles R Harris wrote:
 
 
 On Wed, Oct 22, 2014 at 11:32 AM, Neil Girdhar mistersh...@gmail.com
 mailto:mistersh...@gmail.com wrote:
 
 Hello,
 
 Is this desired behaviour or a regression or a bug?
 
 
 http://stackoverflow.com/questions/26497656/how-do-i-align-a-numpy-record-array-recarray
 
 Thanks,
 
 
 I'd guess that the definition of aligned may have become stricter,
 that's the only thing I think has changed. Maybe Julian can comment on that.
 

structured dtypes have not really a well defined alignment, e.g. the
stride of this is 12, so when element 0 is aligned element 1 is always
unaligned.

Before 1.9 structured dtype always had the aligned flag set, even if
they were unaligned.
Now we require a minimum alignment of 16 for strings and structured
types so copying which sometimes works on the whole compound type
instead of each item always works.
This was the easiest way to get the testsuite running on sparc after
fixing a couple of code paths not updating alignment information which
forced some functions to always take super slow unaligned paths (e.g.
ufunc.at)
But the logic could certainly be improved.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Bug in 1.9?

2014-10-22 Thread Charles R Harris
On Wed, Oct 22, 2014 at 12:28 PM, Julian Taylor 
jtaylor.deb...@googlemail.com wrote:

 On 22.10.2014 20:00, Charles R Harris wrote:
 
 
  On Wed, Oct 22, 2014 at 11:32 AM, Neil Girdhar mistersh...@gmail.com
  mailto:mistersh...@gmail.com wrote:
 
  Hello,
 
  Is this desired behaviour or a regression or a bug?
 
 
 http://stackoverflow.com/questions/26497656/how-do-i-align-a-numpy-record-array-recarray
 
  Thanks,
 
 
  I'd guess that the definition of aligned may have become stricter,
  that's the only thing I think has changed. Maybe Julian can comment on
 that.
 

 structured dtypes have not really a well defined alignment, e.g. the
 stride of this is 12, so when element 0 is aligned element 1 is always
 unaligned.

 Before 1.9 structured dtype always had the aligned flag set, even if
 they were unaligned.
 Now we require a minimum alignment of 16 for strings and structured
 types so copying which sometimes works on the whole compound type
 instead of each item always works.
 This was the easiest way to get the testsuite running on sparc after
 fixing a couple of code paths not updating alignment information which
 forced some functions to always take super slow unaligned paths (e.g.
 ufunc.at)
 But the logic could certainly be improved.


The stackexchange example:

In [9]: a = np.zeros(4, dtype=dtype([('x', 'f8'), ('y', 'i4')],
align=False))

In [10]: a.data
Out[10]: read-write buffer for 0x2f94440, size 48, offset 0 at 0x2f8caf0

In [11]: a = np.zeros(4, dtype=dtype([('x', 'f8'), ('y', 'i4')],
align=True))
In [12]: a.data
Out[12]: read-write buffer for 0x2f94030, size 64, offset 0 at 0x2f8c5b0

Note that using an aligned dtype yields a different size on my 64 bit
system and 64 / 4 = 16.

Chuck
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Bug in genfromtxt with usecols and converters

2014-08-27 Thread Derek Homeier
On 26 Aug 2014, at 09:05 pm, Adrian Altenhoff adrian.altenh...@inf.ethz.ch 
wrote:

 But you are right that the problem with using the first_values, which should 
 of course be valid,
 somehow stems from the use of usecols, it seems that in that loop
 
for (i, conv) in user_converters.items():
 
 i in user_converters and in usecols get out of sync. This certainly looks 
 like a bug, the entire way of
 modifying i inside the loop appears a bit dangerous to me. I’ll have look if 
 I can make this safer.
 Thanks.
 
 As long as your data don’t actually contain any missing values you might 
 also simply use np.loadtxt.
 Ok, wasn't aware of that function so far. I will try that!
 
It was first_values that needs to be addressed by the original indices.
I have created a short test from your case and submitted a fix at
https://github.com/numpy/numpy/pull/5006

Cheers,
Derek
 
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Bug in genfromtxt with usecols and converters

2014-08-26 Thread Derek Homeier
Hi Adrian,

 I tried to load data from a csv file into numpy using genfromtxt. I need
 only a subset of the columns and want to apply some conversions to the
 data. attached is a minimal script showing the error.
 In brief, I want to load columns 1,2 and 4. But in the converter
 function for the 4th column, I get the 3rd value. The issue does not
 occur if I also load the 3rd column.
 Did I somehow misunderstand how the function is supposed to work or is
 this indeed a bug?

not sure whether to call it a bug; the error seems to arise before reading any 
actual data
(even on reading from an empty string); when genfromtxt is checking the 
filling_values used
to substitute missing or invalid data it is apparently testing on default 
testing values of 1 or -1
which your conversion scheme does not know about. Although I think it is rather 
the user’s
responsibility to provide valid converters, probably the documentation should 
at least be
updated to make them aware of this requirement.
I see two possible fixes/workarounds:

provide an keyword argument filling_values=[0,0,'1:1’]
or add the default filling values to your relEnum dictionary, e.g.  { … 
'-1':-1, '1':-1}

Could you check if this works for your case?

HTH,
Derek

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Bug in genfromtxt with usecols and converters

2014-08-26 Thread Adrian Altenhoff
Hi Derek,

thanks for your answer.
 not sure whether to call it a bug; the error seems to arise before reading 
 any actual data
 (even on reading from an empty string); when genfromtxt is checking the 
 filling_values used
 to substitute missing or invalid data it is apparently testing on default 
 testing values of 1 or -1
 which your conversion scheme does not know about. Although I think it is 
 rather the user’s
 responsibility to provide valid converters, probably the documentation should 
 at least be
 updated to make them aware of this requirement.
 I see two possible fixes/workarounds:
 
 provide an keyword argument filling_values=[0,0,'1:1’]
This workaround seems to be work, but I doubt that the actual problem is
the converter function I pass. The '-1', which is used as the testing
value is the first_values from the 3rd column (line 1574 in npyio.py),
but the converter is defined for column 4. by setting the filling_values
to an array of length 3, this obviously makes the problem disappear. But
I think if the first row is used, it should also use the values from the
column for which the converter is defined.

Best
Adrian
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Bug in genfromtxt with usecols and converters

2014-08-26 Thread Derek Homeier
Hi Adrian,

 not sure whether to call it a bug; the error seems to arise before reading 
 any actual data
 (even on reading from an empty string); when genfromtxt is checking the 
 filling_values used
 to substitute missing or invalid data it is apparently testing on default 
 testing values of 1 or -1
 which your conversion scheme does not know about. Although I think it is 
 rather the user’s
 responsibility to provide valid converters, probably the documentation 
 should at least be
 updated to make them aware of this requirement.
 I see two possible fixes/workarounds:
 
 provide an keyword argument filling_values=[0,0,'1:1’]
 This workaround seems to be work, but I doubt that the actual problem is
 the converter function I pass. The '-1', which is used as the testing
 value is the first_values from the 3rd column (line 1574 in npyio.py),
 but the converter is defined for column 4. by setting the filling_values
 to an array of length 3, this obviously makes the problem disappear. But
 I think if the first row is used, it should also use the values from the
 column for which the converter is defined.

it is certainly related to the converter function because a KeyError for the 
dictionary you provide is raised:
File test.py, line 13, in module
3: lambda rel: relEnum[rel.decode()]})
  File /sw/lib/python3.4/site-packages/numpy/lib/npyio.py, line 1581, in 
genfromtxt
missing_values=missing_values[i],)
  File /sw/lib/python3.4/site-packages/numpy/lib/_iotools.py, line 784, in 
update
tester = func(testing_value or asbytes('1'))
  File test.py, line 13, in lambda
3: lambda rel: relEnum[rel.decode()]})
KeyError: '-1’

But you are right that the problem with using the first_values, which should of 
course be valid,
somehow stems from the use of usecols, it seems that in that loop

for (i, conv) in user_converters.items():

i in user_converters and in usecols get out of sync. This certainly looks like 
a bug, the entire way of
modifying i inside the loop appears a bit dangerous to me. I’ll have look if I 
can make this safer.

As long as your data don’t actually contain any missing values you might also 
simply use np.loadtxt.

Cheers,
Derek

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Bug in genfromtxt with usecols and converters

2014-08-26 Thread Adrian Altenhoff
Hi Derek,

 But you are right that the problem with using the first_values, which should 
 of course be valid,
 somehow stems from the use of usecols, it seems that in that loop
 
 for (i, conv) in user_converters.items():
 
 i in user_converters and in usecols get out of sync. This certainly looks 
 like a bug, the entire way of
 modifying i inside the loop appears a bit dangerous to me. I’ll have look if 
 I can make this safer.
Thanks.
 
 As long as your data don’t actually contain any missing values you might also 
 simply use np.loadtxt.
Ok, wasn't aware of that function so far. I will try that!

Best wishes
Adrian
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] Bug in genfromtxt with usecols and converters

2014-08-25 Thread Adrian Altenhoff
Hi,

I tried to load data from a csv file into numpy using genfromtxt. I need
only a subset of the columns and want to apply some conversions to the
data. attached is a minimal script showing the error.
In brief, I want to load columns 1,2 and 4. But in the converter
function for the 4th column, I get the 3rd value. The issue does not
occur if I also load the 3rd column.
Did I somehow misunderstand how the function is supposed to work or is
this indeed a bug?

I'm using python 3.3.1 with numpy 1.8.1

Regards
Adrian
import numpy
import io

off1, off2 = 0, 4000
dtype = [('EntryNr1','i4'),('EntryNr2','i4'),('RelType', 'i1')]
fn = io.BytesIO(1,5,-1,1:1,1.98\n2,8,-1,1:n,22.56\n3,3,-2,m:n,18.2\n.encode('utf-8'))
relEnum = {'1:1':0, '1:n':1, 'm:1':2, 'm:n':3}
data = numpy.genfromtxt(fn, dtype=dtype,
delimiter=',',
usecols=(0,1,3),
converters={0: lambda nr: int(nr)+off1,
1: lambda nr: int(nr)+off2,
3: lambda rel: relEnum[rel.decode()]})
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Bug in np.cross for 2D vectors

2014-07-17 Thread Sebastian Berg
On Di, 2014-07-15 at 10:22 +0100, Neil Hodgson wrote:
 Hi,
 
 We came across this bug while using np.cross on 3D arrays of 2D
 vectors.

Hi,

which numpy version are you using? Until recently, the cross product
simply did *not* work in a broadcasting manner (3d arrays of 2d
vectors), it did something, but usually not the right thing. This is
fixed in recent versions (not sure if 1.8 or only now with 1.9)

- Sebastian

 The first example shows the problem and we looked at the source for
 np.cross and believe we found the bug - an unnecessary swapaxes when
 returning the output (comment inserted in the code).
 
 Thanks
 Neil 
 
 # Example
 
 shape = (3,5,7,2)
 
 
 # These are effectively 3D arrays (3*5*7) of 2D vectors
 data1 = np.random.randn(*shape)
 data2 = np.random.randn(*shape)
 
 
 # The cross product of data1 and data2 should produce a (3*5*7) array
 of scalars
 cross_product_longhand =
 data1[:,:,:,0]*data2[:,:,:,1]-data1[:,:,:,1]*data2[:,:,:,0]
 print 'longhand output shape:',cross_product_longhand.shape # and it
 does
 
 
 cross_product_numpy = np.cross(data1,data2)
 print 'numpy output shape:',cross_product_numpy.shape # It seems to
 have transposed the last 2 dimensions
 
 
 if (cross_product_longhand == np.transpose(cross_product_numpy,
 (0,2,1))).all():
 print 'Unexpected transposition in numpy.cross (numpy version %s)'%
 np.__version__
 
 
 # np.cross L1464
 if axis is not None: 
 axisa, axisb, axisc=(axis,)*3
 a = asarray(a).swapaxes(axisa, 0)
 b = asarray(b).swapaxes(axisb, 0)
 msg = incompatible dimensions for cross product\n\
   (dimension must be 2 or 3)
 if (a.shape[0] not in [2, 3]) or (b.shape[0] not in [2, 3]):
  raise ValueError(msg)
 if a.shape[0] == 2:
 if (b.shape[0] == 2): 
 cp = a[0]*b[1] - a[1]*b[0]
 if cp.ndim == 0:
 return cp
 else:
 ## WE SHOULD NOT SWAPAXES HERE! 
 ## For 2D vectors the first axis has been 
 
 ## collapsed during the cross product
 return cp.swapaxes(0, axisc)
 
 
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Bug in np.cross for 2D vectors

2014-07-17 Thread Neil Hodgson
 Hi,

 We came across this bug while using np.cross on 3D arrays of 2D vectors.


 What version of numpy are you using? This should already be solved in numpy
 master, and be part of the 1.9 release. Here's the relevant commit,
 although the code has been cleaned up a bit in later ones:

 https://github.com/numpy/numpy/commit/b9454f50f23516234c325490913224c3a69fb122

 Jaime

Yes, we are using 1.8 - sorry I should have checked!
Thanks
Neil___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Bug in np.cross for 2D vectors

2014-07-17 Thread Neil Hodgson


 Hi,

 We came across this bug while using np.cross on 3D arrays of 2D vectors.


 What version of numpy are you using? This should already be solved in numpy
 master, and be part of the 1.9 release. Here's the relevant commit,
 although the code has been cleaned up a bit in later ones:

 https://github.com/numpy/numpy/commit/b9454f50f23516234c325490913224c3a69fb122

 Jaime

Hi,

which numpy version are you using? Until recently, the cross product
simply did *not* work in a broadcasting manner (3d arrays of 2d
vectors), it did something, but usually not the right thing. This is
fixed in recent versions (not sure if 1.8 or only now with 1.9)

- Sebastian


Hi, I thought I replied, but I don't see it on the list, so here goes again...

Yes, we are using 1.8, will confirm it's ok with 1.9

Thanks
Neil___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Bug in np.cross for 2D vectors

2014-07-16 Thread Jaime Fernández del Río
On Tue, Jul 15, 2014 at 2:22 AM, Neil Hodgson hodgson.n...@yahoo.co.uk
wrote:

 Hi,

 We came across this bug while using np.cross on 3D arrays of 2D vectors.


What version of numpy are you using? This should already be solved in numpy
master, and be part of the 1.9 release. Here's the relevant commit,
although the code has been cleaned up a bit in later ones:

https://github.com/numpy/numpy/commit/b9454f50f23516234c325490913224c3a69fb122

Jaime

-- 
(\__/)
( O.o)
(  ) Este es Conejo. Copia a Conejo en tu firma y ayúdale en sus planes
de dominación mundial.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] Bug in np.cross for 2D vectors

2014-07-15 Thread Neil Hodgson
Hi,

We came across this bug while using np.cross on 3D arrays of 2D vectors.
The first example shows the problem and we looked at the source for np.cross 
and believe we found the bug - an unnecessary swapaxes when returning the 
output (comment inserted in the code).

Thanks
Neil 

# Example


shape = (3,5,7,2)

# These are effectively 3D arrays (3*5*7) of 2D vectors
data1 = np.random.randn(*shape)
data2 = np.random.randn(*shape)

# The cross product of data1 and data2 should produce a (3*5*7) array of scalars
cross_product_longhand = 
data1[:,:,:,0]*data2[:,:,:,1]-data1[:,:,:,1]*data2[:,:,:,0]
print 'longhand output shape:',cross_product_longhand.shape # and it does

cross_product_numpy = np.cross(data1,data2)
print 'numpy output shape:',cross_product_numpy.shape # It seems to have 
transposed the last 2 dimensions

if (cross_product_longhand == np.transpose(cross_product_numpy, (0,2,1))).all():
print 'Unexpected transposition in numpy.cross (numpy version 
%s)'%np.__version__

# np.cross L1464if axis is not None: 
    axisa, axisb, axisc=(axis,)*3
a = asarray(a).swapaxes(axisa, 0)
b = asarray(b).swapaxes(axisb, 0)
msg = incompatible dimensions for cross product\n\
   (dimension must be 2 or 3)
if (a.shape[0] not in [2, 3]) or (b.shape[0] not in [2, 3]):
raise ValueError(msg)
if a.shape[0] == 2:    if (b.shape[0] == 2): 
    cp = a[0]*b[1] - a[1]*b[0]
    if cp.ndim == 0:
    return cp
    else:
    ## WE SHOULD NOT SWAPAXES HERE! 
    ## For 2D vectors the first axis has been 

    ## collapsed during the cross product
   return cp.swapaxes(0, axisc)
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] bug with mmap'ed datetime64 arrays

2014-02-17 Thread Charles G. Waldman
test case:

#!/usr/bin/env python
import numpy as np
a=np.array(['2014', '2015', '2016'], dtype='datetime64')
x=np.datetime64('2015')
print ax
np.save('test.npy', a)
b = np.load('test.npy', mmap_mode='c')
print bx


result:

 [False False  True]
Traceback (most recent call last):
  File stdin, line 1, in module
  File /tmp/t.py, line 12, in module
print bx
  File /usr/lib64/python2.7/site-packages/numpy/core/memmap.py, line
279, in __array_finalize__
if hasattr(obj, '_mmap') and np.may_share_memory(self, obj):
  File /usr/lib64/python2.7/site-packages/numpy/lib/utils.py, line
298, in may_share_memory
a_low, a_high = byte_bounds(a)
  File /usr/lib64/python2.7/site-packages/numpy/lib/utils.py, line
258, in byte_bounds
bytes_a = int(ai['typestr'][2:])
ValueError: invalid literal for int() with base 10: '8[Y]'


fix:
diff --git a/numpy/lib/utils.py b/numpy/lib/utils.py
index 1f1cdfc..c73f2f1 100644
--- a/numpy/lib/utils.py
+++ b/numpy/lib/utils.py
@@ -210,7 +210,7 @@ def byte_bounds(a):
 a_data = ai['data'][0]
 astrides = ai['strides']
 ashape = ai['shape']
-bytes_a = int(ai['typestr'][2:])
+bytes_a = a.dtype.itemsize

 a_low = a_high = a_data
 if astrides is None: # contiguous case


will submit pull request via github
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] bug in comparing object arrays to None (?)

2014-01-27 Thread Charles G. Waldman
Hi Numpy folks.

I just noticed that comparing an array of type 'object' to None does
not behave as I expected.  Is this a feature or a bug?  (I can take a
stab at fixing it if it's a bug, as I believe it is).

 np.version.full_version
'1.8.0'

 a = np.array(['Frank', None, 'Nancy'])

 a
array(['Frank', None, 'Nancy'], dtype=object)

 a == 'Frank'
array([ True, False, False], dtype=bool)
# Return value is an array

 a == None
False
# Return value is scalar (BUG?)
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] bug in comparing object arrays to None (?)

2014-01-27 Thread Warren Weckesser
On Mon, Jan 27, 2014 at 3:43 PM, Charles G. Waldman char...@crunch.iowrote:

 Hi Numpy folks.

 I just noticed that comparing an array of type 'object' to None does
 not behave as I expected.  Is this a feature or a bug?  (I can take a
 stab at fixing it if it's a bug, as I believe it is).

  np.version.full_version
 '1.8.0'

  a = np.array(['Frank', None, 'Nancy'])

  a
 array(['Frank', None, 'Nancy'], dtype=object)

  a == 'Frank'
 array([ True, False, False], dtype=bool)
 # Return value is an array

  a == None
 False
 # Return value is scalar (BUG?)



Looks like a fix is in progress:  https://github.com/numpy/numpy/pull/3514

Warren

___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] Bug in resize of structured array (with initial size = 0)

2014-01-10 Thread Nicolas Rougier

Hi,

I've tried to resize a record array that was first empty (on purpose, I need it)
and I got the following error (while it's working for regular array).


Traceback (most recent call last):
  File test_resize.py, line 10, in module
print np.resize(V,2)
  File 
/usr/locaL/Cellar/python/2.7.6/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/numpy/core/fromnumeric.py,
 line 1053, in resize
if not Na: return mu.zeros(new_shape, a.dtype.char)
TypeError: Empty data-type


I'm using numpy 1.8.0, python 2.7.6, osx 10.9.1.
Can anyone confirm before I submit an issue ?


Here is the script:

V = np.zeros(0, dtype=np.float32)
print V.dtype
print np.resize(V,2)

V = np.zeros(0, dtype=[('a', np.float32, 1)])
print V.dtype
print np.resize(V,2)


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Bug in numpy.correlate documentation

2013-10-14 Thread Bernhard Spinnler

On 11.10.2013, at 01:19, Julian Taylor jtaylor.deb...@googlemail.com wrote:
 
Yeah, unless the current behaviour is actually broken or redundant in
some way, we're not going to switch from one perfectly good convention
to another perfectly good convention and break everyone's code in the
process.
 
The most helpful thing would be if you could file a pull request that
just changes the docstring to what you think it should be. Extra bonus
points if it points out that there is another definition some people
might be expecting instead, and explains how those people can use the
existing functions to get what they want. :-)
 
-n
 
 
 IMHO, point[ing] out that there is another definition some people
 might be expecting instead, and explain[ing] how those people can use
 the existing functions to get what they want should be a requirement
 for the docstring (Notes section), not merely worth extra bonus
 points.  But then I'm not, presently, in a position to edit the
 docstring myself, so that's just MHO. 
 
 IAE, I found what appears to me to be another vote for the extant
 docstring: Box  Jenkins, 1976, Time Series Analysis: Forecasting and
 Control, Holden-Day, Oakland, pg. 374.  Perhaps a switch (with a
 default value that maintains current definition, so that extant uses
 would not require a code change) c/should be added to the function
 signature so that users can get easily get what they want?
 
 
 As pointed out in another post in this thread, there are now at least
 three different definitions of correlation which are in use in different
 disciplines of science and engineering:
 
 Numpy code:
 
 z_numpyCode[k] = sum_n a[n+k] * conj(v[n])
 
 
 Numpy docs:
 
 z_numpyDoc[k] = sum_n a[n] * conj(v[n+k])
 = sum_n a[n-k] * conj(v[n])
 = z_numpyCode[-k]
 
 
 Wolfram Mathworld:
 
 z_mmca[k] = sum_n conj(a[n]) * v[n+k]
 = conj( sum_n a[n] * conj(v[n+k]) )
 = conj( z_numpyDoc[k] )
 = conj( z_numpyCode[-k] )
 
 I'm sure there are even more if you search long enough. But shouldn't
 the primary objective be to bring the docs in line with the code (which
 is definitely not broken)? It took me 2 days of debugging my
 code recently only to discover that numpy correlate() was calculating a
 different correlation than the docs said.
 
 I can try to come up with a proposal for the docs. Could anyone point me
 to where I can find the docs? I can clone the numpy repo, however, I'm
 not a numpy developer.
 
 
 yes we should only change the documentation to match the (hopefully
 correct) code.
 the documentation is in the docstring of the correlate function in
 numpy/core/numeric.py line 819
 ___

Ok, corrected the docstring, mentioning one alternative definition of 
correlation. Pull request filed: https://github.com/numpy/numpy/pull/3913.

Bernhard


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Bug in numpy.correlate documentation

2013-10-10 Thread Bernhard Spinnler
It seems to me that Wolfram is following yet another path. From 
http://mathworld.wolfram.com/Autocorrelation.html and more importantly 
http://mathworld.wolfram.com/Cross-Correlation.html, equation (5):

z_mathworld[k] = sum_n conj(a[n]) * v[n+k] 
= conj( sum_n a[n] * conj(v[n+k]) )
= conj( z_numpyDocstring[k] )
= conj( z_numpyCode[-k] )

is the conjugate of what the numpy docstring says. So, now we have at least 
three definitions to chose from :-)

Cheers,
Bernhard

On 09.10.2013, at 22:19, David Goldsmith d.l.goldsm...@gmail.com wrote:

 Looks like Wolfram MathWorld would favor the docstring, but the possibility 
 of a use-domain dependency seems plausible (after all, a similar dilemma is 
 observed, e.g., w/ the Fourier Transform)--I guess one discipline's future is 
 another discipline's past. :-)
 
 http://mathworld.wolfram.com/Autocorrelation.html
 

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Bug in numpy.correlate documentation

2013-10-10 Thread Bernhard Spinnler

On 10.10.2013, at 19:27, David Goldsmith d.l.goldsm...@gmail.com wrote:
 On Wed, Oct 9, 2013 at 7:48 PM, Bernhard Spinnler
 bernhard.spinn...@gmx.net wrote:
  Hi Richard,
 
  Ah, I searched the list but didn't find those posts before?
 
  I can easily imagine that correlation is defined differently in different
  disciplines. Both ways are correct and it's just a convention or definition.
  In my field (Digital Communications, Digital Signal Processing) the vast
  majority uses the convention implemented by the code. Here are a few
  examples of prominent text books:
 
  - Papoulis, Probaility, Random Variables, and Stochastic Processes,
  McGraw-Hill, 2nd ed.
  - Benvenuto, Cherubini, Algorithms for Communications Systems and their
  Applications, Wiley.
  - Carlson, Communication Systems 4th ed. 2002, McGraw-Hill.
 
  Last not least, Matlab's xcorr() function behaves exactly like correlate()
  does right now, see
  - http://www.mathworks.de/de/help/signal/ref/xcorr.html
 
  But, as you say, the most important aspect might be, that most people will
  probably prefer changing the docs instead of changing the code.
 
 Yeah, unless the current behaviour is actually broken or redundant in
 some way, we're not going to switch from one perfectly good convention
 to another perfectly good convention and break everyone's code in the
 process.
 
 The most helpful thing would be if you could file a pull request that
 just changes the docstring to what you think it should be. Extra bonus
 points if it points out that there is another definition some people
 might be expecting instead, and explains how those people can use the
 existing functions to get what they want. :-)
 
 -n
 
 IMHO, point[ing] out that there is another definition some people might be 
 expecting instead, and explain[ing] how those people can use the existing 
 functions to get what they want should be a requirement for the docstring 
 (Notes section), not merely worth extra bonus points.  But then I'm not, 
 presently, in a position to edit the docstring myself, so that's just MHO.  
 
 IAE, I found what appears to me to be another vote for the extant 
 docstring: Box  Jenkins, 1976, Time Series Analysis: Forecasting and 
 Control, Holden-Day, Oakland, pg. 374.  Perhaps a switch (with a default 
 value that maintains current definition, so that extant uses would not 
 require a code change) c/should be added to the function signature so that 
 users can get easily get what they want?
 

As pointed out in another post in this thread, there are now at least three 
different definitions of correlation which are in use in different disciplines 
of science and engineering:

Numpy code:

z_numpyCode[k] = sum_n a[n+k] * conj(v[n])


Numpy docs:

z_numpyDoc[k] = sum_n a[n] * conj(v[n+k])
 = sum_n a[n-k] * conj(v[n])
 = z_numpyCode[-k]


Wolfram Mathworld:

z_mmca[k] = sum_n conj(a[n]) * v[n+k]
 = conj( sum_n a[n] * conj(v[n+k]) )
 = conj( z_numpyDoc[k] )
 = conj( z_numpyCode[-k] )

I'm sure there are even more if you search long enough. But shouldn't the 
primary objective be to bring the docs in line with the code (which is 
definitely not broken)? It took me 2 days of debugging my code recently only 
to discover that numpy correlate() was calculating a different correlation than 
the docs said.

I can try to come up with a proposal for the docs. Could anyone point me to 
where I can find the docs? I can clone the numpy repo, however, I'm not a numpy 
developer.

Best wishes,
Bernhard


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Bug in numpy.correlate documentation

2013-10-10 Thread Julian Taylor
On 10.10.2013 21:31, Bernhard Spinnler wrote:
 
 On 10.10.2013, at 19:27, David Goldsmith d.l.goldsm...@gmail.com
 mailto:d.l.goldsm...@gmail.com wrote:

 On Wed, Oct 9, 2013 at 7:48 PM, Bernhard Spinnler
 bernhard.spinn...@gmx.net mailto:bernhard.spinn...@gmx.net wrote:
  Hi Richard,
 
  Ah, I searched the list but didn't find those posts before?
 
  I can easily imagine that correlation is defined differently in
 different
  disciplines. Both ways are correct and it's just a convention or
 definition.
  In my field (Digital Communications, Digital Signal Processing)
 the vast
  majority uses the convention implemented by the code. Here are a few
  examples of prominent text books:
 
  - Papoulis, Probaility, Random Variables, and Stochastic
 Processes,
  McGraw-Hill, 2nd ed.
  - Benvenuto, Cherubini, Algorithms for Communications Systems
 and their
  Applications, Wiley.
  - Carlson, Communication Systems 4th ed. 2002, McGraw-Hill.
 
  Last not least, Matlab's xcorr() function behaves exactly like
 correlate()
  does right now, see
  - http://www.mathworks.de/de/help/signal/ref/xcorr.html
 
  But, as you say, the most important aspect might be, that most
 people will
  probably prefer changing the docs instead of changing the code.

 Yeah, unless the current behaviour is actually broken or redundant in
 some way, we're not going to switch from one perfectly good convention
 to another perfectly good convention and break everyone's code in the
 process.

 The most helpful thing would be if you could file a pull request that
 just changes the docstring to what you think it should be. Extra bonus
 points if it points out that there is another definition some people
 might be expecting instead, and explains how those people can use the
 existing functions to get what they want. :-)

 -n


 IMHO, point[ing] out that there is another definition some people
 might be expecting instead, and explain[ing] how those people can use
 the existing functions to get what they want should be a requirement
 for the docstring (Notes section), not merely worth extra bonus
 points.  But then I'm not, presently, in a position to edit the
 docstring myself, so that's just MHO. 

 IAE, I found what appears to me to be another vote for the extant
 docstring: Box  Jenkins, 1976, Time Series Analysis: Forecasting and
 Control, Holden-Day, Oakland, pg. 374.  Perhaps a switch (with a
 default value that maintains current definition, so that extant uses
 would not require a code change) c/should be added to the function
 signature so that users can get easily get what they want?

 
 As pointed out in another post in this thread, there are now at least
 three different definitions of correlation which are in use in different
 disciplines of science and engineering:
 
 Numpy code:
 
 z_numpyCode[k] = sum_n a[n+k] * conj(v[n])
 
 
 Numpy docs:
 
 z_numpyDoc[k] = sum_n a[n] * conj(v[n+k])
  = sum_n a[n-k] * conj(v[n])
  = z_numpyCode[-k]
 
 
 Wolfram Mathworld:
 
 z_mmca[k] = sum_n conj(a[n]) * v[n+k]
  = conj( sum_n a[n] * conj(v[n+k]) )
  = conj( z_numpyDoc[k] )
  = conj( z_numpyCode[-k] )
 
 I'm sure there are even more if you search long enough. But shouldn't
 the primary objective be to bring the docs in line with the code (which
 is definitely not broken)? It took me 2 days of debugging my
 code recently only to discover that numpy correlate() was calculating a
 different correlation than the docs said.
 
 I can try to come up with a proposal for the docs. Could anyone point me
 to where I can find the docs? I can clone the numpy repo, however, I'm
 not a numpy developer.
 

yes we should only change the documentation to match the (hopefully
correct) code.
the documentation is in the docstring of the correlate function in
numpy/core/numeric.py line 819
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Bug in numpy.correlate documentation

2013-10-09 Thread Bernhard Spinnler
Hi Richard,

Ah, I searched the list but didn't find those posts before…

I can easily imagine that correlation is defined differently in different 
disciplines. Both ways are correct and it's just a convention or definition. In 
my field (Digital Communications, Digital Signal Processing) the vast majority 
uses the convention implemented by the code. Here are a few examples of 
prominent text books:

- Papoulis, Probaility, Random Variables, and Stochastic Processes, 
McGraw-Hill, 2nd ed.
- Benvenuto, Cherubini, Algorithms for Communications Systems and their 
Applications, Wiley.
- Carlson, Communication Systems 4th ed. 2002, McGraw-Hill.

Last not least, Matlab's xcorr() function behaves exactly like correlate() does 
right now, see
- http://www.mathworks.de/de/help/signal/ref/xcorr.html

But, as you say, the most important aspect might be, that most people will 
probably prefer changing the docs instead of changing the code.

Should I file a bug somewhere?

Cheers,
Bernhard


On 08.10.2013, at 21:10, Richard Hattersley rhatters...@gmail.com wrote:

 Hi Bernard,
 
 Looks like you're on to something - two other people have raised this 
 discrepancy before: https://github.com/numpy/numpy/issues/2588. 
 Unfortunately, when it comes to resolving the discrepancy one of the previous 
 comments takes the opposite view. Namely, that the docstring is correct and 
 the code is wrong.
 
 Do different domains use different conventions here? Are there some 
 references to back up one stance or another?
 
 But all else being equal, I'm guessing there'll be far more appetite for 
 updating the documentation than the code.
 
 Regards,
 Richard Hattersley
 
 
 On 7 October 2013 22:09, Bernhard Spinnler bernhard.spinn...@gmx.net wrote:
 The numpy.correlate documentation says:
 
 correlate(a, v) = z[k] = sum_n a[n] * conj(v[n+k])
 
 In [1]: a = [1, 2]
 
 In [2]: v = [2, 1j]
 
 In [3]: z = correlate(a, v, 'full')
 
 In [4]: z
 Out[4]: array([ 0.-1.j,  2.-2.j,  4.+0.j])
 
 However, according to the documentation, z should be
 
 z[-1] = a[1] * conj(v[0]) = 4.+0.j
 z[0]  = a[0] * conj(v[0]) + a[1] * conj(v[1]) = 2.-2.j
 z[1] = a[0] * conj(v[1]) = 0.-1.j
 
 which is the time reversed version of what correlate() calculates.
 
 IMHO, the correlate() code is correct. The correct formula in the docs (which 
 is also the correlation formula in standard text books) should be
 
 z[k] = sum_n a[n+k] * conj(v[n])
 
 Cheers,
 Bernhard
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion
 
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Bug in numpy.correlate documentation

2013-10-09 Thread David Goldsmith
Looks like Wolfram MathWorld would favor the docstring, but the possibility
of a use-domain dependency seems plausible (after all, a similar dilemma
is observed, e.g., w/ the Fourier Transform)--I guess one discipline's
future is another discipline's past. :-)

http://mathworld.wolfram.com/Autocorrelation.html

DG

Date: Tue, 8 Oct 2013 20:10:41 +0100

 From: Richard Hattersley rhatters...@gmail.com
 Subject: Re: [Numpy-discussion] Bug in numpy.correlate documentation
 To: Discussion of Numerical Python numpy-discussion@scipy.org
 Message-ID:
 CAP=RS9k54vtNFHy9ppG=U09oEHwB=KLV0xvwR6BfFgB3o5S=
 f...@mail.gmail.com
 Content-Type: text/plain; charset=iso-8859-1

 Hi Bernard,

 Looks like you're on to something - two other people have raised this
 discrepancy before: https://github.com/numpy/numpy/issues/2588.
 Unfortunately, when it comes to resolving the discrepancy one of the
 previous comments takes the opposite view. Namely, that the docstring is
 correct and the code is wrong.

 Do different domains use different conventions here? Are there some
 references to back up one stance or another?

 But all else being equal, I'm guessing there'll be far more appetite for
 updating the documentation than the code.

 Regards,
 Richard Hattersley


 On 7 October 2013 22:09, Bernhard Spinnler bernhard.spinn...@gmx.net
 wrote:

  The numpy.correlate documentation says:
 
  correlate(a, v) = z[k] = sum_n a[n] * conj(v[n+k])
 

snip

  [so] according to the documentation, z should be
 
  z[-1] = a[1] * conj(v[0]) = 4.+0.j
  z[0]  = a[0] * conj(v[0]) + a[1] * conj(v[1]) = 2.-2.j
  z[1] = a[0] * conj(v[1]) = 0.-1.j
 
  which is the time reversed version of what correlate() calculates.

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Bug in numpy.correlate documentation

2013-10-09 Thread Nathaniel Smith
On Wed, Oct 9, 2013 at 7:48 PM, Bernhard Spinnler
bernhard.spinn...@gmx.net wrote:
 Hi Richard,

 Ah, I searched the list but didn't find those posts before…

 I can easily imagine that correlation is defined differently in different
 disciplines. Both ways are correct and it's just a convention or definition.
 In my field (Digital Communications, Digital Signal Processing) the vast
 majority uses the convention implemented by the code. Here are a few
 examples of prominent text books:

 - Papoulis, Probaility, Random Variables, and Stochastic Processes,
 McGraw-Hill, 2nd ed.
 - Benvenuto, Cherubini, Algorithms for Communications Systems and their
 Applications, Wiley.
 - Carlson, Communication Systems 4th ed. 2002, McGraw-Hill.

 Last not least, Matlab's xcorr() function behaves exactly like correlate()
 does right now, see
 - http://www.mathworks.de/de/help/signal/ref/xcorr.html

 But, as you say, the most important aspect might be, that most people will
 probably prefer changing the docs instead of changing the code.

Yeah, unless the current behaviour is actually broken or redundant in
some way, we're not going to switch from one perfectly good convention
to another perfectly good convention and break everyone's code in the
process.

The most helpful thing would be if you could file a pull request that
just changes the docstring to what you think it should be. Extra bonus
points if it points out that there is another definition some people
might be expecting instead, and explains how those people can use the
existing functions to get what they want. :-)

-n
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Bug in numpy.correlate documentation

2013-10-08 Thread Richard Hattersley
Hi Bernard,

Looks like you're on to something - two other people have raised this
discrepancy before: https://github.com/numpy/numpy/issues/2588.
Unfortunately, when it comes to resolving the discrepancy one of the
previous comments takes the opposite view. Namely, that the docstring is
correct and the code is wrong.

Do different domains use different conventions here? Are there some
references to back up one stance or another?

But all else being equal, I'm guessing there'll be far more appetite for
updating the documentation than the code.

Regards,
Richard Hattersley


On 7 October 2013 22:09, Bernhard Spinnler bernhard.spinn...@gmx.netwrote:

 The numpy.correlate documentation says:

 correlate(a, v) = z[k] = sum_n a[n] * conj(v[n+k])

 In [1]: a = [1, 2]

 In [2]: v = [2, 1j]

 In [3]: z = correlate(a, v, 'full')

 In [4]: z
 Out[4]: array([ 0.-1.j,  2.-2.j,  4.+0.j])

 However, according to the documentation, z should be

 z[-1] = a[1] * conj(v[0]) = 4.+0.j
 z[0]  = a[0] * conj(v[0]) + a[1] * conj(v[1]) = 2.-2.j
 z[1] = a[0] * conj(v[1]) = 0.-1.j

 which is the time reversed version of what correlate() calculates.

 IMHO, the correlate() code is correct. The correct formula in the docs
 (which is also the correlation formula in standard text books) should be

 z[k] = sum_n a[n+k] * conj(v[n])

 Cheers,
 Bernhard
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] Bug in numpy.correlate documentation

2013-10-07 Thread Bernhard Spinnler
The numpy.correlate documentation says:

correlate(a, v) = z[k] = sum_n a[n] * conj(v[n+k])

In [1]: a = [1, 2]

In [2]: v = [2, 1j]

In [3]: z = correlate(a, v, 'full')

In [4]: z
Out[4]: array([ 0.-1.j,  2.-2.j,  4.+0.j])

However, according to the documentation, z should be

z[-1] = a[1] * conj(v[0]) = 4.+0.j
z[0]  = a[0] * conj(v[0]) + a[1] * conj(v[1]) = 2.-2.j
z[1] = a[0] * conj(v[1]) = 0.-1.j

which is the time reversed version of what correlate() calculates.

IMHO, the correlate() code is correct. The correct formula in the docs (which 
is also the correlation formula in standard text books) should be

z[k] = sum_n a[n+k] * conj(v[n])

Cheers,
Bernhard
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] Bug (?) converting list to array

2013-09-09 Thread Chad Kidder
I'm trying to enter a 2-D array and np.array() is returning a 1-D array of
lists.  I'm using Python (x,y) on Windows 7 with numpy 1.7.1.  Here's the
code that is giving me issues.

 f1 = [[15.207, 15.266, 15.181, 15.189, 15.215, 15.198], [-45, -57, -62,
-70, -72, -73.5, -77]]
 f1a = np.array(f1)
 f1a
array([[15.207, 15.266, 15.181, 15.189, 15.215, 15.198],
   [-45, -57, -62, -70, -72, -73.5, -77]], dtype=object)

What am I missing?
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Bug (?) converting list to array

2013-09-09 Thread Benjamin Root
The two lists are of different sizes.

Had to count twice to catch that.

Ben Root

On Mon, Sep 9, 2013 at 9:46 AM, Chad Kidder cckid...@gmail.com wrote:

 I'm trying to enter a 2-D array and np.array() is returning a 1-D array of
 lists.  I'm using Python (x,y) on Windows 7 with numpy 1.7.1.  Here's the
 code that is giving me issues.

  f1 = [[15.207, 15.266, 15.181, 15.189, 15.215, 15.198], [-45, -57,
 -62, -70, -72, -73.5, -77]]
  f1a = np.array(f1)
  f1a
 array([[15.207, 15.266, 15.181, 15.189, 15.215, 15.198],
[-45, -57, -62, -70, -72, -73.5, -77]], dtype=object)

 What am I missing?

 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Bug (?) converting list to array

2013-09-09 Thread Nathaniel Smith
One list has 6 entries and one has 7, so they can't be aligned into a
single array. Possibly it would be better to raise an error here instead of
returning an object array, but that's what's going on.

-n
On 9 Sep 2013 14:49, Chad Kidder cckid...@gmail.com wrote:

 I'm trying to enter a 2-D array and np.array() is returning a 1-D array of
 lists.  I'm using Python (x,y) on Windows 7 with numpy 1.7.1.  Here's the
 code that is giving me issues.

  f1 = [[15.207, 15.266, 15.181, 15.189, 15.215, 15.198], [-45, -57,
 -62, -70, -72, -73.5, -77]]
  f1a = np.array(f1)
  f1a
 array([[15.207, 15.266, 15.181, 15.189, 15.215, 15.198],
[-45, -57, -62, -70, -72, -73.5, -77]], dtype=object)

 What am I missing?

 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Bug (?) converting list to array

2013-09-09 Thread Chad Kidder
Oh, so there was a bug in the user...


On Mon, Sep 9, 2013 at 7:52 AM, Nathaniel Smith n...@pobox.com wrote:

 One list has 6 entries and one has 7, so they can't be aligned into a
 single array. Possibly it would be better to raise an error here instead of
 returning an object array, but that's what's going on.

 -n
 On 9 Sep 2013 14:49, Chad Kidder cckid...@gmail.com wrote:

 I'm trying to enter a 2-D array and np.array() is returning a 1-D array
 of lists.  I'm using Python (x,y) on Windows 7 with numpy 1.7.1.  Here's
 the code that is giving me issues.

  f1 = [[15.207, 15.266, 15.181, 15.189, 15.215, 15.198], [-45, -57,
 -62, -70, -72, -73.5, -77]]
  f1a = np.array(f1)
  f1a
 array([[15.207, 15.266, 15.181, 15.189, 15.215, 15.198],
[-45, -57, -62, -70, -72, -73.5, -77]], dtype=object)

 What am I missing?

 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion


 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Bug (?) converting list to array

2013-09-09 Thread josef . pktd
On Mon, Sep 9, 2013 at 9:52 AM, Nathaniel Smith n...@pobox.com wrote:
 One list has 6 entries and one has 7, so they can't be aligned into a single
 array. Possibly it would be better to raise an error here instead of
 returning an object array, but that's what's going on.

It did at some point (and I relied on the exception to catch bugs,
since I'm still using mainly numpy 1.5)

 f1 = [[15.207, 15.266, 15.181, 15.189, 15.215, 15.198], [-45, -57, -62, 
 -70, -72, -73.5, -77]]
 np.array(f1)
Traceback (most recent call last):
  File stdin, line 1, in module
ValueError: setting an array element with a sequence.
 np.__version__
'1.5.1'

now we get object arrays (in scipy.stats, and I didn't know what to do
with them)

I don't remember any discussion on this.

Josef


 -n

 On 9 Sep 2013 14:49, Chad Kidder cckid...@gmail.com wrote:

 I'm trying to enter a 2-D array and np.array() is returning a 1-D array of
 lists.  I'm using Python (x,y) on Windows 7 with numpy 1.7.1.  Here's the
 code that is giving me issues.

  f1 = [[15.207, 15.266, 15.181, 15.189, 15.215, 15.198], [-45, -57,
  -62, -70, -72, -73.5, -77]]
  f1a = np.array(f1)
  f1a
 array([[15.207, 15.266, 15.181, 15.189, 15.215, 15.198],
[-45, -57, -62, -70, -72, -73.5, -77]], dtype=object)

 What am I missing?

 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion


 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Bug (?) converting list to array

2013-09-09 Thread josef . pktd
On Mon, Sep 9, 2013 at 11:35 AM, Nathaniel Smith n...@pobox.com wrote:
 On 9 Sep 2013 15:50, josef.p...@gmail.com wrote:

 On Mon, Sep 9, 2013 at 9:52 AM, Nathaniel Smith n...@pobox.com wrote:
  One list has 6 entries and one has 7, so they can't be aligned into a
  single
  array. Possibly it would be better to raise an error here instead of
  returning an object array, but that's what's going on.

 It did at some point (and I relied on the exception to catch bugs,
 since I'm still using mainly numpy 1.5)

  f1 = [[15.207, 15.266, 15.181, 15.189, 15.215, 15.198], [-45, -57,
  -62, -70, -72, -73.5, -77]]
  np.array(f1)
 Traceback (most recent call last):
   File stdin, line 1, in module
 ValueError: setting an array element with a sequence.
  np.__version__
 '1.5.1'

 now we get object arrays (in scipy.stats, and I didn't know what to do
 with them)

 I don't remember any discussion on this.

 There may not have been any.

Isn't it too late now?


 Feel free to submit a PR and we can argue about which way is better... (I
 also prefer the 1.5 approach personally.)

I'm just a balcony muppet (and user)

(and I lost the argument against object arrays in scipy.stats)

Josef


 -n


 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Bug (?) converting list to array

2013-09-09 Thread Nathaniel Smith
On 9 Sep 2013 15:50, josef.p...@gmail.com wrote:

 On Mon, Sep 9, 2013 at 9:52 AM, Nathaniel Smith n...@pobox.com wrote:
  One list has 6 entries and one has 7, so they can't be aligned into a
single
  array. Possibly it would be better to raise an error here instead of
  returning an object array, but that's what's going on.

 It did at some point (and I relied on the exception to catch bugs,
 since I'm still using mainly numpy 1.5)

  f1 = [[15.207, 15.266, 15.181, 15.189, 15.215, 15.198], [-45, -57,
-62, -70, -72, -73.5, -77]]
  np.array(f1)
 Traceback (most recent call last):
   File stdin, line 1, in module
 ValueError: setting an array element with a sequence.
  np.__version__
 '1.5.1'

 now we get object arrays (in scipy.stats, and I didn't know what to do
 with them)

 I don't remember any discussion on this.

There may not have been any.

Feel free to submit a PR and we can argue about which way is better... (I
also prefer the 1.5 approach personally.)

-n
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] Bug in gufuncs affecting umath_linalg

2013-08-06 Thread Jaime Fernández del Río
Hi,

I think I have found an undocumented feature of the gufuncs machinery. I
have filed a bug report:

https://github.com/numpy/numpy/issues/3582

Some more background on what i am seeing...

I have coded a gufunc with signature '(r,c,p),(g,g,g,q)-(r,c,q)'. It is a
color map, i.e. a transformation of a 3-dimensional array of p channels
(with p=3, an RGB image of r rows and c columns), into a 3-dimensional
array of q channels (with q=4, a CMYK image of the same size), via a
p-dimensional look-up-table (LUT).

For all practical purposes, the LUT always has the first three dimensions
identical, hence the repeated g's in the signature.

The function registered with this signature, receives the expected values
in the dimensions argument: 'n', 'r', 'c', 'p', 'g', 'q', with 'n' being
the length of the gufunc loop.

But there is a problem with the steps argument. As expected I get a 13 item
long array: 3 main loop strides, 3 strides (r, c, p) for the first
argument, 4 strides (g, g, g, q) for the second, and 3 strides (r, c, q)
for the return. Everything is OK except for the strides for the repeating
'g's: instead of getting three different stride values, the first two are
the same as the last. This does not happen if I modify the signature to be
'(r,c,p),(i,j,k,q)-(r,c,q)', which is the workaround I am implementing in
my code for the time being. I have also managed to repeat the behavior in
repeated dimensions on other arguments, e.g. '(r,r,p),(i,j,k,q)-(r,c,p)'
shows the same issue for the strides of the first argument.

I have seen that the gufunc version of umath_linalg makes use of a similar,
repeated index scheme, e.g. in the 'solve' and 'det' gufuncs. At least for
these two, the tests in place do not catch the error. For solve, the tests
run these two cases (the results below come from the traditional linalg, as
I am running numpy 1.7.1):

 np.linalg.solve([[1, 2], [3, 4]], [[4, 3], [2, 1]])
array([[-6., -5.],
   [ 5.,  4.]])
 np.linalg.solve([[1+2j,2+3j], [3+4j,4+5j]], [[4+3j,3+2j], [2+1j,1+0j]])
array([[-6. +0.e+00j, -5. +0.e+00j],
   [ 5. -3.46944695e-16j,  4. -3.46944695e-16j]])

But because of their highly strucutred nature, these particular test cases
give the same result if you get the strides wrong (!!!):

 np.linalg.solve([[1, 2], [2, 3]], [[4, 3], [3, 2]])
array([[-6., -5.],
   [ 5.,  4.]])
 np.linalg.solve([[1+2j,2+3j], [2+3j,3+4j]], [[4+3j,3+2j], [3+2j,2+1j]])
array([[-6. -1.09314267e-15j, -5. -1.09314267e-15j],
   [ 5. +1.08246745e-15j,  4. +1.08246745e-15j]])

As for the determinant, no abolute check of the return value is performed:
the return of 'det' is compared to the product of the return of 'eigvals',
which also has the '(m, m)' signature, and interprets the data equally
wrong.

For my particular issue, I am simply going to register the gufunc with
non-repeating dimensions, check for equality in a Python wrapper, and
discard the repeated values in my C code. Not sure what is the best way of
going about umath_linalg. Probably better to fix the issue in the gufunc
machinery than to patch umath_lnalg.

If there's any way, other than reporting it, in which I can help getting
this fixed, I'll be more than happy to do it. But for this job I am clearly
unqualified labor, and would need to work under someone else's command.

Regards,

Jaime

-- 
(\__/)
( O.o)
(  ) Este es Conejo. Copia a Conejo en tu firma y ayúdale en sus planes
de dominación mundial.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] bug fixes: which branch?

2013-06-16 Thread Eric Firing
What is the preferred strategy for handling bug fix PRs?  Initial fix on 
master, and then a separate PR to backport to v1.7.x?  Or the reverse? 
It doesn't look like v1.7.x is being merged into master regularly, so 
the matplotlib pattern (fix on maintenance, merge maintenance into 
master) seems not to be used here.

Thanks.

Eric
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] bug fixes: which branch?

2013-06-16 Thread Nathaniel Smith
On Sun, Jun 16, 2013 at 10:57 PM, Eric Firing efir...@hawaii.edu wrote:
 What is the preferred strategy for handling bug fix PRs?  Initial fix on
 master, and then a separate PR to backport to v1.7.x?  Or the reverse?
 It doesn't look like v1.7.x is being merged into master regularly, so
 the matplotlib pattern (fix on maintenance, merge maintenance into
 master) seems not to be used here.

Fix on master then backport is the current strategy.

-n
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] bug in deepcopy() of rank-zero arrays?

2013-04-30 Thread Richard Hattersley
+1 for getting rid of this inconsistency

We've hit this with Iris (a met/ocean analysis package - see github), and
have had to add several workarounds.


On 19 April 2013 16:55, Chris Barker - NOAA Federal
chris.bar...@noaa.govwrote:

 Hi folks,

 In [264]: np.__version__
 Out[264]: '1.7.0'

 I just noticed that deep copying a rank-zero array yields a scalar --
 probably not what we want.

 In [242]: a1 = np.array(3)

 In [243]: type(a1), a1
 Out[243]: (numpy.ndarray, array(3))

 In [244]: a2 = copy.deepcopy(a1)

 In [245]: type(a2), a2
 Out[245]: (numpy.int32, 3)

 regular copy.copy() seems to work fine:

 In [246]: a3 = copy.copy(a1)

 In [247]: type(a3), a3
 Out[247]: (numpy.ndarray, array(3))

 Higher-rank arrays seem to work fine:

 In [253]: a1 = np.array((3,4))

 In [254]: type(a1), a1
 Out[254]: (numpy.ndarray, array([3, 4]))

 In [255]: a2 = copy.deepcopy(a1)

 In [256]: type(a2), a2
 Out[256]: (numpy.ndarray, array([3, 4]))

 Array scalars seem to work fine as well:

 In [257]: s1 = np.float32(3)

 In [258]: s2 = copy.deepcopy(s1)

 In [261]: type(s1), s1
 Out[261]: (numpy.float32, 3.0)

 In [262]: type(s2), s2
 Out[262]: (numpy.float32, 3.0)

 There are other ways to copy arrays, but in this case, I had a dict
 with a bunch of arrays in it, and needed a deepcopy of the dict. I was
 surprised to find that my rank-0 array got turned into a scalar.

 -Chris

 --

 Christopher Barker, Ph.D.
 Oceanographer

 Emergency Response Division
 NOAA/NOS/ORR(206) 526-6959   voice
 7600 Sand Point Way NE   (206) 526-6329   fax
 Seattle, WA  98115   (206) 526-6317   main reception

 chris.bar...@noaa.gov
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] bug in deepcopy() of rank-zero arrays?

2013-04-30 Thread Chris Barker - NOAA Federal
hmm -- I suppose one of us should post an issue on github -- then ask for
it ti be fixed before 1.8  ;-)

I'll try to get to the issue if no one beats me to it -- got to run now...

-Chris



On Tue, Apr 30, 2013 at 5:35 AM, Richard Hattersley
rhatters...@gmail.comwrote:

 +1 for getting rid of this inconsistency

 We've hit this with Iris (a met/ocean analysis package - see github), and
 have had to add several workarounds.


 On 19 April 2013 16:55, Chris Barker - NOAA Federal chris.bar...@noaa.gov
  wrote:

  Hi folks,

 In [264]: np.__version__
 Out[264]: '1.7.0'

 I just noticed that deep copying a rank-zero array yields a scalar --
 probably not what we want.

 In [242]: a1 = np.array(3)

 In [243]: type(a1), a1
 Out[243]: (numpy.ndarray, array(3))

 In [244]: a2 = copy.deepcopy(a1)

 In [245]: type(a2), a2
 Out[245]: (numpy.int32, 3)

 regular copy.copy() seems to work fine:

 In [246]: a3 = copy.copy(a1)

 In [247]: type(a3), a3
 Out[247]: (numpy.ndarray, array(3))

 Higher-rank arrays seem to work fine:

 In [253]: a1 = np.array((3,4))

 In [254]: type(a1), a1
 Out[254]: (numpy.ndarray, array([3, 4]))

 In [255]: a2 = copy.deepcopy(a1)

 In [256]: type(a2), a2
 Out[256]: (numpy.ndarray, array([3, 4]))

 Array scalars seem to work fine as well:

 In [257]: s1 = np.float32(3)

 In [258]: s2 = copy.deepcopy(s1)

 In [261]: type(s1), s1
 Out[261]: (numpy.float32, 3.0)

 In [262]: type(s2), s2
 Out[262]: (numpy.float32, 3.0)

 There are other ways to copy arrays, but in this case, I had a dict
 with a bunch of arrays in it, and needed a deepcopy of the dict. I was
 surprised to find that my rank-0 array got turned into a scalar.

 -Chris

 --

 Christopher Barker, Ph.D.
 Oceanographer

 Emergency Response Division
 NOAA/NOS/ORR(206) 526-6959   voice
 7600 Sand Point Way NE   (206) 526-6329   fax
 Seattle, WA  98115   (206) 526-6317   main reception

 chris.bar...@noaa.gov
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion



 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion




-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/ORR(206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115   (206) 526-6317   main reception

chris.bar...@noaa.gov
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] bug in deepcopy() of rank-zero arrays?

2013-04-19 Thread Chris Barker - NOAA Federal
Hi folks,

In [264]: np.__version__
Out[264]: '1.7.0'

I just noticed that deep copying a rank-zero array yields a scalar --
probably not what we want.

In [242]: a1 = np.array(3)

In [243]: type(a1), a1
Out[243]: (numpy.ndarray, array(3))

In [244]: a2 = copy.deepcopy(a1)

In [245]: type(a2), a2
Out[245]: (numpy.int32, 3)

regular copy.copy() seems to work fine:

In [246]: a3 = copy.copy(a1)

In [247]: type(a3), a3
Out[247]: (numpy.ndarray, array(3))

Higher-rank arrays seem to work fine:

In [253]: a1 = np.array((3,4))

In [254]: type(a1), a1
Out[254]: (numpy.ndarray, array([3, 4]))

In [255]: a2 = copy.deepcopy(a1)

In [256]: type(a2), a2
Out[256]: (numpy.ndarray, array([3, 4]))

Array scalars seem to work fine as well:

In [257]: s1 = np.float32(3)

In [258]: s2 = copy.deepcopy(s1)

In [261]: type(s1), s1
Out[261]: (numpy.float32, 3.0)

In [262]: type(s2), s2
Out[262]: (numpy.float32, 3.0)

There are other ways to copy arrays, but in this case, I had a dict
with a bunch of arrays in it, and needed a deepcopy of the dict. I was
surprised to find that my rank-0 array got turned into a scalar.

-Chris

-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/ORR(206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115   (206) 526-6317   main reception

chris.bar...@noaa.gov
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Bug in np.records?

2013-04-06 Thread Ralf Gommers
On Wed, Mar 20, 2013 at 2:57 PM, Pierre Barbier de Reuille 
pierre.barbierdereui...@gmail.com wrote:

 Hey,

 I am trying to use titles for the record arrays. In the documentation, it
 is specified that any column can set to None. However, trying this fails
 on numpy 1.6.2 because in np.core.records, on line 195, the strip method
 is called on the title object. This is really annoying. Could we fix this
 by replacing line 195 with:


 self._titles = [n.strip() if n is not None else None for n in
 titles[:self._nfields]]

 ?


That sounds reasonable. Ideally you'd send a pull request for this,
including a regression test. Otherwise providing a self-contained example
that can be turned into a test would help.

Ralf
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] Bug in einsum?

2013-03-13 Thread Jaakko Luttinen
Hi,

I have encountered a very weird behaviour with einsum. I try to compute
something like R*A*R', where * denotes a kind of matrix
multiplication. However, for particular shapes of R and A, the results
are extremely bad.

I compare two einsum results:
First, I compute in two einsum calls as (R*A)*R'.
Second, I compute the whole result in one einsum call.
However, the results are significantly different for some shapes.

My test:
import numpy as np
for D in range(30):
A = np.random.randn(100,D,D)
R = np.random.randn(D,D)
Y1 = np.einsum('...ik,...kj-...ij', R, A)
Y1 = np.einsum('...ik,...kj-...ij', Y1, R.T)
Y2 = np.einsum('...ik,...kl,...lj-...ij', R, A, R.T)
print(D=%d % D, np.allclose(Y1,Y2), np.linalg.norm(Y1-Y2))

Output:
D=0 True 0.0
D=1 True 0.0
D=2 True 8.40339658678e-15
D=3 True 8.09995399928e-15
D=4 True 3.59428803435e-14
D=5 False 34.755610184
D=6 False 28.3576558351
D=7 False 41.5402690906
D=8 True 2.31709582841e-13
D=9 False 36.0161112799
D=10 True 4.76237746912e-13
D=11 True 4.5790782e-13
D=12 True 4.90302218301e-13
D=13 True 6.96175851271e-13
D=14 True 1.10067181384e-12
D=15 True 1.29095933163e-12
D=16 True 1.3466837332e-12
D=17 True 1.52265065763e-12
D=18 True 2.05407923852e-12
D=19 True 2.33327630748e-12
D=20 True 2.96849358082e-12
D=21 True 3.31063706175e-12
D=22 True 4.28163620455e-12
D=23 True 3.58951880681e-12
D=24 True 4.69973694769e-12
D=25 True 5.47385264567e-12
D=26 True 5.49643316347e-12
D=27 True 6.75132988402e-12
D=28 True 7.86435437892e-12
D=29 True 7.85453681029e-12

So, for D={5,6,7,9}, allclose returns False and the error norm is HUGE.
It doesn't seem like just some small numerical inaccuracy because the
error norm is so large. I don't know which one is correct (Y1 or Y2) but
at least either one is wrong in my opinion.

I ran the same test several times, and each time same values of D fail.
If I change the shapes somehow, the failing values of D might change
too, but I usually have several failing values.

I'm running the latest version from github (commit bd7104cef4) under
Python 3.2.3. With NumPy 1.6.1 under Python 2.7.3 the test crashes and
Python exits printing Floating point exception.

This seems so weird to me that I wonder if I'm just doing something stupid..

Thanks a lot for any help!
Jaakko
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Bug with ufuncs made with frompyfunc

2013-01-09 Thread Nathaniel Smith
On Wed, Jan 9, 2013 at 7:23 AM, OKB (not okblacke)
brenb...@brenbarn.net wrote:
 A bug causing errors with using methods of ufuncs created with
 frompyfunc was mentioned on the list over a year ago:
 http://mail.scipy.org/pipermail/numpy-discussion/2011-
 September/058501.html

 Is there any word on the status of this bug?  I wasn't able to find
 a ticket in the bug tracker.

That thread says that it had already been fixed in the development
version of numpy, so it should be fixed in the upcoming 1.7. If you
want to be sure then you try it on the 1.7 release candidate.

-n
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] Bug with ufuncs made with frompyfunc

2013-01-08 Thread OKB (not okblacke)
A bug causing errors with using methods of ufuncs created with 
frompyfunc was mentioned on the list over a year ago: 
http://mail.scipy.org/pipermail/numpy-discussion/2011-
September/058501.html

Is there any word on the status of this bug?  I wasn't able to find 
a ticket in the bug tracker.

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Bug in as_strided/reshape

2012-08-10 Thread Dave Hirschfeld
Sebastian Berg sebastian at sipsolutions.net writes:

 
 Hello,
 
 looking at the code, when only adding/removing dimensions with size 1,
 numpy takes a small shortcut, however it uses 0 stride lengths as value
 for the new one element dimensions temporarily, then replacing it again
 to ensure the new array is contiguous.
 This replacing does not check if the dimension has more then size 1.
 Likely there is a better way to fix it, but the attached diff should do
 it.
 
 Regards,
 
 Sebastian
 

Thanks for the confirmation. So this doesn't get lost I've opened issue #380 on 
GitHub

https://github.com/numpy/numpy/issues/380

-Dave

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Bug in as_strided/reshape

2012-08-09 Thread Dave Hirschfeld
Dave Hirschfeld dave.hirschfeld at gmail.com writes:

 
 It seems that reshape doesn't work correctly on an array which has been
 resized using the 0-stride trick e.g.
 
 In [73]: x = array([5])
 
 In [74]: y = as_strided(x, shape=(10,), strides=(0,))
 
 In [75]: y
 Out[75]: array([5, 5, 5, 5, 5, 5, 5, 5, 5, 5])
 
 In [76]: y.reshape([10,1])
 Out[76]: 
 array([[  5],
[  8],
[  762933412],
[-2013265919],
[ 26],
[ 64],
[  762933414],
[-2013244356],
[ 26],
[ 64]])  Should all be 5
 
 In [77]: y.copy().reshape([10,1])
 Out[77]: 
 array([[5],
[5],
[5],
[5],
[5],
[5],
[5],
[5],
[5],
[5]])
 
 In [78]: np.__version__
 Out[78]: '1.6.2'
 
 Perhaps a clause such as below is required in reshape?
 
 if any(stride == 0 for stride in y.strides):
 return y.copy().reshape(shape)
 else:
 return y.reshape(shape)
 
 Regards,
 Dave
 

Though it would be good to avoid the copy which you should be able to do in 
this 
case. Investigating further:

In [15]: y.strides
Out[15]: (0,)

In [16]: z = y.reshape([10,1])

In [17]: z.strides
Out[17]: (4, 4)

In [18]: z.strides = (0, 4)

In [19]: z
Out[19]: 
array([[5],
   [5],
   [5],
   [5],
   [5],
   [5],
   [5],
   [5],
   [5],
   [5]])

In [32]: y.reshape([5, 2])
Out[32]: 
array([[5, 5],
   [5, 5],
   [5, 5],
   [5, 5],
   [5, 5]])

In [33]: y.reshape([5, 2]).strides
Out[33]: (0, 0)

So it seems that reshape is incorrectly setting the stride of axis0 to 4, but 
only when the appended axis is of size 1.

-Dave




___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Bug in as_strided/reshape

2012-08-09 Thread Sebastian Berg
Hello,

looking at the code, when only adding/removing dimensions with size 1,
numpy takes a small shortcut, however it uses 0 stride lengths as value
for the new one element dimensions temporarily, then replacing it again
to ensure the new array is contiguous.
This replacing does not check if the dimension has more then size 1.
Likely there is a better way to fix it, but the attached diff should do
it.

Regards,

Sebastian

On Do, 2012-08-09 at 13:06 +, Dave Hirschfeld wrote:
 Dave Hirschfeld dave.hirschfeld at gmail.com writes:
 
  
  It seems that reshape doesn't work correctly on an array which has been
  resized using the 0-stride trick e.g.
  
  In [73]: x = array([5])
  
  In [74]: y = as_strided(x, shape=(10,), strides=(0,))
  
  In [75]: y
  Out[75]: array([5, 5, 5, 5, 5, 5, 5, 5, 5, 5])
  
  In [76]: y.reshape([10,1])
  Out[76]: 
  array([[  5],
 [  8],
 [  762933412],
 [-2013265919],
 [ 26],
 [ 64],
 [  762933414],
 [-2013244356],
 [ 26],
 [ 64]])  Should all be 5
  
  In [77]: y.copy().reshape([10,1])
  Out[77]: 
  array([[5],
 [5],
 [5],
 [5],
 [5],
 [5],
 [5],
 [5],
 [5],
 [5]])--- a/numpy/core/src/multiarray/shape.c
+++ b/numpy/core/src/multiarray/shape.c
@@ -273,21 +273,21 @@ PyArray_Newshape(PyArrayObject *self, PyArray_Dims
*newdims,
  * appropriate value to preserve contiguousness
  */
 if (order == NPY_FORTRANORDER) {
-if (strides[0] == 0) {
+if ((strides[0] == 0)  (dimensions[0] == 1)) {
 strides[0] = PyArray_DESCR(self)-elsize;
 }
 for (i = 1; i  ndim; i++) {
-if (strides[i] == 0) {
+if ((strides[i] == 0)  (dimensions[i] == 1)) {
 strides[i] = strides[i-1] * dimensions[i-1];
 }
 }
 }
 else {
-if (strides[ndim-1] == 0) {
+if ((strides[ndim-1] == 0)  (dimensions[ndim-1] == 1)) {
 strides[ndim-1] = PyArray_DESCR(self)-elsize;
 }
 for (i = ndim - 2; i  -1; i--) {
-if (strides[i] == 0) {
+if ((strides[i] == 0)  (dimensions[i] == 1)) {
 strides[i] = strides[i+1] * dimensions[i+1];
 }
 }
  
  In [78]: np.__version__
  Out[78]: '1.6.2'
  
  Perhaps a clause such as below is required in reshape?
  
  if any(stride == 0 for stride in y.strides):
  return y.copy().reshape(shape)
  else:
  return y.reshape(shape)
  
  Regards,
  Dave
  
 
 Though it would be good to avoid the copy which you should be able to do in 
 this 
 case. Investigating further:
 
 In [15]: y.strides
 Out[15]: (0,)
 
 In [16]: z = y.reshape([10,1])
 
 In [17]: z.strides
 Out[17]: (4, 4)
 
 In [18]: z.strides = (0, 4)
 
 In [19]: z
 Out[19]: 
 array([[5],
[5],
[5],
[5],
[5],
[5],
[5],
[5],
[5],
[5]])
 
 In [32]: y.reshape([5, 2])
 Out[32]: 
 array([[5, 5],
[5, 5],
[5, 5],
[5, 5],
[5, 5]])
 
 In [33]: y.reshape([5, 2]).strides
 Out[33]: (0, 0)
 
 So it seems that reshape is incorrectly setting the stride of axis0 to 4, but 
 only when the appended axis is of size 1.
 
 -Dave
 
 
 
 
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion
 

From eed2abca6144e16c5d9ca208ef90dd01f7dd6009 Mon Sep 17 00:00:00 2001
From: Sebastian Berg sebast...@sipsolutions.net
Date: Thu, 9 Aug 2012 17:17:32 +0200
Subject: [PATCH] Fix reshaping of arrays with stride 0 in a dimension with
 size of more then 1.

---
 numpy/core/src/multiarray/shape.c |8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/numpy/core/src/multiarray/shape.c b/numpy/core/src/multiarray/shape.c
index 0672326..09a6cb0 100644
--- a/numpy/core/src/multiarray/shape.c
+++ b/numpy/core/src/multiarray/shape.c
@@ -273,21 +273,21 @@ PyArray_Newshape(PyArrayObject *self, PyArray_Dims *newdims,
  * appropriate value to preserve contiguousness
  */
 if (order == NPY_FORTRANORDER) {
-if (strides[0] == 0) {
+if ((strides[0] == 0)  (dimensions[0] == 1)) {
 strides[0] = PyArray_DESCR(self)-elsize;
 }
 for (i = 1; i  ndim; i++) {
-if (strides[i] == 0) {
+if ((strides[i] == 0)  (dimensions[i] == 1)) {
 strides[i] = strides[i-1] * dimensions[i-1];
 }
 }
 }
 else {
-if (strides[ndim-1] == 0) {
+if ((strides[ndim-1] == 0)  (dimensions[ndim-1] == 1)) {
 strides[ndim-1] = 

[Numpy-discussion] Bug in as_strided/reshape

2012-08-08 Thread Dave Hirschfeld
It seems that reshape doesn't work correctly on an array which has been
resized using the 0-stride trick e.g.

In [73]: x = array([5])

In [74]: y = as_strided(x, shape=(10,), strides=(0,))

In [75]: y
Out[75]: array([5, 5, 5, 5, 5, 5, 5, 5, 5, 5])

In [76]: y.reshape([10,1])
Out[76]: 
array([[  5],
   [  8],
   [  762933412],
   [-2013265919],
   [ 26],
   [ 64],
   [  762933414],
   [-2013244356],
   [ 26],
   [ 64]])  Should all be 5

In [77]: y.copy().reshape([10,1])
Out[77]: 
array([[5],
   [5],
   [5],
   [5],
   [5],
   [5],
   [5],
   [5],
   [5],
   [5]])

In [78]: np.__version__
Out[78]: '1.6.2'

Perhaps a clause such as below is required in reshape?

if any(stride == 0 for stride in y.strides):
return y.copy().reshape(shape)
else:
return y.reshape(shape)


Regards,
Dave



___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] bug in numpy.where?

2012-07-30 Thread Phil Hodge
On 07/27/2012 03:58 PM, Andreas Mueller wrote:
 Hi Everybody.
 The bug is that no error is raised, right?
 The docs say

 where(condition, [x, y])

 x, y : array_like, optional
   Values from which to choose. `x` and `y` need to have the same
   shape as `condition`

 In the example you gave, x was a scalar.

net.max() returns an array:

  print type(net.max())
type 'numpy.float32'

That was the reason I cast it to a float to check that that did result 
in the correct behavior for `where`.

Phil
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] bug in numpy.where?

2012-07-30 Thread Robert Kern
On Mon, Jul 30, 2012 at 2:30 PM, Phil Hodge ho...@stsci.edu wrote:
 On 07/27/2012 03:58 PM, Andreas Mueller wrote:
 Hi Everybody.
 The bug is that no error is raised, right?
 The docs say

 where(condition, [x, y])

 x, y : array_like, optional
   Values from which to choose. `x` and `y` need to have the same
   shape as `condition`

 In the example you gave, x was a scalar.

 net.max() returns an array:

   print type(net.max())
 type 'numpy.float32'

No, that's a scalar. The type would be numpy.ndarray if it were an array.

-- 
Robert Kern
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] bug in numpy.where?

2012-07-30 Thread Travis Oliphant
Can you file a bug report on Github's issue tracker? 

Thanks,

-Travis

On Jul 26, 2012, at 1:33 PM, Phil Hodge wrote:

 On a Linux machine:
 
 uname -srvop
 Linux 2.6.18-308.8.2.el5 #1 SMP Tue May 29 11:54:17 EDT 2012 x86_64 
 GNU/Linux
 
 this example shows an apparent problem with the where function:
 
 Python 2.7.1 (r271:86832, Dec 21 2010, 11:19:43)
 [GCC 4.1.2 20080704 (Red Hat 4.1.2-48)] on linux2
 Type help, copyright, credits or license for more information.
 import numpy as np
 print np.__version__
 1.5.1
 net = np.zeros(3, dtype='f4')
 net[1] = 0.00458849
 net[2] = 0.605202
 max_net = net.max()
 test = np.where(net = 0., max_net, net)
 print test
 [ -2.23910537e-35   4.58848989e-03   6.05202019e-01]
 
 When I specified the dtype for net as 'f8', test[0] was 
 3.46244974e+68.  It worked as expected (i.e. test[0] should be 0.605202) 
 when I specified float(max_net) as the second argument to np.where.
 
 Phil
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] bug in numpy.where?

2012-07-30 Thread Phil Hodge
On 07/30/2012 10:53 AM, Travis Oliphant wrote:
 Can you file a bug report on Github's issue tracker?

It's https://github.com/numpy/numpy/issues/369

Phil
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Bug in numpy.mean() revisited

2012-07-27 Thread Henry Gomersall
On Thu, 2012-07-26 at 22:15 -0600, Charles R Harris wrote:
 I would support accumulating in 64 bits but, IIRC, the function will
 need to be rewritten so that it works by adding 32 bit floats to the
 accumulator to save space. There are also more stable methods that
 could also be investigated. There is a nice little project there for
 someone to cut their teeth on.

So a (very) quick read around suggests that using an interim mean gives
a more robust algorithm. The problem being, that these techniques are
either multi-pass, or inherently slower (due to say a division in the
loop).

Higher precision would not suffer the same potential slow down and would
solve most cases of this problem.

Henry

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Bug in numpy.mean() revisited

2012-07-27 Thread Nathaniel Smith
On Fri, Jul 27, 2012 at 5:15 AM, Charles R Harris
charlesr.har...@gmail.com wrote:
 I would support accumulating in 64 bits but, IIRC, the function will need to
 be rewritten so that it works by adding 32 bit floats to the accumulator to
 save space. There are also more stable methods that could also be
 investigated. There is a nice little project there for someone to cut their
 teeth on.

So the obvious solution here would be to make the ufunc reduce loop
smart enough that
  x = np.zeros(2 ** 30, dtype=float32)
  np.sum(x, dtype=float64)
does not upcast 'x' to float64's as a whole. This shouldn't be too
terrible to implement -- iterate over the float32 array, and only
upcast each inner-loop buffer as you go, instead of upcasting the
whole thing.

In fact, nditer might do this already?

Then using a wide accumulator by default would just take a few lines
of code in numpy.core._methods._mean to select the proper dtype and
downcast the result.

-n
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] bug in numpy.where?

2012-07-27 Thread Benjamin Root
On Thu, Jul 26, 2012 at 2:33 PM, Phil Hodge ho...@stsci.edu wrote:

 On a Linux machine:

   uname -srvop
 Linux 2.6.18-308.8.2.el5 #1 SMP Tue May 29 11:54:17 EDT 2012 x86_64
 GNU/Linux

 this example shows an apparent problem with the where function:

 Python 2.7.1 (r271:86832, Dec 21 2010, 11:19:43)
 [GCC 4.1.2 20080704 (Red Hat 4.1.2-48)] on linux2
 Type help, copyright, credits or license for more information.
   import numpy as np
   print np.__version__
 1.5.1
   net = np.zeros(3, dtype='f4')
   net[1] = 0.00458849
   net[2] = 0.605202
   max_net = net.max()
   test = np.where(net = 0., max_net, net)
   print test
 [ -2.23910537e-35   4.58848989e-03   6.05202019e-01]

 When I specified the dtype for net as 'f8', test[0] was
 3.46244974e+68.  It worked as expected (i.e. test[0] should be 0.605202)
 when I specified float(max_net) as the second argument to np.where.

 Phil


Confirmed with version 1.7.0.dev-470c857 on a CentOS6 64-bit machine.
Strange indeed.

Breaking it down further:

 res = (net = 0.)
 print res
[ True False False]
 np.where(res, max_net, net)
array([ -2.23910537e-35,   4.58848989e-03,   6.05202019e-01], dtype=float32)

Very Strange...

Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] bug in numpy.where?

2012-07-27 Thread Christopher Hanley
On Fri, Jul 27, 2012 at 2:01 PM, Benjamin Root ben.r...@ou.edu wrote:



 On Thu, Jul 26, 2012 at 2:33 PM, Phil Hodge ho...@stsci.edu wrote:

 On a Linux machine:

   uname -srvop
 Linux 2.6.18-308.8.2.el5 #1 SMP Tue May 29 11:54:17 EDT 2012 x86_64
 GNU/Linux

 this example shows an apparent problem with the where function:

 Python 2.7.1 (r271:86832, Dec 21 2010, 11:19:43)
 [GCC 4.1.2 20080704 (Red Hat 4.1.2-48)] on linux2
 Type help, copyright, credits or license for more information.
   import numpy as np
   print np.__version__
 1.5.1
   net = np.zeros(3, dtype='f4')
   net[1] = 0.00458849
   net[2] = 0.605202
   max_net = net.max()
   test = np.where(net = 0., max_net, net)
   print test
 [ -2.23910537e-35   4.58848989e-03   6.05202019e-01]

 When I specified the dtype for net as 'f8', test[0] was
 3.46244974e+68.  It worked as expected (i.e. test[0] should be 0.605202)
 when I specified float(max_net) as the second argument to np.where.

 Phil


 Confirmed with version 1.7.0.dev-470c857 on a CentOS6 64-bit machine.
 Strange indeed.

 Breaking it down further:

  res = (net = 0.)
  print res
 [ True False False]
  np.where(res, max_net, net)
 array([ -2.23910537e-35,   4.58848989e-03,   6.05202019e-01],
 dtype=float32)

 Very Strange...

 Ben Root


What if find really interesting is that -2.23910537e-35 is the byte swapped
version of 6.05202019e-01.

Chris
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] bug in numpy.where?

2012-07-27 Thread Andreas Mueller
Hi Everybody.
The bug is that no error is raised, right?
The docs say

where(condition, [x, y])

x, y : array_like, optional
 Values from which to choose. `x` and `y` need to have the same
 shape as `condition`

In the example you gave, x was a scalar.

Cheers,
Andy
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] bug in numpy.where?

2012-07-27 Thread Benjamin Root
On Fri, Jul 27, 2012 at 3:58 PM, Andreas Mueller
amuel...@ais.uni-bonn.dewrote:

 Hi Everybody.
 The bug is that no error is raised, right?
 The docs say

 where(condition, [x, y])

 x, y : array_like, optional
  Values from which to choose. `x` and `y` need to have the same
  shape as `condition`

 In the example you gave, x was a scalar.

 Cheers,
 Andy


Hmm, that is incorrect, I believe.  I have used a scalar before.  Maybe it
works because a scalar is broadcastable to the same shape as any other
N-dim array?

If so, then the wording of that docstring needs to be fixed.

No, I think Christopher hit it on the head.  For whatever reason, the
endian-ness somewhere is not being respected and causes a byte-swapped
version to show up.  How that happens, though, is beyond me.

Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] bug in numpy.where?

2012-07-27 Thread Andreas Mueller

On 07/27/2012 09:10 PM, Benjamin Root wrote:



On Fri, Jul 27, 2012 at 3:58 PM, Andreas Mueller 
amuel...@ais.uni-bonn.de mailto:amuel...@ais.uni-bonn.de wrote:


Hi Everybody.
The bug is that no error is raised, right?
The docs say

where(condition, [x, y])

x, y : array_like, optional
 Values from which to choose. `x` and `y` need to have the same
 shape as `condition`

In the example you gave, x was a scalar.

Cheers,
Andy


Hmm, that is incorrect, I believe.  I have used a scalar before.  
Maybe it works because a scalar is broadcastable to the same shape as 
any other N-dim array?


If so, then the wording of that docstring needs to be fixed.

No, I think Christopher hit it on the head.  For whatever reason, the 
endian-ness somewhere is not being respected and causes a byte-swapped 
version to show up.  How that happens, though, is beyond me.


Well, if you use np.repeat(max_net, 3) instead of max_net, it works as 
expected.

So if you use the function as documented, it does the right thing.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] bug in numpy.where?

2012-07-27 Thread Christopher Hanley
On Fri, Jul 27, 2012 at 4:10 PM, Benjamin Root ben.r...@ou.edu wrote:



 On Fri, Jul 27, 2012 at 3:58 PM, Andreas Mueller amuel...@ais.uni-bonn.de
  wrote:

 Hi Everybody.
 The bug is that no error is raised, right?
 The docs say

 where(condition, [x, y])

 x, y : array_like, optional
  Values from which to choose. `x` and `y` need to have the same
  shape as `condition`

 In the example you gave, x was a scalar.

 Cheers,
 Andy


 Hmm, that is incorrect, I believe.  I have used a scalar before.  Maybe it
 works because a scalar is broadcastable to the same shape as any other
 N-dim array?

 If so, then the wording of that docstring needs to be fixed.

 No, I think Christopher hit it on the head.  For whatever reason, the
 endian-ness somewhere is not being respected and causes a byte-swapped
 version to show up.  How that happens, though, is beyond me.

 Ben Root



It may have something to do with the dtype size as well.  The problem seen
with,
net = np.zeros(3, dtype='f4')

Disappears for
net = np.zeros(3, dtype='f8')

and above.

Chris
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] bug in numpy.where?

2012-07-26 Thread Phil Hodge
On a Linux machine:

  uname -srvop
Linux 2.6.18-308.8.2.el5 #1 SMP Tue May 29 11:54:17 EDT 2012 x86_64 
GNU/Linux

this example shows an apparent problem with the where function:

Python 2.7.1 (r271:86832, Dec 21 2010, 11:19:43)
[GCC 4.1.2 20080704 (Red Hat 4.1.2-48)] on linux2
Type help, copyright, credits or license for more information.
  import numpy as np
  print np.__version__
1.5.1
  net = np.zeros(3, dtype='f4')
  net[1] = 0.00458849
  net[2] = 0.605202
  max_net = net.max()
  test = np.where(net = 0., max_net, net)
  print test
[ -2.23910537e-35   4.58848989e-03   6.05202019e-01]

When I specified the dtype for net as 'f8', test[0] was 
3.46244974e+68.  It worked as expected (i.e. test[0] should be 0.605202) 
when I specified float(max_net) as the second argument to np.where.

Phil
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] Bug in numpy.mean() revisited

2012-07-26 Thread Tom Aldcroft
There was a thread in January discussing the non-obvious behavior of
numpy.mean() for large arrays of float32 values [1].  This issue is
nicely discussed at the end of the numpy.mean() documentation [2] with
an example:

 a = np.zeros((2, 512*512), dtype=np.float32)
 a[0, :] = 1.0
 a[1, :] = 0.1
 np.mean(a)
0.546875

From the docs and previous discussion it seems there is no technical
difficulty in choosing a different (higher precision) type for the
accumulator using the dtype arg, and in fact this is done
automatically for int values.

My question is whether there would be any support for doing something
more than documenting this behavior.  I suspect very few people ever
make it below the fold for the np.mean() documentation.  Taking the
mean of large arrays of float32 values is a *very* common use case and
giving the wrong answer with default inputs is really disturbing.  I
recently had to rebuild a complex science data archive because of
corrupted mean values.

Possible ideas to stimulate discussion:
1. Always use float64 to accumulate float types that are 64 bits or
less.   Are there serious performance impacts to automatically using
float64 to accumulate float32 arrays?  I appreciate this would likely
introduce unwanted regressions (sometimes suddenly getting the right
answer is a bad thing).  So could this be considered for numpy 2.0?

2. Might there be a way to emit a warning if the number of values and
the max accumulated value [3] are such that the estimated fractional
error is above some tolerance?  I'm not even sure if this is a good
idea or if there will be howls from the community as their codes start
warning about inaccurate mean values.  Better idea along this line??

Cheers,
Tom

[1]: http://mail.scipy.org/pipermail/numpy-discussion/2012-January/059960.html
[2]: http://docs.scipy.org/doc/numpy/reference/generated/numpy.mean.html
[3]: Using the max accumulated value during accumulation instead of
the final accumulated value seems like the right thing for estimating
precision loss.  But this would affect performance so maybe just using
the final value would catch many cases.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Bug in numpy.mean() revisited

2012-07-26 Thread Charles R Harris
On Thu, Jul 26, 2012 at 9:26 PM, Tom Aldcroft aldcr...@head.cfa.harvard.edu
 wrote:

 There was a thread in January discussing the non-obvious behavior of
 numpy.mean() for large arrays of float32 values [1].  This issue is
 nicely discussed at the end of the numpy.mean() documentation [2] with
 an example:

  a = np.zeros((2, 512*512), dtype=np.float32)
  a[0, :] = 1.0
  a[1, :] = 0.1
  np.mean(a)
 0.546875

 From the docs and previous discussion it seems there is no technical
 difficulty in choosing a different (higher precision) type for the
 accumulator using the dtype arg, and in fact this is done
 automatically for int values.

 My question is whether there would be any support for doing something
 more than documenting this behavior.  I suspect very few people ever
 make it below the fold for the np.mean() documentation.  Taking the
 mean of large arrays of float32 values is a *very* common use case and
 giving the wrong answer with default inputs is really disturbing.  I
 recently had to rebuild a complex science data archive because of
 corrupted mean values.

 Possible ideas to stimulate discussion:
 1. Always use float64 to accumulate float types that are 64 bits or
 less.   Are there serious performance impacts to automatically using
 float64 to accumulate float32 arrays?  I appreciate this would likely
 introduce unwanted regressions (sometimes suddenly getting the right
 answer is a bad thing).  So could this be considered for numpy 2.0?

 2. Might there be a way to emit a warning if the number of values and
 the max accumulated value [3] are such that the estimated fractional
 error is above some tolerance?  I'm not even sure if this is a good
 idea or if there will be howls from the community as their codes start
 warning about inaccurate mean values.  Better idea along this line??


I would support accumulating in 64 bits but, IIRC, the function will need
to be rewritten so that it works by adding 32 bit floats to the accumulator
to save space. There are also more stable methods that could also be
investigated. There is a nice little project there for someone to cut their
teeth on.

Chuck
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] Bug in pickling an ndarray?

2012-06-30 Thread Daniel Hyams
I am having trouble pickling (and then unpickling) an ndarray. Upon
unpickling, the base attribute of the ndarray is set to some very strange
string (base was None when the ndarray was pickled, so it should remain
None).

I have tried on various platforms and versions of numpy, with inconclusive
results:

# tested: Linux (Suse 11.1), numpy 1.5.1   BUG
# Linux (Suse 11,0), numpy 1.6.1   OK
# Linux (Mint Debian), numpy 1.6.1 BUG
# Linux (Mint Debian), numpy 1.6.2 BUG
# OSX (Snow Leopard),  numpy 1.5.1rc1  BUG
# OSX (Snow Leopard),  numpy 1.6.2 BUG
# Windows 7,   numpy 1.4.1 OK

I have attached a script below that can be used to check for the problem; I
suppose that this is a bug report, unless I'm doing something terribly
wrong or my expectations for the base attribute are off.

 cut here -
# this little demo shows a problem with the base attribute of an ndarray,
when
# pickling.  Before pickling, dset.base is None, but after pickling, it is
some
# strange string.

import cPickle as pickle
import numpy
print numpy.__version__
#import pickle

dset = numpy.ones((2,2))

print BEFORE PICKLING
print dset
print base = ,dset.base
print dset.flags

# pickle.
s = pickle.dumps(dset)

# now unpickle.
dset = pickle.loads(s)

print AFTER PICKLING AND THEN IMMEDIATELY UNPICKLING
print dset
print base = ,dset.base
print dset.flags


-- 
Daniel Hyams
dhy...@gmail.com
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Bug in pickling an ndarray?

2012-06-30 Thread Nathaniel Smith
On Sat, Jun 30, 2012 at 9:15 PM, Daniel Hyams dhy...@gmail.com wrote:
 I am having trouble pickling (and then unpickling) an ndarray. Upon
 unpickling, the base attribute of the ndarray is set to some very strange
 string (base was None when the ndarray was pickled, so it should remain
 None).

This sounds like correct behaviour to me -- is it causing you a
problem? In general ndarray's don't keep things like memory layout,
view sharing, etc. through pickling, and that means that things like
.flags and .base may change.

-n
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Bug in pickling an ndarray?

2012-06-30 Thread Daniel Hyams
Hmmm, I wouldn't think that it is correct behavior; I would think that
*any* ndarray arising from pickling would have its .base attribute set to
None.  If not, then who is really the one that owns the data?

It was my understanding that .base should hold a reference to another
ndarray that the data is really coming from, or it's None.  It certainly
shouldn't be some random string, should it?

And yes, it is causing a problem for me, which is why I noticed it.  In my
application, ndarrays can come from various sources, pickling being one of
them.  Later in the app, I was wanting to resize the array, which you
cannot do if the data is not really owned by that array...I had explicit
check for myarray.base==None, which it is not when I get the ndarray from a
pickle.


-- 
Daniel Hyams
dhy...@gmail.com
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Bug in pickling an ndarray?

2012-06-30 Thread Travis Oliphant
This is the expected behavior.   It is not a bug. 

NumPy arrays after pickling are views into the String that is created by the 
pickling machinery.   Thus, the base is set.  This was done to avoid an 
additional memcpy. 

This avoids a copy, but yes, it does mean that you can't resize the array until 
you make another copy. 

Best regards,

-Travis



On Jun 30, 2012, at 5:33 PM, Daniel Hyams wrote:

 Hmmm, I wouldn't think that it is correct behavior; I would think that *any* 
 ndarray arising from pickling would have its .base attribute set to None.  If 
 not, then who is really the one that owns the data? 
 
 It was my understanding that .base should hold a reference to another ndarray 
 that the data is really coming from, or it's None.  It certainly shouldn't be 
 some random string, should it?
 
 And yes, it is causing a problem for me, which is why I noticed it.  In my 
 application, ndarrays can come from various sources, pickling being one of 
 them.  Later in the app, I was wanting to resize the array, which you cannot 
 do if the data is not really owned by that array...I had explicit check for 
 myarray.base==None, which it is not when I get the ndarray from a pickle.
 
 
 -- 
 Daniel Hyams
 dhy...@gmail.com
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Bug in pickling an ndarray?

2012-06-30 Thread Robert Kern
On Sat, Jun 30, 2012 at 11:33 PM, Daniel Hyams dhy...@gmail.com wrote:
 Hmmm, I wouldn't think that it is correct behavior; I would think that *any*
 ndarray arising from pickling would have its .base attribute set to None.
  If not, then who is really the one that owns the data?

 It was my understanding that .base should hold a reference to another
 ndarray that the data is really coming from, or it's None.  It certainly
 shouldn't be some random string, should it?

It can be any object that will keep the data memory alive while the
object is kept alive. It does not have to be an ndarray. In this case,
the numpy unpickling constructor takes the string object that the
underlying pickling machinery has just created and views its memory
directly. In order to keep Python from freeing that memory, the string
object needs to be kept alive via a reference, so it gets assigned to
the .base.

 And yes, it is causing a problem for me, which is why I noticed it.  In my
 application, ndarrays can come from various sources, pickling being one of
 them.  Later in the app, I was wanting to resize the array, which you cannot
 do if the data is not really owned by that array...

You also can't resize an array if any *other* array has a view on that
array too, so checking for ownership isn't going to help. .resize()
will raise an exception if it can't do this; it's better to just
attempt it and catch the exception than to look before you leap.

 I had explicit check for
 myarray.base==None, which it is not when I get the ndarray from a pickle.

That is not the way to check if an ndarray owns its data. Instead,
check a.flags['OWNDATA']

-- 
Robert Kern
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Bug in pickling an ndarray?

2012-06-30 Thread Daniel Hyams
Thanks Travis and Robert for the clarification; it is much more clear
what is going on now.

As the demo code shows, also a.flags['OWNDATA'] is different on its
way out of the pickle; which also makes sense now.  So using that flag
instead of checking a.base for None is equivalent, at least in this
situation.

So is it a bug, then, that, on Windows, .base is set to None (of
course, this may be something that was fixed in later versions of
numpy; I was only able to test Windows with numpy 1.4.1).

I'll just make a copy and discard the original to work around the
situation (which is what I already had done, but the inconsistent
behavior across versions and platforms made me think it was a bug).

Thanks again for the clear explanation of what is going on.


On Sat, Jun 30, 2012 at 6:33 PM, Daniel Hyams dhy...@gmail.com wrote:

 Hmmm, I wouldn't think that it is correct behavior; I would think that
 *any* ndarray arising from pickling would have its .base attribute set to
 None.  If not, then who is really the one that owns the data?

 It was my understanding that .base should hold a reference to another
 ndarray that the data is really coming from, or it's None.  It certainly
 shouldn't be some random string, should it?

 And yes, it is causing a problem for me, which is why I noticed it.  In my
 application, ndarrays can come from various sources, pickling being one of
 them.  Later in the app, I was wanting to resize the array, which you
 cannot do if the data is not really owned by that array...I had explicit
 check for myarray.base==None, which it is not when I get the ndarray from a
 pickle.


 --
 Daniel Hyams
 dhy...@gmail.com




-- 
Daniel Hyams
dhy...@gmail.com
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] bug in array instanciation?

2012-01-27 Thread Emmanuel Mayssat
In [20]: dt_knobs =
[('pvName',(str,40)),('start','float'),('stop','float'),('mode',(str,10))]

In [21]: r_knobs = np.recarray([],dtype=dt_knobs)

In [22]: r_knobs
Out[22]:
rec.array(('\xa0\x8c\xc9\x02\x00\x00\x00\x00(\xc8v\x02\x00\x00\x00\x00\x00\xd3\x86\x02\x00\x00\x00\x00\x10\xdeJ\x02\x00\x00\x00\x00\x906\xb9\x02',
1.63e-322, 1.351330465085e-312, '\x90\xc6\xa3\x02\x00\x00\x00\x00P'),
  dtype=[('pvName', '|S40'), ('start', 'f8'), ('stop', 'f8'),
('mode', '|S10')])

why is the array not empty?
--
E
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] bug in array instanciation?

2012-01-27 Thread Robert Kern
On Fri, Jan 27, 2012 at 21:17, Emmanuel Mayssat emays...@gmail.com wrote:
 In [20]: dt_knobs =
 [('pvName',(str,40)),('start','float'),('stop','float'),('mode',(str,10))]

 In [21]: r_knobs = np.recarray([],dtype=dt_knobs)

 In [22]: r_knobs
 Out[22]:
 rec.array(('\xa0\x8c\xc9\x02\x00\x00\x00\x00(\xc8v\x02\x00\x00\x00\x00\x00\xd3\x86\x02\x00\x00\x00\x00\x10\xdeJ\x02\x00\x00\x00\x00\x906\xb9\x02',
 1.63e-322, 1.351330465085e-312, '\x90\xc6\xa3\x02\x00\x00\x00\x00P'),
      dtype=[('pvName', '|S40'), ('start', 'f8'), ('stop', 'f8'),
 ('mode', '|S10')])

 why is the array not empty?

The shape [] creates a rank-0 array, which is essentially a scalar.

[~]
|1 x = np.array(10)

[~]
|2 x
array(10)

[~]
|3 x.shape
()


If you want an empty array, you need at least one dimension of size 0:

[~]
|7 r_knobs = np.recarray([0], dtype=dt_knobs)

[~]
|8 r_knobs
rec.array([],
  dtype=[('pvName', '|S40'), ('start', 'f8'), ('stop', 'f8'),
('mode', '|S10')])

-- 
Robert Kern

I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth.
  -- Umberto Eco
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] bug in numpy.mean() ?

2012-01-24 Thread K . -Michael Aye
I know I know, that's pretty outrageous to even suggest, but please 
bear with me, I am stumped as you may be:

2-D data file here:
http://dl.dropbox.com/u/139035/data.npy

Then:
In [3]: data.mean()
Out[3]: 3067.024383998

In [4]: data.max()
Out[4]: 3052.4343

In [5]: data.shape
Out[5]: (1000, 1000)

In [6]: data.min()
Out[6]: 3040.498

In [7]: data.dtype
Out[7]: dtype('float32')


A mean value calculated per loop over the data gives me 3045.747251076416
I first thought I still misunderstand how data.mean() works, per axis 
and so on, but did the same with a flattenend version with the same 
results.

Am I really soo tired that I can't see what I am doing wrong here?
For completion, the data was read by a osgeo.gdal dataset method called 
ReadAsArray()
My numpy.__version__ gives me 1.6.1 and my whole setup is based on 
Enthought's EPD.

Best regards,
Michael



___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] bug in numpy.mean() ?

2012-01-24 Thread Bruce Southey
On 01/24/2012 12:33 PM, K.-Michael Aye wrote:
 I know I know, that's pretty outrageous to even suggest, but please
 bear with me, I am stumped as you may be:

 2-D data file here:
 http://dl.dropbox.com/u/139035/data.npy

 Then:
 In [3]: data.mean()
 Out[3]: 3067.024383998

 In [4]: data.max()
 Out[4]: 3052.4343

 In [5]: data.shape
 Out[5]: (1000, 1000)

 In [6]: data.min()
 Out[6]: 3040.498

 In [7]: data.dtype
 Out[7]: dtype('float32')


 A mean value calculated per loop over the data gives me 3045.747251076416
 I first thought I still misunderstand how data.mean() works, per axis
 and so on, but did the same with a flattenend version with the same
 results.

 Am I really soo tired that I can't see what I am doing wrong here?
 For completion, the data was read by a osgeo.gdal dataset method called
 ReadAsArray()
 My numpy.__version__ gives me 1.6.1 and my whole setup is based on
 Enthought's EPD.

 Best regards,
 Michael



 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion
You have a million 32-bit floating point numbers that are in the 
thousands. Thus you are exceeding the 32-bitfloat precision and, if you 
can, you need to increase precision of the accumulator in np.mean() or 
change the input dtype:
  a.mean(dtype=np.float32) # default and lacks precision
3067.024383998
  a.mean(dtype=np.float64)
3045.747251076416
  a.mean(dtype=np.float128)
3045.7472510764160156
  b=a.astype(np.float128)
  b.mean()
3045.7472510764160156

Otherwise you are left to using some alternative approach to calculate 
the mean.

Bruce





___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] bug in numpy.mean() ?

2012-01-24 Thread Kathleen M Tacina
I have confirmed this on a 64-bit linux machine running python 2.7.2
with the development version of numpy.  It seems to be related to using
float32 instead of float64.   If the array is first converted to a
64-bit float (via astype), mean gives an answer that agrees with your
looped-calculation value: 3045.747250002.  With the original 32-bit
array, averaging successively on one axis and then on the other gives
answers that agree with the 64-bit float answer to the second decimal
place.


In [125]: d = np.load('data.npy')

In [126]: d.mean()
Out[126]: 3067.024383998

In [127]: d64 = d.astype('float64')

In [128]: d64.mean()
Out[128]: 3045.747251076416

In [129]: d.mean(axis=0).mean()
Out[129]: 3045.748750002

In [130]: d.mean(axis=1).mean()
Out[130]: 3045.74448

In [131]: np.version.full_version
Out[131]: '2.0.0.dev-55472ca'



--
On Tue, 2012-01-24 at 12:33 -0600, K.-MichaelA wrote:

 I know I know, that's pretty outrageous to even suggest, but please 
 bear with me, I am stumped as you may be:
 
 2-D data file here:
 http://dl.dropbox.com/u/139035/data.npy
 
 Then:
 In [3]: data.mean()
 Out[3]: 3067.024383998
 
 In [4]: data.max()
 Out[4]: 3052.4343
 
 In [5]: data.shape
 Out[5]: (1000, 1000)
 
 In [6]: data.min()
 Out[6]: 3040.498
 
 In [7]: data.dtype
 Out[7]: dtype('float32')
 
 
 A mean value calculated per loop over the data gives me 3045.747251076416
 I first thought I still misunderstand how data.mean() works, per axis 
 and so on, but did the same with a flattenend version with the same 
 results.
 
 Am I really soo tired that I can't see what I am doing wrong here?
 For completion, the data was read by a osgeo.gdal dataset method called 
 ReadAsArray()
 My numpy.__version__ gives me 1.6.1 and my whole setup is based on 
 Enthought's EPD.
 
 Best regards,
 Michael
 
 
 
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

-- 
--
Kathleen M. Tacina
NASA Glenn Research Center
MS 5-10
21000 Brookpark Road
Cleveland, OH 44135
Telephone: (216) 433-6660
Fax: (216) 433-5802
--
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] bug in numpy.mean() ?

2012-01-24 Thread Zachary Pincus

On Jan 24, 2012, at 1:33 PM, K.-Michael Aye wrote:

 I know I know, that's pretty outrageous to even suggest, but please 
 bear with me, I am stumped as you may be:
 
 2-D data file here:
 http://dl.dropbox.com/u/139035/data.npy
 
 Then:
 In [3]: data.mean()
 Out[3]: 3067.024383998
 
 In [4]: data.max()
 Out[4]: 3052.4343
 
 In [5]: data.shape
 Out[5]: (1000, 1000)
 
 In [6]: data.min()
 Out[6]: 3040.498
 
 In [7]: data.dtype
 Out[7]: dtype('float32')
 
 
 A mean value calculated per loop over the data gives me 3045.747251076416
 I first thought I still misunderstand how data.mean() works, per axis 
 and so on, but did the same with a flattenend version with the same 
 results.
 
 Am I really soo tired that I can't see what I am doing wrong here?
 For completion, the data was read by a osgeo.gdal dataset method called 
 ReadAsArray()
 My numpy.__version__ gives me 1.6.1 and my whole setup is based on 
 Enthought's EPD.


I get the same result:

In [1]: import numpy

In [2]: data = numpy.load('data.npy')

In [3]: data.mean()
Out[3]: 3067.024383998

In [4]: data.max()
Out[4]: 3052.4343

In [5]: data.min()
Out[5]: 3040.498

In [6]: numpy.version.version
Out[6]: '2.0.0.dev-433b02a'

This on OS X 10.7.2 with Python 2.7.1, on an intel Core i7. Running python as a 
32 vs. 64-bit process doesn't make a difference.

The data matrix doesn't look too strange when I view it as an image -- all 
pretty smooth variation around the (min, max) range. But maybe it's still 
somehow floating-point pathological?

This is fun too:
In [12]: data.mean()
Out[12]: 3067.024383998

In [13]: (data/3000).mean()*3000
Out[13]: 3020.807437501

In [15]: (data/2).mean()*2
Out[15]: 3067.024383998

In [16]: (data/200).mean()*200
Out[16]: 3013.67541


Zach


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] bug in numpy.mean() ?

2012-01-24 Thread Val Kalatsky
Just what Bruce said.

You can run the following to confirm:
np.mean(data - data.mean())

If for some reason you do not want to convert to float64 you can add the
result of the previous line to the bad mean:
bad_mean = data.mean()
good_mean = bad_mean + np.mean(data - bad_mean)

Val

On Tue, Jan 24, 2012 at 12:33 PM, K.-Michael Aye kmichael@gmail.comwrote:

 I know I know, that's pretty outrageous to even suggest, but please
 bear with me, I am stumped as you may be:

 2-D data file here:
 http://dl.dropbox.com/u/139035/data.npy

 Then:
 In [3]: data.mean()
 Out[3]: 3067.024383998

 In [4]: data.max()
 Out[4]: 3052.4343

 In [5]: data.shape
 Out[5]: (1000, 1000)

 In [6]: data.min()
 Out[6]: 3040.498

 In [7]: data.dtype
 Out[7]: dtype('float32')


 A mean value calculated per loop over the data gives me 3045.747251076416
 I first thought I still misunderstand how data.mean() works, per axis
 and so on, but did the same with a flattenend version with the same
 results.

 Am I really soo tired that I can't see what I am doing wrong here?
 For completion, the data was read by a osgeo.gdal dataset method called
 ReadAsArray()
 My numpy.__version__ gives me 1.6.1 and my whole setup is based on
 Enthought's EPD.

 Best regards,
 Michael



 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] bug in numpy.mean() ?

2012-01-24 Thread Zachary Pincus
 You have a million 32-bit floating point numbers that are in the 
 thousands. Thus you are exceeding the 32-bitfloat precision and, if you 
 can, you need to increase precision of the accumulator in np.mean() or 
 change the input dtype:
 a.mean(dtype=np.float32) # default and lacks precision
 3067.024383998
 a.mean(dtype=np.float64)
 3045.747251076416
 a.mean(dtype=np.float128)
 3045.7472510764160156
 b=a.astype(np.float128)
 b.mean()
 3045.7472510764160156
 
 Otherwise you are left to using some alternative approach to calculate 
 the mean.
 
 Bruce

Interesting -- I knew that float64 accumulators were used with integer arrays, 
and I had just assumed that 64-bit or higher accumulators would be used with 
floating-point arrays too, instead of the array's dtype. This is actually quite 
a bit of a gotcha for floating-point imaging-type tasks -- good to know!

Zach
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] bug in numpy.mean() ?

2012-01-24 Thread K . -Michael Aye
Thank you Bruce and all, 
I knew I was doing something wrong (should have read the mean method 
doc more closely). Am of course glad that's so easy understandable.
But: If the error can get so big, wouldn't it be a better idea for the 
accumulator to always be of type 'float64' and then convert later to 
the type of the original array? 
As one can see in this case, the result would be much closer to the true value.


Michael


On 2012-01-24 19:01:40 +, Val Kalatsky said:



Just what Bruce said. 

You can run the following to confirm:
np.mean(data - data.mean())

If for some reason you do not want to convert to float64 you can add 
the result of the previous line to the bad mean:

bad_mean = data.mean()
good_mean = bad_mean + np.mean(data - bad_mean)

Val

On Tue, Jan 24, 2012 at 12:33 PM, K.-Michael Aye 
kmichael@gmail.com wrote:

I know I know, that's pretty outrageous to even suggest, but please
bear with me, I am stumped as you may be:

2-D data file here:
http://dl.dropbox.com/u/139035/data.npy

Then:
In [3]: data.mean()
Out[3]: 3067.024383998

In [4]: data.max()
Out[4]: 3052.4343

In [5]: data.shape
Out[5]: (1000, 1000)

In [6]: data.min()
Out[6]: 3040.498

In [7]: data.dtype
Out[7]: dtype('float32')


A mean value calculated per loop over the data gives me 3045.747251076416
I first thought I still misunderstand how data.mean() works, per axis
and so on, but did the same with a flattenend version with the same
results.

Am I really soo tired that I can't see what I am doing wrong here?
For completion, the data was read by a osgeo.gdal dataset method called
ReadAsArray()
My numpy.__version__ gives me 1.6.1 and my whole setup is based on
Enthought's EPD.

Best regards,
Michael



___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] bug in numpy.mean() ?

2012-01-24 Thread eat
Hi,

Oddly, but numpy 1.6 seems to behave more consistent manner:

In []: sys.version
Out[]: '2.7.2 (default, Jun 12 2011, 15:08:59) [MSC v.1500 32 bit (Intel)]'
In []: np.version.version
Out[]: '1.6.0'

In []: d= np.load('data.npy')
In []: d.dtype
Out[]: dtype('float32')

In []: d.mean()
Out[]: 3045.74718
In []: d.mean(dtype= np.float32)
Out[]: 3045.74718
In []: d.mean(dtype= np.float64)
Out[]: 3045.747251076416
In []: (d- d.min()).mean()+ d.min()
Out[]: 3045.7472508750002
In []: d.mean(axis= 0).mean()
Out[]: 3045.74724
In []: d.mean(axis= 1).mean()
Out[]: 3045.74724

Or does the results of calculations depend more on the platform?


My 2 cents,
eat
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] bug in numpy.mean() ?

2012-01-24 Thread Kathleen M Tacina
I found something similar, with a very simple example.

On 64-bit linux, python 2.7.2, numpy development version:

In [22]: a = 4000*np.ones((1024,1024),dtype=np.float32)

In [23]: a.mean()
Out[23]: 4034.16357421875

In [24]: np.version.full_version
Out[24]: '2.0.0.dev-55472ca'


But, a Windows XP machine running python 2.7.2 with numpy 1.6.1 gives:
a = np.ones((1024,1024),dtype=np.float32)
a.mean()
4000.0
np.version.full_version
'1.6.1'


On Tue, 2012-01-24 at 17:12 -0600, eat wrote:

 Hi,
 
 
 
 Oddly, but numpy 1.6 seems to behave more consistent manner:
 
 
 In []: sys.version
 Out[]: '2.7.2 (default, Jun 12 2011, 15:08:59) [MSC v.1500 32 bit
 (Intel)]'
 In []: np.version.version
 Out[]: '1.6.0'
 
 
 In []: d= np.load('data.npy')
 In []: d.dtype
 Out[]: dtype('float32')
 
 
 In []: d.mean()
 Out[]: 3045.74718
 In []: d.mean(dtype= np.float32)
 Out[]: 3045.74718
 In []: d.mean(dtype= np.float64)
 Out[]: 3045.747251076416
 In []: (d- d.min()).mean()+ d.min()
 Out[]: 3045.7472508750002
 In []: d.mean(axis= 0).mean()
 Out[]: 3045.74724
 In []: d.mean(axis= 1).mean()
 Out[]: 3045.74724
 
 
 Or does the results of calculations depend more on the platform?
 
 
 
 
 My 2 cents,
 eat

-- 
--
Kathleen M. Tacina
NASA Glenn Research Center
MS 5-10
21000 Brookpark Road
Cleveland, OH 44135
Telephone: (216) 433-6660
Fax: (216) 433-5802
--
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] bug in numpy.mean() ?

2012-01-24 Thread eat
Hi

On Wed, Jan 25, 2012 at 1:21 AM, Kathleen M Tacina 
kathleen.m.tac...@nasa.gov wrote:

 **
 I found something similar, with a very simple example.

 On 64-bit linux, python 2.7.2, numpy development version:

 In [22]: a = 4000*np.ones((1024,1024),dtype=np.float32)

 In [23]: a.mean()
 Out[23]: 4034.16357421875

 In [24]: np.version.full_version
 Out[24]: '2.0.0.dev-55472ca'


 But, a Windows XP machine running python 2.7.2 with numpy 1.6.1 gives:
 a = np.ones((1024,1024),dtype=np.float32)
 a.mean()
 4000.0
 np.version.full_version
 '1.6.1'

This indeed looks very nasty, regardless of whether it is a version or
platform related problem.

-eat




 On Tue, 2012-01-24 at 17:12 -0600, eat wrote:

 Hi,



  Oddly, but numpy 1.6 seems to behave more consistent manner:



  In []: sys.version

  Out[]: '2.7.2 (default, Jun 12 2011, 15:08:59) [MSC v.1500 32 bit
 (Intel)]'

  In []: np.version.version

  Out[]: '1.6.0'



  In []: d= np.load('data.npy')

  In []: d.dtype

  Out[]: dtype('float32')



  In []: d.mean()

  Out[]: 3045.74718

  In []: d.mean(dtype= np.float32)

  Out[]: 3045.74718

  In []: d.mean(dtype= np.float64)

  Out[]: 3045.747251076416

  In []: (d- d.min()).mean()+ d.min()

  Out[]: 3045.7472508750002

  In []: d.mean(axis= 0).mean()

  Out[]: 3045.74724

  In []: d.mean(axis= 1).mean()

  Out[]: 3045.74724



  Or does the results of calculations depend more on the platform?





  My 2 cents,

  eat

   --
 --
 Kathleen M. Tacina
 NASA Glenn Research Center
 MS 5-10
 21000 Brookpark Road
 Cleveland, OH 44135
 Telephone: (216) 433-6660
 Fax: (216) 433-5802
 --


 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] bug in numpy.mean() ?

2012-01-24 Thread josef . pktd
On Tue, Jan 24, 2012 at 7:21 PM, eat e.antero.ta...@gmail.com wrote:

 Hi

 On Wed, Jan 25, 2012 at 1:21 AM, Kathleen M Tacina 
 kathleen.m.tac...@nasa.gov wrote:

 **
 I found something similar, with a very simple example.

 On 64-bit linux, python 2.7.2, numpy development version:

 In [22]: a = 4000*np.ones((1024,1024),dtype=np.float32)

 In [23]: a.mean()
 Out[23]: 4034.16357421875

 In [24]: np.version.full_version
 Out[24]: '2.0.0.dev-55472ca'


 But, a Windows XP machine running python 2.7.2 with numpy 1.6.1 gives:
 a = np.ones((1024,1024),dtype=np.float32)
 a.mean()
 4000.0
 np.version.full_version
 '1.6.1'

 This indeed looks very nasty, regardless of whether it is a version or
 platform related problem.


Looks like platform specific, same result as -eat

Windows 7,
Python 2.6.5 (r265:79096, Mar 19 2010, 21:48:26) [MSC v.1500 32 bit
(Intel)] on win32

 a = np.ones((1024,1024),dtype=np.float32)
 a.mean()
1.0

 (4000*a).dtype
dtype('float32')
 (4000*a).mean()
4000.0

 b = np.load(data.npy)
 b.mean()
3045.74718
 b.shape
(1000, 1000)
 b.mean(0).mean(0)
3045.74724
 _.dtype
dtype('float64')
 b.dtype
dtype('float32')

 b.mean(dtype=np.float32)
3045.74718

Josef



 -eat




 On Tue, 2012-01-24 at 17:12 -0600, eat wrote:

 Hi,



  Oddly, but numpy 1.6 seems to behave more consistent manner:



  In []: sys.version

  Out[]: '2.7.2 (default, Jun 12 2011, 15:08:59) [MSC v.1500 32 bit
 (Intel)]'

  In []: np.version.version

  Out[]: '1.6.0'



  In []: d= np.load('data.npy')

  In []: d.dtype

  Out[]: dtype('float32')



  In []: d.mean()

  Out[]: 3045.74718

  In []: d.mean(dtype= np.float32)

  Out[]: 3045.74718

  In []: d.mean(dtype= np.float64)

  Out[]: 3045.747251076416

  In []: (d- d.min()).mean()+ d.min()

  Out[]: 3045.7472508750002

  In []: d.mean(axis= 0).mean()

  Out[]: 3045.74724

  In []: d.mean(axis= 1).mean()

  Out[]: 3045.74724



  Or does the results of calculations depend more on the platform?





  My 2 cents,

  eat

   --
 --
 Kathleen M. Tacina
 NASA Glenn Research Center
 MS 5-10
 21000 Brookpark Road
 Cleveland, OH 44135
 Telephone: (216) 433-6660
 Fax: (216) 433-5802
 --


 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion



 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] bug in numpy.mean() ?

2012-01-24 Thread Charles R Harris
On Tue, Jan 24, 2012 at 4:21 PM, Kathleen M Tacina 
kathleen.m.tac...@nasa.gov wrote:

 **
 I found something similar, with a very simple example.

 On 64-bit linux, python 2.7.2, numpy development version:

 In [22]: a = 4000*np.ones((1024,1024),dtype=np.float32)

 In [23]: a.mean()
 Out[23]: 4034.16357421875

 In [24]: np.version.full_version
 Out[24]: '2.0.0.dev-55472ca'


 But, a Windows XP machine running python 2.7.2 with numpy 1.6.1 gives:
 a = np.ones((1024,1024),dtype=np.float32)
 a.mean()
 4000.0
 np.version.full_version
 '1.6.1'



Yes, the results are platform/compiler dependent. The 32 bit platforms tend
to use extended precision accumulators and the x87 instruction set. The 64
bit platforms tend to use sse2+. Different precisions, even though you
might think they are the same.

snip

Chuck
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] bug in numpy.mean() ?

2012-01-24 Thread josef . pktd
On Wed, Jan 25, 2012 at 12:03 AM, Charles R Harris
charlesr.har...@gmail.com wrote:


 On Tue, Jan 24, 2012 at 4:21 PM, Kathleen M Tacina
 kathleen.m.tac...@nasa.gov wrote:

 I found something similar, with a very simple example.

 On 64-bit linux, python 2.7.2, numpy development version:

 In [22]: a = 4000*np.ones((1024,1024),dtype=np.float32)

 In [23]: a.mean()
 Out[23]: 4034.16357421875

 In [24]: np.version.full_version
 Out[24]: '2.0.0.dev-55472ca'


 But, a Windows XP machine running python 2.7.2 with numpy 1.6.1 gives:
 a = np.ones((1024,1024),dtype=np.float32)
 a.mean()
 4000.0
 np.version.full_version
 '1.6.1'



 Yes, the results are platform/compiler dependent. The 32 bit platforms tend
 to use extended precision accumulators and the x87 instruction set. The 64
 bit platforms tend to use sse2+. Different precisions, even though you might
 think they are the same.

just to confirm, same computer as before but the python 3.2 version is
64 bit, now I get the Linux result

Python 3.2 (r32:88445, Feb 20 2011, 21:30:00) [MSC v.1500 64 bit
(AMD64)] on win32

 import numpy as np
 np.__version__
'1.5.1'
 a = 4000*np.ones((1024,1024),dtype=np.float32)
 a.mean()
4034.16357421875
 a.mean(0).mean(0)
4000.0
 a.mean(dtype=np.float64)
4000.0

Josef


 snip

 Chuck

 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] bug in PyArray_GetCastFunc

2011-12-04 Thread Charles R Harris
On Sat, Dec 3, 2011 at 5:28 PM, Geoffrey Irving irv...@naml.us wrote:

 When attempting to cast to a user defined type, PyArray_GetCast looks
 up the cast function in the dictionary but doesn't check if the entry
 exists.  This causes segfaults.  Here's a patch.

 Geoffrey

 diff --git a/numpy/core/src/multiarray/convert_datatype.c
 b/numpy/core/src/multiarray/convert_datatype.c
 index 818d558..4b8f38b 100644
 --- a/numpy/core/src/multiarray/convert_datatype.c
 +++ b/numpy/core/src/multiarray/convert_datatype.c
 @@ -81,7 +81,7 @@ PyArray_GetCastFunc(PyArray_Descr *descr, int type_num)
 key = PyInt_FromLong(type_num);
 cobj = PyDict_GetItem(obj, key);
 Py_DECREF(key);
 -if (NpyCapsule_Check(cobj)) {
 +if (cobj  NpyCapsule_Check(cobj)) {
 castfunc = NpyCapsule_AsVoidPtr(cobj);
 }
 }
 __


I'm thinking NpyCapsule_Check should catch this. From the documentation it
probably should:

int 
PyCObject_Check(PyObjecthttp://docs.python.org/release/2.7/c-api/structures.html#PyObject
* *p*)
Return true if its argument is a
PyCObjecthttp://docs.python.org/release/2.7/c-api/cobject.html#PyCObject

I don't think NULL is a valid PyCObject ;) However, it should be easy to
add the NULL check to the numpy version of the function. I'll do that.

Chuck
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] bug in PyArray_GetCastFunc

2011-12-04 Thread Geoffrey Irving
On Sun, Dec 4, 2011 at 10:02 AM, Charles R Harris
charlesr.har...@gmail.com wrote:


 On Sat, Dec 3, 2011 at 5:28 PM, Geoffrey Irving irv...@naml.us wrote:

 When attempting to cast to a user defined type, PyArray_GetCast looks
 up the cast function in the dictionary but doesn't check if the entry
 exists.  This causes segfaults.  Here's a patch.

 Geoffrey

 diff --git a/numpy/core/src/multiarray/convert_datatype.c
 b/numpy/core/src/multiarray/convert_datatype.c
 index 818d558..4b8f38b 100644
 --- a/numpy/core/src/multiarray/convert_datatype.c
 +++ b/numpy/core/src/multiarray/convert_datatype.c
 @@ -81,7 +81,7 @@ PyArray_GetCastFunc(PyArray_Descr *descr, int type_num)
             key = PyInt_FromLong(type_num);
             cobj = PyDict_GetItem(obj, key);
             Py_DECREF(key);
 -            if (NpyCapsule_Check(cobj)) {
 +            if (cobj  NpyCapsule_Check(cobj)) {
                 castfunc = NpyCapsule_AsVoidPtr(cobj);
             }
         }
 __


 I'm thinking NpyCapsule_Check should catch this. From the documentation it
 probably should:

 int PyCObject_Check(PyObject *p)
 Return true if its argument is a PyCObject

 I don't think NULL is a valid PyCObject ;) However, it should be easy to add
 the NULL check to the numpy version of the function. I'll do that.

That would work, but I think would match the rest of the Python API
better if NpyCapsule_Check required a nonnull argument.
PyCapsule_Check and essentially every other Python API function have
documented undefined behavior if you pass in null, so it might be
surprising one numpy macro violates this.  Incidentally, every other
use of NpyCapsule_Check correctly tests for null.

Geoffrey
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] bug in PyArray_GetCastFunc

2011-12-04 Thread Charles R Harris
On Sun, Dec 4, 2011 at 6:30 PM, Geoffrey Irving irv...@naml.us wrote:

 On Sun, Dec 4, 2011 at 10:02 AM, Charles R Harris
 charlesr.har...@gmail.com wrote:
 
 
  On Sat, Dec 3, 2011 at 5:28 PM, Geoffrey Irving irv...@naml.us wrote:
 
  When attempting to cast to a user defined type, PyArray_GetCast looks
  up the cast function in the dictionary but doesn't check if the entry
  exists.  This causes segfaults.  Here's a patch.
 
  Geoffrey
 
  diff --git a/numpy/core/src/multiarray/convert_datatype.c
  b/numpy/core/src/multiarray/convert_datatype.c
  index 818d558..4b8f38b 100644
  --- a/numpy/core/src/multiarray/convert_datatype.c
  +++ b/numpy/core/src/multiarray/convert_datatype.c
  @@ -81,7 +81,7 @@ PyArray_GetCastFunc(PyArray_Descr *descr, int
 type_num)
  key = PyInt_FromLong(type_num);
  cobj = PyDict_GetItem(obj, key);
  Py_DECREF(key);
  -if (NpyCapsule_Check(cobj)) {
  +if (cobj  NpyCapsule_Check(cobj)) {
  castfunc = NpyCapsule_AsVoidPtr(cobj);
  }
  }
  __
 
 
  I'm thinking NpyCapsule_Check should catch this. From the documentation
 it
  probably should:
 
  int PyCObject_Check(PyObject *p)
  Return true if its argument is a PyCObject
 
  I don't think NULL is a valid PyCObject ;) However, it should be easy to
 add
  the NULL check to the numpy version of the function. I'll do that.

 That would work, but I think would match the rest of the Python API
 better if NpyCapsule_Check required a nonnull argument.
 PyCapsule_Check and essentially every other Python API function have
 documented undefined behavior if you pass in null, so it might be
 surprising one numpy macro violates this.  Incidentally, every other
 use of NpyCapsule_Check correctly tests for null.


Good points. I may change it back ;)

Chuck
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] bug in PyArray_GetCastFunc

2011-12-03 Thread Geoffrey Irving
When attempting to cast to a user defined type, PyArray_GetCast looks
up the cast function in the dictionary but doesn't check if the entry
exists.  This causes segfaults.  Here's a patch.

Geoffrey

diff --git a/numpy/core/src/multiarray/convert_datatype.c
b/numpy/core/src/multiarray/convert_datatype.c
index 818d558..4b8f38b 100644
--- a/numpy/core/src/multiarray/convert_datatype.c
+++ b/numpy/core/src/multiarray/convert_datatype.c
@@ -81,7 +81,7 @@ PyArray_GetCastFunc(PyArray_Descr *descr, int type_num)
 key = PyInt_FromLong(type_num);
 cobj = PyDict_GetItem(obj, key);
 Py_DECREF(key);
-if (NpyCapsule_Check(cobj)) {
+if (cobj  NpyCapsule_Check(cobj)) {
 castfunc = NpyCapsule_AsVoidPtr(cobj);
 }
 }
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] bug in reshape ?

2011-11-22 Thread josef . pktd
might be an old story  np.__version__  - '1.5.1'

It thought for once it's easier to use reshape to add a new axis
instead of ...,None
but my results got weird (normal(0,1) sample of 2.13795875e-314)

 x = 1
 y = np.arange(3)
 z = np.arange(2)[:,None]
 np.broadcast(x,y,z)
numpy.broadcast object at 0x04C0DCA0
 np.broadcast_arrays(x,y,z)
[array([[1, 1, 1],
   [1, 1, 1]]), array([[0, 1, 2],
   [0, 1, 2]]), array([[0, 0, 0],
   [1, 1, 1]])]
 x1, y1, z1 = np.broadcast_arrays(x,y,z)
 map(np.shape, (x1, y1, z1))
[(2, 3), (2, 3), (2, 3)]

shape looks fine, let's add an extra axis with reshape

 x1.reshape(2,3,1)
array([[[  1],
[  1],
[ 1099464714]],

   [[-2147481592],
[184],
[  1]]])

what's that ?

 (0+x1).reshape(2,3,1)
array([[[1],
[1],
[1]],

   [[1],
[1],
[1]]])

 (y1*1.).reshape(2,3,1)
array([[[ 0.],
[ 1.],
[ 2.]],

   [[ 0.],
[ 1.],
[ 2.]]])

 (y1).reshape(2,3,1)
array([[[  0],
[  1],
[  2]],

   [[  0],
[ 1099447643],
[-2147483648]]])


 x1, y1, z1 = np.broadcast_arrays(x,y,z)
 x1[...,None]
array([[[1],
[1],
[1]],

   [[1],
[1],
[1]]])

 x1.shape
(2, 3)
 x1.reshape(2,3,1)
array([[[  1],
[  1],
[ 1099464730]],

   [[-2147479536],
[ -445054780],
[ 1063686842]]])


the background story: playing broadcasting tricks for
http://projects.scipy.org/scipy/ticket/1544

Josef
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


  1   2   3   4   >