Re: [Numpy-discussion] numpy.spacing question

2014-12-04 Thread Alok Singhal
On Thu, Dec 4, 2014 at 4:25 PM, Ryan Nelson  wrote:
>
> I guess I'm a little confused about how the spacing values are calculated.

np.spacing(x) is basically the same as np.nextafter(x, np.inf) - x,
i.e., it returns the minimum positive number that can be added to x to
get a number that's different from x.

> My expectation is that the first logical test should give an output array
> where all of the results are the same. But it is also very likely that I
> don't have any idea what's going on. Can someone provide some clarification?

For 1e-10, np.spacing() is 1.2924697071141057e-26.  1e-10 * eps is
2.2204460492503132e-26, which, when added to 1e-10 rounds to the
closest number that can be represented in a 64-bit floating-point
representation.  That happens to be 2*np.spacing(1e-10), and not
1*np.spacing(1e-10).

-- 
The information transmitted is intended only for the person(s) or entity to 
which it is addressed and may contain confidential and/or privileged 
material. Any review, retransmission, dissemination or other use of, or 
taking of any action in reliance upon, this information by persons or 
entities other than the intended recipient is prohibited. If you received 
this in error, please contact the sender and delete the material from any 
computer. Email transmission cannot be guaranteed to be secure or 
error-free.

-- 

This information is not intended and should not be construed as investment, 
tax or legal advice or an offer or solicitation to buy or sell any 
security. Any offer or solicitation for any private investment fund advised 
by Edgestream Partners, L.P. or any of its affiliates may only be made by 
delivery of its confidential offering documents to qualified investors.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] longlong format error with Python <= 2.6 in scalartypes.c

2011-08-19 Thread Alok Singhal
On Thu, Aug 18, 2011 at 9:01 PM, Mark Wiebe  wrote:
> On Thu, Aug 4, 2011 at 4:08 PM, Derek Homeier
>  wrote:
>>
>> Hi,
>>
>> commits c15a807e and c135371e (thus most immediately addressed to Mark,
>> but I am sending this to the list hoping for more insight on the issue)
>> introduce a test failure with Python 2.5+2.6 on Mac:
>>
>> FAIL: test_timedelta_scalar_construction (test_datetime.TestDateTime)
>> --
>> Traceback (most recent call last):
>>  File
>> "/Users/derek/lib/python2.6/site-packages/numpy/core/tests/test_datetime.py",
>> line 219, in test_timedelta_scalar_construction
>>    assert_equal(str(np.timedelta64(3, 's')), '3 seconds')
>>  File "/Users/derek/lib/python2.6/site-packages/numpy/testing/utils.py",
>> line 313, in assert_equal
>>    raise AssertionError(msg)
>> AssertionError:
>> Items are not equal:
>>  ACTUAL: '%lld seconds'
>>  DESIRED: '3 seconds'
>>
>> due to the "lld" format passed to PyUString_FromFormat in scalartypes.c.
>> In the current npy_common.h I found the comment
>>  *      in Python 2.6 the %lld formatter is not supported. In this
>>  *      case we work around the problem by using the %zd formatter.
>> though I did not notice that problem when I cleaned up the
>> NPY_LONGLONG_FMT definitions in that file (and it is not entirely clear
>> whether the comment only pertains to Windows...). Anyway changing the
>> formatters in scalartypes.c to "zd" as well removes the failure and still
>> works with Python 2.7 and 3.2 (at least on Mac OS). However I am wondering
>> if
>> a) NPY_[U]LONGLONG_FMT should also be defined conditional to the Python
>> version (and if "%zu" is a valid formatter), and
>> b) scalartypes.c should use NPY_LONGLONG_FMT from npy_common.h
>>
>> I am attaching a patch implementing a), but only the quick and dirty
>> solution to b).
>
> I've touched this stuff as little as possible, because I rather dislike the
> way the *_FMT macros are set up right now. I added a comment about
> NPY_INTP_FMT in npy_common.h which I see you read. If you're going to try to
> fix this, I hope you fix it deeper than this patch so it's not error-prone
> anymore.
> NPY_INTP_FMT is used together with PyErr_Format/PyString_FromFormat, whereas
> the other *_FMT are used with the *printf functions from the C libraries.
> These are not compatible, and the %zd hack was put in place because it
> exists even in Python 2.4, and Py_ssize_t seems matches the  pointer size in
> all CPython versions.
> Switching the timedelta64 format in scalartypes.c.src to "%zd" won't help on
> 32-bit platforms, because it won't be a 64-bit type there, unlike how it
> works ok for the NPY_INTP_FMT. In summary:
> * There need to be changes to create a clear distinction between the *_FMT
> for PyString_FromFormat vs the *_FMT for C library *printf functions
> * I suspect we're out of luck for 32-bit older versions of CPython with
> PyString_FromFormat
> Cheers,
> -Mark

By the way, the above bug is fixed in the current master (see
https://github.com/numpy/numpy/commit/730b861120094b1ab38670b9a8895a36c19296a7).
 I fixed it in the most direct way possible, because "the correct" way
would require changes to a lot of places.

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] NumPy Featured on RCE-Cast

2011-02-14 Thread Alok Singhal


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Indexing a 2-d array with a 1-d mask

2011-02-09 Thread Alok Singhal
On Wed, Feb 9, 2011 at 1:24 AM, Friedrich Romstedt
 wrote:
> 2011/2/8 Alok Singhal :
>> In [6]: data2 = numpy.zeros((0, 5), 'd')
>> In [7]: mask2 = numpy.zeros(0, 'bool')
>> In [8]: data2[mask2]
>> 
>> Traceback (most recent call last):
>>  File "", line 1, in 
>> IndexError: invalid index
>>
>> I would have expected the above to give me a 0x5 array.
>
>> Is there any other way to do what I
>> am doing?
>
> Like so (works only for ndim==2):
>
>>>> d1 = numpy.arange(0).reshape((0, 5))
>>>> d2 = numpy.arange(15).reshape((3, 5))
>>>> index1 = numpy.asarray([], dtype=numpy.bool)
>>>> index2 = numpy.asarray([True, False, True], dtype=numpy.bool)
>>>> x = numpy.arange(5)
>>>> (x1, y1) = numpy.meshgrid(x, index1.nonzero()[0])
>>>> (x2, y2) = numpy.meshgrid(x, index2.nonzero()[0])
>>>> (x1, y1)
> (array([], shape=(0, 5), dtype=int32), array([], shape=(0, 5), dtype=int32))
>>>> print x2, "\n", y2
> [[0 1 2 3 4]
>  [0 1 2 3 4]]
> [[0 0 0 0 0]
>  [2 2 2 2 2]]
>>>> d1[y1, x1]
> array([], shape=(0, 5), dtype=int32)
>>>> d2[y1, x1]
> array([], shape=(0, 5), dtype=int32)
>>>> d2[y2, x2]
> array([[ 0,  1,  2,  3,  4],
>       [10, 11, 12, 13, 14]])

Yeah, I can do it with creating the full index array, but I have huge
data sets, so I was hoping to avoid them.  For now, I can just check
for the borderline case and keep using the memory-efficient indexing
for the "regular" cases.

> I don't know if the other thing is a bug, but it looks like.  I could
> imagine that it has something to do with the implicit slicing on the
> array without data?  Rather an imperfection ...
>
> Consider this:
>
>>>> d1 = numpy.arange(0).reshape((0,))
>>>> d2 = numpy.arange(0).reshape((0, 5))
>>>> d3 = numpy.arange(0).reshape((5, 0))
>>>> d1[[]]
> array([], dtype=int32)
>>>> d2[[]]
> Traceback (most recent call last):
>  File "", line 1, in 
> IndexError: invalid index
>>>> d2[[], 0]
> array([], dtype=int32)
>>>> d3[[]]
> array([], shape=(0, 0), dtype=int32)
>>>> d3[0, []]
> array([], dtype=int32)
>>>> d3[:, []]
> Traceback (most recent call last):
>  File "", line 1, in 
> IndexError: invalid index
>
> Ticket?

I think so too.  Although, I don't know if this behavior is a feature
of advanced indexing (and not a bug).

Thanks,
Alok

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] Indexing a 2-d array with a 1-d mask

2011-02-08 Thread Alok Singhal
Hi,

I have an NxM array, which I am indexing with a 1-d, length N boolean
array.  For example, with a 3x5 array:

In [1]: import numpy
In [2]: data = numpy.arange(15)
In [3]: data.shape = 3, 5

Now, I want to select rows 0 and 2, so I can do:

In [4]: mask = numpy.array([True, False, True])
In [5]: data[mask]
Out[5]:
array([[ 0,  1,  2,  3,  4],
  [10, 11, 12, 13, 14]])

But when the shape of 'data' is a 0xM, this indexing fails:

In [6]: data2 = numpy.zeros((0, 5), 'd')
In [7]: mask2 = numpy.zeros(0, 'bool')
In [8]: data2[mask2]

Traceback (most recent call last):
 File "", line 1, in 
IndexError: invalid index

I would have expected the above to give me a 0x5 array.

Of course, I can check on "len(data)" and not use the above indexing
when it is zero, but I am hoping that I don't need to special case the
boundary condition and have numpy fancy indexing do the "right thing"
always.  Is this a bug in numpy?  Is there any other way to do what I
am doing?

Here is my numpy setup (numpy installed from the git repository):

In [1]: import numpy
In [2]: numpy.__version__
Out[2]: '1.6.0.dev-13c83fd'
In [3]: numpy.show_config()
blas_info:
   libraries = ['blas']
   library_dirs = ['/usr/lib']
   language = f77
lapack_info:
   libraries = ['lapack']
   library_dirs = ['/usr/lib']
   language = f77
atlas_threads_info:
 NOT AVAILABLE
blas_opt_info:
   libraries = ['blas']
   library_dirs = ['/usr/lib']
   language = f77
   define_macros = [('NO_ATLAS_INFO', 1)]
atlas_blas_threads_info:
 NOT AVAILABLE
lapack_opt_info:
   libraries = ['lapack', 'blas']
   library_dirs = ['/usr/lib']
   language = f77
   define_macros = [('NO_ATLAS_INFO', 1)]
atlas_info:
 NOT AVAILABLE
lapack_mkl_info:
 NOT AVAILABLE
blas_mkl_info:
 NOT AVAILABLE
atlas_blas_info:
 NOT AVAILABLE
mkl_info:
 NOT AVAILABLE
In [4]: import sys
In [5]: print sys.version
2.6.5 (r265:79063, Apr 16 2010, 13:09:56)
[GCC 4.4.3]

Thanks!

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] datetime64

2009-10-29 Thread Alok Singhal
Hi,

On 29/10/09: 12:18, Ariel Rokem wrote:
> I want to start trying out the new dtype for representation of arrays
> of times, datetime64, which is implemented in the current svn. Is
> there any documentation anywhere? I know of this proposal:
> 
> http://numpy.scipy.org/svn/numpy/tags/1.3.0/doc/neps/datetime-proposal3.rst
> 
> but apparently the current implementation of the dtype didn't follow
> this proposal - the hypothetical examples in the spec don't work with
> the implementation.
> I just want to see a couple of examples on how to initialize arrays of
> this dtype, and what kinds of operations can be done with them (and
> with timedelta64).

I think the only thing that works as of now for dates and deltas is
using datetime.datetime and datetime.timedelta objects in the
initilization of the arrays.  See
http://projects.scipy.org/numpy/ticket/1225 for some tests.

Even when you construct the arrays using datetime.datetime objects,
things are a bit strange:

In [1]: import numpy as np
In [2]: np.__version__
Out[2]: '1.4.0.dev7599'
In [3]: import datetime
In [4]: d = datetime.datetime(2009, 10, 5, 12, 35, 2)
In [5]: d1 = datetime.datetime.now()
In [6]: np.array([d, d1], 'M')
Out[6]: array([2009-10-04 23:27:37.359744, 2009-10-29 00:10:59.677844], 
dtype=datetime64[ns])

-Alok

-- 
   *   *
Alok Singhal   *   * *
http://www.astro.virginia.edu/~as8ca/
   **
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Different results from repeated calculation, part 2

2008-08-14 Thread Alok Singhal
On 14/08/08: 10:20, Keith Goodman wrote:
> A unit test is attached. It contains three tests:
> 
> In test1, I construct matrices x and y and then repeatedly calculate z
> = calc(x,y). The result z is the same every time. So this test passes.
> 
> In test2, I construct matrices x and y each time before calculating z
> = calc(x,y). Sometimes z is slightly different. But the x's test to be
> equal and so do the y's. This test fails (on Debian Lenny, Core 2 Duo,
> with libatlas3gf-sse2 but not with libatlas3gf-sse).
> 
> test3 is the same as test2 but I calculate z like this: z =
> calc(100*x,y) / (100 * 100). This test passes.
> 
> I get:
> 
> ==
> FAIL: repeatability #2
> --
> Traceback (most recent call last):
>   File "/home/[snip]/test/repeat_test.py", line 73, in test_repeat_2
> self.assert_(result, msg)
> AssertionError: Max difference = 2.04946e-16

Could this be because of how the calculations are done?  If the
floating point numbers are stored in the cpu registers, in this case
(intel core duo), they are 80-bit values, whereas 'double' precision
is 64-bits.  Depending upon gcc's optimization settings, the amount of
automatic variables, etc., it is entirely possible that the numbers
are stored in registers only in some cases, and are in the RAM in
other cases.  Thus, in your tests, sometimes some numbers get stored
in the cpu registers, making the calculations with those values
different from the case if they were not stored in the registers.

See "The pitfalls of verifying floating-point computations" at
http://portal.acm.org/citation.cfm?doid=1353445.1353446 (or if that
needs subscription, you can download the PDF from
http://arxiv.org/abs/cs/0701192).  The paper has a lot of examples of
surprises like this.  Quote:

  We shall discuss the following myths, among others:

  ...

  - "Arithmetic operations are deterministic; that is, if I do z=x+y in
two places in the same program and my program never touches x and y
in the meantime, then the results should be the same."
  
  - A variant: "If x < 1 tests true at one point, then x < 1 stays true
later if I never modify x."
  
  ...

-Alok

-- 
   *   *  
Alok Singhal   *   * *
http://www.astro.virginia.edu/~as8ca/ 
   ** 
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] min() of array containing NaN

2008-08-13 Thread Alok Singhal
On 12/08/08: 18:31, Charles R Harris wrote:
>OnTue,   Aug   12,   2008   at   6:28   PM,   Charles   R   Harris
><[EMAIL PROTECTED]> wrote:
>I suppose you could use
>min(a,b) = (abs(a - b) + a + b)/2
>which would have that effect.
> 
>Hmm, that is for the max, min would be
>(a + b - |a - b|)/2

This would break when there is an overflow because of
addition/subtraction:

def new_min(a, b):
  return (a + b - abs(a-b))/2

a = 1e308
b = -1e308

new_min(a, b) # returns -inf
min(a, b) # returns -1e308

-- 
       *   *  
Alok Singhal   *   * *
http://www.astro.virginia.edu/~as8ca/ 
   ** 
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion