Re: [Numpy-discussion] Trac internal error
On Tue, May 13, 2008 at 1:37 AM, Pauli Virtanen [EMAIL PROTECTED] wrote: Something seems to be wrong with the Trac: http://scipy.org/scipy/numpy/timeline Internal Error Ticket changes event provider (TicketModule) failed: SubversionException: (Can't open file '/home/scipy/svn/numpy/db/revprops/5159': Permission denied, 13) Although this one sorted itself by morning, I kept running into permissions issues in various places today. I finally went ahead and changed some things and I hope these permissions issues will disappear. If anyone else runs into something like this in the next few days, please let me know. Thanks, -- Jarrod Millman Computational Infrastructure for Research Labs 10 Giannini Hall, UC Berkeley phone: 510.643.4014 http://cirl.berkeley.edu/ ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] ticket 788: possible blocker
2008/5/14 Matthew Brett [EMAIL PROTECTED]: Hi, Stefan, sometimes the fix really is clear and a test is like closing the barn door after the horse has bolted. Sometimes it isn't even clear *how* to test. I committed one fix and omitted a test because I couldn't think of anything really reasonable. I think concentrating on unit tests is more productive in the long run because we will find *new* bugs, and if done right they will also cover spots where old bugs were found. I must say that I have certainly (correctly) fixed a bug, and then broken the code somewhere else resulting in the same effect as the original bug, and missed it because I didn't put in a test the first time. I do agree (with everyone else I think) that it's a very good habit to get into to submit a test with every fix, no matter how obvious. Best, I agree as well, what may be obvious to someone is not for someone else, and there are many examples where I thought the code did this but in fact did that (and I saw it regularly in my courses with some students). Matthieu -- French PhD student Website : http://matthieu-brucher.developpez.com/ Blogs : http://matt.eifelle.com and http://blog.developpez.com/?blog=92 LinkedIn : http://www.linkedin.com/in/matthieubrucher ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] mac osx10.5 and external library crash (possible f2py/numpy problem?)
I've just moved all my stuff from a Intel Imac on 10.4.11 to a Macpro on 10.5.2. On both machines I have the same universal activestate python (both 2.5.1 and 2.5.2.2 give the same problem). I have some codes in fortran from which I build a shared library using f2py from numpy. Now when I import the library as built on osx10.4 it all works fine but when I import library as built on 10.5 I get the following error message willgoose-macpro:system garrywillgoose$ python ActivePython 2.5.1.1 (ActiveState Software Inc.) based on Python 2.5.1 (r251:54863, May 1 2007, 17:40:00) [GCC 4.0.1 (Apple Computer, Inc. build 5250)] on darwin Type help, copyright, credits or license for more information. import tsimdtm Fatal Python error: Interpreter not initialized (version mismatch?) Abort trap Its only my libraries that are the problem. Now also (1) if I move this library back to my 10.4 machine it works fine, and (2) if I move the library built on my 10.4 machine over to 10.5 it also works fine. Its only the library built on 10.5 and run on 10.5 that is the problem. Re f2py it appears to be independent of fortran compiler (g95, gfortran and intel fortran all give same result). I upgraded to the latest version of activestate 2.5.2 but same problem. OSX 10.4 has xcode 2.4 installed, 10.5 has 3.0 installed. Reinstalled all the fortran compilers and the numpy support library ... still the same. Any ideas (1) what the error means in the first place and (2) what I should do? I've cross-posted this on the Python-mac and numpy discussion since it appears to be a OSX/numpy/f2py interaction. Prof Garry Willgoose, Australian Professorial Fellow in Environmental Engineering, Director, Centre for Climate Impact Management (C2IM), School of Engineering, The University of Newcastle, Callaghan, 2308 Australia. Centre webpage: www.c2im.org.au Phone: (International) +61 2 4921 6050 (Tues-Fri AM); +61 2 6545 9574 (Fri PM-Mon) FAX: (International) +61 2 4921 6991 (Uni); +61 2 6545 9574 (personal and Telluric) Env. Engg. Secretary: (International) +61 2 4921 6042 email: [EMAIL PROTECTED]; [EMAIL PROTECTED] email-for-life: [EMAIL PROTECTED] personal webpage: www.telluricresearch.com/garry Do not go where the path may lead, go instead where there is no path and leave a trail Ralph Waldo Emerson ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Trac internal error
ti, 2008-05-13 kello 23:00 -0700, Jarrod Millman kirjoitti: On Tue, May 13, 2008 at 1:37 AM, Pauli Virtanen [EMAIL PROTECTED] wrote: Something seems to be wrong with the Trac: http://scipy.org/scipy/numpy/timeline Internal Error Ticket changes event provider (TicketModule) failed: SubversionException: (Can't open file '/home/scipy/svn/numpy/db/revprops/5159': Permission denied, 13) Although this one sorted itself by morning, I kept running into permissions issues in various places today. I finally went ahead and changed some things and I hope these permissions issues will disappear. If anyone else runs into something like this in the next few days, please let me know. There's a permission error here: http://scipy.org/scipy/numpy/wiki/CodingStyleGuidelines -- Pauli Virtanen ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] 1.1 dev on the trunk and the road to 1.2
On Wed, May 14, 2008 at 1:47 AM, Travis E. Oliphant [EMAIL PROTECTED] wrote: Eric Firing wrote: Jarrod Millman wrote: On Tue, May 13, 2008 at 9:39 PM, Charles R Harris [EMAIL PROTECTED] wrote: I was getting ready to add a big code cleanup, so you lucked out ;) Let's get this release out as quickly as possible once masked arrays are ready to go. What is left to do for the masked arrays? Just want to make sure I haven't missed something. Thanks, Masked arrays: the only thing I know of as a possibility is the suggestion that the new code be accessible with a DeprecationWarning from numpy.core as well as from numpy. I'm neutral on this, and will not try to implement it. All is not yet well in matrix land; the end of the numpy.test() output is: == ERROR: check_array_from_matrix_list (numpy.core.tests.test_defmatrix.TestNewScalarIndexing) -- Traceback (most recent call last): File /usr/local/lib/python2.5/site-packages/numpy/core/tests/test_defmatrix.py, line 194, in check_array_from_matrix_list x = array([a, a]) ValueError: setting an array element with a sequence. == FAIL: check_dimesions (numpy.core.tests.test_defmatrix.TestNewScalarIndexing) -- Traceback (most recent call last): File /usr/local/lib/python2.5/site-packages/numpy/core/tests/test_defmatrix.py, line 190, in check_dimesions assert_equal(x.ndim, 1) File /usr/local/lib/python2.5/site-packages/numpy/testing/utils.py, line 145, in assert_equal assert desired == actual, msg AssertionError: Items are not equal: ACTUAL: 2 DESIRED: 1 -- Ran 1004 tests in 1.534s FAILED (failures=1, errors=1) Out[2]:unittest._TextTestResult run=1004 errors=1 failures=1 I'm aware of these errors. I will try to clean them up soon by fixing the dimension reduction assumption mistakes. Nah, leave it. You not only have to fix the descent in dimensions, you will have to fix PyArray_From DescAndData (or whatever it is). I vote for simply noting that x = array([a, a]) won't work, and keep it around as a reminder of what a horrible mistake the current Matrix implementation is. The code is difficult as it is, and if we keep this up it will start looking like windows with 15 years of compatibility crud. Chuck ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] searchsorted() and memory cache
I will post any new insights as I continue to work on this... OK, I save isolated a sample of my data that illustrates the terrible performance with the binarysearch. I have uploaded it as a pytables file to http://astraw.com/framenumbers.h5 in case anyone wants to have a look themselves. Here's an example of the type of benchmark I've been running: import fastsearch.downsamp import fastsearch.binarysearch import tables h5=tables.openFile('framenumbers.h5',mode='r') framenumbers=h5.root.framenumbers.read() keys=h5.root.keys.read() h5.close() def bench( implementation ): for key in keys: implementation.index( key ) downsamp = fastsearch.downsamp.DownSampledPreSearcher( framenumbers ) binary = fastsearch.binarysearch.BinarySearcher( framenumbers ) # The next two lines are IPython-specific, and the 2nd takes a looong time: %timeit bench(downsamp) %timeit bench(binary) Running the above gives: In [14]: %timeit bench(downsamp) 10 loops, best of 3: 64 ms per loop In [15]: %timeit bench(binary) 10 loops, best of 3: 184 s per loop Quite a difference (a factor of about 3000)! At this point, I haven't delved into the dataset to see what makes it so pathological -- performance is nowhere near this bad for the binary search algorithm with other sets of keys. ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] HASH TABLES IN PYTHON
To Whom It May Concern, I was wondering if anyone has ever worked with hash tables within the Python Programming language? I will need to utilize this ability for quick numerical calculations. Thank You, David Blubaugh This e-mail transmission contains information that is confidential and may be privileged. It is intended only for the addressee(s) named above. If you receive this e-mail in error, please do not read, copy or disseminate it in any manner. If you are not the intended recipient, any disclosure, copying, distribution or use of the contents of this information is prohibited. Please reply to the message immediately by informing the sender that the message was misdirected. After replying, please erase it from your computer system. Your assistance in correcting this error is appreciated. ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] HASH TABLES IN PYTHON
On Wed, May 14, 2008 at 10:20 AM, Blubaugh, David A. [EMAIL PROTECTED] wrote: To Whom It May Concern, I was wondering if anyone has ever worked with hash tables within the Python Programming language? I will need to utilize this ability for quick numerical calculations. Yes. Python dicts are hash tables. PS: Please do not post with ALL CAPS. -- Robert Kern I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth. -- Umberto Eco ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] how to use masked arrays
On Wednesday 14 May 2008 13:19:55 Eric Firing wrote: Pierre GM wrote: (almost) equivalent [1]: mydata._data mydata.view(np.ndarray) Shouldn't the former be discouraged, on the grounds that a leading underscore, by Python convention, indicates an attribute that is not part of the public API, but is instead part of the potentially changeable implementation? Eric, * Please keep the note [1] in mind: the two commands are NOT equivalent: the former outputs a subclass of ndarray (when appropriate), the latter a regular ndarray. * You can use mydata.data to achieve the same result as mydata._data. In practice, both _data and data are properties, without a fset method and a with fget= lambda x:x.view(x._baseclass). I'm not very comfortable with using .data myself, it looks a bit awkward (personal taste), and it may let a user think that the readbuffer object is accessed (when in fact, it's mydata.data.data...) * The syntax ._data is required for backwards compatibility (that was the data portion of the old MaskedArray object). So is ._mask * You can also use the getdata(mydata) function: it returns the ._data part of a masked array or the argument as a ndarray, depending which is available. ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] searchsorted() and memory cache
On Wed, May 14, 2008 at 8:09 AM, Andrew Straw [EMAIL PROTECTED] wrote: I will post any new insights as I continue to work on this... OK, I save isolated a sample of my data that illustrates the terrible performance with the binarysearch. I have uploaded it as a pytables file to http://astraw.com/framenumbers.h5 in case anyone wants to have a look themselves. Here's an example of the type of benchmark I've been running: import fastsearch.downsamp import fastsearch.binarysearch import tables h5=tables.openFile('framenumbers.h5',mode='r') framenumbers=h5.root.framenumbers.read() keys=h5.root.keys.read() h5.close() def bench( implementation ): for key in keys: implementation.index( key ) downsamp = fastsearch.downsamp.DownSampledPreSearcher( framenumbers ) binary = fastsearch.binarysearch.BinarySearcher( framenumbers ) # The next two lines are IPython-specific, and the 2nd takes a looong time: %timeit bench(downsamp) %timeit bench(binary) Running the above gives: In [14]: %timeit bench(downsamp) 10 loops, best of 3: 64 ms per loop In [15]: %timeit bench(binary) 10 loops, best of 3: 184 s per loop Quite a difference (a factor of about 3000)! At this point, I haven't delved into the dataset to see what makes it so pathological -- performance is nowhere near this bad for the binary search algorithm with other sets of keys. It can't be that bad Andrew, something else is going on. And 191 MB isn's *that* big, I expect it should bit in memory with no problem. Chuck ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] f2py and -D_FORTIFY_SOURCE=2 compilation flag
Hi, On fedora 8, the docstrings of f2py generated extensions are strangely missing. On Ubuntu, the same modules do have the docstrings. The problem, as reported in the f2py ML, seems to come from the -D_FORTIFY_SOURCE flag which is set to 2 instead of 1. Could this be fixed in numpy.distutils and how ? Thanks, David ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Extending an ndarray
Thanks alan, that works! Soren On Wed, May 14, 2008 at 9:02 PM, Alan McIntyre [EMAIL PROTECTED] wrote: Here's one way (probably not the most efficient or elegant): # example original array a=arange(1,26).reshape(5,5) # place copy of 'a' into upper left corner of a larger array of zeros b=zeros((10,10)) b[:5,:5]=a On Wed, May 14, 2008 at 2:48 PM, Søren Nielsen [EMAIL PROTECTED] wrote: Hi, I've loaded an image into a ndarray. I'd like to extend the ndarray with a border of zeros all around the ndarray.. does anyone here know how to do this? Thanks, Soren ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] searchsorted() and memory cache
Charles R Harris wrote: On Wed, May 14, 2008 at 8:09 AM, Andrew Straw [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] wrote: Quite a difference (a factor of about 3000)! At this point, I haven't delved into the dataset to see what makes it so pathological -- performance is nowhere near this bad for the binary search algorithm with other sets of keys. It can't be that bad Andrew, something else is going on. And 191 MB isn's *that* big, I expect it should bit in memory with no problem. I agree the performance difference seems beyond what one would expect due to cache misses alone. I'm at a loss to propose other explanations, though. Ideas? ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] f2py and -D_FORTIFY_SOURCE=2 compilation flag
On Wed, May 14, 2008 at 2:46 PM, David Huard [EMAIL PROTECTED] wrote: Hi, On fedora 8, the docstrings of f2py generated extensions are strangely missing. On Ubuntu, the same modules do have the docstrings. The problem, as reported in the f2py ML, seems to come from the -D_FORTIFY_SOURCE flag which is set to 2 instead of 1. Could this be fixed in numpy.distutils and how ? There is no string FORTIFY_SOURCE anywhere in the numpy codebase. -- Robert Kern I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth. -- Umberto Eco ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] f2py and -D_FORTIFY_SOURCE=2 compilation flag
I filed a patch that seems to do the trick in ticket #792.http://scipy.org/scipy/numpy/ticket/792 2008/5/14 David Huard [EMAIL PROTECTED]: Hi, On fedora 8, the docstrings of f2py generated extensions are strangely missing. On Ubuntu, the same modules do have the docstrings. The problem, as reported in the f2py ML, seems to come from the -D_FORTIFY_SOURCE flag which is set to 2 instead of 1. Could this be fixed in numpy.distutils and how ? Thanks, David ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] let's use patch review
Hi, I read the recent flamebate about unittests, formal procedures for a commit etc. and it was amusing. :) I think Stefan is right about the unit tests. I also think that Travis is right that there is no formal procedure that can assure what we want. I think that a solution is a patch review. Every big/succesful project does it. And the workflow as I see it is this: 1) Travis will fix a bug, and submit it to a patch review. If he is busy, that's the only thing he will do 2) Someone else reviews it. Stefan will be the one who will always point out missing tests. 3) There needs to be a common consensus that the patch is ok to go in. 4) when the patch is reviewed and ok to go in, anyone with a commit access will commit it. I think it's as simple as that. Sometimes no one has enought time to write a proper test, yet someone has a free minute to fix a bug. Then I think it's ok to put the code in, as I think it's good to fix a bug now. However, the issue is definitely not closed and the bug is not fixed (!) until someone writes a proper test. I.e. putting code in that is not tested, however it doesn't break things, is imho ok, as it will not hurt anyone and it will temporarily fix a bug (but of course the code will be broken at some point in the future, if there is no test for it). Now, the problem is that all patch review tools sucks in some way. Currently the most promissing is the one from Guido here: http://code.google.com/p/rietveld/ it's opensource, you can run it on your server, or use it online here: http://codereview.appspot.com/ I suggest you to read the docs how to use it, I am still learning it. Also it works fine for svn, but not for Mercurial, so we are not using it in SymPy yet. So to also do some work besides just talk, I started with this issue: http://projects.scipy.org/scipy/numpy/ticket/788 and submitted the code (not my code though:) in there for a review here: http://codereview.appspot.com/953 and added some comments. So what do you think? Ondrej P.S. efiring, my comments are real questions to your patch. :) ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] f2py and -D_FORTIFY_SOURCE=2 compilation flag
On Wed, May 14, 2008 at 3:20 PM, David Huard [EMAIL PROTECTED] wrote: I filed a patch that seems to do the trick in ticket #792. I don't think this is the right approach. The problem isn't that _FORTIFY_SOURCE is set to 2 but that f2py is doing (probably) bad things that trip these buffer overflow checks. IIRC, Pearu wasn't on the f2py mailing list at the time this came up; please try him again. -- Robert Kern I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth. -- Umberto Eco ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] embedded PyArray_FromDimsAndData Segmentation Fault
Hi all, the PyArray_FromDimsAndData is still cousing me headaches. Is there anybody out there finding the error of the following code? #include Python.h #include numpy/ndarrayobject.h int main(int argc,char** argv) { int dimensions = 2; void* value = malloc(sizeof(double)*100); int* size = (int*)malloc(sizeof(int)*2); size[0] = 10; size[1] = 10; for(int i=0;i100;i++) ((double*)value)[i] = 1.0; for(int i=0;i100;i++) printf(%e ,((double*)value)[i]); printf(\n%d %d\n,dimensions,size[0]); PyArray_FromDimsAndData(dimensions,size,NPY_DOUBLELTR,(char*)value); //TROUBLE HERE return 0; } I allway get a segmentation fault at the PyArray_FromDimsAndData call. I want to create copies of c arrays, copy them into a running python interpreter as nd-arrays and modify them with some python functions. If I did this in a module, I would have to call the import_array(); function, I know. However, this is all outside of any module and when I add it before PyArray_FromDimsAndData I get the following compilation error: src/test.cpp:24: error: return-statement with no value, in function returning 'int' Does anybody have a clue? Best, Thomas ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] embedded PyArray_FromDimsAndData Segmentation Fault
On Wed, May 14, 2008 at 5:42 PM, Thomas Hrabe [EMAIL PROTECTED] wrote: Hi all, the PyArray_FromDimsAndData is still cousing me headaches. Is there anybody out there finding the error of the following code? #include Python.h #include numpy/ndarrayobject.h int main(int argc,char** argv) { int dimensions = 2; void* value = malloc(sizeof(double)*100); int* size = (int*)malloc(sizeof(int)*2); size[0] = 10; size[1] = 10; for(int i=0;i100;i++) ((double*)value)[i] = 1.0; for(int i=0;i100;i++) printf(%e ,((double*)value)[i]); printf(\n%d %d\n,dimensions,size[0]); PyArray_FromDimsAndData(dimensions,size,NPY_DOUBLELTR,(char*)value); //TROUBLE HERE return 0; } I allway get a segmentation fault at the PyArray_FromDimsAndData call. I want to create copies of c arrays, copy them into a running python interpreter as nd-arrays and modify them with some python functions. If I did this in a module, I would have to call the import_array(); function, I know. However, this is all outside of any module and when I add it before PyArray_FromDimsAndData I get the following compilation error: src/test.cpp:24: error: return-statement with no value, in function returning 'int' You can't use numpy outside of Python. Put your code into a Python extension module. I can explain the proximate causes of the error messages if you really want, but they aren't really relevant. The ultimate problem is that you aren't in a Python extension module. -- Robert Kern I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth. -- Umberto Eco ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] embedded PyArray_FromDimsAndData Segmentation Fault
Hi Thomas 2008/5/15 Thomas Hrabe [EMAIL PROTECTED]: PyArray_FromDimsAndData(dimensions,size,NPY_DOUBLELTR,(char*)value); //TROUBLE HERE I didn't know a person could write a stand-alone program using NumPy this way (can you?); but what I do know is that FromDimsAndData is deprecated, and that it can be replaced here by PyArray_SimpleNewFromData(dimensions, size, NPY_CDOUBLE, value); where npy_intp* size = malloc(sizeof(npy_intp)*2); Regards Stéfan ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] searchsorted() and memory cache
On Wed, May 14, 2008 at 2:00 PM, Andrew Straw [EMAIL PROTECTED] wrote: Charles R Harris wrote: On Wed, May 14, 2008 at 8:09 AM, Andrew Straw [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] wrote: Quite a difference (a factor of about 3000)! At this point, I haven't delved into the dataset to see what makes it so pathological -- performance is nowhere near this bad for the binary search algorithm with other sets of keys. It can't be that bad Andrew, something else is going on. And 191 MB isn's *that* big, I expect it should bit in memory with no problem. I agree the performance difference seems beyond what one would expect due to cache misses alone. I'm at a loss to propose other explanations, though. Ideas? I just searched for 2**25/10 keys in a 2**25 array of reals. It took less than a second when vectorized. In a python loop it took about 7.7 seconds. The only thing I can think of is that the search isn't getting any cpu cycles for some reason. How much memory is it using? Do you have any nans and such in the data? Chuck ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] embedded PyArray_FromDimsAndDataSegmentation Fault
I didn't know a person could write a stand-alone program using NumPy this way (can you?) Well, this is possible when you embed python and use the simple objects such as ints, strings, Why should it be impossible to do it for numpy then? My plan is to send multidimensional arrays from C to python and to apply some python specific functions to them. winmail.dat___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] let's use patch review
Ondrej Certik wrote: Hi, I read the recent flamebate about unittests, formal procedures for a commit etc. and it was amusing. :) I think Stefan is right about the unit tests. I also think that Travis is right that there is no formal procedure that can assure what we want. I think that a solution is a patch review. Every big/succesful project does it. And the workflow as I see it is this: Are you sure numpy is big enough that a formal mechanism is needed--for everyone? It makes good sense for my (rare) patches to be reviewed, but shouldn't some of the core developers be allowed to simply get on with it? As it is, my patches can easily be reviewed because I don't have commit access. 1) Travis will fix a bug, and submit it to a patch review. If he is busy, that's the only thing he will do 2) Someone else reviews it. Stefan will be the one who will always point out missing tests. That we can agree on! 3) There needs to be a common consensus that the patch is ok to go in. What does that mean? How does one know when there is a consensus? 4) when the patch is reviewed and ok to go in, anyone with a commit access will commit it. But it has to be a specific person in each case, not anyone. I think it's as simple as that. Sometimes no one has enought time to write a proper test, yet someone has a free minute to fix a bug. Then I think it's ok to put the code in, as I think it's good to fix a bug now. However, How does that fit with the workflow above? Does Travis commit the bugfix, or not? the issue is definitely not closed and the bug is not fixed (!) until someone writes a proper test. I.e. putting code in that is not tested, however it doesn't break things, is imho ok, as it will not hurt anyone and it will temporarily fix a bug (but of course the code will be broken at some point in the future, if there is no test for it). That is overstating the case; for 788, for example, no one in his right mind would undo the one-line correction that Travis made. Chances are, there will be all sorts of breakage and foulups and the revelation of new bugs in the future--but not another instance that would be caught by the test for 788. [...] http://codereview.appspot.com/953 and added some comments. So what do you think? Looks like it could be useful. I replied to the comments. I haven't read the docs, and I don't know what the next step is when a revision of the patch is in order, as it is in this case. Eric Ondrej P.S. efiring, my comments are real questions to your patch. :) ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] embedded PyArray_FromDimsAndDataSegmentation Fault
On Wed, May 14, 2008 at 6:40 PM, Thomas Hrabe [EMAIL PROTECTED] wrote: I didn't know a person could write a stand-alone program using NumPy this way (can you?) Well, this is possible when you embed python and use the simple objects such as ints, strings, Why should it be impossible to do it for numpy then? numpy exposes its API as a pointer to an array which contains function pointers. import_array() imports the extension module, accesses the PyCObject that contains this pointer, and sets a global pointer appropriately. There are #defines macros to emulate the functions by dereferencing the appropriate element of the array and calling it with the given macro arguments. The reason you get the error about returning nothing when the return type of main() is declared int is because this macro is only intended to work inside of an initmodule() function of an extension module, whose return type is void. import_array() includes error handling logic and will return if there is an error. You get the segfault without import_array() because all of the functions you try to call are trying to dereference an array which has not been initialized. My plan is to send multidimensional arrays from C to python and to apply some python specific functions to them. Well, first you need to call Py_Initialize() to start the VM. Otherwise, you can't import numpy to begin with. I guess you could write a void load_numpy(void) function which just exists to call import_array(). Just be sure to check the exception state appropriately after it returns. But for the most part, it's much better to drive your C code using Python than the other around. -- Robert Kern I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth. -- Umberto Eco ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] let's use patch review
On Wed, 2008-05-14 at 13:58 -1000, Eric Firing wrote: What does that mean? How does one know when there is a consensus? There can be a system to make this automatic. For example, the code is never commited directly to svn, but to a gatekeeper, and people vote by an email command to say if they want the patch in; when the total number of votes is above some threshold, the gatekeeper commit the patch. David ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] searchsorted() and memory cache
Aha, I've found the problem -- my values were int64 and my keys were uint64. Switching to the same data type immediately fixes the issue! It's not a memory cache issue at all. Perhaps searchsorted() should emit a warning if the keys require casting... I can't believe how bad the hit was. -Andrew Charles R Harris wrote: On Wed, May 14, 2008 at 2:00 PM, Andrew Straw [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] wrote: Charles R Harris wrote: On Wed, May 14, 2008 at 8:09 AM, Andrew Straw [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] mailto:[EMAIL PROTECTED] mailto:[EMAIL PROTECTED] wrote: Quite a difference (a factor of about 3000)! At this point, I haven't delved into the dataset to see what makes it so pathological -- performance is nowhere near this bad for the binary search algorithm with other sets of keys. It can't be that bad Andrew, something else is going on. And 191 MB isn's *that* big, I expect it should bit in memory with no problem. I agree the performance difference seems beyond what one would expect due to cache misses alone. I'm at a loss to propose other explanations, though. Ideas? I just searched for 2**25/10 keys in a 2**25 array of reals. It took less than a second when vectorized. In a python loop it took about 7.7 seconds. The only thing I can think of is that the search isn't getting any cpu cycles for some reason. How much memory is it using? Do you have any nans and such in the data? ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion