[Numpy-discussion] Asking proposal review/feedback for GSOC 15
Hi, My name is Oğuzhan(You may use 'Oguzhan'). I submitted a proposal on the system with the title 'NumPy - Vector math library integration'. Ralf commented on my proposal and advised to ask for a feedback on mailing list and here I am. I would appreciate any feedback from community. I think community members are able to view my proposal, its visibility is set to 'Organization members'. I preferred my name in its original form, if any mentor would like to search, I provide my name on system below. Name: Oğuzhan Ünlü Thanks in advance, ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Rewrite np.histogram in c?
Hope this isn't too off-topic: but it would be very nice if np.histogram and np.histogram2d supported masked arrays. Is this out of scope for outside the numpy.ma package? On Mon, Mar 16, 2015 at 2:35 PM, Robert McGibbon rmcgi...@gmail.com wrote: Hi, It sounds like putting together a PR makes sense then. I'll try hacking on this a bit. -Robert On Mar 16, 2015 11:20 AM, Jaime Fernández del Río jaime.f...@gmail.com wrote: On Mon, Mar 16, 2015 at 9:28 AM, Jerome Kieffer jerome.kief...@esrf.fr wrote: On Mon, 16 Mar 2015 06:56:58 -0700 Jaime Fernández del Río jaime.f...@gmail.com wrote: Dispatching to a different method seems like a no brainer indeed. The question is whether we really need to do this in C. I need to do both unweighted weighted histograms and we got a factor 5 using (simple) cython: it is in the proceedings of Euroscipy, last year. http://arxiv.org/pdf/1412.6367.pdf If I read your paper and code properly, you got 5x faster, mostly because you combined the weighted and unweighted histograms into a single search of the array, and because you used an algorithm that can only be applied to equal- sized bins, similarly to the 10x speed-up Robert was reporting. I think that having a special path for equal sized bins is a great idea: let's do it, PRs are always welcome! Similarly, getting the counts together with the weights seems like a very good idea. I also think that writing it in Python is going to take us 80% of the way there: most of the improvements both of you have reported are not likely to be coming from the language chosen, but from the algorithm used. And if C proves to be sufficiently faster to warrant using it, it should be confined to the number crunching: I don;t think there is any point in rewriting argument parsing in C. Also, keep in mind `np.histogram` can now handle arrays of just about **any** dtype. Handling that complexity in C is not a ride in the park. Other functions like `np.bincount` and `np.digitize` cheat by only handling `double` typed arrays, a luxury that histogram probably can't afford at this point in time. Jaime -- (\__/) ( O.o) ( ) Este es Conejo. Copia a Conejo en tu firma y ayúdale en sus planes de dominación mundial. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] Reminder - Summer School Advanced Scientific Programming in Python in Munich, Germany
Reminder: Deadline for application is 23:59 UTC, March 31, 2015. Advanced Scientific Programming in Python = a Summer School by the G-Node, the Bernstein Center for Computational Neuroscience Munich and the Graduate School of Systemic Neurosciences Scientists spend more and more time writing, maintaining, and debugging software. While techniques for doing this efficiently have evolved, only few scientists have been trained to use them. As a result, instead of doing their research, they spend far too much time writing deficient code and reinventing the wheel. In this course we will present a selection of advanced programming techniques, incorporating theoretical lectures and practical exercises tailored to the needs of a programming scientist. New skills will be tested in a real programming project: we will team up to develop an entertaining scientific computer game. We use the Python programming language for the entire course. Python works as a simple programming language for beginners, but more importantly, it also works great in scientific simulations and data analysis. We show how clean language design, ease of extensibility, and the great wealth of open source libraries for scientific computing and data visualization are driving Python to become a standard tool for the programming scientist. This school is targeted at Master or PhD students and Post-docs from all areas of science. Competence in Python or in another language such as Java, C/C++, MATLAB, or Mathematica is absolutely required. Basic knowledge of Python is assumed. Participants without any prior experience with Python should work through the proposed introductory materials before the course. Date and Location = August 31—September 5, 2015. Munich, Germany. Preliminary Program === Day 0 (Mon Aug 31) — Best Programming Practices • Best Practices for Scientific Computing • Version control with git and how to contribute to Open Source with github • Object-oriented programming design patterns Day 1 (Tue Sept 1) — Software Carpentry • Test-driven development, unit testing quality assurance • Debugging, profiling and benchmarking techniques • Advanced Python: generators, decorators, and context managers Day 2 (Wed Sept 2) — Scientific Tools for Python • Advanced NumPy • The Quest for Speed (intro): Interfacing to C with Cython • Contributing to Open Source Software/Programming in teams Day 3 (Thu Sept 3) — The Quest for Speed • Writing parallel applications in Python • Python 3: why should I care • Programming project Day 4 (Fri Sept 4) — Efficient Memory Management • When parallelization does not help: the starving CPUs problem • Programming project Day 5 (Sat Sept 5) — Practical Software Development • Programming project • The Pelita Tournament Every evening we will have the tutors' consultation hour: Tutors will answer your questions and give suggestions for your own projects. Applications You can apply on-line at https://python.g-node.org Applications must be submitted before 23:59 UTC, March 31, 2015. Notifications of acceptance will be sent by May 1, 2015. No fee is charged but participants should take care of travel, living, and accommodation expenses. Candidates will be selected on the basis of their profile. Places are limited: acceptance rate is usually around 20%. Prerequisites: You are supposed to know the basics of Python to participate in the lectures Preliminary Faculty === • Pietro Berkes, Enthought Inc., UK • Marianne Corvellec, Plotly Technologies Inc., Montréal, Canada • Kathryn D. Huff, Department of Nuclear Engineering, University of California - Berkeley, USA • Zbigniew Jędrzejewski-Szmek, Krasnow Institute, George Mason University, USA • Eilif Muller, Blue Brain Project, École Polytechnique Fédérale de Lausanne, Switzerland • Juan Nunez-Iglesias, Victorian Life Sciences Computation Initiative, University of Melbourne, Australia • Rike-Benjamin Schuppner, Institute for Theoretical Biology, Humboldt-Universität zu Berlin, Germany • Bartosz Teleńczuk, European Institute for Theoretical Neuroscience, CNRS, Paris, France • Nelle Varoquaux, Centre for Computational Biology Mines ParisTech, Institut Curie, U900 INSERM, Paris, France • Tiziano Zito, Forschungszentrum Jülich GmbH, Germany Organized by Tiziano Zito (head) and Zbigniew Jędrzejewski-Szmek for the German Neuroinformatics Node of the INCF Germany, Christopher Roppelt for the German Center for Vertigo and Balance Disorders (DSGZ) and the Graduate School of Systemic Neurosciences (GSN) of the Ludwig-Maximilians-Universität Munich Germany, Christoph Hartmann for the Frankfurt Institute for Advanced Studies (FIAS) and International Max Planck Research School (IMPRS) for Neural Circuits, Frankfurt Germany, and Jakob Jordan for the Institute of Neuroscience and Medicine (INM-6) and Institute for Advanced
[Numpy-discussion] element-wise array segmental function operation?
Hi, all I want to know wether there is a terse way to apply a function to every array element, where the function behaves according to the element value. for example [code] def fun(v): if 0=v60: return f1(v)#where f1 is a function elif 60=v70: return f2(v) elif 70=v80: return f3(v) ...and so on... [/code] for 'a=numpy.array([20,50,75])', I hope to get numpy.array([f1(20), f1(50), f3(75)]) thanks in advance Lee ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] element-wise array segmental function operation?
On 23.03.2015 07:46, oyster wrote: Hi, all I want to know wether there is a terse way to apply a function to every array element, where the function behaves according to the element value. for example [code] def fun(v): if 0=v60: return f1(v)#where f1 is a function elif 60=v70: return f2(v) elif 70=v80: return f3(v) ...and so on... [/code] for 'a=numpy.array([20,50,75])', I hope to get numpy.array([f1(20), f1(50), f3(75)]) piecewise should be what you are looking for: http://docs.scipy.org/doc/numpy/reference/generated/numpy.piecewise.html ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] ANN: pandas 0.16.0 released
Hello, We are proud to announce v0.16.0 of pandas, a major release from 0.15.2. This release includes a small number of API changes, several new features, enhancements, and performance improvements along with a large number of bug fixes. This was 4 months of work by 60 authors encompassing 204 issues. We recommend that all users upgrade to this version. *Highlights:* - - *DataFrame.assign* method, see here http://pandas.pydata.org/pandas-docs/stable/whatsnew.html#whatsnew-0160-enhancements-assign - *Series.to_coo/from_coo* methods to interact with *scipy.sparse*, see here http://pandas.pydata.org/pandas-docs/stable/whatsnew.html#whatsnew-0160-enhancements-sparse - Backwards incompatible change to *Timedelta* to conform the *.seconds* attribute with *datetime.timedelta*, see here http://pandas.pydata.org/pandas-docs/stable/whatsnew.html#whatsnew-0160-api-breaking-timedelta - Changes to the *.loc* slicing API to conform with the behavior of *.ix* see here http://pandas.pydata.org/pandas-docs/stable/whatsnew.html#whatsnew-0160-api-breaking-indexing - Changes to the default for ordering in the *Categorical* constructor, see here http://pandas.pydata.org/pandas-docs/stable/whatsnew.html#whatsnew-0160-api-breaking-categorical - Enhancement to the *.str* accessor to make string operations easier, see here http://pandas.pydata.org/pandas-docs/stable/whatsnew.html#whatsnew-0160-enhancements-string - The *pandas.tools.rplot*, *pandas.sandbox.qtpandas* and *pandas.rpy* modules are deprecated. - We refer users to external packages like seaborn http://stanford.edu/~mwaskom/software/seaborn/, pandas-qt https://github.com/datalyze-solutions/pandas-qt and rpy2 http://rpy.sourceforge.net/ for similar or equivalent functionality, see here for more detail http://pandas.pydata.org/pandas-docs/stable/whatsnew.html#deprecations See a full description of the Whatsnew for v0.16.0 http://pandas.pydata.org/pandas-docs/stable/whatsnew.html *What is it:* *pandas* is a Python package providing fast, flexible, and expressive data structures designed to make working with “relational” or “labeled” data both easy and intuitive. It aims to be the fundamental high-level building block for doing practical, real world data analysis in Python. Additionally, it has the broader goal of becoming the most powerful and flexible open source data analysis / manipulation tool available in any language. Documentation: http://pandas.pydata.org/pandas-docs/stable/ Source tarballs, windows wheels, macosx wheels are available on PyPI: https://pypi.python.org/pypi/pandas windows binaries are courtesy of Christoph Gohlke and are built on Numpy 1.9 macosx wheels are courtesy of Matthew Brett and are built on Numpy 1.7.1 Please report any issues here: https://github.com/pydata/pandas/issues Thanks The Pandas Development Team ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] GSoC projects
Hi Lulu, welcome! On Mon, Mar 23, 2015 at 6:09 AM, Lulu Li c...@alum.mit.edu wrote: My apology if I am posting to the wrong mailing list. I am interested in NumPy project ideas for Google Summer of Code 2015 as posted here https://github.com/scipy/scipy/wiki/GSoC-project-ideas. In particular, knowing C and Python, I am interested in porting parts of bumpy from C to Cython or pythonic types. I wonder if these projects are still looking for participants? If not I will be excited to put together a proposal and work on them these summer. Proposals are still very welcome. There has been some interest in this particular project idea, but I haven't seen any submitted proposals yet. And even if there were, you can still submit yours. The deadline is closing in fast, so you'll have to be quick though. Try to post a first draft asap, so you can get some feedback and improve your proposal before the 27th. Also keep in mind that one of the requirements for getting your proposal accepted is that you have submitted at least one patch to Numpy. This allows us to interact with you and gives you an idea of how the Numpy development process works. Cheers, Ralf ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Installation on Windows
Hi, thank you all! This turned out more complicated than I expected. I tried installing the indicated compiler VCForPython27.msi but that didn't change anything. On the other hand I don't want to install any special distribution of Python - I want to stick to the standard distribution to be sure my own code can run anywhere. I only need numpy to test the language guesser langid.py - as I need a guesser for sentences (a very small amount of text, making the language identification tricky). langid.py happens to be dependent on numpy. I will try some other language guesser instead. I might install Anaconda on a virtual machine to compare numpy with other solutions. Yours, Per Tunedal On Fri, Mar 20, 2015, at 15:59, Sebastian wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA256 Hi, as you ask how to install Numpy and not how to compile it, I guess you are looking for a so called distribution. A distribution bundles pre-compiled packages of Numpy and others together for simple usage of Numpy. Otherwise you have to compile it yourself with various dependencies. That's easy to accomplish. Have a look at https://winpython.github.io/ https://code.google.com/p/pythonxy/ http://docs.continuum.io/anaconda/ regards, Sebastian On 03/20/2015 09:45 AM, Per Tunedal wrote: Hi, how do I install Numpy on Windows? I've tried the setup.py file, but get an error message: setup.py install gives: No module named msvccompiler in numpy.distutils; trying from distutils error: Unable to find vcvarsall.bat Yours, Per Tunedal ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion -- python programming - mail server - photo - video - https://sebix.at To verify my cryptographic signature or send me encrypted mails, get my key at https://sebix.at/DC9B463B.asc and on public keyservers. -BEGIN PGP SIGNATURE- Version: GnuPG v1 iQIcBAEBCAAGBQJVDDXtAAoJEBn0X+vcm0Y7WeIP/An4PfdtfAQBMKPuUmFoLsfO mskvmdciJl7K7rGucvd1jJWGuuaarILziYjCQk7ZeWd/uvC8c7iA4H6T2PgA0CuP tsWfRpNNy56C7I6lo0b4l3l4o4QM84H/S9qKL5Qsnygl9BeFQxyAKspgwxWUmKXk 6V5YqCkF/91Qbeb8MTO6Gc4a8cG+H7xo1OEuOBC1qummU/f4UoaIwk1WXX3AeYaO Jun3ZNv6yB0mk94iQzIiccQmWz3T9F+Z0TawXg5otLgsCqpNd0GEtLV/MWmBU5HN zgQ7Uhmz9bmypSEx1UPF1L8NHOVD0VdoUCFy4tzECi7RqcVxxTJ1dwqZOFFQaqAk F6m3K4HTfvfhSaSZR9pIgtP0sVyis44R1Vox24IDZH6LKCpt6GnWcCxbZfCUQW67 9OEs/YP3yeH1VRY70soGmkexFc7a7ssy6nyuAN1MXSX+uxJDsr674gklqV1i8Yxm Et8hLDG084Bh7aaq4Xppz3kXNOLDX3+RClXJjOR0qyxzNqSdJBzgABmY83GDV2DS e7iV0IJYIBzBpU9tok3KRsYky/cKMkagx75MQKgWLqsmfSD+gutmEscgIKIJXCMx rt1NN46OODR9KMjoK+9k80GILEbU9gwsw61jrj0KaH+032tZemeMgN8GlkpTiTbW eomkdUii20Cjp3x+Jdvh =JGhA -END PGP SIGNATURE- ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] GSoC students: please read
Hi all, It's great to see that this year there are a lot of students interested in doing a GSoC project with Numpy or Scipy. So far five proposals have been submitted, and it looks like several more are being prepared now. I'd like to give you a bit of advice as well as an idea of what's going to happen in the few weeks. The deadline for submitting applications is 27 March. Don't wait until the last day to submit your proposal! It has happened before that Melange was overloaded and unavailable - the Google program admins will not accept that as an excuse and allow you to submit later. So as soon as your proposal is in good shape, put it in. You can still continue revising it. From 28 March until 13 April we will continue to interact with you, as we request slots from the PSF and rank the proposals. We don't know how many slots we will get this year, but to give you an impression: for the last two years we got 2 slots. Hopefully we can get more this year, but that's far from certain. Our ranking will be based on a combination of factors: the interaction you've had with potential mentors and the community until now (and continue to have), the quality of your submitted PRs, quality and projected impact of your proposal, your enthusiasm, match with potential mentors, etc. We will also organize a video call (Skype / Google Hangout / ...) with each of you during the first half of April to be able to exchange ideas with a higher communication bandwidth medium than email. Finally a note on mentoring: we will be able to mentor all proposals submitted or suggested until now. Due to the large interest and technical nature of a few topics it has in some cases taken a bit long to provide feedback on draft proposals, however there are no showstoppers in this regard. Please continue improving your proposals and working with your potential mentors. Cheers, Ralf ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] GSoC students: please read
On Mon, Mar 23, 2015 at 10:29 PM, Stephan Hoyer sho...@gmail.com wrote: On Mon, Mar 23, 2015 at 2:21 PM, Ralf Gommers ralf.gomm...@gmail.com wrote: It's great to see that this year there are a lot of students interested in doing a GSoC project with Numpy or Scipy. So far five proposals have been submitted, and it looks like several more are being prepared now. Hi Ralf, Is there a centralized place for non-mentors to view proposals and give feedback? Hi Stephan, there isn't really. All students post their drafts to the mailing list, where they can get feedback. They're free to keep that draft wherever they want - blogs, Github, StackEdit, ftp sites and more are all being used. The central overview is in Melange (the official GSoC tool), but that's not publicly accessible. Note that an overview of project ideas can be found at https://github.com/scipy/scipy/wiki/GSoC-project-ideas. If you're particularly interested in one or more of those, it should be easy to find back in the mailing list archive what students sent draft proposals for feedback. Your comments on individual proposals will be much appreciated. Cheers, Ralf ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] GSoC students: please read
On Mon, Mar 23, 2015 at 2:21 PM, Ralf Gommers ralf.gomm...@gmail.com wrote: It's great to see that this year there are a lot of students interested in doing a GSoC project with Numpy or Scipy. So far five proposals have been submitted, and it looks like several more are being prepared now. Hi Ralf, Is there a centralized place for non-mentors to view proposals and give feedback? Thanks, Stephan ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Rewrite np.histogram in c?
On Mon, Mar 23, 2015 at 2:59 PM, Daniel da Silva var.mail.dan...@gmail.com wrote: Hope this isn't too off-topic: but it would be very nice if np.histogram and np.histogram2d supported masked arrays. Is this out of scope for outside the numpy.ma package? Right now it looks like there's no histogram function at all for masked arrays - would be good to improve that situation. If it's as easy as adding to np.histogram something like: if isinstance(a, np.ma.MaskedArray): a = a.data[~a.mask] then it makes sense to add that I think. Ralf On Mon, Mar 16, 2015 at 2:35 PM, Robert McGibbon rmcgi...@gmail.com wrote: Hi, It sounds like putting together a PR makes sense then. I'll try hacking on this a bit. -Robert On Mar 16, 2015 11:20 AM, Jaime Fernández del Río jaime.f...@gmail.com wrote: On Mon, Mar 16, 2015 at 9:28 AM, Jerome Kieffer jerome.kief...@esrf.fr wrote: On Mon, 16 Mar 2015 06:56:58 -0700 Jaime Fernández del Río jaime.f...@gmail.com wrote: Dispatching to a different method seems like a no brainer indeed. The question is whether we really need to do this in C. I need to do both unweighted weighted histograms and we got a factor 5 using (simple) cython: it is in the proceedings of Euroscipy, last year. http://arxiv.org/pdf/1412.6367.pdf If I read your paper and code properly, you got 5x faster, mostly because you combined the weighted and unweighted histograms into a single search of the array, and because you used an algorithm that can only be applied to equal- sized bins, similarly to the 10x speed-up Robert was reporting. I think that having a special path for equal sized bins is a great idea: let's do it, PRs are always welcome! Similarly, getting the counts together with the weights seems like a very good idea. I also think that writing it in Python is going to take us 80% of the way there: most of the improvements both of you have reported are not likely to be coming from the language chosen, but from the algorithm used. And if C proves to be sufficiently faster to warrant using it, it should be confined to the number crunching: I don;t think there is any point in rewriting argument parsing in C. Also, keep in mind `np.histogram` can now handle arrays of just about **any** dtype. Handling that complexity in C is not a ride in the park. Other functions like `np.bincount` and `np.digitize` cheat by only handling `double` typed arrays, a luxury that histogram probably can't afford at this point in time. Jaime -- (\__/) ( O.o) ( ) Este es Conejo. Copia a Conejo en tu firma y ayúdale en sus planes de dominación mundial. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Rewrite np.histogram in c?
On 2015/03/23 7:36 AM, Ralf Gommers wrote: On Mon, Mar 23, 2015 at 2:59 PM, Daniel da Silva var.mail.dan...@gmail.com mailto:var.mail.dan...@gmail.com wrote: Hope this isn't too off-topic: but it would be very nice if np.histogram and np.histogram2d supported masked arrays. Is this out of scope for outside the numpy.ma http://numpy.ma package? Right now it looks like there's no histogram function at all for masked arrays - would be good to improve that situation. If it's as easy as adding to np.histogram something like: if isinstance(a, np.ma.MaskedArray): a = a.data[~a.mask] It looks like it requires a little more than that, but not much. For full support a new mask would need to be made from the logical_or of the a mask and the weights mask, and then used to compress both a and weights. Eric then it makes sense to add that I think. Ralf On Mon, Mar 16, 2015 at 2:35 PM, Robert McGibbon rmcgi...@gmail.com mailto:rmcgi...@gmail.com wrote: Hi, It sounds like putting together a PR makes sense then. I'll try hacking on this a bit. -Robert On Mar 16, 2015 11:20 AM, Jaime Fernández del Río jaime.f...@gmail.com mailto:jaime.f...@gmail.com wrote: On Mon, Mar 16, 2015 at 9:28 AM, Jerome Kieffer jerome.kief...@esrf.fr mailto:jerome.kief...@esrf.fr wrote: On Mon, 16 Mar 2015 06:56:58 -0700 Jaime Fernández del Río jaime.f...@gmail.com mailto:jaime.f...@gmail.com wrote: Dispatching to a different method seems like a no brainer indeed. The question is whether we really need to do this in C. I need to do both unweighted weighted histograms and we got a factor 5 using (simple) cython: it is in the proceedings of Euroscipy, last year. http://arxiv.org/pdf/1412.6367.pdf If I read your paper and code properly, you got 5x faster, mostly because you combined the weighted and unweighted histograms into a single search of the array, and because you used an algorithm that can only be applied to equal- sized bins, similarly to the 10x speed-up Robert was reporting. I think that having a special path for equal sized bins is a great idea: let's do it, PRs are always welcome! Similarly, getting the counts together with the weights seems like a very good idea. I also think that writing it in Python is going to take us 80% of the way there: most of the improvements both of you have reported are not likely to be coming from the language chosen, but from the algorithm used. And if C proves to be sufficiently faster to warrant using it, it should be confined to the number crunching: I don;t think there is any point in rewriting argument parsing in C. Also, keep in mind `np.histogram` can now handle arrays of just about **any** dtype. Handling that complexity in C is not a ride in the park. Other functions like `np.bincount` and `np.digitize` cheat by only handling `double` typed arrays, a luxury that histogram probably can't afford at this point in time. Jaime -- (\__/) ( O.o) ( ) Este es Conejo. Copia a Conejo en tu firma y ayúdale en sus planes de dominación mundial. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org mailto:NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org mailto:NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org mailto:NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Asking proposal review/feedback for GSOC 15
On Mon, Mar 23, 2015 at 12:23 PM, Oğuzhan Ünlü cengoguzhanu...@gmail.com wrote: Hi, My name is Oğuzhan(You may use 'Oguzhan'). I submitted a proposal on the system with the title 'NumPy - Vector math library integration'. Ralf commented on my proposal and advised to ask for a feedback on mailing list and here I am. I would appreciate any feedback from community. I think community members are able to view my proposal, its visibility is set to 'Organization members'. I preferred my name in its original form, if any mentor would like to search, I provide my name on system below. Name: Oğuzhan Ünlü Hi Oğuzhan, There are only a handful of potential mentors signed up in Melange, and this list is read by hundreds of people. So it would be good to post your proposal in a publicly accessible place and post the link here. Good options are on Github or on StackEdit. Cheers, Ralf P.S. for those who do have access to Melange: http://www.google-melange.com/gsoc/proposal/review/org/google/gsoc2015/blacksimit/5741031244955648 ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Rewrite np.histogram in c?
On Mar 23, 2015 6:59 AM, Daniel da Silva var.mail.dan...@gmail.com wrote: Hope this isn't too off-topic: but it would be very nice if np.histogram and np.histogram2d supported masked arrays. Is this out of scope for outside the numpy.ma package? Usually the way this kind of thing is handled is by adding an np.ma.histogram function. -n ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] Vector math library integration
Hello! I want to contribute to NumPy/SciPy, namely I am interested in project Vector math library integration. I have good skills of C and Python, so I can make it. Please, send me additional information about this idea asap. Have a nice day! Best regards, Akbar IRC: aki93 at freenode dot net ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion