[Numpy-discussion] PyArray_Scalar() and Unicode
I apologize ahead of time for anything I might be totally missing, but in order to make PyArray_Scalar() work on non-CPython interpreters, it's necessary for me to significantly refactor that function. I've made (untested but correct looking) changes to the function to handle all of the data types except Unicode. I just got the crash course in Unicode today so my understanding is limited. It seems the most compatible way to turn the UCS4 data into a PyUnicodeObject would be to first convert it to UCS2 and then use PyUnicode_DecodeUTF16() to create the python object. There are a few problems with this. The biggest problem for me is that it appears PyUCS2Buffer_FromUCS4() doesn't produce UCS2 at all, but rather UTF-16 since it produces surrogate pairs for code points above 0x. My first question is: is there any time when the data produced by PyUCS2Buffer_FromUCS4() wouldn't be parseable by a standards compliant UTF-16 decoder? Aside from that, converting to UCS2, possibly after making a word aligned copy of the original data, then converting that to the native storage, which is likely UTF-16 anyways, is horribly wasteful. The ideal way to accomplish this would be to simply use PyUnicode_DecodeUTF32() on the original data and be done with it. The biggest problem with this approach is it's not very compatible (Requires Python 2.6, and currently isn't implemented in PyPy but that's fixable) I talked briefly to Stéfan about this and he mentioned that you were involved in all of this and that things are in a state of flux. So before I devoted a significant amount of time and thought to this I thought I'd put myself out into the open air and see if there's any major holes in my rationale, or if things will change significantly enough that I should adjust my approach. Thanks, Dan ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Technicalities of the SVN - GIT transition
Sorry to interrupt, but do you have a usable, up to date cloneable git repository somewhere? I noticed that you had a repository on github, but it's about a month out of date. I understand if what you're working on isn't ready to be cloned from yet. In the meantime I can use git-svn, but I figured if you had something better, I'd use that. Thanks, Dan On Jun 1, 2010 12:59 AM, David Cournapeau courn...@gmail.com wrote: Hi there, I have looked back into the way to convert the existing numpy svn repository into git. It went quite smoothly using svn2git (developed by the KDE team for their own transition), but there are a few questions which need to be answered: - Shall we keep the old svn branches ? I think most of them are just cruft, and can be safely removed We would then just keep the release branches (in maintenance/***) - Tag conversion: svn has no notion of tags, so translating them into git tags cannot be done automatically in a safely manner (and we do have some rewritten tags in the svn repo). I was thinking about creating a small script to create them manually afterwards for the releases, in the svntags/***. - Author conversion: according to git, there are around 50 committers in numpy. Several of them are double and should be be merged I think (kern vs rkern, Travis' accounts as well), but there is also the option to set up real emails. Since email are private, I don't want to just scrape them without asking permission first. I don't know how we should proceed here. The author conversion needs to be decided upfront (as changing name in committers will cause to change every sha256), tags and scrapping branches may be done later. Last time we discussed things, there were some concerns about space: the numpy git repo is around 17 Mb for the full history, 36 Mb if one includes the working tree, compared to 43 Mb for a trunk checkout from svn. The master branch (the trunk in git) has ~ 6500 commits. cheers, David ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] My GSoC Proposal to Implement a Subset of NumPy for PyPy
Wow, that's a very cool idea. I think that's an excellent approach to allowing user RPython functions. Maciej expressed concern this could create a support burden for RPython for the core PyPy developers (There aren't many of them). I think, handled correctly, this could help create a community knowledgable about RPython that could support it, but initially I would be the only one supporting this use. I think the best approach would be to provide a similar fast_vectorize() in addition to accelerating normal python looping constructs with the JIT compiler, that way we can have faster code without even trying. I'm going to be contacting the author of that paper to see what its implementation status is, as the paper mentioned he was trying to get it into SciPy. Sorry about the latency, I tend to make multiple drafts of my emails before I send them... On Apr 21, 2010 5:20 AM, Dag Sverre Seljebotn da...@student.matnat.uio.no wrote: Dan Roberts wrote: Thanks for the reply. You're certainly right that your work is extremely be... This might be relevant? http://conference.scipy.org/proceedings/SciPy2008/paper_16/ -- Dag Sverre ___ NumPy-Discussion mailing list numpy-discuss...@scipy... ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] My GSoC Proposal to Implement a Subset of NumPy for PyPy
Thanks for the reply. You're certainly right that your work is extremely beneficial to mine. At present I'm afraid a great deal of NumPy C code isn't easily reusable and it's great you're addressing that. I may not have been thinking in line with Maciej, but I was thinking ufuncs would be written in pure Python and jit compiled to an efficient form. (We can make lots of nice assumptions about them) That said, I think being able to write generic ufuncs is a very good idea, and absolutely doable. On Apr 20, 2010 7:48 AM, Travis Oliphant oliph...@enthought.com wrote: On Apr 16, 2010, at 11:50 PM, Dan Roberts wrote: Hello NumPy Users, Hi everybody, my name i... Hi Daniel, This sounds like a great project, and I think it has promise. I would especially pay attention to the requests to make it easy to write ufuncs and generalized ufuncs in RPython. That has the most possibility of being immediately useful. Your timing is also very good.I am going to be spending some time re-factoring NumPy to separate out the CPython interface from the underlying algorithms. I think this re-factoring should help you in your long-term goals. If you have any input or suggestions while the refactoring is taking place, we are always open to suggestions and criticisms. Thanks for writing a NumPy-related proposal. Best regards, -Travis Thanks, Daniel Roberts ___ NumPy-Discussion mailing list NumPy-Discuss... -- Travis Oliphant Enthought Inc. 1-512-536-1057 http://www.enthought.com oliph...@enthought.com ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] My GSoC Proposal to Implement a Subset of NumPy for PyPy
Oops, I intended to dig more through the NumPy source before I sent the final version of that message, so I could be speaking from an informed standpoint. Thanks, Daniel Roberts ___ NumPy-Discussion mailing list NumPy-Discuss... -- Travis Oliphant Enthought Inc. 1-512-536-1057 http://www.enthought.com oliph...@enthought.com ... ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] My GSoC Proposal to Implement a Subset of NumPy for PyPy
On Apr 18, 2010 6:46 PM, Dan Roberts ademan...@gmail.com wrote: I've been trying my best to take my time formulating my replies but I need to respond eventually. :-) This is embarassing, but I'm actually not sure where I talked about an interface specifically. I did rather nebulously talk about interfacing with C code and LAPACK and the interfaces there would be provided by the respective code and consumed by micronumpy, at least as I currently see it. I haven't consulted maciej about this yet, but I think working backwards from a complete C NumPy depends on a great deal of ifs, many of which I think aren't satisfied. I need to look into this, but I assume NumPy operates on array structures directly, rather than through an interface. If it's through an interface, there's a real possibility that approach is possible, though it would require me to write some adaptors, I think it would be ok, and a low enough time investment. Like I said I'm currently speaking from ignorance so I need to look into it and get back to you. Cheers, Dan P.S. I agree about the sparse matrices, I've bugged fijal a small bit about that. P.P.S. Forgot to CC the mailing list... assumed this mail client would do it for me.. lol On Apr 17, 2010 12:25 AM, Stéfan van der Walt ste...@sun.ac.za wrote: Hi Dan On 17 April 2010 06:50, Dan Roberts ademan...@gmail.com wrote: Hi everybody, my name is Dan... Thanks for the introduction, and welcome to NumPy! I hadn't prepared for review by the NumPy mentors, but this can make my proposal stronger t... This proposal builds a bridge between two projects, so even if it technically falls under the... Why should we bother reimplimenting anything? PyPy, for those who are unfamiliar, has the ... Your code has a fairly specialised application and it's worth discussing exactly where it wou... ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] My GSoC Proposal to Implement a Subset of NumPy for PyPy
Hello NumPy Users, Hi everybody, my name is Dan Roberts, and my Google Summer of Code proposal was categorized under NumPy rather than PyPy, so it will end up being reviewed by mentors for the NumPy project. I'd like to take this chance to introduce myself and my proposal. I hadn't prepared for review by the NumPy mentors, but this can make my proposal stronger than before. With a bit of help from all of you, I can dedicate my summer to creating more useful code than I would have previously. I realize that from the perspective of NumPy, my proposal might seem lacking, so I'd like to also invite the scrutiny of all of the readers of this list. Why should we bother reimplimenting anything? PyPy, for those who are unfamiliar, has the ability to Just-in-Time compile itself and programs that it's running. One of the major advantages of this is that code operating on NumPy arrays could potentially be written in pure-python, with normal looping constructs, and be nearly as fast as a ufunc painstakingly crafted in C. I'd love to see as much Python and as little C as possible, and I'm sure I'm not alone in that wish. A short introduction: I've been coding in Python for the past few years, and have increasingly become interested in speeding up what has become my favorite language. To that end I've become interested in both the PyPy project and the NumPy projects. I've spent a fair amount of time frustrating the PyPy developers with silly questions, written a bit of code for them, and now my GSoC proposal involves both them, and NumPy. Finally, I'd like to ask all of you: what features are most important to you? It's not practical, wise, or even possible for me to reimpliment more than a small portion of NumPy, but if I can address the most important parts, maybe I can make this project useful enough for some of you to use, and close enough for the rest of you that I can drum up some support for more development in the future. My proposal lives at http://codespeak.net/~dan/gsoc/micronumpy.htmlthanks for making it this far through my long winded introduction! I welcome all constructive criticism and thoughts. Thanks, Daniel Roberts ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion