This sounds a lot like Theano, did you look into it? -=- Olivier
Le 20 mars 2012 13:49, mark florisson <markflorisso...@gmail.com> a écrit : > On 13 March 2012 18:18, Travis Oliphant <tra...@continuum.io> wrote: > >>> > >>> (Mark F., how does the above match how you feel about this?) > >> > >> I would like collaboration, but from a technical perspective I think > >> this would be much more involved than just dumping the AST to an IR > >> and generating some code from there. For vector expressions I think > >> sharing code would be more feasible than arbitrary (parallel) loops, > >> etc. Cython as a compiler can make many decisions that a Python > >> (bytecode) compiler can't make (at least without annotations and a > >> well-defined subset of the language (not so much the syntax as the > >> semantics)). I think in numba, if parallelism is to be supported, you > >> will want a prange-like construct, as proving independence between > >> iterations can be very hard to near impossible for a compiler. > > > > I completely agree that you have to define some kind of syntax to get > parallelism. But, a prange construct would not be out of the question, of > course. > > > >> > >> As for code generation, I'm not sure how llvm would do things like > >> slicing arrays, reshaping, resizing etc (for vector expressions you > >> can first evaluate all slicing and indexing operations and then > >> compile the remaining vector expression), but for loops and array > >> reassignment within loops this would have to invoke the actual slicing > >> code from the llvm code (I presume). > > > > There could be some analysis on the byte-code, prior to emitting the > llvm code in order to handle lots of things. Basically, you have to > "play" the byte-code on a simple machine anyway in order to emit the > correct code. The big thing about Cython is you have to typedef too many > things that are really quite knowable from the code. If Cython could > improve it's type inference, then it would be a more suitable target. > > > >> There are many other things, like > >> bounds checking, wraparound, etc, that are all supported in both numpy > >> and Cython, but going through an llvm layer would as far as I can see, > >> require re-implementing those, at least if you want top-notch > >> performance. Personally, I think for non-trivial performance-critical > >> code (for loops with indexing, slicing, function calls, etc) Cython is > >> a better target. > > > > With libclang it is really quite possible to imagine a cython -> C > target that itself compiles to llvm so that you can do everything at that > intermediate layer. However, LLVM is a much better layer for > optimization than C now that there are a lot of people collaborating on > that layer. I think it would be great if Cython targeted LLVM actually > instead of C. > > > >> > >> Finally, as for non-vector-expression code, I really believe Cython is > >> a better target. cython.inline can have high overhead (at least the > >> first time it has to compile), but with better (numpy-aware) type > >> inference or profile guided optimizations (see recent threads on the > >> cython-dev mailing list), in addition to things like prange, I > >> personally believe Cython targets most of the use cases where numba > >> would be able to generate performing code. > > > > Cython and Numba certainly overlap. However, Cython requires: > > > > 1) learning another language > > 2) creating an extension module --- loading bit-code files and > dynamically executing (even on a different machine from the one that > initially created them) can be a powerful alternative for run-time > compilation and distribution of code. > > > > These aren't show-stoppers obviously. But, I think some users would > prefer an even simpler approach to getting fast-code than Cython (which > currently doesn't do enought type-inference and requires building a dlopen > extension module). > > Dag and I have been discussing this at PyCon, and here is my take on > it (at this moment :). > > Definitely, if you can avoid Cython then that is easier and more > desirable in many ways. So perhaps we can create a third project > called X (I'm not very creative, maybe ArrayExprOpt), that takes an > abstract syntax tree in a rather simple form, performs code > optimizations such as rewriting loops with array accesses to vector > expressions, fusing vector expressions and loops, etc, and spits out a > transformed AST containing these optimizations. If runtime information > is given such as actual shape and stride information the > transformations could figure out there and then whether to do things > like collapsing, axes swapping, blocking (as in, introducing more axes > or loops to retain discontiguous blocks in the cache), blocked memory > copies to contiguous chunks, etc. The AST could then also say whether > the final expressions are vectorizable. Part of this functionality is > already in numpy's nditer, except that this would be implicit and do > more (and hopefully with minimal overhead). > > So numba, Cython and maybe numexpr could use the functionality, simply > by building the AST from Python and converting back (if necessary) to > its own AST. As such, the AST optimizer would be only part of any > (runtime) compiler's pipeline, and it should be very flexible to > retain any information (metadata regarding actual types, control flow > information, etc) provided by the original AST. It would not do > control flow analysis, type inference or promotion, etc, but only deal > with abstract types like integers, reals and arrays (C, Fortran or > partly contiguous or strided). It would not deal with objects, but > would allow to insert nodes like UnreorderableNode and SideEffectNode > wrapping parts of the original AST. In short, it should be as easy as > possible to convert from an original AST to this project's AST and > back again afterwards. > > As the project matures many optimizations may be added that deal with > all sorts of loop restructuring and ways to efficiently utilize the > cache as well as enable vectorization and possibly parallelism. > Perhaps it could even generate a different AST depending on whether > execution target the CPU or the GPU (with optionally available > information such as cache sizes, GPU shared/local memory sizes, etc). > > Seeing that this would be a part of my master dissertation, my > supervisor would require me to write the code, so at least until > August I think I would have to write (at least the bulk of) this. > Otherwise I can also make other parts of my dissertation's project > more prominent to make up for it. Anyway, my question is, is there > interest from at least the numba and numexpr projects (if code can be > transformed into vector operations, it makes sense to use numexpr for > that, I'm not sure what numba's interest is in that). > > > -Travis > > > > > > > > > >> > >>> Dag > >>> _______________________________________________ > >>> NumPy-Discussion mailing list > >>> NumPy-Discussion@scipy.org > >>> http://mail.scipy.org/mailman/listinfo/numpy-discussion > >>> > >>> > >>> > >>> _______________________________________________ > >>> NumPy-Discussion mailing list > >>> NumPy-Discussion@scipy.org > >>> http://mail.scipy.org/mailman/listinfo/numpy-discussion > >>> > >> _______________________________________________ > >> NumPy-Discussion mailing list > >> NumPy-Discussion@scipy.org > >> http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion@scipy.org > > http://mail.scipy.org/mailman/listinfo/numpy-discussion > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion@scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion >
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion