On 20 March 2012 20:49, Olivier Delalleau <sh...@keba.be> wrote: > I doubt Theano is already as smart as you'd want it to be right now, however > the core mechanisms are there to perform graph optimizations and move > computations to GPU. It may save time to start from there instead of > starting all over from scratch. I'm not sure though, but it looks like it > would be worth considering it at least.
Thanks for the suggestion Olivier, as Dag said we discusses it, and indeed we (or I) should look a lot deeper into it and see what components are reusable there and discuss with the Theano community if and how we can collaborate. > -=- Olivier > > Le 20 mars 2012 15:40, Dag Sverre Seljebotn <d.s.seljeb...@astro.uio.no> a > écrit : > >> We talked some about Theano. There are some differences in project goals >> which means that it makes sense to make this a seperate project: Cython >> wants to use this to generate C code up front from the Cython AST at >> compilation time; numba also has a different frontend (parsing of python >> bytecode) and a different backend (LLVM). >> >> However, it may very well be possible that Theano could be refactored so >> that the more essential algorithms working on the syntax tree could be >> pulled out and shared with cython and numba. Then the question is whether >> the core of Theano is smart enough to compete with Fortran compilers and >> support arbitraily strided inputs optimally. Otherwise one might as well >> start from scratch. I'll leave that for Mark to figure out... >> >> Dag >> -- >> Sent from my Android phone with K-9 Mail. Please excuse my brevity. >> >> >> Olivier Delalleau <sh...@keba.be> wrote: >>> >>> This sounds a lot like Theano, did you look into it? >>> >>> -=- Olivier >>> >>> Le 20 mars 2012 13:49, mark florisson <markflorisso...@gmail.com> a écrit >>> : >>>> >>>> On 13 March 2012 18:18, Travis Oliphant <tra...@continuum.io> wrote: >>>> >>> >>>> >>> (Mark F., how does the above match how you feel about this?) >>>> >> >>>> >> I would like collaboration, but from a technical perspective I think >>>> >> this would be much more involved than just dumping the AST to an IR >>>> >> and generating some code from there. For vector expressions I think >>>> >> sharing code would be more feasible than arbitrary (parallel) loops, >>>> >> etc. Cython as a compiler can make many decisions that a Python >>>> >> (bytecode) compiler can't make (at least without annotations and a >>>> >> well-defined subset of the language (not so much the syntax as the >>>> >> semantics)). I think in numba, if parallelism is to be supported, you >>>> >> will want a prange-like construct, as proving independence between >>>> >> iterations can be very hard to near impossible for a compiler. >>>> > >>>> > I completely agree that you have to define some kind of syntax to get >>>> > parallelism. But, a prange construct would not be out of the question, >>>> > of >>>> > course. >>>> > >>>> >> >>>> >> As for code generation, I'm not sure how llvm would do things like >>>> >> slicing arrays, reshaping, resizing etc (for vector expressions you >>>> >> can first evaluate all slicing and indexing operations and then >>>> >> compile the remaining vector expression), but for loops and array >>>> >> reassignment within loops this would have to invoke the actual >>>> >> slicing >>>> >> code from the llvm code (I presume). >>>> > >>>> > There could be some analysis on the byte-code, prior to emitting the >>>> > llvm code in order to handle lots of things. Basically, you have to >>>> > "play" >>>> > the byte-code on a simple machine anyway in order to emit the correct >>>> > code. >>>> > The big thing about Cython is you have to typedef too many things that >>>> > are >>>> > really quite knowable from the code. If Cython could improve it's type >>>> > inference, then it would be a more suitable target. >>>> > >>>> >> There are many other things, like >>>> >> bounds checking, wraparound, etc, that are all supported in both >>>> >> numpy >>>> >> and Cython, but going through an llvm layer would as far as I can >>>> >> see, >>>> >> require re-implementing those, at least if you want top-notch >>>> >> performance. Personally, I think for non-trivial performance-critical >>>> >> code (for loops with indexing, slicing, function calls, etc) Cython >>>> >> is >>>> >> a better target. >>>> > >>>> > With libclang it is really quite possible to imagine a cython -> C >>>> > target that itself compiles to llvm so that you can do everything at that >>>> > intermediate layer. However, LLVM is a much better layer for >>>> > optimization >>>> > than C now that there are a lot of people collaborating on that layer. >>>> > I >>>> > think it would be great if Cython targeted LLVM actually instead of C. >>>> > >>>> >> >>>> >> Finally, as for non-vector-expression code, I really believe Cython >>>> >> is >>>> >> a better target. cython.inline can have high overhead (at least the >>>> >> first time it has to compile), but with better (numpy-aware) type >>>> >> inference or profile guided optimizations (see recent threads on the >>>> >> cython-dev mailing list), in addition to things like prange, I >>>> >> personally believe Cython targets most of the use cases where numba >>>> >> would be able to generate performing code. >>>> > >>>> > Cython and Numba certainly overlap. However, Cython requires: >>>> > >>>> > 1) learning another language >>>> > 2) creating an extension module --- loading bit-code files and >>>> > dynamically executing (even on a different machine from the one that >>>> > initially created them) can be a powerful alternative for run-time >>>> > compilation and distribution of code. >>>> > >>>> > These aren't show-stoppers obviously. But, I think some users would >>>> > prefer an even simpler approach to getting fast-code than Cython (which >>>> > currently doesn't do enought type-inference and requires building a >>>> > dlopen >>>> > extension module). >>>> >>>> Dag and I have been discussing this at PyCon, and here is my take on >>>> it (at this moment :). >>>> >>>> Definitely, if you can avoid Cython then that is easier and more >>>> desirable in many ways. So perhaps we can create a third project >>>> called X (I'm not very creative, maybe ArrayExprOpt), that takes an >>>> abstract syntax tree in a rather simple form, performs code >>>> optimizations such as rewriting loops with array accesses to vector >>>> expressions, fusing vector expressions and loops, etc, and spits out a >>>> transformed AST containing these optimizations. If runtime information >>>> is given such as actual shape and stride information the >>>> transformations could figure out there and then whether to do things >>>> like collapsing, axes swapping, blocking (as in, introducing more axes >>>> or loops to retain discontiguous blocks in the cache), blocked memory >>>> copies to contiguous chunks, etc. The AST could then also say whether >>>> the final expressions are vectorizable. Part of this functionality is >>>> already in numpy's nditer, except that this would be implicit and do >>>> more (and hopefully with minimal overhead). >>>> >>>> So numba, Cython and maybe numexpr could use the functionality, simply >>>> by building the AST from Python and converting back (if necessary) to >>>> its own AST. As such, the AST optimizer would be only part of any >>>> (runtime) compiler's pipeline, and it should be very flexible to >>>> retain any information (metadata regarding actual types, control flow >>>> information, etc) provided by the original AST. It would not do >>>> control flow analysis, type inference or promotion, etc, but only deal >>>> with abstract types like integers, reals and arrays (C, Fortran or >>>> partly contiguous or strided). It would not deal with objects, but >>>> would allow to insert nodes like UnreorderableNode and SideEffectNode >>>> wrapping parts of the original AST. In short, it should be as easy as >>>> possible to convert from an original AST to this project's AST and >>>> back again afterwards. >>>> >>>> As the project matures many optimizations may be added that deal with >>>> all sorts of loop restructuring and ways to efficiently utilize the >>>> cache as well as enable vectorization and possibly parallelism. >>>> Perhaps it could even generate a different AST depending on whether >>>> execution target the CPU or the GPU (with optionally available >>>> information such as cache sizes, GPU shared/local memory sizes, etc). >>>> >>>> Seeing that this would be a part of my master dissertation, my >>>> supervisor would require me to write the code, so at least until >>>> August I think I would have to write (at least the bulk of) this. >>>> Otherwise I can also make other parts of my dissertation's project >>>> more prominent to make up for it. Anyway, my question is, is there >>>> interest from at least the numba and numexpr projects (if code can be >>>> transformed into vector operations, it makes sense to use numexpr for >>>> that, I'm not sure what numba's interest is in that). >>>> >>>> > -Travis >>>> > >>>> > >>>> > >>>> > >>>> >> >>>> >>> Dag >>>> >>> _______________________________________________ >>>> >>> NumPy-Discussion mailing list >>>> >>> NumPy-Discussion@scipy.org >>>> >>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>>> >>> >>>> >>> >>>> >>> >>>> >>> _______________________________________________ >>>> >>> NumPy-Discussion mailing list >>>> >>> NumPy-Discussion@scipy.org >>>> >>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>>> >>> >>>> >> _______________________________________________ >>>> >> NumPy-Discussion mailing list >>>> >> NumPy-Discussion@scipy.org >>>> >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>>> > >>>> > _______________________________________________ >>>> > NumPy-Discussion mailing list >>>> > NumPy-Discussion@scipy.org >>>> > http://mail.scipy.org/mailman/listinfo/numpy-discussion >>>> _______________________________________________ >>>> NumPy-Discussion mailing list >>>> NumPy-Discussion@scipy.org >>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>> >>> > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion@scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion