This sounds a lot like Theano, did you look into it?

-=- Olivier

Le 20 mars 2012 13:49, mark florisson <markflorisso...@gmail.com> a écrit :

> On 13 March 2012 18:18, Travis Oliphant <tra...@continuum.io> wrote:
> >>>
> >>> (Mark F., how does the above match how you feel about this?)
> >>
> >> I would like collaboration, but from a technical perspective I think
> >> this would be much more involved than just dumping the AST to an IR
> >> and generating some code from there. For vector expressions I think
> >> sharing code would be more feasible than arbitrary (parallel) loops,
> >> etc. Cython as a compiler can make many decisions that a Python
> >> (bytecode) compiler can't make (at least without annotations and a
> >> well-defined subset of the language (not so much the syntax as the
> >> semantics)). I think in numba, if parallelism is to be supported, you
> >> will want a prange-like construct, as proving independence between
> >> iterations can be very hard to near impossible for a compiler.
> >
> > I completely agree that you have to define some kind of syntax to get
> parallelism.  But, a prange construct would not be out of the question, of
> course.
> >
> >>
> >> As for code generation, I'm not sure how llvm would do things like
> >> slicing arrays, reshaping, resizing etc (for vector expressions you
> >> can first evaluate all slicing and indexing operations and then
> >> compile the remaining vector expression), but for loops and array
> >> reassignment within loops this would have to invoke the actual slicing
> >> code from the llvm code (I presume).
> >
> > There could be some analysis on the byte-code, prior to emitting the
> llvm code in order to handle lots of things.   Basically, you have to
> "play" the byte-code on a simple machine anyway in order to emit the
> correct code.   The big thing about Cython is you have to typedef too many
> things that are really quite knowable from the code.   If Cython could
> improve it's type inference, then it would be a more suitable target.
> >
> >> There are many other things, like
> >> bounds checking, wraparound, etc, that are all supported in both numpy
> >> and Cython, but going through an llvm layer would as far as I can see,
> >> require re-implementing those, at least if you want top-notch
> >> performance. Personally, I think for non-trivial performance-critical
> >> code (for loops with indexing, slicing, function calls, etc) Cython is
> >> a better target.
> >
> > With libclang it is really quite possible to imagine a cython -> C
> target that itself compiles to llvm so that you can do everything at that
> intermediate layer.   However,  LLVM is a much better layer for
> optimization than C now that there are a lot of people collaborating on
> that layer.   I think it would be great if Cython targeted LLVM actually
> instead of C.
> >
> >>
> >> Finally, as for non-vector-expression code, I really believe Cython is
> >> a better target. cython.inline can have high overhead (at least the
> >> first time it has to compile), but with better (numpy-aware) type
> >> inference or profile guided optimizations (see recent threads on the
> >> cython-dev mailing list), in addition to things like prange, I
> >> personally believe Cython targets most of the use cases where numba
> >> would be able to generate performing code.
> >
> > Cython and Numba certainly overlap.  However, Cython requires:
> >
> >        1) learning another language
> >        2) creating an extension module --- loading bit-code files and
> dynamically executing (even on a different machine from the one that
> initially created them) can be a powerful alternative for run-time
> compilation and distribution of code.
> >
> > These aren't show-stoppers obviously.   But, I think some users would
> prefer an even simpler approach to getting fast-code than Cython (which
> currently doesn't do enought type-inference and requires building a dlopen
> extension module).
>
> Dag and I have been discussing this at PyCon, and here is my take on
> it (at this moment :).
>
> Definitely, if you can avoid Cython then that is easier and more
> desirable in many ways. So perhaps we can create a third project
> called X (I'm not very creative, maybe ArrayExprOpt), that takes an
> abstract syntax tree in a rather simple form, performs code
> optimizations such as rewriting loops with array accesses to vector
> expressions, fusing vector expressions and loops, etc, and spits out a
> transformed AST containing these optimizations. If runtime information
> is given such as actual shape and stride information the
> transformations could figure out there and then whether to do things
> like collapsing, axes swapping, blocking (as in, introducing more axes
> or loops to retain discontiguous blocks in the cache), blocked memory
> copies to contiguous chunks, etc. The AST could then also say whether
> the final expressions are vectorizable. Part of this functionality is
> already in numpy's nditer, except that this would be implicit and do
> more (and hopefully with minimal overhead).
>
> So numba, Cython and maybe numexpr could use the functionality, simply
> by building the AST from Python and converting back (if necessary) to
> its own AST. As such, the AST optimizer would be only part of any
> (runtime) compiler's pipeline, and it should be very flexible to
> retain any information (metadata regarding actual types, control flow
> information, etc) provided by the original AST. It would not do
> control flow analysis, type inference or promotion, etc, but only deal
> with abstract types like integers, reals and arrays (C, Fortran or
> partly contiguous or strided). It would not deal with objects, but
> would allow to insert nodes like UnreorderableNode and SideEffectNode
> wrapping parts of the original AST. In short, it should be as easy as
> possible to convert from an original AST to this project's AST and
> back again afterwards.
>
> As the project matures many optimizations may be added that deal with
> all sorts of loop restructuring and ways to efficiently utilize the
> cache as well as enable vectorization and possibly parallelism.
> Perhaps it could even generate a different AST depending on whether
> execution target the CPU or the GPU (with optionally available
> information such as cache sizes, GPU shared/local memory sizes, etc).
>
> Seeing that this would be a part of my master dissertation, my
> supervisor would require me to write the code, so at least until
> August I think I would have to write (at least the bulk of) this.
> Otherwise I can also make other parts of my dissertation's project
> more prominent to make up for it. Anyway, my question is, is there
> interest from at least the numba and numexpr projects (if code can be
> transformed into vector operations, it makes sense to use numexpr for
> that, I'm not sure what numba's interest is in that).
>
> > -Travis
> >
> >
> >
> >
> >>
> >>> Dag
> >>> _______________________________________________
> >>> NumPy-Discussion mailing list
> >>> NumPy-Discussion@scipy.org
> >>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
> >>>
> >>>
> >>>
> >>> _______________________________________________
> >>> NumPy-Discussion mailing list
> >>> NumPy-Discussion@scipy.org
> >>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
> >>>
> >> _______________________________________________
> >> NumPy-Discussion mailing list
> >> NumPy-Discussion@scipy.org
> >> http://mail.scipy.org/mailman/listinfo/numpy-discussion
> >
> > _______________________________________________
> > NumPy-Discussion mailing list
> > NumPy-Discussion@scipy.org
> > http://mail.scipy.org/mailman/listinfo/numpy-discussion
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Reply via email to