11.03.2012 15:12, xavier.gn...@gmail.com kirjoitti: [clip] > If this description is correct, Numba is an additional pass once the > cpython bytecode has be produced by cpython. > Is it correct?? > Is python bytecote a good intermediate representation to perform numpy > related optimization? > > One current big issue with numpy is that C=A+B+D produces temporaries. > numexpr addresses this issue and it would be great to get the same > result by default in numpy. > numexpr also optimizes polynomials using Horner's method. It is hard to > do that at bytecode level, isn't it?
My impression is that dealing with Python's bytecode is not necessarily significantly harder than dealing with the AST. Your example reads 1 0 LOAD_NAME 0 (A) 3 LOAD_NAME 1 (B) 6 BINARY_ADD 7 LOAD_NAME 2 (D) 10 BINARY_ADD 11 STORE_NAME 3 (C) For instance, interpreting the bytecode (e.g. loop body) once with dummy objects lets you know what the final compound expression is. > Unladen swallow wanted to replace the full cpython implementation by a > jit compiler built using LLVM... but unladen swallow is dead. To get speed gains, you need to optimize not only the bytecode interpreter side, but also the object space --- Python classes, strings and all that. Keeping in mind Python's dynamism, there are potential side effects everywhere. I guess this is what sunk the swallow. Just speeding up effectively statically typed code dealing with arrays and basic types, on the other hand, sounds much easier. The PyPy guys have a much more ambitious approach, and are getting nice results. Also with arrays --- as I understand, the fact that they want to be able to do this sort of optimization is the main reason why they want to reimplement the core parts of Numpy in RPython. The second issue is that unfortunately their emulation of CPython's C-API is at the moment seems to have quite large overheads. Porting Numpy on top of that is possible --- I managed to get basic things (apart from string/unicode arrays) to work, but things took very large speed hits (of the order ~ 100x for things like `arange(10000).sum()`). This pushes the speed advantage of Numpy to a bit too large array sizes. The reason is probably that Numpy uses PyObjects internally heavily, which accumulates the cost of passing objects through the emulation layer. -- Pauli Virtanen _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion