Hi everyone, We have a new development which could be a giant leap for FOSS numerical computing (a la matlab, mathematica). It also calls for and requires participation of FOSS community -especially CS and math enthusiasts and those who need a replacement for matlab (those who are interested, please read the end).
A company named Continuum Analytics is developing a numerical computing engine called 'Blaze' for python language. It is touted as the next generation of Numpy. For anyone who doesn't know yet, Numpy is a library that makes it possible to do Matlab type matrix calculations in python. Continuum Analytics is headed by Travis Oliphant - the creator of Numpy. What is new in Blaze? I got to read some of the early docs for Blaze. Here are some of my early impressions. It is a generalization of numpy concepts, and has a fresh new data model and computing paradigm. Data model: The data model of Blaze has arrays/tables that combine concepts of software like numpy, pandas, pytables and theano. Its features are: 1. You can combine multiple data buffers (called chunks) into a single array or table. These chunks can include RAM buffers, GPGPU buffers, hard disk locations and even network streams. This makes it possible to defines infinitely huge arrays. Numpy could accept only arrays which span a single RAM buffer, and pytables used to handle storage. 2. You have array/tables whose rows or columns can be tagged by name (like python maps). For example you can name columns as time, temperature, pressure etc. This was a feature provided by Pandas and R language. 3. The 'Dtype' of numpy is generalized and extended. Dtype defines how raw data buffers in memory have to be interpreted. As a consequence, you will be able to define arrays with very complex type layout. Computation model: The computation engine (for crunching numbers defined by the arrays) is designed like a virtual machine that accepts python inputs. Its features are: 1. It can be optimized for the target processing system. For example, different algorithms can be evolved for CPUs, SMPs (ie, multicore processors), clusters, GPGPUs, DSPs or vector processors (like Intel MIC, Nvidia Fermi, Adapteva Parallela). 2. It supports 'Lazy evaluation' - a very significant feature which makes blaze behave like Mathematica, Sympy or maxima. Numpy presently evaluates expressions as soon as they are executed. Blaze on the other hand by default, will wait until an evaluation is forced. What this means is that when you write a blaze expression in python (like x = a + b, y = x**2), blaze will create a data structure called AST (abstract syntax tree) instead of executing the specified operation. AST is a machine representation of the expression. ASTs from multiple expressions are combined (you will get y = {a+b}**2 ), and finally executed when a result is required. This allows optimization of expressions before evaluation - for example (a*b + a*c) can be optimized as (a*(b+c)), since multiplication is expensive. You also get advantage of symbolic mathematics- for example, when you do sin(x)**2 + cos(x)**2, you can get an array of 1s without even evaluating the expression. Blaze has many more features like these, but I could figure out only these so far. An official announcement is yet to come. To know more about Blaze, you can refer these: 1. Slides presented for PyCon: https://github.com/ContinuumIO/blaze/raw/master/slides.pdf 2. Docs within blaze repositories: Principles: https://github.com/ContinuumIO/ArrayServer/blob/master/docs/principles.txt NDTables doc: https://github.com/ContinuumIO/ArrayServer/blob/master/docs/ndtable.rst 3. GitHub repositories: https://github.com/ContinuumIO/blazeprototype https://github.com/ContinuumIO/ArrayServer Blaze could boost FOSS numerical computing if done successfully. It will create a powerful and versatile numerics tool for multiple target machines. But there is a catch. Continuum Analytics is a company with commercial interests. In the principles.txt document, they state that they intend to publish the specifications for blaze and a 'reference implementation' as open source, but an optimized version will be closed source. This is just like Bittorrent, whose protocol is open, but the canonical implementation is closed. Other developers have produced good free implementations like Transmission. Just like that, blaze will require optimized free implementations for various backends. There is a need for developers to step up to this and there is good opportunity too. For example, someone could do an implementation for Intel MIC or a DSP coprocessor as a college project (a very large one). If anyone is interested, please share your comments. More insights on the design documentation, your thoughts about the project, and expression of interest are welcome. Regards, Gokul Das -- "Freedom is the only law". "Freedom Unplugged" http://www.ilug-tvm.org You received this message because you are subscribed to the Google Groups "ilug-tvm" group. To control your subscription visit http://groups.google.co.in/group/ilug-tvm/subscribe To post to this group, send email to ilug-tvm@googlegroups.com To unsubscribe from this group, send email to ilug-tvm-unsubscr...@googlegroups.com For details visit the google group page: http://groups.google.com/group/ilug-tvm?hl=en