Re: [Numpy-discussion] Comment published in Nature Astronomy about The ecological impact of computing with Python

Hameer Abbasi Wed, 25 Nov 2020 01:17:36 -0800

Hello,

TACO consists of three things:
An array API
A scheduling language
A language for describing sparse modes of the tensor
So it combines arrays with scheduling, and also sparse tensors for a lot of 
different applications. It also includes an auto-scheduler. The code thus 
generated is on par or faster than, e.g. MKL and other equivalent libraries, 
with the ability to do fusion for arbitrary expressions. This is, for more 
complicated expressions involving sparse operands, big-O superior to composing 
the operations.


The limitations are:
Right now, it can only compute Einstein-summation type expressions, we’re 
(along with Rawn, another member of the TACO team) trying to extend that to any 
kind of point-wise expressions and reductions (such as exp(tensor), 
sum(tensor), ...).
It requires a C compiler at runtime. We’re writing an LLVM backend for it that 
will hopefully remove that requirement.
It can’t do arbitrary non-pointwise functions, e.g. SVD, inverse. This is a 
long way from being completely solved.

As for why not Numba/llvmlite: Re-writing TACO is a large task that would be 
hard to do, wrapping/extending it is much easier.

Best regards,
Hameer Abbasi

--
Sent from Canary (https://canarymail.io)

> On Mittwoch, Nov. 25, 2020 at 9:07 AM, YueCompl <[email protected] 
> (mailto:[email protected])> wrote:
> Great to know.
>
> Skimmed through the project readme, so TACO currently generating C code as 
> intermediate language, if the purpose is about tensors, why not Numba's 
> llvmlite for it?
>
> I'm aware that the scheduling code tend not to be array programs, and 
> llvmlite may have tailored too much to optimize more general programs well. 
> How is TACO going in this regard?
>
> Compl
>
> > On 2020-11-25, at 02:27, Hameer Abbasi <[email protected] 
> > (mailto:[email protected])> wrote:
> > Hello,
> >
> > We’re trying to do a part of this in the TACO team, and with a Python 
> > wrapper in the form of PyData/Sparse. It will allow an abstract 
> > array/scheduling to take place, but there are a bunch of constraints, the 
> > most important one being that a C compiler cannot be required at runtime.
> >
> > However, this may take a while to materialize, as we need an LLVM backend, 
> > and a Python wrapper (matching the NumPy API), and support for arbitrary 
> > functions (like universal functions).
> >
> > https://github.com/tensor-compiler/taco
> > http://fredrikbk.com/publications/kjolstad-thesis.pdf
> >
> > --
> > Sent from Canary (https://canarymail.io/)
> >
> > > On Dienstag, Nov. 24, 2020 at 7:22 PM, YueCompl <[email protected] 
> > > (mailto:[email protected])> wrote:
> > > Is there some community interest to develop fusion based high-performance 
> > > array programming? Something like 
> > > https://github.com/AccelerateHS/accelerate#an-embedded-language-for-accelerated-array-computations
> > >  , but that embedded DSL is far less pleasing compared to Python as the 
> > > surface language for optimized Numpy code in C.
> > >
> > > I imagine that we might be able to transpile a Numpy program into fused 
> > > LLVM IR, then deploy part as host code on CPUs and part as CUDA code on 
> > > GPUs?
> > >
> > > I know Numba is already doing the array part, but it is too limited in 
> > > addressing more complex non-array data structures. I had been approaching 
> > > ~20K separate data series with some intermediate variables for each, then 
> > > it took up to 30+GB RAM keep compiling yet gave no result after 10+hours.
> > >
> > > Compl
> > >
> > >
> > > > On 2020-11-24, at 23:47, PIERRE AUGIER 
> > > > <[email protected] 
> > > > (mailto:[email protected])> wrote:
> > > > Hi,
> > > >
> > > > I recently took a bit of time to study the comment "The ecological 
> > > > impact of high-performance computing in astrophysics" published in 
> > > > Nature Astronomy (Zwart, 2020, 
> > > > https://www.nature.com/articles/s41550-020-1208-y, 
> > > > https://arxiv.org/pdf/2009.11295.pdf), where it is stated that "Best 
> > > > however, for the environment is to abandon Python for a more 
> > > > environmentally friendly (compiled) programming language.".
> > > >
> > > > I wrote a simple Python-Numpy implementation of the problem used for 
> > > > this study (https://www.nbabel.org (https://www.nbabel.org/)) and, 
> > > > accelerated by Transonic-Pythran, it's very efficient. Here are some 
> > > > numbers (elapsed times in s, smaller is better):
> > > >
> > > > | # particles | Py | C++ | Fortran | Julia |
> > > > |-------------|-----|-----|---------|-------|
> > > > | 1024 | 29 | 55 | 41 | 45 |
> > > > | 2048 | 123 | 231 | 166 | 173 |
> > > >
> > > > The code and a modified figure are here: 
> > > > https://github.com/paugier/nbabel (There is no check on the results for 
> > > > https://www.nbabel.org (https://www.nbabel.org/), so one still has to 
> > > > be very careful.)
> > > >
> > > > I think that the Numpy community should spend a bit of energy to show 
> > > > what can be done with the existing tools to get very high performance 
> > > > (and low CO2 production) with Python. This work could be the basis of a 
> > > > serious reply to the comment by Zwart (2020).
> > > >
> > > > Unfortunately the Python solution in https://www.nbabel.org 
> > > > (https://www.nbabel.org/) is very bad in terms of performance (and 
> > > > therefore CO2 production). It is also true for most of the Python 
> > > > solutions for the Computer Language Benchmarks Game in 
> > > > https://benchmarksgame-team.pages.debian.net/benchmarksgame/ (codes 
> > > > here 
> > > > https://salsa.debian.org/benchmarksgame-team/benchmarksgame#what-else).
> > > >
> > > > We could try to fix this so that people see that in many cases, it is 
> > > > not necessary to "abandon Python for a more environmentally friendly 
> > > > (compiled) programming language". One of the longest and hardest task 
> > > > would be to implement the different cases of the Computer Language 
> > > > Benchmarks Game in standard and modern Python-Numpy. Then, optimizing 
> > > > and accelerating such code should be doable and we should be able to 
> > > > get very good performance at least for some cases. Good news for this 
> > > > project, (i) the first point can be done by anyone with good knowledge 
> > > > in Python-Numpy (many potential workers), (ii) for some cases, there 
> > > > are already good Python implementations and (iii) the work can easily 
> > > > be parallelized.
> > > >
> > > > It is not a criticism, but the (beautiful and very nice) new Numpy 
> > > > website https://numpy.org/ is not very convincing in terms of 
> > > > performance. It's written "Performant The core of NumPy is 
> > > > well-optimized C code. Enjoy the flexibility of Python with the speed 
> > > > of compiled code." It's true that the core of Numpy is well-optimized C 
> > > > code but to seriously compete with C++, Fortran or Julia in terms of 
> > > > numerical performance, one needs to use other tools to move the 
> > > > compiled-interpreted boundary outside the hot loops. So it could be 
> > > > reasonable to mention such tools (in particular Numba, Pythran, Cython 
> > > > and Transonic).
> > > >
> > > > Is there already something planned to answer to Zwart (2020)?
> > > >
> > > > Any opinions or suggestions on this potential project?
> > > >
> > > > Pierre
> > > >
> > > > PS: Of course, alternative Python interpreters (PyPy, GraalPython, 
> > > > Pyjion, Pyston, etc.) could also be used, especially if HPy 
> > > > (https://github.com/hpyproject/hpy) is successful (C core of Numpy 
> > > > written in HPy, Cython able to produce HPy code, etc.). However, I tend 
> > > > to be a bit skeptical in the ability of such technologies to reach very 
> > > > high performance for low-level Numpy code (performance that can be 
> > > > reached by replacing whole Python functions with optimized compiled 
> > > > code). Of course, I hope I'm wrong! IMHO, it does not remove the need 
> > > > for a successful HPy!
> > > >
> > > > --
> > > > Pierre Augier - CR CNRS http://www.legi.grenoble-inp.fr 
> > > > (http://www.legi.grenoble-inp.fr/)
> > > > LEGI (UMR 5519) Laboratoire des Ecoulements Geophysiques et Industriels
> > > > BP53, 38041 Grenoble Cedex, France tel:+33.4.56.52.86.16
> > > > _______________________________________________
> > > > NumPy-Discussion mailing list
> > > > [email protected] (mailto:[email protected])
> > > > https://mail.python.org/mailman/listinfo/numpy-discussion
> > >
> > > _______________________________________________
> > > NumPy-Discussion mailing list
> > > [email protected] (mailto:[email protected])
> > > https://mail.python.org/mailman/listinfo/numpy-discussion
> > _______________________________________________
> > NumPy-Discussion mailing list
> > [email protected] (mailto:[email protected])
> > https://mail.python.org/mailman/listinfo/numpy-discussion
>
> _______________________________________________
> NumPy-Discussion mailing list
> [email protected]
> https://mail.python.org/mailman/listinfo/numpy-discussion

_______________________________________________
NumPy-Discussion mailing list
[email protected]
https://mail.python.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Comment published in Nature Astronomy about The ecological impact of computing with Python

Reply via email to