FWIW, the real benefits of TACO come from generating code that contracts higher order sparse tensors, which are difficult to code by hand and unlikely to be a kernel in a hand-tuned library. The "novel compiler techniques" mentioned on the website enable the compiler to reason about co-iteration through multiple sparse tensors at a time.
Rohan On Sun, Dec 12, 2021 at 9:50 AM Mark Adams <mfad...@lbl.gov> wrote: > >> It may be different with the optimization turned on. I am surprised >> that it is 40 usually it is lower. >> >> > Sure, he was underperforming by 4x so he was using ~10 cores of a P9. Way > below saturation (20-30). > (but he only got 20x speedup, not sure about that, load balance?, the > matrix looks huge so probably not communication, but maybe) >