On Fri, Oct 22, 2021 at 8:36 PM Matt Mahoney <[email protected]> wrote:
> Unfortunately GPUs aren't very good at handling sparse matrices. Following > pointers isn't parallel and requires random memory access, which is 50-100 > times slower than sequential. > Well, if you can keep memory accesses on chip it is more like 5 times slower than sequential, and if you can increase the on-chip memory banks cross-bared with large numbers of CPUs, you can achieve a lot of parallelism. Of course, that means, even with 7nm fabs, that you need to minimize the size of the model. Fortunately, the ridiculous bragging rights over "large models" hides the fact that we can expect enormous model compression ratios without loss of accuracy -- so there is good reason to believe a small mammal could eat the eggs. Anyone know a mixed signal IC consultant? <https://jimbowery.blogspot.com/2013/04/a-circuit-minimizing-multicore-shared.html> ------------------------------------------ Artificial General Intelligence List: AGI Permalink: https://agi.topicbox.com/groups/agi/Tc124b3d00b83e897-M9db627a93d8b53a4f7725cc3 Delivery options: https://agi.topicbox.com/groups/agi/subscription
