Thanks, Hugh! Nicely explained.
../Dave On Jul 14, 2020, 10:15 AM -0400, D. Hugh Redelmeier via talk <talk@gtalug.org>, wrote: > | From: David Mason via talk <talk@gtalug.org> > > | The short answer is: Machine Learning (and other data-mining-like > applications) > > A much LONGER answer: > > There has been a field of Computing on GPUs for perhaps a dozen years. > GPUs have evolved into having a LOT of Floating Point units that can > act simultaneously, mostly in lock-step. > > They are nasty to program: conventional high-level languages and > programmers aren't very good at exploiting GPUs. > > NVidia's Cuda (dominant) and the industry standard OpenCL (struggling) > are used to program the combination of the host CPU and the GPU. > > Generally, a set of subroutines is written to exploit a GPU and those > subroutines get called by conventional programs. Examples of such a > library: TensorFlow, PyTorch, OpenBLAS. The first two are for machine > learning. > > Some challenges GPU programmers face: > > - GPUs cannot do everything that programmers are used to. A program > using a GPU must be composed of a Host CPU program and a GPU > program. (Some languages let you do the split within a single > program, but there still is a split.) > > - GPU programming requires a lot effort designing how data gets > shuffled in and out of the GPU's dedicated memory. Without care, > the time eaten by this can easily overwhelm the time saved by using a > GPU instead of just the host CPU. > > Like any performance problem, one needs to measure to get an > accurate understanding. The result might easily suggest massive > changes to a program. > > - Each GPU links its ALUs into fixed-size groups. Problems must be > mapped onto these groups, even if that isn't natural. A typical size > is 64 ALUs. Each ALU in a group is either executing the same > instruction, or is idled. > > OpenCL and Cuda help the programmer create doubly-nested loops that > map well onto this hardware. > > Lots of compute-intensive algorithms are not easy to break down into this > structure. > > - GPUs are not very good at conventional control-flow. And it is > different from what most programmers expect. For example, when an > "if" is executed, all compute elements in a group are tied up, even > if they are not active. Think how this applies to loops. > > - each GPU is kind of different, it is hard to program generically. > This is made worse by the fact that Cuda, the most popular language, > is proprietary to NVidia. Lots of politics here. > > - GPUs are not easily safe to share amongst multiple processes. This > is slowly improving. > > - New GPUs are getting better, so one should perhaps revisit existing > programs regularly. > > - GPU memories are not virtual. If you hit the limit of memory on a > card, you've got to change your program. > > Worse: there is a three or more level hierarchy of fixed-size > memories within the GPU that needs to be explicitly managed. > > - GPU software is oriented to performance. Compile times are long. > Debugging is hard and different. > > Setting up the hardware and software for GPU computing is stupidly > challenging. Alex gave a talk to GTALUG (video available) about his > playing with this. Here's what I remember: > > - AMD is mostly open source but not part of most distros (why???). > You need to use select distros plus out-of-distro software. Support > for APUs (AMD processor chips with built-in GPUs) is still missing > (dumb). > > - NVidia is closed source. Alex found it easier to get going. Still > work. Still requires out-of-distro software. > > - He didn't try Intel. Ubiquitous but not popular for GPU computing > since all units are integrated and thus limited in crunch. > > Intel, being behind, is the nicest player. > --- > Post to this mailing list talk@gtalug.org > Unsubscribe from this mailing list https://gtalug.org/mailman/listinfo/talk
--- Post to this mailing list talk@gtalug.org Unsubscribe from this mailing list https://gtalug.org/mailman/listinfo/talk