http://arst.ch/jo6

Moore's Law is not dead. It's merely pining for the fjords
By Jon Stokes | Last updated May 5, 2010 11:04 AM

What if I took to the pages of a major business magazine and made the
bold recommendation that, because humans have run out of new places on
Earth that we can migrate to, it is past time for us to make the
collective leap to faster-than-light travel so that we can explore
neighboring solar systems? Your reaction would probably be something
like, "Yes, of course everyone would love to go faster than light, and
if it were as easy as just deciding we all want to do it then it would
be done already."

That's pretty much how I felt after reading a recent [Forbes op-ed][1]
by NVIDIA's Bill Dally, in which he declares, "It is past time for the
computing industry—and everyone who relies on it for continued
improvements in productivity, economic growth and social progress—to
take the leap into parallel processing." Obviously yes, we would all
love to just magically jump right into parallel processing, and
transform all of our existing serial workloads into parallel workloads.
But there are two big problems: 1) nobody knows the percentage of
existing serial workloads that can be usefully parallelized (but it's
probably small), and 2) parallel programming is *hard*.

 [1]:
http://www.forbes.com/2010/04/29/moores-law-computing-processing-opinions-contributors-bill-dally.html

### A lot of serial, not as much parallel

Note that in the preceding paragraph, I spoke of "workloads" and not
"programs." That's because the problem isn't that existing software has
been written one way and it needs to be rewritten in some new way. It's
that the tasks that the software carries out are inherently serial. Of
course, Dally is well aware of this distinction, but he conveniently
ignores it because it doesn't help his point. However, the example that
Dally uses to illustrate the difference between serial and parallel is
actually a very good illustration of the fact that we can't just rewrite
serial software and make it parallel.

Here's Dally's analogy: "Reading this essay is a serial process—you read
one word after another. But counting the number of words, for example,
is a problem best solved using parallelism. Give each paragraph to a
different person, and the work gets done far more quickly." Yep, the
process of reading is definitely serial—there's *no way* to accomplish
the task in parallel (believe me, I've tried), and no amount of
programming wizardry will make it otherwise. Word counts, on the other
hand, can be done either in serial or in parallel; but counting words is
a much less interesting and useful undertaking than reading.

As with reading vs. word counts, it has so far turned out that the main
bulk of ordinary computing tasks that are interesting and worthwhile are
serial tasks; the parallel stuff, while critically important in a few
key verticals, is niche. This is unfortunate for NVIDIA, because NVIDIA
is in the parallel business. Now, it could ultimately happen that the
set of "interesting things that we want to do with computers that are
best done in parallel" will one day grow larger than the set of
"interesting things that we want to do with computers that can only be
done serially," and if that happens that will be great for everyone (not
just NVIDIA); but so far we appear to be on track for the opposite outcome.

Ultimately, NVIDIA's fundamental problem boils down to this simple fact:
you can do parallel tasks in a serial manner, but you can't do serial
tasks in a parallel manner. What this means for computing's history up
until now is that everyone started out making serial hardware, with the
result that parallel tasks have tended to be done in serial because
that's the hardware that was available. Some percentage of those tasks
can be rethought to work in parallel, but, as I said above, so far this
percentage has been disappointingly low.

### Are our programmers learning?

Dally talks a bit about practices and approaches, as if writing parallel
software is mainly a matter of tools and training. Would that it were so.

There are some folks who honestly believe that if we gave computer
science students the right tools for explicitly expressing parallelism
and we totally reformed the comp sci curriculum so that students are
trained to use these tools from day one, we'd enter into some sort of
golden age of parallelism. But the number of people who think this way
is shrinking, at least from what I've informally observed. This issue
came up in an untranscribed portion of the [conversation][2] that I had
with Stanford president and RISC pioneer John Hennessy, and it has come
up in many other conversations that I've had since with folks in the
field: most humans just don't seem to be wired to be able to learn to do
parallel programming at the level that our processor hardware now
demands. It's not that it can't be done—a few people can really take to
it and do it well. But, like the innate potential to become a chess
grandmaster, the innate potential to be a real parallel-programming
wizard is not evenly distributed in the population. Some professors and
CS grad students I've talked to have observed the same thing I have:
that when you get to the point in a computer science curriculum where
you introduce parallel programming, you lose too many students.

 [2]:
http://arstechnica.com/tech-policy/news/2009/04/ars-asks-stanfords-president-what-would-you-do-with-800-billion.ars/2

Right now, it's probably fair to say that the "tools and training will
fix it" school of thought is still the mainstream, but as core counts
increase and the gap widens between our hardware's peak theoretical
performance and what real-world, ordinary programmers can actually get
out of that hardware, we'll see more people concede that programmers
just need to be able to write serial code that the machine then
parallelizes for them. In other words, the programmer will get the board
ready, and some software and/or hardware designed by grandmasters will
then take over and do all of the grandmaster-level chess playing.

### Right about one thing: it will take time

Responding to this essay is ultimately like fighting a heavyweight boxer
who has both hands tied behind his back. Dally is a smart guy with
impeccable credentials, and with this short, nonspecialist essay format
he appears to have fallen into a common trap that ensnares specialists
of all types when they're not good at writing for a lay audience but
they're pressed into doing so for commercial reasons, i.e. he got just
enough rope to hang himself. I would love to read a longer technical
essay by him where he tries to make whatever points he intended to make
with this piece, but using his own terms.

In the end, I can definitely agree with him on one thing: the industry
has a long slog ahead of it before it's ready to start using Moore's Law
to once again deliver the kinds of performance gains (as opposed to
functionality and system-level power efficiency increases) that it gave
us up until recently. That process will take not just time, but a whole
lot more investment in fundamental computer science research.

Reply via email to