On Thu, Oct 21, 2010 at 6:04 PM, Darren Duncan <dar...@darrenduncan.net>wrote:

> Aaron Sherman wrote:
>
>>

> Things that typically precipitate threading in an application:
>>
>>   - Blocking IO
>>   - Event management (often as a crutch to avoid asynchronous code)
>>   - Legitimately parallelizable, intense computing
>>
>> Interestingly, the first two tend to be where most of the need comes from
>> and the last one tends to be what drives most discussion of threading.
>>
>

> The last one in particular would legitimately get attention when one
> considers that it is for this that the concern about using multi-core
> machines efficiently comes into play.


That sounds great, but what's the benefit to a common use case? Sorting
lists with higher processor overhead and waste heat in applications that
traditionally weren't processor-bound in the first place?

Over the past 20+ years, I've seen some very large, processor-bound
applications that could (and in some cases, did) benefit from threading over
multiple cores. However, they were so far in the minority as to be nearly
invisible, and in many cases such applications can simply be run multiple
times per host in order to VERY efficiently consume every available
processor.

The vast majority of my computing experience has been in places where I'm
actually willing to use Perl, a grossly inefficient language (I say this,
coming as I do from C, not in comparison to other HLLs), because my
performance concerns are either non-existent or related almost entirely to
non-trivial IO (i.e. anything sendfile can do).


>  The first 2 are more about lowering latency and appearing responsive to a
> user on a single core machine.


Write me a Web server, and we'll talk. Worse, write a BitTorrent client that
tries to store its results into a high performance, local datastore without
reducing theoretical, back-of-the-napkin throughput by a staggering amount.
Shockingly enough, neither of these frequently used examples are
processor-bound.

The vast majority of today's applications are written with network
communications in mind to one degree or another. "The user," isn't so much
interesting as servicing network and disk IO responsively enough that
hardware and network protocol stacks wait on you to empty or fill a buffer
as infrequently as possible. This is essential in such rare circumstances
as:

   - Database intensive applications
   - Moving large data files across wide area networks
   - Parsing and interpreting highly complex languages inline from
   data received over multiple, simultaneous network connections (sounds like
   this should be rare, but your browser does it every time you click on a
   link)

Just in working with Rakudo, I have to use git, make and Perl itself, all of
which can improve CPU performance all they like, but will ultimately run
slow if they don't handle reading dozens of files, possibly from multiple IO
devices (disks, network filesystems, remote repositories, etc) as
responsively as possible.

Now, to back up and think this through, there is one place where multi-core
processor usage is going to become critical over the next few years: phones.
Android-based phones are going multi-core within the next six months. My
money is on a multi-core iPhone within a year. These platforms are going to
need to take advantage of multiple cores for primarily single-application
performance in a low-power environment.

So, I don't want you to think that I'm blind to the need you describe. I
just don't want you to be unrealistic about the application balance out
there.


 I think that Perl 6's implicit multi-threading approach such as for
> hyperops or junctions is a good best first choice to handle many common
> needs, the last list item above, without users having to think about it.
>  Likewise any pure functional code. -- Darren Duncan
>

It's very common for people working on the design or implementation of a
programming language to become myopic with respect to the importance of
executing code as quickly as possible, and I'm not faulting anyone for that.
It's probably a good thing in most circumstances, but in this case, assuming
that the largest need is going to be the execution of code turns out to be a
misleading instinct. Computers execute code far, far less than you would
expect, and the cost of failing to service events is often orders of
magnitude greater than the cost of spending twice the number of cycles doing
so.

PS: Want an example of how important IO is? Google has their own multi-core
friendly network protocol modifications to Linux that have been pushed out
in the past 6 months:

http://www.h-online.com/open/features/Kernel-Log-Coming-in-2-6-35-Part-3-Network-support-1040736.html

They had to do this because single cores can no longer keep up with the
network.

Reply via email to