On Thu, Oct 21, 2010 at 6:04 PM, Darren Duncan <dar...@darrenduncan.net>wrote:
> Aaron Sherman wrote: > >> > Things that typically precipitate threading in an application: >> >> - Blocking IO >> - Event management (often as a crutch to avoid asynchronous code) >> - Legitimately parallelizable, intense computing >> >> Interestingly, the first two tend to be where most of the need comes from >> and the last one tends to be what drives most discussion of threading. >> > > The last one in particular would legitimately get attention when one > considers that it is for this that the concern about using multi-core > machines efficiently comes into play. That sounds great, but what's the benefit to a common use case? Sorting lists with higher processor overhead and waste heat in applications that traditionally weren't processor-bound in the first place? Over the past 20+ years, I've seen some very large, processor-bound applications that could (and in some cases, did) benefit from threading over multiple cores. However, they were so far in the minority as to be nearly invisible, and in many cases such applications can simply be run multiple times per host in order to VERY efficiently consume every available processor. The vast majority of my computing experience has been in places where I'm actually willing to use Perl, a grossly inefficient language (I say this, coming as I do from C, not in comparison to other HLLs), because my performance concerns are either non-existent or related almost entirely to non-trivial IO (i.e. anything sendfile can do). > The first 2 are more about lowering latency and appearing responsive to a > user on a single core machine. Write me a Web server, and we'll talk. Worse, write a BitTorrent client that tries to store its results into a high performance, local datastore without reducing theoretical, back-of-the-napkin throughput by a staggering amount. Shockingly enough, neither of these frequently used examples are processor-bound. The vast majority of today's applications are written with network communications in mind to one degree or another. "The user," isn't so much interesting as servicing network and disk IO responsively enough that hardware and network protocol stacks wait on you to empty or fill a buffer as infrequently as possible. This is essential in such rare circumstances as: - Database intensive applications - Moving large data files across wide area networks - Parsing and interpreting highly complex languages inline from data received over multiple, simultaneous network connections (sounds like this should be rare, but your browser does it every time you click on a link) Just in working with Rakudo, I have to use git, make and Perl itself, all of which can improve CPU performance all they like, but will ultimately run slow if they don't handle reading dozens of files, possibly from multiple IO devices (disks, network filesystems, remote repositories, etc) as responsively as possible. Now, to back up and think this through, there is one place where multi-core processor usage is going to become critical over the next few years: phones. Android-based phones are going multi-core within the next six months. My money is on a multi-core iPhone within a year. These platforms are going to need to take advantage of multiple cores for primarily single-application performance in a low-power environment. So, I don't want you to think that I'm blind to the need you describe. I just don't want you to be unrealistic about the application balance out there. I think that Perl 6's implicit multi-threading approach such as for > hyperops or junctions is a good best first choice to handle many common > needs, the last list item above, without users having to think about it. > Likewise any pure functional code. -- Darren Duncan > It's very common for people working on the design or implementation of a programming language to become myopic with respect to the importance of executing code as quickly as possible, and I'm not faulting anyone for that. It's probably a good thing in most circumstances, but in this case, assuming that the largest need is going to be the execution of code turns out to be a misleading instinct. Computers execute code far, far less than you would expect, and the cost of failing to service events is often orders of magnitude greater than the cost of spending twice the number of cycles doing so. PS: Want an example of how important IO is? Google has their own multi-core friendly network protocol modifications to Linux that have been pushed out in the past 6 months: http://www.h-online.com/open/features/Kernel-Log-Coming-in-2-6-35-Part-3-Network-support-1040736.html They had to do this because single cores can no longer keep up with the network.