Re: [fonc] Everything You Know (about Parallel Programming) Is Wrong!: A Wild Screed about the Future

Miles Fidelman Wed, 04 Apr 2012 09:01:14 -0700

David Barbour wrote:

Your approach to parallelism strikes me as simplistic. Like sayingEarth is in center of Solar system. Sun goes around Earth. It soundssimple. It's "easy to conceptualize". Oh, and it requires epicyclicorbits to account for every other planet. Doesn't sound so simpleanymore. Like this, simplistic becomes a complexity multiplier indisguise.
You propose actor per object. It sounds simple to you, and "easy toconceptualize". But now programmers have challenges to controllatency, support replay, testing, maintenance, verification,consistency. This is in addition to problems hand-waved through likeline-of-sight and collision detection. It doesn't sound so simple anymore.

The whole point of architecture is to generate the overall outline of asystem, to address a particular problem space within the constraints athand. The KISS principle applies (along with "seek simplicity anddistrust it"). If there isn't a degree of simplicity and elegance in anarchitecture, the architect hasn't done particularly good job.

In the past, limitations of hardware, languages, and run-timeenvironments have dictated against taking parallel (or more accurately,concurrent) approaches to problems, even when massive concurrency is thebest mapping onto the problem domain - resulting in very ugly code.

Yes, there are additional problems introduced by modeling a problem asmassively concurrent - and those are areas that I think are areas forfruitful research. In particular, re. the ones you cite:

- control latency, support replay, testing, maintenance, verification:these are nothing new at the systems level (think about either all thedifferent things that run on a common server, or about all the thingsthat go on in a distributed system such as the federated collection ofSMTP servers that we're relying on right now)

- consistency: is not your message "Avoid the Concurrency Trap byEmbracing Non-Determinism?" -- is not a key question: what does it meanto "embrace non-determinism" and how to design systems in an inherentlyindeterminate environment? (more below)

- now line-of-sight and collision detection, which are more specific tothe simulation domain, are interesting in two regards:

-- collision detection (and weapons effects) are easy if you allowactors to calculate to determine "I'm hit," not so easy if you wantindependent verification by a referee or a physical environment model -the latter pretty much requires some kind of serialization, and thequestion becomes how

-- line-of-sight calculations are the bane of simulators - right now,the practice is for each entity to do it's own line of sightcalculations (doesn't matter if it's an object that is invoked by acontrol thread, or an asynchronous actor) - each entity takes a lookaround (scans a database) to determine who it can see (and be seen by),who it can't, what's blocking it's view of other objects, etc. -- verycompute intensive, and where coders spend a LOT of time optimizing (aCGF has to do this 20 times a second or more, GIS systems tend to take30 seconds to several minutes to do the same thing -- when I was in thesimulation business, I sat in several rather amusing meetings, watchingcoders from a well-known GIS firm, as their jaws dropped when told howfast our stuff did line-of-sight calucations). I expect that there aresome serious efficiencies that can be gained by performing LOScalculations from a global perspective, and that these can benefit frommassive parallelism - I expect there's work on ray tracing and renderingthat applies - but that gets pretty far afield from my own experience.)

The old sequential model, or even the pipeline technique I suggest, donot contradict the known, working structure for consistency.


But is consistency the issue at hand?

This line of conversation goes back to a comment that the limits toexploiting parallelism come down to people thinking sequentially, andinherent complexity of designing parallel algorithms. I argue that quitea few problems are more easily viewed through the lens of concurrency -using network protocols and military simulation as examples that I'mpersonally familiar with.

You seem to be making the case for sequential techniques that maintainconsistency. But is that really the question? This entire thread startedwith a posting about a paper you were giving on Project Renaissance -that contained two points that stood out to me:

"If we cannot skirt Amdahl’s Law, the last 900 cores will do us no goodwhatsoever. What does this mean? We cannot afford even tiny amounts ofserialization."

"Avoid the Concurrency Trap by Embracing Non-Determinism?" (actually notfrom the post, but from the Project Renaissance home page)

In this, I think we're in violet agreement - the key to taking advantageof parallelism is to "embrace non-determinism."

In this context, I've been enjoying Carl Hewitt's recent writings aboutindeterminacy in computing. If I might paraphrase a bit, isn't the pointthat 'complex computing systems are inherently and always indeterminate,let's just accept this, not try to force consistency where it can't beforced, and get on with finding ways to solve problems in ways that workin an indeterminate environment.'

Which comes back to my original comment that there are broad classes ofproblems that are more readily addressed through the lens of massiveconcurrency (as a first-order architectural view). And that new hardwareadvances (multi-core architectures, graphic processors), andlanguage/run-time models (actors, Erlang-like massive concurrency), nowallow us to architect systems around massive concurrency (when the modelfits the problem).


And, returning to this context:


    "... For huge classes of problems - anything that's remotely
    transactional or event driven, simulation, gaming come to mind
    immediately - it's far easier to conceptualize as spawning a
    process than trying to serialize things. The stumbling block has
    always been context switching overhead. That problem goes away as
    your hardware becomes massively parallel. "

    Are you arguing that:
    a) such problems are NOT easier to conceptualize as parallel and
    asynchronous, or,
    b) parallelism is NOT removing obstacles to taking actor-like
    approaches to these classes of problems, or
    c) something else?


I would argue all three.

Ahh... then I would counter that:

a) you selectively conceptualize only part of the system - anidealized happy path. It is much more difficult to conceptualize yourwhole system - i.e. all those sad paths you created but ignored. Manysimulators have collision detection, soft real-time latencyconstraints, and consistency requirements. It is not easy toconceptualize how your system achieves these.

In this one, I write primarily from personal experience and observation.There are a huge class of systems that are inherently concurrent, andinherently not serializeable. Pretty much any distributed system comesto mind - email and transaction processing come to mind. I happen tothink that simulators fall into this class - and in this regard there'san existence proof:


- Today's simulators are built both ways:

-- CGFs and SAFs (multiple entities simulated on a single box) -generally written with an object-oriented paradigm in C++ or Java,highly optimized for performance, with code that is incredibly hard tofollow, and turns out to be rather brittle

-- networked simulations (e.g. F16 man-in-the-loop simulators linked bynetwork) are inherently independent processes, linked by networks thathave all kinds of indeterminancies vis-a-vis packet delays, packetdelivery order, packet loss (you pretty much have to use multi-cast UDP,and packet loss, or you can't keep up with real-time simulation - andthe pilots tend to throw up all over the simulators if the timing is off- sensitive thing the human vestibular system) ---- a much simplerarchitecture, systems that are much easier to follow

Very different run-time environments, very different systemarchitectures. Both work.

Personally, I find networked simulators to be a lot easier toconceptualize than today's CGFs -- in one case, adding a new entity (saya plane) to a simulation = adding a new box that has a clean interface.In the other, it involves adding a new object, and having to understandthat there's all kinds of behind-the-scenes magic going on, as multiplecontrol threads wind their way through all the objects in the system.One is a clean mapping between the problem space, the other is just ugly.

Yes, as noted above, serious problems remain - but the question is aboutserial vs. parallel approaches are more tractable at the architecturallevel.

b) parallelism is not concurrency; it does not suggest actor-likeapproaches. Pipeline and data parallelism are well proven alternativesused in real practice. There are many others, which I have mentionedbefore.

Fair point. If we limit ourselves to a discussion of pipelines and dataparallelism, I'll concede that they do not necessarily lead to cleanerconceptual mappings between problems and systems architectures. In fact,for the examples I've been talking about, my sense is that a pipelinedapproach to simulation is not particularly easier to comprehend thancurrent approaches - though it might take better advantage of largenumbers of processing cores. In the case of email, I can't even begin tothink about applying synchronous parallelism to messages flowing afederation of mail servers.

On the other hand, if we look at the larger question of "skirtingAmdah's law" in an environment with lots of processing cores - certainlywithin some definitions of "parallelism" - then actor-like massiveconcurrency approaches are certainly in bounds, and the availability ofmore cores certainly allows for running more actors without running intoresource conflicts.

c) performance - context-switching overhead - isn't the most importantstumbling block. consistency, correctness, complexity are each moreimportant.

Ahh... here's I'll through it back to the question of architectures anddesign patterns that assume inconsistency as the norm (biologicalmetaphors if you will). And maybe add a touch of protocol layeringtechniques (IP packets are inherently unreliable and probabilistic inbehavior, we layer TCP on top of it to provide reliable connections. Inother cases - like VoIP and video streaming - we can't go back andretransmit, so we either forget about lost packets, or useforward-error-correcting codes).

Like you, I believe we can achieve parallel designs while improvingsimplicity. But I think I will eschew turning tanks into actors.


Agreed on the first, not, obviously on the second.






--
In theory, there is no difference between theory and practice.
In practice, there is.   .... Yogi Berra


_______________________________________________
fonc mailing list
fonc@vpri.org
http://vpri.org/mailman/listinfo/fonc

Re: [fonc] Everything You Know (about Parallel Programming) Is Wrong!: A Wild Screed about the Future

Reply via email to