On Tue, Feb 14, 2017 at 1:07 PM, Patrick Georgi <pgeo...@google.com> wrote: > 2017-02-14 17:12 GMT+01:00 Aaron Durbin via coreboot <coreboot@coreboot.org>: >> For an optimized bootflow >> all pieces of work that need to be done pretty much need to be closely >> coupled. One needs to globally optimize the full sequence. > Like initializing slow hardware even before RAM init (as long as it's > just an initial command)? > How about using PIT/IRQ0 plus some tiny register-only interrupt > routine to do trivial register wrangling (we do have register > scripts)?
I don't think I properly understand your suggestion. For this particular eMMC case are you suggesting taking the PIT interrupt and doing the next piece of work in it? > >> that we seem to be absolutely needing to >> maintain boot speeds. Is Chrome OS going against tide of coreboot >> wanting to solve those sorts of issues? > The problem is that two basic qualities collide here: speed and > simplicity. The effect is that people ask to stop a second to > reconsider the options. > MPinit and parallelism are the "go to" solution for all performance > related issues of the last 10 years, but they're not without cost. > Questioning this approach doesn't mean that that we shouldn't go there > at all, just that the obvious answers might not lead to simple > solutions. > > As Andrey stated elsewhere, we're far from CPU bound. Agreed. But our chunking of work is very coarsely sectioned up. I think the other CPU path is an attempt to work around the coarseness of the work steps in the dependency chain. > > For his concrete example: does eMMC init fail if you ping it more > often than every 10ms? It better not, you already stated that it's > hard to guarantee those 10ms, so there needs to be some spare room. We > could look at the largest chunk of init process that could be > restructured to implement cooperative multithreading on a single core > for as many tasks as possible, to cut down on all those udelays (or > even mdelays). Maybe we could even build a compiler plugin to ensure > at compile time that the resulting code is proper (loops either have > low bounds or are yielding, yield()/sched()/... aren't called within > critical sections)... That's a possibility, but you have to solve the case for each combination of hardware present and/or per platform. Building up the dependency chain is the most important piece. And from there to ensure execution context is not lost for longer than a set amount of time. We're miles away from that since we're run to completion serially right now. > > Once we leave that scheduling to physics (ie enabling multicore > operation), all bets are off (or we have to synchronize the execution > to a degree that we could just as well do it manually). A lot of > complexity just to have 8 times the CPU power for the same amount of > IO bound tasks. > > > Patrick -- coreboot mailing list: coreboot@coreboot.org https://www.coreboot.org/mailman/listinfo/coreboot