On Tue, Oct 27, 2015 at 2:26 PM, Dmitry Osipenko <dig...@gmail.com> wrote:
> 25.10.2015 20:39, Peter Crosthwaite пишет: > > On Sun, Oct 25, 2015 at 6:23 AM, Dmitry Osipenko <dig...@gmail.com> wrote: >> >>> 25.10.2015 02:55, Peter Crosthwaite пишет: >>> >>> On Sat, Oct 24, 2015 at 3:22 PM, Dmitry Osipenko <dig...@gmail.com> >>>> wrote: >>>> >>>>> >>>>> 24.10.2015 22:45, Peter Crosthwaite пишет: >>>>> >>>>>> >>>>>> >>>>>> >>>>>> This looks like a give-up without trying to get the correct value. If >>>>>> the calculated value (using the normal-path logic below) is sane, you >>>>>> should just use it. If it comes out bad then you should clamp to 1. >>>>>> >>>>>> I am wondering whether this clamping policy (as in original code as >>>>>> well) is correct at all though. The value of a free-running >>>>>> short-interval periodic timer (poor mans random number generator) >>>>>> without any actual interrupt generation will be affected by QEMUs >>>>>> asynchronous handling of timer events. So if I set limit to 100, then >>>>>> sample this timer every user keyboard stroke, I should get a uniform >>>>>> distribution on [0,100]. Instead in am going to get lots of 1s. This >>>>>> >>>>> >>>>> >>>>> >>>>> Right, that's a good example. What about to scale ptimer period to >>>>> match >>>>> adjusted timer_mod interval? >>>>> >>>>> >>>> Thats just as incorrect as changing the limit IMO. The guest could get >>>> confused with a timer running at the wrong frequency. >>>> >>>> is more broken in the original code (as you state), as I will get > >>>>>> 100, but I think we have traded broken for slightly less broken. I >>>>>> think the correct semantic is to completely ignoring rate limitin >>>>>> except for the scheduling on event callbacks. That is, the timer >>>>>> >>>>> >>>>> >>>>> >>>>> I'm missing you here. What event callbacks? >>>>> >>>>> >>>> when timer_mod() hits, and it turn triggers some device specific event >>>> (usually an interrupt). >>>> >>>> There are two basic interactions for any QEMU timer. There are >>>> synchronous events, i.e. the guest reading (polling) the counter which >>>> is what this patch tries to fix. The second is the common case of >>>> periodic interrupt generation. My proposal is that rate limiting does >>>> not affect synchronous operation, only asynchronous (so my keystroke >>>> RNG case works). In the current code, if ptimer_get_count() is called >>>> when the event has passed it returns 0 under the assumption that the >>>> timer_mod callback is about to happen. With rate-limiting that may be >>>> well in the future. >>>> >>>> >>> ptimer_tick() would happen on the next QEMU loop cycle, so it might be >>> more >>> reasonable to return counter = 1 here, wouldn't it? >>> >>> >>> interval is not rate limited, instead the timer_mod interval >>>>>> (next_event -last_event) just has a 10us clamp. >>>>>> >>>>>> The means the original code semantic of returning counter = 0 for an >>>>>> already triggered timer is wrong. It should handle in-the-past >>>>>> wrap-arounds as wraparounds by triggering the timer and redoing the >>>>>> math with the new interval values. So instead the logic would be >>>>>> something like: >>>>>> >>>>>> timer_val = -1; >>>>>> >>>>>> for(;;) { >>>>>> >>>>>> if (!enabled) { >>>>>> return delta; >>>>>> } >>>>>> >>>>>> timer_val = (next-event - now) * scaling(); >>>>>> if (timer_val >= 0) { >>>>>> return timer_val; >>>>>> } >>>>>> /* Timer has actually expired but we missed it, reload it and try >>>>>> again >>>>>> */ >>>>>> ptimer_tick(); >>>>>> } >>>>>> >>>>> >>>>> >>>>> >>>>> Why do you think that ptimer_get_count() == 0 in case of the running >>>>> periodic timer that was expired while QEMU was "on the way" to ptimer >>>>> code >>>>> is bad and wrong? >>>>> >>>> >>>> >>>> Because you may have gone past the time when it was actually zero and >>>> reloaded and started counting again. It should return the real >>>> (reloaded and resumed) value. This is made worse by rate limiting as >>>> the timer will spend a long time at the clamp value waiting for the >>>> rate-limited tick to fix it. >>>> >>>> Following on from before, we don't want the async stuff to affect >>>> sync. As the async callbacks are heavily affected by rate limiting we >>>> don't want the polled timer having to rely on the callbacks (async >>>> path) at all for correct operation. That's what the current >>>> implementation of ptimer_get_count assumes with that 0-clamp. >>>> >>>> >>> Alright, that make sense now. Thanks for clarifying. >>> >>> From the guest point of view it's okay (no?), do we really >>>>> need to overengineer that corner case? >>>>> >>>>> >>>> Depends on your use case. Your bug report is correct in that the timer >>>> should never be outside the bounds of the limit. But you are fixing >>>> that very specifically with a minimal change rather than correcting >>>> the larger ptimer_get_count() logic which I think is more broken than >>>> you suggest it is. >>>> >>>> >>>>>> ptimer_reload() then needs to be patched to make sure it always >>>>>> timer_mod()s in the future, otherwise this loop could iterate a large >>>>>> number of times. >>>>>> >>>>>> This means that when the qemu_timer() actually ticks, a large number >>>>>> or cycles may have occured, but we can justify that in that callback >>>>>> event latency (usually interrupts) is undefined anyway and coalescing >>>>>> of multiples may have happened as part of that. This usually matches >>>>>> expectations of real guests where interrupt latency is ultimately >>>>>> undefined. >>>>>> >>>>> >>>>> >>>>> >>>>> ptimer_tick() is re-arm'ing the qemu_timer() in case of periodic mode. >>>>> Hope >>>>> I haven't missed your point here. >>>>> >>>>> >>>> Yes. But it is also capable of doing the heavy lifting for our already >>>> expired case. Basic idea is, if the timer is in a bad state (should >>>> have hit) but hasn't, do the hit to put the timer into a good state >>>> (by calling ptimer_tick) then just do the right thing. That's what the >>>> loop does. It should also work for an in-the-past one-shot. >>>> >>>> >>> Summarizing what we have now: >>> >>> There are two issues with ptimer: >>> >>> 1) ptimer_get_count() return wrong values with adjusted .limit >>> >>> Patch V7 doesn't solve that issue, but makes it slightly better by >>> clamping >>> returned value to [0, 1]. That's not what we want, we need to return >>> counter >>> value in it's valid range [0, limit]. >>> >>> You are rejecting variant of scaling ptimer period, saying that it might >>> affect software behavior inside the guest. But by adjusting the timer, we >>> might already causing same misbehavior in case of blazing fast host >>> machine. >>> >> >> It is a different misbehaviour. We are modelling the polled timer >> perfectly but limiting the frequency of callbacks (interrupts). I >> think this is the lesser of two evils. >> >> I'll scratch my head a bit more on it. If you have any better idea, please >>> share. >>> >>> 2) ptimer_get_count() return fake 0 value in case of the expired >>> qemu_timer() without triggering ptimer_tick() >>> >>> You're suggesting to solve it by running ptimer_tick(). So if emulated >>> device uses ptimer tick event (scheduled qemu bh) to raise interrupt, it >>> would do it by one QEMU loop cycle earlier. >>> >>> >> Yes, this is ok, as even in a rate limited scenario there is no reason >> to absolutely force the rate limit. If a poll happens it should just >> flush the waiting interrupt. >> >> My question here: is it always legal for the guest software to be able to >>> get counter = 0 on poll while CPU interrupt on timer expire hasn't >>> happened >>> yet (would happen after one QEMU cycle). >>> >> >> Yes. And I am going a step further by saying it is ok for the guest >> software to see the timer value wrapped around before the expire too. >> >> > Let's imagine a hardware with a such restriction: timer interrupt has > highest priority and CPU immediately switches to the interrupt handler in a > such way that it won't ever could see counter = 0 / wraparound (with > interrupt enabled) before entering the handler. > > Is it unrealistic? > > Yes. And if it is possible in real HW, I don't think this is valid for QEMU outside of icount mode. > For instance (on QEMU): > > CPU | Timer > --------------------------------------------------------------------- > start_periodic_timer | timer starts ticking > ..... > QEMU starts to execute | > translated block | > | QEMU timer expires > | > CPU reads the timer register, | ptimer_get_count() return > ptimer_get_count() called | wrapped around value > ..... > CPU interrupt handler kicks in | timer continue ticking, so > | any value is valid here > CPU stops the timer and sets | > counter to 0, returns from the | > handler | > ..... > Now, for some reason, software | > sees that timer is stopped | > and do something using read | > value | > > Program code sketch: > > timer_interrupt_handler() > { > write32(1, TIMER_STOP); > write32(0, TIMER_COUNTER); > write32(TIMER_IRQ_CLEAR, TIMER_STATUS); > > return IRQ_HANDLED; > } > > program() > { > ..... > > ..... <--- timer expired here > ..... <--- interrupt handler executed here on real HW > > var1 = read32(TIMER_COUNTER); <--- Emulated got wrapped, > real got 0 > > ..... <--- interrupt handler executed here on QEMU > > if (read32(TIMER_STATUS) & TIMER_RUNNING) { > ..... > } else { > ..... > > write(var1 >> 16, SOME_DEV_REGISTER); > } > > ..... > } > > Might emulated program behave differently from the real HW after it? > Probably. > > I want to mention that not only beefy generic CPU's are the ptimer users. > > However, it seems that no one of the current ptimer users has a such > restriction since it would already return 0 on expire and ptimer_tick() > would happen after it. We can agree on keeping ptimer less universal in > favor of > the expire optimization, so somebody may improve it later if it would be > needed. > > I think removing the rate limiter's and clamping-affect on the read value makes it more universal if anything. > Do we agree? > > I'm not sure, what are you referring to as the "expire optimizsation"? > I guess it might cause software >>> misbehavior if it assumes that the real hardware has CPU and timer >>> running >>> in the same clock domain, i.e. such situation might be not possible. >>> >> >> Assumptions about the CPU clocking only make sense in icount mode, >> where the rate limiter is disable anyway. >> >> > Timer limiter has nothing to do with a returned value for the expired > timer. Clock cycle accurate execution isn't relevant to upstream QEMU, I > meant clocking in general. Emulated behaviour shouldn't diverge from the > real HW. > > But when the rate limiter is on for a short interval, it massively distorts this. Regards, Peter > So I'm >>> suggesting to return counter = 1 and allow ptimer_tick() happen on it's >>> own. >>> >>> >> My alternate suggestion is, if you detect that the tick should have >> already happened, just make it happen. I don't see the need to rate >> limit a polled timer. >> >> > Yes, I got your idea and it is absolutely correct if we agree on the above > tradeoff (if that tradeoff exists). > > -- > Dmitry >