On Tue, Oct 27, 2015 at 2:26 PM, Dmitry Osipenko <dig...@gmail.com> wrote:

> 25.10.2015 20:39, Peter Crosthwaite пишет:
>
> On Sun, Oct 25, 2015 at 6:23 AM, Dmitry Osipenko <dig...@gmail.com> wrote:
>>
>>> 25.10.2015 02:55, Peter Crosthwaite пишет:
>>>
>>> On Sat, Oct 24, 2015 at 3:22 PM, Dmitry Osipenko <dig...@gmail.com>
>>>> wrote:
>>>>
>>>>>
>>>>> 24.10.2015 22:45, Peter Crosthwaite пишет:
>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> This looks like a give-up without trying to get the correct value. If
>>>>>> the calculated value (using the normal-path logic below) is sane, you
>>>>>> should just use it. If it comes out bad then you should clamp to 1.
>>>>>>
>>>>>> I am wondering whether this clamping policy (as in original code as
>>>>>> well) is correct at all though. The value of a free-running
>>>>>> short-interval periodic timer (poor mans random number generator)
>>>>>> without any actual interrupt generation will be affected by QEMUs
>>>>>> asynchronous handling of timer events. So if I set limit to 100, then
>>>>>> sample this timer every user keyboard stroke, I should get a uniform
>>>>>> distribution on [0,100]. Instead in am going to get lots of 1s. This
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> Right, that's a good example. What about to scale ptimer period to
>>>>> match
>>>>> adjusted timer_mod interval?
>>>>>
>>>>>
>>>> Thats just as incorrect as changing the limit IMO. The guest could get
>>>> confused with a timer running at the wrong frequency.
>>>>
>>>> is more broken in the original code (as you state), as I will get >
>>>>>> 100, but I think we have traded broken for slightly less broken. I
>>>>>> think the correct semantic is to completely ignoring rate limitin
>>>>>> except for the scheduling on event callbacks. That is, the timer
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> I'm missing you here. What event callbacks?
>>>>>
>>>>>
>>>> when timer_mod() hits, and it turn triggers some device specific event
>>>> (usually an interrupt).
>>>>
>>>> There are two basic interactions for any QEMU timer. There are
>>>> synchronous events, i.e. the guest reading (polling) the counter which
>>>> is what this patch tries to fix. The second is the common case of
>>>> periodic interrupt generation. My proposal is that rate limiting does
>>>> not affect synchronous operation, only asynchronous (so my keystroke
>>>> RNG case works). In the current code, if ptimer_get_count() is called
>>>> when the event has passed it returns 0 under the assumption that the
>>>> timer_mod callback is about to happen. With rate-limiting that may be
>>>> well in the future.
>>>>
>>>>
>>> ptimer_tick() would happen on the next QEMU loop cycle, so it might be
>>> more
>>> reasonable to return counter = 1 here, wouldn't it?
>>>
>>>
>>> interval is not rate limited, instead the timer_mod interval
>>>>>> (next_event -last_event) just has a 10us clamp.
>>>>>>
>>>>>> The means the original code semantic of returning counter = 0 for an
>>>>>> already triggered timer is wrong. It should handle in-the-past
>>>>>> wrap-arounds as wraparounds by triggering the timer and redoing the
>>>>>> math with the new interval values. So instead the logic would be
>>>>>> something like:
>>>>>>
>>>>>> timer_val = -1;
>>>>>>
>>>>>> for(;;) {
>>>>>>
>>>>>>     if (!enabled) {
>>>>>>        return delta;
>>>>>>     }
>>>>>>
>>>>>>     timer_val = (next-event - now) *  scaling();
>>>>>>     if (timer_val >=  0) {
>>>>>>         return timer_val;
>>>>>>     }
>>>>>>     /* Timer has actually expired but we missed it, reload it and try
>>>>>> again
>>>>>> */
>>>>>>     ptimer_tick();
>>>>>> }
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> Why do you think that ptimer_get_count() == 0 in case of the running
>>>>> periodic timer that was expired while QEMU was "on the way" to ptimer
>>>>> code
>>>>> is bad and wrong?
>>>>>
>>>>
>>>>
>>>> Because you may have gone past the time when it was actually zero and
>>>> reloaded and started counting again. It should return the real
>>>> (reloaded and resumed) value. This is made worse by rate limiting as
>>>> the timer will spend a long time at the clamp value waiting for the
>>>> rate-limited tick to fix it.
>>>>
>>>> Following on from before, we don't want the async stuff to affect
>>>> sync. As the async callbacks are heavily affected by rate limiting we
>>>> don't want the polled timer having to rely on the callbacks (async
>>>> path) at all for correct operation. That's what the current
>>>> implementation of ptimer_get_count assumes with that 0-clamp.
>>>>
>>>>
>>> Alright, that make sense now. Thanks for clarifying.
>>>
>>>   From the guest point of view it's okay (no?), do we really
>>>>> need to overengineer that corner case?
>>>>>
>>>>>
>>>> Depends on your use case. Your bug report is correct in that the timer
>>>> should never be outside the bounds of the limit. But you are fixing
>>>> that very specifically with a minimal change rather than correcting
>>>> the larger ptimer_get_count() logic which I think is more broken than
>>>> you suggest it is.
>>>>
>>>>
>>>>>> ptimer_reload() then needs to be patched to make sure it always
>>>>>> timer_mod()s in the future, otherwise this loop could iterate a large
>>>>>> number of times.
>>>>>>
>>>>>> This means that when the qemu_timer() actually ticks, a large number
>>>>>> or cycles may have occured, but we can justify that in that callback
>>>>>> event latency (usually interrupts) is undefined anyway and coalescing
>>>>>> of multiples may have happened as part of that. This usually matches
>>>>>> expectations of real guests where interrupt latency is ultimately
>>>>>> undefined.
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> ptimer_tick() is re-arm'ing the qemu_timer() in case of periodic mode.
>>>>> Hope
>>>>> I haven't missed your point here.
>>>>>
>>>>>
>>>> Yes. But it is also capable of doing the heavy lifting for our already
>>>> expired case. Basic idea is, if the timer is in a bad state (should
>>>> have hit) but hasn't, do the hit to put the timer into a good state
>>>> (by calling ptimer_tick) then just do the right thing. That's what the
>>>> loop does. It should also work for an in-the-past one-shot.
>>>>
>>>>
>>> Summarizing what we have now:
>>>
>>> There are two issues with ptimer:
>>>
>>> 1) ptimer_get_count() return wrong values with adjusted .limit
>>>
>>> Patch V7 doesn't solve that issue, but makes it slightly better by
>>> clamping
>>> returned value to [0, 1]. That's not what we want, we need to return
>>> counter
>>> value in it's valid range [0, limit].
>>>
>>> You are rejecting variant of scaling ptimer period, saying that it might
>>> affect software behavior inside the guest. But by adjusting the timer, we
>>> might already causing same misbehavior in case of blazing fast host
>>> machine.
>>>
>>
>> It is a different misbehaviour. We are modelling the polled timer
>> perfectly but limiting the frequency of callbacks (interrupts). I
>> think this is the lesser of two evils.
>>
>> I'll scratch my head a bit more on it. If you have any better idea, please
>>> share.
>>>
>>> 2) ptimer_get_count() return fake 0 value in case of the expired
>>> qemu_timer() without triggering ptimer_tick()
>>>
>>> You're suggesting to solve it by running ptimer_tick(). So if emulated
>>> device uses ptimer tick event (scheduled qemu bh) to raise interrupt, it
>>> would do it by one QEMU loop cycle earlier.
>>>
>>>
>> Yes, this is ok, as even in a rate limited scenario there is no reason
>> to absolutely force the rate limit. If a poll happens it should just
>> flush the waiting interrupt.
>>
>> My question here: is it always legal for the guest software to be able to
>>> get counter = 0 on poll while CPU interrupt on timer expire hasn't
>>> happened
>>> yet (would happen after one QEMU cycle).
>>>
>>
>> Yes. And I am going a step further by saying it is ok for the guest
>> software to see the timer value wrapped around before the expire too.
>>
>>
> Let's imagine a hardware with a such restriction: timer interrupt has
> highest priority and CPU immediately switches to the interrupt handler in a
> such way that it won't ever could see counter = 0 / wraparound (with
> interrupt enabled) before entering the handler.
>
> Is it unrealistic?
>
>
Yes. And if it is possible in real HW, I don't think this is valid for QEMU
outside of icount mode.



> For instance (on QEMU):
>
>              CPU                  |               Timer
> ---------------------------------------------------------------------
>   start_periodic_timer            |       timer starts ticking
>                                 .....
>   QEMU starts to execute          |
>   translated block                |
>                                   |       QEMU timer expires
>                                   |
>   CPU reads the timer register,   |       ptimer_get_count() return
>   ptimer_get_count() called       |       wrapped around value
>                                 .....
>   CPU interrupt handler kicks in  |       timer continue ticking, so
>                                   |       any value is valid here
>   CPU stops the timer and sets    |
>   counter to 0, returns from the  |
>   handler                         |
>                                 .....
>   Now, for some reason, software  |
>   sees that timer is stopped      |
>   and do something using read     |
>   value                           |
>
> Program code sketch:
>
> timer_interrupt_handler()
> {
>         write32(1, TIMER_STOP);
>         write32(0, TIMER_COUNTER);
>         write32(TIMER_IRQ_CLEAR, TIMER_STATUS);
>
>         return IRQ_HANDLED;
> }
>
> program()
> {
>         .....
>
>         ..... <--- timer expired here
>         ..... <--- interrupt handler executed here on real HW
>
>         var1 = read32(TIMER_COUNTER); <--- Emulated got wrapped,
>                                            real got 0
>
>         ..... <--- interrupt handler executed here on QEMU
>
>         if (read32(TIMER_STATUS) & TIMER_RUNNING) {
>                 .....
>         } else {
>                 .....
>
>                 write(var1 >> 16, SOME_DEV_REGISTER);
>         }
>
>         .....
> }
>
> Might emulated program behave differently from the real HW after it?
> Probably.
>
> I want to mention that not only beefy generic CPU's are the ptimer users.
>
> However, it seems that no one of the current ptimer users has a such
> restriction since it would already return 0 on expire and ptimer_tick()
> would happen after it. We can agree on keeping ptimer less universal in
> favor of
> the expire optimization, so somebody may improve it later if it would be
> needed.
>
>
I think removing the rate limiter's and clamping-affect on the read value
makes it more universal if anything.


> Do we agree?
>
>
I'm not sure, what are you referring to as the "expire optimizsation"?


> I guess it might cause software
>>> misbehavior if it assumes that the real hardware has CPU and timer
>>> running
>>> in the same clock domain, i.e. such situation might be not possible.
>>>
>>
>> Assumptions about the CPU clocking only make sense in icount mode,
>> where the rate limiter is disable anyway.
>>
>>
> Timer limiter has nothing to do with a returned value for the expired
> timer. Clock cycle accurate execution isn't relevant to upstream QEMU, I
> meant clocking in general. Emulated behaviour shouldn't diverge from the
> real HW.
>
>
But when the rate limiter is on for a short interval, it massively distorts
this.

Regards,
Peter


> So I'm
>>> suggesting to return counter = 1 and allow ptimer_tick() happen on it's
>>> own.
>>>
>>>
>> My alternate suggestion is, if you detect that the tick should have
>> already happened, just make it happen. I don't see the need to rate
>> limit a polled timer.
>>
>>
> Yes, I got your idea and it is absolutely correct if we agree on the above
> tradeoff (if that tradeoff exists).
>
> --
> Dmitry
>

Reply via email to